WO2023061054A1 - 非接触式手势控制方法和电子设备 - Google Patents

非接触式手势控制方法和电子设备 Download PDF

Info

Publication number
WO2023061054A1
WO2023061054A1 PCT/CN2022/114295 CN2022114295W WO2023061054A1 WO 2023061054 A1 WO2023061054 A1 WO 2023061054A1 CN 2022114295 W CN2022114295 W CN 2022114295W WO 2023061054 A1 WO2023061054 A1 WO 2023061054A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
signal
contact gesture
target application
microphone
Prior art date
Application number
PCT/CN2022/114295
Other languages
English (en)
French (fr)
Inventor
曾展秀
沈晔星
余民军
朱云帆
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023061054A1 publication Critical patent/WO2023061054A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer

Definitions

  • the present application relates to the field of electronic technology, in particular to a non-contact gesture control method and electronic equipment.
  • Non-contact gestures can carry out human-computer interaction in electronic devices, which is convenient for users to use non-contact gestures (that is, air gestures) at a relatively long distance from electronic devices, such as answering calls, browsing the web, turning off music, and taking screenshots. and so on.
  • non-contact gestures that is, air gestures
  • the present application provides a non-contact gesture control method and an electronic device, which can enable the electronic device to accurately realize the human-computer interaction of the non-contact gesture in a target application, and improve user experience of using the electronic device and the target application.
  • the present application provides a non-contact gesture control method, the method is applied to an electronic device, and the electronic device includes: at least one speaker and at least one microphone.
  • the method includes:
  • the target application is an application that supports non-contact gesture recognition
  • autocorrelation refers to the dependence relationship between the instantaneous value of a signal at one moment and the instantaneous value of another moment, and is a time-domain description of a signal.
  • the signal when the autocorrelation coefficient of the signal is greater than or equal to a preset value, the signal may be called an autocorrelation signal, that is, the signal has good autocorrelation characteristics. This application does not limit the size of the preset value.
  • the electronic device can use the existing hardware design such as speakers and microphones and software design such as gesture recognition algorithms in the electronic device to accurately control Realize the human-computer interaction of non-contact gestures, so that electronic devices can accurately control the target application to respond to non-contact gestures in low-light or even dark environments, which is conducive to improving the user's experience in using electronic devices and target applications, and does not require electronic devices. Adding additional components or using expensive components in the device reduces the cost of electronic device control target applications responding to touchless gestures.
  • the ultrasonic signal when the ultrasonic signal includes a signal, the one signal is an autocorrelation signal, and the frequency range of the one signal is within the first range;
  • the two kinds of signals are autocorrelation signals
  • the frequency ranges of the two kinds of signals are all within the first range
  • the two kinds of signals are time-divisionally transmitted
  • the frequency ranges of the two kinds of signals are the same and the frequency The rate of change is opposite;
  • the first range is related to the sampling rate of the microphone and the ultrasonic frequency range.
  • the horn can transmit the aforementioned ultrasonic signal to a wide coverage area, which increases the gesture recognition range of the electronic device, and also improves the effect of the electronic device in recognizing non-contact gestures, so that the electronic device can accurately recognize non-contact gestures, effectively It is beneficial to accurately realize the human-computer interaction of non-contact gestures.
  • At least one horn is activated to emit ultrasonic signals, including:
  • two signals are transmitted when two channels of a speaker are activated;
  • the electronic device When the electronic device includes two speakers, two kinds of signals are transmitted when the two speakers are activated.
  • the electronic device is designed for different numbers of speakers, so that different numbers of speakers can emit ultrasonic signals containing two types of signals, which improves the universality and practicability of the electronic device for recognizing non-contact gestures.
  • the lowest value of the frequency response of the microphone is within the second range, and the second range is used to ensure that the microphone can receive the reflection signal.
  • the microphone can collect as many reflected signals as possible, which improves the accuracy of the electronic device in recognizing non-contact gestures, and does not need to add additional devices or use expensive devices, saving the cost of electronic devices for recognizing non-contact gestures.
  • the target application is controlled to respond to the touchless gesture according to the target application and the reflected signal, including:
  • the distance feature is used to represent the distance change between the waving position of the non-contact gesture and the microphone, and the velocity feature is used to represent the speed change of the non-contact gesture;
  • the electronic device can use the characteristics of the reflected signal to reflect the non-contact gesture, and can accurately recognize the non-contact gesture, which helps to reduce false touches in the target application and reduces the time for the electronic device to recognize the non-contact gesture. cost.
  • the electronic device when the electronic device includes a microphone, the electronic device further includes: a shield for the microphone, the shield is used to adjust the propagation direction and/or propagation amount of the ultrasonic signal, so that the electronic device can distinguish between different directions touchless gestures;
  • the maximum distance between the microphones is greater than a first threshold, and the first threshold is used to ensure that there is a position difference between the at least two microphones and the same speaker.
  • the electronic device is designed for different numbers of microphones, so that different numbers of microphones can receive the reflected signal when the ultrasonic signal encounters the reflection of the non-contact gesture, which improves the universality and practicability of the electronic device for recognizing the non-contact gesture. .
  • the method before starting at least one horn to emit an autocorrelated ultrasonic signal, the method also includes:
  • the first switch of the electronic device is in a first state, and the first state of the first switch is used to instruct the electronic device to enable the non-contact gesture recognition function.
  • the electronic device can also be designed to use whether the target application supports non-contact gesture recognition as a trigger condition to adaptively start or close the electronic device to realize the non-contact gesture control method, so that the electronic device can flexibly realize the non-contact gesture control method in the target application.
  • the human-computer interaction of gestures is conducive to improving the adaptability and flexibility of electronic devices, and can also reduce the overall power consumption of electronic devices.
  • the method before starting at least one horn to emit an autocorrelated ultrasonic signal, the method further includes:
  • the target application supports contactless gesture recognition, including:
  • the storage module When the identification of the target application is stored in the storage module of the electronic device, it is determined that the target application supports non-contact gesture recognition, and the storage module is used to store the identifications of all applications supporting non-contact gesture recognition.
  • the state of the second switch is used to indicate whether the target application supports non-contact gesture recognition.
  • the method also includes:
  • the electronic device can adaptively disable the function of the electronic device/target application that supports non-contact gesture recognition, which ensures the availability of the gesture recognition algorithm in the electronic device and reduces the overall power consumption of the electronic device.
  • the present application provides an electronic device, including: a memory and a processor; the memory is used to store program instructions; the processor is used to call the program instructions in the memory so that the electronic device performs any possibility of the first aspect and the first aspect A touchless gesture control method in the design of .
  • the present application provides a chip system, which is applied to an electronic device including a memory, a display screen, and a sensor; the chip system includes: a processor; when the processor executes the computer instructions stored in the memory, the electronic device executes the first A non-contact gesture control method in any possible design of the first aspect and the first aspect.
  • the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by the processor to enable the electronic device to realize the non-contact in any possible design of the first aspect and the first aspect. gesture control method.
  • the present application provides a computer program product, including: execution instructions, the execution instructions are stored in a readable storage medium, at least one processor of the electronic device can read the execution instructions from the readable storage medium, at least one processor Execution of the execution instruction enables the electronic device to implement the non-contact gesture control method in any possible design of the first aspect and the first aspect.
  • FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 2 is a software structural block diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 3A is a block diagram of the hardware and software structure of an electronic device implementing a non-contact gesture control method provided by an embodiment of the present application;
  • FIG. 3B is a block diagram of software and hardware structure of an electronic device implementing a non-contact gesture control method provided by an embodiment of the present application;
  • FIG. 4 is a software block diagram of an electronic device implementing a non-contact gesture control method provided by an embodiment of the present application
  • FIG. 5A is a schematic diagram of the positions of speakers and microphones when the electronic device is a notebook computer according to an embodiment of the present application;
  • FIG. 5B is a schematic diagram of the position of the speaker and the microphone when the electronic device is a mobile phone according to an embodiment of the present application;
  • FIG. 5C is a schematic diagram of the positions of speakers and microphones provided by an embodiment of the present application when the electronic device is a PC;
  • FIG. 6A is a schematic waveform diagram of a first transmission signal provided by an embodiment of the present application.
  • FIG. 6B is a schematic waveform diagram of a second transmission signal provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a non-contact gesture control method provided by an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a non-contact gesture control method provided by an embodiment of the present application.
  • FIG. 9A is a schematic diagram of a non-contact gesture provided by an embodiment of the present application in which the user's hand approaches the electronic device;
  • FIG. 9B is a schematic diagram of a non-contact gesture provided by an embodiment of the present application in which the user's hand moves away from the electronic device;
  • FIG. 9C is a schematic diagram of a characteristic heat map corresponding to a non-contact gesture provided by an embodiment of the present application where the user's hand first approaches the electronic device and then moves away from the electronic device;
  • FIG. 10A is a schematic diagram of a scenario in which a non-contact gesture provided by an embodiment of the present application is a user's hand waving downward;
  • Fig. 10B is a non-contact gesture provided by an embodiment of the present application, which is a characteristic heat map corresponding to the user's hand swiping downward;
  • FIG. 10C is a schematic diagram of a scenario in which a non-contact gesture provided by an embodiment of the present application is a user's hand waving upwards;
  • Fig. 10D is a characteristic heat map corresponding to a non-contact gesture provided by an embodiment of the present application, which is the user's hand waving upwards;
  • FIG. 11A is a schematic diagram of a scene in which a non-contact gesture provided by an embodiment of the present application is a user's hand waving to the left;
  • Fig. 11B is a non-contact gesture provided by an embodiment of the present application, which is a characteristic heat map corresponding to the user's hand waving to the left;
  • FIG. 11C is a schematic diagram of a scene in which a non-contact gesture provided by an embodiment of the present application is the user's hand waving to the right;
  • Fig. 11D is a non-contact gesture provided by an embodiment of the present application, which is a characteristic heat map corresponding to the user's hand waving to the right;
  • FIG. 12 is a flowchart of an electronic device determining a distance feature and a speed feature according to an embodiment of the present application
  • Fig. 13 is a flowchart of determining a gesture category of a non-contact gesture by an electronic device according to an embodiment of the present application.
  • At least one means one or more, and “multiple” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there can be three types of relationships, for example, A and/or B, which can mean: A exists alone, A and B exist at the same time, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • “At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (unit) of a, b or c alone may mean: a alone, b alone, c alone, a combination of a and b, a combination of a and c, a combination of b and c, or a combination of a, b and c, where a, b, c can be single or multiple.
  • first and second are used for descriptive purposes only, and should not be understood as indicating or implying relative importance. The orientation or positional relationship indicated by the terms “center”, “longitudinal”, “transverse”, “upper”, “lower”, “left”, “right”, “front”, “rear” etc.
  • the present application provides a non-contact gesture control method, electronic equipment, computer-readable storage medium and computer program product, which can utilize the existing hardware design such as loudspeaker and microphone in the electronic equipment and Gesture recognition algorithm and other software design can accurately realize the human-computer interaction of non-contact gestures in the target application, so that electronic devices can accurately control the target application to respond to non-contact gestures in low-light or even dark environments, which is conducive to improving user experience.
  • the experience of using the electronic device and the target application also eliminates the need to add additional devices or use expensive devices in the electronic device, reducing the cost of the electronic device controlling the target application to respond to non-contact gestures.
  • the application scenarios of the above method may include but not limited to: office scenarios, entertainment scenarios, functional scenarios, and the like.
  • non-contact gestures such as swipe up and down or swipe left and right to read documents in formats such as PDF, word, and WPS, and achieve fast interactive responses such as turning pages up and down.
  • Users can also use non-contact gestures such as swipe up and down or swipe left and right to demonstrate documents in PPT and other formats to achieve fast interactive responses such as turning pages up and down.
  • users can use non-contact gestures such as waving up and down to play audio and video, and achieve rapid interactive responses such as increasing or decreasing the volume. Users can also use non-contact gestures such as swiping left and right to browse photos in the gallery and achieve fast interactive responses such as turning pages up and down. Users can also use non-contact gestures such as swipe up and down to make video calls and achieve quick interactive responses such as increasing or decreasing the volume.
  • non-contact gestures such as waving up and down to play audio and video, and achieve rapid interactive responses such as increasing or decreasing the volume.
  • users can also use non-contact gestures such as swiping left and right to browse photos in the gallery and achieve fast interactive responses such as turning pages up and down. Users can also use non-contact gestures such as swipe up and down to make video calls and achieve quick interactive responses such as increasing or decreasing the volume.
  • users can use non-contact gestures such as swaying up and down, opening palms, clenching palms, and waving left and right to quickly capture screenshots or select multiple processes, etc., to achieve fast interactive responses.
  • non-contact gestures such as swaying up and down, opening palms, clenching palms, and waving left and right to quickly capture screenshots or select multiple processes, etc., to achieve fast interactive responses.
  • the electronic device can be a mobile phone (such as a folding screen mobile phone, a large-screen mobile phone, etc.), a tablet computer, a notebook computer, a wearable device, a vehicle-mounted device, an augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) device , personal computer (personal computer, PC), ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA), smart TV, smart screen, high-definition TV, 4K TV, smart Speakers, smart projectors and other equipment, this application does not impose any restrictions on the specific types of electronic equipment.
  • FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, and a battery 142 , antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193 , a display screen 194, and a subscriber identification module (subscriber identification module, SIM) card interface 195, etc.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in this application does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit, NPU
  • the controller may be the nerve center and command center of the electronic device 100 .
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
  • processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the I2C interface is a bidirectional synchronous serial bus, including a serial data line (serial data line, SDA) and a serial clock line (derail clock line, SCL).
  • processor 110 may include multiple sets of I2C buses.
  • the processor 110 can be respectively coupled to the touch sensor 180K, the charger, the flashlight, the camera 193 and the like through different I2C bus interfaces.
  • the processor 110 may be coupled to the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100 .
  • the I2S interface can be used for audio communication.
  • processor 110 may include multiple sets of I2S buses.
  • the processor 110 may be coupled to the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170 .
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the I2S interface, so as to realize the function of answering calls through the Bluetooth headset.
  • the PCM interface can also be used for audio communication, sampling, quantizing and encoding the analog signal.
  • the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
  • the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both I2S interface and PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • a UART interface is generally used to connect the processor 110 and the wireless communication module 160 .
  • the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function.
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
  • MIPI interface includes camera serial interface (camera serial interface, CSI), display serial interface (display serial interface, DSI), etc.
  • the processor 110 communicates with the camera 193 through the CSI interface to realize the shooting function of the electronic device 100 .
  • the processor 110 communicates with the display screen 194 through the DSI interface to realize the display function of the electronic device 100 .
  • the GPIO interface can be configured by software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 110 with the camera 193 , the display screen 194 , the wireless communication module 160 , the audio module 170 , the sensor module 180 and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 130 is an interface conforming to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through them. This interface can also be used to connect other electronic devices, such as AR devices.
  • the interface connection relationship between modules shown in this application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is configured to receive a charging input from a charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 can receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 is charging the battery 142 , it can also provide power for electronic devices through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be disposed in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be set in the same device.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio equipment (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent of the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. applied on the electronic device 100.
  • System global navigation satellite system, GNSS
  • frequency modulation frequency modulation, FM
  • near field communication technology near field communication, NFC
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • Wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband code division Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC, FM , and/or IR technology, etc.
  • GNSS can include global positioning system (global positioning system, GPS), global navigation satellite system (global navigation satellite system, GLONASS), Beidou satellite navigation system (beidou navigation satellite system, BDS), quasi-zenith satellite system (quasi-zenith) satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou satellite navigation system beidou navigation satellite system, BDS
  • quasi-zenith satellite system quasi-zenith satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used for processing the data fed back by the camera 193 .
  • the light is transmitted to the photosensitive element of the camera through the lens, and the optical signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin color.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be located in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object generates an optical image through the lens and projects it to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other image signals.
  • the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs.
  • the electronic device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG moving picture experts group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the electronic device 100 can be realized through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
  • the storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 can be set in the processor 110, or some functional modules of the audio module 170 can be set in the processor 110.
  • Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
  • Electronic device 100 can listen to music through speaker 170A, or listen to hands-free calls.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the receiver 170B can be placed close to the human ear to receive the voice.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can put his mouth close to the microphone 170C to make a sound, and input the sound signal to the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C. In some other embodiments, the electronic device 100 may be provided with two microphones 170C, which may also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions, etc.
  • the earphone interface 170D is used for connecting wired earphones.
  • the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • pressure sensor 180A may be disposed on display screen 194 .
  • pressure sensors 180A such as resistive pressure sensors, inductive pressure sensors, and capacitive pressure sensors.
  • a capacitive pressure sensor may be comprised of at least two parallel plates with conductive material.
  • the electronic device 100 determines the intensity of pressure according to the change in capacitance.
  • the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view short messages is executed. When a touch operation whose intensity is greater than or equal to the first pressure threshold acts on the icon of the short message application, the instruction of creating a new short message is executed.
  • the gyro sensor 180B can be used to determine the motion posture of the electronic device 100 .
  • the angular velocity of the electronic device 100 around three axes may be determined by the gyro sensor 180B.
  • the gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to counteract the shaking of the electronic device 100 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
  • the magnetic sensor 180D includes a Hall sensor.
  • the electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip leather case.
  • the electronic device 100 when the electronic device 100 is a clamshell machine, the electronic device 100 can detect opening and closing of the clamshell according to the magnetic sensor 180D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the distance sensor 180F is used to measure the distance.
  • the electronic device 100 may measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F for distance measurement to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the electronic device 100 emits infrared light through the light emitting diode.
  • Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it may be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user is holding the electronic device 100 close to the ear to make a call, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode, automatic unlock and lock screen in pocket mode.
  • the ambient light sensor 180L is used for sensing ambient light brightness.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket, so as to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access to application locks, take pictures with fingerprints, answer incoming calls with fingerprints, and the like.
  • the temperature sensor 180J is used to detect temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to implement a temperature treatment strategy. For example, when the temperature reported by the temperature sensor 180J exceeds the threshold, the electronic device 100 may reduce the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to prevent the electronic device 100 from being shut down abnormally due to the low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also known as "touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
  • the bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human voice. The bone conduction sensor 180M can also contact the human pulse and receive the blood pressure beating signal. In some embodiments, the bone conduction sensor 180M can also be disposed in the earphone, combined into a bone conduction earphone.
  • the audio module 170 can analyze the voice signal based on the vibration signal of the vibrating bone mass of the vocal part acquired by the bone conduction sensor 180M, so as to realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the electronic device 100 can receive key input and generate key signal input related to user settings and function control of the electronic device 100 .
  • the motor 191 can generate a vibrating reminder.
  • the motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback.
  • touch operations applied to different applications may correspond to different vibration feedback effects.
  • the motor 191 may also correspond to different vibration feedback effects for touch operations acting on different areas of the display screen 194 .
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.
  • the SIM card interface 195 is used for connecting a SIM card.
  • the SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the electronic device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of multiple cards may be the same or different.
  • the SIM card interface 195 is also compatible with different types of SIM cards.
  • the SIM card interface 195 is also compatible with external memory cards.
  • the electronic device 100 interacts with the network through the SIM card to implement functions such as calling and data communication.
  • the electronic device 100 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100 .
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • This application takes the Android system with a layered architecture as an example to illustrate the software structure of the electronic device 100 .
  • the present application does not limit the type of the operating system of the electronic device. For example, Android system, Linux system, Windows system, iOS system, Hongmeng operating system (harmony operating system, Hongmeng OS), etc.
  • FIG. 2 is a block diagram of a software structure of an electronic device provided by an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the Android system is divided into four layers, from top to bottom are application program layer (APP), application program framework layer (APP framework), Android runtime (Android runtime) and system library (libraries), And the kernel layer (kernel).
  • APP application program layer
  • APP framework application program framework
  • Android runtime Android runtime
  • libraries system library
  • kernel layer kernel layer
  • the application layer can consist of a series of application packages.
  • the application package can include camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, game, chat, shopping, travel, instant messaging (such as SMS), smart home, Device control and other applications (application, APP).
  • application application, APP
  • the smart home application can be used to control or manage home devices with networking functions.
  • household equipment can include lights, televisions, and air conditioners.
  • household equipment can also include anti-theft door locks, speakers, sweeping robots, sockets, body fat scales, desk lamps, air purifiers, refrigerators, washing machines, water heaters, microwave ovens, rice cookers, curtains, fans, TVs, set-top boxes, doors and windows etc.
  • the application package may also include applications such as a main screen (that is, a desktop), a negative screen, a control center, and a notification center.
  • applications such as a main screen (that is, a desktop), a negative screen, a control center, and a notification center.
  • negative one screen also known as "-1 screen” refers to a user interface (user interface, UI) that slides to the right on the main screen of the electronic device until it slides to the leftmost split screen.
  • the negative screen can be used to place some quick service functions and notification messages, such as global search, quick entry of a certain page of the application (payment code, WeChat, etc.), instant messages and reminders (express information, expenditure information, commuting traffic conditions, etc.) , taxi travel information, schedule information, etc.) and follow news (football stands, basketball stands, stock information, etc.).
  • the control center is a sliding-up message notification bar of the electronic device, that is, a user interface displayed by the electronic device when the user starts to perform an upward sliding operation on the bottom of the electronic device.
  • the notification center is the drop-down message notification bar of the electronic device, that is, the user interface displayed by the electronic device when the user starts to perform a downward operation on the top of the electronic device.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include window managers, content providers, view systems, phone managers, resource managers, notification managers, and so on.
  • the window manager is used to manage window programs, such as managing window status, attributes, view (view) addition, deletion, update, window order, message collection and processing, etc.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • the window manager is the entrance for the outside world to access the window.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • Data can include videos, images, audio, calls made and received, browsing history and bookmarks, phonebook, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on.
  • the view system can be used to build applications.
  • a display interface can consist of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide communication functions of the electronic device 100 . For example, the management of call status (including connected, hung up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, user interface layout files (layout xml), video files, fonts, colors, user interface components (user interface module, ID card identification number (identity document, ID) of the UI component).
  • the resource manager is used to manage the aforementioned resources in a unified manner.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify the download completion, message reminder, etc.
  • the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window.
  • prompting text information in the status bar issuing a prompt sound, vibrating the electronic device, and flashing the indicator light, etc.
  • the Android runtime includes core libraries and a virtual machine.
  • the Android runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of the Android system.
  • the application layer and the application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application program layer and the application program framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • a system library can include multiple function modules. For example: surface manager (surface manager), media library (media libraries), 3D graphics processing library (eg: OpenGLES), 2D graphics engine (eg: SGL) and image processing library, etc.
  • surface manager surface manager
  • media library media libraries
  • 3D graphics processing library eg: OpenGLES
  • 2D graphics engine eg: SGL
  • image processing library etc.
  • the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
  • a corresponding hardware interrupt is sent to the kernel layer.
  • the kernel layer processes touch operations into original input events (including touch coordinates, time stamps of touch operations, and other information). Raw input events are stored at the kernel level.
  • the application framework layer obtains the original input event from the kernel layer, and identifies the control corresponding to the input event. Take the touch operation as a touch and click operation, and the control corresponding to the click operation is the control of the smart speaker icon as an example.
  • the smart speaker application calls the interface of the application framework layer to start the smart speaker application, and then starts the audio driver by calling the kernel layer , the electrical audio signal is converted into an audio signal through the speaker 170A.
  • the structure illustrated in this application does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • Fig. 3A-Fig. 3B are block diagrams of the software and hardware structures of two electronic devices implementing non-contact gesture control methods provided by an embodiment of the present application.
  • Fig. 4 is provided by an embodiment of the present application A software block diagram of an electronic device implementing a non-contact gesture control method.
  • the software and hardware structural block diagram of the electronic device realizing the non-contact gesture control method may include: a switch control module 21, an application recognition module 22, a speaker 23, a microphone 24, a feature extraction module 25, and a model reasoning module 26.
  • Fig. 3A the software system of the electronic device is used to recognize non-contact gestures, and the software system of the electronic device includes: a feature extraction module 25, a model reasoning module 26 and a reaction recognition module 27.
  • the target application 28 is used to recognize non-contact gestures, and the target application 28 includes: a feature extraction module 25 , a model reasoning module 26 and a reaction recognition module 27 .
  • FIGS. 3A-3B various modules in FIGS. 3A-3B are described in detail, and a possible implementation manner in an electronic device implementing a non-contact gesture control method.
  • the switch control module 21 can be a software code set in the electronic device.
  • the switch control module 21 is configured to detect the state of the first switch, and determine whether to send a trigger instruction to the application identification module 22 according to the state of the first switch.
  • the first switch may be a software switch in a software system of the electronic device, and/or, the first switch may be a hardware switch on a housing of the electronic device.
  • the present application does not limit parameters such as position, shape, and size of the first switch.
  • the trigger instruction is used to trigger the application identification module 22 .
  • the present application does not limit the specific implementation manner of the trigger instruction.
  • the trigger instruction can be represented by letters, binary numbers, numbers and the like.
  • the first state may be an on state or an off state, which is not limited in this application.
  • the state of the first switch can also be used to indicate whether the electronic device enables the non-contact gesture recognition function.
  • the switch control module 21 can determine that the electronic device enables the function of non-contact gesture recognition, and send a trigger instruction to the application recognition module 22; when the first switch is not in the first state, The switch control module 21 may determine that the electronic device has not enabled the non-contact gesture recognition function, and does not send a trigger instruction to the application recognition module 22 .
  • the electronic device may display prompt information corresponding to the aforementioned operation, so as to prompt the user whether the electronic device currently supports non-contact gesture recognition.
  • the present application does not limit the specific implementation manner of the prompt information corresponding to the aforementioned operations.
  • the switch control module 21 may default the first switch to be in the first state. Moreover, the switch control module 21 is an optional module. Correspondingly, the application identification module 22 does not need to start identifying the target application 28 after receiving the trigger instruction.
  • the application identification module 22 may be a software code set in the electronic device.
  • the application identification module 22 is configured to identify the target application 28 after receiving the trigger instruction, and determine whether the target application 28 supports non-contact gesture recognition.
  • the identified target application 28 is an application program that has been started by the electronic device, and a user interface displayed by the target application 28 on the electronic device is a focus window of the electronic device.
  • the focus window of the electronic device mentioned in this application can be understood as the current operating window of the electronic device, that is, the window that the user can currently operate.
  • This application does not limit the relevant parameters of the target application 28 such as type, user interface, and display position.
  • the content in the user interface may be expressed in the form of text, picture, video, audio, etc.
  • the application recognition module 22 may determine whether the target application 28 supports non-contact gesture recognition in various ways.
  • a white list can be set in the application recognition module 22, and whether the target application 28 supports non-contact gesture recognition is determined by judging whether the relevant information of the target application 28 is stored in the white list.
  • relevant information of all applications supporting non-contact gestures is stored in the white list.
  • the relevant information of an application may include: an application identifier, or an application identifier and an application process identifier, and the like.
  • the identification of the application may be represented by an application name, an application ID, and the like.
  • the process of the application can be understood as the execution process of the application to realize the user interface displayed in the electronic device.
  • the identifier of the process may be represented by a process name, a process ID, and the like.
  • the application identification module 22 can determine that the target application 28 supports non-contact gesture recognition. When no relevant information of the target application 28 is stored in the white list, the application identification module 22 may determine that the target application 28 does not support non-contact gesture recognition.
  • the application recognition module 22 can detect the state of the second switch, and according to the state of the second switch, can determine whether the target application 28 supports non-contact gesture recognition.
  • the second switch may be a software switch in a software system of the electronic device, and/or, the second switch may be a software switch in a target application.
  • the present application does not limit parameters such as position, shape, and size of the second switch.
  • the application identification module 22 can determine that the target application 28 supports non-contact gesture recognition; when the second switch is not in the second state, the application identification module 22 can determine that the target application 28 does not support Touchless gesture recognition.
  • the second state may be an on state or an off state, which is not limited in this application.
  • the electronic device may display prompt information corresponding to the aforementioned operation, so as to prompt the user whether the target application 28 currently supports non-contact gesture recognition.
  • the present application does not limit the specific implementation manner of the prompt information corresponding to the aforementioned operations.
  • the application identification module 22 is also used to determine whether the target application 28 supports non-contact gesture recognition according to the scene type of the user interface of the target application 28 .
  • the scene types corresponding to the user interface of the target application 28 may include: a text browsing scene, a picture browsing scene, a voice playing scene, a video playing scene, and the like.
  • the application recognition module 22 can determine that the target application 28 supports non-contact gesture recognition.
  • the application identification module 22 may determine that the target application 28 does not support non-contact gesture recognition.
  • the application identification module 22 is also used to determine the gesture category of the non-contact gestures supported by the scene type of the user interface of the target application 28 , so as to prompt the user in time.
  • the gesture category of the non-contact gesture supported by the scene type corresponding to the user interface of the target application 28 can be understood as one or more gesture categories of the non-contact gesture that the user can use in the user interface of the target application 28 .
  • the application recognition module 22 can start the speaker 23 to emit ultrasonic signals and start the microphone 24 to collect reflected signals.
  • the application recognition module 22 can also start the feature extraction module 25, the model reasoning module 26 and the reaction recognition module 27 to perform corresponding functions (the aforementioned processes are not shown in Fig. 3A, Fig. 3B and Fig. 4).
  • the application identification module 22 is further configured to send the identification of the target application 28 to the reaction identification module 27 .
  • the application identification module 22 does not need to send the identification of the target application 28 to the reaction identification module 27 , and the reaction identification module 27 can directly obtain the identification of the target application 28 .
  • the application identification module 22 When determining that the target application 28 does not support the recognition of non-contact gestures, if the horn 23 and the microphone 24 are not activated, then the application identification module 22 does not need to activate the horn 23 to transmit ultrasonic signals and activate the microphone 24 to collect reflected signals; if the horn 23 and the microphone 24 are activated, then the application recognition module 22 controls the speaker 23 and the microphone 24 to stop activation.
  • the horn 23 is set in the electronic device.
  • the horn 23 is configured to emit ultrasonic signals, so that the non-contact gesture can reflect the ultrasonic signals emitted by the horn 23 .
  • This application does not limit parameters such as the number and type of the speakers 23 .
  • the microphone 24 is set in the electronic device.
  • the microphone 24 is used to collect reflection signals and transmit the reflection signals to the feature extraction module 25 .
  • the present application does not limit parameters such as the number and types of the microphones 24 .
  • the reflected signal collected by the microphone 24 includes: The reflected signal is reflected after the signal encounters the surrounding environment of the electronic device, and the ultrasonic signal is reflected after the ultrasonic signal encounters the non-contact gesture.
  • the feature extraction module 25 is a software code, and the feature extraction module 25 can be set in the software system of the electronic device.
  • the feature extraction module 25 is used to filter out the ultrasonic signal reflected by the surrounding environment of the electronic device from the reflected signal, obtain the reflected signal reflected by the ultrasonic signal when it encounters a non-contact gesture, and extract the ultrasonic signal when it encounters a non-contact gesture.
  • Features such as distance feature and speed feature represented by the reflection signal reflected by the touch gesture, and transmit the distance feature and speed feature to the model reasoning module 26 .
  • the velocity feature is used to represent the velocity change of the non-contact gesture.
  • the distance feature is used to represent the distance change between the swipe position of the touchless gesture and the microphone.
  • model reasoning module 26 is a software code, and the model reasoning module 26 can be set in the software system of the electronic device, which is not limited in this application.
  • the model reasoning module 26 is configured to identify the gesture category of the non-contact gesture according to the distance feature and the speed feature, and transmit the gesture category of the non-contact gesture to the reaction recognition module 27 .
  • the gesture categories of non-contact gestures mentioned in this application may include but not limited to: palm movements (such as waving, opening, clenching, hovering, moving, etc.), finger movements (such as waving, hovering, number of fingers changes, etc.) and hand movements (such as waving, opening, clenching, hovering, moving, etc.).
  • palm movements such as waving, opening, clenching, hovering, moving, etc.
  • finger movements such as waving, hovering, number of fingers changes, etc.
  • hand movements such as waving, opening, clenching, hovering, moving, etc.
  • the gesture position of the non-contact gesture needs to be within the gesture recognition range of the electronic device, so as to ensure that the ultrasonic signal can meet the non-contact gesture and reflect a reflected signal.
  • the present application does not limit the specific value of the gesture recognition range of the electronic device.
  • the electronic device can display the gesture recognition range of the electronic device, so that the user can make effective non-contact gestures within the gesture recognition range of the electronic device, so that the electronic device can recognize the non-contact gesture.
  • the response identification module 27 is a software code, and the response identification module 27 can be set in the software system of the electronic device.
  • the reaction identification module 27 is configured to transmit an interaction instruction to the target application 28 according to the identification of the target application 28 and the gesture category of the non-contact gesture.
  • the interaction instruction is used to control the target application 28 to respond to the non-contact gesture, so that the target application 28 realizes the human-computer interaction of the non-contact gesture.
  • the present application does not limit the specific implementation manner of the interactive instruction.
  • the interactive instruction may be represented by letters, binary numbers, numbers, and the like.
  • the target application 28 is configured to control the target application 28 to respond to the non-contact gesture in response to the interaction instruction.
  • the electronic device needs hardware design for the speaker 23 and microphone 24, and software design for the switch control module 21, application recognition module 22, feature extraction module 25, model reasoning module 26, response recognition module 27, and target application 28.
  • the electronic device can start the horn 23 to emit ultrasonic signals, start the microphone 24 to collect the reflected signals when the ultrasonic signals meet the non-contact gesture reflection, and use the characteristic changes of the aforementioned reflected signals to capture the non-contact gesture , realize human-computer interaction of non-contact gestures in the target application 28 .
  • the feature extraction module 25, the model reasoning module 26 and the reaction recognition module 27 can also be set in the server of the target application 28, and the server of the target application 28 is located in the feature extraction module 25, model reasoning module After the respective operations performed by the module 26 and the reaction recognition module 27 , the execution result may be returned to the target application 28 .
  • the target application 28 can realize human-computer interaction of non-contact gestures.
  • the hardware design and software design involved in implementing the non-contact gesture control method for the electronic device are respectively introduced in detail.
  • the electronic device designs the layout settings and parameter constraints of the speaker and microphone, which can improve the accuracy of the electronic device's recognition of non-contact gestures, and there is no need to add additional devices or use expensive devices, saving the electronic device's recognition of non-contact gestures.
  • the cost of gestures are the layout settings and parameter constraints of the speaker and microphone, which can improve the accuracy of the electronic device's recognition of non-contact gestures, and there is no need to add additional devices or use expensive devices, saving the electronic device's recognition of non-contact gestures. The cost of gestures.
  • Electronic equipment can set the horn on the non-contact surface of the electronic equipment (such as when the electronic equipment is placed on the table, the surface of the electronic equipment except the contact surface with the table), which can ensure that the ultrasonic signal emitted by the horn has a wide range
  • the coverage is beneficial for electronic devices to have a better recognition effect on non-contact gestures.
  • the present application does not limit parameters such as quantity, type, and position of the speakers.
  • the number of horns may be one. In some other embodiments, the number of horns may be at least two. Wherein, at least two horns are arranged symmetrically, that is, at least two horns have symmetrical positions, which may be centrally symmetrical or axially symmetrical.
  • the present application does not limit the foregoing content, as long as the propagation range of the ultrasonic signal emitted by the horn is as wide as possible.
  • the electronic device can arrange the microphone at a position close to the user side of the electronic device, so that the microphone can receive reflected signals, which is beneficial for the electronic device to have a better recognition effect on non-contact gestures.
  • the present application does not limit parameters such as quantity, type, and position of the microphones.
  • the number of microphones may be one, and the electronic device is further provided with a shielding member for the microphone, through which the propagation channel of the reflection signal can be adjusted, so that the microphone collects reflection signals with different characteristics.
  • this application refers to the non-contact gestures with opposite gesture directions as Gesture 1 and Gesture 2 respectively, and the same ultrasonic signals encounter the reflected signals reflected by Gesture 1 and Gesture 2 respectively. Call them signal 1 and signal 2, respectively.
  • the present application does not limit the specific implementation manners of Gesture 1 and Gesture 2 .
  • Gesture 1 is a left swipe and Gesture 2 is a right swipe.
  • Gesture 1 is a swipe up and gesture 2 is a swipe down.
  • the electronic device can collect signal 1 and signal 2 with different characteristics through a microphone and a shield. Therefore, the electronic device can distinguish gesture 1 and gesture 2 according to the signal 1 and signal 2 with different characteristics, so that the electronic device can recognize non-contact gestures in different directions.
  • the propagation channel can be characterized by parameters such as propagation amount or propagation direction.
  • the propagation channel mentioned in this application may also be referred to as an echo path/propagation path.
  • reflected signals with different characteristics can be characterized by different appearance times of peaks or troughs in characteristic heat maps of reflected signals.
  • the shielding member can be made of materials such as plastic or ceramics.
  • the shielding member is a conducting member with two openings, one opening of the shielding member is used to cover/wrap the sound pickup hole communicated with the microphone and provided on the housing of the electronic device, and the other opening of the shielding member There are sound holes.
  • the shielding member is a conducting member with two openings, one opening of the shielding member is used to cover/wrap the sound pickup hole communicated with the microphone and provided on the housing of the electronic device, and the other opening of the shielding member There are sound holes.
  • the electronic device is a mobile phone as an example for illustration.
  • the shell of the mobile phone is provided with a sound pickup hole, and the sound pickup hole communicates with the sound pickup surface of the microphone through a connecting piece.
  • the shielding member is a conducting member with two openings, one opening of the shielding member covers/wraps the sound pickup hole, the surface where one opening of the shielding member is located is adjacent to the surface where the other opening of the shielding member is located, and the other opening of the shielding member
  • the surface where the opening is located is a chamfered surface
  • the other opening of the shielding member is provided with a sound-receiving hole, so that the reflected signal can be transmitted to the sound-picking surface of the microphone through the sound-receiving hole, the sound-picking hole and the connecting piece in sequence.
  • the present application does not limit the specific implementation manner of the surface where the other opening of the shielding member is located.
  • the surface where another opening of the shielding member is located is not perpendicular to the surface where an opening of the shielding member is located.
  • the present application does not limit parameters such as position, quantity, and shape of sound receiving holes.
  • the setting of the sound hole can change the propagation direction and propagation amount of signal 1 and signal 2, so that the characteristics of signal 1 and signal 2 are different, so that the electronic device can distinguish non-contact gestures with opposite gesture directions.
  • the electronic device can recognize non-contact gestures in various swipe directions.
  • the shell of the mobile phone is provided with a sound pickup hole, and the sound pickup hole communicates with the sound pickup surface of the microphone through a connecting piece.
  • the baffle is a conducting piece with two openings, one opening of the baffle covers/wraps the sound pickup hole, the surface where one opening of the baffle is located is opposite or adjacent to the surface where the other opening of the baffle is located, and the surface of the baffle The other opening is provided with sound receiving hole 1 and sound receiving hole 2 with different signal throughputs, so that the reflected signal can pass through sound receiving hole 1 and sound receiving hole 2, and then propagate to the sound pickup surface of the microphone through the sound pickup hole and the conducting member .
  • the sound receiving hole 1 and the sound receiving hole 2 are both arranged on the same surface of the shielding member away from the shell of the mobile phone.
  • the sound collecting hole 1 and the sound collecting hole 2 are respectively arranged on two opposite surfaces of the shielding member.
  • the sound receiving hole 1 and the sound receiving hole 2 may adopt different calibers and/or numbers, so that the sound receiving hole 1 and the sound receiving hole 2 have different signal throughputs.
  • the present application does not limit parameters such as position, quantity and shape of the sound receiving holes 1 and 2 .
  • sound receiving hole 1 is one hole
  • sound receiving hole 2 is two holes
  • the diameter of sound receiving hole 1 is larger than that of sound receiving hole 2 .
  • the setting of acoustic hole 1 and acoustic hole 2 can change the propagation direction and propagation amount of signal 1 and signal 2, resulting in different characteristics of signal 1 and signal 2, so that electronic devices can distinguish non-contact gestures with opposite swaying directions.
  • the electronic device can recognize non-contact gestures in various swipe directions.
  • the shielding member can be fixedly arranged on the housing of the electronic device, or can be inserted into the housing of the electronic device to realize the detachable connection of the shielding member.
  • This application does not limit the connection method of the shielding member.
  • the present application does not limit parameters such as position, quantity and shape of the sound pickup holes.
  • the application can also change the shape of the microphone to an irregular shape, such as The sound pickup surface or welding surface of the microphone is set as a beveled surface to change the propagation channel of the reflected signal, so that the electronic device can distinguish non-contact gestures with opposite waving directions.
  • the number of microphones may also be at least two.
  • the maximum distance between the microphones is greater than the first preset threshold, and the first preset threshold is used to ensure that there is a position difference between different microphones relative to the same speaker, that is, there is a position difference between the transmitting position of the ultrasonic signal and the collection position of the reflected signal.
  • the difference is beneficial to reflect the change of the non-contact gesture, and avoid the phenomenon that the non-contact gesture cannot be distinguished due to the short distance between the microphones.
  • the present application does not limit the specific size of the first preset threshold.
  • the microphone can be designed on a non-contact surface of a small electronic device such as a mobile phone.
  • the microphone can be designed at the front end of a larger electronic device such as a notebook computer or a PC, so that it is easy to collect changes in reflected signals and ensure that changes in reflected signals can better reflect non-contact gestures.
  • FIG. 5A is a schematic diagram of positions of a speaker and a microphone when the electronic device is a notebook computer according to an embodiment of the present application.
  • the speaker can be arranged on the edge position a1 on the side of the screen facing the user, the side a2 of the casing on the left side of the screen, the side a3 of the casing on the right side of the screen, The upper surface a4 of the housing on the side facing the user of the keyboard, the side a5 of the housing surrounding the side facing the user of the keyboard, the side a6 of the housing surrounding the left side of the keyboard, and the side a6 of the housing surrounding the right side of the keyboard a7 and so on.
  • the number of speakers on the notebook computer can be set to at least two, and the at least two speakers are left-right symmetrical along the symmetry axis of the notebook computer, so that the coverage of the ultrasonic signal is more uniform and broader.
  • the position of the microphone on the laptop computer can be set close to the user side, for example, the clasp of the laptop computer.
  • the microphone can be arranged on the edge position a1 of the notebook computer such as the side facing the user on the screen, the side a8 of the housing on the upper side of the screen, the side a5 of the housing surrounding the keyboard facing the user, and the side surrounding the keyboard.
  • the number of microphones on the notebook computer can be set to at least two, and the at least two microphones are left-right symmetrical along the symmetry axis of the notebook computer, so that the microphones can receive as many reflected signals as possible.
  • six speakers and three microphones can be set on a laptop computer.
  • the six speakers are divided into three pairs, and each pair of speakers is arranged symmetrically.
  • the first pair of speakers are respectively located at the left edge and the right edge of a1, the second pair of speakers are respectively located at: a2 and a3, and the third pair of speakers are located at: on a5.
  • Two of the three microphones are arranged symmetrically, both on the :a5 and the other on the :a8.
  • FIG. 5B is a schematic diagram of positions of speakers and microphones when the electronic device is a mobile phone according to an embodiment of the present application.
  • the speaker can be set on the edge position b1 on the side of the screen facing the user, on the side b2 of the housing on the upper side of the screen, on the side b3 of the housing on the lower side of the screen, on the The housing side b4 on the left side of the screen, the housing side b5 on the right side of the screen, etc.
  • the number of speakers on the mobile phone can be set to at least two, and at least two speakers are left-right symmetrical along the symmetry axis of the mobile phone, so that the coverage of the ultrasonic signal is more uniform and wider.
  • the microphone can be set on the edge position b1 of the mobile phone, such as on the side facing the user of the screen, on the side b2 of the casing on the upper side of the screen, on the side b3 of the casing on the lower side of the screen, on the side of the screen.
  • at least one microphone is arranged on the side adjacent to the side.
  • the number of microphones on the mobile phone can be set to at least two, and at least two microphones are set on the mobile phone, so that the microphones can receive as many reflected signals as possible.
  • the mobile phone is a handheld device, the position of the speaker and the microphone on the mobile phone should be avoided as much as possible.
  • four speakers and four microphones can be set on the mobile phone.
  • the four speakers are divided into two pairs, and each pair of speakers is arranged symmetrically.
  • the first pair of speakers are respectively located at the upper edge and the lower edge of b1, and the second pair of speakers are respectively located at the left edge and right edge of b1.
  • the four microphones are divided into two pairs, and each pair of microphones is arranged symmetrically, the first pair of microphones are respectively located on: b2 and b3, and the second pair of microphones are respectively located on: b4 and b5.
  • FIG. 5C is a schematic diagram of positions of a speaker and a microphone when the electronic device is a PC according to an embodiment of the present application.
  • the speakers can be arranged on the edge position c1 of the PC such as on the side facing the user of the screen, on the side c2 of the housing on the upper side of the screen, on the side c3 of the housing on the lower side of the screen, on the The housing side c4 on the left side of the screen, the housing side c5 on the right side of the screen, the bracket side c6 on the left side of the bracket, the bracket side c7 on the right side of the bracket, the side facing the bracket Positions such as c8 on the bracket on the user side, and c9 on the upper surface of the base near the user side of the base (that is, the contact surface between the base and the bracket).
  • the number of speakers on the PC can be set to at least two, and the at least two speakers are left-right
  • the position of the microphone on the PC can be set closer to the user.
  • the microphone can be set on the edge position c1 on the side of the screen facing the user, the side c2 of the casing on the upper side of the screen, the side c3 of the casing on the lower side of the screen, or the left side of the screen.
  • the number of microphones on the PC can be set to at least two, and the at least two microphones are left-right symmetrical along the symmetry axis of the PC, so that the microphones can receive as many reflected signals as possible.
  • four speakers and four microphones can be set on the PC.
  • the four speakers are divided into two pairs, and each pair of speakers is arranged symmetrically.
  • the first pair of speakers are respectively located on c6 and c7, and the second pair of speakers are located on c9.
  • the four microphones are divided into two pairs, and each pair of microphones is arranged symmetrically.
  • the first pair of microphones is located on: c9, and the second pair of microphones is located on: c10.
  • electronic devices arrange the number and positions of speakers and microphones so that the speakers can emit ultrasonic signals as widely as possible, so that the microphones can collect as many reflected signals as possible, which improves the effect of electronic devices in recognizing non-contact gestures.
  • the electronic device can accurately recognize the non-contact gesture, which is beneficial to accurately realize the human-computer interaction of the non-contact gesture in the target application.
  • the ultrasonic signal may include one kind of signal, or may include multiple kinds of signals, which is not limited in the present application.
  • each signal in the ultrasonic signal needs to satisfy is: each signal is an autocorrelation signal, that is, the signal has a good autocorrelation property.
  • the autocorrelation mentioned in this application refers to the dependence relationship between the instantaneous value of a signal at one moment and the instantaneous value of another moment, and is a time-domain description of a signal.
  • the signal when the autocorrelation coefficient of the signal is greater than or equal to a preset value, the signal may be called an autocorrelation signal, that is, the signal has good autocorrelation characteristics. This application does not limit the size of the preset value.
  • the present application does not limit the specific type of the ultrasonic signal.
  • a chirp signal (Chirp signal for short), or a zero autocorrelation sequence (Zadoff-Chu Sequence, ZC series), etc.
  • the horn can emit ultrasonic signals with good autocorrelation characteristics, which is convenient for electronic equipment to extract the reflected signals when the ultrasonic signals meet the non-contact gesture reflection from the reflected signals, and filter out the reflection of the surrounding environment when the ultrasonic signals meet the electronic equipment.
  • the ultrasonic signal enables the electronic device to accurately capture the non-contact gesture according to the reflection signal of the ultrasonic signal meeting the reflection of the non-contact gesture.
  • the electronic device can also design the frequency range of each signal in the ultrasonic signal to be within the first range.
  • the frequency range (ie, frequency band) of each signal refers to the actual frequency range of the signal.
  • the first range is related to the sampling rate of the microphone and the ultrasonic frequency range.
  • the sampling rate of the microphone should be greater than twice the highest frequency of the ultrasonic signal. Therefore, the frequency of the ultrasonic signal needs to be less than half the sampling rate of the microphone.
  • the frequency range of ultrasound is greater than or equal to 20 kHz (Hertz).
  • the first range is greater than or equal to 20 kHz and less than half of the sampling rate of the microphone.
  • the present application does not limit the sampling rate of the microphone.
  • Commercially available microphones in laptops and PCs have a sampling rate of 48kHz.
  • the sampling rate of the microphone in the mobile phone has reached 96kHz, or even higher.
  • the sampling rate of the microphone in this application is illustrated by taking 48kHz as an example.
  • the first range is greater than or equal to 20kHz and less than 24kHz.
  • the horn can emit ultrasonic signals with a frequency range within the first range, so that the microphone can collect reflected signals within the first range.
  • the horn reflects multiple signals, in addition to satisfying the above conditions, considering the limited frequency band resources corresponding to the first range that allow ultrasonic signals to propagate, the horn can transmit multiple signals in a multi-channel manner.
  • the frequency band resources that allow ultrasonic signals to propagate are related to the first range. For example, when the first range is greater than or equal to 20 kHz and less than 24 kHz, the frequency band resource corresponding to the first range is 4 kHz (that is, the difference between 24 kHz-20 kHz).
  • the multi-channel mode refers to that the speaker needs to time-division (that is, take turns/alternately) transmit various signals. Additionally, in some embodiments, the frequency ranges of the various signals are the same. In some other embodiments, the frequency ranges of the multiple signals are the same, and there are signals with opposite frequency change rates among the multiple signals.
  • the frequency change rate refers to a frequency change trend, which may gradually increase or gradually decrease, which is not limited in the present application.
  • the present application does not limit the magnitude of the frequency change rate.
  • the present application does not limit the interval between two adjacent signals among the various signals.
  • the horn transmits ultrasonic signals in a multi-channel manner, so that the frequency range of the ultrasonic signal can occupy the first range as large as possible, which is conducive to the ultrasonic signal covering as many frequency ranges as possible, ensuring the propagation quality of the ultrasonic signal, and improving the electronic
  • the recognition accuracy of non-contact gestures by the device solves the problem that the recognition effect of non-contact gestures cannot be guaranteed when the microphone adopts a low sampling rate.
  • the two signals are both autocorrelation signals
  • the frequency ranges of the two signals are both within the first range
  • the two signals are time-divisionally transmitted
  • the frequency ranges of the two signals are the same It is the opposite of the rate of change of frequency.
  • the horn in the case where the horn emits two signals, these two signals are respectively referred to as: a first emission signal and a second emission signal.
  • the ultrasonic signal reflected by the first transmitted signal is referred to as a first reflected signal
  • the ultrasonic signal reflected by the second transmitted signal is referred to as a second reflected signal for example.
  • FIG. 6A is a schematic waveform diagram of a first transmission signal provided by an embodiment of the present application
  • FIG. 6B is a schematic waveform diagram of a second transmission signal provided by an embodiment of the present application.
  • the frequency of the ultrasonic signal adopts the linear frequency sweep mode of the Chirp ultrasonic signal
  • the frequency f 1 (t) of the first transmitted signal x 1 (t) corresponds to the curve 11 in Figure 6A
  • the first A transmitted signal x 1 (t) corresponds to curve 12 in FIG. 6A
  • the frequency f 2 (t) of the second transmitted signal x 2 (t) corresponds to the curve 21 in FIG. 6B
  • the second transmitted signal x 2 (t) corresponds to the curve 22 in FIG. 6B .
  • the first transmitting signal or the second transmitting signal x i (t) are all synchronous sweeping sinusoidal signals (synchronized sweepsinsignal).
  • the frequency f i (t) of the first transmitted signal or the second transmitted signal x i (t) is a function of time t, and f i (t) changes as t changes, and the value range of f i (t) is greater than or equal to f 0 and less than or equal to f 1 , and f 0 is greater than or equal to the minimum value of the first range, f i is less than or equal to the maximum value of the first range, and the value of i is 1 or 2.
  • the frequency range of the first transmitted signal or the second transmitted signal is the same (that is, both are greater than or equal to f 0 and less than or equal to f 1 ), and the rate of change of the frequency of the first transmitted signal or the second transmitted signal is opposite (that is, curve 11 gradually becomes larger, curve 21 gradually becomes smaller).
  • curve 21 shows a complete waveform of the first reflected signal, and correspondingly, the minimum transmission duration of a complete waveform of the first reflected signal is t16.
  • curve 22 shows a complete waveform of the second reflected signal, and correspondingly, the minimum transmission duration of a complete waveform of the second reflected signal is t26 .
  • the electronic device can design each signal emitted by the speaker to have good autocorrelation characteristics, the frequency range of each signal is within the first range, and at least one of the conditions in which the two signals adopt a multi-channel mode, so that the electronic device can pass
  • the horn can transmit the above-mentioned ultrasonic signal to a wide coverage area, which increases the gesture recognition range of the electronic device, and also improves the effect of the non-contact gesture recognition of the electronic device, so that the electronic device can accurately recognize the non-contact gesture. Accurately realize the human-computer interaction of non-contact gestures in the target application, reducing the cost of recognizing non-contact gestures.
  • the electronic device can also constrain the frequency response of the ultrasonic signal.
  • the lowest value of the frequency response of the ultrasonic signal is related to the gesture recognition range of the electronic device and the sound source energy of the ultrasonic signal.
  • the highest value of the frequency response of the ultrasonic signal is also related to the influence of the sound pressure intensity of the ultrasonic signal on the living body (such as the bearing capacity of the eardrum).
  • the electronic device can also design the transmission duration of the ultrasonic signal and/or parameters of the ultrasonic signal such as the minimum propagation energy value and the maximum propagation energy value according to the gesture recognition range of the electronic device.
  • the microphone needs to meet the following conditions:
  • the frequency response that is, the frequency response (frequency response)
  • the second range is related to the gesture recognition range of the electronic device.
  • the microphone Since the frequency range of the ultrasonic signal is within the first range. Therefore, the microphone needs to be responsive to ultrasonic signals at every frequency within the first range. Therefore, by setting the lowest value of the frequency response of the microphone within the second range, the microphone can receive reflected signals within the first range within the gesture recognition range of the electronic device.
  • the frequency range of the reflected signal when the ultrasonic signal encounters the non-contact gesture reflection is within the first range
  • the frequency range of the ultrasonic signal reflected by the surrounding environment of the electronic device when the ultrasonic signal encounters it is not within the first range
  • the frequency range of the reflected signal collected by the microphone is usually the full frequency band.
  • This application does not limit the sampling rate of the microphone.
  • This application can use commercially available microphones. Generally, the higher the sampling rate of the microphone, the better the performance of the crystal oscillator, and the higher the frequency, the better the recognition effect of the electronic device on non-contact gestures.
  • the electronic device can only use a microphone capable of collecting reflection signals, and the above conditions can improve the performance of the microphone for collecting reflection signals.
  • the electronic device enables the speaker to emit ultrasonic signals as widely as possible, so that the microphone can receive as many reflected signals as possible, which can improve the accuracy of electronic devices in recognizing non-contact gestures, and also There is no need to add additional devices or use expensive devices, saving the cost of electronic devices for recognizing non-contact gestures.
  • the electronic device also proposes a software design for the recognition of non-contact gestures.
  • the electronic device can accurately estimate the gesture category of the non-contact gesture according to the characteristic change of the non-contact gesture represented by the reflected signal reflected by the ultrasonic signal in the reflected signal. Therefore, the electronic device can accurately recognize the non-contact gesture, and control the target application to respond to the non-contact gesture, which is beneficial to realize the human-computer interaction of the non-contact gesture in the target application.
  • FIG. 7 is a schematic flowchart of a non-contact gesture control method provided by an embodiment of the present application. As shown in Figure 7, the non-contact gesture control method of the present application may include the following steps:
  • the electronic device displays a target application, where the target application is an application supporting non-contact gesture recognition.
  • the electronic device activates a horn to transmit an autocorrelated ultrasonic signal.
  • the electronic device starts the microphone to collect the ultrasonic signal and encounters a reflection signal reflected by the non-contact gesture.
  • the electronic device controls the target application to respond to the non-contact gesture according to the target application and the reflected signal.
  • electronic devices can use ultrasonic signals, existing hardware designs such as speakers and microphones in electronic devices, and software designs such as gesture recognition algorithms to accurately realize human-computer interaction of non-contact gestures in target applications, making electronic devices Being able to accurately control the target application to respond to non-contact gestures in low-light or even dark environments is conducive to improving the user's experience in using electronic devices and target applications, and there is no need to add additional devices or use expensive devices in the electronic device, reducing the The cost of electronic device control target applications responding to touchless gestures.
  • the electronic device may display one or more user interfaces of the target application 28 .
  • the electronic device may also display the icon of the target application 28, or the electronic device may also display user interfaces of other applications, which is not limited in this application.
  • the electronic device may simultaneously display multiple application user interfaces (including the user interface of the target application 28), the electronic device uses the application identification module 22 to first identify the target application 28, and then determine whether the target application 28 supports contactless gesture recognition.
  • the application identification module 22 can identify an application that has been started in the electronic device and has a user interface as the focus window as the target application 28 , and the aforementioned user interface is the user interface currently displayed by the target application 28 in the electronic device. Wherein, the application identification module 22 may identify the target application 28 in various ways.
  • the electronic device can store the identification of the application in the storage space of the software system, and the electronic device may start multiple applications at the same time, that is, the identification corresponding to multiple applications can be stored in the storage space . Therefore, the application identification module 22 can retrieve the identifiers of all applications from the storage space to determine all the applications that have been started.
  • the present application does not limit parameters such as location and size of the storage space.
  • the identification of the application may be represented by an application name, an application ID, and the like.
  • the application identification module 22 can obtain the identification of the application corresponding to the focus window through the application programming interface (application programming interface, API) in the software system, such as calling a function, to determine the target application 28.
  • API application programming interface
  • the application identification module 22 can determine whether the target application 28 is stored in the storage space to determine whether the target application 28 has been started. Thus, when the application identifying module 22 determines that the target application 28 has been started, it can identify the started application in the electronic device and the focused window as the user interface as the target application 28 .
  • the electronic device may simultaneously display user interfaces of multiple applications (including the user interface of the target application 28 ), but the electronic device has only one focus window. Therefore, the scene recognition module 22 can obtain the user interface corresponding to the focus window by using methods such as functions.
  • the application identifying module 22 can search for the application corresponding to the user interface corresponding to the focus window among all the started applications.
  • the application identification module 22 can identify the application in the electronic device that has been started and whose focus window is a user interface as the target application 28 .
  • the application identification module 22 can use multiple methods to determine whether the target application 28 supports non-contact gesture recognition.
  • the specific implementation methods can refer to the descriptions in FIG. 3A, FIG. repeat.
  • the application recognition module 22 can start the speaker 23 to emit ultrasonic signals, and start the microphone 24 to collect reflected signals.
  • the electronic device can also use the switch control module 21 to send a trigger instruction to the application identification module 22, so that the application identification module 22 executes S101 after receiving the trigger instruction.
  • the switch control module 21 can also use the switch control module 21 to send a trigger instruction to the application identification module 22, so that the application identification module 22 executes S101 after receiving the trigger instruction.
  • the electronic device can also be designed to use whether the target application supports non-contact gesture recognition as a trigger condition to adaptively start or close the electronic device to realize the non-contact gesture control method, so that the electronic device can flexibly realize non-contact gesture recognition in the target application.
  • the human-computer interaction of touch gestures is conducive to improving the adaptability and flexibility of electronic equipment, and can also reduce the overall power consumption of electronic equipment.
  • the scene recognition module 22 may stop activating the speaker and the microphone, so that the target application does not continue to respond to the non-contact gesture. It is beneficial for the electronic equipment to self-adaptively realize the human-computer interaction of the non-contact gesture, and is also beneficial for reducing the overall power consumption of the electronic equipment.
  • the scene recognition module 22 can stop starting the speaker and the microphone, and the switch control module 21 can switch the state of the first switch to not be in the first state. state, so that the electronic device turns off the function of non-contact gesture recognition of the electronic device, fully considers the availability of the human-computer interaction of the electronic device to realize the non-contact gesture, and also reduces the overall power consumption of the electronic device.
  • the present application does not limit the specific size of the second threshold.
  • the electronic device can display corresponding prompt information, which is convenient for suggesting to the user not to use non-contact gestures to control the target application or plug in and use non-contact gestures to manipulate the target application.
  • prompt information which is convenient for suggesting to the user not to use non-contact gestures to control the target application or plug in and use non-contact gestures to manipulate the target application.
  • the present application does not limit the specific implementation manner of the foregoing prompt information.
  • the electronic device can adaptively enable or disable the function of non-contact gesture recognition, which ensures the adaptability and availability of the human-computer interaction of the electronic device to realize the non-contact gesture, and reduces the overall power consumption of the electronic device.
  • the electronic device may use the horn 23 to transmit an autocorrelated ultrasonic signal.
  • the ultrasonic signal may include one type of signal or multiple types of signals, which is not limited in this application.
  • the one signal is an autocorrelation signal with good autocorrelation characteristics, and the frequency range of the one signal is within the first range.
  • the present application does not limit to the mode that horn 23 transmits a kind of signal.
  • the number of horns 23 When the number of horns 23 is one, one horn can emit one kind of signal continuously, or can emit one kind of signal at intervals, which is not limited in this application.
  • the multiple speakers can transmit multiple signals in time division, or cross transmit multiple signals, or simultaneously transmit multiple signals, which is not limited in this application.
  • the multiple signals are all autocorrelation signals and have good autocorrelation characteristics, and the frequency ranges of the multiple signals are all within the first range.
  • the present application does not limit the manner in which the horn 23 transmits various signals.
  • the number of speakers 23 When the number of speakers 23 is one, multiple sound channels of one speaker can transmit various signals respectively. When the number of speakers 23 is multiple, the multiple speakers can emit various signals respectively. Wherein, the horn 23 may reflect various signals in time-division, cross, and simultaneous manners, which are not limited in this application.
  • the ultrasonic signal may include two kinds of signals, both of which are autocorrelation signals, both of which have good autocorrelation characteristics, and the frequency ranges of the two kinds of signals are both within the first range, and the multi-sound There are two kinds of signals transmitted in channel mode.
  • one speaker can include two sound channels, and the two sound channels can time-divisionally transmit two kinds of signals. That is, one channel transmits the first transmission signal, and after the transmission of the first transmission signal ends, the other channel transmits the second transmission signal.
  • the two horns can time-divisionally transmit two kinds of signals. That is, one horn transmits the first transmission signal, and after the transmission of the first transmission signal ends, the other horn transmits the second transmission signal.
  • the frequency ranges of the two signals may be the same, and the frequency change rates of the two signals may be opposite.
  • the electronic device activates the horn to transmit the ultrasonic signal, which makes the coverage of the ultrasonic signal transmission wider, and also enables the electronic device to collect the reflected signal when the ultrasonic signal meets the non-contact gesture reflection through the microphone.
  • the ultrasonic signal may encounter a non-contact gesture or the surrounding environment of the electronic device. Therefore, the reflected signal when the ultrasonic signal encounters the reflection of the non-contact gesture is inevitably mixed with the ultrasonic signal reflected by the surrounding environment of the electronic device when the ultrasonic signal encounters it.
  • the frequency range of the ultrasonic signal reflected by the non-contact gesture is within the first range, and the frequency range of the ultrasonic signal reflected by the surrounding environment of the electronic device is not within the first range.
  • the electronic device can collect the reflected signal after the ultrasonic signal encounters the non-contact gesture through the microphone.
  • the electronic device can use the feature extraction module 25, the model reasoning module 26 and the response recognition module 27 to control the target application 28 to respond to the non-contact gesture according to the target application and the reflection signal.
  • FIG. 8 is a schematic flowchart of a non-contact gesture control method provided by an embodiment of the present application.
  • the non-contact gesture control method of the present application may include:
  • the feature extraction module 25 extracts distance features and velocity features of the reflected signal.
  • the model reasoning module 26 determines the gesture category of the non-contact gesture according to the distance feature and the speed feature.
  • the reaction identification module 27 sends an interaction instruction to the target application 28 according to the target application and the gesture category of the non-contact gesture.
  • the target application 28 controls the target application 28 to respond to the non-contact gesture.
  • electronic devices can accurately recognize non-contact gestures, and realize human-computer interaction of non-contact gestures in target applications without considering the constraints of ambient light, device costs, and device materials, which improves the user's ability to use electronic devices and The experience of the target application, and also reduces the cost of electronic devices to recognize touchless gestures.
  • the feature extraction module 25 may use the feature heat map to characterize the feature of the reflected signal. Therefore, the feature extraction module 25 can determine the gesture category of the non-contact gesture through the feature heat map in combination with the arrangement position of the microphone 24 .
  • the electronic device is a notebook computer, and the electronic device includes 4 microphones (mic1, mic2, mic3 and mic4) which are uniformly arranged , the four microphones are all set on the side a5 of the notebook computer casing that surrounds the keyboard and faces the user side, and the speaker in the electronic device emits two kinds of signals as an example for illustration.
  • the electronic device includes 4 microphones (mic1, mic2, mic3 and mic4) which are uniformly arranged , the four microphones are all set on the side a5 of the notebook computer casing that surrounds the keyboard and faces the user side, and the speaker in the electronic device emits two kinds of signals as an example for illustration.
  • the abscissa of the characteristic heat map adopts the time dimension, and the abscissa represents the number of frames in milliseconds (ms), and the duration of one frame is greater than or equal to the first The sum of the minimum transmission duration of a complete waveform of the transmitted signal and the minimum transmission duration of a complete waveform of the second transmitted signal.
  • the minimum transmission duration of a complete waveform of the first reflected signal is t16 shown in FIG. 6A , which is about 10 ms.
  • the transmission duration of a complete waveform of the second reflected signal is t26 shown in FIG. 6B , which is about 10 ms.
  • the duration of one frame is greater than or equal to t16+t26, which is about 20ms (10ms+10ms).
  • the horn may directly transmit the second reflected signal. Then, the duration of one frame is greater than the sum of the minimum transmission duration of a complete waveform of the first transmitted signal and the minimum transmission duration of a complete waveform of the second transmitted signal.
  • the horn transmits the second reflection signal after the interval time elapses. Then, the duration of one frame is equal to the sum of the minimum transmission duration of a complete waveform of the first transmitted signal, the minimum transmission duration of a complete waveform of the second transmitted signal, and the interval duration.
  • the ordinate of the characteristic heat map adopts the feature dimension, and the number N on the ordinate represents the Nth channel impulse response (channel impulse response, CIR)
  • the feature is a dimensionless quantity, which can represent the actual distance between the microphone and the user's hand. The smaller the number on the ordinate, the smaller the actual distance.
  • each microphone may include two channels, and each channel may collect a reflection signal, that is, one channel may collect the first reflection signal, and the other channel may collect the second reflection signal.
  • each channel is divided into 64 dimensions, that is, one channel in each microphone can reflect the change of the first reflected signal collected in the 64 dimensions on the ordinate, and the other channel in each microphone can reflect The changes of the collected second reflection signal are reflected in the other 64 dimensions on the ordinate.
  • the total length of the CIR feature on the ordinate is 512.
  • the numbers 1-64 and 65-128 on the ordinate represent the two channels of mic1, that is, one channel of mic1 collects the first reflected signal, and the other channel of mic1 collects the second reflected signal.
  • Numbers 129-192 and 193-256 on the ordinate represent two channels of mic2, that is, one channel of mic2 collects the first reflected signal, and the other channel of mic2 collects the second reflected signal.
  • Numbers 257-320 and 321-384 on the ordinate represent two channels of mic3, that is, one channel of mic3 collects the first reflected signal, and the other channel of mic3 collects the second reflected signal.
  • the numbers 385-448 and 449-512 on the ordinate represent the two channels of mic4, that is, one channel of mic4 collects the first reflected signal, and the other channel of mic4 collects the second reflected signal.
  • the coordinate points of the characteristic heat map adopt the color dimension (such as black, white and gray), and the coordinate points represent the size of the CIR feature, which can represent the energy of the reflected signal.
  • the color of the coordinate point is white, and the stronger the energy of the reflected signal is, the closer the actual distance between the microphone and the user's hand is.
  • the color of the coordinate point is black or the color of the coordinate point is grayer, the user's hand is in a stationary state or the actual distance between the microphone and the user's hand is farther.
  • Figure 9A is a schematic diagram of a non-contact gesture provided by an embodiment of the present application, where the user's hand approaches the electronic device
  • Figure 9B is a non-contact gesture provided by an embodiment of the present application.
  • the gesture is a schematic diagram of a scene where the user's hand moves away from the electronic device
  • FIG. 9C is a schematic diagram of a characteristic heat map corresponding to a non-contact gesture provided by an embodiment of the present application in which the user's hand first approaches the electronic device and then moves away from the electronic device.
  • the dotted line is used to indicate that there is a certain distance between the user's hand and the laptop computer, that is, the non-contact gesture is an air gesture.
  • the user's hand gradually approaches the laptop along the direction of the ray AB, which is called the aforementioned non-contact gesture as a push action.
  • the user's hand is gradually moving away from the laptop along the direction of the ray BA, which is called a pulling action of the aforementioned non-contact gesture.
  • the electronic device can determine that: the user's hand performs the pushing action first, and then the pulling action.
  • Figure 10A is a schematic diagram of a non-contact gesture provided by an embodiment of the application, which is a scene where the user's hand is swiping downward
  • Figure 10B is a non-contact gesture provided by an embodiment of the application The gesture is the characteristic heat map corresponding to the user's hand waving downwards
  • Figure 10C is a schematic diagram of a non-contact gesture provided by an embodiment of the application where the user's hand is waving upwards
  • Figure 10D is a scene provided by an embodiment of the application
  • a non-contact gesture is a feature heat map corresponding to a user's hand waving upwards.
  • the dotted line is used to indicate that there is a certain distance between the user's hand and the notebook computer, that is, the non-contact gesture is an air gesture.
  • the user's hand swipes downward along the direction of the ray CD, which is called the aforesaid non-contact gesture as a swiping motion.
  • the user's hand swipes upwards along the direction of the ray DC, which is called the aforementioned non-contact gesture as an upward swiping motion.
  • the electronic device may determine that the user's hand performs a downward swipe motion.
  • the electronic device may determine that the user's hand performs an up-swing motion.
  • Figure 11A is a schematic diagram of a non-contact gesture provided by an embodiment of the application, which is a scene where the user's hand is waving to the left
  • Figure 11B is a non-contact gesture provided by an embodiment of the application The gesture is the characteristic heat map corresponding to the user's hand waving to the left
  • Figure 11C is a schematic diagram of a non-contact gesture provided by an embodiment of the application where the user's hand is waving to the right
  • Figure 11D is a scene diagram provided by an embodiment of the application.
  • a non-contact gesture of the user's hand waving to the right corresponds to a characteristic heat map.
  • the dotted line is used to indicate that there is a certain distance between the user's hand and the laptop computer, that is, the non-contact gesture is an air gesture.
  • the user's hand is waving to the left along the direction of the ray EF, which means that the aforementioned non-contact gesture is a left waving motion.
  • the user's hand is waving to the right along the direction of the ray FE, which is called the aforementioned non-contact gesture as a right swipe.
  • the channel of mic1 corresponding to the number 1-64 on the ordinate, the channel of mic2 corresponding to the number 129-192 on the ordinate, the channel and channel of mic3 corresponding to the number 257-320 on the ordinate correspond to the channels of mic4, and the color of the coordinate points turns white, indicating that the energy of the reflected signal increases in turn.
  • the electronic device may determine that the user's hand performs a left-swipe motion.
  • the channel of mic1 corresponding to the number 1-64 on the ordinate, the channel of mic2 corresponding to the number 129-192 on the ordinate, the channel and channel of mic3 corresponding to the number 257-320 on the ordinate correspond to the channels of mic4, and the color of the coordinate points becomes black, which means that the energy of the reflected signal decreases in turn.
  • the electronic device may determine that the user's hand performs a right swipe motion.
  • the electronic device can determine the gesture category of the non-contact gesture by combining the arrangement position of the microphone and the characteristic heat map.
  • the arrangement position of the microphones mentioned in this application can be understood as: when the number of microphones is one, the arrangement position of the microphones is the layout of a microphone in the electronic device; when the number of microphones is at least two, the arrangement of the microphones The location is the layout of the at least two microphones in the electronic device.
  • the feature extraction module 25 can extract the distance feature and velocity feature of the reflected signal from the feature heat map.
  • FIG. 12 is a flowchart of an electronic device determining a distance feature and a speed feature according to an embodiment of the present application.
  • the concrete process that feature extraction module 25 extracts distance feature and velocity feature can comprise the following steps:
  • the feature extraction module 25 can synchronize the first transmitted signal and the first reflected signal, so that the electronic device can obtain the starting position of the first reflected signal, which facilitates the subsequent estimation of the impulse response of the propagation channel.
  • the feature extraction module 25 can synchronize the second transmitted signal and the second reflected signal, so that the electronic device can obtain the starting position of the second reflected signal, so as to facilitate the subsequent estimation of the impulse response of the propagation channel.
  • the feature extraction module 25 uses the good autocorrelation characteristics of the first reflected signal to estimate the impulse response of the propagation channel, and then performs matched filtering after processing with a Fourier transform (discrete Fourier transform, DFT) matrix, and the output of the matched filter The signal spectrum is then subjected to an inverse Fourier transform (inverse discrete Fourier transform, IDFT).
  • DFT discrete Fourier transform
  • the feature extraction module 25 uses the good autocorrelation characteristics of the second reflected signal to estimate the impulse response of the propagation channel, and then performs matched filtering after processing with a Fourier transform (discrete Fourier transform, DFT) matrix, and the matched filter The spectrum of the output signal is then subjected to inverse Fourier transform (inverse discrete Fourier transform, IDFT).
  • DFT discrete Fourier transform
  • the feature extraction module 25 can output CIR features, that is, the feature extraction module 25 can draw a feature heat map according to the first reflected signal and the second reflected signal, and the feature heat map is characterized by CIR features.
  • the CIR feature includes: the CIR of the non-contact gesture and the CIR feature of the surrounding environment of the characteristic electronic device.
  • the feature extraction module 25 differentiates the CIR feature in the time dimension to obtain a differential channel impulse response (diff channel impulse response, dCIR) feature of the relative motion.
  • dCIR differential channel impulse response
  • the dCIR feature can represent the distance change of the non-contact gesture.
  • the influence of the surrounding environment of the fixed electronic equipment is removed. It can be seen that the dCIR feature is the distance feature.
  • the feature extraction module 25 uses the dCIR feature in the time dimension to perform Fourier transform (such as fast Fourier transform (FFT)) in a long period of time, and the Doppler in the long period of time can be obtained feature.
  • the Doppler feature can represent the speed change of the non-contact gesture. It can be seen that the Doppler feature is the velocity feature.
  • the feature extraction module 25 can extract the distance feature and velocity feature of the reflected signal.
  • the model reasoning module 26 can recognize the gesture category of the non-contact gesture according to the distance feature and the speed feature in combination with the arrangement position of the microphone.
  • model inference module 26 may receive distance features and velocity features from feature extraction module 25 .
  • the feature extraction module 25 can send the distance feature and speed feature of a fixed number of frames to the model reasoning module 26, so that the model reasoning module 26 can recognize a complete non-contact gesture.
  • the fixed frame number is related to the execution duration of the non-contact gesture.
  • the fixed frame rate is also related to the reaction rate of the electronic device to recognize the non-contact gesture.
  • the present application does not limit the specific numerical value of the fixed frame number. For example, the fixed number of frames is 128.
  • the model inference module 26 can input the distance feature and speed feature into the deep learning classification model corresponding to the gesture recognition algorithm for inference, and obtain the gesture category of the non-contact gesture.
  • the deep learning classification model corresponding to the gesture recognition algorithm can adopt models such as temporal convolutional networks (TCN).
  • TCN temporal convolutional networks
  • TCN is a variant of convolutional neural network that performs sequence modeling tasks by combining recurrent neural network (RNN) and convolutional neural networks (CNN) architectures.
  • RNN recurrent neural network
  • CNN convolutional neural networks
  • LSTM long short-term memory
  • the TCN model is based on the CNN model and has been improved as follows: 1. Applicable sequence model: causal convolution; 2. Memory history: hollow convolution/dilated convolution, residual module (residual block). And, compared to the RNN model, the architecture of the TCN model can take a sequence of arbitrary length and map it to an output sequence of the same length. Thus, the TCN model has a very long effective history, and the network has a combination of boosting and dilated convolutions in the residual layer.
  • FIG. 13 is a flowchart of an electronic device determining a gesture category of a non-contact gesture according to an embodiment of the present application.
  • the specific reasoning process of the TCN model may include the following steps:
  • the model reasoning module 26 inputs the distance feature and the speed feature into the gesture classification model, and can output a classification result of the non-contact gesture.
  • the gesture classification model can adopt a multi-layer network model (multi-layer convolution layer is used for illustration in Figure 13), specifically through CNN model to fuse multiple feature matrices, and then through multi-layer causal convolution, each layer is passed through residual Poor connection, using AutoML to search and optimize the grid structure, and then get the classification results of various non-contact gestures after training through multiple training sets.
  • the classification result may be expressed as a gesture category of a coarse-grained non-contact gesture.
  • classification results can be divided into: palm motion, finger motion, and hand motion.
  • the classification results can be divided into: hovering, waving, and changes in the number of fingers.
  • the model reasoning module 26 inputs the distance feature and velocity feature into the gesture tracking model, and can output the moving area of the non-contact gesture (such as 3D coordinates).
  • the gesture tracking model can use a multi-layer network model (multi-layer convolution layer is used for illustration in Figure 13), specifically through the depth camera and all devices that can detect the position of the hand to record the coordinates for each gesture action, and through the coordinates can be Calculate the speed of the gesture, redesign the network model through these labeled data (such as gesture category, coordinates, speed, etc.), and through the training of the network model, realize the classification function, and predict the coordinates and movement speed of the hand.
  • multi-layer network model multi-layer convolution layer is used for illustration in Figure 13
  • the electronic device realizes the control of the gesture range according to the coordinates of the hand; the electronic device realizes rough gesture classification according to the speed information of the gesture.
  • the gesture data that is not recognized successfully will not enter the gesture classification model for secondary judgment, and will be directly output into other categories.
  • the gesture data that is successfully recognized can be judged again through the gesture classification model. In this way, the aforementioned operations help to reduce false touches in the interaction response and improve the classification accuracy of non-contact gestures.
  • the gesture classification model and gesture tracking model can also implement corresponding functions in other ways.
  • the gesture classification model and the gesture tracking model can adopt a multi-layer network model with the same structure but different parameters.
  • the model reasoning module 26 performs fusion reasoning according to the classification result of the non-contact gesture and the moving area of the non-contact gesture to obtain a fine-grained gesture category of the non-contact gesture. To sum up, the model reasoning module 26 can identify the gesture category of the non-contact gesture according to the distance feature and speed feature, through the gesture classification model and the gesture tracking model in combination with the arrangement position of the microphone.
  • the model reasoning module 26 can identify whether it is a non-contact gesture and identify the gesture type of the non-contact gesture through double verification.
  • the reaction identification module 27 may send an interaction instruction to the target application 28 according to the target application and the gesture category of the non-contact gesture.
  • the reaction identification module 27 may first determine the interactive response corresponding to the non-contact gesture according to the target application and the gesture category of the non-contact gesture, and then send an interactive instruction of the interactive response to the target application 28 .
  • the present application does not limit the specific implementation manner of the interactive response.
  • the interactive response is corresponding to non-contact gestures such as turning pages up and down, turning pages left and right, increasing volume, decreasing volume, switching videos, making calls, taking screenshots, capturing long pictures, recording screens, switching applications, switching Responses such as going to the main screen, switching to the negative screen, selecting multiple processes, etc.
  • the interactive response can also be represented by keyboard control events, mouse click events, touch events after screen pixels are segmented, etc., so that the target application 28 can perform interactive responses corresponding to non-contact gestures according to the aforementioned events. .
  • the reaction identification module 27 can determine that the non-contact gesture responds to the target application 28 according to the corresponding relationship between the identification of each application, the gesture category of the non-contact gesture, and the interaction response, and according to the target application and the gesture category of the non-contact gesture. interactive response.
  • the present application does not limit the specific implementation manner of the corresponding relationship.
  • the corresponding relationship may be stored in the electronic device in a representation manner such as a table, a matrix, an array, or a key-value (key-value). Therefore, the response identification module 27 can determine the interactive instruction of the interactive response, and then send the interactive instruction of the interactive response to the target application 28 .
  • the target application 28 can control the target application 28 to respond to the non-contact gesture, that is, the target application 28 can perform an interactive response corresponding to the non-contact gesture.
  • electronic devices can timely and accurately control the target application to execute the interactive response corresponding to non-contact gestures, so that electronic devices can accurately and quickly realize the human-computer interaction of non-contact gestures in the target application .
  • the present application provides an electronic device, including: a memory and a processor; the memory is used to store program instructions; the processor is used to call the program instructions in the memory to make the electronic device execute the non-contact gesture control method in the foregoing embodiments .
  • the present application provides a chip system, which is applied to an electronic device including a memory, a display screen, and a sensor; the chip system includes: a processor; when the processor executes the computer instructions stored in the memory, the electronic device executes the preceding The non-contact gesture control method in the embodiment.
  • the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to enable an electronic device to implement the non-contact gesture control method in the foregoing embodiments.
  • the present application provides a computer program product, including: execution instructions, the execution instructions are stored in a readable storage medium, at least one processor of the electronic device can read the execution instructions from the readable storage medium, at least one processor Execution of the execution instruction enables the electronic device to implement the non-contact gesture control method in the foregoing embodiments.
  • all or part of the functions may be implemented by software, hardware, or a combination of software and hardware.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored on a computer readable storage medium.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, DVD), or a semiconductor medium (for example, a solid state disk (solid state disk, SSD)), etc.
  • a magnetic medium for example, a floppy disk, a hard disk, or a magnetic tape
  • an optical medium for example, DVD
  • a semiconductor medium for example, a solid state disk (solid state disk, SSD)
  • the processes can be completed by computer programs to instruct related hardware.
  • the programs can be stored in computer-readable storage media.
  • When the programs are executed may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: ROM or random access memory RAM, magnetic disk or optical disk, and other various media that can store program codes.

Abstract

一种非接触式手势控制方法和电子设备(100)。该方法应用于电子设备(100),电子设备(100)包括:至少一个喇叭(23)和至少一个麦克风(24)。该方法包括:显示目标应用(28),目标应用(28)为支持非接触式手势识别的应用程序(S101);启动至少一个喇叭(23)发射自相关的超声波信号(S102);启动至少一个麦克风(24)采集超声波信号遇到非接触式手势后反射的反射信号(S103);根据目标应用(28)以及反射信号,控制目标应用(28)响应非接触式手势(S104)。从而,电子设备(100)可在目标应用(28)中精准地实现非接触式手势的人机交互,无需考虑周围环境的光线、器件成本和器件材质等约束,提升了用户使用电子设备(100)和目标应用(28)的体验,还降低了电子设备(100)控制目标应用(28)响应非接触式手势的成本。

Description

非接触式手势控制方法和电子设备
本申请要求于2021年10月13日提交国家知识产权局、申请号为202111194277.9、申请名称为“非接触式手势控制方法和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子技术领域,尤其涉及一种非接触式手势控制方法和电子设备。
背景技术
非接触式手势在电子设备中可进行人机交互,方便用户在电子设备的较远距离位置处使用非接触式手势(即隔空手势),实现如接听电话、浏览网页、关掉音乐、截屏等操作。
目前,电子设备常常通过拍摄的图像来识别非接触式手势,实现用户使用非接触式手势来操控应用。然而,拍摄的图像需要明亮的光源,否则电子设备在昏暗的光源下无法准确识别非接触式手势,导致用户无法使用非接触式手势来操控应用,影响用户的使用体验。
因此,电子设备如何准确识别非接触式手势实现相应操作是现亟需解决的问题。
发明内容
本申请提供一种非接触式手势控制方法和电子设备,可使得电子设备能够在目标应用中精准地实现非接触式手势的人机交互,提升了用户使用电子设备和目标应用的体验。
第一方面,本申请提供一种非接触式手势控制方法,该方法应用于电子设备,电子设备中包括:至少一个喇叭和至少一个麦克风。
该方法包括:
显示目标应用,目标应用为支持非接触式手势识别的应用程序;
启动至少一个喇叭发射自相关的超声波信号;
启动至少一个麦克风采集超声波信号遇到非接触式手势后反射的反射信号;
根据目标应用以及反射信号,控制目标应用响应非接触式手势。
其中,自相关是指信号在1个时刻的瞬时值与另1个时刻的瞬时值之间的依赖关系,是对1个信号的时域描述。在一些实施例中,在信号的自相关系数大于等于预设值时,可称为该信号为自相关的信号,即该信号具有良好的自相关特性。本申请对预设值的大小不做限定。
通过第一方面提供的非接触式手势控制方法,电子设备可借助自相关的超声波信号,利用电子设备中现有的喇叭和麦克风等硬件设计以及手势识别算法等软件设计,在目标应用中精准地实现非接触式手势的人机交互,使得电子设备能够在光线不足甚至黑暗的环境中准确地控制目标应用响应非接触式手势,有利于提升用户使用电子设备和目标应用的体 验,也无需在电子设备中增加额外的器件或使用昂贵的器件,降低了电子设备控制目标应用响应非接触式手势的成本。
在一种可能的设计中,在超声波信号包括一种信号时,一种信号为自相关的信号,且一种信号的频率范围在第一范围内;
或者,在超声波信号包括两种信号时,两种信号均为自相关的信号,两种信号的频率范围均在第一范围内,两种信号时分发射,且两种信号的频率范围相同和频率变化速率相反;
其中,第一范围与麦克风的采样率以及超声波频率范围相关。
从而,喇叭可将前述超声波信号发射到广泛的覆盖范围,增大了电子设备的手势识别范围,还提升了电子设备识别非接触式手势的效果,使得电子设备能够精准识别非接触式手势,有利于精准地实现非接触式手势的人机交互。
在一种可能的设计中,启动至少一个喇叭发射超声波信号,包括:
在电子设备包括一个喇叭时,启动一个喇叭的两个声道时分发射两种信号;
在电子设备包括两个喇叭时,启动两个喇叭时分发射两种信号。
从而,电子设备针对不同数量的喇叭进行设计,使得不同数量的喇叭均能够发射出包含有两种信号超声波信号,提升了识别非接触式手势的电子设备的普遍性和实用性。
在一种可能的设计中,麦克风的频响的最低值在第二范围内,第二范围用于确保麦克风可接收到反射信号。
从而,麦克风可尽可能多地采集到反射信号,提高了电子设备识别非接触式手势的准确率,也无需增加额外的器件或使用昂贵的器件,节省了电子设备识别非接触式手势的成本。
在一种可能的设计中,根据目标应用以及反射信号,控制目标应用响应非接触式手势,包括:
提取反射信号的距离特征和速度特征,距离特征用于表示非接触式手势的挥动位置与麦克风间的距离变化,速度特征用于表示非接触式手势的速度变化;
根据距离特征和速度特征,确定非接触式手势的手势类别;
根据目标应用以及非接触式手势的手势类别,控制目标应用响应非接触式手势。
从而,电子设备可采用反射信号的特征反映出非接触式手势,能够精准地识别出非接触式手势,有助于减少在目标应用中响应的误触,降低了电子设备识别非接触式手势的成本。
在一种可能的设计中,在电子设备包括一个麦克风时,电子设备还包括:麦克风的遮挡件,遮挡件用于调整超声波信号的传播方向和/或传播量,以使电子设备能够区分不同方向的非接触式手势;
在电子设备包括至少两个麦克风时,麦克风间的最大距离大于第一阈值,第一阈值用于确保至少两个麦克风与同一喇叭间存在位置差异。
从而,电子设备针对不同数量的麦克风进行设计,使得不同数量的麦克风均能够接收到超声波信号遇到非接触式手势反射的反射信号,提升了识别非接触式手势的电子设备的普遍性和实用性。
在一种可能的设计中,在启动至少一个喇叭发射自相关的超声波信号之前,该方法还 包括:
确定电子设备的第一开关处于第一状态,第一开关的第一状态用于指示电子设备启用非接触式手势识别的功能。
从而,电子设备还可设计将目标应用是否支持非接触式手势识别作为自适应地启动或关闭电子设备实现非接触式手势控制方法的触发条件,使得电子设备在目标应用中灵活地实现非接触式手势的人机交互,有利于提升电子设备的自适应性和灵活性,还可降低电子设备的整机功耗。
在一种可能的设计中,在启动至少一个喇叭发射自相关的超声波信号之前,该方法还包括:
确定目标应用支持非接触式手势识别。
在一种可能的设计中,确定目标应用支持非接触式手势识别,包括:
在电子设备的存储模块中存储有目标应用的标识时,确定目标应用支持非接触式手势识别,存储模块用于存储支持非接触式手势识别的全部应用的标识。
从而,为确定目标应用提供了一种可能的实现方式。
或者,
在目标应用的第二开关处于第二状态时,确定目标应用支持非接触式手势识别,第二开关的状态用于表示目标应用是否支持非接触式手势识别。
从而,为确定目标应用又提供了一种可能的实现方式,丰富了确定目标应用的方式。
在一种可能的设计中,该方法还包括:
在电子设备未插电或电子设备的剩余电量小于第二阈值时,停止启动至少一个喇叭以及至少一个麦克风。
从而,电子设备可自适应关闭电子设备/目标应用的支持非接触式手势识别的功能,保障了电子设备中的手势识别算法的可用性,降低了电子设备的整机功耗。
第二方面,本申请提供一种电子设备,包括:存储器和处理器;存储器用于存储程序指令;处理器用于调用存储器中的程序指令使得电子设备执行第一方面及第一方面任一种可能的设计中的非接触式手势控制方法。
第三方面,本申请提供一种芯片系统,芯片系统应用于包括存储器、显示屏和传感器的电子设备;芯片系统包括:处理器;当处理器执行存储器中存储的计算机指令时,电子设备执行第一方面及第一方面任一种可能的设计中的非接触式手势控制方法。
第四方面,本申请提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器使得电子设备执行时实现第一方面及第一方面任一种可能的设计中的非接触式手势控制方法。
第五方面,本申请提供一种计算机程序产品,包括:执行指令,执行指令存储在可读存储介质中,电子设备的至少一个处理器可以从可读存储介质读取执行指令,至少一个处理器执行执行指令使得电子设备实现第一方面及第一方面任一种可能的设计中的非接触式手势控制方法。
附图说明
图1为本申请一实施例提供的一种电子设备的结构示意图;
图2为本申请一实施例提供的一种电子设备的软件结构框图;
图3A为本申请一实施例提供的一种电子设备实现非接触式手势控制方法的软硬件结构框图;
图3B为本申请一实施例提供的一种电子设备实现非接触式手势控制方法的软硬件结构框图;
图4为本申请一实施例提供的一种电子设备实现非接触式手势控制方法的软件框图;
图5A为本申请一实施例提供的一种电子设备为笔记本电脑时的喇叭和麦克风的位置示意图;
图5B为本申请一实施例提供的一种电子设备为手机时的喇叭和麦克风的位置示意图;
图5C为本申请一实施例提供的一种电子设备为PC时的喇叭和麦克风的位置示意图;
图6A为本申请一实施例提供的一种第一发射信号的波形示意图;
图6B为本申请一实施例提供的一种第二发射信号的波形示意图;
图7为本申请一实施例提供的一种非接触式手势控制方法的流程示意图;
图8为本申请一实施例提供的一种非接触式手势控制方法的流程示意图;
图9A为本申请一实施例提供的一种非接触式手势为用户的手靠近电子设备的场景示意图;
图9B为本申请一实施例提供的一种非接触式手势为用户的手远离电子设备的场景示意图;
图9C为本申请一实施例提供的一种非接触式手势为用户的手先靠近电子设备后远离电子设备对应的特征热度图的示意图;
图10A为本申请一实施例提供的一种非接触式手势为用户的手向下挥动的场景示意图;
图10B为本申请一实施例提供的一种非接触式手势为用户的手向下挥动对应的特征热度图;
图10C为本申请一实施例提供的一种非接触式手势为用户的手向上挥动的场景示意图;
图10D为本申请一实施例提供的一种非接触式手势为用户的手向上挥动对应的特征热度图;
图11A为本申请一实施例提供的一种非接触式手势为用户的手向左挥动的场景示意图;
图11B为本申请一实施例提供的一种非接触式手势为用户的手向左挥动对应的特征热度图;
图11C为本申请一实施例提供的一种非接触式手势为用户的手向右挥动的场景示意图;
图11D为本申请一实施例提供的一种非接触式手势为用户的手向右挥动对应的特征热度图;
图12为本申请一实施例提供的一种电子设备确定距离特征和速度特征的流程框图;
图13为本申请一实施例提供的一种电子设备确定非接触式手势的手势类别的流程框图。
具体实施方式
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示: 单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,单独a,单独b或单独c中的至少一项(个),可以表示:单独a,单独b,单独c,组合a和b,组合a和c,组合b和c,或组合a、b和c,其中a,b,c可以是单个,也可以是多个。此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“中心”、“纵向”、“横向”、“上”、“下”、“左”、“右”、“前”、“后”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本申请和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本申请的限制。
本申请提供一种非接触式手势控制方法、电子设备、计算机可读存储介质及计算机程序产品,可借助具有良好自相关特性的超声波信号,利用电子设备中现有的喇叭和麦克风等硬件设计以及手势识别算法等软件设计,在目标应用中精准地实现非接触式手势的人机交互,使得电子设备能够在光线不足甚至黑暗的环境中准确地控制目标应用响应非接触式手势,有利于提升用户使用电子设备和目标应用的体验,也无需在电子设备中增加额外的器件或使用昂贵的器件,降低了电子设备控制目标应用响应非接触式手势的成本。
其中,上述方法的应用场景可包括但不限于:办公场景、娱乐场景、功能场景等。
在办公场景下,用户可采用如上下挥动或左右挥动等类别的非接触式手势,阅读如PDF、word、WPS等格式的文档,实现如上下翻页等快速的交互响应。用户也可采用如上下挥动或左右挥动等类别的非接触式手势,演示PPT等格式的文档,实现如上下翻页等快速的交互响应。
在娱乐场景下,用户可采用如上下挥动等类别的非接触式手势,播放音视频,实现如增加或降低音量等快速的交互响应。用户也可采用如左右挥动等类别的非接触式手势,浏览图库中的照片,实现如上下翻页等快速的交互响应。用户也可采用如上下挥动等类别的非接触式手势,拨打视频通话,实现如增加或降低音量等快速的交互响应。
在功能场景下,用户可采用如上下挥动、张开手掌、握紧手掌、左右挥动等类别的非接触式手势,快速截屏或选择多进程等,实现快速的交互响应。
其中,电子设备可以是手机(如折叠屏手机、大屏手机等)、平板电脑、笔记本电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、个人计算机(personal computer,PC)、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)、智能电视、智慧屏、高清电视、4K电视、智能音箱、智能投影仪等设备,本申请对电子设备的具体类型不作任何限制。
下面以电子设备为手机为例,结合图1,介绍本申请涉及的电子设备。
图1为本申请一实施例提供的一种电子设备的结构示意图。如图1所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193, 显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本申请示意的结构并不构成对电子设备100的具体限定。在另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I2C总线接口通信,实现电子设备100的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I2S接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。I2S接口和PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现电子设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现电子设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本申请示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大 器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled, MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器 110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当电子设备100是翻盖机时,电子设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用 于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振 动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备100中,不能和电子设备100分离。
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请以分层架构的Android系统为例,示例性说明电子设备100的软件结构。其中,本申请对电子设备的操作系统的类型不做限定。例如,Android系统、Linux系统、Windows系统、iOS系统、鸿蒙操作系统(harmony operating system,鸿蒙OS)等。
图2为本申请一实施例提供的一种电子设备的软件结构框图。如图2所示,分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层(APP),应用程序框架层(APP framework),安卓运行时(Android runtime)和系统库(libraries),以及内核层(kernel)。
应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,游戏,聊天,购物,出行,即时通信(如短信息),智能家居,设备控制等应用程序(application,APP)。
其中,智能家居应用可用于对具有联网功能的家居设备进行控制或管理。例如,家居设备可以包括电灯、电视和空调。又如,家居设备还可以包括防盗门锁、音箱、扫地机器人、插座、体脂秤、台灯、空气净化器、电冰箱、洗衣机、热水器、微波炉、电饭锅、窗帘、风扇、电视、机顶盒、门窗等。
另外,应用程序包还可以包括:主屏幕(即桌面),负一屏,控制中心,通知中心等应用程序。
其中,负一屏,又可称为“-1屏”,是指在电子设备的主屏幕向右滑动屏幕,直至滑动至最左侧分屏的用户界面(user interface,UI)。例如,负一屏可以用于放置一些快捷服务功能和通知消息,比如全局搜索、应用程序某个页面的快捷入口(付款码、微信等)、即时信息及提醒(快递信息、支出信息、通勤路况、打车出行信息、日程信息等)及关注动态(足球看台、篮球看台、股票信息等)等。控制中心为电子设备的上滑消息通知栏,即当用户在电子设备的底部开始进行向上滑动的操作时电子设备所显示出的用户界面。通知中心为电子设备的下拉消息通知栏,即当用户在电子设备的顶部开始进行向下操作时电 子设备所显示出的用户界面。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器(window manager)用于管理窗口程序,如管理窗口状态、属性、视图(view)增加、删除、更新、窗口顺序、消息收集和处理等。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。并且,窗口管理器为外界访问窗口的入口。
内容提供器用于存放和获取数据,并使这些数据可以被应用程序访问。数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器(resource manager)为应用程序提供各种资源,比如本地化字符串,图标,图片,用户界面的布局文件(layout xml),视频文件,字体,颜色,用户界面组件(user interface module,UI组件)的身份证标识号(identity document,ID)等。并且,资源管理器用于统一管理前述资源。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
安卓运行时包括核心库和虚拟机。安卓运行时负责Android系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是Android系统的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维图形处理库(例如:OpenGLES),2D图形引擎(例如:SGL)和图像处理库等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
下面结合利用智能音箱播放声音的场景,示例性说明电子设备100的软件和硬件的工作流程。
当触摸传感器180K接收到触摸操作,相应的硬件中断被发给内核层。内核层将触摸操作加工成原始输入事件(包括触摸坐标,触摸操作的时间戳等信息)。原始输入事件被存储在内核层。应用程序框架层从内核层获取原始输入事件,识别该输入事件所对应的控件。以该触摸操作是触摸单击操作,该单击操作所对应的控件为智能音箱图标的控件为例,智能音箱应用调用应用框架层的接口,启动智能音箱应用,进而通过调用内核层启动音频驱动,通过扬声器170A将音频电信号转换成声音信号。
可以理解的是,本申请示意的结构并不构成对电子设备100的具体限定。在另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
基于前述描述,本申请以下实施例将以具有图1和图2所示结构的电子设备为例,结合附图和应用场景,对本申请提供的电子设备和非接触式手势控制方法进行详细阐述。
请参阅图3A、图3B-图4,图3A-图3B为本申请一实施例提供的两种电子设备实现非接触式手势控制方法的软硬件结构框图,图4为本申请一实施例提供的一种电子设备实现非接触式手势控制方法的软件框图。
如图3A-图3B所示,电子设备实现非接触式手势控制方法的软硬件结构框图可以包括:开关控制模块21、应用识别模块22、喇叭23、麦克风24、特征提取模块25、模型推理模块26、反应识别模块27以及目标应用28。
其中,图3A与图3B的不同点在于:图3A中,电子设备的软件系统来识别非接触式手势,且电子设备的软件系统中包括:特征提取模块25、模型推理模块26以及反应识别模块27。图3B中,目标应用28来识别非接触式手势,且目标应用28中包括:特征提取模块25、模型推理模块26以及反应识别模块27。
下面,结合图4,详细阐述图3A-图3B中各个模块,在电子设备实现非接触式手势控制方法中的可能实现方式。
其中,开关控制模块21可为设置在电子设备中的软件代码。
开关控制模块21,用于检测第一开关的状态,并根据第一开关的状态,确定是否向应用识别模块22发送触发指令。
其中,第一开关可为电子设备的软件系统中的软件开关,和/或,第一开关可为电子设备的壳体上的硬件开关。本申请对第一开关的如位置、形状、大小等参数不做限定。
其中,触发指令用于触发应用识别模块22。本申请对触发指令的具体实现方式不做限定。例如,触发指令可采用字母、二进制、数字等表示方式。
其中,第一状态可为开启状态或关闭状态,本申请对此不做限定。并且,第一开关的状态还可用于指示电子设备是否启用非接触式手势识别功能。
从而,在第一开关处于第一状态时,开关控制模块21可确定电子设备启用非接触式手势识别的功能,并向应用识别模块22发送触发指令;在第一开关不处于第一状态时,开关控制模块21可确定电子设备未启用非接触式手势识别的功能,并不向应用识别模块 22发送触发指令。
另外,在用户对第一开关进行如开启或关闭等操作后,电子设备可显示前述操作对应的提示信息,方便向用户提示电子设备当前是否支持非接触式手势识别。其中,本申请对前述操作对应的提示信息的具体实现方式不做限定。
需要说明的是,除了上述方式之外,开关控制模块21也可默认第一开关处于第一状态。并且,开关控制模块21为可选的模块,对应地,应用识别模块22无需在接收到触发指令后,才开始识别目标应用28。
其中,应用识别模块22可为设置在电子设备中的软件代码。
应用识别模块22,用于在接收到触发指令后,可识别目标应用28,并确定目标应用28是否支持非接触式手势识别。
其中,所识别出的目标应用28为电子设备已启动的一个应用程序,且目标应用28在电子设备中显示的一个用户界面为电子设备的焦点窗口。本申请提及的电子设备的焦点窗口可理解为电子设备当前的操作窗口,即用户现在可操作的窗口。本申请对目标应用28的如类型、用户界面、显示位置等相关参数不做限定。用户界面中的内容可采用如文字、图片、视频、音频等表示形式。
其中,应用识别模块22可采用多种方式确定目标应用28是否支持非接触式手势识别。
在一些实施例中,应用识别模块22中可设置白名单,通过判断白名单中是否存储有目标应用28的相关信息,来确定目标应用28是否支持非接触式手势识别。
其中,白名单中存储有支持非接触式手势的全部应用的相关信息。一个应用的相关信息可以包括:应用的标识,或者,应用的标识以及应用的进程的标识等方式。应用的标识可采用如应用名、应用的ID等表示形式。应用的进程可理解为应用实现显示在电子设备中的用户界面的执行过程。进程的标识可采用如进程名、进程的ID等表示形式。
从而,在白名单中存储有目标应用28的相关信息时,应用识别模块22可确定目标应用28支持非接触式手势识别。在白名单中未存储有目标应用28的相关信息时,应用识别模块22可确定目标应用28不支持非接触式手势识别。
在另一些实施例中,应用识别模块22可检测第二开关的状态,并根据第二开关的状态,可确定目标应用28是否支持非接触式手势识别。
其中,第二开关可为电子设备的软件系统中的软件开关,和/或,第二开关可为目标应用中的软件开关。本申请对第二开关的如位置、形状、大小等参数不做限定。
其中,本申请对触发指令的具体实现方式不做限定。
从而,在第二开关处于第二状态时,应用识别模块22可确定目标应用28支持非接触式手势识别;在第二开关不处于第二状态时,应用识别模块22可确定目标应用28不支持非接触式手势识别。
其中,第二状态可为开启状态或关闭状态,本申请对此不做限定。
另外,在用户对第二开关进行如开启或关闭等操作后,电子设备可显示前述操作对应的提示信息,方便向用户提示目标应用28当前是否支持非接触式手势识别。其中,本申请对前述操作对应的提示信息的具体实现方式不做限定。
此外,在根据上述方式确定目标应用28支持非接触式手势识别后,应用识别模块22,还用于根据目标应用28的用户界面的场景类型,可确定目标应用28是否支持非接触式手 势识别。
其中,目标应用28的用户界面对应的场景类型可以包括:文字浏览场景、图片浏览场景、语音播放场景以及视频播放场景等。
从而,在目标应用28的用户界面对应的场景类型支持非接触式手势时,应用识别模块22可确定目标应用28支持非接触式手势识别。在目标应用28的用户界面对应的场景类型不支持非接触式手势时,应用识别模块22可确定目标应用28不支持非接触式手势识别。
此外,除了判断目标应用28的用户界面的场景类型是否支持非接触式手势之外,应用识别模块22,还用于确定目标应用28的用户界面的场景类型所支持的非接触式手势的手势类别,便于及时提示给用户。
其中,目标应用28的用户界面对应的场景类型所支持的非接触式手势的手势类别可理解为用户可在目标应用28的用户界面中使用的非接触式手势的一个或多个手势类别。
综上,在确定目标应用28支持非接触式手势识别后,应用识别模块22可启动喇叭23发射超声波信号以及启动麦克风24采集反射信号。
另外,与此同时,应用识别模块22还可启动特征提取模块25、模型推理模块26和反应识别模块27执行相应的功能(图3A、图3B和图4中未示出前述过程)。
并且,图3A中,在确定目标应用28是否支持非接触式手势识别后,应用识别模块22还用于向反应识别模块27发送目标应用28的标识。图3B中,在确定目标应用28是否支持非接触式手势识别后,应用识别模块22无需向反应识别模块27发送目标应用28的标识,反应识别模块27可直接获取目标应用28的标识。在确定目标应用28不支持对非接触式手势的识别时,如果喇叭23和麦克风24未启动,那么,应用识别模块22便无需启动喇叭23发射超声波信号以及启动麦克风24采集反射信号;如果喇叭23和麦克风24已启动,那么,应用识别模块22便控制喇叭23和麦克风24停止启动。
其中,喇叭23设置在电子设备中。喇叭23,用于发射超声波信号,使得非接触式手势能够反射喇叭23发射的超声波信号。本申请对喇叭23的数量、类型等参数不做限定。
其中,麦克风24设置在电子设备中。麦克风24,用于采集反射信号,并向特征提取模块25传输反射信号。本申请对麦克风24的数量、类型等参数不做限定。
需要说明的是,超声波信号遇到非接触式手势后反射的反射信号中难免夹杂着超声波信号遇到电子设备的周围环境后反射的超声波信号,因此,麦克风24采集到的反射信号中包括:超声波信号遇到电子设备的周围环境后反射的反射信号,以及超声波信号遇到非接触式手势后反射的超声波信号。
其中,特征提取模块25为软件代码,特征提取模块25可设置在电子设备的软件系统中。
特征提取模块25,用于可从反射信号中,滤除超声波信号遇到电子设备的周围环境反射的超声波信号,得到超声波信号遇到非接触式手势反射的反射信号,并提取超声波信号遇到非接触式手势反射的反射信号所表征的如距离特征和速度特征等特征,并向模型推理模块26传输距离特征和速度特征。
其中,速度特征用于表示非接触式手势的速度变化。距离特征用于表示非接触式手势的挥动位置与麦克风间的距离变化。
其中,模型推理模块26为软件代码,模型推理模块26可设置在电子设备的软件系统中,本申请对此不做限定。
模型推理模块26,用于根据距离特征和速度特征,识别出非接触式手势的手势类别,并向反应识别模块27传输非接触式手势的手势类别。
其中,本申请提及的非接触式手势的手势类别可包括但不限于:手掌动作(如挥动、张开、紧握、悬停、移动等)、手指动作(如挥动、悬停、手指数量变化等)以及手动作(如挥动、张开、紧握、悬停、移动等)等。本申请对非接触式手势的具体实现方式不做限定。
并且,非接触式手势的手势位置需要在电子设备的手势识别范围内,从而保证了超声波信号能够遇到非接触式手势能够反射出反射信号。
其中,本申请对电子设备的手势识别范围的具体数值不做限定。在一些实施例中,电子设备可显示电子设备的手势识别范围,使得用户能够在电子设备的手势识别范围内做出有效的非接触式手势,使得电子设备能够识别出非接触式手势。
其中,反应识别模块27为软件代码,反应识别模块27可设置在电子设备的软件系统中。
反应识别模块27,用于根据目标应用28的标识和非接触式手势的手势类别,向目标应用28传输交互指令。
其中,交互指令用于控制目标应用28响应非接触式手势,使得目标应用28实现非接触式手势的人机交互。本申请对交互指令的具体实现方式不做限定。例如,交互指令可采用字母、二进制、数字等表示方式。目标应用28,用于响应于交互指令,控制目标应用28响应非接触式手势。
综上,电子设备需要对喇叭23和麦克风24进行硬件设计,还需要对开关控制模块21、应用识别模块22、特征提取模块25、模型推理模块26、反应识别模块27和目标应用28进行软件设计,使得电子设备基于硬件设计和软件设计,可启动喇叭23发射超声波信号,启动麦克风24采集超声波信号遇到非接触式手势反射的反射信号,并借助前述反射信号的特征变化来捕捉非接触式手势,在目标应用28中实现非接触式手势的人机交互。
需要说明的是,除了上述实现方式之外,特征提取模块25、模型推理模块26以及反应识别模块27也可设置在目标应用28的服务器中,目标应用28的服务器在特征提取模块25、模型推理模块26以及反应识别模块27各自执行的操作后,可向目标应用28返回执行结果。从而,目标应用28可实现非接触式手势的人机交互。下面,分别详细介绍电子设备实现非接触式手势控制方法所涉及的硬件设计和软件设计。
一、硬件设计
在硬件设计中,电子设备设计了喇叭和麦克风的布局设置和参数约束,可提高电子设备识别非接触式手势的准确率,也无需增加额外的器件或使用昂贵的器件,节省电子设备识别非接触式手势的成本。
1、布局设置
1.1喇叭
电子设备可将喇叭设置在电子设备的非接触面(如当电子设备放置在桌子上时,电子设备中除了与桌子的接触面之外的面)上,能够保证喇叭发射的超声波信号具有广泛的覆 盖范围,有利于电子设备能够发挥更佳的对非接触式手势的识别效果。
其中,本申请对喇叭的如数量、类型、位置等参数不做限定。
在一些实施例中,喇叭的数量可以为一个。在另一些实施例中,喇叭的数量可以为至少两个。其中,至少两个喇叭对称设置,即至少两个喇叭存在对称位,可以是中心对称或者是轴对称。本申请对前述内容不做限定,只需满足喇叭发射的超声波信号的传播范围越广越好即可。
1.2麦克风
电子设备可将麦克风设置在电子设备中靠近用户侧的位置,使得麦克风能够接收到反射信号,有利于电子设备能够发挥更佳的对非接触式手势的识别效果。
其中,本申请对麦克风的如数量、类型、位置等参数不做限定。
在一些实施例中,麦克风的数量可以为一个,且电子设备还设置有麦克风的遮挡件,通过遮挡件可调整反射信号的传播信道,使得麦克风采集到不同特征的反射信号。
其中,针对手势方向相反的非接触式手势而言,本申请将手势方向相反的非接触式手势分别称为手势1和手势2,相同超声波信号分别遇到手势1和手势2所反射的反射信号分别称为信号1和信号2。其中,本申请对手势1和手势2的具体实现方式不做限定。例如,手势1为左挥,手势2为右挥。手势1为上挥,手势2为下挥。
电子设备通过一个麦克风和遮挡件,可采集到不同特征的信号1与信号2。从而,电子设备可根据不同特征的信号1和信号2区分出手势1和手势2,使得电子设备可识别出不同方向的非接触式手势。
其中,传播信道可采用如传播量或传播方向等参数进行表征。另外,本申请提及的传播信道也可称为回波路径/传播路径。在一些实施例中,不同特征的反射信号可通过反射信号的特征热度图中如波峰或波谷的出现时刻不同进行表征。
其中,本申请对遮挡件的具体实现方式不做限定。例如,遮挡件可采用如塑料或陶瓷等材质。在一些实施例中,遮挡件为具有两个开口的导通件,遮挡件的一个开口用于覆盖/包裹与麦克风连通且设置在电子设备的外壳上的拾音孔,遮挡件的另一个开口设置有收音孔。由此,通过调整收音孔的如数量、位置、口径等参数,来改变反射信号的传播信道。
下面,详细介绍遮挡件的具体实现方式,其中以电子设备为手机为例进行示意。
在一些实施例中,手机的外壳上设置有拾音孔,拾音孔通过连接件与麦克风的拾音面连通。遮挡件为具有两个开口的导通件,遮挡件的一开口将拾音孔覆盖/包裹,遮挡件的一开口所在表面与遮挡件的另一开口所在表面相邻,且遮挡件的另一开口所在表面为斜切面,遮挡件的另一开口设置有收音孔,使得反射信号能够依次经由收音孔、拾音孔和连接件可传播到麦克风的拾音面上。
其中,本申请对遮挡件的另一开口所在表面的具体实现方式不做限定。在一些实施例中,遮挡件的另一开口所在表面不与遮挡件的一开口所在表面垂直。并且,本申请对收音孔的如位置、数量和形状等参数不做限定。
由于收音孔设置在遮挡件的斜切面上,且反射信号均需要先通过收音孔。因此,收音孔的设置可改变信号1与信号2的传播方向和传播量,使得信号1和信号2的特征不同,这样,电子设备能够区分出手势方向相反的非接触式手势。从而,电子设备可识别出各个挥动方向的非接触式手势。
在另一些实施例中,手机的外壳上设置有拾音孔,拾音孔通过连接件与麦克风的拾音面连通。遮挡件为具有两个开口的导通件,遮挡件的一开口将拾音孔覆盖/包裹,遮挡件的一开口所在表面与遮挡件的另一开口所在表面相对或相邻,且遮挡件的另一开口设置有不同的信号通过量的收音孔1和收音孔2,使得反射信号能够从收音孔1和收音孔2通过,再经过拾音孔和导通件传播到麦克风的拾音面上。
其中,在遮挡件的一开口所在表面与遮挡件的另一开口所在表面相对时,收音孔1和收音孔2均设置在遮挡件远离手机的外壳的同一面上。在遮挡件的一开口所在表面与遮挡件的另一开口所在表面相邻时,收音孔1和收音孔2分别设置在遮挡件的两个相对面上。
其中,收音孔1和收音孔2可采用不同的口径和/或数量,使得收音孔1和收音孔2具有不同的信号通过量。本申请对收音孔1和收音孔2的如位置、数量和形状等参数不做限定。例如,收音孔1为1个孔,收音孔2为2个孔,且收音孔1的口径大于收音孔2的口径。
由于收音孔1和收音孔2具有不同的信号通过量,且反射信号均需要先通过收音孔1和收音孔2。因此,收音孔1和收音孔2的设置可改变信号1与信号2的传播方向和传播量,导致信号1和信号2的特征不同,使得电子设备能够区分出挥动方向相反的非接触式手势。从而,电子设备可识别出各个挥动方向的非接触式手势。
需要说明的是,遮挡件可固定设置在电子设备的外壳上,也可插接在电子设备的外壳上,实现遮挡件的可拆卸连接,本申请对遮挡件的连接方式不做限定。并且,本申请对拾音孔的如位置、数量和形状等参数不做限定。
另外,在麦克风的数量为一个时,除了设置麦克风的遮挡件之外,如果麦克风距离手机的外壳上的拾音孔较近,本申请也可将麦克风的形状更改为不规则形状的,如将麦克风的拾音面或焊接面设置为斜切面,来改变反射信号的传播信道,使得电子设备能够区分出挥动方向相反的非接触式手势。
在另一些实施例中,麦克风的数量也可以为至少两个。其中,麦克风间的最大距离大于第一预设阈值,第一预设阈值用于确保不同麦克风相对于同一喇叭之间存在位置差异,即超声波信号的发射位置和反射信号的采集位置之间存在位置差异,有利于反映出非接触式手势的变化,避免由于麦克风间的距离较近无法区分出非接触式手势的现象。本申请对第一预设阈值的具体大小不做限定。
另外,在一些实施例中,麦克风可设计在如手机等体积较小的电子设备的非接触面。在一些实施例中,麦克风可设计在如笔记本电脑或PC等体积较大的电子设备的前端位置,易于采集反射信号的变化,确保反射信号的变化能够更好的反映出非接触式手势。
需要说明的是,本申请对喇叭和麦克风之间的距离不做限定。
1.3喇叭和麦克风在电子设备中的布局举例
结合图5A-图5C,详细介绍喇叭和麦克风分别在电子设备中的具体实现位置。
请参阅图5A,图5A为本申请一实施例提供的一种电子设备为笔记本电脑时的喇叭和麦克风的位置示意图。
如图5A所示,喇叭可设置在笔记本电脑的如位于屏幕的正对用户侧的边缘位置a1、位于屏幕的左侧的壳体侧边a2,位于屏幕的右侧的壳体侧边a3、位于键盘的正对用户侧的壳体上表面a4、环绕键盘的正对用户侧的壳体侧边a5、环绕键盘的左侧的壳体侧边a6、 环绕键盘的右侧的壳体侧边a7等位置。一般情况下,笔记本电脑上的喇叭的数量可设置为至少两个,且至少两个喇叭沿着笔记本电脑的对称轴左右对称,使得超声波信号的覆盖范围更均匀且更广阔。
继续结合图5A,麦克风在笔记本电脑上的位置可设置为靠近用户侧,例如笔记本电脑的扣手位。其中,麦克风可设置在笔记本电脑的如位于屏幕的正对用户侧的边缘位置a1、位于屏幕的上侧的壳体侧边a8、环绕键盘的正对用户侧的壳体侧边a5、环绕键盘的左侧的壳体侧边a6、环绕键盘的右侧的壳体侧边a7等位置。一般情况下,笔记本电脑上的麦克风的数量可设置为至少两个,且至少两个麦克风沿着笔记本电脑的对称轴左右对称,使得麦克风能够尽可能多地接收到反射信号。
例如,笔记本电脑上可设置六个喇叭和三个麦克风。六个喇叭分成三对,每对喇叭对称设置,第一对喇叭分别位于:a1中的左边缘位置和右边缘位置,第二对喇叭分别位于:a2和a3上,第三对喇叭均位于:a5上。三个麦克风中的两个麦克风对称设置,这两个麦克风均位于:a5上,另一个麦克风位于:a8上。
请参阅图5B,图5B为本申请一实施例提供的一种电子设备为手机时的喇叭和麦克风的位置示意图。如图5B所示,喇叭可设置在手机的如位于屏幕的正对用户侧的边缘位置b1、位于屏幕的上侧的壳体侧边b2、位于屏幕的下侧的壳体侧边b3、位于屏幕的左侧的壳体侧边b4、位于屏幕的右侧的壳体侧边b5等位置。一般情况下,手机上的喇叭的数量可设置为至少两个,且至少两个喇叭沿着手机的对称轴左右对称,使得超声波信号的覆盖范围更均匀且更广阔。
继续结合图5B,麦克风可设置在手机的如位于屏幕的正对用户侧的边缘位置b1、位于屏幕的上侧的壳体侧边b2、位于屏幕的下侧的壳体侧边b3、位于屏幕的左侧的壳体侧边b4、位于屏幕的右侧的壳体侧边b5等位置。在一些实施例中,在手机的屏幕的一侧边上布局有两个及两个以上麦克风时,与该侧边相邻的侧边上至少布局有一个麦克风。一般情况下,手机上的麦克风的数量可设置为至少两个,且至少两个麦克风设置在手机上,使得麦克风能够尽可能多地接收到反射信号。
另外,由于手机为手持设备,因此,喇叭和麦克风在手机上的位置尽量避免为手持位置。
例如,手机上可设置四个喇叭和四个麦克风。四个喇叭分为两对,每对喇叭对称设置,第一对喇叭分别位于:b1中的上边缘位置和下边缘位置,第二对喇叭分别位于:b1中的左边缘位置和右边缘位置。四个麦克风分为两对,每对麦克风对称设置,第一对麦克风分别位于:b2和b3上,第二对麦克风分别位于:b4和b5上。
请参阅图5C,图5C为本申请一实施例提供的一种电子设备为PC时的喇叭和麦克风的位置示意图。如图5C所示,喇叭可设置在PC的如位于屏幕的正对用户侧的边缘位置c1、位于屏幕的上侧的壳体侧边c2、位于屏幕的下侧的壳体侧边c3、位于屏幕的左侧的壳体侧边c4、位于屏幕的右侧的壳体侧边c5、位于支架的左侧的支架侧边c6、位于支架的右侧的支架侧边c7、位于支架的正对用户侧的支架上c8、位于底座靠近用户侧的底座上表面(即底座与支架之间的接触面)c9等位置。一般情况下,PC上的喇叭的数量可设置为至少两个,且至少两个喇叭沿着PC的对称轴左右对称,使得超声波信号的覆盖范围更均匀且更广阔。
继续结合图5C,麦克风在PC上的位置可设置为更靠近用户。其中,麦克风可设置在PC的如位于屏幕的正对用户侧的边缘位置c1、位于屏幕的上侧的壳体侧边c2、位于屏幕的下侧的壳体侧边c3、位于屏幕的左侧的壳体侧边c4、位于屏幕的右侧的壳体侧边c5、位于底座靠近用户侧的底座上表面(即底座与支架之间的接触面)c9、环绕底座的正对用户侧的底座侧边c10等位置。一般情况下,PC上的麦克风的数量可设置为至少两个,且至少两麦克风沿着PC的对称轴左右对称,使得麦克风能够尽可能多地接收到反射信号。
例如,PC上可设置四个喇叭和四个麦克风。四个喇叭分成二对,每对喇叭对称设置,第一对喇叭分别位于:c6和c7上,第二对喇叭均位于:c9上。四个麦克风分为两对,每对麦克风对称设置,第一对麦克风均位于:c9上,第二对麦克风均位于:c10上。综上,电子设备通过布局喇叭和麦克风的数量和位置,使得喇叭能够尽可能广泛地发射超声波信号,使得麦克风能够尽可能多地采集到反射信号,提升了电子设备识别非接触式手势的效果,使得电子设备能够精准识别非接触式手势,有利于在目标应用中精准地实现非接触式手势的人机交互。
2、硬件的参数约束
2.1喇叭发射的超声波信号
本申请中,超声波信号可包括一种信号,也可包括多种信号,本申请对此不做限定。其中,超声波信号中的每种信号需要满足的条件是:每种信号为自相关的信号,即该信号具有良好的自相关特性。
本申请提及的自相关是指信号在1个时刻的瞬时值与另1个时刻的瞬时值之间的依赖关系,是对1个信号的时域描述。在一些实施例中,在信号的自相关系数大于等于预设值时,可称为该信号为自相关的信号,即该信号具有良好的自相关特性。本申请对预设值的大小不做限定。
其中,本申请对超声波信号的具体类型不做限定。例如,线性调频信号(简称Chirp信号),或者,零自相关序列(Zadoff-Chu Sequence,ZC系列)等。
从而,喇叭可发射自相关特性良好的超声波信号,方便电子设备从反射信号中,提取出超声波信号遇到非接触式手势反射的反射信号,并滤除掉超声波信号遇到电子设备的周围环境反射的超声波信号,使得电子设备能够根据超声波信号遇到非接触式手势反射的反射信号精准地捕捉到非接触式手势。
另外,电子设备还可设计超声波信号中的每种信号的频率范围在第一范围内。其中,每种信号的频率范围(即频带)指的是信号的实际频率范围。第一范围与麦克风的采样率和超声波频率范围相关。
基于香农采样定理可知,为了不失真地恢复超声波信号,麦克风的采样率应该大于超声波信号的最高频率的2倍。因此,超声波信号的频率需要小于麦克风的采样率的一半。
由于超声波为一种声波,因此,超声波频率范围大于等于20kHz(赫兹)。
综上,第一范围为大于等于20kHz且小于麦克风的采样率的一半。
其中,本申请对麦克风的采样率的大小不做限定。市面上,笔记本电脑和PC中的麦克风的采样率为48kHz。手机中的麦克风的采样率已达到96kHz,甚至更高。为了便于说明,本申请中麦克风的采样率采用48kHz为例进行示意。
在麦克风的采样率为48kHz时,第一范围为大于等于20kHz且小于24kHz。
从而,喇叭可发射频率范围在第一范围内的超声波信号,使得麦克风能够采集到在第一范围内的反射信号。
在喇叭反射多种信号的情况下,除了满足上述条件之外,考虑到第一范围对应的允许超声波信号传播的频带资源有限,喇叭可以采用多声道方式来发射多种信号。
其中,允许超声波信号传播的频带资源与第一范围相关。例如,在第一范围为大于等于20kHz且小于24kHz时,第一范围对应的频带资源为4kHz(即24kHz-20kHz的差值)。
其中,多声道方式指的是喇叭需要时分(即轮流/交替)发射多种信号。此外,在一些实施例中,多种信号的频率范围均相同。在另一些实施例中,多种信号的频率范围均相同,且多种信号中存在频率变化速率相反的信号。
其中,频率变化速率指的是频率的变化趋势,可逐渐增大,也可逐渐减小,本申请对此不做限定。另外,本申请对频率变化速率的大小不做限定。
另外,本申请对多种信号中的相邻两种信号的间隔时长不做限定。
从而,喇叭采用多声道方式发射超声波信号,使得超声波信号的频率范围能够占据尽可能大的第一范围,有利于超声波信号能够覆盖尽可能多的频率范围,保证超声波信号的传播质量,提升电子设备对非接触式手势的识别准确率,解决了在麦克风采用较低采样率的情况下无法保证非接触式手势的识别效果的问题。
举例而言,在超声波信号包括两种信号时,两种信号均为自相关的信号,两种信号的频率范围均在第一范围内,两种信号时分发射,且两种信号的频率范围相同和频率变化速率相反。
为了便于说明,本申请中,在喇叭发射两种信号的情况下,这两种信号分别称为:第一发射信号和第二发射信号。对应的,第一发射信号反射的超声波信号称为第一反射信号,第二发射信号反射的超声波信号称为第二反射信号进行举例示意。
基于上述描述,下面,结合图6A-图6B,举例说明电子设备通过喇叭发射的第一发射信号和第二发射信号。
请参阅图6A-图6B,图6A为本申请一实施例提供的一种第一发射信号的波形示意图,图6B为本申请一实施例提供的一种第二发射信号的波形示意图。
在超声波信号的频率采用Chirp超声波信号的线性扫频方式的情况下,如图6A所示,第一发射信号x 1(t)的频率f 1(t)对应于图6A中的曲线11,第一发射信号x 1(t)对应于图6A中的曲线12。如图6B所示,第二发射信号x 2(t)的频率f 2(t)对应于图6B中的曲线21,第二发射信号x 2(t)对应于图6B中的曲线22。
其中,第一发射信号或第二发射信号x i(t)均为同步扫描正弦信号(synchronized sweepsinsignal)。第一发射信号或第二发射信号x i(t)可表示为:x i(t)=A isin(Φ(t))。第一发射信号x 1(t)的频率f 1(t)可表示为:f 1(t)=f 0+βt,第二发射信号x 2(t)的频率f 2(t)可表示为:f 2(t)=f 1-βt。第一发射信号或第二发射信号x i(t)的频率f i(t)是关于时间t的函数,f i(t)随着t的变化而变化,f i(t)的取值范围是大于等于f 0且小于等于f 1,且f 0大于等于第一范围的最小值,f i小于等于第一范围的最大值,i的取值为1或2。
可见,图6A和图6B中的第一发射信号和第二发射信号满足前述条件:
A、采用Chirp信号,使得第一发射信号和第二发射信号x i(t)具有良好的自相关特性;
B、由于f 0大于等于第一范围的最小值,f 1小于等于第一范围的最大值,因此,第一发射信号或第二发射信号的频率f i(t)均在第一范围内;
C、第一发射信号或第二发射信号的频率范围相同(即均为大于等于f 0且小于等于f 1),且第一发射信号或第二发射信号的频率的变化速率相反(即曲线11逐渐变大,曲线21逐渐变小)。
另外,图6A中,曲线21示出了第一反射信号的一个完整波形,对应的,第一反射信号的一个完整波形的最小发射时长为t16。图6B中,曲线22示出了第二反射信号的一个完整波形,对应的,第二反射信号的一个完整波形的最小发射时长为t26。
可见,电子设备可设计喇叭所发射的每种信号具有良好的自相关特性、每种信号的频率范围在第一范围内以及两种信号采用多声道方式中的至少一个条件,使得电子设备通过喇叭可将前述超声波信号发射到广泛的覆盖范围,增大了电子设备的手势识别范围,还提升了电子设备识别非接触式手势的效果,使得电子设备能够精准识别非接触式手势,有利于在目标应用中精准地实现非接触式手势的人机交互,降低了识别非接触式手势的成本。
另外,电子设备也可对超声波信号的频响进行约束。其中,超声波信号的频响的最低值与电子设备的手势识别范围和超声波信号的声源能量相关。超声波信号的频响的最高值还与超声波信号的声压强度对生物体的影响(如耳膜的承载能力)相关。
另外,电子设备还可根据电子设备的手势识别范围,设计超声波信号的发射时长和/或超声波信号的如最小传播能量值和最大传播能量值等参数。
2.2麦克风
在电子设备通过麦克风能够采集到反射信号时,麦克风需要满足如下条件:
A、频响的最低值在第二范围内
其中,频响,即频率响应(frequency response),用于描述麦克风对不同频率的声波信号所响应的能力。第二范围与电子设备的手势识别范围相关。
由于超声波信号的频率范围在第一范围内。因此,麦克风需要对第一范围内的每个频率的超声波信号进行响应。从而,通过设置麦克风的频响的最低值在第二范围内,使得麦克风在电子设备的手势识别范围内,能够接收到第一范围内的反射信号。
需要说明的是,一般情况下,超声波信号遇到非接触式手势反射的反射信号的频率范围在第一范围内,超声波信号遇到电子设备的周围环境反射的超声波信号的频率范围不在第一范围内,使得麦克风采集到的反射信号的频率范围通常为全频段。
B、采样率
本申请对麦克风的采样率的大小不做限定。本申请可采用市面上的麦克风。一般情况下,麦克风的采样率越高,晶振性能越好,频率越高,电子设备对非接触式手势的识别效果越好。
可见,电子设备可采用能够采集到反射信号的麦克风即可,上述条件能够提升麦克风采集到反射信号的性能。
综上,电子设备通过约束喇叭和麦克风的参数,使得喇叭能够尽可能广泛地发射超声波信号,使得麦克风能够尽可能多地接收到反射信号,可提高电子设备识别非接触式手势的准确率,也无需增加额外的器件或使用昂贵的器件,节省电子设备识别非接触式手势的成本。
二、软件设计
除了硬件设计之外,电子设备还提出了对非接触式手势进行识别的软件设计。在软件设计中,电子设备根据反射信号中的超声波信号遇到非接触式手势所反射的反射信号所表示的非接触式手势的特征变化,可准确估计出非接触式手势的手势类别。从而,电子设备可精准识别出非接触式手势,并控制目标应用响应非接触式手势,有利于在目标应用中实现非接触式手势的人机交互。
下面,基于上述关于电子设备中的喇叭和麦克风的硬件设计的描述,结合图7,详细介绍电子设备实现非接触式手势控制方法的软件设计。
请参阅图7,图7为本申请一实施例提供的一种非接触式手势控制方法的流程示意图。如图7所示,本申请的非接触式手势控制方法可以包括如下步骤:
S101、电子设备显示目标应用,目标应用为支持非接触式手势识别的应用程序。
S102、电子设备启动喇叭发射自相关的超声波信号。
S103、电子设备启动麦克风采集超声波信号遇到非接触式手势反射的反射信号。
S104、电子设备根据目标应用和反射信号,控制目标应用响应非接触式手势。
综上,电子设备可借助超声波信号,利用电子设备中现有的喇叭和麦克风等硬件设计以及手势识别算法等软件设计,在目标应用中精准地实现非接触式手势的人机交互,使得电子设备能够在光线不足甚至黑暗的环境中准确地控制目标应用响应非接触式手势,有利于提升用户使用电子设备和目标应用的体验,也无需在电子设备中增加额外的器件或使用昂贵的器件,降低了电子设备控制目标应用响应非接触式手势的成本。
下面,结合图3A、图3B-图4中的各个模块,详细介绍图7中每个步骤的具体实现方式。
S101
S101中,电子设备可显示目标应用28的一个或多个用户界面。另外,在电子设备显示目标应用28的用户界面之外,电子设备还可显示目标应用28的图标,或者,电子设备还可显示其他应用的用户界面,本申请对此不做限定。
考虑到电子设备可能同时显示多个应用的用户界面(其中包括目标应用28的用户界面),因此,电子设备利用应用识别模块22,可先识别目标应用28,再确定目标应用28是否支持非接触式手势识别。
应用识别模块22可将电子设备中已启动且用户界面为焦点窗口的应用识别为目标应用28,前述提及的用户界面为目标应用28当前显示在电子设备中的用户界面。其中,应用识别模块22可采用多种方式识别出目标应用28。
在一些实施例中,由于在应用启动后,电子设备可在软件系统的存储空间中存储该应用的标识,且电子设备可能同时启动多个应用,即存储空间中可存储多个应用对应的标识。因此,应用识别模块22可从存储空间中调取全部应用的标识,来确定已启动的全部应用。
其中,本申请对存储空间的如位置、大小等参数不做限定。应用的标识可采用如应用名、应用的ID等表示形式。
应用识别模块22通过软件系统中的应用程序接口(application programming interface,API)采用如调用函数等方式,可获取焦点窗口对应的应用的标识,来确定目标应用28。
从而,应用识别模块22可判断存储空间中是否存储目标应用28的标识,来确定目标 应用28是否已启动。由此,应用识别模块22在确定目标应用28已启动时,可将电子设备中的已启动且焦点窗口为用户界面的应用识别为目标应用28。
在另一些实施例中,电子设备可能同时显示多个应用的用户界面(其中包括目标应用28的用户界面),但电子设备的焦点窗口唯一。因此,场景识别模块22利用如函数等方式可获得焦点窗口对应的用户界面。
从而,应用识别模块22在已启动的全部应用中,可查找焦点窗口对应的用户界面对应的应用。由此,应用识别模块22可将电子设备中的已启动且焦点窗口为用户界面的应用识别为目标应用28。
在识别出目标应用28后,应用识别模块22可采用多种方式确定目标应用28是否支持非接触式手势识别,具体实现方式可参阅图3A、图3B-图4中的描述,此处不做赘述。
从而,在确定目标应用28支持非接触式手势识别后,应用识别模块22可启动喇叭23发射超声波信号,以及启动麦克风24采集反射信号。
另外,电子设备还可利用开关控制模块21,向应用识别模块22发送触发指令,使得应用识别模块22在接收到触发指令后执行S101,具体实现方式可参阅图3A、图3B-图4中的描述,此处不做赘述。
由此,电子设备还可设计将目标应用是否支持非接触式手势识别作为自适应地启动或关闭电子设备实现非接触式手势控制方法的触发条件,使得电子设备能够在目标应用中灵活地实现非接触式手势的人机交互,有利于提升电子设备的自适应性和灵活性,还可降低电子设备的整机功耗。
此外,在确定电子设备退出目标应用或切换至目标应用中的不支持非接触式手势识别的进程后,场景识别模块22可停止启动喇叭和麦克风,使得目标应用不继续响应非接触式手势,有利于电子设备自适应实现非接触式手势的人机交互,还有利于降低电子设备的整机功耗。
此外,在确定电子设备未插电或电子设备的剩余电量小于第二阈值时,场景识别模块22可停止启动喇叭和麦克风,且开关控制模块21可将第一开关的状态切换为不处于第一状态,使得电子设备关闭电子设备的非接触式手势识别的功能,充分考虑了电子设备实现非接触式手势的人机交互的可用性,还降低了电子设备的整机功耗。其中,本申请对第二阈值的具体大小不做限定。
另外,在电子设备和/或目标应用不支持非接触式手势识别后,电子设备可显示对应的提示信息,方便向用户建议不使用非接触式手势来操控目标应用或者插电使用非接触式手势来操控目标应用。本申请对前述提示信息的具体实现方式不做限定。
可见,电子设备能够自适应启用或关闭非接触式手势识别的功能,保障了电子设备实现非接触式手势的人机交互的自适应性和可用性,降低了电子设备的整机功耗。
S102
S102中,在电子设备确定目标应用后,电子设备可利用喇叭23发射自相关的超声波信号。前述提及的自相关可参见前文的描述,此处不做赘述。其中,超声波信号可包括一种信号,也可包括多种信号,本申请对此不做限定。
在超声波信号包括一种信号的情况下,一种信号为自相关的信号,具有良好的自相关特性,且一种信号的频率范围在第一范围内。其中,本申请对喇叭23发射一种信号的方 式不做限定。
在喇叭23的数量为一个时,一个喇叭可连续发射一种信号,也可以间隔发射一种信号,本申请对此不做限定。在喇叭23的数量为多个时,多个喇叭可时分发射多种信号,也可交叉发射多种信号,也可同时发射多种信号,本申请对此不做限定。
在超声波信号包括多种信号的情况下,多种信号均为自相关的信号,均具有良好的自相关特性,且多种信号的频率范围均在第一范围内。其中,本申请对喇叭23发射多种信号的方式不做限定。
在喇叭23的数量为一个时,一个喇叭的多个声道可分别发射多种信号。在喇叭23的数量为多个时,多个喇叭可分别发射多种信号。其中,喇叭23可采用时分、交叉、同时等方式反射多种信号,本申请对此不做限定。
举例而言,超声波信号可以包括两种信号,两种信号均为自相关的信号,两种信号均具有良好的自相关特性,两种信号的频率范围均在第一范围内,且采用多声道方式发射两种信号。
在喇叭的数量为一个时,一个喇叭可包括两个声道,两个声道可时分发射两种信号。即,一个声道发射第一发射信号,在第一发射信号发射结束后,另一个声道发射第二发射信号。
在喇叭的数量为至少两个时,针对对称设置的两个喇叭而言,这两个喇叭可时分发射两种信号。即,一个喇叭发射第一发射信号,在第一发射信号发射结束后,另一个喇叭发射第二发射信号。
另外,除了时分发射两种信号之外,两种信号的频率范围可相同,两种信号的频率变化速率可相反。
综上,电子设备启动喇叭发射超声波信号,使得超声波信号发射的覆盖范围更为广泛,也使得电子设备通过麦克风能够采集到超声波信号遇到非接触式手势反射的反射信号。
S103
S103中,在超声波信号发射后,超声波信号可遇到非接触式手势,也可遇到电子设备的周围环境。因此,超声波信号遇到非接触式手势反射的反射信号中难免夹杂着超声波信号遇到电子设备的周围环境反射的超声波信号。
其中,超声波信号遇到非接触式手势反射的反射信号的频率范围在第一范围内,超声波信号遇到电子设备的周围环境反射的超声波信号的频率范围不在第一范围内。
综上,电子设备通过麦克风可采集到超声波信号遇到非接触式手势后反射的反射信号。
S104
S104中,在麦克风采集到反射信号后,电子设备可利用特征提取模块25、模型推理模块26和反应识别模块27,根据目标应用以及反射信号,控制目标应用28响应非接触式手势。
下面,结合图8,详细介绍上述过程的一种可实现方式。
请参阅图8,图8为本申请一实施例提供的一种非接触式手势控制方法的流程示意图。如图8所示,本申请的非接触式手势控制方法可以包括:
S201、特征提取模块25提取反射信号的距离特征和速度特征。
S202、模型推理模块26根据距离特征和速度特征,确定非接触式手势的手势类别。
S203、反应识别模块27根据目标应用以及非接触式手势的手势类别,向目标应用28发送交互指令。
S204、目标应用28在接收到交互指令后,控制目标应用28响应非接触式手势。
综上,电子设备能够精准识别出非接触式手势,在目标应用中实现非接触式手势的人机交互,无需考虑周围环境的光线、器件成本和器件材质等约束,提升了用户使用电子设备和目标应用的体验,还降低了电子设备识别非接触式手势的成本。
S201
S201中,特征提取模块25可采用特征热度图来表征反射信号的特征。从而,特征提取模块25结合麦克风24的排列位置,可通过特征热度图确定出非接触式手势的手势类别。
下面,结合图9A-图9C,图10A-图10D和图11A-图11D,详细介绍特征热度图的具体实现方式。
为了便于说明,图9A-图9C,图10A-图10D和图11A-图11D中,以电子设备为笔记本电脑,电子设备中包括4个距离均匀设置的麦克风(mic1、mic2、mic3和mic4),4个麦克风皆设置在笔记本电脑的位于环绕键盘的正对用户侧的壳体侧边a5上,且电子设备中的喇叭发射两种信号为例进行示意。
图9A-图9C,图10A-图10D和图11A-图11D中,特征热度图的横坐标采用时间维度,横坐标代表帧数,单位为毫秒(ms),一帧的时长大于等于第一发射信号的一个完整波形的最小发射时长和第二发射信号的一个完整波形的最小发射时长之和。
以图6A中的第一反射信号和图6B中的第二反射信号为例,第一反射信号的一个完整波形的最小发射时长为图6A所示的t16,约为10ms。第二反射信号的一个完整波形的发射时长为图6B所示的t26,约为10ms。那么,一帧的时长大于等于t16+t26,约为20ms(10ms+10ms)。
以第一反射信号和第二反射信号的间隔时长等于0为例,在喇叭发射完第一反射信号后,喇叭可直接发射第二反射信号。那么,一帧的时长大于第一发射信号的一个完整波形的最小发射时长和第二发射信号的一个完整波形的最小发射时长之和。
以第一反射信号和第二反射信号的间隔时长大于0为例,在喇叭发射完第一反射信号后,经过间隔时长后,喇叭再发射第二反射信号。那么,一帧的时长等于第一发射信号的一个完整波形的最小发射时长、第二发射信号的一个完整波形的最小发射时长和间隔段时长之和。
图9A-图9C,图10A-图10D和图11A-图11D中,特征热度图的纵坐标采用特征维度,纵坐标上的数字N代表第N个信道冲激响应(channel impulse response,CIR)特征,为无量纲量,可表征麦克风与用户的手之间的实际距离,纵坐标上的数字越小,实际距离越小。
由于喇叭需要发射两种发射信号(即第一发射信号和第二发射信号)。因此,麦克风可采集到两种反射信号(即第一反射信号和第二反射信号)。这样,每个麦克风中可包括2个通道,且每个通道可采集一种反射信号,即一个通道可采集第一反射信号,另一个通道可采集第二反射信号。
针对4个麦克风而言,共计有8个通道。故,每个通道划分为64个维度,即每个麦克风中的一个通道可将采集到的第一反射信号的变化反映在纵坐标上的64个维度中,每个麦克风中的另一个通道可将采集到的第二反射信号的变化反映在纵坐标上的另外64个 维度中。基于上述描述,纵坐标上的CIR特征的总长为512。
其中:
纵坐标上的数字1-64和65-128代表mic1的两个通道,即mic1的一个通道采集第一反射信号,mic1的另一个通道采集第二反射信号。
纵坐标上的数字129-192和193-256代表mic2的两个通道,即mic2的一个通道采集第一反射信号,mic2的另一个通道采集第二反射信号。
纵坐标上的数字257-320和321-384代表mic3的两个通道,即mic3的一个通道采集第一反射信号,mic3的另一个通道采集第二反射信号。
纵坐标上的数字385-448和449-512代表mic4的两个通道,即mic4的一个通道采集第一反射信号,mic4的另一个通道采集第二反射信号。
图9A-图9C,图10A-图10D和图11A-图11D中,特征热度图的坐标点采用颜色维度(如黑白灰),坐标点代表CIR特征的大小,可表示反射信号的能量。其中,坐标点的颜色为白色,反射信号的能量越强,麦克风与用户的手之间的实际距离越近。另外,在坐标点的颜色为黑色或坐标点的颜色较灰的情况下,用户的手处于静止状态或者麦克风与用户的手之间的实际距离越远。
请参阅图9A-图9C,图9A为本申请一实施例提供的一种非接触式手势为用户的手靠近电子设备的场景示意图,图9B为本申请一实施例提供的一种非接触式手势为用户的手远离电子设备的场景示意图,图9C为本申请一实施例提供的一种非接触式手势为用户的手先靠近电子设备后远离电子设备对应的特征热度图的示意图。
图9A-图9B中,虚线用于表示用户的手与笔记本电脑之间有一定的距离,即非接触式手势为隔空手势。图9A中,用户的手沿着射线AB所在的方向逐渐靠近笔记本电脑,称为前述非接触式手势为推动作。图9B中,用户的手沿着射线BA所在的方向逐渐远离笔记本电脑,称为前述非接触式手势为拉动作。
由于4个麦克风的排列位置皆设置在笔记本电脑的一个面上。因此,图9C中,在用户的手靠近或远离电子设备时,用户的手与4个麦克风之间的实际距离的变化趋势是相同,即4个麦克风分别接收到的反射信号(第一反射信号或第二反射信号)的能量的变化走势也是相同。
以纵坐标上的数字1-64对应的mic1的通道为例,图9C中,随着时间的推移,在135帧-158帧之间,坐标点的颜色变白且纵坐标上的数字变小,即用户的手靠近4个麦克风;在159帧-178帧之间,坐标点的颜色变黑且纵坐标上的数字变大,即用户的手逐渐远离4个麦克风。
从而,电子设备可确定:用户的手先执行推动作,后执行拉动作。
请参阅图10A-图10D,图10A为本申请一实施例提供的一种非接触式手势为用户的手向下挥动的场景示意图,图10B为本申请一实施例提供的一种非接触式手势为用户的手向下挥动对应的特征热度图,图10C为本申请一实施例提供的一种非接触式手势为用户的手向上挥动的场景示意图,图10D为本申请一实施例提供的一种非接触式手势为用户的手向上挥动对应的特征热度图。
图10A和图10C中,虚线用于表示用户的手与笔记本电脑之间有一定的距离,即非接触式手势为隔空手势。图10A中,用户的手沿着射线CD所在的方向向下挥动,称为前述 非接触式手势为下挥动作。图10C中,用户的手沿着射线DC所在的方向向上挥动,称为前述非接触式手势为上挥动作。
由于4个麦克风的排列位置皆设置在笔记本电脑的一个面上。因此,图10B中,在用户的手向下挥动或向上挥动时,用户的手与4个麦克风之间的实际距离的变化趋势是相同,即4个麦克风分别接收到的反射信号(第一反射信号或第二反射信号)的能量的变化走势也是相同。
以纵坐标上的数字1-64对应的mic1的通道为例,图10B中,随着时间的推移,在169帧-188帧之间,坐标点的颜色变白且纵坐标上的数字变小,即用户的手靠近4个麦克风;在189帧-250帧之间,坐标点的颜色变黑,可表示用户的手远离4个麦克风。
从而,电子设备可确定:用户的手执行下挥动作。
以纵坐标上的数字1-64对应的mic1的通道为例,图10D中,随着时间的推移,在151帧-178帧之间,坐标点的颜色从白色变为灰色且纵坐标上的数字变大,即用户的手远离4个麦克风。在179帧-233帧之间,坐标点的颜色变黑,可表示用户的手远离4个麦克风。
从而,电子设备可确定:用户的手执行上挥动作。
请参阅图11A-图11D,图11A为本申请一实施例提供的一种非接触式手势为用户的手向左挥动的场景示意图,图11B为本申请一实施例提供的一种非接触式手势为用户的手向左挥动对应的特征热度图,图11C为本申请一实施例提供的一种非接触式手势为用户的手向右挥动的场景示意图,图11D为本申请一实施例提供的一种非接触式手势为用户的手向右挥动对应的特征热度图。
图11A和图11C中,虚线用于表示用户的手与笔记本电脑之间有一定的距离,即非接触式手势为隔空手势。图11A中,用户的手沿着射线EF所在的方向向左挥动,称为前述非接触式手势为左挥动作。图11C中,用户的手沿着射线FE所在的方向向右挥动,称为前述非接触式手势为右挥动作。
由于4个麦克风的排列位置皆设置在笔记本电脑的一个面上。因此,图11B中,在用户的手向左挥动或向右挥动时,用户的手与4个麦克风之间的实际距离的变化趋势是相同,即4个麦克风分别接收到的反射信号(第一反射信号或第二反射信号)的能量的变化走势也是相同。
以纵坐标上的数字1-64对应的mic1的通道为例,图11B中,随着时间的推移,在149帧-212帧之间,坐标点的颜色变白且纵坐标上的数字先变小再变大,即用户的手先靠近再远离4个麦克风。
并且,在同一时刻,纵坐标上的数字1-64对应的mic1的通道、纵坐标上的数字129-192对应的mic2的通道、纵坐标上的数字257-320对应的mic3的通道和纵坐标上的数字385-448对应的mic4的通道,坐标点的颜色变白,可表示反射信号的能量依次升高。
从而,电子设备可确定:用户的手执行左挥动作。
以纵坐标上的数字1-64对应的mic1的通道为例,图11D中,随着时间的推移,在156帧-243帧之间,坐标点的亮度变白且纵坐标上的数字先变小再变大,即用户的手先靠近再远离4个麦克风。
并且,在同一时刻,纵坐标上的数字1-64对应的mic1的通道、纵坐标上的数字129-192对应的mic2的通道、纵坐标上的数字257-320对应的mic3的通道和纵坐标上的数字 385-448对应的mic4的通道,坐标点的颜色变黑,可表示反射信号的能量依次降低。
从而,电子设备可确定:用户的手执行右挥动作。
综上,电子设备结合麦克风的排列位置和特征热度图,可确定出非接触式手势的手势类别。其中,本申请提及的麦克风的排列位置可理解为:在麦克风的数量为一个时,麦克风的排列位置即一个麦克风在电子设备中的布局;在麦克风的数量为至少两个时,麦克风的排列位置即至少两个麦克风在电子设备中的布局。
从而,特征提取模块25可从特征热度图中提取反射信号的距离特征和速度特征。
请参阅图12,图12为本申请一实施例提供的一种电子设备确定距离特征和速度特征的流程框图。
如图12所示,特征提取模块25提取距离特征和速度特征的具体过程可以包括如下步骤:
S11:特征提取模块25可同步第一发射信号和第一反射信号,使得电子设备可获取到第一反射信号的起始位置,方便后续执行传播信道的脉冲响应估计。
对应地,特征提取模块25可同步第二发射信号和第二反射信号,使得电子设备可获取到第二反射信号的起始位置,方便后续执行传播信道的脉冲响应估计。
S12:特征提取模块25利用第一反射信号良好的自相关特性进行传播信道的脉冲响应估计,采用傅里叶变化(discrete Fourier transform,DFT)矩阵处理后再进行匹配滤波,将匹配滤波器的输出信号频谱再做逆傅里叶变化(inverse discrete Fourier transform,IDFT)。
对应地,特征提取模块25利用第二反射信号良好的自相关特性进行传播信道的脉冲响应估计,采用傅里叶变化(discrete Fourier transform,DFT)矩阵处理后再进行匹配滤波,将匹配滤波器的输出信号频谱再做逆傅里叶变化(inverse discrete Fourier transform,IDFT)。
综上,特征提取模块25可输出CIR特征,即特征提取模块25根据第一反射信号和第二反射信号,可绘制出特征热度图,且在特征热度图中表征有CIR特征。其中,CIR特征中包括:非接触式手势的CIR和特征电子设备的周围环境的CIR特征。
S13:特征提取模块25在时间维度上对CIR特征进行差分,可得到相对运动的差分信道冲激响应(diff channel impulse response,dCIR)特征。其中,dCIR特征可表示非接触式手势的距离变化。从而,去除固定不变的电子设备的周围环境的影响。可见,dCIR特征即距离特征。
S14:特征提取模块25在时间维度上使用dCIR特征在长时间周期内进行傅里叶变换(如快速傅里叶变换(fast Fourier transform,FFT)),可得到该长时间周期内的多普勒特征。其中,多普勒特征可表示非接触式手势的速度变化。可见,多普勒特征即速度特征。
从而,特征提取模块25可提取出反射信号的距离特征和速度特征。
S202
S202中,模型推理模块26结合麦克风的排列位置,可根据距离特征以及速度特征,识别出非接触式手势的手势类别。
在一些实施例中,模型推理模块26可从特征提取模块25接收距离特征和速度特征。
需要说明的是,由于非接触式手势需要花费一定的执行时长。因此,特征提取模块25可将固定帧数的距离特征和速度特征发送给模型推理模块26,使得模型推理模块26能够识别出一个完整的非接触式手势。
其中,固定帧数与非接触式手势的执行时长相关。另外,固定帧数也与电子设备识别非接触式手势的反映速率相关。本申请对固定帧数的具体数值不做限定。例如,固定帧数为128。
在接收到距离特征和速度特征后,模型推理模块26可将距离特征和速度特征输入到手势识别算法对应的深度学习分类模型进行推理,得到非接触式手势的手势类别。
其中,手势识别算法对应的深度学习分类模型可采用如时序卷积神经网络(temporal convolutional networks,TCN)等模型。
TCN是卷积神经网络的一种变体,它通过结合循环神经网络(recurrent neural network,RNN)和卷积神经网络(convolutional neural networks,CNN)架构来进行序列建模任务。相比于长短期记忆网络(long short-term memory,LSTM)而言,TCN的卷积体系结构在各种任务和数据集上表现出更好的性能,同时具备更长的有效内存。
TCN模型以CNN模型为基础,并做了如下改进:1、适用序列模型:因果卷积(causal convolution);2、记忆历史:空洞卷积/膨胀卷积(dilated convolution),残差模块(residual block)。并且,相比于RNN模型,TCN模型的体系结构可以采用任意长度的序列,并将其映射到相同长度的输出序列。从而,TCN模型具有非常长的有效历史记录,网络带有残差层的增强和扩张卷积的组合。
请参阅图13,图13为本申请一实施例提供的一种电子设备确定非接触式手势的手势类别的流程框图。
如图13所示,TCN模型的具体推理过程可包括如下步骤:
S21:模型推理模块26将距离特征和速度特征输入到手势分类模型中,可输出非接触式手势的分类结果。
其中,手势分类模型可采用多层网络模型(图13中采用多层卷积层进行示意),具体通过CNN模型将多个特征矩阵进行融合,再经过多层因果卷积,每一层经过残差连接,使用AutoML进行网格结构搜索和优化,再通过多个训练集调教后得到多种非接触式手势的分类结果。其中,分类结果可表示为粗粒度的非接触式手势的手势类别。例如,分类结果可以划分为:手掌动作、手指动作以及手动作。又如,分类结果可以划分为:悬停、挥动以及手指数量变化。
S22:模型推理模块26将距离特征和速度特征输入到手势追踪模型中,可输出非接触式手势的移动区域(如3D坐标)。
其中,手势追踪模型可采用多层网络模型(图13中采用多层卷积层进行示意),具体通过深度摄像机及全部能够检测手部位置的设备为每一次手势动作记录坐标,同时通过坐标可以计算手势的速度,通过这些标注数据(如手势类别、坐标、速度等),重新设计网络模型,并通过网络模型训练,实现分类功能,还可预测手的坐标及运动速度。
此外,电子设备根据手的坐标,实现对手势作用范围的控制;电子设备根据手势的速度信息,实现粗略的手势分类。并且,对于未识别成功的手势数据不会进入手势分类模型进行二次判断,直接输出成其他类别。对于识别成功的手势数据可再次通过手势分类模型进行二次判断。这样,前述操作有助于减少交互反应的误触,提高非接触式手势的分类精度。
需要说明的是,除了上述实现方式之外,手势分类模型和手势追踪模型还可采用其他 方式实现相应的功能。另外,手势分类模型和手势追踪模型可采用结构相同但参数不同的多层网络模型。
S23:模型推理模块26根据非接触式手势的分类结果和非接触式手势的移动区域进行融合推理,可得到细粒度的非接触式手势的手势类别。综上,模型推理模块26结合麦克风的排列位置,通过手势分类模型和手势追踪模型,根据距离特征和速度特征,可识别出非接触式手势的手势类别。
从而,模型推理模块26通过双重校验方式,可识别是否是非接触式手势以及识别出非接触式手势的手势类别。
S203
S203中,反应识别模块27根据目标应用以及非接触式手势的手势类别,可向目标应用28发送交互指令。
在一些实施例中,反应识别模块27根据目标应用以及非接触式手势的手势类别,可先确定非接触式手势对应的交互响应,再向目标应用28发送交互响应的交互指令。
其中,本申请对交互响应的具体实现方式不做限定。在一些实施例中,交互响应为与非接触式手势对应的如上下翻页、左右翻页、增加音量、降低音量、切换视频、拨打电话、截屏、截取长图、录制屏幕、切换应用、切换至主屏幕、切换至负一屏、选择多进程等响应。
另外,交互响应也可采用如键盘的控制事件、鼠标的点击事件、屏幕像素点的分块后的触碰事件等进行表示,使得目标应用28能够根据前述事件执行非接触式手势对应的交互响应。
由于同一个非接触式手势在不同类型的应用中所表现的交互响应可能相同或不同。因此,反应识别模块27根据各个应用的标识、非接触式手势的手势类别以及交互响应间的对应关系,根据目标应用以及非接触式手势的手势类别,可确定非接触式手势响应于目标应用28的交互响应。
其中,本申请对对应关系的具体实现方式不做限定。在一些实施例中,对应关系可采用如表格、矩阵、数组、关键值(key-value)等表示方式在电子设备中进行存储。从而,反应识别模块27可确定交互响应的交互指令,便可将交互响应的交互指令发送给目标应用28。
S204
S204中,目标应用28在接收到交互指令,可控制目标应用28响应非接触式手势,即目标应用28可执行非接触式手势对应的交互响应。
综上,电子设备通过非接触式手势,可及时且准确地控制目标应用执行非接触式手势对应的交互响应,使得电子设备能够准确且快速地在目标应用中实现非接触式手势的人机交互。
示例性地,本申请提供一种电子设备,包括:存储器和处理器;存储器用于存储程序指令;处理器用于调用存储器中的程序指令使得电子设备执行前文实施例中的非接触式手势控制方法。
示例性地,本申请提供一种芯片系统,芯片系统应用于包括存储器、显示屏和传感器的电子设备;芯片系统包括:处理器;当处理器执行存储器中存储的计算机指令时,电子 设备执行前文实施例中的非接触式手势控制方法。
示例性地,本申请提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器使得电子设备执行时实现前文实施例中的非接触式手势控制方法。
示例性地,本申请提供一种计算机程序产品,包括:执行指令,执行指令存储在可读存储介质中,电子设备的至少一个处理器可以从可读存储介质读取执行指令,至少一个处理器执行执行指令使得电子设备实现前文实施例中的非接触式手势控制方法。
在上述实施例中,全部或部分功能可以通过软件、硬件、或者软件加硬件的组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (15)

  1. 一种非接触式手势控制方法,其特征在于,应用于电子设备,所述电子设备包括:至少一个喇叭和至少一个麦克风,所述方法包括:
    显示目标应用,所述目标应用为支持非接触式手势识别的应用程序;
    启动所述至少一个喇叭发射自相关的超声波信号;
    启动所述至少一个麦克风采集所述超声波信号遇到非接触式手势后反射的反射信号;
    根据所述目标应用以及所述反射信号,控制所述目标应用响应所述非接触式手势。
  2. 根据权利要求1所述的方法,其特征在于,
    在所述超声波信号包括一种信号时,所述一种信号为自相关的信号,且所述一种信号的频率范围在第一范围内;
    其中,所述第一范围与所述麦克风的采样率以及超声波频率范围相关。
  3. 根据权利要求1所述的方法,其特征在于,
    在所述超声波信号包括两种信号时,所述两种信号均为自相关的信号,所述两种信号的频率范围均在第一范围内,所述两种信号时分发射,且所述两种信号的频率范围相同和频率变化速率相反;
    其中,所述第一范围与所述麦克风的采样率以及超声波频率范围相关。
  4. 根据权利要求3所述的方法,其特征在于,所述启动所述至少一个喇叭发射超声波信号,包括:
    在所述电子设备包括一个喇叭时,启动所述一个喇叭的两个声道时分发射所述两种信号;
    在所述电子设备包括两个喇叭时,启动所述两个喇叭时分发射所述两种信号。
  5. 根据权利要求1所述的方法,其特征在于,所述超声波信号采用线性调频信号。
  6. 根据权利要求1所述的方法,其特征在于,
    所述麦克风的频响的最低值在第二范围内,所述第二范围用于确保所述麦克风可接收到所述反射信号。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述目标应用以及所述反射信号,控制所述目标应用响应所述非接触式手势,包括:
    提取所述反射信号的距离特征和速度特征,所述距离特征用于表示所述非接触式手势的挥动位置与所述麦克风间的距离变化,所述速度特征用于表示所述非接触式手势的速度变化;
    根据所述距离特征和所述速度特征,确定所述非接触式手势的手势类别;
    根据所述目标应用以及所述非接触式手势的手势类别,控制所述目标应用响应所述非 接触式手势。
  8. 根据权利要求1所述的方法,其特征在于,
    在所述电子设备包括一个麦克风时,所述电子设备还包括:所述麦克风的遮挡件,所述遮挡件用于调整所述超声波信号的传播方向和/或传播量,以使所述电子设备能够区分不同方向的非接触式手势;
    在所述电子设备包括至少两个麦克风时,麦克风间的最大距离大于第一阈值,所述第一阈值用于确保所述至少两个麦克风与同一喇叭间存在位置差异。
  9. 根据权利要求1-8任一项所述的方法,其特征在于,在启动所述至少一个喇叭发射自相关的超声波信号之前,所述方法还包括:
    确定所述电子设备的第一开关处于第一状态,所述第一开关的第一状态用于指示所述电子设备启用非接触式手势识别的功能。
  10. 根据权利要求1-9任一项所述的方法,其特征在于,在启动所述至少一个喇叭发射自相关的超声波信号之前,所述方法还包括:
    确定所述目标应用支持非接触式手势识别。
  11. 根据权利要求10所述的方法,其特征在于,所述确定所述目标应用支持非接触式手势识别,包括:
    在所述电子设备的存储模块中存储有所述目标应用的标识时,确定所述目标应用支持非接触式手势识别,所述存储模块用于存储支持非接触式手势识别的全部应用的标识;
    或者,在所述目标应用的第二开关处于第二状态时,确定所述目标应用支持非接触式手势识别,所述第二开关的状态用于表示所述目标应用是否支持非接触式手势识别。
  12. 根据权利要求1-11任一项所述的方法,其特征在于,所述方法还包括:
    在所述电子设备未插电或所述电子设备的剩余电量小于第二阈值时,停止启动所述至少一个喇叭以及所述至少一个麦克风。
  13. 一种电子设备,其特征在于,包括:存储器和处理器;
    所述存储器用于存储程序指令;
    所述处理器用于调用所述存储器中的程序指令使得所述电子设备执行权利要求1-12任一项所述的非接触式手势控制方法。
  14. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1-12任一项所述的非接触式手势控制方法。
  15. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1-12任一项所述的非接触式手势控制方法。
PCT/CN2022/114295 2021-10-13 2022-08-23 非接触式手势控制方法和电子设备 WO2023061054A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111194277.9 2021-10-13
CN202111194277.9A CN115981454A (zh) 2021-10-13 2021-10-13 非接触式手势控制方法和电子设备

Publications (1)

Publication Number Publication Date
WO2023061054A1 true WO2023061054A1 (zh) 2023-04-20

Family

ID=85956759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/114295 WO2023061054A1 (zh) 2021-10-13 2022-08-23 非接触式手势控制方法和电子设备

Country Status (2)

Country Link
CN (1) CN115981454A (zh)
WO (1) WO2023061054A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130154919A1 (en) * 2011-12-20 2013-06-20 Microsoft Corporation User control gesture detection
CN104978022A (zh) * 2014-04-10 2015-10-14 联发科技股份有限公司 基于超声波的非接触式手势识别方法及其装置
CN105718064A (zh) * 2016-01-22 2016-06-29 南京大学 基于超声波的手势识别系统与方法
CN106446801A (zh) * 2016-09-06 2017-02-22 清华大学 基于超声主动探测的微手势识别方法及系统
CN107291308A (zh) * 2017-07-26 2017-10-24 上海科世达-华阳汽车电器有限公司 一种手势识别装置及其识别方法
CN107943300A (zh) * 2017-12-07 2018-04-20 深圳大学 一种基于超声波的手势识别方法及系统
US20190041994A1 (en) * 2017-08-04 2019-02-07 Center For Integrated Smart Sensors Foundation Contactless gesture recognition system and method thereof
CN109857245A (zh) * 2017-11-30 2019-06-07 腾讯科技(深圳)有限公司 一种手势识别方法和终端

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130154919A1 (en) * 2011-12-20 2013-06-20 Microsoft Corporation User control gesture detection
CN104978022A (zh) * 2014-04-10 2015-10-14 联发科技股份有限公司 基于超声波的非接触式手势识别方法及其装置
CN105718064A (zh) * 2016-01-22 2016-06-29 南京大学 基于超声波的手势识别系统与方法
CN106446801A (zh) * 2016-09-06 2017-02-22 清华大学 基于超声主动探测的微手势识别方法及系统
CN107291308A (zh) * 2017-07-26 2017-10-24 上海科世达-华阳汽车电器有限公司 一种手势识别装置及其识别方法
US20190041994A1 (en) * 2017-08-04 2019-02-07 Center For Integrated Smart Sensors Foundation Contactless gesture recognition system and method thereof
CN109857245A (zh) * 2017-11-30 2019-06-07 腾讯科技(深圳)有限公司 一种手势识别方法和终端
CN107943300A (zh) * 2017-12-07 2018-04-20 深圳大学 一种基于超声波的手势识别方法及系统

Also Published As

Publication number Publication date
CN115981454A (zh) 2023-04-18

Similar Documents

Publication Publication Date Title
JP7142783B2 (ja) 音声制御方法及び電子装置
WO2021013158A1 (zh) 显示方法及相关装置
WO2021000803A1 (zh) 一种控制屏幕小窗口的方法及相关设备
WO2021129326A1 (zh) 一种屏幕显示方法及电子设备
WO2020177585A1 (zh) 一种手势处理方法及设备
WO2020052529A1 (zh) 全屏显示视频中快速调出小窗口的方法、图形用户接口及终端
WO2020211701A1 (zh) 模型训练方法、情绪识别方法及相关装置和设备
CN113645351B (zh) 应用界面交互方法、电子设备和计算机可读存储介质
WO2021036770A1 (zh) 一种分屏处理方法及终端设备
WO2019072178A1 (zh) 一种通知处理方法及电子设备
WO2021185244A1 (zh) 一种设备交互的方法和电子设备
WO2021063098A1 (zh) 一种触摸屏的响应方法及电子设备
WO2022068483A1 (zh) 应用启动方法、装置和电子设备
WO2021078032A1 (zh) 用户界面的显示方法及电子设备
WO2022068819A1 (zh) 一种界面显示方法及相关装置
WO2022037726A1 (zh) 分屏显示方法和电子设备
WO2020238759A1 (zh) 一种界面显示方法和电子设备
WO2022017393A1 (zh) 显示交互系统、显示方法及设备
WO2021052139A1 (zh) 手势输入方法及电子设备
WO2021082815A1 (zh) 一种显示要素的显示方法和电子设备
WO2022022609A1 (zh) 一种防误触的方法及电子设备
US20230168784A1 (en) Interaction method for electronic device and electronic device
WO2022007707A1 (zh) 家居设备控制方法、终端设备及计算机可读存储介质
WO2022028290A1 (zh) 基于指向操作的设备之间的交互方法及电子设备
CN110058729B (zh) 调节触摸检测的灵敏度的方法和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22880001

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022880001

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022880001

Country of ref document: EP

Effective date: 20240319