WO2022156438A1 - 一种唤醒方法及电子设备 - Google Patents

一种唤醒方法及电子设备 Download PDF

Info

Publication number
WO2022156438A1
WO2022156438A1 PCT/CN2021/138534 CN2021138534W WO2022156438A1 WO 2022156438 A1 WO2022156438 A1 WO 2022156438A1 CN 2021138534 W CN2021138534 W CN 2021138534W WO 2022156438 A1 WO2022156438 A1 WO 2022156438A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
wave signal
sound
ultrasonic
pickup direction
Prior art date
Application number
PCT/CN2021/138534
Other languages
English (en)
French (fr)
Inventor
刘长飞
李树为
孙渊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21920812.1A priority Critical patent/EP4258259A4/en
Publication of WO2022156438A1 publication Critical patent/WO2022156438A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/02Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems using reflection of acoustic waves
    • G01S15/04Systems determining presence of a target
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/86Combinations of sonar systems with lidar systems; Combinations of sonar systems with systems not using wave reflection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/87Combinations of sonar systems
    • G01S15/876Combination of several spaced transmitters or receivers of known location for determining the position of a transponder or a reflector
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S3/00Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
    • G01S3/80Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
    • G01S3/802Systems for determining direction or deviation from predetermined direction
    • G01S3/808Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems
    • G01S3/8083Systems for determining direction or deviation from predetermined direction using transducers spaced apart and measuring phase or time difference between signals therefrom, i.e. path-difference systems determining direction of source
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/88Sonar systems specially adapted for specific applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2499/00Aspects covered by H04R or H04S not otherwise provided for in their subgroups
    • H04R2499/10General applications
    • H04R2499/11Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's

Definitions

  • the present application relates to the field of terminals, and in particular, to a wake-up method and an electronic device.
  • voice assistants for example, Little E, Siri, etc.
  • an electronic device will preset one or more wake words (eg, "hello little E”, “hi Siri”, etc.). After detecting the preset wake-up word, the electronic device wakes up and interacts with the user through the voice assistant.
  • the electronic device does not wake up even if the sound wave signal sent by the user contains a preset wake-up word; or, sometimes the electronic device wakes up even though the sound wave signal sent by the user does not contain the preset wake-up word. This brings a bad experience to the user.
  • the present application provides a wake-up method and an electronic device.
  • the technical solution provided by the present application can improve the accuracy of wake-up of the electronic device, reduce the probability of false wake-up of the electronic device, and improve the user experience.
  • an electronic device which is in an unawakened state.
  • the electronic device includes: a processor; a memory; M (M is a positive integer greater than 1) microphones, each microphone corresponding to a sound pickup entrance; the M sound pickup entrances of the M microphones are located on the first surface of the electronic device, and the first The surface is on a plane; the distance between any two microphones in the above M microphones is fixed; P (P is a positive integer greater than or equal to 1) ultrasonic transmitters, each ultrasonic transmitter corresponds to an ultrasonic transmitting port P ultrasonic emission ports of above-mentioned P ultrasonic transmitters are located on the second surface; The second surface is different from the first surface; Q (Q is a positive integer greater than 1) ultrasonic receivers, each ultrasonic receiver corresponds to an ultrasonic wave receiving ports; the Q ultrasonic receiving ports of the above-mentioned Q ultrasonic receivers are located on the third surface of the electronic device, and the third surface is on a plane;
  • the first sound wave signal is detected by the above-mentioned M microphones; in response to the first sound wave signal, according to the arrival time difference of the first sound wave signal reaching at least two of the M microphones, and the part of the at least two microphones Or the distance between all the microphones, and the first sound pickup direction is obtained; wherein, the first sound pickup direction is used to indicate: the first sound source position on the first projection point on the plane where the first surface is located, relative to the first surface A fixed point on the plane where it is located (this fixed point is different from the first projection point); further, obtain the first sound wave signal component of the first sound wave signal in the first pickup direction; After the similarity between the set wake-up words is less than the preset first threshold and greater than or equal to the preset second threshold, a second acoustic wave signal can be transmitted through the P ultrasonic transmitters, and the second acoustic wave signal is an ultrasonic signal; Further, the second acoustic wave signal may be received by the Q ultrasonic receivers; in response
  • the second projection point is relative to the direction of the fixed point (the fixed point is different from the second projection point); further, the second acoustic wave signal component of the second acoustic wave signal in the second pickup direction is obtained; After the similarity with the preset wake-up word is greater than the preset third threshold, it means that the first sound wave signal contains the wake-up word, and the electronic device wakes up.
  • the wake-up method provided by the electronic device in the first aspect can be divided into two stages.
  • the electronic device can first locate the first sound pickup direction according to the sound wake-up process, and then identify the difference between the sound wave signal component of the first sound wave signal in the first sound pickup direction and the preset wake-up word. similarity.
  • the wake-up method can enter the second stage.
  • the electronic device can use the ultrasonic signal to locate the second sound pickup direction, and then identify the similarity between the sound wave signal component of the first sound wave signal in the second pickup direction and the preset wake-up word.
  • the electronic device wakes up.
  • the electronic device can determine the sound source position of the end user by locating the sound pickup direction in two stages, so as to identify the wake-up word according to the final sound source position, improve the wake-up accuracy of the electronic device, and reduce the error of the electronic device. probability of awakening.
  • the electronic device further performs: wake-up the electronic device. That is to say, if in the above-mentioned first stage, the similarity between the sound wave signal component of the first sound wave signal in the first pickup direction and the preset wake-up word is high, it means that the first sound wave detected by the electronic device has a high degree of similarity. If the sound wave signal is closer to the preset wake-up word, the electronic device is woken up, and there is no need to enter the second stage to perform positioning again.
  • the electronic device further executes: maintaining an unawakened state . That is to say, if in the above-mentioned first stage, the similarity between the sound wave signal component of the first sound wave signal in the first pickup direction and the preset wake-up word is low, it means that the first sound wave detected by the electronic device has a low degree of similarity. If the sound wave signal is quite different from the preset wake-up word, the electronic device can continue to remain in the unwake-up state, and does not need to enter the second stage to perform positioning again.
  • the electronic device further executes: keep not woken up state. That is to say, if in the above-mentioned second stage, the similarity between the sound wave signal component of the first sound wave signal in the second pickup direction and the preset wake-up word is relatively high, it means that although the identification in the first stage It is found that the similarity between the first sound wave signal and the wake-up word is not high, but it can be determined through ultrasonic positioning that the actual first sound wave signal is closer to the preset wake-up word, and the electronic device is woken up.
  • the above-mentioned Q ultrasonic receivers may specifically be part or all of the above-mentioned M microphones; wherein, Q is less than or equal to M; Sound inlet; the third surface described above is the same as the first surface.
  • the electronic device can use the existing microphone to participate in the ultrasonic positioning, without the need to add an additional ultrasonic receiver, thereby reducing the cost of ultrasonic positioning in a voice interaction scenario.
  • the above-mentioned Q ultrasonic receivers may be different from some or all of the above-mentioned M microphones.
  • the electronic device further includes: N speakers, where the N sound wave emission ports of the N speakers are located on the fourth surface; N is a positive integer greater than or equal to 1; The fourth surface is different from the first surface described above.
  • the electronic device further includes: the above-mentioned P ultrasonic transmitters are part or all of the above-mentioned N speakers; wherein, P is less than or equal to N; is the sound wave emission port; the fourth surface is the same as the second surface.
  • the electronic device can use the existing speakers to participate in the ultrasonic positioning, and no additional ultrasonic transmitters need to be added, thereby reducing the cost of ultrasonic positioning in a voice interaction scenario.
  • the above-mentioned P ultrasonic transmitters may be different from part or all of the N speakers.
  • the second surface is parallel to the first surface.
  • an electronic device which is in an unawakened state.
  • the electronic device includes: a processor; a memory; M (M is a positive integer greater than 1) microphones, each microphone corresponding to a sound pickup entrance; the M sound pickup entrances of the M microphones are located on the first surface of the electronic device, and the first The surface is on a plane; the distance between any two microphones in the above M microphones is fixed; P (P is a positive integer greater than or equal to 1) ultrasonic transmitters, each ultrasonic transmitter corresponds to an ultrasonic transmitting port P ultrasonic emission ports of above-mentioned P ultrasonic transmitters are located on the second surface; The second surface is different from the first surface; Q (Q is a positive integer greater than 1) ultrasonic receivers, each ultrasonic receiver corresponds to an ultrasonic wave receiving ports; the Q ultrasonic receiving ports of the above-mentioned Q ultrasonic receivers are located on the third surface of the electronic device, and the third surface is on a plane;
  • the first sound wave signal is detected by the M microphones; in response to the first sound wave signal, the arrival time difference between the first sound wave signal reaching at least two of the M microphones, and a portion of the at least two microphones or The distance between all the microphones, the first sound pickup direction is obtained; the first sound pickup direction is used to indicate: the first projection point of the first sound source position on the plane where the first surface is located, relative to the plane where the first surface is located.
  • the receiver receives the second sound wave signal; in response to the second sound wave signal, according to the arrival time difference of the second sound wave signal reaching at least two ultrasonic receivers in the Q ultrasonic receivers, and part of the at least two ultrasonic receivers or The distance between all ultrasonic receivers, the second sound pickup direction is obtained; the second sound pickup direction is used to indicate that the second sound source is located at the second projection point on the plane where the first surface is located, relative to the direction of the fixed point
  • the fixed point is different from the second projection point; the third sound pickup direction is
  • the wake-up method performed by the electronic device provided in the second aspect can also be divided into two stages.
  • the electronic device can first locate the first sound pickup direction according to the sound wake-up process, and then identify the difference between the sound wave signal component of the first sound wave signal in the first sound pickup direction and the preset wake-up word. similarity.
  • the wake-up method can enter the second stage.
  • the electronic device can use the ultrasonic signal to locate the second sound pickup direction, and then correct the first sound pickup direction through the second sound pickup direction, so as to obtain a more accurate signal from the actual location of the user.
  • Approaching the third pickup direction the electronic device can identify the similarity between the sound wave signal component of the first sound wave signal in the third pickup direction and the preset wake-up word.
  • the electronic device wakes up. Furthermore, the accuracy rate of the electronic device waking up is higher, and the probability of the electronic device waking up by mistake is lower.
  • the electronic device further performs: wake-up the electronic device. Similar to the first aspect, if in the above-mentioned first stage, the similarity between the sound wave signal component of the first sound wave signal in the first pickup direction and the preset wake-up word is high, it means that the electronic device has detected If the first sound wave signal is closer to the preset wake-up word, then the electronic device is woken up, and there is no need to enter the second stage to perform positioning again.
  • the electronic device further executes: maintaining an unawakened state . Similar to the first aspect, if in the above-mentioned first stage, the similarity between the sound wave signal component of the first sound wave signal in the first pickup direction and the preset wake-up word is low, it means that the electronic device has detected The difference between the first sound wave signal and the preset wake-up word is large, the electronic device can continue to remain in the unwake-up state, and does not need to enter the second stage to perform positioning again.
  • the electronic device further executes: keep not woken up state.
  • the electronic device determines the third sound pickup direction according to the first sound pickup direction and the second sound pickup direction; including: if the first sound pickup direction and the second pickup direction The absolute value of the direction deviation of the sound direction is smaller than the preset fourth threshold, or, after the absolute value of the direction deviation between the first sound pickup direction and the second sound pickup direction is greater than the preset fifth threshold value, the third sound pickup direction is different from the first sound pickup direction.
  • a pickup direction is the same.
  • the electronic device determines the third sound pickup direction according to the first sound pickup direction and the second sound pickup direction; including: if the first sound pickup direction and the second pickup direction The absolute value of the direction deviation of the sound direction is greater than the preset fourth threshold and less than the fifth threshold, then the third sound pickup direction is the first sound pickup direction, superimposed the first sound pickup direction and the second sound pickup direction. The product of the absolute value of the direction deviation and the preset scale factor.
  • the above-mentioned Q ultrasonic receivers are part or all of the above-mentioned M microphones; wherein, Q is less than or equal to M; wherein, the above-mentioned ultrasonic receiving ports are sound pickup inlets ; the third surface is the same as the first surface.
  • the above-mentioned Q ultrasonic receivers are different from some or all of the above-mentioned M microphones.
  • the electronic device further includes: N speakers, and the N sound wave emission ports of the N speakers are located on the fourth surface; N is a positive integer greater than or equal to 1; The fourth surface is different from the first surface.
  • the above-mentioned P ultrasonic transmitters are part or all of the above-mentioned N loudspeakers; wherein, P is less than or equal to N; the above-mentioned ultrasonic emission ports are sound wave emission ports; The four surfaces are the same as the second surface.
  • the above-mentioned P ultrasonic transmitters are different from part or all of the above-mentioned N speakers.
  • the second surface is parallel to the first surface.
  • a wake-up method includes: detecting a first sound wave signal through M microphones; in response to the first sound wave signal, according to the arrival time difference between the first sound wave signal reaching at least two microphones in the above-mentioned M microphones, and at least two The distance between some or all of the microphones in the microphone, and the first sound pickup direction is obtained; wherein, the first sound pickup direction is used to indicate that the position of the first sound source is at the first projection point on the plane where the first surface is located, relative to the in the direction of a fixed point on the plane where the first surface is located; the fixed point is different from the first projection point; the first sound wave signal component of the first sound wave signal in the first pickup direction is obtained; After the similarity between the signal component and the preset wake-up word is less than the preset first threshold and greater than or equal to the preset second threshold, the P ultrasonic transmitters are used to transmit a second acoustic wave signal, and the second acoustic
  • the wake-up method provided in the third aspect can be divided into two stages.
  • the electronic device can first locate the first sound pickup direction according to the sound wake-up process, and then identify the difference between the sound wave signal component of the first sound wave signal in the first sound pickup direction and the preset wake-up word. similarity.
  • the wake-up method can enter the second stage.
  • the electronic device can use the ultrasonic signal to locate the second sound pickup direction, and then identify the similarity between the sound wave signal component of the first sound wave signal in the second pickup direction and the preset wake-up word.
  • the electronic device wakes up.
  • the electronic device can determine the sound source position of the end user by locating the sound pickup direction in two stages, so as to identify the wake-up word according to the final sound source position, improve the wake-up accuracy of the electronic device, and reduce the error of the electronic device. probability of awakening.
  • the method further includes: waking up the electronic device.
  • the method further includes: the electronic device remains unwake-up state.
  • the method further includes: the electronic device keeps the Awake state.
  • Any implementation manner of the third aspect corresponds to any implementation manner of the first aspect, respectively.
  • the technical effect corresponding to any one of the implementation manners of the third aspect reference may be made to the technical effect corresponding to any one of the above-mentioned implementation manners of the first aspect, which will not be repeated here.
  • a wake-up method includes: detecting a first sound wave signal through the M microphones; in response to the first sound wave signal, according to the arrival time difference value of the first sound wave signal reaching at least two microphones in the M microphones, and the at least two microphones The distance between some or all of the microphones in the first sound pickup direction is obtained; wherein, the first sound pickup direction is used to indicate that the position of the first sound source is at the first projection point on the plane where the first surface is located, relative to the The direction of a fixed point on the plane where the first surface is located; the fixed point is different from the first projection point; the first sound wave signal component of the first sound wave signal in the first pickup direction is obtained; After the similarity between the component and the preset wake-up word is less than the preset first threshold and greater than or equal to the preset second threshold, the P ultrasonic transmitters transmit a second acoustic wave signal, and the second acoustic wave signal is an ultrasonic wave signal;
  • the projection point is relative to the direction of the fixed point; the fixed point is different from the second projection point; the third sound pickup direction is determined according to the first sound pickup direction and the second sound pickup direction, wherein the third sound pickup direction is used to indicate, The third projection point of the third sound source position on the plane where the first surface is located, relative to the direction of the above-mentioned fixed point; the third sound wave signal component of the first sound wave signal in the third pickup direction is obtained; After the similarity between the sound wave signal component and the preset wake-up word is greater than the preset third threshold, the electronic device wakes up.
  • the wake-up method provided in the fourth aspect can also be divided into two stages.
  • the electronic device can first locate the first sound pickup direction according to the sound wake-up process, and then identify the difference between the sound wave signal component of the first sound wave signal in the first sound pickup direction and the preset wake-up word. similarity.
  • the wake-up method can enter the second stage.
  • the electronic device can use the ultrasonic signal to locate the second sound pickup direction, and then correct the first sound pickup direction through the second pickup direction, so as to obtain a signal that is closer to the actual location of the user. Approaching the third pickup direction.
  • the electronic device can identify the similarity between the sound wave signal component of the first sound wave signal in the third pickup direction and the preset wake-up word.
  • the electronic device wakes up. Furthermore, the accuracy rate of the electronic device waking up is higher, and the probability of the electronic device waking up by mistake is lower.
  • the method further includes: waking up the electronic device.
  • the method further includes: the electronic device remains unwake-up state.
  • the method further includes: the electronic device keeps the Awake state.
  • the third sound pickup direction is determined according to the first sound pickup direction and the second sound pickup direction; including: in the first sound pickup direction and the second sound pickup direction
  • the absolute value of the direction deviation is less than the preset fourth threshold, or, after the absolute value of the direction deviation between the first pickup direction and the second pickup direction is greater than the preset fifth threshold, the third pickup direction and the first pickup direction. same direction.
  • the third sound pickup direction is determined according to the first sound pickup direction and the second sound pickup direction; including: in the first sound pickup direction and the second sound pickup direction After the absolute value of the direction deviation is greater than the preset fourth threshold and less than the fifth threshold, the third pickup direction is the first pickup direction, superimposed the direction deviation of the first pickup direction and the second pickup direction The product of the absolute value and the preset scale factor.
  • Any implementation manner of the fourth aspect corresponds to any implementation manner of the second aspect, respectively.
  • For the technical effect corresponding to any one of the implementation manners of the fourth aspect reference may be made to the technical effect corresponding to any one of the implementation manners of the second aspect, which will not be repeated here.
  • a wake-up method includes: detecting a first sound wave signal through the M microphones; in response to the first sound wave signal, according to the arrival time difference value of the first sound wave signal reaching at least two microphones in the M microphones, and the at least two microphones The distance between some or all of the microphones in the first sound pickup direction is obtained; wherein, the first sound pickup direction is used to indicate that the position of the first sound source is at the first projection point on the plane where the first surface is located, relative to the The direction of a fixed point on the plane where the first surface is located; the fixed point is different from the first projection point; the second acoustic wave signal is transmitted through P ultrasonic transmitters, and the second acoustic wave signal is an ultrasonic signal; received through Q ultrasonic receivers a second acoustic wave signal; in response to the second acoustic wave signal, according to the arrival time difference of the second acoustic wave signal reaching at least two of the Q ultra
  • the third projection point on the plane is relative to the direction of the above-mentioned fixed point; the third sound wave signal component of the first sound wave signal in the third pickup direction is obtained; between the third sound wave signal component and the preset wake-up word After the similarity between them is greater than the preset third threshold, the electronic device wakes up.
  • the electronic device after the electronic device detects the first sound wave signal, it can perform two positioning processes.
  • the first sound pickup direction can be obtained once by positioning according to the time when the first sound wave signal reaches the M microphones; the second sound pickup direction can be obtained by locating obstacles by sending and receiving ultrasonic signals once.
  • a third sound pickup direction that is closer to the actual location of the user can be obtained. In this way, the electronic device can identify the similarity between the sound wave signal component of the first sound wave signal in the third pickup direction and the preset wake-up word.
  • the electronic device wakes up. Furthermore, the accuracy rate of the electronic device waking up is higher, and the probability of the electronic device waking up by mistake is lower.
  • the present application provides a computer-readable storage medium, comprising computer instructions, when the computer instructions are executed on the above-mentioned electronic device, the electronic device is made to execute any one of the above-mentioned wake-up methods.
  • the present application provides a computer program product, which, when the computer program product runs on the above electronic device, causes the electronic device to execute any one of the wake-up methods described above.
  • FIG. 1A is a schematic diagram of a scenario of a wake-up method provided by an embodiment of the present application
  • FIG. 1B is a schematic diagram of the position of the sound source positioned by the provided electronic device
  • 1C is a schematic diagram of a sound source location positioned by the provided electronic device
  • FIG. 2 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of the principle of acoustic signal positioning in the wake-up method provided by the embodiment of the present application;
  • 4 to 7 are schematic flowcharts of acoustic wave signal processing in a wake-up method provided by an embodiment of the present application
  • FIG. 8 is a schematic partial flowchart of a wake-up method provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the principle of positioning an obstacle with an ultrasonic signal in a wake-up method provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an ultrasonic signal locating an obstacle position in a wake-up method provided by an embodiment of the present application
  • FIG. 11 is a schematic flowchart of acoustic signal processing in a wake-up method provided by an embodiment of the present application.
  • FIG. 12 is a schematic partial flowchart of another wake-up method provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a sound source position in another wake-up method provided by an embodiment of the present application.
  • FIG. 14 is a schematic diagram of acoustic signal processing in another wake-up method provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • references in this specification to "one embodiment” or “some embodiments” and the like mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically emphasized otherwise.
  • the terms “including”, “including”, “having” and their variants mean “including but not limited to” unless specifically emphasized otherwise.
  • the term “connected” includes both direct and indirect connections unless otherwise specified. "First” and “second” are only for descriptive purposes, and should not be interpreted as indicating or implying relative importance or implying the number of indicated technical features.
  • words such as “exemplarily” or “for example” are used to represent examples, illustrations or illustrations. Any embodiment or design described in the embodiments of the present application as “exemplarily” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplarily” or “such as” is intended to present the related concepts in a specific manner.
  • FIG. 1A is a schematic diagram of a scenario of a wake-up method provided by an embodiment of the present application.
  • the electronic device 100 has the function of voice interaction and can receive sound wave signals.
  • the upper surface of the electronic device 100 is provided with a plurality of microphones or microphone arrays 170C sound pickup inlets; each microphone or each microphone array corresponds to a sound pickup inlet; the plurality of microphones or microphone arrays 170C pass through different Pick up the entrance, receive the sound wave signal.
  • other parts (eg, side surfaces, etc.) or upper surface of the electronic device 100 may be provided with speakers (not shown in the figure) for outputting sound wave signals.
  • the user 200 can wake up the electronic device 100 through a sound wave signal, and after the electronic device 100 wakes up, control the electronic device through further voice commands to perform corresponding functions.
  • the electronic device 100 may be a device with a voice interaction function, such as a smart speaker, a smart TV, a smart air conditioner, a smart door lock, and a smart lamp. This application does not limit this.
  • the sound wave signal includes a voice signal (a signal with a frequency of 20Hz-20000Hz).
  • the acoustic signal further includes an ultrasonic signal (a signal with a frequency greater than 20000 Hz).
  • the sonic signal may also include an infrasonic signal (a signal with a frequency lower than 20 Hz).
  • the sound wave signal sent by the user refers to the voice signal sent by the user.
  • the plurality of microphones or the plurality of sound pickup inlets of the microphone array 170C in FIG. 1A are provided on the upper surface of the electronic device 100 , which is only a schematic example.
  • the aforementioned plurality of pickup inlets may also be provided on another surface.
  • the upper surface or another surface is at or near a level.
  • the other surface is close to the horizontal plane means that although the other surface is uneven and has certain unevenness, the impact of the unevenness is small, and it can be approximated as a horizontal plane.
  • the description will be given by taking as an example that a plurality of microphones or microphone arrays 170C are disposed on the upper surface of the electronic device 100 .
  • the electronic device 100 does not wake up even if the sound wave signal sent by the user 200 contains a preset wake-up word; or, sometimes the electronic device 100 wakes up even though the sound wave signal sent by the user 200 does not contain the preset wake-up word . This disturbs the user and brings a bad experience to the user.
  • the inventor has concluded that the above-mentioned errors are mainly caused by two reasons through long-term in-depth research, experiments and analysis.
  • the voice interaction process on the side of the electronic device and the process of locating the position of the sound source by the electronic device according to the detected sound wave signal are introduced.
  • FIG. 1B is a schematic diagram of the location of the sound source to which the electronic device 100 is located.
  • the upper surface of the electronic device 100 is the XY axis plane, and the center point of the upper surface is the O point.
  • the electronic device 100 can only identify the sound source position A1 (X1, Y1), but cannot identify the height of the sound source position. Therefore, the concept of the sound source position hereinafter is essentially the projection of the sound source position on the upper surface of the electronic device 100 .
  • the above-mentioned point O being the center point of the upper surface is only a schematic example. In fact, any fixed point on the upper surface can be the O point.
  • the voice interaction process on the electronic device side can be divided into five steps: wake-up, response, input, understanding and feedback.
  • the voice interaction function may be implemented by a voice assistant installed in the electronic device 100 .
  • FIG. 1A the above five links are further described.
  • the electronic device 100 is in a state before waking up (eg, a standby state, etc.).
  • the user 200 outputs a sound wave signal containing a preset wake-up word.
  • the electronic device 100 identifies whether the sound wave signal contains a preset wake-up word.
  • the electronic device 100 invokes the voice interaction assistant, or activates the voice interaction function of the electronic device 100, and the electronic device 100 wakes up and enters the working state.
  • the electronic device 100 may also respond to the above-mentioned sound wave signal sent by the user. In this way, the electronic device 100 is switched from the first state (eg, standby state, etc.) to the second state (eg, working state, etc.). After that, the user 200 can issue further voice commands. After receiving the further voice command, the electronic device 100 can recognize the corresponding semantic content through the voice recognition algorithm, that is, understand the further voice command, so as to execute the corresponding function.
  • the sound pickup device of the electronic device 100 In order to respond to the sound wave signal in time, the sound pickup device of the electronic device 100 usually needs to be always on.
  • the sound pickup device of the electronic device 100 may be a microphone array or a plurality of microphones.
  • the electronic device 100 may detect the sound wave signal in real time through a microphone array or a plurality of microphones.
  • the electronic device 100 will identify the sound source position corresponding to the sound wave signal according to the detected sound wave signal, and obtain the direction from the sound source position (which may be called pickup sound). direction), and then obtain the component of the sound wave signal in the pickup direction, and perform processing based on the component. In this way, the amount of data to be processed can be reduced and the response speed can be improved.
  • the electronic device 100 can locate the sound source position A1 corresponding to the sound wave signal, and obtain the sound source position A1 where the sound wave signal is located. Furthermore, the electronic device 100 may take the direction from the sound source position A1 as the sound pickup direction, and obtain the sound wave signal component in the sound pickup direction according to the sound pickup direction. Subsequently, the electronic device 100 may input the acquired sound wave signal component into the wake word model. In the wake word model, a preset algorithm is used to extract the sound wave feature of the sound wave signal component, and the similarity (also called confidence) between the sound wave feature and the sound wave feature corresponding to the preset wake word is compared.
  • the wake word model a preset algorithm is used to extract the sound wave feature of the sound wave signal component, and the similarity (also called confidence) between the sound wave feature and the sound wave feature corresponding to the preset wake word is compared.
  • the electronic device 100 may confirm that the detected sound wave signal contains the preset wake-up word; at this time, the electronic device 100 wakes up and enters the working state. If the similarity is less than the preset threshold, the electronic device 100 may confirm that the detected sound wave signal does not contain the preset wake-up word; at this time, the electronic device 100 may continue to maintain the state before wake-up (eg, standby state, etc.).
  • FIG. 1C is a schematic diagram of the position of the sound source to which the electronic device 100 is provided.
  • the electronic device 100 locates the sound source position as the sound source position A1 according to the detected sound wave signal, but actually the user 200 sends out the sound wave signal at the sound source position A2.
  • the electronic device 100 locates the sound source position as the sound source position A1 according to the detected sound wave signal, but actually the user 200 sends out the sound wave signal at the sound source position A2.
  • there is a deviation will affect a series of subsequent processing of the electronic device 100, resulting in inaccurate processing results and large errors.
  • TOA time of arrival
  • TDOA time difference of arrival
  • the environment where the electronic device 100 and the user 200 are located generally has a noise source, and the noise emitted by the noise source will also lead to a positioning deviation of the position of the sound source.
  • the electronic device 100 can filter out a part of the noise signal through the noise reduction algorithm, the residual noise signal will still affect the sound source location location result of the electronic device 100, so that the sound source location A1 located by the electronic device 100 is the same as the actual location of the user 200.
  • the sound source position A2 is deviated.
  • the electronic device 100 takes the direction from the sound source position A1 as the sound pickup direction, the sound wave signal is further extracted at the sound pickup direction.
  • the sound wave signal component in the direction cannot accurately reflect the sound wave signal input by the user. Subsequently, the accuracy of recognizing the wake word by the electronic device 100 is lowered, and the user experience is poor.
  • the present application provides a wake-up method and an electronic device.
  • the wake-up method provided by the embodiment of the present application is applied to an electronic device.
  • Electronic devices can be smart speakers, smart TVs, smart air conditioners, smart refrigerators, smart lights, smart doors, smart locks, smart curtains and other smart home devices, smart phones, smart glasses, smart watches, smart bracelets, etc.
  • Wearable electronic devices tablet computers, notebook computers, personal digital assistants (PDAs), in-vehicle devices, virtual reality devices, augmented reality devices and other electronic devices with voice interaction functions. This application does not limit this.
  • FIG. 2 shows a schematic diagram of a hardware structure of an electronic device 100 provided by an embodiment of the present application.
  • the electronic device 100 may include a processor 110 , an external memory interface 120 , an internal memory 121 , a universal serial bus (USB) interface 130 , a charge management module 140 , a power management module 141 , and a battery 142 , Antenna 1, Antenna 2, Mobile Communication Module 150, Wireless Communication Module 160, Audio Module 170, Speaker 170A, Receiver 170B, Microphone Array 170C, Headphone Interface 170D, Sensor Module 180, Key 190, Motor 191, Indicator 192, Camera 193, a display screen 194, a subscriber identification module (SIM) card interface 195, an ultrasonic transmitter 196, an ultrasonic receiver 197, and a USB interface 198 and the like.
  • SIM subscriber identification module
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or less components than those shown in FIG. 2 , or combine some components, or separate some components, or arrange different components.
  • the components shown in Figure 2 may be implemented in hardware, software or a combination of software and hardware.
  • Processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a controller, a video Codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc.
  • AP application processor
  • modem processor graphics processing unit
  • ISP image signal processor
  • controller a video Codec
  • digital signal processor digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural-network processing unit neural-network processing unit
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
  • the processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal asynchronous transmitter) receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transceiver
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM subscriber identity module
  • USB universal serial bus
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 charges the battery 142 , it can also supply power to the electronic device through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory interface 120, the wireless communication module 150, and the like.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 may be implemented by the antenna 1 , the antenna 2 , the wireless communication module 150 and the like.
  • the mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G etc. applied on the electronic device.
  • the mobile communication module 150 may include one or more filters, switches, power amplifiers, low noise amplifiers (LNAs), and the like.
  • the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and then turn it into an electromagnetic wave for radiation through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the same device as at least part of the modules of the processor 110 .
  • the wireless communication module 160 can provide applications on electronic devices including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • frequency modulation frequency modulation
  • FM near field communication technology
  • NFC near field communication technology
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating one or more communication processing modules.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store one or more computer programs including instructions.
  • the processor 110 may execute the above-mentioned instructions stored in the internal memory 121, thereby causing the electronic device to execute the wake-up methods provided in some embodiments of the present application, as well as various functional applications and data processing.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the stored program area may store the operating system; the stored program area may also store one or more application programs (such as gallery, contacts, etc.) and the like.
  • the storage data area can store data (such as photos, contacts, etc.) created during the use of the electronic device.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, universal flash storage (UFS), and the like.
  • the processor 110 executes the instructions stored in the internal memory 121 and/or the instructions stored in the memory provided in the processor, so that the electronic device executes the wake-up method provided in the embodiments of the present application , as well as various functional applications and data processing.
  • the electronic device can implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone array 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 is used for converting digital audio information into analog sound wave signal output, and also for converting analog audio input into digital sound wave signal.
  • the audio module 170 may also be used to encode and decode sound wave signals.
  • the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
  • the speaker 170A also referred to as a "speaker" is used to convert audio electrical signals into sound wave signals.
  • the electronic device can listen to music through the speaker 170A, or listen to a hands-free call.
  • Microphone array 170C includes a plurality of microphones.
  • the microphone can also be called “microphone” or “microphone”, which is used to convert sound wave signals into electrical signals.
  • the electronic device can use the microphone array 170C to collect sound wave signals, and then identify the sound source according to the sound wave signal collected by each microphone in the microphone array 170C, so as to realize functions such as sound source localization and directional recording.
  • the electronic device may be provided with one or more microphone arrays 170C.
  • the microphone array 170C may be replaced with multiple microphones; that is, the electronic device 100 does not include the microphone array 170C, but instead includes multiple microphones.
  • the pickup inlets of the plurality of microphones are located on the same surface, such as the upper surface, of the electronic device 100 .
  • the sensor 180 may include a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like. This does not impose any restrictions.
  • the ultrasonic transmitter 196 and the ultrasonic receiver 197 are respectively used for transmitting ultrasonic signals and receiving ultrasonic signals. There may be one or more ultrasonic transmitters 196 and ultrasonic receivers 197; this application does not limit this. Those skilled in the art can set this according to actual experience or actual application scenarios.
  • An ultrasonic signal is a sound wave signal with a frequency higher than 20,000 Hz (Hertz). Ultrasonic signals have the characteristics of good directionality, strong reflection ability and strong penetration ability.
  • the ultrasonic transmitter 196 may specifically be a plurality of speakers 196A (not shown in FIG. 2 ), that is, the speakers 196A have the function of transmitting ultrasonic signals.
  • the ultrasonic receiver 197 may be a microphone array 197A (not shown in FIG. 2 ) or a plurality of microphones 197B (not shown in FIG. 2 ), that is, the microphone array 197A and the plurality of microphones 197B have the ability to receive ultrasonic signals. Function.
  • electronic device 100 does not include ultrasonic transmitter 196; the functionality of ultrasonic transmitter 196 is integrated in speaker 170A. That is, the speaker 170A can emit not only sound wave signals that can be perceived by human ears, but also ultrasonic signals. In this way, the electronic device no longer needs to be additionally provided with the ultrasonic transmitter 196 .
  • electronic device 100 similarly does not include ultrasonic receiver 197; the functionality of ultrasonic receiver 197 is integrated in microphone array 170C. That is to say, the microphone array 170C can receive not only the sound wave signal that the human ear can perceive, but also the ultrasonic wave signal. In this way, the electronic device no longer needs to be additionally provided with the ultrasonic receiver 197 .
  • electronic device 100 does not include ultrasonic transmitter 196 nor ultrasonic receiver 197 .
  • the function of the ultrasonic transmitter 196 is integrated in the speaker 170A, and the function of the ultrasonic receiver 197 is integrated in the microphone array 170C.
  • the USB interface 198 can be used to connect other devices.
  • the USB interface 198 may be one or more USB interfaces.
  • the electronic device 100 includes an ultrasonic transmitter 196 and multiple microphones or microphone arrays 170C, and the ultrasonic transmitter 196 and the multiple microphones or microphone arrays 170C are not integrated, the top of the ultrasonic transmitter 196 is located
  • the plane is parallel or approximately parallel to the plane where the sound pickup entrances of the plurality of microphones or microphone arrays 170C are located. Approximate parallel means that although the two planes are not parallel, but the angle difference is small, they can be regarded as parallel.
  • the electronic device 100 when the electronic device 100 is a smart speaker, the electronic device 100 may further include one or more devices such as a GPU, a display screen, and buttons.
  • the electronic device 100 may further include one or more devices such as a GPU, a display screen, and buttons.
  • a GPU a GPU
  • a display screen a display screen
  • buttons buttons
  • the electronic device 100 when the electronic device 100 is a smart TV, the electronic device 100 may further include one or more devices such as a GPU and a display screen, and may also be equipped with one or more devices such as a remote control, an infrared sensor, and the like for the electronic device. .
  • a remote control an infrared sensor
  • the electronic device 100 may further include one or more devices such as a remote control, an infrared sensor, and the like for the electronic device.
  • the electronic device 100 when the electronic device 100 is a smart phone, the electronic device 100 may further include one or more devices such as a GPU, a display screen, a headphone jack, buttons, a battery, a motor, an indicator, and a SIM card interface.
  • a GPU GPU
  • a display screen a headphone jack
  • buttons buttons
  • a battery a motor
  • an indicator an indicator
  • SIM card interface SIM card interface
  • the electronic device 100 when the electronic device 100 detects whether the received sound wave signal contains a wake-up word, the electronic device 100 can introduce the ultrasonic signal to detect the sound source position of the user, so as to improve the detection accuracy of the sound source position.
  • FIG. 3 is a schematic diagram of the principle of positioning a sound wave signal in a wake-up method provided by an embodiment of the present application.
  • the user 200 emits a sound wave signal at the sound source position B1.
  • the electronic device 100 uses the TOA algorithm or the TDOA algorithm to locate according to the received sound wave signal (the received sound wave signal includes but is not limited to the sound wave signal sent by the user 200 ), and obtains the sound source position B2.
  • the electronic device 100 can also use the ultrasonic positioning method to locate the user to obtain the obstacle position B3.
  • the electronic device 100 combines the two positioning results (ie the sound source position B2 and the obstacle position B3) to finally determine the sound source position B4 where the user is located (the sound source position B4 and the sound source position B2 or the obstacle position B3 may be the same , can also be different). In this way, the electronic device 100 can correct the sound source position B2 according to the obstacle position B3, so that the sound source position B4 determined by the electronic device 100 is closer to the sound source position B1 where the user is actually located.
  • the electronic device can subsequently use the direction from the sound source position B4 as the sound pickup direction, and identify whether the detected sound wave signal contains a wake-up word. Since the deviation between the sound source position B4 determined by the electronic device and the sound source position B1 actually corresponding to the user is small, the electronic device has a higher accuracy in judging the inclusion of the wake word according to the sound wave signal component in the direction of the sound source position B4. , so as to improve the accuracy of electronic device wake-up, reduce the probability of false wake-up of the electronic device, and improve user experience.
  • the electronic device 100 may include N (N is a positive integer greater than 1) speakers and L (L is a positive integer greater than or equal to 1) microphone arrays.
  • each microphone array includes M (M is a positive integer greater than 1) microphones.
  • N speaker and L microphone arrays are arranged at different positions of the electronic device 100 .
  • the distance between any two of the N speakers and the distance between any two of the M microphones are fixed (either equal or unequal, but both are fixed).
  • the sound pickup inlets of the M microphones or the sound pickup inlets of the L microphone arrays are located on the same surface, such as the upper surface, of the electronic device 100 .
  • the electronic device 100 may include N (N is a positive integer greater than 1) speakers and M (M is a positive integer greater than 1) microphones.
  • N speakers and M microphones are arranged at different positions of the electronic device 100 .
  • the distance between any two of the N speakers and the distance between any two of the M microphones are fixed.
  • the pickup inlets of the M microphones are located on the same surface, such as the upper surface, of the electronic device 100 .
  • M microphones located in one microphone array For the convenience of description, the following descriptions are made by taking M microphones located in one microphone array as an example. Those skilled in the art should understand that, instead of being located in one microphone array, individual M microphones are also within the protection scope of the present application.
  • each of the N loudspeakers can be used as an ultrasonic transmitter to transmit ultrasonic signals (a sound wave signal higher than 20000 Hz).
  • each of the N speakers can also play a sound wave signal (a sound wave signal of 20 Hz to 20000 Hz) that can be perceived by the human ear.
  • Each of the M microphones can be used as an ultrasonic receiver to receive ultrasonic signals.
  • each of the M microphones can also collect sound wave signals that can be perceived by human ears. In this way, the electronic device 100 can use the speaker and the microphone to realize ultrasonic positioning, without the need to add additional ultrasonic transmitters and ultrasonic receivers, thereby reducing the cost of ultrasonic positioning in a voice interaction scenario.
  • each of the N speakers can only play a sound wave signal (a sound wave signal of 20 Hz to 20000 Hz) that the human ear can perceive.
  • Each of the M microphones can only collect sound wave signals that the human ear can perceive.
  • the electronic device 100 is additionally provided with P ultrasonic transmitters, and Q ultrasonic receivers. Among them, P is a positive integer greater than or equal to 1, and Q is a positive integer greater than 1. The distance between any two of the Q ultrasonic receivers is fixed. When P is a positive integer greater than 1, the distance between any two of the P ultrasonic transmitters is fixed.
  • the electronic device 100 may set the M microphones in the microphone array to a normally-on state, so as to collect sound wave signals in real time through the M microphones. At this time, if an ultrasonic signal exists in the environment where the electronic device 100 is located, the ultrasonic signal may also be collected by each microphone as a high-frequency sound wave signal. When only the sound wave signal needs to be used, each microphone can input the collected sound wave signal into the corresponding low-pass filter to filter out the ultrasonic signal greater than 20000Hz in the sound wave signal.
  • the electronic device can obtain the sound wave signal component in the sound pickup direction based on the filtered sound wave signal and the sound pickup direction from the sound source position, so as to determine whether the sound wave signal component in the sound pickup direction contains a preset wake-up word .
  • each microphone can input the collected sound wave signals into the corresponding high-pass filter to filter out the signals less than 20000 Hz in the sound wave signal.
  • the electronic device can perform ultrasonic positioning based on the filtered sound wave signal, so as to locate the obstacle position B3.
  • FIGS. 4 to 7 are schematic flowcharts of acoustic wave signal processing in the wake-up method provided by the embodiments of the present application.
  • the sound wave signal A is collected by the M microphones in the microphone array (assuming that the sound wave signal A does not include an ultrasonic signal, even if it does, it can be filtered out by a low-pass filter). Due to the different positions of the M microphones, the waveforms of the sound wave signals A collected by different microphones in the M microphones may be different (small or even no difference), and the time points of the sound wave signals A collected by different microphones may also be different. Therefore, as shown in FIG. 4 , the electronic device 100 can obtain the corresponding M channels of sound wave signals A through the M microphones.
  • the electronic device 100 can also locate the sound source position according to the M channels of sound wave signals A. Exemplarily, since each channel of sound wave signal A in the M channels of sound wave signal A reaches the corresponding microphone at different time points, the electronic device can calculate the corresponding sound source position B2 by using the TOA algorithm or the TDOA algorithm according to the above time point.
  • the electronic device 100 may determine the direction from the sound source position B2 as the first sound pickup direction. Align each channel of sound wave signal A in the M channels of sound wave signal A in time (ie, align the start time points of each channel of sound wave signal A). After aligning in time, the component of each channel of sound wave signal A in the M channels of sound wave signal A in the first pickup direction, that is, component 501 of the M channels of sound wave signal A is obtained. The A components 501 of the M channels of sound wave signals are fused into one channel of sound wave signals, that is, the sound wave signal A'.
  • the electronic device 100 may determine the direction from the sound source position B2 as the first sound pickup direction. Acquire the component of each channel of sound wave signal A in the M channels of sound wave signal A in the first pickup direction, that is, component 501 of the M channels of sound wave signal A. After that, align in time (ie, align the starting time points of each channel of sound wave signal A). After being aligned in time, the M channels of sound wave signal A components 501 are fused into one channel of sound wave signal, that is, sound wave signal A'.
  • the above-mentioned fusion may be directly superimposing the A components 501 of the M channels of sound wave signals, or may be weighted average of the A components 501 of the M channels of sound wave signals, or other methods. This application does not limit this.
  • the electronic device 100 After the electronic device 100 obtains the sound wave signal A', it can input the sound wave signal A' into a preset wake-up word model.
  • the wake-up word model stores preset sound wave features 701 of the wake-up word.
  • a preset algorithm is used to extract the sound wave feature 702 of the sound wave signal A', and the proposed sound wave feature 702 is compared with the sound wave feature 701 corresponding to the preset wake-up word, and the difference between the two is obtained. similarity (also called confidence).
  • the similarity between the two is finally obtained as similarity 1 (also referred to as the first similarity).
  • the sound wave feature 702 and the sound wave feature 701 may be represented by related codes, functions, matrices or spectrograms, which are not limited in this application.
  • FIG. 8 is a schematic partial flowchart of a wake-up method provided by an embodiment of the present application.
  • the similarity 1 is greater than the first threshold (for example, 90%, 90 points, etc.)
  • the electronic device 100 may determine that the sound wave signal A contains the preset wake-up word. Further, the electronic device 100 wakes up.
  • the electronic device 100 invokes a voice assistant, and the voice assistant interacts with the user's voice.
  • the similarity 1 is less than the second threshold (for example, 60%, 60 points, etc.), it indicates that the sound wave signal A detected by the electronic device 100 is quite different from the preset wake-up word, and the electronic device 100 can determine that the sound wave signal A does not contain Default wake word. Furthermore, the electronic device 100 continues to remain in an unawakened state.
  • the second threshold is smaller than the first threshold.
  • both the second threshold and the first threshold can be adjusted, and are not limited to the thresholds exemplified above.
  • the similarity 1 is between the second threshold and the first threshold, it indicates that the sound wave signal A detected by the electronic device 100 may contain a preset wake-up word, and the electronic device 100 can further determine by ultrasonic positioning according to S801-S805 Whether the sound wave signal A contains a preset wake-up word, it is further determined whether the electronic device 100 wakes up.
  • the electronic device 100 obtains an obstacle position according to the time between the transmission and reception of the ultrasonic signal, the transmission speed of the ultrasonic wave in the air, and even according to the sound source position obtained by using the TOA algorithm or the TDOA algorithm.
  • FIG. 9 is a schematic diagram of the principle of positioning an obstacle by an ultrasonic signal in a wake-up method provided by an embodiment of the present application.
  • the electronic device 100 is provided with P ultrasonic transmitters and Q ultrasonic receivers.
  • the P ultrasonic transmitters and the Q ultrasonic receivers face different directions on the electronic device 100 . That is, the P ultrasonic transmitters transmit ultrasonic waves in the K direction, and the Q ultrasonic receivers are not located in the K direction of the P ultrasonic transmitters.
  • the Q ultrasonic receivers cannot receive the ultrasonic signals directly transmitted by the P ultrasonic transmitters, but can only receive the ultrasonic signals transmitted by the P ultrasonic transmitters and the ultrasonic signals reflected by the obstacles.
  • P is a positive integer greater than or equal to 1
  • Q is a positive integer greater than 1.
  • the P ultrasonic transmitters include a speaker 1001 , a speaker 102 , and a speaker 1003
  • the Q microphones include a microphone 1011 , a microphone 1012 , and a microphone 1013
  • the speaker 1001 can transmit the ultrasonic signal 1 within a certain angle range.
  • the speaker 1002 can transmit the ultrasonic signal 2 (not shown in FIG. 9 ) within a certain angular range
  • the speaker 1003 can transmit the ultrasonic signal 3 (not shown in FIG. 9 ) within a certain angular range.
  • the ultrasonic signal 1 , the ultrasonic signal 2 and the ultrasonic signal 3 are reflected after encountering obstacles including the user 200 .
  • the microphone 1011 , the microphone 1012 and the microphone 1013 can collect the reflected ultrasonic signal 1 , the ultrasonic signal 2 and the ultrasonic signal.
  • the ultrasonic signal 1 is emitted after passing through obstacles including the user 200 , and the reflected ultrasonic signal 1 reaches the microphone 1011 , the microphone 1012 , and the microphone 1013 for different durations and time points.
  • the electronic device 100 obtains the position of the obstacle according to the duration of both the transmission and reception of the ultrasonic signal 1 and the transmission speed of the ultrasonic wave in the air. Of course, in this way, the obtained position of the obstacle also has a certain deviation.
  • the acquired position of the obstacle may be the obstacle position B3 shown in FIG. 3 .
  • FIG. 10 is a schematic diagram of an ultrasonic signal locating an obstacle position in a wake-up method provided by an embodiment of the present application. As shown in FIG. 10 , through ultrasonic positioning, there are two positions of the obtained obstacles, namely, the obstacle position B3 and the obstacle position B5.
  • the obstacle position B5 is quite different from the sound source position B2, outside the above-mentioned certain range, the obstacle position B5 is excluded; since the difference between the obstacle position B3 and the sound source position B2 is within the above-mentioned certain range, the obstacle is reserved.
  • Location B3 According to the above method, there may be one or more than one obstacle positions remaining.
  • the retained obstacle position may be recorded as positioning result 1.
  • the multiple remaining obstacle positions may be superimposed and averaged to obtain one obstacle position.
  • the above-mentioned certain range may be a range that is different from the direction from the sound source position B2 by a certain angle.
  • the above-mentioned certain angle may be a preset angle.
  • the ultrasonic signal 2 can also be transmitted separately. Furthermore, in the above manner, the positioning result 2 obtained based on the transmission and reflection of the ultrasonic signal 2 can also be obtained.
  • the location result 2 may include one or more obstacle locations.
  • the ultrasonic signal 3 can also be emitted separately. Furthermore, in the above manner, the positioning result 3 obtained based on the transmission and reflection of the ultrasonic signal 3 can also be obtained.
  • the location result 3 may include one or more obstacle locations.
  • the electronic device 100 may use a preset clustering algorithm to perform cluster analysis on the above-mentioned positioning result 1 , positioning result 2 and positioning result 3 .
  • the above-mentioned clustering algorithm may include K-means clustering algorithm (k-means clustering algorithm, also known as k-means clustering algorithm) or self-organizing map neural network (self-organizing maps, SOM) clustering algorithm.
  • K-means clustering algorithm K-means clustering algorithm
  • SOM self-organizing maps
  • the electronic device 100 may also use only one obstacle position determined by the ultrasonic signal 1 .
  • each microphone may include an ultrasonic signal, or may include a sound wave signal that can be recognized by the human ear.
  • each microphone can first input the collected sound wave signal into the corresponding high-pass filter, and filter the sound wave signal less than 20000 Hz in the sound wave signal to obtain the corresponding ultrasonic signal.
  • the electronic device 100 can determine an obstacle position according to the above method.
  • the electronic device 100 uses the direction from the obstacle position as the second sound pickup direction, and acquires the sound wave signal A component of the sound wave signal A in the second sound pickup direction, that is, the sound wave signal A".
  • FIG. 11 is a schematic flowchart of acoustic wave signal processing in a wake-up method provided by an embodiment of the present application.
  • the electronic device 100 can use the direction from the position of the obstacle as the second pickup direction. After the M channels of sound wave signals A collected by the electronic device 100 are aligned in time, the M channels of sound wave signals A are extracted at For the component in the second pickup direction, the A component 1101 of the M-channel acoustic wave signal is obtained. After that, the electronic device 100 can fuse the above-mentioned M-channel sound wave signal A component 1101 to obtain the sound wave signal A".
  • the electronic device 100 may input the sound wave signal A" into a preset wake-up word model, and calculate the similarity 2 (also referred to as the second similarity) between the sound wave signal A" and the wake-up word.
  • the electronic device 100 may determine that the sound wave signal A contains the preset wake-up word. At this time, the electronic device 100 can perform step S804.
  • the third threshold may be greater than or smaller than the first threshold.
  • the third threshold may be 95% or 95 points, or 80% or 80 points Wait.
  • the electronic device 100 may determine that the sound wave signal A does not contain the preset wake-up word. Wake word. At this point, the electronic device 100 may perform step S805.
  • the electronic device 100 may invoke a voice assistant, or activate a function of the voice assistant.
  • the electronic device 100 continues to remain in a non-awake state (eg, a standby state, etc.).
  • a non-awake state eg, a standby state, etc.
  • the wake-up method provided by the present application can be divided into two stages.
  • the electronic device can first identify the similarity1 between the detected sound wave signal and the preset wake-up word according to the sound wake-up process.
  • the electronic device wakes up; when the similarity 1 is less than the second threshold, the electronic device continues to remain in an unawakened state; when the similarity 1 is between the first threshold and the second threshold, the wake-up method can enter second stage.
  • the electronic device can use the ultrasonic signal to locate the position of the obstacle, and screen and calculate the position of the obstacle through the position of the sound source identified in the first stage, and finally obtain an obstacle position.
  • the electronic device calculates the similarity 2 between the sound wave signal component in the direction from the position of the obstacle and the preset wake-up word. After the similarity degree 2 satisfies the corresponding threshold condition, the electronic device wakes up; otherwise, the electronic device continues to remain in an unawakened state.
  • the third threshold when the first threshold is set higher, the third threshold may be set to a value smaller than the first threshold.
  • the first threshold may be set to 95, and the third threshold may be set to a value less than 95 (eg, 70 or 80, etc.). That is to say, only after the first stage detects that the similarity 1 is greater than the first threshold (for example, 95), the electronic device determines that the sound wave signal contains the preset wake-up word. Otherwise, enter the second stage, the electronic device locates by ultrasonic wave, and calculates the similarity 2.
  • a third threshold eg, 70 or 80
  • the third threshold may be set to a value greater than the first threshold.
  • the first threshold may be set to 75
  • the third threshold may be set to a value greater than 75 (eg, 85 or 95, etc.). That is, when the first stage detects that the similarity 1 is less than the first threshold (for example, 75), the second stage is entered; the electronic device uses ultrasonic positioning to calculate the similarity 2. When the similarity 2 is greater than a third threshold (eg, 70 or 80), the electronic device wakes up.
  • a third threshold eg, 70 or 80
  • the present application also provides another embodiment of the wake-up method.
  • Another wake-up method contains the same first stage as the aforementioned wake-up method, but the second stage is different. The content of the first stage will not be repeated here.
  • FIG. 12 is a schematic partial flowchart of another wake-up method provided by an embodiment of the present application.
  • the second sound pickup direction is determined through ultrasonic positioning, and the second sound pickup direction is determined according to the first sound pickup direction and the second sound pickup direction. direction to further determine the third pickup direction, and then obtain the component of the sound wave signal A in the third pickup direction, that is, the sound wave signal A"', and then determine whether the sound wave signal A"' contains a preset wake-up word, to determine the electronic equipment 100 whether to wake up.
  • the electronic device 100 may determine that the sound wave signal A contains the preset wake-up word. Further, the electronic device 100 wakes up. Exemplarily, the electronic device 100 invokes a voice assistant, and the voice assistant interacts with the user's voice.
  • the first threshold for example, 90%, 90 points, etc.
  • the similarity 1 is less than the second threshold (for example, 60%, 60 points, etc.), it indicates that the sound wave signal A detected by the electronic device 100 is quite different from the preset wake-up word, and the electronic device 100 can determine that the sound wave signal A does not contain Default wake word. Furthermore, the electronic device 100 continues to remain in an unawakened state.
  • the second threshold is smaller than the first threshold.
  • both the second threshold and the first threshold can be adjusted, and are not limited to the thresholds exemplified above.
  • part of the flow of another wake-up method includes:
  • the electronic device 100 obtains the position of an obstacle according to the time between the transmission and reception of the ultrasonic signal, the transmission speed of the ultrasonic wave in the air, and even the position of the sound source obtained by the TOA algorithm or the TDOA algorithm.
  • the related content of S1201 can refer to the related content of S801, so it is not repeated here.
  • the electronic device 100 uses the direction from the position of the obstacle as the second sound pickup direction, and determines the third sound pickup direction according to the second sound pickup direction and the first sound pickup direction.
  • the electronic device may also combine the sound source position B2 and the obstacle position B3 to re-determine the sound source position of the user.
  • FIG. 13 is a schematic diagram of a sound source position in another wake-up method provided by an embodiment of the present application.
  • the upper surface of the electronic device 100 is the XY axis plane, and the center point of the upper surface is the O point.
  • the X-axis and the Y-axis are two mutually perpendicular coordinate axes passing through the O point.
  • the XY-axis coordinate system is the same as the coordinate system of the previous FIGS. 1B and 1C .
  • the sound source position B2 is the sound source position located by the electronic device 100 in the first stage through the TOA algorithm or the TDOA algorithm;
  • the obstacle position B3 is an obstacle position finally located by the electronic device according to S1201, using the ultrasonic signal;
  • the sound source position B4 is the final calculated sound source position that is closer to the user. Connecting points B2, B3, B4 and O respectively, the angle ⁇ between the line segment B2O and the X axis, the angle ⁇ between the line segment B3O and the X axis, and the angle ⁇ between the line segment B4O and the X axis can be obtained.
  • reflects the relative direction between the sound source position B2 and the electronic device; ⁇ reflects the relative direction between the obstacle position B3 and the electronic device; ⁇ reflects the relative direction between the sound source position B4 and the electronic device.
  • ⁇ in formula (1) is +; when ⁇ is greater than ⁇ , ⁇ in formula (1) is -.
  • the electronic device 100 can determine the above-mentioned sound source position B2 as the sound source corresponding to the end user position; at this time, the angle ⁇ between the determined user's sound source position B4 and the X-axis is ⁇ .
  • the electronic device can adjust the weight of ⁇ through the proportional coefficient k, and finally determine ⁇ , that is, the relative relationship between the sound source position B4 and the electronic device 100 direction. That is to say, the electronic device can correct the sound source position B2 based on the sound source position B2 located for the first time, combined with the obstacle position B3 obtained by ultrasonic positioning, and obtain the sound source position B4 which is closer to the user's position. In this way, when the located sound source position B2 deviates greatly from the user's position due to factors such as noise, the electronic device 100 can obtain the sound source position B1 that is closer to the user's position through the above method.
  • ⁇ 1 can also be a negative value.
  • the electronic device 100 acquires the sound wave signal A component of the sound wave signal A in the third pickup direction, that is, the sound wave signal A"'.
  • the electronic device 100 may use the direction indicated by ⁇ as the third pickup direction.
  • FIG. 14 is a schematic diagram of acoustic signal processing in a wake-up method provided by an embodiment of the present application. As shown in FIG. 14 , the electronic device 100 can extract the sound wave signal A component 1401 of the M channel sound wave signal in the third pickup direction, and fuse the extracted M channel sound wave signal A component 1401 into the sound wave signal A"', thereby Obtain the sound wave signal in the third pickup direction.
  • the electronic device 100 may input the above-mentioned sound wave signal A"' into a preset wake-up word model, and calculate the similarity 3 (also referred to as the third similarity) between the sound wave signal A"' and the wake-up word.
  • the electronic device 100 obtains the similarity between the acoustic wave signal A"' and the preset wake-up word, that is, similarity 3 (also referred to as the third similarity).
  • similarity 3 also referred to as the third similarity.
  • Relevant fusion and subsequent processing The process is similar to the above-mentioned fusion in the first pickup direction, the second pickup direction, and the subsequent processing process, please refer to the foregoing content; details are not repeated here.
  • the electronic device 100 can determine that the sound wave signal A contains the preset wake-up word At this point, the electronic device 100 can execute step S1205.
  • the third threshold may be greater than or smaller than the first threshold.
  • the third threshold may be 95% or 95 points, or 80% or 80%. grade.
  • the electronic device 100 may determine that the sound wave signal A does not contain the preset wake-up word. At this point, the electronic device 100 can perform step S1206.
  • the electronic device 100 may invoke a voice assistant, or activate a function of the voice assistant.
  • the electronic device 100 remains in an unawakened state.
  • the electronic device 100 continues to remain in a non-awake state (eg, a standby state, etc.).
  • a non-awake state eg, a standby state, etc.
  • the electronic device 100 Since the third sound pickup direction is closer to the direction indicated by the sound source position corresponding to the user, the electronic device 100 inputs the wake-up word model with the above-mentioned sound wave signal A"', the calculated similarity 3 is more accurate, and the electronic device 100 wakes up accurately The rate is higher, and the probability of false wake-up of the electronic device 100 is lower.
  • the inventors have verified through experiments that in a noisy scene, the electronic device 100 using the wake-up method provided by the embodiment of the present application can improve the wake-up accuracy rate and reduce false alarms. probability of awakening.
  • the electronic device 100 when the electronic device 100 detects whether the sound wave signal A contains a wake-up word according to the above method, it can obtain the first sound pickup direction and the second sound pickup direction, and according to the first sound pickup direction and the second sound pickup direction The direction obtains the third sound pickup direction, and according to the similarity between the sound wave signal component of the sound wave signal A in the third sound pickup direction and the wake-up word, it is determined whether to wake up the electronic device. After the similarity between the two is greater than the preset threshold, wake up the electronic device. Otherwise, the electronic device continues to remain unawakened.
  • the similarity between the sound wave signal component of the sound wave signal A in the third pickup direction and the wake-up word is that the sound wave signal component of the sound wave signal A in the third pickup direction is input into the wake-up word model.
  • the sound wave feature extracted by the algorithm, the sound wave feature corresponding to the preset wake-up word, and the similarity between the two Please refer to the above description for details.
  • the electronic device 100 when it detects whether the sound wave signal A contains a wake-up word according to the above method, it can obtain the sound wave signal A' (that is, the sound wave signal component of the sound wave signal A in the first pickup direction) and the wake-up word Similarity 1 between the sonic signal A" (that is, the sonic signal component of the sonic signal A in the second pickup direction) and the wake-up word 2, and the sonic signal A"' (That is, the similarity between the sound wave signal component of the sound wave signal A in the third pickup direction) and the wake-up word is 3.
  • the electronic device 100 when the electronic device 100 detects whether the sound wave signal A contains a wake-up word, it can obtain three of the above-mentioned similarity degrees 1 to 3. At this time, it is assumed that similarity 3 is the highest value among similarity 1 to similarity 3; it indicates that the third pickup direction corresponding to similarity 3 is closer to the direction indicated by the sound source position corresponding to the user. Then, the electronic device 100 can use the sound wave signal corresponding to the similarity degree 3 as the basis for recognizing the wake-up word this time, and compare it with the preset threshold value. If the sound wave signal is greater than the preset threshold value, the electronic device wakes up; after that, the third sound pickup is used. direction, extract the sound wave signal detected by the subsequent electronic device, and execute further voice commands.
  • the electronic device 100 when the electronic device 100 detects whether the sound wave signal A contains a wake-up word according to the above method, it can obtain the sound wave signal A' (that is, the sound wave signal component of the sound wave signal A in the first pickup direction) and the wake-up word
  • the similarity 1 between the sound wave signal A" (that is, the sound wave signal component of the sound wave signal A in the second pickup direction) and the wake word 2 can also be obtained.
  • the electronic device 100 when the electronic device 100 detects whether the sound wave signal A contains a wake-up word, it can obtain two of the aforementioned similarity degrees 1 to 2. At this time, it is assumed that similarity 2 is the highest value among similarity 1 to similarity 2; it indicates that the second pickup direction corresponding to similarity 2 is closer to the direction indicated by the sound source position corresponding to the user. Then, the electronic device 100 can use the sound wave signal corresponding to the similarity degree 2 as the basis for recognizing the wake-up word this time, and compare it with the preset threshold value. If it is greater than the preset threshold value, the electronic device wakes up; after that, the second sound pickup is used. direction, extract the sound wave signal detected by the subsequent electronic device, and execute further voice commands.
  • the electronic device 100 after the electronic device 100 calculates the similarity 1 according to the above method, if the similarity 1 is between the first threshold and the second threshold, the electronic device 100 can calculate the similarity 2 and/or the similarity according to the above method. or similarity 3. That is, when the electronic device 100 detects whether the sound wave signal A contains a wake-up word, after the similarity 1 meets a certain threshold condition, at least one of the similarity 2 and the similarity 3 can be obtained. At this time, the electronic device 100 may determine the maximum value among all obtained similarities (eg, similarity 1, similarity 2, and similarity 3).
  • the similarity 1 is the maximum value, it indicates that the sound wave signal A' detected by the electronic device 100 is closer to the preset wake-up word, and the first pickup direction is also closer to the direction indicated by the corresponding sound source position of the user. Then, the electronic device 100 can use the sound wave signal A' as the basis for recognizing the wake word this time, and determine whether to wake up according to the similarity 1 between the sound wave signal A' and the wake word.
  • the electronic device 100 can continue to detect sound wave signals according to the first pickup direction corresponding to similarity 1, thereby identifying the user. voice commands.
  • the voice assistant can be used to further collect sound wave signals.
  • the electronic device can acquire the sound wave signal component of the sound wave signal in the first sound pickup direction, and identify and execute the corresponding voice command according to the sound wave signal component in the first sound pickup direction.
  • the sound wave signal in the first sound pickup direction is more realistic to restore the actual voice of the user.
  • the accuracy rate of the subsequent speech recognition of the electronic device 100 will also be improved accordingly. Taking a noise scene as an example, according to the above method, the accuracy of speech recognition performed by the electronic device 100 can be improved by 4% or even higher.
  • the electronic device that executes the wake-up method in the foregoing embodiment is an electronic device having an ultrasonic positioning function, which is not limited in this embodiment of the present application.
  • FIG. 15 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • the electronic device may specifically include: multiple ultrasonic transmitters 1501 (the ultrasonic transmitter 1501 may specifically be a speaker); multiple ultrasonic receivers 1502 (the ultrasonic receiver 1502 may specifically be a microphone); one or more processors 1503 ; a memory 1504 one or more application programs (not shown); and one or more computer programs 1505, each of which may be connected by one or more communication buses 1506.
  • the one or more computer programs 1505 are stored in the memory 1504 and configured to be executed by the one or more processors 1503; the one or more computer programs 1505 include instructions that can be used to perform the above-described implementations The relevant steps performed by the electronic device in the example.
  • the electronic device may also include a touch screen (for example, the touch screen may include a touch sensor and a display screen), a mouse and other input devices.
  • FIG. 15 is only an example, and is not used to limit the scope of the present application.
  • the electronic device provided by the present application may also have other hardware structures.
  • Each functional unit in each of the embodiments of the embodiments of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • a computer-readable storage medium includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Signal Processing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

一种唤醒方法及电子设备。电子设备包括:处理器;存储器;M个麦克风;P个超声波发射器;Q个超声波接收器;存储器上的计算机程序,当被处理器执行时,电子设备执行:通过M个麦克风检测到第一声波信号;获取到第一拾音方向;当第一声波信号在第一拾音方向的分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值时,通过收发第二声波信号,第二声波信号为超声波信号获取到第二拾音方向;当第一声波信号在第二拾音方向的分量与预设的唤醒词之间的相似度大于预设的第三阈值时,电子设备唤醒,从而提高电子设备的唤醒准确率,降低误唤醒概率,改善用户体验。

Description

一种唤醒方法及电子设备
本申请要求于2021年01月20日提交国家知识产权局、申请号为202110075531.7、申请名称为“一种唤醒方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端领域,尤其涉及一种唤醒方法及电子设备。
背景技术
随着语音识别技术的发展,许多电子设备都安装有语音助手(例如,小E、Siri等)来实现与用户的语音交互。通常来说,电子设备会预设一个或多个唤醒词(例如,“你好小E”、“hi Siri”等)。在检测到预设的唤醒词后,电子设备唤醒,通过语音助手与用户语音交互。
不过,在实践中发现,有时即使用户发出的声波信号包含预设的唤醒词,电子设备也不唤醒;或者,有时即使用户发出的声波信号不包含预设的唤醒词,电子设备却唤醒。这给用户带来了不好的体验。
发明内容
为了解决上述的技术问题,本申请提供一种唤醒方法及电子设备。本申请提供的技术方案,可以提高电子设备唤醒的准确率,降低电子设备误唤醒的概率,改善用户体验。
第一方面,提供一种电子设备,处于未唤醒状态。电子设备包括:处理器;存储器;M(M为大于1的正整数)个麦克风,每个麦克风对应一个拾音入口;M个麦克风的M个拾音入口位于电子设备的第一表面,第一表面在一个平面上;上述M个麦克风中任意两个麦克风之间的距离都是固定的;P(P为大于等于1的正整数)个超声波发射器,每个超声波发射器对应一个超声波发射口;上述P个超声波发射器的P个超声波发射口位于第二表面;第二表面不同于第一表面;Q(Q为大于1的正整数)个超声波接收器,每个超声波接收器对应一个超声波接收口;上述Q个超声波接收器的Q个超声波接收口位于电子设备的第三表面,第三表面在一个平面上;第三表面不同于第一表面;上述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;上述Q个超声波接收口与上述P个超声波发射口朝向不同的方向;以及计算机程序,其中计算机程序存储在存储器上,当计算机程序被处理器执行时,使得电子设备执行以下步骤:
通过上述M个麦克风检测到第一声波信号;响应于第一声波信号,根据第一声波信号到达M个麦克风中至少两个麦克风的到达时间差值,以及至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;其中,第一拾音方向用于指示:第一声源位置在第一表面所在平面上的第一投影点,相对于第一表面所在平面上一个固定点(该固定点不同于第一投影点);进而,获取第一声波信号在第一拾音方向上的第一声波信号分量;在第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,可通过P个超声波发射器发射第二声波 信号,第二声波信号为超声波信号;进而,可通过Q个超声波接收器接收到第二声波信号;响应于第二声波信号,根据第二声波信号到达Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;其中,第二拾音方向用于指示:第二声源位置在第一表面所在平面上的第二投影点,相对于固定点(该固定点不同于第二投影点)的方向;进而,获取到第二声波信号在第二拾音方向上的第二声波信号分量;在第二声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,说明第一声波信号中包含该唤醒词,则电子设备唤醒。
可以看出,第一方面提供电子设备在执行唤醒方法可以划分为两个阶段。在第一阶段中,电子设备可以先按照声音唤醒流程,定位出第一拾音方向,进而识别第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度在第一阈值和第二阈值之间时,上述唤醒方法可进入第二阶段。在第二阶段中,电子设备可以使用超声波信号定位出第二拾音方向,进而识别第一声波信号在第二拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第二拾音方向上的声波信号分量与预设的唤醒词之间的相似度满足对应的阈值条件后,电子设备唤醒。这样,电子设备可以通过两个阶段对拾音方向的定位确定最终用户所在的声源位置,从而按照最终确定的声源位置进行唤醒词的识别,提高电子设备唤醒的准确率,降低电子设备误唤醒的概率。
根据第一方面,在第一声波信号分量与预设的唤醒词之间的相似度大于第一阈值后,电子设备还执行:唤醒电子设备。也就是说,如果在上述第一阶段中,第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度较高,说明电子设备检测到的第一声波信号与预设的唤醒词较为接近,则唤醒电子设备,不需要进入上述第二阶段再次进行定位。
根据第一方面,或者以上第一方面的任意一种实现方式,在第一声波信号分量与预设的唤醒词之间的相似度小于第二阈值后,电子设备还执行:保持未唤醒状态。也就是说,如果在上述第一阶段中,第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度较低,说明电子设备检测到的第一声波信号与预设的唤醒词差别较大,则电子设备可继续保持未唤醒的状态,也不需要进入上述第二阶段再次进行定位。
根据第一方面,或者以上第一方面的任意一种实现方式,在第二声波信号分量与预设的唤醒词之间的相似度小于或等于第三阈值后,电子设备还执行:保持未唤醒状态。也就是说,如果在上述第二阶段中,第一声波信号在第二拾音方向上的声波信号分量与预设的唤醒词之间的相似度较高,说明虽然在第一阶段中识别出第一声波信号与唤醒词之间的相似度不高,但通过超声波定位可确定实际第一声波信号与预设的唤醒词较为接近,则唤醒电子设备。
根据第一方面,或者以上第一方面的任意一种实现方式,上述Q个超声波接收器具体可以为上述M个麦克风的部分或全部;其中,Q小于等于M;上述超声波接收口此时为拾音入口;上述第三表面与第一表面相同。这样,电子设备可利用现有的麦克风参与超声波定位,不需要额外新增超声波接收器,降低在语音交互场景下进行超声 波定位的成本。
根据第一方面,或者以上第一方面的任意一种实现方式,上述Q个超声波接收器可以不同于上述M个麦克风的部分或全部。
根据第一方面,或者以上第一方面的任意一种实现方式,电子设备还包括:N个扬声器,这N个扬声器的N个声波发射口位于第四表面;N为大于等于1的正整数;第四表面不同于上述第一表面。
根据第一方面,或者以上第一方面的任意一种实现方式,电子设备还包括:上述P个超声波发射器为上述N个扬声器的部分或全部;其中,P小于等于N;上述超声波发射口此时为声波发射口;第四表面与第二表面相同。这样,电子设备可利用现有的扬声器参与超声波定位,不需要额外新增超声波发射器,降低在语音交互场景下进行超声波定位的成本。
根据第一方面,或者以上第一方面的任意一种实现方式,上述P个超声波发射器可以不同于N个扬声器的部分或全部。
根据第一方面,或者以上第一方面的任意一种实现方式,上述第二表面平行于上述第一表面。
第二方面,提供一种电子设备,处于未唤醒状态。电子设备包括:处理器;存储器;M(M为大于1的正整数)个麦克风,每个麦克风对应一个拾音入口;M个麦克风的M个拾音入口位于电子设备的第一表面,第一表面在一个平面上;上述M个麦克风中任意两个麦克风之间的距离都是固定的;P(P为大于等于1的正整数)个超声波发射器,每个超声波发射器对应一个超声波发射口;上述P个超声波发射器的P个超声波发射口位于第二表面;第二表面不同于第一表面;Q(Q为大于1的正整数)个超声波接收器,每个超声波接收器对应一个超声波接收口;上述Q个超声波接收器的Q个超声波接收口位于电子设备的第三表面,第三表面在一个平面上;第三表面不同于第一表面;上述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;上述Q个超声波接收口与上述P个超声波发射口朝向不同的方向;以及计算机程序,其中计算机程序存储在存储器上,当计算机程序被处理器执行时,使得电子设备执行以下步骤:
通过M个麦克风检测到第一声波信号;响应于第一声波信号,根据第一声波信号到达M个麦克风中至少两个麦克风的到达时间差值,以及至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;第一拾音方向用于指示:第一声源位置在第一表面所在平面上的第一投影点,相对于第一表面所在平面上一个固定点的方向;固定点不同于第一投影点;获取到第一声波信号在第一拾音方向上的第一声波信号分量;在第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,通过P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;通过Q个超声波接收器接收到第二声波信号;响应于第二声波信号,根据第二声波信号到达Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;第二拾音方向用于指示,第二声源位置在第一表面所在平面上的第二投影点,相对于固定点的方向;固定点不同于第二投影点;根据第一拾音方向和第二拾 音方向确定第三拾音方向,第三拾音方向用于指示,第三声源位置在第一表面所在平面上的第三投影点,相对于固定点的方向;获取到第一声波信号在第三拾音方向上的第三声波信号分量;在第三声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,电子设备唤醒。
可以看出,第二方面提供的电子设备执行的唤醒方法也可以划分为两个阶段。在第一阶段中,电子设备可以先按照声音唤醒流程,定位出第一拾音方向,进而识别第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度在第一阈值和第二阈值之间时,上述唤醒方法可进入第二阶段。与第一方面不同的是,在第二阶段中,电子设备可以使用超声波信号定位出第二拾音方向,进而,通过第二拾音方向校正第一拾音方向,得到与用户实际所在位置更接近的第三拾音方向。这样,电子设备可识别第一声波信号在第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度满足对应的阈值条件后,电子设备唤醒。进而电子设备唤醒的准确率更高,电子设备误唤醒的概率更低。
根据第二方面,在第一声波信号分量与预设的唤醒词之间的相似度大于第一阈值后,电子设备还执行:唤醒电子设备。与第一方面类似的,如果在上述第一阶段中,第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度较高,说明电子设备检测到的第一声波信号与预设的唤醒词较为接近,则唤醒电子设备,不需要进入上述第二阶段再次进行定位。
根据第二方面,或者以上第二方面的任意一种实现方式,在第一声波信号分量与预设的唤醒词之间的相似度小于第二阈值后,电子设备还执行:保持未唤醒状态。与第一方面类似的,如果在上述第一阶段中,第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度较低,说明电子设备检测到的第一声波信号与预设的唤醒词差别较大,则电子设备可继续保持未唤醒的状态,也不需要进入上述第二阶段再次进行定位。
根据第二方面,或者以上第二方面的任意一种实现方式,在第三声波信号分量与预设的唤醒词之间的相似度小于或等于第三阈值后,电子设备还执行:保持未唤醒状态。
根据第二方面,或者以上第二方面的任意一种实现方式,电子设备根据第一拾音方向和第二拾音方向确定第三拾音方向;包括:如果第一拾音方向与第二拾音方向的方向偏差绝对值小于预设的第四阈值,或,第一拾音方向与第二拾音方向的方向偏差绝对值大于预设的第五阈值后,则第三拾音方向与第一拾音方向相同。
根据第二方面,或者以上第二方面的任意一种实现方式,电子设备根据第一拾音方向和第二拾音方向确定第三拾音方向;包括:如果第一拾音方向与第二拾音方向的方向偏差绝对值,大于预设的第四阈值,且小于第五阈值,则第三拾音方向为在第一拾音方向上,叠加第一拾音方向与第二拾音方向的方向偏差绝对值与预设的比例系数的积。
根据第二方面,或者以上第二方面的任意一种实现方式,上述Q个超声波接收器为上述M个麦克风的部分或全部;其中,Q小于等于M;其中,上述超声波接收口为 拾音入口;第三表面与第一表面相同。
根据第二方面,或者以上第二方面的任意一种实现方式,上述Q个超声波接收器不同于上述M个麦克风的部分或全部。
根据第二方面,或者以上第二方面的任意一种实现方式,电子设备还包括:N个扬声器,这N个扬声器的N个声波发射口位于第四表面;N为大于等于1的正整数;第四表面不同于第一表面。
根据第二方面,或者以上第二方面的任意一种实现方式,上述P个超声波发射器为上述N个扬声器的部分或全部;其中,P小于等于N;上述超声波发射口为声波发射口;第四表面与第二表面相同。
根据第二方面,或者以上第二方面的任意一种实现方式,上述P个超声波发射器不同于上述N个扬声器的部分或全部。
根据第二方面,或者以上第二方面的任意一种实现方式,上述第二表面平行于上述第一表面。
第三方面,提供一种唤醒方法。唤醒方法包括:通过M个麦克风检测到第一声波信号;响应于第一声波信号,根据第一声波信号到达上述M个麦克风中至少两个麦克风的到达时间差值,以及至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;其中,第一拾音方向用于指示,第一声源位置在第一表面所在平面上的第一投影点,相对于第一表面所在平面上一个固定点的方向;该固定点不同于第一投影点;获取到第一声波信号在第一拾音方向上的第一声波信号分量;在第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,通过P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;通过Q个超声波接收器接收到第二声波信号;响应于第二声波信号,根据第二声波信号到达Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;其中,第二拾音方向用于指示,第二声源位置在第一表面所在平面上的第二投影点,相对于固定点的方向;该固定点不同于第二投影点;获取到第二声波信号在第二拾音方向上的第二声波信号分量;在第二声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,电子设备唤醒。
与第一方面提供的电子设备对应的,第三方面提供的唤醒方法可以划分为两个阶段。在第一阶段中,电子设备可以先按照声音唤醒流程,定位出第一拾音方向,进而识别第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度在第一阈值和第二阈值之间时,上述唤醒方法可进入第二阶段。在第二阶段中,电子设备可以使用超声波信号定位出第二拾音方向,进而识别第一声波信号在第二拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第二拾音方向上的声波信号分量与预设的唤醒词之间的相似度满足对应的阈值条件后,电子设备唤醒。这样,电子设备可以通过两个阶段对拾音方向的定位确定最终用户所在的声源位置,从而按照最终确定的声源位置进行唤醒词的识别,提高电子设备唤醒的准确率,降低电子设备误唤醒的概率。
根据第三方面,在第一声波信号分量与预设的唤醒词之间的相似度大于第一阈值 后,方法还包括:电子设备唤醒。
根据第三方面,或者以上第三方面的任意一种实现方式,在第一声波信号分量与预设的唤醒词之间的相似度小于第二阈值后,方法还包括:电子设备保持未唤醒状态。
根据第三方面,或者以上第三方面的任意一种实现方式,在第二声波信号分量与预设的唤醒词之间的相似度小于或等于第三阈值后,方法还包括:电子设备保持未唤醒状态。
第三方面的任意一种实现方式分别与第一方面的任意一种实现方式相对应。第三方面中任意一种实现方式所对应的技术效果可参见上述第一方面中任意一种实现方式所对应的技术效果,此处不再赘述。
第四方面,提供一种唤醒方法。唤醒方法包括:通过M个麦克风检测到第一声波信号;响应于第一声波信号,根据第一声波信号到达M个麦克风中至少两个麦克风的到达时间差值,以及至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;其中,第一拾音方向用于指示,第一声源位置在第一表面所在平面上的第一投影点,相对于第一表面所在平面上一个固定点的方向;该固定点不同于第一投影点;获取到第一声波信号在第一拾音方向上的第一声波信号分量;在第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,通过P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;通过Q个超声波接收器接收到第二声波信号;响应于第二声波信号,根据第二声波信号到达Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;其中,第二拾音方向用于指示,第二声源位置在第一表面所在平面上的第二投影点,相对于固定点的方向;该固定点不同于第二投影点;根据第一拾音方向和第二拾音方向确定第三拾音方向,其中,第三拾音方向用于指示,第三声源位置在第一表面所在平面上的第三投影点,相对于上述固定点的方向;获取到第一声波信号在第三拾音方向上的第三声波信号分量;在第三声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,电子设备唤醒。
与第二方面提供的电子设备对应的,第四方面提供的唤醒方法也可以划分为两个阶段。在第一阶段中,电子设备可以先按照声音唤醒流程,定位出第一拾音方向,进而识别第一声波信号在第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第一拾音方向上的声波信号分量与预设的唤醒词之间的相似度在第一阈值和第二阈值之间时,上述唤醒方法可进入第二阶段。与第三方面不同的是,在第二阶段中,电子设备可以使用超声波信号定位出第二拾音方向,进而,通过第二拾音方向校正第一拾音方向,得到与用户实际所在位置更接近的第三拾音方向。这样,电子设备可识别第一声波信号在第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度满足对应的阈值条件后,电子设备唤醒。进而电子设备唤醒的准确率更高,电子设备误唤醒的概率更低。
根据第四方面,在第一声波信号分量与预设的唤醒词之间的相似度大于第一阈值后,方法还包括:电子设备唤醒。
根据第四方面,或者以上第四方面的任意一种实现方式,在第一声波信号分量与 预设的唤醒词之间的相似度小于第二阈值后,方法还包括:电子设备保持未唤醒状态。
根据第四方面,或者以上第四方面的任意一种实现方式,在第三声波信号分量与预设的唤醒词之间的相似度小于或等于第三阈值后,方法还包括:电子设备保持未唤醒状态。
根据第四方面,或者以上第四方面的任意一种实现方式,根据第一拾音方向和第二拾音方向确定第三拾音方向;包括:在第一拾音方向与第二拾音方向的方向偏差绝对值小于预设的第四阈值,或,第一拾音方向与第二拾音方向的方向偏差绝对值大于预设的第五阈值后,第三拾音方向与第一拾音方向相同。
根据第四方面,或者以上第四方面的任意一种实现方式,根据第一拾音方向和第二拾音方向确定第三拾音方向;包括:在第一拾音方向与第二拾音方向的方向偏差绝对值,大于预设的第四阈值,且小于第五阈值后,第三拾音方向为在第一拾音方向上,叠加第一拾音方向与第二拾音方向的方向偏差绝对值与预设的比例系数的积。
第四方面的任意一种实现方式分别与第二方面的任意一种实现方式相对应。第四方面中任意一种实现方式所对应的技术效果可参见上述第二方面中任意一种实现方式所对应的技术效果,此处不再赘述。
第五方面,提供一种唤醒方法。唤醒方法包括:通过M个麦克风检测到第一声波信号;响应于第一声波信号,根据第一声波信号到达M个麦克风中至少两个麦克风的到达时间差值,以及至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;其中,第一拾音方向用于指示,第一声源位置在第一表面所在平面上的第一投影点,相对于第一表面所在平面上一个固定点的方向;该固定点不同于第一投影点;通过P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;通过Q个超声波接收器接收到第二声波信号;响应于第二声波信号,根据第二声波信号到达Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;其中,第二拾音方向用于指示,第二声源位置在第一表面所在平面上的第二投影点,相对于固定点的方向;该固定点不同于第二投影点;根据第一拾音方向和第二拾音方向确定第三拾音方向,其中,第三拾音方向用于指示,第三声源位置在第一表面所在平面上的第三投影点,相对于上述固定点的方向;获取到第一声波信号在第三拾音方向上的第三声波信号分量;在第三声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,电子设备唤醒。
在第五方面提供的唤醒方法中,电子设备检测到第一声波信号后,可进行两次定位过程。一次可根据第一声波信号到达M个麦克风的时间进行定位,得到第一拾音方向;一次可通过收发超声波信号对障碍物进行定位,得到第二拾音方向。进而,通过第二拾音方向校正第一拾音方向后,可得到与用户实际所在位置更接近的第三拾音方向。这样,电子设备可识别第一声波信号在第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度。当第三拾音方向上的声波信号分量与预设的唤醒词之间的相似度满足对应的阈值条件后,电子设备唤醒。进而电子设备唤醒的准确率更高,电子设备误唤醒的概率更低。
第六方面,本申请提供一种计算机可读存储介质,包括计算机指令,当计算机指 令在上述电子设备上运行时,使得电子设备执行上述任一项所述的唤醒方法。
第七方面,本申请提供一种计算机程序产品,当计算机程序产品在上述电子设备上运行时,使得电子设备执行上述任一项所述的唤醒方法。
可以理解地,上述各个方面所提供的计算机可读存储介质以及计算机程序产品均应用于上文所提供的对应方法以及对应的电子设备,因此,其所能达到的有益效果可参考上文所提供的对应的电子设备或方法中的有益效果,此处不再赘述。
附图说明
图1A为本申请实施例提供的唤醒方法的场景示意图;
图1B为提供的电子设备定位到的声源位置的示意图;
图1C为提供的电子设备定位到的声源位置的示意图;
图2为本申请实施例提供的电子设备的硬件结构示意图;
图3为本申请实施例提供的唤醒方法中声波信号定位的原理示意图;
图4-图7为本申请实施例提供的唤醒方法中声波信号处理的流程示意图;
图8为本申请实施例提供的一种唤醒方法的部分流程示意图;
图9为本申请实施例提供的一种唤醒方法中超声波信号定位障碍物的原理示意图;
图10为本申请实施例提供的一种唤醒方法中超声波信号定位障碍物位置的示意图;
图11为本申请实施例提供的一种唤醒方法中声波信号处理的流程示意图;
图12为本申请实施例提供的另一种唤醒方法的部分流程示意图;
图13为本申请实施例提供的另一种唤醒方法中声源位置的示意图;
图14为本申请实施例提供的另一种唤醒方法中声波信号处理的示意图;
图15为本申请实施例提供的电子设备的硬件结构示意图。
具体实施方式
下面结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个或两个以上(包含两个)。术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。术语“连接”包括直接连接和间接连接,除非另外说明。“第一”、“第二”仅用于描述目的,而不 能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。
在本申请实施例中,“示例性地”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性地”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性地”或者“例如”等词旨在以具体方式呈现相关概念。
图1A为本申请实施例提供的唤醒方法的场景示意图。如图1A所示,电子设备100具有语音交互的功能,可接收声波信号。具体来说,电子设备100的上表面设置有多个麦克风或麦克风阵列170C的拾音入口;每个麦克风或每个麦克风阵列对应一个拾音入口;多个麦克风或麦克风阵列170C通过上表面的不同拾音入口,接收声波信号。可选地,电子设备100的其他部位(比如,侧面等)或上表面可设置有扬声器(图中未示出),用于输出声波信号。用户200可通过声波信号唤醒电子设备100,进而在电子设备100唤醒后,通过进一步的语音指令控制电子设备,执行对应的功能。示例性地,电子设备100可为智能音箱、智能电视、智能空调、智能门锁、智能灯等具有语音交互功能的设备。本申请对此不做限制。
需要说明的是,声波信号包括语音信号(频率在20Hz-20000Hz的信号)。可选地,声波信号还包括超声波信号(频率大于20000Hz的信号)。可选地,声波信号还可包括次声波信号(频率低于20Hz的信号)。用户发出的声波信号,是指用户发出的语音信号。
需要说明的是,图1A中多个麦克风或麦克风阵列170C的多个拾音入口,设置在电子设备100的上表面,仅为示意性举例。上述多个拾音入口也可设置在另外的表面上。在电子设备100的使用中,上表面或另外的表面处于水平面或接近水平面。另外的表面接近水平面是指,虽然另外的表面不平整,存在一定的凹凸,但凹凸影响小,可近似认为水平面。以下为了便于说明,以多个麦克风或麦克风阵列170C设置在电子设备100的上表面为例进行介绍。
在实践中发现,有时即使用户200发出的声波信号包含预设的唤醒词,电子设备100也不唤醒;或者,有时即使用户200发出的声波信号不包含预设的唤醒词,电子设备100却唤醒。这打扰了用户,给用户带来了不好的体验。
为了解决上述的技术问题,发明人经过长期深入的研究、实验及分析,总结出引起上述误差,主要存在两个方面的原因。在阐明上述两个方面的原因之前,先介绍电子设备侧的语音交互过程,以及电子设备根据检测到的声波信号,对声源位置的定位过程。
需要说明的是,由于多个麦克风或麦克风阵列170C的拾音入口设置在电子设备100的上表面上,电子设备100不能识别到声源位置的三维位置,只能识别到声源位置在上表面所在平面上的投影对应的位置,即二维位置。下面结合图1B具体说明。图1B为电子设备100定位到的声源位置的示意图。如图1B所示,电子设备100的上表面为XY轴平面,上表面的中心点为O点。电子设备100根据接收到的声波信号,仅能识别到声源位置A1(X1,Y1),无法识别到声源位置的高度。因此,下文中有关声源位置的概念,实质上为声源位置在电子设备100的上表面的投影。上述O点为上表面的中心点仅为示意性举例。实际上,上表面上的任意一个固定点都可以为O点。
电子设备侧的语音交互过程,一般来说,可划分为五个环节:唤醒、响应、输入、理解和反馈。示例性地,语音交互功能可由电子设备100安装的语音助手来具体实施。结合图1A,进一步阐述上述的五个环节。如图1A所示,电子设备100处于唤醒之前的状态(比如,待机状态等)。用户200输出包含预设的唤醒词的声波信号。电子设备100在接收到该声波信号后,识别该声波信号是否包含预设的唤醒词。若从该声波信号中识别出预设的唤醒词,电子设备100调用语音交互助手,或者激活电子设备100的语音交互功能,电子设备100唤醒,进入工作状态。可选地,电子设备100还可对用户发出的上述声波信号进行应答。这样,电子设备100从第一状态(比如,待机状态等)切换到第二状态(比如,工作状态等)。之后,用户200可发出进一步的语音指令。电子设备100在接收到进一步的语音指令后,可通过语音识别算法识别出对应的语义内容,即理解该进一步的语音指令,从而执行对应的功能。为了能及时地响应声波信号,电子设备100的拾音装置通常需要常开(always on)。示例性地,电子设备100的拾音装置可为麦克风阵列或多个麦克风。电子设备100可通过麦克风阵列或多个麦克风实时地检测声波信号。
为了实现精准的唤醒,及对用户发出的语音快速响应,电子设备100会根据检测到的声波信号,识别该声波信号对应的声源位置,获取来自该声源位置的方向(可称为拾音方向),进而获取该声波信号在拾音方向上的分量,基于该分量,进行处理。这样,可以减少处理的数据量,提高响应速度。
具体来说,如图1B所示,电子设备100在检测到声波信号后,可对该声波信号所对应的声源位置A1进行定位,获取到该声波信号所在的声源位置A1。进而,电子设备100可将来自声源位置A1的方向作为拾音方向,根据该拾音方向,获取到该拾音方向上的声波信号分量。后续,电子设备100可将获取到的声波信号分量输入唤醒词模型。在唤醒词模型中,使用预设算法提取该声波信号分量的声波特征,并比较该声波特征与预设的唤醒词对应的声波特征之间的相似度(也称为置信度)。如果相似度大于预设的阈值,则电子设备100可确认检测到的声波信号包含预设的唤醒词;此时,电子设备100唤醒,进入工作状态。如果相似度小于预设的阈值,则电子设备100可确认检测到的声波信号不包含预设的唤醒词;此时,电子设备100可继续保持唤醒之前的状态(比如,待机状态等)。
不过,发明人发现,电子设备100根据检测到的声波信号,定位出的声源位置一般会存在偏差。图1C为提供的电子设备100定位到的声源位置的示意图。如图1C所示,电子设备100根据检测到的声波信号,定位出的声源位置为声源位置A1,而实际上用户200是在声源位置A2发出声波信号。从而,存在偏差。这样的偏差,会影响电子设备100后续的一系列处理,导致处理结果不精准,误差较大。
发明人经过长期深入的研究、实验及分析,总结得到上述声源位置定位偏差主要有两方面原因:
1、电子设备100在声源位置定位中使用的到达时间(time of arrival,TOA)算法或到达时间差(time difference of arrival,TDOA)算法,本身精度不够精准,计算出的声源位置本身存在偏差;
2、电子设备100和用户200所处环境一般存在噪声源,噪声源发出的噪声,也会 导致声源位置的定位偏差。电子设备100虽然可以通过降噪算法滤除一部分噪声信号,但残留的噪声信号仍然会影响电子设备100的声源位置定位结果,使得电子设备100定位出的声源位置A1与用户200实际所在的声源位置A2出现偏差。
当电子设备100定位出的声源位置A1与用户200实际所在的声源位置A2出现偏差时,如果电子设备100将来自声源位置A1的方向作为拾音方向,进一步提取声波信号在该拾音方向上的声波信号分量,从而无法准确地反映出用户输入的声波信号。后续,导致电子设备100识别唤醒词的准确率降低,用户体验较差。
为了提高电子设备唤醒的准确率,降低电子设备误唤醒的概率,改善用户体验,本申请提供一种唤醒方法及电子设备。本申请实施例提供的唤醒方法应用于电子设备。电子设备可以为智能音箱、智能电视、智能空调、智能冰箱、智能灯、智能门、智能锁、智能窗帘等各种智能家居设备、智能手机、智能眼镜、智能手表、智能手环等各种可穿戴电子设备、平板电脑、笔记本电脑、个人数字助理(personal digital assistant,PDA)、车载设备、虚拟现实设备、增强现实设备等具有语音交互功能的电子设备。本申请对此不做限制。
示例性地,图2示出了本申请实施例提供的电子设备100的硬件结构示意图。如图2所示,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风阵列170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,用户标识模块(subscriber identification module,SIM)卡接口195,超声波发射器196,超声波接收器197,以及USB接口198等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图2所示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图2所示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元。例如,处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal  asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器接口120和无线通信模块150等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2以及无线通信模块150等实现。
移动通信模块150可以提供应用在电子设备上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括一个或多个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
无线通信模块160可以提供应用在电子设备上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(Bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成一个或多个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括指令。处理器110可以通过运行存储在内部存储器121的上述指令,从而使得电 子设备执行本申请一些实施例中所提供的唤醒的方法,以及各种功能应用和数据处理等。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统;该存储程序区还可以存储一个或多个应用程序(比如图库、联系人等)等。存储数据区可存储电子设备使用过程中所创建的数据(比如照片,联系人等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如一个或多个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。在另一些实施例中,处理器110通过运行存储在内部存储器121的指令,和/或存储在设置于处理器中的存储器的指令,来使得电子设备执行本申请实施例中所提供的唤醒方法,以及各种功能应用和数据处理。
电子设备可以通过音频模块170,扬声器170A,受话器170B,麦克风阵列170C,耳机接口170D以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟声波信号输出,也用于将模拟音频输入转换为数字声波信号。音频模块170还可以用于对声波信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声波信号。电子设备可以通过扬声器170A收听音乐,或收听免提通话。
麦克风阵列170C包括多个麦克风。其中,麦克风也可称“话筒”,“传声器”,用于将声波信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风发声,将声波信号输入到麦克风。在一些实施例中,电子设备可以使用麦克风阵列170C采集声波信号,进而根据麦克风阵列170C中每个麦克风采集到的声波信号识别声音来源,实现声源定位、定向录音等功能。电子设备可以设置一个或多个麦克风阵列170C。在另一种实施方式中,麦克风阵列170C可被替换为多个麦克风;即电子设备100不包含麦克风阵列170C,而是包括多个麦克风。多个麦克风的拾音入口位于电子设备100的诸如上表面的同一表面。
传感器180可以包括压力传感器,陀螺仪传感器,气压传感器,磁传感器,加速度传感器,距离传感器,接近光传感器,指纹传感器,温度传感器,触摸传感器,环境光传感器,骨传导传感器等,本申请实施例对此不做任何限制。
超声波发射器196和超声波接收器197,分别用于发射超声波信号以及接收超声波信号。超声波发射器196和超声波接收器197,均可为一个或多个;本申请对此不做限制。本领域技术人员可以根据实际经验或实际应用场景对此进行设置。超声波信号是一种频率高于20000Hz(赫兹)的声波信号。超声波信号具有方向性好,反射能力和穿透能力强等特点。
示例性地,超声波发射器196具体可以为多个扬声器196A(图2中未示出),即扬声器196A具有发射超声波信号的功能。示例性地,超声波接收器197具体可以为麦克风阵列197A(图2中未示出)或多个麦克风197B(图2中未示出),即麦克风阵列197A、多个麦克风197B具有接收超声波信号的功能。
在一些实施例中,电子设备100不包含超声波发射器196;超声波发射器196的功能集成在扬声器170A中。也就是说,扬声器170A既能发出人耳能够感知的声波信 号,也能发射超声波信号。这样,电子设备不再需要额外设置超声波发射器196。
在一些实施例中,类似地,电子设备100不包含超声波接收器197;超声波接收器197的功能集成在麦克风阵列170C中。也就是说,麦克风阵列170C既能接收人耳能够感知的声波信号,也可以接收超声波信号。这样,电子设备不再需要额外设置超声波接收器197。
在一些实施例中,电子设备100不包含超声波发射器196,也不包含超声波接收器197。超声波发射器196的功能集成在扬声器170A中,超声波接收器197的功能集成在麦克风阵列170C中。
USB接口198可用于连接其他的设备。示例性地,USB接口198可为一个或多个USB接口。
需要说明的是,在电子设备100包括超声波发射器196,以及多个麦克风或麦克风阵列170C,且超声波发射器196与多个麦克风或麦克风阵列170C没有集成在一体时,超声波发射器196的顶部所在的平面,与多个麦克风或麦克风阵列170C的拾音入口所在的平面平行或近似平行。近似平行是指,两个平面虽然不平行,但角度差异很小,可以看作是平行。
可选地,当电子设备100为智能音箱时,电子设备100还可以包括GPU、显示屏以及按键等一项或多项器件。本申请实施例对此不做任何限制。
可选地,当电子设备100为智能电视时,电子设备100还可以包括GPU、显示屏等一项或多项器件,并且还可以为电子设备配备遥控器、红外传感器等一项或多项器件。本申请实施例对此不做任何限制。
可选地,当电子设备100为智能手机时,电子设备100还可以包括GPU、显示屏、耳机接口、按键、电池、马达、指示器以及SIM卡接口等一项或多项器件。本申请实施例对此不做任何限制。
在本申请实施例中,电子设备100在检测接收到的声波信号是否包含唤醒词时,可引入超声波信号检测用户所在的声源位置,提高声源位置的检测精度。
图3为本申请实施例提供的唤醒方法中声波信号定位的原理示意图。
如图3所示,用户200在声源位置B1发出声波信号。电子设备100根据接收到的声波信号(接收到的声波信号包括但不限于用户200发出的声波信号),使用TOA算法或TDOA算法进行定位,获取到声源位置B2。并且,电子设备100在收到声波信号后,还可以使用超声波定位方法对用户进行定位,得到障碍物位置B3。进而,电子设备100结合这两次定位结果(即声源位置B2和障碍物位置B3)最终确定用户所在的声源位置B4(声源位置B4与声源位置B2或障碍物位置B3,可以相同,也可以不同)。这样,电子设备100可通过障碍物位置B3修正声源位置B2,使得电子设备100确定出的声源位置B4更加接近用户实际所在的声源位置B1。
这样,电子设备后续可将来自声源位置B4的方向作为拾音方向,识别检测到的声波信号是否包含唤醒词。由于电子设备确定出的声源位置B4与用户实际对应的声源位置B1之间的偏差较小,使得电子设备按照声源位置B4方向的声波信号分量,来判断包含唤醒词的准确率更高,从而提高电子设备唤醒的准确率,降低电子设备误唤醒的概率,改善用户体验。
示例性地,电子设备100可以包括N(N为大于1的正整数)个扬声器和L(L为大于等于1的正整数)个麦克风阵列。其中,每个麦克风阵列包括M(M为大于1的正整数)个麦克风。N个扬声器和L个麦克风阵列设置在电子设备100的不同位置。N个扬声器中任意两者之间的距离、M个麦克风中任意两者之间的距离都固定(相等或不相等均可,但都固定)。M个麦克风的拾音入口或L个麦克风阵列的拾音入口位于电子设备100的诸如上表面的同一表面。
可替换地,电子设备100可以包括N(N为大于1的正整数)个扬声器和M(M为大于1的正整数)个麦克风。N个扬声器和M个麦克风设置在电子设备100的不同位置。N个扬声器中任意两者之间的距离、M个麦克风中任意两者之间的距离都固定。M个麦克风的拾音入口位于电子设备100的诸如上表面的同一表面。
为了方便说明,下面都采用位于一个麦克风阵列的M个麦克风为例进行介绍。本领域技术人员应当了解,不位于一个麦克风阵列中,单独的M个麦克风也在本申请的保护范围内。
其中,N个扬声器中的每个扬声器均可作为超声波发射器,发射超声波信号(高于20000Hz的声波信号)。并且,N个扬声器中的每个扬声器还可以播放人耳能够感知的声波信号(20Hz至20000Hz的声波信号)。M个麦克风中的每个麦克风均可作为超声波接收器,接收超声波信号。并且,M个麦克风中的每个麦克风还可以采集人耳能够感知的声波信号。这样,电子设备100可利用扬声器和麦克风实现超声波定位,不需要额外新增超声波发射器和超声波接收器,降低在语音交互场景下进行超声波定位的成本。
可替换地,N个扬声器中的每个扬声器只可以播放人耳能够感知的声波信号(20Hz至20000Hz的声波信号)。M个麦克风中的每个麦克风只可以采集人耳能够感知的声波信号。电子设备100另外设置有P个超声波发射器,以及Q个超声波接收器。其中,P为大于等于1的正整数,Q为大于1的正整数。Q个超声波接收器中任意两者之间的距离都固定。在P为大于1的正整数时,P个超声波发射器中任意两者之间的距离都固定。
在一些实施例中,电子设备100可将麦克风阵列中的M个麦克风设置为常开状态,从而通过这M个麦克风实时采集声波信号。此时,如果电子设备100所处的环境中存在超声波信号,则超声波信号作为一种高频的声波信号也可能被各个麦克风采集到。在仅需要使用声波信号时,各个麦克风可将采集到的声波信号输入至对应的低通滤波器中,将声波信号中大于20000Hz的超声波信号滤除。这样,电子设备可基于滤波后的声波信号,以及来自声源位置的拾音方向,获取到拾音方向上的声波信号分量,从而确定拾音方向上的声波信号分量是否包含预设的唤醒词。而在仅需要使用超声波信号时,各个麦克风可将采集到的声波信号输入至对应的高通滤波器中,将声波信号中小于20000Hz的信号滤除。进而,电子设备可基于滤波后的声波信号进行超声波定位,从而定位出障碍物位置B3。
示例性,图4-图7为本申请实施例提供的唤醒方法中声波信号处理的流程示意图。如图4所示,麦克风阵列中的M个麦克风采集到声波信号A(假设声波信号A不包括超声波信号,即使包括也可以通过低通滤波器滤除)。由于M个麦克风的位置不同, M个麦克风中不同麦克风采集到的声波信号A的波形可能不同(差异较小,甚至无差异),并且不同麦克风采集到的声波信号A的时间点也可能不同。因此,如图4所示,电子设备100可以通过这M个麦克风获取到对应的M路声波信号A。
电子设备100获取到M路声波信号A后,还可以根据M路声波信号A对声源位置进行定位。示例性地,由于M路声波信号A中每一路声波信号A到达对应麦克风的时间点不同,那么,电子设备可以根据上述时间点,使用TOA算法或TDOA算法计算对应的声源位置B2。
如图5所示,电子设备100计算出声源位置B2后,可以将来自声源位置B2的方向确定为第一拾音方向。将M路声波信号A中每一路声波信号A,在时间上对齐(即对齐各路声波信号A的起始时间点)。在时间上对齐后,获取M路声波信号A中每一路声波信号A在第一拾音方向上的分量,即M路声波信号A分量501。将M路声波信号A分量501融合为一路声波信号,即声波信号A’。
此外,也可以如图6所示,电子设备100在计算出声源位置B2后,可以将来自声源位置B2的方向确定为第一拾音方向。获取M路声波信号A中每一路声波信号A在第一拾音方向上的分量,即M路声波信号A分量501。之后,在时间上对齐(即对齐各路声波信号A的起始时间点)。在时间上对齐后,将M路声波信号A分量501融合为一路声波信号,即声波信号A’。
上述的融合,可以为直接将M路声波信号A分量501直接叠加,也可以将M路声波信号A分量501加权平均,还可以为其他的方式。本申请对此不做限制。
如图7所示,电子设备100获取到声波信号A’后,可将声波信号A’输入预设的唤醒词模型。唤醒词模型存储有预设的唤醒词的声波特征701。之后,在唤醒词模型中,使用预设算法提取声波信号A’的声波特征702,并将提出出的声波特征702与预设的唤醒词对应的声波特征701相比较,获取到两者之间的相似度(也称为置信度)。最终获取到两者的相似度为相似度1(也可称为第一相似度)。其中,声波特征702和声波特征701可以通过相关代码、函数、矩阵或频谱图表示,本申请对此不做限制。
图8为本申请实施例提供的一种唤醒方法的部分流程示意图。如图8所示,在根据图4-图7示出的处理流程得出的相似度1后,若相似度1大于第一阈值(比如,90%,90分等),表明电子设备100检测到的声波信号A与预设的唤醒词较为接近,则电子设备100可确定声波信号A包含预设的唤醒词。进而,电子设备100唤醒。示例性地,电子设备100调用语音助手,由语音助手与用户语音交互。
若相似度1小于第二阈值(比如,60%,60分等),表明电子设备100检测到的声波信号A与预设的唤醒词差别较大,则电子设备100可确定声波信号A没有包含预设的唤醒词。进而,电子设备100继续保持未唤醒的状态。其中,第二阈值小于第一阈值。另外,第二阈值和第一阈值均可调整,不限于上述举例的阈值。
若相似度1位于第二阈值和第一阈值之间,表明电子设备100检测到的声波信号A可能包含预设的唤醒词,则电子设备100可按照S801-S805,通过超声波定位,来进一步确定声波信号A是否包含预设的唤醒词,从而进一步确定电子设备100是否唤醒。
S801、电子设备100根据超声波信号的发射和接收之间的时长,超声波在空气中 的传输速度,甚至根据使用TOA算法或TDOA算法得到的声源位置,获取到一个障碍物位置。
在具体阐述S801之前,先介绍超声波定位的原理。图9为本申请实施例提供的一种唤醒方法中超声波信号定位障碍物的原理示意图。电子设备100设置有P个超声波发射器和Q个超声波接收器。P个超声波发射器和Q个超声波接收器在电子设备100上朝向不同的方向。即P个超声波发射器向K方向发射超声波,Q个超声波接收器不位于P个超声波发射器的K方向上。也就是说,Q个超声波接收器不能接收到P个超声波发射器直接发射的超声波信号,只能接收到P个超声波发射器发射的超声波信号,经障碍物反射后的超声波信号。P为大于等于1的正整数,Q为大于1的正整数。
如图9所示,以P个超声波发射器包括扬声器1001、扬声器102以及扬声器1003,Q个麦克风包括麦克风1011、麦克风1012以及麦克风1013为例。扬声器1001可在一定角度范围内发射超声波信号1。类似地,扬声器1002可在一定角度范围内发射超声波信号2(图9未示出),扬声器1003可在一定角度范围内发射超声波信号3(图9未示出)。超声波信号1、超声波信号2以及超声波信号3遇到包括用户200在内的障碍物后发生反射。麦克风1011、麦克风1012以及麦克风1013可采集到反射后的超声波信号1、超声波信号2以及超声波信号。
以超声波信号1为例,在经过包括用户200在内的障碍物后,超声波信号1发生发射,反射后的超声波信号1到达麦克风1011、麦克风1012以及麦克风1013的时长及时间点不同。电子设备100根据超声波信号1的发射和接收两者的时长,以及超声波在空气中的传输速度,获取到障碍物的位置。当然,通过这种方式,获取到的障碍物的位置也是存在一定的偏差的。
示例性地,如果电子设备100周边的障碍物只有用户200,那么获取到的障碍物的位置可以为图3所示的障碍物位置B3。
示例性地,如果电子设备100周边的障碍物较多,包括但不限于用户200,那么可以根据先前获取到的声源位置B2,排除掉与声源位置B2差别较大的障碍物的位置,保留与声源位置B2相差在一定范围内的障碍物的位置。图10为本申请实施例提供的一种唤醒方法中超声波信号定位障碍物位置的示意图。如图10所示,通过超声波定位,获取到的障碍物的位置有两个,分别为障碍物位置B3和障碍物位置B5。由于障碍物位置B5与声源位置B2相差较大,在上述的一定范围之外,排除障碍物位置B5;由于障碍物位置B3与声源位置B2相差在上述的一定范围之内,保留障碍物位置B3。按照上述方式,保留下来的障碍物位置可以为一个,也可以多于一个。
示例性地,保留下来的障碍物位置可记为定位结果1。
示例性地,在保留下来的障碍物位置为多个时,可以将保留下来的多个障碍物位置,叠加后平均,来得到一个障碍物位置。
示例性地,上述的一定范围,可以为与来自声源位置B2的方向,相差一定角度的范围。上述的一定角度,可以为预设的角度。
可选地,超声波信号2也可单独发射。进而,按照上述方式,也可获取到基于超声波信号2的发射、反射获取到的定位结果2。定位结果2可包括一个或多个障碍物位置。
可选地,超声波信号3也可单独发射。进而,按照上述方式,也可获取到基于超声波信号3的发射、反射获取到的定位结果3。定位结果3可包括一个或多个障碍物位置。
之后,电子设备100可使用预设的聚类算法,对上述定位结果1、定位结果2以及定位结果3进行聚类分析。示例性地,上述聚类算法可以包括K均值聚类算法(k-means clustering algorithm,也可称为k-means聚类算法)或自组织映射神经网络(self-organizing maps,SOM)聚类算法等。电子设备100通过聚类分析可以将多个定位结果中相似度较高的障碍物位置聚合为一个障碍物位置(例如,图3中的障碍物位置B3)。此时,电子设备100可将聚合后的障碍物位置确定为可以使用的障碍物位置。
可选地,电子设备100也可以只使用超声波信号1确定出来的一个障碍物位置。
需要说明的是,各个麦克风采集到的声波信号可以包括超声波信号,也可以包括人耳能够识别的声波信号。在S801中,各个麦克风可先将采集到的声波信号输入至对应的高通滤波器中,将声波信号中小于20000Hz的声波信号滤除,得到相应的超声波信号。进而,电子设备100可以按照上述方法,确定出一个障碍物位置。
S802、电子设备100将来自所述障碍物位置的方向作为第二拾音方向,获取到声波信号A在所述第二拾音方向上的声波信号A分量即声波信号A”。
可结合图11进一步阐明S802。图11为本申请实施例提供的一种唤醒方法中声波信号处理的流程示意图。如图11所示,电子设备100可将来自所述障碍物位置的方向作为第二拾音方向,电子设备100采集到的M路声波信号A在时间上对齐后,提取M路声波信号A在所述第二拾音方向上的分量,获取到M路声波信号A分量1101。之后,电子设备100可将上述M路声波信号A分量1101融合,获取到声波信号A”。当然,也可以类似于图6所示,先提取M路声波信号A在所述第二拾音方向上的分量,之后在时间对齐后,再融合;具体流程不再赘述。
S803、电子设备100可将上述声波信号A”输入预设的唤醒词模型,计算声波信号A”与唤醒词的相似度2(也称为第二相似度)。
有关相似度2的计算过程,与相似度1的计算过程类似。此处不再赘述。
若相似度2大于第三阈值,表明声波信号A在第二拾音方向上的分量声波信号A”与预设的唤醒词接近,则电子设备100可确定声波信号A包含预设的唤醒词。此时,电子设备100可执行步骤S804。其中,第三阈值可以大于第一阈值,也可以小于第一阈值。比如,第三阈值可为95%或95分,也可为80%或80分等。
若相似度2小于第三阈值,表明声波信号A在第二拾音方向上的分量声波信号A”与预设的唤醒词相差较大,则电子设备100可确定声波信号A没有包含预设的唤醒词。此时,电子设备100可执行步骤S805。
S804、电子设备100唤醒。
示例性地,电子设备100可调用语音助手,或激活语音助手的功能。
S805、电子设备100保持未唤醒状态。
示例性地,电子设备100继续保持未唤醒状态(比如,待机状态等)。
可以看出,本申请提供的唤醒方法可以划分为两个阶段。在第一阶段中,电子设 备可以先按照声音唤醒流程,识别检测到的声波信号与预设的唤醒词之间的相似度1。在相似度1大于第一阈值,电子设备唤醒;在相似度1小于第二阈值,电子设备继续保持未唤醒状态;在相似度1在第一阈值和第二阈值之间时,唤醒方法可进入第二阶段。
在第二阶段中,电子设备可以使用超声波信号定位出障碍物位置,并通过第一阶段中识别出的声源位置,对障碍物位置筛选及计算,最终获取到一个障碍物位置。电子设备计算来自障碍物位置的方向上的声波信号分量与预设的唤醒词之间的相似度2。在相似度2满足对应的阈值条件后,电子设备唤醒;否则,电子设备继续保持未唤醒状态。
需要说明的是,本领域技术人员可以根据实际经验或实际应用场景设置上述的第三阈值。示例性地,当第一阈值设置较高时,可将第三阈值设置为小于第一阈值的数值。例如,第一阈值可以设置为95,第三阈值可以设置为小于95的数值(比如,70或80等)。也就是说,当第一阶段检测到相似度1大于第一阈值(比如,95)后,电子设备才会确定声波信号包含预设的唤醒词。否则,进入第二阶段,电子设备通过超声波定位,计算相似度2。当相似度2大于第三阈值(比如,70或80)后,电子设备唤醒。
或者,当第一阈值设置较低时,可将第三阈值设置为大于第一阈值的数值。例如,第一阈值可以设置为75,第三阈值可以设置为大于75的数值(比如,85或95等)。也就是说,当第一阶段检测到相似度1小于第一阈值(比如,75)后,进入第二阶段;电子设备通过超声波定位,计算相似度2。当相似度2大于第三阈值(比如,70或80)后,电子设备唤醒。
此外,本申请还提供了另外一种唤醒方法的实施例。另外一种唤醒方法与前文所述的唤醒方法相比,都包含相同的第一阶段,但第二阶段有所区别。有关第一阶段的内容,此处不再赘述。
结合图12介绍另外一种唤醒方法中第二阶段的内容。图12为本申请实施例提供的另一种唤醒方法的部分流程示意图。如图12所示,在根据图4-图7示出的处理流程得出的相似度1后,通过超声波定位,确定出第二拾音方向,并根据第一拾音方向和第二拾音方向来进一步确定第三拾音方向,继而获取声波信号A在第三拾音方向上的分量即声波信号A”’,进而确定声波信号A”’是否包含预设的唤醒词,来确定电子设备100是否唤醒。
具体来说,在根据图4-图7示出的处理流程得出的相似度1后,若相似度1大于第一阈值(比如,90%,90分等),表明电子设备100检测到的声波信号A与预设的唤醒词较为接近,则电子设备100可确定声波信号A包含预设的唤醒词。进而,电子设备100唤醒。示例性地,电子设备100调用语音助手,由语音助手与用户语音交互。
若相似度1小于第二阈值(比如,60%,60分等),表明电子设备100检测到的声波信号A与预设的唤醒词差别较大,则电子设备100可确定声波信号A没有包含预设的唤醒词。进而,电子设备100继续保持未唤醒的状态。其中,第二阈值小于第一阈值。另外,第二阈值和第一阈值均可调整,不限于上述举例的阈值。
若相似度1位于第二阈值和第一阈值之间,表明电子设备100检测到的声波信号 A可能包含预设的唤醒词,则电子设备100可按照S1201-S1205,通过超声波定位,来进一步确定声波信号A是否包含预设的唤醒词,从而进一步确定电子设备100是否唤醒。具体来说,另一种唤醒方法的部分流程包括:
S1201、电子设备100根据超声波信号的发射和接收之间的时长,超声波在空气中的传输速度,甚至根据TOA算法或TDOA算法得到的声源位置,获取到一个障碍物位置。
其中,S1201的相关内容可参见S801的相关内容,故此处不再赘述。
S1202、电子设备100将来自所述障碍物位置的方向作为第二拾音方向,根据第二拾音方向和第一拾音方向确定第三拾音方向。
考虑到声源位置B2和障碍物位置B3均可能存在误差,为了更准确地确定声源位置,电子设备还可以结合声源位置B2和障碍物位置B3,重新确定用户所在的声源位置。
下面结合图13,来进一步阐明S1202。图13为本申请实施例提供的另一种唤醒方法中声源位置的示意图。示例性地,如图13所示,电子设备100的上表面为XY轴平面,上表面的中心点为O点。X轴和Y轴为过O点的两条相互垂直的坐标轴。该XY轴坐标系与之前的图1B、图1C的坐标系相同。声源位置B2为电子设备100在第一阶段中,通过TOA算法或TDOA算法定位出的声源位置;障碍物位置B3为电子设备按照S1201,使用超声波信号,最终定位出的一个障碍物位置;假设声源位置B4为最终计算得到的更为接近用户的声源位置。分别连接B2、B3、B4和O点,可以得到线段B2O与X轴的夹角α,线段B3O与X轴的夹角β,线段B4O与X轴的夹角γ。α反映声源位置B2与电子设备之间的相对方向;β反映障碍物位置B3与电子设备之间的相对方向;γ反映声源位置B4与电子设备之间的相对方向。α和β均可按照上述的方法,计算得到;而γ为未知的。设α与β之差的绝对值△=|α-β|。进而,电子设备100可以按照下述的公式(1)计算γ。
Figure PCTCN2021138534-appb-000001
其中,k为预设的比例系数,0≤k≤1;θ 1为预设值1(例如,5°等),θ 2为预设值2(例如,10°等)。当α小于β时,公式(1)中的±取+;当α大于β时,公式(1)中的±取-。
也就是说,当△较小或较大时,表明通过超声波定位以及计算得到的障碍物位置B3的误差可能较大,则电子设备100可将上述声源位置B2确定为最终用户对应的声源位置;此时,确定出来的用户的声源位置B4与X轴的夹角γ为α。
当△在预设范围(即θ 1和θ 2所定义的区间)内时,电子设备可通过比例系数k调整△的权重,最终确定γ,即声源位置B4与电子设备100之间的相对方向。也就是说,电子设备可以在首次定位出的声源位置B2的基础上,结合超声波定位得到的障碍物位置B3,对声源位置B2进行校正,得到与用户位置更接近的声源位置B4。这样,当定位出的声源位置B2因为噪音等因素与用户位置偏差较大时,电子设备100通过上 述方法,可得到与用户位置更接近的声源位置B1。
可选地,θ 1也可为负值。
需要说明的是,上述的公式(1)仅为一种示意性举例;也可依据其他的公式计算γ。
当然,本领域技术人员还可以按照上述原理,设置其他坐标系(例如三维坐标系),本申请对此不做限制。
S1203、电子设备100获取到声波信号A在所述第三拾音方向上的声波信号A分量即声波信号A”’。
与上述实施例中步骤S802类似的,电子设备100可将γ指示的方向作为第三拾音方向。图14为本申请实施例提供的唤醒方法中声波信号处理的示意图。如图14所示,电子设备100可提取M路声波信号在第三拾音方向上的声波信号A分量1401,并将提取到的M路声波信号A分量1401融合为声波信号A”’,从而得到第三拾音方向上的声波信号。
S1204、电子设备100可将上述声波信号A”’输入预设的唤醒词模型,计算声波信号A”’与唤醒词的相似度3(也称为第三相似度)。
与上述实施例中步骤S803类似的,电子设备100获取到声波信号A”’与预设的唤醒词之间的相似度即相似度3(也称为第三相似度)。有关融合以及后续处理的流程,与上述第一拾音方向、第二拾音方向上的融合及后续处理流程类似,请参见前述内容;此处不再赘述。
若相似度3大于第三阈值,表明声波信号A在第三拾音方向上的分量声波信号A”’与预设的唤醒词接近,则电子设备100可确定声波信号A包含预设的唤醒词。此时,电子设备100可执行步骤S1205。其中,第三阈值可以大于第一阈值,也可以小于第一阈值。比如,第三阈值可为95%或95分,也可为80%或80分等。
若相似度3小于第三阈值,表明声波信号A在第三拾音方向上的分量声波信号A”’与预设的唤醒词相差较大,则电子设备100可确定声波信号A没有包含预设的唤醒词。此时,电子设备100可执行步骤S1206。
S1205、电子设备100唤醒。
示例性地,电子设备100可调用语音助手,或激活语音助手的功能。
S1206、电子设备100保持未唤醒状态。
示例性地,电子设备100继续保持未唤醒状态(比如,待机状态等)。
由于第三拾音方向更接近用户对应的声源位置指示的方向,因此电子设备100以上述声波信号A”’输入唤醒词模型,计算出的相似度3更加准确,进而电子设备100唤醒的准确率更高,电子设备100误唤醒的概率更低。发明人经过实验,证实在有噪音的场景下,使用本申请实施例提供的唤醒方法的电子设备100,能提高唤醒的准确率,降低误唤醒的概率。
在另一些实施例中,电子设备100在按照上述方法检测声波信号A是否包含唤醒词时,可以得到第一拾音方向,第二拾音方向,并根据第一拾音方向和第二拾音方向得到第三拾音方向,根据声波信号A在第三拾音方向上的声波信号分量与唤醒词之间的相似度,确定是否唤醒电子设备。在两者的相似度大于预设的阈值后,唤醒电子设 备。否则,电子设备继续保持未唤醒状态。其中,声波信号A在第三拾音方向上的声波信号分量与唤醒词之间的相似度,是将声波信号A在第三拾音方向上的声波信号分量输入唤醒词模型,根据预设的算法提取出的声波特征,与预设的唤醒词对应的声波特征,两者之间的相似度。具体请参见上文描述。
在另一些实施例中,电子设备100在按照上述方法检测声波信号A是否包含唤醒词时,可以得到声波信号A’(即声波信号A在第一拾音方向上的声波信号分量)与唤醒词之间的相似度1,也可以得到声波信号A”(即声波信号A在第二拾音方向上的声波信号分量)与唤醒词之间的相似度2,还可以得到声波信号A”’(即声波信号A在第三拾音方向的声波信号分量)与唤醒词之间的相似度3。
也就是说,电子设备100在检测声波信号A是否包含唤醒词时,可以得到上述的相似度1至相似度3中的三个。此时,假设相似度3为相似度1至相似度3中的最高值;表明与相似度3对应的第三拾音方向,更接近用户对应的声源位置指示的方向。那么,电子设备100可以将相似度3对应的声波信号,作为本次识别唤醒词的依据,与预设的阈值比较,若大于预设的阈值,则电子设备唤醒;之后,使用第三拾音方向,提取本次后续电子设备检测到的声波信号,执行进一步的语音指令。
在另外一些实施例中,电子设备100在按照上述方法检测声波信号A是否包含唤醒词时,可以得到声波信号A’(即声波信号A在第一拾音方向上的声波信号分量)与唤醒词之间的相似度1,也可以得到声波信号A”(即声波信号A在第二拾音方向上的声波信号分量)与唤醒词之间的相似度2。
也就是说,电子设备100在检测声波信号A是否包含唤醒词时,可以得到上述的相似度1至相似度2中的两个。此时,假设相似度2为相似度1至相似度2中的最高值;表明与相似度2对应的第二拾音方向,更接近用户对应的声源位置指示的方向。那么,电子设备100可以将相似度2对应的声波信号,作为本次识别唤醒词的依据,与预设的阈值比较,若大于预设的阈值,则电子设备唤醒;之后,使用第二拾音方向,提取本次后续电子设备检测到的声波信号,执行进一步的语音指令。
在另外一些实施例中,电子设备100按照上述方法计算出相似度1后,如果相似度1在第一阈值和第二阈值之间,则电子设备100可按照上述方法计算出相似度2和/或相似度3。也就是说,电子设备100在检测声波信号A是否包含唤醒词时,当上述相似度1满足一定的阈值条件后,还可以得到上述相似度2和相似度3中的至少一个。此时,电子设备100可确定已得到的所有相似度(例如相似度1、相似度2以及相似度3)中的最大值。例如,如果相似度1为最大值,表明电子设备100检测到的声波信号A’与预设的唤醒词更接近,第一拾音方向也更接近用户对应的声源位置指示的方向。那么,电子设备100可将声波信号A’作为本次识别唤醒词的依据,根据声波信号A’与唤醒词之间的相似度1确定是否唤醒。
仍以相似度1为相似度1、相似度2以及相似度3中的最大值举例,电子设备100唤醒后,可继续按照与相似度1对应的第一拾音方向检测声波信号,从而识别用户的语音指令。例如,电子设备100唤醒后,可使用语音助手进一步采集声波信号。进而,电子设备可获取该声波信号在第一拾音方向上的声波信号分量,并根据在第一拾音方向上的声波信号分量识别并执行对应的语音指令。在这种场景下,由于第一拾音方向 相比于第二拾音方向或第三拾音方向更接近用户对应的声源位置指示的方向,因此第一拾音方向上的声波信号更加真实地还原用户的实际语音。这样,电子设备100在本次后续的语音识别准确率也会随之提高。以噪音场景举例,电子设备100按照上述方法,进行语音识别的准确率可以提高4%甚至更高。
需要说明的是,执行上述实施例中唤醒方法的电子设备为具有超声波定位功能的电子设备,本申请实施例对此不做限制。
图15为本申请实施例提供的电子设备的硬件结构示意图。电子设备具体可以包括:多个超声波发射器1501(超声波发射器1501具体可以为扬声器);多个超声波接收器1502(超声波接收器1502具体可以为麦克风);一个或多个处理器1503;存储器1504;一个或多个应用程序(未示出);以及一个或多个计算机程序1505,上述各器件可以通过一个或多个通信总线1506连接。其中,该一个或多个计算机程序1505被存储在存储器1504中并被配置为被该一个或多个处理器1503执行;该一个或多个计算机程序1505包括指令,该指令可以用于执行上述实施例中电子设备执行的相关步骤。当然,该电子设备还可以包括触摸屏(例如,触摸屏可以包括触摸传感器和显示屏)、鼠标等输入设备。
需要说明的是,图15所示的硬件结构仅为示例性,并不用于限制本申请的范围。本申请提供的电子设备还可以为其他的硬件结构。
通过以上的实施方式的描述,所属技术领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请实施例的具体实施方式,但本申请实施例的保护范围并不局限于此,任何在本申请实施例揭露的技术范围内的变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以所述权利要求的保护范围为准。

Claims (34)

  1. 一种电子设备,处于未唤醒状态,其特征在于,所述电子设备包括:
    处理器;
    存储器;
    M个麦克风,每个麦克风对应一个拾音入口;所述M个麦克风的M个拾音入口位于所述电子设备的第一表面,所述第一表面在一个平面上;所述M个麦克风中任意两个麦克风之间的距离都是固定的;M为大于1的正整数;
    P个超声波发射器,每个超声波发射器对应一个超声波发射口;所述P个超声波发射器的P个超声波发射口位于第二表面;P为大于等于1的正整数;所述第二表面不同于所述第一表面;
    Q个超声波接收器,每个超声波接收器对应一个超声波接收口;所述Q个超声波接收器的Q个超声波接收口位于所述电子设备的第三表面,所述第三表面在一个平面上;所述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;Q为大于1的正整数;所述Q个超声波接收口与所述P个超声波发射口朝向不同的方向;所述第三表面不同于所述第一表面;
    以及计算机程序,其中所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    通过所述M个麦克风检测到第一声波信号;
    响应于所述第一声波信号,根据所述第一声波信号到达所述M个麦克风中至少两个麦克风的到达时间差值,以及所述至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;所述第一拾音方向用于指示,第一声源位置在所述第一表面所在平面上的第一投影点,相对于所述第一表面所在平面上一个固定点的方向;所述固定点不同于所述第一投影点;
    获取到所述第一声波信号在所述第一拾音方向上的第一声波信号分量;
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,
    通过所述P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;
    通过所述Q个超声波接收器接收到所述第二声波信号;
    响应于所述第二声波信号,根据所述第二声波信号到达所述Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及所述至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;所述第二拾音方向用于指示,第二声源位置在所述第一表面所在平面上的第二投影点,相对于所述固定点的方向;所述固定点不同于所述第二投影点;
    获取到所述第二声波信号在所述第二拾音方向上的第二声波信号分量;
    在所述第二声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,所述电子设备唤醒。
  2. 根据权利要求1所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第一声波信号分量与预设的唤醒词之间的相似度大于所述第一阈值后,所述电子设备唤醒。
  3. 根据权利要求1或2所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于所述第二阈值后,所述电子设备保持未唤醒状态。
  4. 根据权利要求1-3中任意一项所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第二声波信号分量与预设的唤醒词之间的相似度小于或等于所述第三阈值后,所述电子设备保持未唤醒状态。
  5. 根据权利要求1-4中任意一项所述的电子设备,其特征在于,所述Q个超声波接收器为所述M个麦克风的部分或全部;其中,Q小于等于M;所述超声波接收口为所述拾音入口;所述第三表面与所述第一表面相同。
  6. 根据权利要求1-4中任意一项所述的电子设备,其特征在于,所述Q个超声波接收器不同于所述M个麦克风的部分或全部。
  7. 根据权利要求1-6中任意一项所述的电子设备,其特征在于,所述电子设备还包括:
    N个扬声器,所述N个扬声器的N个声波发射口位于第四表面;N为大于等于1的正整数;所述第四表面不同于所述第一表面。
  8. 根据权利要求7所述的电子设备,其特征在于,所述电子设备还包括:
    所述P个超声波发射器为所述N个扬声器的部分或全部;其中,P小于等于N;所述超声波发射口为所述声波发射口;所述第四表面与所述第二表面相同。
  9. 根据权利要求7所述的电子设备,其特征在于,所述P个超声波发射器不同于所述N个扬声器的部分或全部。
  10. 根据权利要求1-9中任意一项所述的电子设备,其特征在于,所述第二表面平行于所述第一表面。
  11. 一种电子设备,处于未唤醒状态,其特征在于,所述电子设备包括:
    处理器;
    存储器;
    M个麦克风,每个麦克风对应一个拾音入口;所述M个麦克风的M个拾音入口位于所述电子设备的第一表面,所述第一表面在一个平面上;所述M个麦克风中任意两个麦克风之间的距离都是固定的;M为大于1的正整数;
    P个超声波发射器,所述P个超声波发射器的P个超声波发射口位于第二表面;P为大于等于1的正整数;所述第二表面不同于所述第一表面;
    Q个超声波接收器,每个超声波接收器对应一个超声波接收口;所述Q个超声波接收器的Q个超声波接收口位于所述电子设备的第三表面,所述第三表面在一个平面上;所述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;Q为大于1的正整数;所述Q个超声波接收口与所述P个超声波发射口朝向不同的方向;所述第三表面不同于所述第一表面;
    以及计算机程序,其中所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备执行以下步骤:
    通过所述M个麦克风检测到第一声波信号;
    响应于所述第一声波信号,根据所述第一声波信号到达所述M个麦克风中至少两个麦克风的到达时间差值,以及所述至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;所述第一拾音方向用于指示,第一声源位置在所述第一表面所在平面上的第一投影点,相对于所述第一表面所在平面上一个固定点的方向;所述固定点不同于所述第一投影点;
    获取到所述第一声波信号在所述第一拾音方向上的第一声波信号分量;
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,
    通过所述P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;
    通过所述Q个超声波接收器接收到所述第二声波信号;
    响应于所述第二声波信号,根据所述第二声波信号到达所述Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及所述至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;所述第二拾音方向用于指示,第二声源位置在所述第一表面所在平面上的第二投影点,相对于所述固定点的方向;所述固定点不同于所述第二投影点;
    根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向,所述第三拾音方向用于指示,第三声源位置在所述第一表面所在平面上的第三投影点,相对于所述固定点的方向;
    获取到所述第一声波信号在所述第三拾音方向上的第三声波信号分量;
    在所述第三声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,所述电子设备唤醒。
  12. 根据权利要求11所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第一声波信号分量与预设的唤醒词之间的相似度大于所述第一阈值后,所述电子设备唤醒。
  13. 根据权利要求11或12所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于所述第二阈值后,所述电子设备保持未唤醒状态。
  14. 根据权利要求11-13中任意一项所述的电子设备,其特征在于,所述电子设备还执行:
    在所述第三声波信号分量与预设的唤醒词之间的相似度小于或等于所述第三阈值后,所述电子设备保持未唤醒状态。
  15. 根据权利要求11-14中任一项所述的电子设备,其特征在于,所述根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向;包括:
    在所述第一拾音方向与所述第二拾音方向的方向偏差绝对值小于预设的第四阈值,或,所述第一拾音方向与所述第二拾音方向的方向偏差绝对值大于预设的第五阈值后,所述第三拾音方向与所述第一拾音方向相同。
  16. 根据权利要求11-15中任意一项所述的电子设备,其特征在于,所述根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向;包括:
    在所述第一拾音方向与所述第二拾音方向的方向偏差绝对值,大于预设的第四阈 值,且小于第五阈值后,所述第三拾音方向为在所述第一拾音方向上,叠加所述第一拾音方向与所述第二拾音方向的方向偏差绝对值与预设的比例系数的积。
  17. 根据权利要求11-16中任意一项所述的电子设备,其特征在于,所述Q个超声波接收器为所述M个麦克风的部分或全部;其中,Q小于等于M;所述超声波接收口为所述拾音入口;所述第三表面与所述第一表面相同。
  18. 根据权利要求11-17中任意一项所述的电子设备,其特征在于,所述Q个超声波接收器不同于所述M个麦克风的部分或全部。
  19. 根据权利要求11-18中任意一项所述的电子设备,其特征在于,所述电子设备还包括:
    N个扬声器,所述N个扬声器的N个声波发射口位于第四表面;N为大于等于1的正整数;所述第四表面不同于所述第一表面。
  20. 根据权利要求19所述的电子设备,其特征在于,
    所述P个超声波发射器为所述N个扬声器的部分或全部;其中,P小于等于N;所述超声波发射口为所述声波发射口;所述第四表面与所述第二表面相同。
  21. 根据权利要求19所述的电子设备,其特征在于,所述P个超声波发射器不同于所述N个扬声器的部分或全部。
  22. 根据权利要求11-21中任意一项所述的电子设备,其特征在于,所述第二表面平行于所述第一表面。
  23. 一种唤醒方法,应用于电子设备;所述电子设备处于未唤醒状态,所述电子设备包括:处理器;存储器;M个麦克风,每个麦克风对应一个拾音入口;所述M个麦克风的M个拾音入口位于所述电子设备的第一表面,所述第一表面在一个平面上;所述M个麦克风中任意两个麦克风之间的距离都是固定的;M为大于1的正整数;P个超声波发射器,所述P个超声波发射器的P个超声波发射口位于第二表面;P为大于等于1的正整数;所述第二表面不同于所述第一表面;Q个超声波接收器,每个超声波接收器对应一个超声波接收口;所述Q个超声波接收器的Q个超声波接收口位于所述电子设备的第三表面,所述第三表面在一个平面上;所述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;Q为大于1的正整数;所述Q个超声波接收口与所述P个超声波发射口朝向不同的方向;所述第三表面不同于所述第一表面;所述方法包括:
    通过所述M个麦克风检测到第一声波信号;
    响应于所述第一声波信号,根据所述第一声波信号到达所述M个麦克风中至少两个麦克风的到达时间差值,以及所述至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;所述第一拾音方向用于指示,第一声源位置在所述第一表面所在平面上的第一投影点,相对于所述第一表面所在平面上一个固定点的方向;所述固定点不同于所述第一投影点;
    获取到所述第一声波信号在所述第一拾音方向上的第一声波信号分量;
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,
    通过所述P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;
    通过所述Q个超声波接收器接收到所述第二声波信号;
    响应于所述第二声波信号,根据所述第二声波信号到达所述Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及所述至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;所述第二拾音方向用于指示,第二声源位置在所述第一表面所在平面上的第二投影点,相对于所述固定点的方向;所述固定点不同于所述第二投影点;
    获取到所述第二声波信号在所述第二拾音方向上的第二声波信号分量;
    在所述第二声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,所述电子设备唤醒。
  24. 根据权利要求23所述的方法,其特征在于,所述方法还包括:
    在所述第一声波信号分量与预设的唤醒词之间的相似度大于所述第一阈值后,所述电子设备唤醒。
  25. 根据权利要求23或24所述的方法,其特征在于,所述方法还包括:
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于所述第二阈值后,所述电子设备保持未唤醒状态。
  26. 根据权利要求23-25中任意一项所述的方法,其特征在于,所述方法还包括:
    在所述第二声波信号分量与预设的唤醒词之间的相似度小于或等于所述第三阈值后,所述电子设备保持未唤醒状态。
  27. 一种唤醒方法,应用于电子设备;所述电子设备处于未唤醒状态,所述电子设备包括:处理器;存储器;M个麦克风,每个麦克风对应一个拾音入口;所述M个麦克风的M个拾音入口位于所述电子设备的第一表面,所述第一表面在一个平面上;所述M个麦克风中任意两个麦克风之间的距离都是固定的;M为大于1的正整数;P个超声波发射器,所述P个超声波发射器的P个超声波发射口位于第二表面;P为大于等于1的正整数;所述第二表面不同于所述第一表面;Q个超声波接收器,每个超声波接收器对应一个超声波接收口;所述Q个超声波接收器的Q个超声波接收口位于所述电子设备的第三表面,所述第三表面在一个平面上;所述Q个超声波接收器中任意两个超声波接收器之间的距离都是固定的;Q为大于1的正整数;所述Q个超声波接收口与所述P个超声波发射口朝向不同的方向;所述第三表面不同于所述第一表面;所述方法包括:
    通过所述M个麦克风检测到第一声波信号;
    响应于所述第一声波信号,根据所述第一声波信号到达所述M个麦克风中至少两个麦克风的到达时间差值,以及所述至少两个麦克风中的部分或全部麦克风之间的距离,获取到第一拾音方向;所述第一拾音方向用于指示,第一声源位置在所述第一表面所在平面上的第一投影点,相对于所述第一表面所在平面上一个固定点的方向;所述固定点不同于所述第一投影点;
    获取到所述第一声波信号在所述第一拾音方向上的第一声波信号分量;
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于预设的第一阈值,且大于或等于预设的第二阈值后,
    通过所述P个超声波发射器发射第二声波信号,第二声波信号为超声波信号;
    通过所述Q个超声波接收器接收到所述第二声波信号;
    响应于所述第二声波信号,根据所述第二声波信号到达所述Q个超声波接收器中至少两个超声波接收器的到达时间差值,以及所述至少两个超声波接收器中的部分或全部超声波接收器之间的距离,获取到第二拾音方向;所述第二拾音方向用于指示,第二声源位置在所述第一表面所在平面上的第二投影点,相对于所述固定点的方向;所述固定点不同于所述第二投影点;
    根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向,所述第三拾音方向用于指示,第三声源位置在所述第一表面所在平面上的第三投影点,相对于所述固定点的方向;
    获取到所述第一声波信号在所述第三拾音方向上的第三声波信号分量;
    在所述第三声波信号分量与预设的唤醒词之间的相似度大于预设的第三阈值后,所述电子设备唤醒。
  28. 根据权利要求27所述的方法,其特征在于,所述方法还包括:
    在所述第一声波信号分量与预设的唤醒词之间的相似度大于所述第一阈值后,所述电子设备唤醒。
  29. 根据权利要求27或28所述的方法,其特征在于,所述方法还包括:
    在所述第一声波信号分量与预设的唤醒词之间的相似度小于所述第二阈值后,所述电子设备保持未唤醒状态。
  30. 根据权利要求27-29中任意一项所述的方法,其特征在于,所述方法还包括:
    在所述第三声波信号分量与预设的唤醒词之间的相似度小于或等于所述第三阈值后,所述电子设备保持未唤醒状态。
  31. 根据权利要求27-30中任一项所述的方法,其特征在于,所述根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向;包括:
    在所述第一拾音方向与所述第二拾音方向的方向偏差绝对值小于预设的第四阈值,或,所述第一拾音方向与所述第二拾音方向的方向偏差绝对值大于预设的第五阈值后,所述第三拾音方向与所述第一拾音方向相同。
  32. 根据权利要求27-31中任意一项所述的方法,其特征在于,所述根据所述第一拾音方向和所述第二拾音方向确定第三拾音方向;包括:
    在所述第一拾音方向与所述第二拾音方向的方向偏差绝对值,大于预设的第四阈值,且小于第五阈值后,所述第三拾音方向为在所述第一拾音方向上,叠加所述第一拾音方向与所述第二拾音方向的方向偏差绝对值与预设的比例系数的积。
  33. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,当所述计算机程序在电子设备上运行时,使得所述电子设备执行如权利要求23-32中任意一项所述的方法。
  34. 一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求23-32中任意一项所述的方法。
PCT/CN2021/138534 2021-01-20 2021-12-15 一种唤醒方法及电子设备 WO2022156438A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21920812.1A EP4258259A4 (en) 2021-01-20 2021-12-15 WAKE-UP METHOD AND ELECTRONIC DEVICE

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110075531.7 2021-01-20
CN202110075531.7A CN114863936A (zh) 2021-01-20 2021-01-20 一种唤醒方法及电子设备

Publications (1)

Publication Number Publication Date
WO2022156438A1 true WO2022156438A1 (zh) 2022-07-28

Family

ID=82548460

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/138534 WO2022156438A1 (zh) 2021-01-20 2021-12-15 一种唤醒方法及电子设备

Country Status (3)

Country Link
EP (1) EP4258259A4 (zh)
CN (1) CN114863936A (zh)
WO (1) WO2022156438A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116456441A (zh) * 2023-06-16 2023-07-18 荣耀终端有限公司 声音处理装置、方法和电子设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622770A (zh) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 语音唤醒方法及装置
CN108537032A (zh) * 2018-04-19 2018-09-14 湖南德熠智能科技有限公司 一种人脸识别方法
CN110364151A (zh) * 2019-07-15 2019-10-22 华为技术有限公司 一种语音唤醒的方法和电子设备
CN110600058A (zh) * 2019-09-11 2019-12-20 深圳市万睿智能科技有限公司 基于超声波唤醒语音助手的方法、装置、计算机设备及存储介质
US20200090643A1 (en) * 2019-07-01 2020-03-19 Lg Electronics Inc. Speech recognition method and device
CN111245995A (zh) * 2020-01-13 2020-06-05 Oppo广东移动通信有限公司 麦克风组件及电子设备
CN111667843A (zh) * 2019-03-05 2020-09-15 北京京东尚科信息技术有限公司 终端设备的语音唤醒方法、系统、电子设备、存储介质
US20210014600A1 (en) * 2019-07-11 2021-01-14 Infineon Technologies Ag Portable device and method for operating the same

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622770A (zh) * 2017-09-30 2018-01-23 百度在线网络技术(北京)有限公司 语音唤醒方法及装置
CN108537032A (zh) * 2018-04-19 2018-09-14 湖南德熠智能科技有限公司 一种人脸识别方法
CN111667843A (zh) * 2019-03-05 2020-09-15 北京京东尚科信息技术有限公司 终端设备的语音唤醒方法、系统、电子设备、存储介质
US20200090643A1 (en) * 2019-07-01 2020-03-19 Lg Electronics Inc. Speech recognition method and device
US20210014600A1 (en) * 2019-07-11 2021-01-14 Infineon Technologies Ag Portable device and method for operating the same
CN110364151A (zh) * 2019-07-15 2019-10-22 华为技术有限公司 一种语音唤醒的方法和电子设备
CN110600058A (zh) * 2019-09-11 2019-12-20 深圳市万睿智能科技有限公司 基于超声波唤醒语音助手的方法、装置、计算机设备及存储介质
CN111245995A (zh) * 2020-01-13 2020-06-05 Oppo广东移动通信有限公司 麦克风组件及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4258259A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116456441A (zh) * 2023-06-16 2023-07-18 荣耀终端有限公司 声音处理装置、方法和电子设备
CN116456441B (zh) * 2023-06-16 2023-10-31 荣耀终端有限公司 声音处理装置、方法和电子设备

Also Published As

Publication number Publication date
EP4258259A1 (en) 2023-10-11
EP4258259A4 (en) 2024-05-01
CN114863936A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
WO2021013137A1 (zh) 一种语音唤醒方法及电子设备
WO2020228815A1 (zh) 一种语音唤醒方法及设备
WO2021000876A1 (zh) 一种语音控制方法、电子设备及系统
CN107408386B (zh) 基于语音方向控制电子装置
WO2021136037A1 (zh) 语音唤醒方法、设备及系统
US20220121413A1 (en) Screen Control Method, Electronic Device, and Storage Medium
JP2012220959A (ja) 入力された発話の関連性を判定するための装置および方法
CN111696562B (zh) 语音唤醒方法、设备及存储介质
WO2021180085A1 (zh) 拾音方法、装置和电子设备
WO2020207376A1 (zh) 一种去噪方法及电子设备
WO2021013255A1 (zh) 一种声纹识别方法及装置
US20210225374A1 (en) Method and system of environment-sensitive wake-on-voice initiation using ultrasound
WO2022156438A1 (zh) 一种唤醒方法及电子设备
WO2022199405A1 (zh) 一种语音控制方法和装置
CN113921002A (zh) 一种设备控制方法及相关装置
WO2021254131A1 (zh) 一种语音唤醒的方法、电子设备、可穿戴设备和系统
WO2023197709A1 (zh) 器件识别方法和相关装置
US20230401897A1 (en) Method for preventing hand gesture misrecognition and electronic device
CN115361636A (zh) 声音信号调整方法、装置、终端设备及存储介质
KR20230094005A (ko) 음향 센서를 이용한 화자 분류 장치 및 방법
EP4138355A1 (en) In-vehicle voice interaction method and device
CN115691498A (zh) 语音交互方法、电子设备及介质
US20240111478A1 (en) Video Recording Method and Electronic Device
US11968519B2 (en) Directional audio provision system
WO2021227530A1 (zh) 设备使能方法及装置、存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21920812

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021920812

Country of ref document: EP

Effective date: 20230707

NENP Non-entry into the national phase

Ref country code: DE