US20210256979A1 - Voice Control Method, Wearable Device, and Terminal - Google Patents

Voice Control Method, Wearable Device, and Terminal Download PDF

Info

Publication number
US20210256979A1
US20210256979A1 US17/256,845 US201817256845A US2021256979A1 US 20210256979 A1 US20210256979 A1 US 20210256979A1 US 201817256845 A US201817256845 A US 201817256845A US 2021256979 A1 US2021256979 A1 US 2021256979A1
Authority
US
United States
Prior art keywords
voice
user
voice component
component
voiceprint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/256,845
Other languages
English (en)
Inventor
Long Zhang
Chunjian Li
Cunshou Qiu
Qing Chang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Assigned to HUAWEI TECHNOLOGIES CO., LTD. reassignment HUAWEI TECHNOLOGIES CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CHANG, QING, LI, CHUNJIAN, QIU, CHUNSHOU, ZHANG, LONG
Assigned to HUAWEI TECHNOLOGIES CO., LTD. reassignment HUAWEI TECHNOLOGIES CO., LTD. CORRECTIVE ASSIGNMENT TO CORRECT THE THIRD INVENTOR'S NAME PREVIOUSLY RECORDED AT REEL: 55018 FRAME: 041. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT. Assignors: CHANG, QING, LI, CHUNJIAN, QIU, Cunshou, ZHANG, LONG
Publication of US20210256979A1 publication Critical patent/US20210256979A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/16Constructional details or arrangements
    • G06F1/1613Constructional details or arrangements for portable computers
    • G06F1/163Wearable computers, e.g. on a belt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/08Mouthpieces; Microphones; Attachments therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1041Mechanical or electronic switches, or control elements
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2460/00Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
    • H04R2460/13Hearing devices using bone conduction transducers

Definitions

  • This application relates to the terminal field, and in particular, to a voice control method, a wearable device, and a terminal.
  • a voiceprint is a sound wave spectrum carrying voice information when a user makes a sound, and can reflect an audio feature of the user. Because vocal organs (for example, a tongue, teeth, a larynx, a lung, and a nasal cavity) used by different persons during speaking are different in size and form, sound wave spectrums of any two persons are usually different. Therefore, one or more types of voice information may be analyzed through voiceprint recognition (speaker recognition, SR), to distinguish between unknown voices.
  • voiceprint recognition peaker recognition
  • a conventional microphone is mainly used to collect a speaker's voice signal propagated by air, and a speaker's identity is further identified based on the collected speaker's voice signal.
  • the collected speaker's voice signal has much noise, which easily interferes with accuracy of voiceprint recognition.
  • a security risk of a terminal such as a mobile phone may increase because the terminal cannot accurately identify the voice signal.
  • This application provides a voice control method, a wearable device, and a terminal, to improve accuracy and security of voiceprint recognition when a user uses a voice control terminal.
  • this application provides a voice control method, including: establishing, by a terminal, a communication connection to a wearable device; when a voicing user enters voice information to the wearable device, performing, by the terminal, identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information, where the first voice component is collected by a first voice sensor of the wearable device, and the second voice component is collected by a second voice sensor of the wearable device; and if a result of the identity authentication performed by the terminal on the voicing user is that the voicing user is an authorized user, executing, by the terminal, an operation instruction corresponding to the voice information.
  • the wearable device collects two pieces of voice information (that is, the first voice component and the second voice component) by using two voice sensors.
  • the terminal may separately perform voiceprint recognition on the two pieces of voice information.
  • voiceprint recognition results of the two pieces of voice information both match that of the authorized user, it may be determined that a current voicing user is the authorized user. It is clearly that compared with a voiceprint recognition process of one piece of voice information, the dual voiceprint recognition process of the two pieces of voice information can significantly improve accuracy and security during user identity authentication.
  • the second voice component is collected by a bone conduction microphone of the wearable device, it indicates that the user wears the wearable device when making a sound. This avoids a case in which an unauthorized user maliciously controls a terminal of the authorized user by using a recording of the authorized user.
  • the method before the performing, by the terminal, identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information, the method further includes: obtaining, by the terminal, the first voiceprint recognition result and the second voiceprint recognition result from the wearable device, where the first voiceprint recognition result is obtained after the wearable device performs voiceprint recognition on the first voice component, and the second voiceprint recognition result is obtained after the wearable device performs voiceprint recognition on the second voice component.
  • the wearable device may locally perform voiceprint recognition on the two voice components separately, and further send the recognition results to the terminal. This can reduce implementation complexity of implementing voice control by the terminal.
  • the method before the performing, by the terminal, identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information, the method further includes: obtaining, by the terminal, the first voice component and the second voice component from the wearable device; and separately performing, by the terminal, voiceprint recognition on the first voice component and the second voice component, to obtain the first voiceprint recognition result corresponding to the first voice component and the second voiceprint recognition result corresponding to the second voice component.
  • the wearable device may send the two voice components to the terminal for voiceprint recognition. This reduces power consumption and implementation complexity of the wearable device.
  • the separately performing, by the terminal, voiceprint recognition on the first voice component and the second voice component includes: performing, by the terminal, voiceprint recognition on the first voice component and the second voice component when the voice information includes a preset keyword; or performing, by the terminal, voiceprint recognition on the first voice component and the second voice component when a preset operation entered by the user is received. Otherwise, it indicates that the user does not need to perform voiceprint recognition at this time, and the terminal does not need to enable a voiceprint recognition function, This reduces power consumption of the terminal.
  • the separately performing, by the terminal, voiceprint recognition on the first voice component and the second voice component includes: determining, by the terminal, whether the first voice component matches a first voiceprint model of an authorized user, where the first voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the first voice sensor and determining, by the terminal, whether the second voice component matches a second voiceprint model of the authorized user, where the second voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the second voice sensor.
  • the performing, by the terminal, identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information includes: if the first voice component matches the first voiceprint model of the authorized user, and the second voice component matches the second voiceprint model of the authorized user, determining, by the terminal, that the voicing user is an authorized user, or otherwise, determining, by the terminal, that the voicing user is an unauthorized user.
  • the determining, by the terminal, whether the first voice component matches a first voiceprint model of an authorized user includes: calculating, by the terminal, a first degree of matching between the first voice component and the first voiceprint model of the authorized user; and if the first matching degree is greater than a first threshold, determining, by the terminal, that the first voice component matches the first voiceprint model of the authorized user; and the determining, by the terminal, whether the second voice component matches a second voiceprint model of the authorized user includes: calculating, by the terminal, a second degree of matching between the second voice component and the second voiceprint model of the authorized user; and if the second matching degree is greater than a second threshold, determining, by the terminal, that the second voice component matches the second voiceprint model of the authorized user.
  • the method before the performing, by a terminal, identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information, the method further includes: obtaining, by the terminal, an enabling instruction sent by the wearable device, where the enabling instruction is generated by the wearable device in response to a wake-up voice entered by the user; and enabling, by the terminal, a voiceprint recognition function in response to the enabling instruction.
  • the method further includes: determining, by the terminal based on the first voice component and the second voice component, whether the voice information includes a preset wake-up word; and enabling, by the terminal, a voiceprint recognition function if the voice information includes the preset wake-up word.
  • the user may trigger, by saying the wake-up word, the terminal to enable the voiceprint recognition function, or otherwise, it indicates that the user does not need to perform voiceprint recognition at this time, and the terminal does not need to enable the voiceprint recognition function. This reduces power consumption of the terminal.
  • the method further includes: automatically executing, by the terminal, an unlock operation.
  • an unlock operation In this way, the user only needs to enter the voice information once to complete a series of operations such as user identity authentication, mobile phone unlocking, and enabling a function of the mobile phone. This greatly improves control efficiency of the user on the mobile phone and user experience.
  • the method before the executing, by the terminal, an operation instruction corresponding to the voice information, the method further includes: obtaining, by the terminal, a device identifier of the wearable device; and the executing, by the terminal, an operation instruction corresponding to the voice information includes: if the device identifier of the wearable device is a preset authorized device identifier, executing, by the terminal, the operation instruction corresponding to the voice information.
  • the terminal may receive and execute a related operation instruction sent by an authorized Bluetooth device, and when an unauthorized Bluetooth device sends an operation instruction to the terminal, the terminal may discard the operation instruction to improve security.
  • this application provides a voice control method, including: establishing, by a wearable device, a communication connection to a terminal; collecting, by the wearable device, a first voice component in voice information by using a first voice sensor; collecting, by the wearable device, a second voice component in the voice information by using a second voice sensor; and separately performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component to perform identity authentication on a voicing user.
  • the first voice sensor is located on a side that is of the wearable device and that is not in contact with the user
  • the second voice sensor is located on a side that is of the wearable device and that is in contact with the user.
  • the first voice sensor is an air conduction microphone
  • the second voice sensor is a bone conduction microphone.
  • the method before the collecting, by the wearable device, a first voice component in voice information by using a first voice sensor, the method further includes: detecting ambient light intensity by using an optical proximity sensor on the wearable device; detecting an acceleration value by using an acceleration sensor on the wearable device; and if the ambient light intensity is less than a preset light intensity threshold, or the acceleration value is greater than a preset acceleration threshold, or the ambient light intensity is less than the preset light intensity threshold and the acceleration value is greater than the preset acceleration threshold, determining that the wearable device is in a wearing state.
  • the method further includes: performing, by the wearable device, voice activity detection (VAD) on the first voice component to obtain a first VAD value; and performing, by the wearable device, VAD on the second voice component to obtain a second VAD value; and the performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component includes: performing voiceprint recognition on the first voice component and the second voice component when the first VAD value and the second VAD value each meet a preset condition.
  • VAD voice activity detection
  • the performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component includes: performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component when the voice information includes a preset key word; or performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component when a preset operation entered by the user is received.
  • the performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component includes: determining, by the wearable device, whether the first voice component matches a first voiceprint model of an authorized user, where the first voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the first voice sensor; and determining, by the wearable device, whether the second voice component matches a second voiceprint model of the authorized user, where the second voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the second voice sensor; and
  • the method further includes: collecting, by the wearable device by using the first voice sensor, a first registration component in a registration voice entered by the authorized user to establish the first voiceprint model of the authorized user; and collecting, by the wearable device by using the second voice sensor, a second registration component in the registration voice entered by the authorized user to establish the second voiceprint model of the authorized user.
  • the determining, by the wearable device, whether the first voice component matches a first voiceprint model of an authorized user includes: calculating, by the wearable device, a first degree of matching between the first voice component and the first voiceprint model of the authorized user; and if the first matching degree is greater than a first threshold, determining, by the wearable device, that the first voice component matches the first voiceprint model of the authorized user; and the determining, by the wearable device, whether the second voice component matches a second voiceprint model of the authorized user includes: calculating, by the wearable device, a second degree of matching between the second voice component and the second voiceprint model of the authorized user; and if the second matching degree is greater than a second threshold, determining, by the wearable device, that the second voice component matches the second voiceprint model of the authorized user.
  • the method further includes: sending, by the wearable device, an authentication success message or an unlock instruction to the terminal if the voicing user is an authorized user.
  • the method further includes: if the voicing user is an authorized user, sending, by the wearable device, an operation instruction corresponding to the voice information to the terminal.
  • the method before the performing, by the wearable device, voiceprint recognition on the first voice component and the second voice component, the method further includes: performing, by the wearable device, noise reduction processing on the first voice component and the second voice component; and/or canceling, by the wearable device, an echo signal in each of the first voice component and the second voice component by using an echo cancellation algorithm.
  • the method before the collecting, by the wearable device, a first voice component in voice information by using a first voice sensor, the method further includes: receiving, by the wearable device, a wake-up voice input by the user, where the wake-up voice includes a preset wake-up word; and sending, by the wearable device, an enabling instruction to the terminal in response to the wake-up voice, where the enabling instruction is used to instruct the terminal to enable a voiceprint recognition function.
  • this application provides a terminal, including a connection unit, an obtaining unit, a recognition unit, an authentication unit, and an execution unit.
  • the connection unit is configured to establish a communication connection to a wearable device.
  • the authentication unit is configured to: when a voicing user enters voice information to the wearable device, perform identity authentication on the voicing user based on a first voiceprint recognition result of a first voice component in the voice information and a second voiceprint recognition result of a second voice component in the voice information, where the first voice component is collected by a first voice sensor of the wearable device, and the second voice component is collected by a second voice sensor of the wearable device.
  • the execution unit is configured to: if a result of the identity authentication performed by the terminal on the voicing user is that the voicing user is an authorized user, execute an operation instruction corresponding to the voice information.
  • the obtaining unit is configured to obtain the first voiceprint recognition result and the second voiceprint recognition result from the wearable device, where the first voiceprint recognition result is obtained after the wearable device performs voiceprint recognition on the first voice component, and the second voiceprint recognition result is obtained after the wearable device performs voiceprint recognition on the second voice component.
  • the obtaining unit is configured to obtain the first voice component and the second voice component from the wearable device
  • the recognition unit is configured to separately perform voiceprint recognition on the first voice component and the second voice component, to obtain the first voiceprint recognition result corresponding to the first voice component and the second voiceprint recognition result corresponding to the second voice component.
  • the recognition unit is specifically configured to: perform voiceprint recognition on the first voice component and the second voice component when the voice information includes a preset keyword; or perform voiceprint recognition on the first voice component and the second voice component when a preset operation entered by the user is received.
  • the recognition unit is specifically configured to: determine whether the first voice component matches a first voiceprint model of an authorized user, where the first voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the first voice sensor; and determine whether the second voice component matches a second voiceprint model of the authorized user, where the second voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the second voice sensor; and the authentication unit is specifically configured to: if the first voice component matches the first voiceprint model of the authorized user, and the second voice component matches the second voiceprint model of the authorized user, determine that the voicing user is an authorized user, or otherwise, determine that the voicing user is an unauthorized user.
  • the recognition unit is specifically configured to: calculate a first degree of matching between the first voice component and the first voiceprint model of the authorized user; if the first matching degree is greater than a first threshold, determine that the first voice component matches the first voiceprint model of the authorized user; calculate a second degree of matching between the second voice component and the second voiceprint model of the authorized user; and if the second matching degree is greater than a second threshold, determine that the second voice component matches the second voiceprint model of the authorized user.
  • the obtaining unit is further configured to obtain an enabling instruction sent by the wearable device, where the enabling instruction is generated by the wearable device in response to a wake-up voice entered by the user, and the execution unit is further configured to enable a voiceprint recognition function in response to the enabling instruction.
  • the recognition unit is further configured to determine, based on the first voice component and the second voice component, whether the voice information includes a preset wake-up word, and the execution unit is further configured to enable a voiceprint recognition function if the voice information includes the preset wake-up word.
  • the execution unit is further configured to automatically perform an unlock operation if the voicing user is an authorized user.
  • the obtaining unit is further configured to obtain a device identifier of the wearable device
  • the execution unit is specifically configured to: if the device identifier of the wearable device is a preset authorized device identifier, execute the operation instruction corresponding to the voice information.
  • this application provides a wearable device, including a connection unit, a detection unit, a recognition unit, an authentication unit, and a sending unit.
  • the connection unit is configured to establish a communication connection to a terminal.
  • the detection unit is configured to collect a first voice component in voice information by using a first voice sensor, and the wearable device collects a second voice component in the voice information by using a second voice sensor.
  • the recognition unit is configured to separately perform voiceprint recognition on the first voice component and the second voice component.
  • the detection unit is further configured to: detect ambient light intensity by using an optical proximity sensor on the wearable device; detect an acceleration value by using an acceleration sensor on the wearable device; and if the ambient light intensity is less than a preset light intensity threshold, or the acceleration value is greater than a preset acceleration threshold, or the ambient light intensity is less than the preset light intensity threshold and the acceleration value is greater than the preset acceleration threshold, determine that the wearable device is in a wearing state.
  • the detection unit is further configured to: perform voice activity detection (VAD) on the first voice component to obtain a first VAD value; and perform VAD on the second voice component to obtain a second VAD value; and the recognition unit is specifically configured to perform voiceprint recognition on the first voice component and the second voice component when the first VAD value and the second VAD value each meet a preset condition.
  • VAD voice activity detection
  • the recognition unit is specifically configured to: perform voiceprint recognition on the first voice component and the second voice component when the voice information includes a preset keyword; or perform voiceprint recognition on the first voice component and the second voice component when a preset operation entered by the user is received.
  • the recognition unit is specifically configured to: determine whether the first voice component matches a first voiceprint model of an authorized user, where the first voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the first voice sensor; and determine whether the second voice component matches a second voiceprint model of the authorized user, where the second voiceprint model is used to reflect an audio feature that is of the authorized user and that is collected by the second voice sensor; and the authentication unit is specifically configured to: if the first voice component matches the first voiceprint model of the authorized user, and the second voice component matches the second voiceprint model of the authorized user, determine that the voicing user is an authorized user, or otherwise, determine that the voicing user is an unauthorized user.
  • the recognition unit is specifically configured to: calculate a first degree of matching between the first voice component and the first voiceprint model of the authorized user; if the first matching degree is greater than a first threshold, determine that the first voice component matches the first voiceprint model of the authorized user; calculate a second degree of matching between the second voice component and the second voiceprint model of the authorized user; and if the second matching degree is greater than a second threshold, determine that the second voice component matches the second voiceprint model of the authorized user.
  • the sending unit is further configured to send an authentication success message or an unlocking instruction to the terminal if the voicing user is an authorized user.
  • the sending unit is further configured to: if the voicing user is an authorized user, send an operation instruction corresponding to the voice information to the terminal.
  • the detection unit is further configured to detect a wake-up voice entered by the user, where the wake-up voice includes a preset wake-up word
  • the sending unit is further configured to send an enabling instruction to the terminal, where the enabling instruction is used to instruct the terminal to enable a voiceprint recognition function.
  • this application provides a terminal, including a touchscreen, one or more processors, a memory, and one or more programs.
  • the processor is coupled to the memory, and the one or more programs are stored in the memory.
  • the processor executes the one or more programs stored in the memory, so that the terminal performs any one of the foregoing voice control methods.
  • this application provides a wearable device, including a first voice sensor disposed outside the wearable device and a second voice sensor disposed inside the wearable device, one or more processors, a memory, and one or more programs.
  • the processor is coupled to the memory, and the one or more programs are stored in the memory.
  • the processor executes the one or more programs stored in the memory, so that the wearable device performs any one of the foregoing voice control methods.
  • this application provides a computer storage medium, including a computer instruction.
  • the computer instruction is run on a terminal, the terminal or a wearable device is enabled to perform the voice control method according to any one of the foregoing design methods.
  • this application provides a computer program product.
  • the computer program product runs on a computer, the computer is enabled to perform the voice control method according to any one of the first aspect or the possible implementations of the first aspect.
  • the terminal according to the third aspect and the fifth aspect, the wearable device according to the fourth aspect and the sixth aspect, the computer storage medium according to the seventh aspect, and the computer program product according to the eighth aspect are all used to perform the corresponding method provided above. Therefore, for advantageous effects that the terminal, the wearable device, the computer storage medium, and the computer program product can achieve, refer to advantageous effects in the corresponding methods provided above. Details are not described herein.
  • FIG. 1 is an architectural diagram 1 of a scenario of a voice control method according to an embodiment of this application;
  • FIG. 2 is a schematic structural diagram 1 of a wearable device according to an embodiment of this application.
  • FIG. 3 is a schematic structural diagram 1 of a terminal according to an embodiment of this application.
  • FIG. 4 is a schematic interaction diagram 1 of a voice control method according to an embodiment of this application.
  • FIG. 5 is an architectural diagram 2 of a scenario of a voice control method according to an embodiment of this application.
  • FIG. 6 is a schematic interaction diagram 2 of a voice control method according to an embodiment of this application.
  • FIG. 7( a ) and FIG. 7( b ) are an architectural diagram 3 of a scenario of a voice control method according to an embodiment of this application;
  • FIG. 8 is a schematic structural diagram 2 of a terminal according to an embodiment of this application.
  • FIG. 9 is a schematic structural diagram 2 of a wearable device according to an embodiment of this application.
  • FIG. 10 is a schematic structural diagram of a terminal according to an embodiment of this application.
  • a voice control method provided in an embodiment of this application may be applied to a voice control system including a wearable device 11 and a terminal 12 .
  • the wearable device 11 may be a device that has a voice collection function, such as a wireless headset, a wired headset, smart glasses, a smart helmet, or a smart wristwatch.
  • the terminal 12 may be a device such as a mobile phone, a tablet computer, a notebook computer, an ultra-mobile personal computer (Ultra-mobile Personal Computer, UMPC), or a personal digital assistant (Personal Digital Assistant, PDA). This is not limited in the embodiments of this application.
  • the wearable device 11 may specifically include a first voice sensor 201 disposed outside the wearable device 11 and a second voice sensor 202 disposed inside the wearable device 11 .
  • An inside of the wearable device 11 refers to a side that is directly in contact with a user when the user uses the wearable device 11
  • an outside of the wearable device 11 refers to a side that is not directly in contact with the user.
  • the first voice sensor 201 may be an air conduction microphone
  • the second voice sensor 202 may be a sensor capable of collecting a vibration signal generated when the user makes a sound, such as a bone conduction microphone, an optical vibration sensor, an acceleration sensor, or an air conduction microphone.
  • a manner of collecting voice information by the air conduction microphone is transmitting a, vibration signal in vocalization to the microphone by using air.
  • a manner of collecting voice information by the bone conduction microphone is transmitting a vibration signal in vocalization to the microphone by using a bone.
  • the first voice sensor 201 is an air conduction microphone
  • the second voice sensor 202 is a bone conduction microphone.
  • the wearable device 11 may collect, by using the first voice sensor 201 , voice information sent by the user after air propagation, may also collect, by using the second voice sensor 202 , the voice information sent by the user after bone propagation.
  • first voice sensors 201 there may be a plurality of first voice sensors 201 on the wearable device 11 .
  • the first voice sensor 201 is an air conduction microphone.
  • Two air conduction microphones may be disposed outside the wearable device 11 , and the two air conduction microphones jointly collect voice information sent by the user after air propagation, to obtain a first voice component in the voice information.
  • the bone conduction microphone may collect the voice information sent by the user after bone propagation, to obtain a second voice component in the voice information.
  • the wearable device 11 may further include components such as an acceleration sensor 203 (where the acceleration sensor 203 may be also used as the second voice sensor 202 ), an optical proximity sensor 204 , a communications module 205 , a speaker 206 , a calculation module 207 , a storage module 208 , and a power supply 209 . It may be understood that the wearable device 11 may have more or fewer components than those shown in FIG. 2 , may combine two or more components, or may have different component configurations. Various components shown in FIG. 2 may be implemented in hardware, software, or a combination of hardware and software that includes one or more signal processing or application-specific integrated circuits.
  • the terminal 12 in the voice control system may be specifically a mobile phone 100 .
  • the mobile phone 100 may specifically include: components such as a processor 101 , a radio frequency (radio frequency, RF) circuit 102 , a memory 103 , a touchscreen 104 , a Bluetooth apparatus 105 , one or more sensors 106 , a Wi-Fi apparatus 107 , a positioning apparatus 108 , an audio circuit 109 , a peripheral interface 110 , and a power supply apparatus 111 .
  • These components may communicate by using one or more communications buses or signal cables (not shown in FIG. 3 ).
  • a person skilled in the art may understand that a hardware structure shown in FIG. 3 does not constitute a limitation on the mobile phone 100 .
  • the mobile phone 100 may include more or fewer components than those shown in the figure, or combine some components, or have different component arrangements.
  • the processor 101 is a control center of the mobile phone 100 .
  • the processor 101 is connected to parts of the mobile phone 100 by using various interfaces and cables, runs or executes an application program stored in the memory 103 , and invokes data and an instruction stored in the memory 103 , to perform various functions of the mobile phone 100 and process data.
  • the processor 101 may include one or more processing units.
  • the processor 101 may further integrate an application processor and a modem processor.
  • the application processor mainly processes an operating system, a user interface, an application program, and the like.
  • the modem processor mainly processes wireless communication. It may be understood that the modem processor may alternatively not be integrated into the processor 101 .
  • the processor 101 may be a Kirin 960 multi-core processor manufactured by Huawei Technologies Co., Ltd.
  • the radio frequency circuit 102 may be configured to receive and send a radio signal in an information receiving and sending process or a call process. Specifically, after receiving downlink data from a base station, the radio frequency circuit 102 may send the downlink data to the processor 101 for processing, and sends related uplink data to the base station.
  • the radio frequency circuit includes but is not limited to an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
  • the radio frequency circuit 102 may further communicate with another device through wireless communication.
  • the wireless communication may use any communications standard or protocol, including but not limited to the global system for mobile communications, general packet radio service, code division multiple access, wideband code division multiple access, long term evolution, email, messaging service, and the like.
  • the memory 103 is configured to store an application program and data.
  • the processor 101 runs the application program and the data that are stored in the memory 103 , to execute various functions of the mobile phone 100 and process data.
  • the memory 103 mainly includes a program storage area and a data storage area.
  • the program storage area may store an operating system, and an application program required by at least one function (for example, a sound playing function or an image playing function),
  • the data storage area may store data (for example, audio data or a phone book) created based on use of the mobile phone 100 .
  • the memory 103 may include a high-speed random access memory, and may further include a non-volatile memory such as a magnetic disk storage device, a flash memory device, or another volatile solid-state storage device.
  • the memory 103 may store various operating systems such as an IONS® operating system developed by Apple and an ANDROID® operating system developed by Google.
  • the touchscreen 104 may include a touch-sensitive surface 104 - 1 and a display 104 - 2 .
  • the touch-sensitive surface 104 - 1 may collect a touch event performed by a user of the mobile phone 100 on or near the touch-sensitive surface 104 - 1 (for example, an operation performed by the user on the touch-sensitive surface 104 - 1 or near the touch-sensitive surface 104 - 1 by using any proper object such as a finger or a stylus), and send collected touch information to another component, for example, the processor 101 .
  • the touch event performed by the user near the touch-sensitive surface 104 - 1 may be referred to as a floating touch.
  • the floating touch may mean that the user does not need to directly touch the touchpad for selecting, moving, or dragging an object (for example, an icon), and the user only needs to be near the terminal to execute a desired function.
  • terms such as “touch” and “contact” do not imply a direct contact with the touchscreen, but a contact near or close to the touchscreen.
  • the touch-sensitive surface 104 - 1 on which the floating touch can be performed may be implemented in a capacitive type, an infrared light sensing type, an ultrasonic wave type, or the like.
  • the touch-sensitive surface 104 - 1 may include two parts: a touch detection apparatus and a touch controller.
  • the touch detection apparatus detects a touch orientation of the user, detects a signal generated by a touch operation, and transmits the signal to the touch controller.
  • the touch controller receives touch information from the touch detection apparatus, converts the touch information into touchpoint coordinates, and sends the touchpoint coordinates to the processor 101 .
  • the touch controller may further receive an instruction sent by the processor 101 , and execute the instruction.
  • the touch-sensitive surface 104 - 1 may be implemented in a plurality of types, such as a resistive type, a capacitive type, an infrared type, and a surface acoustic wave type.
  • the display (also referred to as a display screen) 104 - 2 may be configured to display information entered by the user or information provided for the user, and various menus of the mobile phone 100 .
  • the display 104 - 2 may be configured in a form such as a liquid crystal display or an organic light emitting diode.
  • the touch-sensitive surface 104 - 1 may cover the display 104 - 2 . After detecting a touch event on or near the touch-sensitive surface 104 - 1 , the touch-sensitive surface 104 - 1 transmits the touch event to the processor 101 to determine a type of the touch event. Then, the processor 101 may provide corresponding visual output on the display 104 - 2 based on the type of the touch event.
  • the touch-sensitive surface 104 - 1 and the display screen 104 - 2 are used as two independent parts to implement input and output functions of the mobile phone 100 , in some embodiments, the touch-sensitive surface 104 - 1 and the display screen 104 - 2 may be integrated to implement the input and output functions of the mobile phone 100 .
  • the touchscreen 104 is formed by stacking a plurality of layers of materials. Only the touch-sensitive surface (layer) and the display screen (layer) are presented in the embodiments of this application, and other layers are not recorded in the embodiments of this application.
  • the touch-sensitive surface 104 - 1 may cover the display 104 - 2 , and a size of the touch-sensitive surface 104 - 1 is greater than a size of the display screen 104 - 2 . Therefore, the display screen 104 - 2 is entirely covered by the touch-sensitive surface 104 - 1 .
  • the touch-sensitive surface 104 - 1 may be configured on the front of the mobile phone 100 in a full panel form, in other words, any touch performed by the user on the front of the mobile phone 100 can be sensed by the mobile phone. In this way, full touch control experience on the front of the mobile phone can be implemented.
  • the touch-sensitive surface 104 - 1 is configured on the front of the mobile phone 100 in the full panel form
  • the display screen 104 - 2 may also be configured on the front of the mobile phone 100 in the full panel form. In this way, a bezel-less structure can be implemented on the front of the mobile phone.
  • the touchscreen 104 may further include one or more groups of sensor arrays, so that the touchscreen 104 may also sense pressure and the like applied by the user on the touchscreen 104 while sensing a touch event performed by the user on the touchscreen 104 .
  • the mobile phone 100 may further include the Bluetooth apparatus 105 , configured to implement data exchange between the mobile phone 100 and another short-distance terminal (for example, the wearable device 11 ).
  • the Bluetooth apparatus may be an integrated circuit, a Bluetooth chip, or the like.
  • the mobile phone 100 may further include at least one type of the sensor 106 , such as a light sensor, a motion sensor, and another sensor.
  • the optical sensor may include an ambient light sensor and a proximity sensor.
  • the ambient light sensor may adjust luminance of the display of the touchscreen 104 based on brightness of ambient light, and the proximity sensor may power off the display when the mobile phone 100 moves to an ear.
  • an accelerometer sensor may detect acceleration values in various directions (usually on three axes).
  • the accelerometer sensor may detect a value and a direction of gravity when the accelerometer sensor is stationary, and may be applied to an application for recognizing a mobile phone posture (such as switching between landscape mode and portrait mode, a related game, and magnetometer posture calibration), a function related to vibration recognition such as a pedometer and a knock), and the like.
  • Other sensors such as a fingerprint recognition component, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor may be further configured on the mobile phone 100 . Details are not described herein.
  • the Wi-Fi apparatus 107 is configured to provide, for the mobile phone 100 , network access that complies with a Wi-Fi-related standard protocol.
  • the mobile phone 100 may access a Wi-Fi access point by using the Wi-Fi apparatus 107 , to help the user to receive and send an email, browse a web page, access streaming media, and the like.
  • the Wi-Fi apparatus 107 provides wireless broadband interact access for the user.
  • the Wi-Fi apparatus 107 may be used as a Wi-Fi wireless access point, and may provide another terminal with Wi-Fi network access.
  • the positioning apparatus 108 is configured to provide a geographical location for the mobile phone 100 . It can be understood that the positioning apparatus 108 may be specifically a receiver of a positioning system, such as a global positioning system (global positioning system, GPS) or a BeiDou navigation satellite system. After receiving the geographical location sent by the positioning system, the positioning apparatus 108 sends the information to the processor 101 for processing, or sends the information to the memory 103 for storage. In some other embodiments, the positioning apparatus 108 may be a receiver of an assisted global positioning system (assisted global positioning system, AGPS). The AGPS runs in a manner in which GPS positioning is performed with specific assistance.
  • AGPS assisted global positioning system
  • the AGPS can enable a positioning speed of the mobile phone 100 to be higher.
  • the positioning apparatus 108 may obtain positioning assistance through communication with an assisted positioning server (for example, a mobile phone positioning server).
  • the AGPS system is used as an assisted server to assist the positioning apparatus 108 in completing ranging and positioning services.
  • the assisted positioning server provides positioning assistance by communicating with a terminal such as the positioning apparatus 108 (a GPS receiver) of the mobile phone 100 by using a wireless communications network.
  • the audio circuit 109 , a speaker 113 , and a microphone 114 may provide an audio interface between the user and the mobile phone 100 .
  • the audio circuit 109 may convert received audio data into an electrical signal and then transmit the electrical signal to the speaker 113 , and the speaker 113 converts the electrical signal into a sound signal for output.
  • the microphone 114 converts a collected sound signal into an electrical signal.
  • the audio circuit 109 receives the electrical signal, converts the electrical signal into audio data, and then outputs the audio data to the RF circuit 102 , to send the audio data to, for example, another mobile phone, or outputs the audio data to the memory 103 for further processing.
  • the peripheral interface 110 is configured to provide various interfaces for an external input/output device (for example, a keyboard, a mouse, an external display, an external memory, or a subscriber identity module card).
  • an external input/output device for example, a keyboard, a mouse, an external display, an external memory, or a subscriber identity module card.
  • the mobile phone 100 is connected to the mouse by using a universal serial bus interface, and is electrically connected, by using a metal contact on a card slot of the subscriber identity module card, to the subscriber identity module (subscriber identity module, SIM) card provided by a telecommunications operator.
  • the peripheral interface 110 may be configured to couple the external input/output peripheral device to the processor 101 and the memory 103 .
  • the mobile phone 100 may further include the power supply apparatus 111 (for example, a battery and a power supply management chip) that supplies power to the components.
  • the battery may be logically connected to the processor 101 by using the power supply management chip, so that functions such as charging, discharging, and power consumption management are implemented by using the power supply apparatus 111 .
  • the mobile phone 100 may further include a camera, a flash, a micro projection apparatus, a near field communication (near field communication, NFC) apparatus, and the like. Details are not described herein.
  • the wearable device 11 is a Bluetooth headset and the terminal 12 is a mobile phone.
  • the Bluetooth headset and the mobile phone may communicate with each other by using a Bluetooth connection, in this embodiment of this application, the user may enter voice information to the Bluetooth headset when wearing the Bluetooth headset.
  • the Bluetooth headset may separately collect the voice information by using the externally disposed first voice sensor 201 and the internally disposed second voice sensor 202 .
  • the voice information collected by the first voice sensor 201 is a first voice component
  • the voice information collected by the second voice sensor 202 is a second voice component.
  • the Bluetooth headset may separately perform voiceprint recognition on the first voice component and the second voice component, to obtain a first voiceprint recognition result corresponding to the first voice component and a second voiceprint recognition result corresponding to the second voice component.
  • the Bluetooth headset may pre-store a first voiceprint model and a second voiceprint model of an authorized user.
  • the first voiceprint model is generated based on a registration voice that is entered by the authorized user to the first voice sensor 201 in advance.
  • the second voiceprint model is generated based on a registration voice that is entered by the authorized user to the second voice sensor 202 in advance.
  • the Bluetooth headset may match the first voiceprint model with the collected first voice component, and match the second voiceprint model with the collected second voice component.
  • the Bluetooth headset may calculate, by using a specific algorithm, a first degree of matching between the first voice component and the first voiceprint model and a second degree of matching between the second voice component and the second voiceprint model.
  • a higher matching degree indicates more similarity between the voice component and the corresponding voiceprint model, and a higher possibility that a voicing user is the authorized user.
  • the Bluetooth headset may determine that the first voice component matches the first voiceprint model, and the second voice component matches the second voiceprint model.
  • the Bluetooth headset may determine that the first voice component matches the first voiceprint model, and the second voice component matches the second voiceprint model. Further, the Bluetooth headset may send, to the mobile phone, an operation instruction corresponding to the voice information, for example, an unlock instruction, a power-off instruction, or an instruction for calling a specific contact. In this way, the mobile phone can perform a corresponding operation based on the operation instruction, so that the user can control the mobile phone by using a voice.
  • an operation instruction corresponding to the voice information for example, an unlock instruction, a power-off instruction, or an instruction for calling a specific contact.
  • the Bluetooth headset may alternatively send the collected first voice component and the collected second voice component to the mobile phone.
  • the mobile phone separately performs voiceprint recognition on the first voice component and the second voice component, and determines, based on recognition results, whether the user entering the voice information is the authorized user. If the user is the authorized user, the mobile phone may execute the operation instruction corresponding to the voice information.
  • the authorized user is a user that can pass an identity authentication measure preset by the mobile phone.
  • an identity authentication measure preset by the terminal is entering a password, fingerprint recognition, and voiceprint recognition
  • a user who enters the password or pre-enters, in the terminal fingerprint information and a voiceprint model on which user identity authentication is performed may be considered as the authorized user of the terminal.
  • fingerprint information and a voiceprint model on which user identity authentication is performed may be considered as the authorized user of the terminal.
  • there may be one or more authorized users for one terminal and any user other than the authorized user may be considered as an unauthorized user of the terminal. After passing a specific identity authentication measure, the unauthorized user may also be considered as the authorized user. This is not limited in the embodiments of this application.
  • the wearable device 11 max collect voice information generated in an ear canal and voice information generated outside the ear canal when the user makes a sound.
  • the wearable device 11 generates two pieces of voice information (that is, the first voice component and the second voice component). Therefore, the wearable device 11 (or the terminal 12 ) may separately perform voiceprint recognition on the two pieces of voice information.
  • voiceprint recognition results of the two pieces of voice information each match the voiceprint model of the authorized user, it may he determined that the user entering the voice information at this time is the authorized user. It is clearly that compared with a voiceprint recognition process of one piece of voice information, the dual voiceprint recognition process of the two pieces of voice information can significantly improve accuracy and security during user identity authentication.
  • the wearable device 11 can collect, in this manner of bone conduction, the voice information entered by the user only after the user wears the wearable device 11 , when the voice information collected by the wearable device 11 in this manner of bone conduction can pass voiceprint recognition, it is also noted that the voice information is generated when the authorized user wearing the wearable device 11 makes a sound. This avoids a case in which an unauthorized user maliciously controls a terminal of the authorized user by using a recording of the authorized user.
  • a voice control method provided in the embodiments of this application.
  • a mobile phone is used as a terminal and a Bluetooth headset is used as a wearable device.
  • FIG. 4 is a schematic flowchart of a voice control method according to an embodiment of this application. As shown in FIG. 4 , the voice control method may include the following steps.
  • S 401 A mobile phone establishes a Bluetooth connection to a Bluetooth headset.
  • a user may enable a Bluetooth function of the Bluetooth headset when wanting to use the Bluetooth headset.
  • the Bluetooth headset may send outside a pairing broadcast. If a Bluetooth function is enabled on the mobile phone, the mobile phone may receive the pairing broadcast and notify the user that a related Bluetooth device is scanned. After the user selects the Bluetooth headset on the mobile phone, the mobile phone may pair with the Bluetooth headset and establish the Bluetooth connection. Subsequently, the mobile phone and the Bluetooth headset may communicate with each other by using the Bluetooth connection. Certainly, if the mobile phone and the Bluetooth headset are successfully paired before the current Bluetooth connection is established, the mobile phone may automatically establish a Bluetooth connection to the Bluetooth headset found by scanning.
  • the user may operate the mobile phone to establish a Wi-Fi connection to the headset.
  • the user may operate the mobile phone to establish a Wi-Fi connection to the headset.
  • the user inserts a headset cable plug into a corresponding headset interface of the mobile phone to establish a wired connection. This is not limited in the embodiments of this application.
  • the Bluetooth headset detects whether the Bluetooth headset is in a wearing state.
  • an optical proximity sensor and an acceleration sensor may be disposed on the Bluetooth headset.
  • the optical proximity sensor is disposed on a side in contact with the user when the user wears the Bluetooth headset.
  • the optical proximity sensor and the acceleration sensor may be periodically enabled to obtain a currently detected measurement value.
  • the Bluetooth headset After wearing the Bluetooth headset, the user blocks light emitted into the optical proximity sensor. Therefore, when light intensity detected by the optical proximity sensor is less than a preset light intensity threshold, the Bluetooth headset may determine that the Bluetooth headset is in the wearing state at this time. In addition, after the user wears the Bluetooth headset, the Bluetooth headset may move with the user. Therefore, when an acceleration value detected by the acceleration sensor is greater than a preset acceleration threshold, the Bluetooth headset may determine that the Bluetooth headset is in the wearing state at this time. Alternatively, when the light intensity detected by the optical proximity sensor is less than the preset light intensity threshold, if it is detected that the acceleration value detected by the acceleration sensor at this time is greater than the preset acceleration threshold, the Bluetooth headset may determine that the Bluetooth headset is in the wearing state at this time.
  • a second voice sensor for example, a bone conduction microphone or an optical vibration sensor
  • the Bluetooth headset may further collect, by using the second voice sensor, a vibration signal generated in a current environment.
  • the Bluetooth headset is in direct contact with the user when being in the wearing state. Therefore, a vibration signal collected by the second voice sensor is stronger than that collected by the second voice sensor in a nom wearing state. In this case, if energy of the vibration signal collected by the second voice sensor is greater than an energy threshold, the Bluetooth headset may determine that the Bluetooth headset is in the wearing state.
  • the Bluetooth headset may determine that the Bluetooth headset is in the wearing state. This can reduce a probability that the Bluetooth headset cannot accurately detect a wearing status by using the optical proximity sensor or the acceleration sensor in a scenario in which the user puts the Bluetooth headset into a pocket and the like.
  • the energy threshold or the preset spectrum feature may be obtained through statistics collection after various vibration signals generated by sounds, motion, or the like made after a large quantity of users wear the Bluetooth headset are captured, and is quite different from an energy or a spectrum feature of a voice signal detected by the second voice sensor when the user does not wear the Bluetooth headset.
  • a first voice sensor for example, an air conduction microphone
  • the first voice sensor does not need to be enabled before the Bluetooth headset detects that the Bluetooth headset is currently in the wearing state.
  • the Bluetooth headset may enable the first voice sensor to collect voice information generated when the user makes a sound, to reduce power consumption of the Bluetooth headset.
  • the Bluetooth headset may continue to perform the following steps S 403 to S 407 , or otherwise, the Bluetooth headset may enter a sleep state, and continue to perform the following steps S 403 to S 407 after detecting that the Bluetooth headset is currently in the wearing state.
  • the Bluetooth headset may trigger, only when the Bluetooth headset detects that the user wears the Bluetooth headset, that is, the user has an intention to use the Bluetooth headset, a process in which the Bluetooth headset collects the voice information entered by the user, and performs voiceprint recognition and the like. This reduces power consumption of the Bluetooth headset.
  • step S 402 is optional. To be specific, regardless of whether the user wears the Bluetooth headset, the Bluetooth headset may continue to perform the following steps S 403 to S 407 . This is not limited in the embodiments of this application.
  • the Bluetooth headset collects, by using the first voice sensor, a first voice component in the voice information entered by the user, and collects a second voice component in the voice information by using the second voice sensor.
  • the Bluetooth headset may enable a voice detection module to separately collect, by using the first voice sensor and the second voice sensor, the voice information entered by the user, to obtain the first voice component and the second voice component in the voice information.
  • the first voice sensor is an air conduction microphone
  • the second voice sensor is a bone conduction microphone.
  • the user may enter voice information “Xiao F, pay by using WeChat”.
  • the Bluetooth headset may receive, by using the air conduction microphone, a vibration signal (in other words, the first voice component in the voice information) generated by air vibration after the user makes a sound
  • a vibration signal in other words, the first voice component in the voice information
  • the Bluetooth headset may receive, by using the bone conduction microphone, a vibration signal (in other words, the second voice component in the voice information) generated by vibration of the ear bone and the skin after the user makes a sound.
  • the Bluetooth headset may further distinguish a voice signal and a background noise in the voice information by using a VAD (voice activity detection, voice activity detection) algorithm.
  • VAD voice activity detection, voice activity detection
  • the Bluetooth headset may separately input the first voice component and the second voice component in the voice information into a corresponding VAD algorithm to obtain a first VAD value corresponding to the first voice component and a second VAD value corresponding to the second voice component.
  • a VAD value may be used to reflect whether the voice information is a normal voice signal of the speaker or a noise signal.
  • the VAD value may be set to be in a range from 0 to 100.
  • the VAD value When the VAD value is greater than a VAD threshold, it indicates that the voice information is a normal voice signal of the speaker, or when the VAD value is less than the VAD threshold, it indicates that the voice information is a noise signal.
  • the VAD value may be set to 0 or 1. When the VAD value is 1, it indicates that the voice information is a normal voice signal of the speaker, or when the VAD value is 0, it indicates that the voice information is a noise signal.
  • the Bluetooth headset may determine, based on the two VAD values: the first VAD value and the second VAD value, whether the voice information is a noise signal. For example, when both the first VAD value and the second VAD value are 1, the Bluetooth headset may determine that the voice information is not a noise signal, but is a normal voice signal of the speaker. For another example, when the first VAD value and the second VAD value each are greater than a preset value, the Bluetooth headset may determine that the voice information is not a noise signal, but is a normal voice signal of the speaker.
  • the Bluetooth headset may also determine, based on only the second VAD value, whether the voice information is a noise signal.
  • Voice activity detection is separately performed on the first voice component and the second voice component, if the Bluetooth headset determines that the voice information is a noise signal, the Bluetooth headset may discard the voice information. If the Bluetooth headset determines that the voice information is not a noise signal, the Bluetooth headset may continue to perform the following steps S 404 to S 407 . in other words, only when the user enters valid voice information to the Bluetooth headset, the Bluetooth headset is triggered to perform a subsequent process such as voiceprint identification. This reduces power consumption of the Bluetooth headset.
  • the Bluetooth headset may further separately measure a noise value of the voice information by using a noise estimation algorithm (for example, a minimum statistics algorithm or a minima controlled recursive averaging algorithm).
  • a noise estimation algorithm for example, a minimum statistics algorithm or a minima controlled recursive averaging algorithm.
  • the Bluetooth headset may set storage space specially used for storing the noise value, and after calculating a new noise value each time, the Bluetooth headset may update the new noise value to the storage space. In other words, a latest calculated noise value is always stored in the storage space.
  • the Bluetooth headset may separately perform noise reduction processing on the first voice component and the second voice component by using the noise value in the storage space, so that recognition results obtained when a subsequent Bluetooth headset (or a mobile phone) separately perform voiceprint recognition on the first voice component and the second voice component are more accurate.
  • the Bluetooth headset sends the first voice component and the second voice component to the mobile phone by using the Bluetooth connection.
  • the Bluetooth headset may send the first voice component and the second voice component to the mobile phone. Then, the mobile phone performs the following steps S 705 to S 707 , to implement operations such as voiceprint recognition on the voice information entered by the user and user identity authentication.
  • the mobile phone separately performs voiceprint recognition on the first voice component and the second voice component, to obtain a first voiceprint recognition result corresponding to the first voice component and a second voiceprint recognition result corresponding to the second voice component.
  • Voiceprint models of one or more authorized users may he pre-stored on the mobile phone.
  • Each authorized user has two voiceprint models, one is a first voiceprint model established based on a voice feature of the user collected when the air conduction microphone (in other words, the first voice sensor) works, and the other is a second voiceprint model established based on a voice feature of the user collected when the bone conduction microphone (in other words, the second voice sensor) works.
  • the first phase is a background model training phase.
  • a. developer may collect voices of related texts (for example, “Hello, Xiao E”) generated when a large quantity of speakers wearing the Bluetooth headset make a sound.
  • the mobile phone may extract an audio feature (for example, a time-frequency noise spectrum graph, or a gammatone-like spectrogram) in a background sound, and a background model of voiceprint recognition is established by using a machine learning algorithm such as a GMM (gaussian mixed model, Gaussian mixture model), an SVM (support vector machines, support vector machine), or a deep neural network framework.
  • GMM gaussian mixed model, Gaussian mixture model
  • SVM support vector machines, support vector machine
  • the mobile phone or the Bluetooth headset may establish, based on the background model and a registration voice entered by a user, a first voiceprint model and a second voiceprint model belonging to the user.
  • the deep neural network framework includes but is not limited to a DNN (deep neural network, deep neural network) algorithm, an RNN (recurrent neural network, recurrent neural network) algorithm, an LSTM (long short term memory, long short-term memory) algorithm, and the like.
  • the second phase is a process in which when the user uses a voice control function on the mobile phone for the first time, the first voiceprint model and the second voiceprint model belonging to the user are established by entering the registration voice.
  • the voice assistant APP may prompt the user to wear a Bluetooth headset and say a registration voice “Hello Xiao E”.
  • the Bluetooth headset includes an air conduction microphone and a bone conduction microphone, the Bluetooth headset may obtain a first registration component collected by using the air conduction microphone and a second registration component collected by using the bone conduction microphone that are in the registration voice.
  • the mobile phone may separately extract an audio feature of the user 1 in the first registration component and the second registration component, and further input the audio feature of the user 1 into the background model. In this way, the first voiceprint model and the second voiceprint model of the user 1 are obtained.
  • the mobile phone may locally store the first voiceprint model and the second voiceprint model of the authorized user 1 , or may send the first voiceprint model and the second voiceprint model of the authorized user 1 to the Bluetooth headset for storage.
  • the mobile phone may further use the Bluetooth headset currently connected to the mobile phone as an authorized Bluetooth device.
  • the mobile phone may locally store an identifier (for example, a MAC address of the Bluetooth headset) of the authorized Bluetooth device.
  • the mobile phone may receive and execute a related operation instruction sent by the authorized Bluetooth device, and when an unauthorized Bluetooth device sends an operation instruction to the mobile phone, the mobile phone may discard the operation instruction to improve security.
  • One mobile phone can manage one or more authorized Bluetooth devices. As shown in FIG.
  • the user may access a setting screen 701 of a voiceprint recognition function from a setting function, and after clicking a setting button 705 , the user may access a management screen 706 of an authorized device shown in FIG. 7( b ) .
  • the user may add or delete an authorized Bluetooth device on the management screen 806 of the authorized device.
  • the mobile phone may separately extract an audio feature of each of the first voice component and the second voice component, and then match the first voiceprint model of the authorized user 1 with the audio feature of the first voice component, and match the second voiceprint model of the authorized user 1 with the audio feature of the second voice component.
  • the mobile phone may calculate, by using a specific algorithm, a first matching degree (that is, a first voiceprint recognition result) between the first voiceprint model and the first voice component, and a second matching degree (that is, the second voiceprint recognition result) between the second voiceprint model and the second voice component.
  • a higher matching degree indicates more similarity between the audio feature of the voice information and the audio feature of the authorized user 1 , and a higher possibility that the user entering the voice information is the authorized user 1 .
  • the mobile phone may further calculate one by one, according to the foregoing method, a first degree of matching between the first voice component and another authorized user (for example, an authorized user 2 or an authorized user 3 ), and a second degree of matching between the second voice component and the another authorized user.
  • the Bluetooth headset may determine an authorized user (for example, an authorized user A) with a highest matching degree as a current voicing user.
  • the mobile phone may further pre-determine whether voiceprint recognition needs to be performed on the first voice component and the second voice component.
  • the Bluetooth headset or the mobile phone may identify a preset keyword from the voice information entered by the user, for example, a keyword related to user privacy or fund behavior such as “transfer”, “payment”, “**bank”, or “chat record”, it indicates that a security requirement of the user to control the mobile phone through a voice is relatively high at this time. Therefore, the mobile phone may perform step S 405 , that is, perform voiceprint recognition.
  • the Bluetooth headset receives a preset operation that is performed by the user and that is used to enable a voiceprint recognition function, for example, an operation of tapping the Bluetooth headset or simultaneously pressing a volume button and a volume—button, it indicates that the user needs to verify a user identity through voiceprint recognition at this time. Therefore, the Bluetooth headset may instruct the mobile phone to perform step S 405 , that is, perform voiceprint recognition.
  • keywords corresponding to different security levels may be preset on the mobile phone. For example, a keyword at a highest security level includes “pay”, “payment”, or the like, a keyword at a relatively high security level includes “photographing”, “calling”, or the like, and a keyword at a lowest security level includes “listening to a song”, “navigation”, or the like.
  • the mobile phone when it is detected that the collected voice information includes the keyword at a highest security level, the mobile phone may be triggered to separately perform voiceprint recognition on the first voice component and the second voice component, in other words, perform voiceprint recognition on both the two collected voice sources, to improve security of voice controlling the mobile phone.
  • the mobile phone may be triggered to perform voiceprint recognition only on the first voice component or the second voice component.
  • the mobile phone does not need to perform voiceprint recognition on the first voice component and the second voice component.
  • the mobile phone does not need to perform voiceprint recognition on the first voice component and the second voice component. This reduces power consumption of the mobile phone.
  • the mobile phone may further preset one or more wake-up words to wake up the mobile phone and enable the voiceprint recognition function.
  • the wake-up word may be “Hello, Xiao E”.
  • the Bluetooth headset or the mobile phone may identify whether the voice information is a wake-up voice including the wake-up word.
  • the Bluetooth headset may send the first voice component and the second voice component in the collected voice information to the mobile phone. If the mobile phone further identifies that the voice information includes the wake-up word, the mobile phone may enable the voiceprint recognition function (for example, power on a voiceprint recognition chip). Subsequently, if the voice information collected by the Bluetooth headset includes the key word, the mobile phone may perform voiceprint recognition according to the method in step S 405 by using the enabled voiceprint recognition function.
  • the Bluetooth headset may further identify whether the voice information includes the wake-up word. If the voice information includes the wake-up word, it indicates that the user may subsequently need to use the voiceprint identification function. In this case, the Bluetooth headset may send an enabling instruction to the mobile phone, so that the mobile phone enables the voiceprint identification function in response to the enabling instruction.
  • the mobile phone performs user identity authentication based on the first voiceprint recognition result and the second voiceprint recognition result.
  • step S 706 after obtaining, through voiceprint recognition, the first voiceprint recognition result corresponding to the first voice component and the second voiceprint recognition result corresponding to the second voice component, the mobile phone may perform, based on the two voiceprint recognition results, identity authentication on the user entering the voice information. Therefore, accuracy and security of user identity authentication are improved.
  • the first degree of matching between the first voiceprint model of the authorized user and the first voice component is the first voiceprint recognition result
  • the second degree of matching between the second voiceprint model of the authorized user and the second voice component is the second voiceprint recognition result.
  • the authentication policy is that when the first matching degree is greater than a first threshold and the second matching degree is greater than a second threshold (the second threshold is the same as or different from the first threshold), the mobile phone determines that the user sending the first voice component and the second voice component is the authorized user, or otherwise, the mobile phone may determine that the user sending the first voice component and the second voice component is an unauthorized user.
  • the mobile phone may calculate a weighted average value of the first matching degree and the second matching degree.
  • the mobile phone may determine that the user sending the first voice component and the second voice component is the authorized user, or otherwise, the mobile phone may determine that the user sending the first voice component and the second voice component is an unauthorized user.
  • the mobile phone may use different authentication policies in different voiceprint recognition scenarios. For example, when the collected voice information includes the keyword at a highest security level, the mobile phone may set both the first threshold and the second threshold to 99 scores. In this way, only when both the first matching degree and the second matching degree are greater than 99 scores, the mobile phone determines that the current voicing user is the authorized user. When the collected voice information includes a keyword at a relatively low security level, the mobile phone may set both the first threshold and the second threshold to 85 scores. In this way, when both the first matching degree and the second matching degree are greater than 85 scores, the mobile phone may determine that the current voicing user is the authorized user. In other words, for voiceprint recognition scenarios at different security levels, the mobile phone may use authentication policies at different security levels to perform user identity authentication.
  • the mobile phone stores voiceprint models of a plurality of authorized users, for example, the mobile phone stores voiceprint models of an authorized user A, an authorized user B, and an authorized user C
  • the voiceprint model of each authorized user includes a first voiceprint model and a second voiceprint model.
  • the mobile phone may separately match s collected first voice component and a collected second voice component with the voiceprint model of each authorized user according to the foregoing method.
  • the mobile phone may determine an authorized user (for example, the authorized user A) that meets the authentication policy and has a highest matching degree as a current voicing user.
  • the voiceprint model that is of the authorized user and that is stored on the mobile phone may alternatively be established after the mobile phone combines the first registration component and the second registration component in the registration voice.
  • each authorized user has a voiceprint model, and the voiceprint model can reflect an audio feature of a voice of the authorized user when the voice is transmitted through the air, and can also reflect an audio feature of the voice of the authorized user when the voice is transmitted through a bone.
  • the mobile phone may perform voiceprint recognition after combining the first voice component and the second voice component, for example, the mobile phone calculates a degree of matching between the voiceprint model of the authorized user and a combination of the first voice component and the second voice component. Further, the mobile phone can also perform user identity authentication based on the matching degree. According to this identity authentication method, the voiceprint models of the authorized user are combined into one voiceprint model. Therefore, complexity of the voiceprint model and required storage space are reduced correspondingly. In addition, because information about the voiceprint feature of the second voice component is used, dual voiceprint assurance and a liveness detection function are also provided.
  • the mobile phone may generate the operation instruction corresponding to the voice information. For example, when the voice information is “Xiao E, pay by using WeChat”, the operation instruction corresponding to the voice information is displaying a payment screen of a WeChat APP. In this way, after generating the operation instruction for displaying the payment screen on the WeChat APP, the mobile phone may automatically enable the WeChat APP, and display the payment screen on the WeChat APP.
  • the mobile phone may further first unlock a screen, and then execute the operation instruction for displaying the payment screen on the WeChat APP, to display a payment screen 501 on the WeChat APP.
  • the voice control method provided in steps S 401 to S 407 may be a function provided by the voice assistant APP.
  • the voice assistant APP When the Bluetooth headset interacts with the mobile phone, if determining that the current voicing user is the authorized user through voiceprint recognition, the mobile phone may send data such as the generated operation instruction or the voice information to the voice assistant APP running at an application layer. Further, the voice assistant APP invokes a related interface or service at an application framework layer to execute the operation instruction corresponding to the voice information.
  • the mobile phone may be unlocked and execute the operation instruction corresponding to the voice information while identifying the user identity by using a voiceprint.
  • the user only needs to enter the voice information once to complete a series of operations such as user identity authentication, mobile phone unlocking, and enabling a function of the mobile phone. This greatly improves control efficiency of the user on the mobile phone and user experience.
  • steps S 401 to S 407 the mobile phone is used as an execution body to perform operations such as voiceprint recognition and user identity authentication. It may be understood that some or all of steps S 401 to S 407 may also be completed by the Bluetooth headset. This can reduce implementation complexity of the mobile phone and power consumption of the mobile phone, As shown in FIG. 6 , the voice control method may include the following steps,
  • S 601 A mobile phone establishes a Bluetooth connection to a Bluetooth headset.
  • the Bluetooth headset detects whether the Bluetooth headset is in a wearing state.
  • the Bluetooth headset collects, by using a first voice sensor, a first voice component in voice information entered by a user, and collects a second voice component in the voice information by using a second voice sensor.
  • steps S 601 to S 603 for establishing the Bluetooth connection between the Bluetooth headset and the mobile phone detecting whether the Bluetooth headset is in the wearing state, and detecting the first voice component and the second voice component in the voice information, refer to related descriptions of steps S 401 to S 403 . Details are not described herein.
  • the Bluetooth headset may further perform operations such as VAD detection, noise reduction, or filtering on the detected first voice component and the detected second voice component. This is not limited in the embodiments of this application.
  • the Bluetooth headset has an audio playback function
  • an air conduction microphone and a bone conduction microphone on the Bluetooth headset may receive an echo signal of a sound source played by the speaker. Therefore, after obtaining the first voice component and the second voice component, the Bluetooth headset may further cancel an echo signal in each of the first voice component and the second voice component by using an echo cancellation algorithm (for example, adaptive echo cancellation, AEC), to improve accuracy of subsequent voiceprint recognition.
  • an echo cancellation algorithm for example, adaptive echo cancellation, AEC
  • the Bluetooth headset separately performs voiceprint recognition on the first voice component and the second voice component, to obtain a first voiceprint recognition result corresponding to the first voice component and a second voiceprint recognition result corresponding to the second voice component.
  • the Bluetooth headset may pre-stored voiceprint models of one or more authorized users. In this way, after obtaining the first voice component and the second voice component, the Bluetooth headset may perform voiceprint recognition on the first voice component and the second voice component by using the voiceprint models locally stored on the Bluetooth headset.
  • the Bluetooth headset may perform voiceprint recognition on the first voice component and the second voice component by using the voiceprint models locally stored on the Bluetooth headset.
  • the Bluetooth headset performs user identity authentication based on the first voiceprint recognition result and the second voiceprint recognition result.
  • step S 406 For a process in which the Bluetooth headset performs user identity authentication based on the first voiceprint recognition result and the second voiceprint recognition result, refer to related descriptions in step S 406 that the mobile phone performs user identity authentication based on the first voiceprint recognition result and the second voiceprint recognition result. Details are not described herein.
  • the Bluetooth headset sends an operation instruction corresponding to the voice information to the mobile phone by using the Bluetooth connection.
  • the Bluetooth headset may generate the operation instruction corresponding to the voice information. For example, when the voice information is “Xiao E, pay by using WeChat”, the operation instruction corresponding to the voice information is displaying a payment screen of a WeChat APP. In this way, the Bluetooth headset may send, to the mobile phone by using the established Bluetooth connection, the operation instruction for displaying the payment screen on the WeChat APP. As shown in FIG. 5 , after receiving the operation instruction, the mobile phone may automatically enable the WeChat APP, and display a payment screen 501 on the WeChat APP.
  • the Bluetooth headset may further send a success message of user identity authentication or an unlocking instruction to the mobile phone, so that the mobile phone may first unlock a screen, and then execute the operation instruction corresponding to the voice information.
  • the Bluetooth headset may also send the collected voice information to the mobile phone, and the mobile phone generates a corresponding operation instruction based on the voice information, and executes the operation instruction.
  • the Bluetooth headset when sending the voice information or the corresponding operation instruction to the mobile phone, the Bluetooth headset may further send a device identifier (for example, a MAC address) of the Bluetooth headset to the mobile phone. Because the mobile phone stores an identifier of an authorized Bluetooth device passing authentication, the mobile phone may determine, based on the received device identifier, whether the currently connected Bluetooth headset is the authorized Bluetooth device. If the Bluetooth headset is the authorized Bluetooth device, the mobile phone may further execute the operation instruction sent by the Bluetooth headset, or perform an operation such as voice recognition on voice information sent by the Bluetooth headset; or otherwise, the mobile phone may discard the operation instruction sent by the Bluetooth headset. This avoids a security problem caused by malicious control of the mobile phone by an unauthorized Bluetooth device.
  • a device identifier for example, a MAC address
  • the mobile phone and the authorized Bluetooth device may pre-agree on a password or a password for transmitting the operation instruction.
  • the Bluetooth headset may further send the pre-agreed password or password to the mobile phone, so that the mobile phone determines whether the currently connected Bluetooth headset is the authorized Bluetooth device.
  • the mobile phone and the authorized Bluetooth device may pre-agree on an encryption algorithm and a decryption algorithm used for transmitting the operation instruction.
  • the Bluetooth headset may encrypt the operation instruction by using the agreed encryption algorithm.
  • the mobile phone After receiving an encrypted operation instruction, if the mobile phone can decrypt the operation instruction by using the agreed decryption algorithm, it indicates that the currently connected Bluetooth headset is the authorized Bluetooth device, and the mobile phone may further execute the operation instruction sent by the Bluetooth headset; or otherwise, it indicates that the currently connected Bluetooth headset is an unauthorized Bluetooth device, and the mobile phone may discard the operation instruction sent by the Bluetooth headset.
  • steps S 401 to S 407 and steps S 601 to S 607 are merely two implementations of the voice control method provided in this application. It may be understood that a person skilled in the art may set, based on an actual application scenario or actual experience, which steps are performed by the Bluetooth headset and which steps are performed by the mobile phone in the foregoing embodiments. This is not limited in the embodiments of this application.
  • the Bluetooth headset may also send the obtained first voiceprint recognition result and the obtained second voiceprint recognition result to the mobile phone, and subsequently the mobile phone performs an operation such as user identity authentication based on the voiceprint recognition result.
  • the Bluetooth headset may pre-determine whether voiceprint recognition needs to be performed on the first voice component and the second voice component. If voiceprint recognition needs to be performed on the first voice component and the second voice component, the Bluetooth headset may send the first voice component and the second voice component to the mobile phone, so that the mobile phone completes subsequent operations such as voiceprint recognition and user identity authentication; or otherwise, the Bluetooth headset does not need to send the first voice component and the second voice component to the mobile phone. This avoids increasing power consumption of the mobile phone for processing the first voice component and the second voice component.
  • the user may further access a setting screen 701 of the mobile phone to enable or disable the voice control function.
  • the user may set, by using a setting button 702 , a keyword for triggering the voice control function, for example, “Xiao E” or “Pay”, or the user may manage a voiceprint model of the authorized user by using a setting button 703 , for example, add or delete the voiceprint model of the authorized user, or the user may set, by using a setting button 704 , an operation instruction that can be supported by a voice assistant, for example, payment, making a call, or ordering a meal. In this way, the user can obtain customized voice control experience.
  • an embodiment of this application discloses a terminal.
  • the terminal is configured to implement the methods recorded in the foregoing method embodiments, and the terminal includes a connection unit 801 , an obtaining unit 802 , a recognition unit 803 , an authentication unit 804 , and an execution unit 805 .
  • the connection unit 801 is configured to support the terminal in performing the process S 401 in FIG. 4 and the process S 601 in FIG. 6 .
  • the obtaining unit 802 supports the terminal in performing the process S 404 in FIG. 4 and the process S 606 in FIG. 6 .
  • the recognition unit 803 is configured to support the terminal in performing the process S 405 in FIG. 4 .
  • the authentication unit 804 is configured to support the terminal in performing the process S 406 in FIG. 4 .
  • the execution unit 805 is configured to support the terminal in performing the process S 407 in FIG. 4 and the process S 607 in FIG. 6 . All related content of the steps in the foregoing method embodiments may be cited in function descriptions of the corresponding functional modules. Details are not described herein.
  • an embodiment of this application discloses a wearable device.
  • the wearable device is configured to implement the methods recorded in the foregoing method embodiments, and the wearable device includes: a connection unit 901 , a detection unit 902 , a sending unit 903 , a recognition unit 904 , and an authentication unit 905 .
  • the connection unit 801 is configured to support the terminal in performing the process S 401 in FIG. 4 and the process S 601 in FIG. 6 .
  • the detection unit 902 is configured to support the terminal in performing the processes S 402 and S 403 in FIG. 4 and the processes S 602 and S 603 in FIG. 6 .
  • the recognition unit 904 is configured to support the terminal in performing the process S 604 in FIG. 6 .
  • the authentication unit 905 is configured to support the terminal in performing the process S 605 in FIG. 6 .
  • the sending unit 903 is configured to support the terminal in performing the process S 404 in FIG. 4 and the process S 606 in FIG. 6 . All related content of the steps in the foregoing method embodiments may be cited in function descriptions of the corresponding functional modules. Details are not described herein.
  • an embodiment of this application discloses a terminal.
  • the terminal may include a touchscreen 1001 , where the touchscreen 1001 includes a touch-sensitive surface 1006 and a display screen 1007 , one or more processors 1002 , a memory 1003 , one or more applications (not shown), and one or more computer programs 1004 .
  • the foregoing components may be connected by using one or more communications buses 1005 .
  • the one or more computer programs 1004 are stored in the memory 1003 and configured to be executed by the one or more processors 1002 .
  • the one or more computer programs 1004 include an instruction. The instruction may be used to perform steps in FIG. 4 , FIG. 6 , and corresponding embodiments.
  • Functional units in the embodiments of this application may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
  • the integrated unit may be implemented in a form of hardware, or may be implemented in a form of a software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, the integrated unit may be stored in a computer-readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or some of the steps of the methods described in the embodiments of this application.
  • the foregoing storage medium includes: any medium that can store program code, such as a flash memory, a removable hard disk, a read-only memory, a random access memory, a magnetic disk, or an optical disc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Telephone Function (AREA)
  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)
US17/256,845 2018-06-29 2018-06-29 Voice Control Method, Wearable Device, and Terminal Abandoned US20210256979A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/093829 WO2020000427A1 (fr) 2018-06-29 2018-06-29 Procédé de commande vocale, appareil pouvant être porté et terminal

Publications (1)

Publication Number Publication Date
US20210256979A1 true US20210256979A1 (en) 2021-08-19

Family

ID=68772588

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/256,845 Abandoned US20210256979A1 (en) 2018-06-29 2018-06-29 Voice Control Method, Wearable Device, and Terminal

Country Status (6)

Country Link
US (1) US20210256979A1 (fr)
EP (1) EP3790006A4 (fr)
KR (1) KR102525294B1 (fr)
CN (2) CN112420035A (fr)
RU (1) RU2763392C1 (fr)
WO (1) WO2020000427A1 (fr)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674749A (zh) * 2021-08-26 2021-11-19 珠海格力电器股份有限公司 一种控制方法、装置、电子设备和存储介质
US20210390959A1 (en) * 2020-06-15 2021-12-16 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof
CN113963683A (zh) * 2020-07-01 2022-01-21 广州汽车集团股份有限公司 一种后备箱开启控制方法及后备箱开启控制系统
US20220172702A1 (en) * 2020-12-02 2022-06-02 National Applied Research Laboratories Method for converting vibration to voice frequency wirelessly
US20220201453A1 (en) * 2019-04-02 2022-06-23 Huawei Technologies Co., Ltd. Service connection establishment method, bluetooth master device, chip, and bluetooth system
US11393449B1 (en) * 2021-03-25 2022-07-19 Cirrus Logic, Inc. Methods and apparatus for obtaining biometric data
US11393206B2 (en) * 2018-03-13 2022-07-19 Tencent Technology (Shenzhen) Company Limited Image recognition method and apparatus, terminal, and storage medium
US11757871B1 (en) * 2021-07-13 2023-09-12 T-Mobile Usa, Inc. Voice command security and authorization in user computing devices
WO2023172936A1 (fr) * 2022-03-08 2023-09-14 University Of Houston System Systèmes et appareil d'authentification multifactorielle par conduction osseuse et signaux audio
US12014741B2 (en) 2021-02-23 2024-06-18 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111210823B (zh) * 2019-12-25 2022-08-26 秒针信息技术有限公司 收音设备检测方法和装置
CN111131966B (zh) * 2019-12-26 2022-05-20 上海传英信息技术有限公司 模式控制方法、耳机系统及计算机可读存储介质
CN111613229A (zh) * 2020-05-13 2020-09-01 深圳康佳电子科技有限公司 一种电视音箱的声纹控制方法、存储介质及智能电视
CN113823288A (zh) * 2020-06-16 2021-12-21 华为技术有限公司 一种语音唤醒的方法、电子设备、可穿戴设备和系统
CN113830026A (zh) * 2020-06-24 2021-12-24 华为技术有限公司 一种设备控制方法及计算机可读存储介质
CN111683358A (zh) * 2020-07-06 2020-09-18 Oppo(重庆)智能科技有限公司 控制方法、装置、移动终端、存储介质和无线耳机
CN111627450A (zh) * 2020-07-28 2020-09-04 南京新研协同定位导航研究院有限公司 一种mr眼镜的延长续航系统及其续航方法
CN112259124B (zh) * 2020-10-21 2021-06-15 交互未来(北京)科技有限公司 基于音频频域特征的对话过程捂嘴手势识别方法
CN112133313A (zh) * 2020-10-21 2020-12-25 交互未来(北京)科技有限公司 基于单耳机语音对话过程捂嘴手势的识别方法
CN112259097A (zh) * 2020-10-27 2021-01-22 深圳康佳电子科技有限公司 一种语音识别的控制方法和计算机设备
CN114553289A (zh) * 2020-11-27 2022-05-27 成都立扬信息技术有限公司 一种基于北斗通信的通信终端
CN112530430A (zh) * 2020-11-30 2021-03-19 北京百度网讯科技有限公司 车载操作系统控制方法、装置、耳机、终端及存储介质
WO2022147905A1 (fr) * 2021-01-11 2022-07-14 深圳市韶音科技有限公司 Procédé d'optimisation d'un état de fonctionnement d'écouteurs à conduction osseuse
CN112951243A (zh) * 2021-02-07 2021-06-11 深圳市汇顶科技股份有限公司 语音唤醒方法、装置、芯片、电子设备及存储介质
CN115132212A (zh) * 2021-03-24 2022-09-30 华为技术有限公司 一种语音控制方法和装置
CN115484347B (zh) * 2021-05-31 2024-06-11 华为技术有限公司 一种语音控制方法、电子设备、芯片系统及存储介质
CN113329356B (zh) * 2021-06-02 2022-06-03 中国工商银行股份有限公司 切换接听方式的方法、装置、电子设备及介质
CN113298507B (zh) * 2021-06-15 2023-08-22 英华达(上海)科技有限公司 支付验证方法、系统、电子设备和存储介质
CN113612738B (zh) * 2021-07-20 2023-05-16 深圳市展韵科技有限公司 声纹实时鉴权加密的方法、声纹鉴权设备及受控设备
CN113559511A (zh) * 2021-07-26 2021-10-29 歌尔科技有限公司 控制方法、游戏装置、计算机程序产品及可读存储介质
CN113409819B (zh) * 2021-08-19 2022-01-25 中国空气动力研究与发展中心低速空气动力研究所 一种基于听觉谱特征提取的直升机声信号识别方法
CN114120603B (zh) * 2021-11-26 2023-08-08 歌尔科技有限公司 语音控制方法、耳机和存储介质
CN115273909A (zh) * 2022-07-28 2022-11-01 歌尔科技有限公司 语音活性检测方法、装置、设备及计算机可读存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193208A1 (en) * 2015-12-30 2017-07-06 Motorola Mobility Llc Multimodal biometric authentication system and method with photoplethysmography (ppg) bulk absorption biometric
US20170193207A1 (en) * 2015-12-30 2017-07-06 Motorola Mobility Llc Multimodal biometric authentication system and method with photoplethysmography (ppg) bulk absorption biometric
US20180107813A1 (en) * 2016-10-18 2018-04-19 Plantronics, Inc. User Authentication Persistence
US20180113673A1 (en) * 2016-10-20 2018-04-26 Qualcomm Incorporated Systems and methods for in-ear control of remote devices
US20180167745A1 (en) * 2015-06-30 2018-06-14 Essilor International (Compagnie Generale D'optique) A head mounted audio acquisition module
US20180240463A1 (en) * 2017-02-22 2018-08-23 Plantronics, Inc. Enhanced Voiceprint Authentication
US20180293981A1 (en) * 2017-04-07 2018-10-11 Google Inc. Multi-user virtual assistant for verbal device control

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2234727C1 (ru) * 2002-12-04 2004-08-20 Открытое акционерное общество "Корпорация "Фазотрон - научно-исследовательский институт радиостроения" Устройство управления бортовой радиолокационной станцией с альтернативным каналом речевого управления
CN101441869A (zh) * 2007-11-21 2009-05-27 联想(北京)有限公司 语音识别终端用户身份的方法及终端
KR101599533B1 (ko) * 2008-07-29 2016-03-03 엘지전자 주식회사 오디오 신호 처리 방법 및 장치
US9262612B2 (en) * 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
CN103078828A (zh) * 2011-10-25 2013-05-01 上海博路信息技术有限公司 一种云模式的语音鉴权系统
CN103259908B (zh) * 2012-02-15 2017-06-27 联想(北京)有限公司 一种移动终端及其智能控制方法
CN103871419B (zh) * 2012-12-11 2017-05-24 联想(北京)有限公司 一种信息处理方法及电子设备
US9589120B2 (en) * 2013-04-05 2017-03-07 Microsoft Technology Licensing, Llc Behavior based authentication for touch screen devices
CN104348621A (zh) * 2013-08-02 2015-02-11 成都林海电子有限责任公司 一种基于声纹识别的鉴权系统及方法
US9343068B2 (en) * 2013-09-16 2016-05-17 Qualcomm Incorporated Method and apparatus for controlling access to applications having different security levels
CN103838991A (zh) * 2014-02-20 2014-06-04 联想(北京)有限公司 一种信息处理方法及电子设备
CN103730120A (zh) * 2013-12-27 2014-04-16 深圳市亚略特生物识别科技有限公司 电子设备的语音控制方法及系统
CN104050406A (zh) * 2014-07-03 2014-09-17 南昌欧菲生物识别技术有限公司 利用指纹组合进行鉴权的方法及终端设备
KR102246900B1 (ko) * 2014-07-29 2021-04-30 삼성전자주식회사 전자 장치 및 이의 음성 인식 방법
CN105469791A (zh) * 2014-09-04 2016-04-06 中兴通讯股份有限公司 业务处理方法及装置
DK3272101T3 (da) * 2015-03-20 2020-03-02 Aplcomp Oy Audiovisuel associativ autentificeringsfremgangsmåde, tilsvarende system og apparat
CN105096937A (zh) * 2015-05-26 2015-11-25 努比亚技术有限公司 语音数据处理方法及终端
CN106373575B (zh) * 2015-07-23 2020-07-21 阿里巴巴集团控股有限公司 一种用户声纹模型构建方法、装置及系统
US10062388B2 (en) * 2015-10-22 2018-08-28 Motorola Mobility Llc Acoustic and surface vibration authentication
CN105224850A (zh) * 2015-10-24 2016-01-06 北京进化者机器人科技有限公司 组合鉴权方法及智能交互系统
US10446143B2 (en) * 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
CN107231232B (zh) * 2016-03-23 2020-04-28 阿里巴巴集团控股有限公司 一种身份验证方法及装置
GB2545534B (en) * 2016-08-03 2019-11-06 Cirrus Logic Int Semiconductor Ltd Methods and apparatus for authentication in an electronic device
JP2018025855A (ja) * 2016-08-08 2018-02-15 ソニーモバイルコミュニケーションズ株式会社 情報処理サーバ、情報処理装置、情報処理システム、情報処理方法、およびプログラム
CN106847275A (zh) * 2016-12-27 2017-06-13 广东小天才科技有限公司 一种用于控制穿戴设备的方法及穿戴设备
CN106714023B (zh) * 2016-12-27 2019-03-15 广东小天才科技有限公司 一种基于骨传导耳机的语音唤醒方法、系统及骨传导耳机
CN106686494A (zh) * 2016-12-27 2017-05-17 广东小天才科技有限公司 一种可穿戴设备的语音输入控制方法及可穿戴设备
CN107277272A (zh) * 2017-07-25 2017-10-20 深圳市芯中芯科技有限公司 一种基于软件app的蓝牙设备语音交互方法及系统
CN107393535A (zh) * 2017-08-29 2017-11-24 歌尔科技有限公司 一种开启终端语音识别功能的方法、装置、耳机及终端
CN107682553B (zh) * 2017-10-10 2020-06-23 Oppo广东移动通信有限公司 通话信号发送方法、装置、移动终端及存储介质
CN107886957A (zh) * 2017-11-17 2018-04-06 广州势必可赢网络科技有限公司 一种结合声纹识别的语音唤醒方法及装置
CN108062464A (zh) * 2017-11-27 2018-05-22 北京传嘉科技有限公司 基于声纹识别的终端控制方法及系统
CN107863098A (zh) * 2017-12-07 2018-03-30 广州市艾涛普电子有限公司 一种语音识别控制方法和装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180167745A1 (en) * 2015-06-30 2018-06-14 Essilor International (Compagnie Generale D'optique) A head mounted audio acquisition module
US20170193208A1 (en) * 2015-12-30 2017-07-06 Motorola Mobility Llc Multimodal biometric authentication system and method with photoplethysmography (ppg) bulk absorption biometric
US20170193207A1 (en) * 2015-12-30 2017-07-06 Motorola Mobility Llc Multimodal biometric authentication system and method with photoplethysmography (ppg) bulk absorption biometric
US20180107813A1 (en) * 2016-10-18 2018-04-19 Plantronics, Inc. User Authentication Persistence
US20180113673A1 (en) * 2016-10-20 2018-04-26 Qualcomm Incorporated Systems and methods for in-ear control of remote devices
US20180240463A1 (en) * 2017-02-22 2018-08-23 Plantronics, Inc. Enhanced Voiceprint Authentication
US20180293981A1 (en) * 2017-04-07 2018-10-11 Google Inc. Multi-user virtual assistant for verbal device control

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11393206B2 (en) * 2018-03-13 2022-07-19 Tencent Technology (Shenzhen) Company Limited Image recognition method and apparatus, terminal, and storage medium
US20220201453A1 (en) * 2019-04-02 2022-06-23 Huawei Technologies Co., Ltd. Service connection establishment method, bluetooth master device, chip, and bluetooth system
US20210390959A1 (en) * 2020-06-15 2021-12-16 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof
US11664033B2 (en) * 2020-06-15 2023-05-30 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof
CN113963683A (zh) * 2020-07-01 2022-01-21 广州汽车集团股份有限公司 一种后备箱开启控制方法及后备箱开启控制系统
US20220172702A1 (en) * 2020-12-02 2022-06-02 National Applied Research Laboratories Method for converting vibration to voice frequency wirelessly
US11699428B2 (en) * 2020-12-02 2023-07-11 National Applied Research Laboratories Method for converting vibration to voice frequency wirelessly
US12014741B2 (en) 2021-02-23 2024-06-18 Samsung Electronics Co., Ltd. Electronic device and controlling method thereof
US11393449B1 (en) * 2021-03-25 2022-07-19 Cirrus Logic, Inc. Methods and apparatus for obtaining biometric data
US20220310057A1 (en) * 2021-03-25 2022-09-29 Cirrus Logic International Semiconductor Ltd. Methods and apparatus for obtaining biometric data
US11710475B2 (en) * 2021-03-25 2023-07-25 Cirrus Logic, Inc. Methods and apparatus for obtaining biometric data
US11757871B1 (en) * 2021-07-13 2023-09-12 T-Mobile Usa, Inc. Voice command security and authorization in user computing devices
CN113674749A (zh) * 2021-08-26 2021-11-19 珠海格力电器股份有限公司 一种控制方法、装置、电子设备和存储介质
WO2023172936A1 (fr) * 2022-03-08 2023-09-14 University Of Houston System Systèmes et appareil d'authentification multifactorielle par conduction osseuse et signaux audio
WO2023172937A1 (fr) * 2022-03-08 2023-09-14 University Of Houston System Procédé d'authentification multifactorielle utilisant la conduction osseuse et des signaux audio

Also Published As

Publication number Publication date
EP3790006A1 (fr) 2021-03-10
KR102525294B1 (ko) 2023-04-24
CN110574103B (zh) 2020-10-23
EP3790006A4 (fr) 2021-06-09
KR20210015917A (ko) 2021-02-10
RU2763392C1 (ru) 2021-12-28
CN110574103A (zh) 2019-12-13
CN112420035A (zh) 2021-02-26
WO2020000427A1 (fr) 2020-01-02

Similar Documents

Publication Publication Date Title
KR102525294B1 (ko) 음성 제어 방법, 웨어러블 디바이스 및 단말
US20140310764A1 (en) Method and apparatus for providing user authentication and identification based on gestures
WO2019011109A1 (fr) Procédé de commande d'autorisation et produit associé
WO2020088483A1 (fr) Procédé de commande audio et dispositif électronique
CN109074171B (zh) 输入方法及电子设备
WO2018133282A1 (fr) Procédé de reconnaissance dynamique et dispositif terminal
EP3499853B1 (fr) Procédé et dispositif d'authentification ppg
WO2019218843A1 (fr) Procédé et dispositif de configuration de touches, et terminal mobile et support d'informations
WO2021052306A1 (fr) Enregistrement de caractéristiques d'empreinte vocale
US11601806B2 (en) Device, computer program and method
AU2019211885B2 (en) Authentication window display method and apparatus
CN111477225A (zh) 语音控制方法、装置、电子设备及存储介质
WO2019011108A1 (fr) Procédé de reconnaissance d'iris et produit associé
US20240013789A1 (en) Voice control method and apparatus
WO2019019837A1 (fr) Procédé d'identification biologique et produit associé
CN111681655A (zh) 语音控制方法、装置、电子设备及存储介质
CN111341317B (zh) 唤醒音频数据的评价方法、装置、电子设备及介质
CN111652624A (zh) 购票处理方法、检票处理方法、装置、设备及存储介质
CN109032008A (zh) 发声控制方法、装置以及电子装置
CN109116982A (zh) 信息播放方法、装置以及电子装置
CN111028846B (zh) 免唤醒词注册的方法和装置
CN108650398B (zh) 一种任务执行方法及移动终端
KR20230002728A (ko) 정보 처리 방법 및 전자기기
WO2019128430A1 (fr) Procédé, appareil et dispositif permettant de déterminer une largeur de bande, et support de stockage
US10425819B2 (en) Apparatus and method for controlling outbound communication

Legal Events

Date Code Title Description
AS Assignment

Owner name: HUAWEI TECHNOLOGIES CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, LONG;LI, CHUNJIAN;QIU, CHUNSHOU;AND OTHERS;REEL/FRAME:055018/0041

Effective date: 20210122

AS Assignment

Owner name: HUAWEI TECHNOLOGIES CO., LTD., CHINA

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE THIRD INVENTOR'S NAME PREVIOUSLY RECORDED AT REEL: 55018 FRAME: 041. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:ZHANG, LONG;LI, CHUNJIAN;QIU, CUNSHOU;AND OTHERS;REEL/FRAME:055672/0009

Effective date: 20210122

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION