CN108540357B - Voice control method and device and sound equipment - Google Patents

Voice control method and device and sound equipment Download PDF

Info

Publication number
CN108540357B
CN108540357B CN201810289522.6A CN201810289522A CN108540357B CN 108540357 B CN108540357 B CN 108540357B CN 201810289522 A CN201810289522 A CN 201810289522A CN 108540357 B CN108540357 B CN 108540357B
Authority
CN
China
Prior art keywords
voice
authorized user
instruction
judging
permission
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810289522.6A
Other languages
Chinese (zh)
Other versions
CN108540357A (en
Inventor
王声平
张立新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Waterward Information Co Ltd
Original Assignee
Shenzhen Water World Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Water World Co Ltd filed Critical Shenzhen Water World Co Ltd
Priority to CN201810289522.6A priority Critical patent/CN108540357B/en
Priority to PCT/CN2018/082196 priority patent/WO2019184006A1/en
Publication of CN108540357A publication Critical patent/CN108540357A/en
Application granted granted Critical
Publication of CN108540357B publication Critical patent/CN108540357B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • H04L12/282Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Automation & Control Theory (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Lock And Its Accessories (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a voice control method, a voice control device and sound equipment, wherein the method comprises the following steps: detecting whether an authority opening condition is met; when the permission opening condition is met, entering a permission opening mode; under the authority open mode, the voice command sent by the unauthorized user is responded, so that the unauthorized user can perform voice control on the equipment under a specific condition, a complicated process of authorizing a temporary user (such as a visitor) is omitted, the temporary authorization of the temporary user is realized quickly, the privacy and the safety of equipment control are ensured, the flexibility of equipment control is improved, and the user experience is improved.

Description

Voice control method and device and sound equipment
Technical Field
The invention relates to the technical field of smart home, in particular to a voice control method, a voice control device and voice equipment.
Background
At present, voice control is widely applied to the fields of intelligent home, robot intelligent interaction, mobile terminals and the like, a user only needs to send a voice instruction to control electronic equipment, hands of people are liberated, and the intelligent control system is very convenient and fast.
Taking the sound equipment as an example, the user can turn on and off the equipment through voice instructions, or adjust the volume, change songs and the like. However, the voice control technology performs a corresponding instruction by parsing the voice content of the user, and anyone who speaks the same content can control the audio device, so that privacy and security are poor.
In order to solve the above technical problem, an audio device that only responds to a voice instruction of an authorized user is provided in the prior art, so that an unauthorized user cannot control the audio device, thereby improving the privacy and security of the device. However, when a visitor temporarily needs to control the device, the host needs to authorize the visitor, the visitor is added as an authorized user, and when the visitor is used, the host needs to delete the authorization of the visitor, so that the whole operation process is complicated.
Therefore, how to improve the flexibility of device control on the basis of ensuring the privacy and security of device control is a technical problem which needs to be solved at present.
Disclosure of Invention
The invention mainly aims to provide a voice control method, a voice control device and sound equipment, and aims to improve the flexibility of equipment control on the basis of ensuring the privacy and the safety of the equipment control.
To achieve the above object, an embodiment of the present invention provides a voice control method, including:
detecting whether an authority opening condition is met;
when the permission opening condition is met, entering a permission opening mode;
and responding to a voice instruction sent by an unauthorized user in the permission open mode.
Optionally, the step of detecting whether the permission opening condition is satisfied includes:
detecting whether an authority opening instruction sent by an authorized user is received;
and when receiving an authority opening instruction sent by an authorized user, judging that the authority opening condition is met.
Optionally, the step of detecting whether an authorization opening instruction sent by an authorized user is received includes:
when a voice instruction is received, judging whether the voice instruction is sent by an authorized user;
when the voice instruction is sent by an authorized user, judging whether the voice instruction is an authority opening instruction or not;
and when the voice instruction is an authority opening instruction, judging that the authority opening instruction sent by the authorized user is received.
Optionally, the step of detecting whether the permission opening condition is satisfied includes:
detecting whether an authorized user is on site;
and when an authorized user is on the spot, judging that the permission opening condition is met.
Optionally, the step of detecting whether an authorized user is present includes:
collecting field sound information;
judging whether the sound information contains the voice information of an authorized user;
and when the voice information contains the voice information of the authorized user, judging that the authorized user is on the spot.
Optionally, the step of determining whether the sound information includes voice information of an authorized user includes:
detecting whether the sound information contains voice information or not;
when the voice information comprises voice information, extracting voiceprint characteristics of the voice information;
judging whether the voiceprint features of the voice information are matched with the prestored voiceprint features;
and when the voiceprint characteristics of the voice information are matched with the pre-stored voiceprint characteristics, judging that the voice information contains the voice information of the authorized user.
Optionally, the step of detecting whether an authorized user is present includes:
collecting on-site image information;
judging whether the image information contains an image of an authorized user;
and when the image information contains the image of the authorized user, judging that the authorized user is on the spot.
Optionally, the step of determining whether the voice command is issued by an authorized user includes:
extracting the voiceprint characteristics of the voice command;
judging whether the voiceprint features of the voice command are matched with the prestored voiceprint features;
and when the voiceprint characteristics of the voice command are matched with the pre-stored voiceprint characteristics, judging that the voice command is sent by an authorized user.
Optionally, the step of entering the permission open mode further includes: and when the authorized user is not on site, exiting the permission open mode.
Optionally, the step of entering the permission open mode further includes: and when receiving an exit permission opening instruction sent by the authorized user, exiting the permission opening mode.
The embodiment of the invention also provides a voice control device, which comprises:
the condition detection module is used for detecting whether the permission opening condition is met;
the mode switching module is used for entering the permission opening mode when the permission opening condition is met;
and the instruction response module is used for responding to the voice instruction sent by the unauthorized user in the permission open mode.
Optionally, the condition detecting module includes:
the instruction detection unit is used for detecting whether an authority opening instruction sent by an authorized user is received;
the first judgment unit is used for judging that the permission opening condition is met when a permission opening instruction sent by an authorized user is received.
Optionally, the instruction detection unit includes:
the user judging unit is used for judging whether the voice instruction is sent out by an authorized user or not when the voice instruction is received;
the instruction judging unit is used for judging whether the voice instruction is an authority opening instruction or not when the voice instruction is sent by an authorized user;
and the first judging unit is used for judging that the permission opening instruction sent by the authorized user is received when the voice instruction is the permission opening instruction.
Optionally, the detection module includes:
the field detection unit is used for detecting whether an authorized user is on the field;
and the second judgment unit is used for judging that the permission opening condition is met when an authorized user is on the site.
Optionally, the on-site detection unit comprises:
the voice acquisition unit is used for acquiring field voice information;
the voice judging unit is used for judging whether the voice information contains voice information of an authorized user;
and the second judging unit is used for judging that the authorized user is on the spot when the voice information of the authorized user is contained in the voice information.
Optionally, the voice determination unit includes:
the voice detection subunit is used for detecting whether the voice information contains voice information;
the first extraction subunit is used for extracting the voiceprint characteristics of the voice information when the voice information comprises the voice information;
the first matching subunit is used for judging whether the voiceprint characteristics of the voice information are matched with the prestored voiceprint characteristics;
and the first judging subunit is used for judging that the voice information contains the voice information of the authorized user when the voiceprint characteristics of the voice information are matched with the pre-stored voiceprint characteristics.
Optionally, the on-site detection unit comprises:
the image acquisition unit is used for acquiring on-site image information;
the image judging unit is used for judging whether the image information contains image information of an authorized user;
and the third judging unit is used for judging that the authorized user is on the spot when the image information of the authorized user is contained in the image information.
Optionally, the user determination unit includes:
the second extraction subunit is used for extracting the voiceprint features of the voice command;
the second matching subunit is used for judging whether the voiceprint features of the voice command are matched with the prestored voiceprint features;
and the second judging subunit is used for judging that the voice command is sent by an authorized user when the voiceprint characteristics of the voice command are matched with the pre-stored voiceprint characteristics.
Optionally, the apparatus further comprises an exit module configured to: and when the authorized user is not on site, exiting the permission open mode.
Optionally, the apparatus further comprises an exit module configured to: and when receiving an exit permission opening instruction sent by the authorized user, exiting the permission opening mode.
An embodiment of the present invention further provides an audio device, which includes a memory, a processor, and at least one application program stored in the memory and configured to be executed by the processor, where the application program is configured to execute the aforementioned voice control method.
According to the voice control method provided by the embodiment of the invention, the permission open mode is set, when the permission open condition is met, the permission open mode is entered, and the voice instruction sent by the unauthorized user is responded in the permission open mode, so that the unauthorized user can carry out voice control on the equipment under the specific condition, the tedious process of authorizing the temporary user (such as a visitor) is omitted, the quick temporary authorization of the temporary user is realized, the privacy and the safety of equipment control are ensured, the flexibility of equipment control is improved, and the user experience is improved.
Drawings
FIG. 1 is a flow chart of one embodiment of a voice control method of the present invention;
FIG. 2 is a block diagram of a first embodiment of the voice control apparatus of the present invention;
FIG. 3 is a block schematic diagram of the condition detection module of FIG. 2;
FIG. 4 is a block schematic diagram of the instruction detection unit of FIG. 3;
FIG. 5 is a block diagram of the user determination unit of FIG. 4;
FIG. 6 is a further block diagram of the condition detection module of FIG. 2;
FIG. 7 is a block schematic diagram of the field test unit of FIG. 6;
FIG. 8 is a block diagram of the speech determination unit of FIG. 7;
FIG. 9 is a further block schematic diagram of the field test unit of FIG. 6;
FIG. 10 is a block diagram of the image determination unit in FIG. 9;
fig. 11 is a block diagram of a voice control apparatus according to a second embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The voice control method and the voice control device in the embodiment of the invention can be applied to various electronic devices, including intelligent household devices (such as intelligent sound, an intelligent television, an intelligent refrigerator, an intelligent air conditioner and the like), artificial intelligent devices (such as robot devices and the like), mobile terminals (such as mobile phones, tablets and the like), computer terminals (such as personal computers, notebook computers and the like) and the like. The following description will be made in detail taking an application to an audio device (smart audio) as an example.
Referring to fig. 1, an embodiment of a speech control method of the present invention is provided, the method including the steps of:
and S11, detecting whether the permission opening condition is met. When the permission opening condition is satisfied, the flow proceeds to the next step S12.
And S12, entering an authority opening mode.
And S13, responding to the voice command sent by the unauthorized user in the permission open mode.
In the embodiment of the invention, the permission open mode is added for the sound equipment. The audio device is normally in a non-permission open mode, and enters the permission open mode only when it is detected that the permission open condition is satisfied. In the non-permission open mode, the sound equipment only responds to the voice instruction sent by the authorized user and does not respond to the voice instruction sent by the unauthorized user, namely, the unauthorized user cannot perform voice control on the sound equipment. In the permission open mode, the sound equipment not only responds to the voice command sent by the authorized user, but also responds to the voice command sent by the unauthorized user, namely anyone can carry out voice control on the sound equipment.
In the embodiment of the invention, the sound equipment can judge whether the permission opening condition is met by detecting the permission opening instruction, detecting whether the authorized user is on the spot and the like.
Optionally, in the non-permission open mode, the sound device detects whether a permission open instruction sent by an authorized user is received, and when the permission open instruction sent by the authorized user is received, it is determined that a permission open condition is met, and the sound device enters the permission open mode.
Specifically, when a voice instruction is received, the sound equipment judges whether the voice instruction is sent by an authorized user; when the voice instruction is sent by an authorized user, judging whether the voice instruction is an authority opening instruction or not; and when the voice instruction is an authority opening instruction, judging that the authority opening instruction sent by the authorized user is received.
For the judgment of the authorized user, the biological characteristic information of the user can be collected, and whether the user is the authorized user is judged according to the biological characteristic information, wherein the biological characteristic information can be a voiceprint characteristic, a fingerprint characteristic, an iris characteristic, a sclera characteristic, a face characteristic and the like.
Taking voiceprint features as an example, the sound device may collect and store voiceprint features of authorized users in advance. When a voice instruction sent by a user is received, the sound equipment extracts the voiceprint features of the voice instruction, judges whether the voiceprint features of the voice instruction are matched with the pre-stored voiceprint features, judges that the voice instruction is sent by an authorized user when the voiceprint features of the voice instruction are matched with the pre-stored voiceprint features, and judges that the voice instruction is sent by an unauthorized user otherwise. The pre-stored voiceprint characteristics can be one or at least two, i.e. there can be one authorized user or at least two authorized users. When at least two voiceprint features are prestored, the voiceprint feature of the voice instruction only needs to be matched with one of the prestored voiceprint features.
In the scheme, in the permission open mode, when an permission open exit instruction sent by an authorized user is received, the sound equipment exits the permission open mode and recovers the non-permission open mode; or when the authority open mode lasts for a preset time, the sound equipment automatically exits the authority open mode; or, when no voice command is received after the preset time, the sound equipment automatically exits the permission open mode, and the like.
Optionally, in the non-permission open mode, the sound device detects whether an authorized user is on site, and when an authorized user is on site, it determines that the permission open condition is satisfied. For the detection of whether the authorized user is on site or not, the sound equipment can perform detection and identification through voice, images and the like.
Take speech detection recognition as an example. The sound equipment collects sound information of a site through a microphone, judges whether the sound information contains voice information of an authorized user, and judges that the authorized user is on the site when the sound information contains the voice information of the authorized user.
The sound equipment can acquire and store voiceprint characteristics of an authorized user in advance, and when the sound information is judged to contain Voice information of the authorized user, the sound equipment firstly carries out Voice Activity Detection (VAD) on the sound information and detects whether the sound information contains the Voice information; when the voice information comprises voice information, extracting voiceprint characteristics of the voice information; then judging whether the voiceprint characteristics of the voice information are matched with the prestored voiceprint characteristics; and when the voiceprint characteristics of the voice information are matched with the pre-stored voiceprint characteristics, judging that the voice information contains the voice information of the authorized user. The pre-stored voiceprint characteristics can be one or at least two, i.e. there can be one authorized user or at least two authorized users. When at least two voiceprint features are prestored, the voiceprint feature of the voice instruction only needs to be matched with one of the prestored voiceprint features.
Take image detection and recognition as an example. The sound equipment acquires image information of a scene through a camera (a panoramic camera or a common camera), judges whether the image information contains an image of an authorized user, and judges that the authorized user is in the scene when the image information contains the image of the authorized user.
The sound equipment can acquire and store the image characteristics of the face image of the authorized user in advance, and when judging whether the image information contains the image of the authorized user, the sound equipment firstly detects whether the image information contains the face image; when the image information contains a face image, extracting the image characteristics of the face image; then judging whether the image characteristics of the face image are matched with the pre-stored image characteristics; and when the image characteristics of the face image are matched with the pre-stored image characteristics, judging that the image information contains the image information of the authorized user. The image characteristics of the pre-stored face image can be one or at least two, that is, there can be one authorized user or at least two authorized users. When the image features of at least two face images are prestored, the image features of the face images contained in the image information can be matched with the image features of one of the prestored face images.
In the scheme, under the permission open mode, when detecting that an authorized user is not on site, the sound equipment exits the permission open mode and recovers the non-permission open mode; or when the authority open mode lasts for a preset time, the sound equipment automatically exits the authority open mode; or, when no voice command is received after the preset time, the sound equipment automatically exits the permission open mode, and the like.
When detecting whether the authorized user is on site, the sound device can detect through voice, images and the like. For example, when the voice or image of the authorized user is not detected for a preset time, it is determined that the authorized user is not present.
According to the voice control method provided by the embodiment of the invention, the permission open mode is set, when the permission open condition is met, the permission open mode is entered, and the voice instruction sent by the unauthorized user is responded in the permission open mode, so that the unauthorized user can carry out voice control on the equipment under the specific condition, a complicated process of authorizing a temporary user (such as a visitor) is omitted, the quick temporary authorization of the temporary user is realized, the privacy and the safety of equipment control are ensured, the flexibility of equipment control is improved, and the user experience is improved.
Referring to fig. 2, the voice control apparatus of the present invention is proposed, which includes a condition detection module 10, a mode switching module 20, and an instruction response module 30, wherein: the condition detection module 10 is used for detecting whether the permission opening condition is met; the mode switching module 20 is used for entering the permission opening mode when the permission opening condition is met; and the instruction response module 30 is used for responding to a voice instruction sent by an unauthorized user in the permission open mode.
In the embodiment of the invention, the permission open mode is added for the sound equipment. The audio apparatus is normally in the non-rights open mode, and the mode switching module 20 switches from the non-rights open mode to the rights open mode only when the condition detecting module 10 detects that the rights open condition is satisfied. In the unauthorized open mode, the command response module 30 only responds to the voice command sent by the authorized user, and does not respond to the voice command sent by the unauthorized user, that is, the unauthorized user cannot perform voice control on the audio device. In the permission open mode, the command response module 30 responds to not only the voice command issued by the authorized user, but also the voice command issued by the unauthorized user, that is, anyone can perform voice control on the audio device.
In the embodiment of the present invention, the condition detection module 10 may determine whether the permission opening condition is satisfied by detecting the permission opening instruction, detecting whether the authorized user is on site, and the like.
Alternatively, as shown in fig. 3, the condition detecting module 10 may include an instruction detecting unit 11 and a first deciding unit 12, wherein: the instruction detection unit 11 is configured to detect whether an authority opening instruction sent by an authorized user is received; the first decision unit 12 is configured to, when receiving an authority opening instruction issued by an authorized user, decide that an authority opening condition is satisfied.
Specifically, the instruction detecting unit 11 includes, as shown in fig. 4, a user judging unit 111, an instruction judging unit 112, and a first judging unit 113, where: a user judging unit 111 for judging whether the voice instruction is issued by an authorized user when the voice instruction is received; an instruction judging unit 112, configured to judge whether the voice instruction is an authority opening instruction when the voice instruction is issued by an authorized user; the first determination unit 113 is configured to determine that the permission release instruction issued by the authorized user is received when the voice instruction is the permission release instruction.
For the judgment of the authorized user, the user judgment unit 111 may collect biometric information of the user, and judge whether the user is an authorized user according to the biometric information, where the biometric information may be a voiceprint feature, a fingerprint feature, an iris feature, a sclera feature, a face feature, or the like.
Taking voiceprint features as an example, the sound device may collect and store voiceprint features of authorized users in advance. At this time, the user determination unit 111 includes, as shown in fig. 5, a second extraction sub-unit 11111, a second matching sub-unit 1112, and a second determination sub-unit 1113, where: a second extraction sub-unit 11111 configured to extract a voiceprint feature of the user's voice instruction; a second matching subunit 1112, configured to determine whether a voiceprint feature of the voice instruction matches a pre-stored voiceprint feature; and a second judging subunit 1113, configured to judge that the voice command is issued by the authorized user when the voiceprint feature of the voice command matches with the pre-stored voiceprint feature.
The pre-stored voiceprint characteristics can be one or at least two, i.e. there can be one authorized user or at least two authorized users. When at least two voiceprint features are prestored, the voiceprint feature of the voice instruction only needs to be matched with one of the prestored voiceprint features.
Alternatively, as shown in fig. 6, the condition detecting module 10 may include a field detecting unit 13 and a second deciding unit 14, wherein: a field detection unit 13 for detecting whether an authorized user is present on the field; and the second judging unit 14 is used for judging that the permission opening condition is met when an authorized user is on site.
For the detection of whether the authorized user is on site, the site detection unit 13 may perform detection recognition through voice, image, and the like.
When the speech detection recognition is passed, the on-site detecting unit 13 includes, as shown in fig. 7, a sound collecting unit 131, a speech judging unit 132, and a second judging unit 133, in which: a sound collecting unit 131 for collecting the sound information of the scene; a voice judging unit 132, configured to judge whether the voice information includes voice information of an authorized user; the second determination unit 133 is configured to determine that an authorized user is present when the sound information includes voice information of the authorized user.
The sound device may collect and store the voiceprint features of the authorized user in advance, at this time, the voice determining unit 132 includes, as shown in fig. 8, a voice detecting subunit 1321, a first extracting subunit 1322, a first matching subunit 1323, and a first determining subunit 1324, where: the voice detection subunit 1321 is configured to perform voice activity detection on the voice information, and detect whether the voice information includes voice information; a first extracting subunit 1322, configured to, when the sound information includes voice information, extract a voiceprint feature of the voice information; a first matching subunit 1323, configured to determine whether a voiceprint feature of the voice information matches a pre-stored voiceprint feature; the first determining subunit 1324 is configured to determine that the voice information includes voice information of an authorized user when the voiceprint feature of the voice information matches the pre-stored voiceprint feature.
When the image detection is identified, the on-site detecting unit 13 includes an image acquiring unit 134, an image judging unit 135 and a third judging unit 136 as shown in fig. 9, wherein: an image acquisition unit 134 for acquiring on-site image information; an image determining unit 135 configured to determine whether image information of an authorized user is included in the image information; a third determination unit 136 configured to determine that an authorized user is present when the image information of the authorized user is included in the image information.
The audio device may acquire image features of a face image of an authorized user in advance, and in this case, the image determination unit 135 includes an image detection subunit 1351, a third extraction subunit 1352, a third matching subunit 1353, and a third determination subunit 1354, as shown in fig. 10, where: an image detection subunit 1351, configured to detect whether the image information includes a face image; a third extraction subunit 1352, configured to, when the image information includes a face image, extract image features of the face image; a third matching subunit 1353, configured to determine whether the image features of the face image match pre-stored image features; and a third judging subunit 1354, configured to judge that the image information includes image information of an authorized user when the image features of the face image match the pre-stored image features.
The image characteristics of the pre-stored face image can be one or at least two, that is, there can be one authorized user or at least two authorized users. When the image features of at least two face images are prestored, the image features of the face images contained in the image information can be matched with the image features of one of the prestored face images.
Further, as shown in fig. 11, in the second embodiment of the voice control method of the present invention, the apparatus further includes an exit module 40, where the exit module 40 is configured to exit the permission open mode when an exit condition is satisfied.
Optionally, when receiving an exit permission open instruction sent by an authorized user, the exit module 40 exits the permission open mode and restores the non-permission open mode.
Optionally, when it is detected that the authorized user is not present, the exit module 40 exits the authorized open mode and restores the unauthorized open mode. The exit module 40 may detect through voice, image, etc. when detecting whether the authorized user is present. For example, when no voice or image of the authorized user is detected for a preset time, the exit module 40 determines that the authorized user is not present.
Alternatively, the exit module 40 automatically exits the permission open mode after the permission open mode lasts for a preset time.
Alternatively, when no voice command is received for a preset time, the exit module 40 automatically exits the permission open mode.
The voice control device of the embodiment of the invention has the advantages that through setting the authority open mode, when the authority open condition is met, the voice control device enters the authority open mode, and responds to the voice instruction sent by the unauthorized user in the authority open mode, so that the unauthorized user can carry out voice control on the equipment under the specific condition, the complicated process of authorizing the temporary user is omitted, the quick temporary authorization of the temporary user is realized, the privacy and the safety of equipment control are ensured, the flexibility of equipment control is improved, and the user experience is improved.
The invention also proposes an acoustic device comprising a memory, a processor and at least one application stored in the memory and configured to be executed by the processor, the application being configured for executing the speech control method. The voice control method comprises the following steps: detecting whether an authority opening condition is met; when the permission opening condition is met, entering a permission opening mode; and responding to a voice instruction sent by an unauthorized user in the permission open mode. The voice control method described in this embodiment is the voice control method according to the above embodiment of the present invention, and is not described herein again.
Those skilled in the art will appreciate that the present invention includes apparatus directed to performing one or more of the operations described in the present application. These devices may be specially designed and manufactured for the required purposes, or they may comprise known devices in general-purpose computers. These devices have stored therein computer programs that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium, including, but not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magnetic-optical disks, ROMs (Read-Only memories), RAMs (Random Access memories), EPROMs (Erasable Programmable Read-Only memories), EEPROMs (Electrically Erasable Programmable Read-Only memories), flash memories, magnetic cards, or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a bus. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).
It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. Those skilled in the art will appreciate that the computer program instructions may be implemented by a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the features specified in the block or blocks of the block diagrams and/or flowchart illustrations of the present disclosure.
Those of skill in the art will appreciate that various operations, methods, steps in the processes, acts, or solutions discussed in the present application may be alternated, modified, combined, or deleted. Further, various operations, methods, steps in the flows, which have been discussed in the present application, may be interchanged, modified, rearranged, decomposed, combined, or eliminated. Further, steps, measures, schemes in the various operations, methods, procedures disclosed in the prior art and the present invention can also be alternated, changed, rearranged, decomposed, combined, or deleted.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (8)

1. A voice control method, comprising the steps of:
detecting whether an authority opening condition is met;
when the permission opening condition is met, entering a permission opening mode;
responding to a voice instruction sent by an unauthorized user in the permission open mode;
under the permission open mode, when detecting that the authorized user is not on site, the sound equipment exits the permission open mode and recovers the non-permission open mode;
when the voice or the image of the authorized user is not detected for the preset time, judging that the authorized user is not on the spot;
the step of detecting whether the permission opening condition is met comprises the following steps:
detecting whether an authority opening instruction sent by an authorized user is received;
and when receiving an authority opening instruction sent by an authorized user, judging that the authority opening condition is met.
2. The voice control method according to claim 1, wherein the step of detecting whether an authorization opening instruction issued by an authorized user is received comprises:
when a voice instruction is received, judging whether the voice instruction is sent by an authorized user;
when the voice instruction is sent by an authorized user, judging whether the voice instruction is an authority opening instruction or not;
and when the voice instruction is an authority opening instruction, judging that the authority opening instruction sent by the authorized user is received.
3. The voice control method according to claim 1, wherein the step of detecting whether the right opening condition is satisfied comprises:
detecting whether an authorized user is on site;
and when an authorized user is on the spot, judging that the permission opening condition is met.
4. The voice-controlled method of claim 3, wherein the step of detecting whether an authorized user is present comprises:
collecting field sound information;
judging whether the sound information contains the voice information of an authorized user;
and when the voice information contains the voice information of the authorized user, judging that the authorized user is on the spot.
5. A voice control apparatus, comprising:
the condition detection module is used for detecting whether the permission opening condition is met;
the mode switching module is used for entering the permission opening mode when the permission opening condition is met;
the instruction response module is used for responding to a voice instruction sent by an unauthorized user in the permission open mode;
the exit module is used for exiting the permission open mode and recovering the non-permission open mode when detecting that the authorized user is not in the field, and judging that the authorized user is not in the field when the voice or the image of the authorized user is not detected for the duration of the preset time;
the condition detection module includes:
the instruction detection unit is used for detecting whether an authority opening instruction sent by an authorized user is received;
the first judgment unit is used for judging that the permission opening condition is met when a permission opening instruction sent by an authorized user is received.
6. The voice control apparatus according to claim 5, wherein the instruction detection unit includes:
the user judging unit is used for judging whether the voice instruction is sent out by an authorized user or not when the voice instruction is received;
the instruction judging unit is used for judging whether the voice instruction is an authority opening instruction or not when the voice instruction is sent by an authorized user;
and the first judging unit is used for judging that the permission opening instruction sent by the authorized user is received when the voice instruction is the permission opening instruction.
7. The voice control apparatus of claim 5, wherein the detection module comprises:
the field detection unit is used for detecting whether an authorized user is on the field;
the second judgment unit is used for judging that the permission opening condition is met when an authorized user is on the site;
the on-site detection unit includes:
the voice acquisition unit is used for acquiring field voice information;
the voice judging unit is used for judging whether the voice information contains voice information of an authorized user;
and the second judging unit is used for judging that the authorized user is on the spot when the voice information of the authorized user is contained in the voice information.
8. An audio device comprising a memory, a processor and at least one application stored in the memory and configured to be executed by the processor, wherein the application is configured to perform the voice control method of any of claims 1 to 4.
CN201810289522.6A 2018-03-30 2018-03-30 Voice control method and device and sound equipment Active CN108540357B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810289522.6A CN108540357B (en) 2018-03-30 2018-03-30 Voice control method and device and sound equipment
PCT/CN2018/082196 WO2019184006A1 (en) 2018-03-30 2018-04-08 Voice control method and apparatus, and audio equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810289522.6A CN108540357B (en) 2018-03-30 2018-03-30 Voice control method and device and sound equipment

Publications (2)

Publication Number Publication Date
CN108540357A CN108540357A (en) 2018-09-14
CN108540357B true CN108540357B (en) 2020-10-09

Family

ID=63482491

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810289522.6A Active CN108540357B (en) 2018-03-30 2018-03-30 Voice control method and device and sound equipment

Country Status (2)

Country Link
CN (1) CN108540357B (en)
WO (1) WO2019184006A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109781134A (en) * 2018-12-29 2019-05-21 百度在线网络技术(北京)有限公司 Navigation control method, device, engine end and storage medium
CN110213138A (en) * 2019-04-23 2019-09-06 深圳康佳电子科技有限公司 Intelligent terminal user authentication method, intelligent terminal and storage medium
CN116710889A (en) * 2020-12-30 2023-09-05 深圳市大疆创新科技有限公司 Method, apparatus, and system for operating a device based on voice commands

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9367978B2 (en) * 2013-03-15 2016-06-14 The Chamberlain Group, Inc. Control device access method and apparatus
CN105700389A (en) * 2014-11-27 2016-06-22 青岛海尔智能技术研发有限公司 Smart home natural language control method
US9396598B2 (en) * 2014-10-28 2016-07-19 The Chamberlain Group, Inc. Remote guest access to a secured premises
CN106506442A (en) * 2016-09-14 2017-03-15 上海百芝龙网络科技有限公司 A kind of smart home multi-user identification and its Rights Management System
CN107015481A (en) * 2017-05-31 2017-08-04 苏州远唯景电子科技有限公司 A kind of intelligent voice control blind system with voice identification authentication
CN108390859A (en) * 2018-01-22 2018-08-10 深圳慧安康科技有限公司 A kind of interphone extension intelligent robot

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102497423B (en) * 2011-11-10 2014-11-12 贵阳朗玛信息技术股份有限公司 Method, device and system for playing songs in webpage chat room
CN104123940A (en) * 2014-08-06 2014-10-29 苏州英纳索智能科技有限公司 Voice control system and method based on intelligent home system
CN104902070A (en) * 2015-04-13 2015-09-09 青岛海信移动通信技术股份有限公司 Mobile terminal voice control method and mobile terminal
CN107742069A (en) * 2017-09-18 2018-02-27 广东美的制冷设备有限公司 terminal control method, device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9367978B2 (en) * 2013-03-15 2016-06-14 The Chamberlain Group, Inc. Control device access method and apparatus
US9396598B2 (en) * 2014-10-28 2016-07-19 The Chamberlain Group, Inc. Remote guest access to a secured premises
CN105700389A (en) * 2014-11-27 2016-06-22 青岛海尔智能技术研发有限公司 Smart home natural language control method
CN106506442A (en) * 2016-09-14 2017-03-15 上海百芝龙网络科技有限公司 A kind of smart home multi-user identification and its Rights Management System
CN107015481A (en) * 2017-05-31 2017-08-04 苏州远唯景电子科技有限公司 A kind of intelligent voice control blind system with voice identification authentication
CN108390859A (en) * 2018-01-22 2018-08-10 深圳慧安康科技有限公司 A kind of interphone extension intelligent robot

Also Published As

Publication number Publication date
CN108540357A (en) 2018-09-14
WO2019184006A1 (en) 2019-10-03

Similar Documents

Publication Publication Date Title
CN110851809B (en) Fingerprint identification method and device and touch screen terminal
CN104866750B (en) Using startup method and apparatus
CN108540357B (en) Voice control method and device and sound equipment
CN104850827B (en) Fingerprint identification method and device
EP3147768A1 (en) Screen interface unlocking method and screen interface unlocking device
CN104778416B (en) A kind of information concealing method and terminal
CN104331651A (en) Fingerprint- and voice recognition-based control system and equipment
CN103577737A (en) Mobile terminal and automatic authority adjusting method thereof
CN101494690A (en) Mobile terminal and unlocking method thereof
EP3357210B1 (en) System and method for person reidentification
CN108734838B (en) Intelligent lock with video-based biological feature verification device
CN105426730A (en) Login authentication processing method and device as well as terminal equipment
CN103856614A (en) Method and device for avoiding error hibernation of mobile terminal
CN103177238A (en) Terminal and user identifying method
CN102110195A (en) Computer system and identification method and device for user
WO2004061818A2 (en) Identification apparatus and method
EP1461781B1 (en) User identification method and device
CN110077361B (en) Vehicle control method and device
CN112509586A (en) Method and device for recognizing voice print of telephone channel
CN111611437A (en) Method and device for preventing face voiceprint verification and replacement attack
CN108768977A (en) A kind of terminal system login method based on speech verification
CN105827810B (en) A kind of communication terminal based on Application on Voiceprint Recognition recovers method and communication terminal
CN104092813A (en) Handset and operation processing method thereof
CN105303092A (en) Identity authentication method and apparatus
CN107025398B (en) A kind of method and terminal device of controlling terminal equipment switch operating state

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220523

Address after: 518000 floor 1, building 3, Dexin Chang wisdom Park, No. 23 Heping Road, Qinghua community, Longhua street, Longhua District, Shenzhen, Guangdong

Patentee after: Shenzhen waterward Information Co.,Ltd.

Address before: 518000, block B, huayuancheng digital building, 1079 Nanhai Avenue, Shekou, Nanshan District, Shenzhen City, Guangdong Province

Patentee before: SHENZHEN WATER WORLD Co.,Ltd.

TR01 Transfer of patent right