CN110727821A

CN110727821A - Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake

Info

Publication number: CN110727821A
Application number: CN201910967844.6A
Authority: CN
Inventors: 金国阳
Original assignee: Shenzhen Hai Yi Zhi Xin Technology Co Ltd
Current assignee: Shenzhen Hai Yi Zhi Xin Technology Co Ltd
Priority date: 2019-10-12
Filing date: 2019-10-12
Publication date: 2020-01-24

Abstract

The invention provides a method, a device, a system and a computer storage medium for preventing equipment from being awoken by mistake. The method for preventing the equipment from being awoken by mistake comprises the following steps: performing awakening word detection on the audio information which is played by the intelligent voice assistant equipment; determining the position of the awakening word in the audio information; controlling the intelligent voice assistant device to turn off a awakened function at the location. Therefore, the intelligent voice assistant equipment can be awakened only when detecting that the voice comprises the awakening word, so that the intelligent voice assistant equipment is prevented from being awakened by mistake, the function operation of the intelligent voice assistant equipment is ensured, and the user experience is improved.

Description

Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake

Technical Field

The present application relates to the field of speech processing, and in particular, to a method, an apparatus, a system, and a computer storage medium for preventing a device from being awoken by mistake.

Background

Along with the development of science and technology, more and more devices supporting voice control are provided, and more devices are gradually driven to be phonetized and intelligentized, so that voice really becomes an interface of human-computer interaction. In voice interaction equipment, voice awakening technology is becoming more important and becoming a bridge for people to communicate with the equipment.

The voice awakening application fields are wide, such as robots, mobile phones, wearable equipment, smart homes, vehicles and the like. Many voice-enabled devices require voice wake-up techniques as a starting or entrance to human and machine interaction. Typically, a device is turned on and automatically loads a resource when it is in a sleep state. Then, when the user speaks a specific wake-up word, the device wakes up to switch to an operating state to wait for the next instruction of the user.

Then, the device will recognize the user's instruction and perform corresponding operation according to the recognized result. For example, after successful recognition, the device may play a corresponding piece of audio to interact with the user. However, if the audio also includes the wake-up word, when the device listens to the wake-up word, it is impossible to distinguish whether the wake-up word is a human voice or its own audio sound, which may cause the device's own audio sound to wake up itself, which may cause the device to be woken up by mistake, thereby affecting the normal interaction between the user and the device.

Disclosure of Invention

The present invention has been made in view of the above problems.

According to an aspect of the present invention, there is provided a method for preventing a device from being mistakenly woken, the method including:

performing awakening word detection on the audio information which is played by the intelligent voice assistant equipment;

determining the position of the awakening word in the audio information;

controlling the intelligent voice assistant device to turn off a awakened function at the location.

In one implementation, the audio information is played by the intelligent voice assistant device.

In one implementation, the audio information is played by an external device that is directly or indirectly connected to the intelligent voice assistant device.

In one implementation, the external device and the intelligent voice assistant device are connected through a Bluetooth or wireless network.

In one implementation, the audio information is text-to-speech (TTS) speech.

In one implementation, the location includes a starting location and a length of the wake word in the audio information; or the position comprises a starting position and an ending position of the awakening word in the audio information.

In one implementation, the method further comprises:

controlling the intelligent voice assistant device to resume the awakened function after the location.

According to another aspect of the present invention, there is provided an apparatus for preventing a device from being mistakenly woken, the apparatus being configured to perform the steps of the method, the apparatus including:

the detection module is used for carrying out awakening word detection on the audio information which is played by the intelligent voice assistant equipment;

a determining module, configured to determine a position of the wake-up word in the audio information;

a control module to control the intelligent voice assistant device to turn off the awakened function at the location.

In one implementation, the apparatus and the intelligent voice assistant device are in communication connection through a wired or wireless mode.

According to another aspect of the present invention, there is provided a system for preventing a device from being mistakenly awakened, including a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor implements the steps of the method when executing the computer program.

According to another aspect of the present invention, there is provided a computer storage medium having stored thereon a computer program which, when executed by a computer, performs the steps of the above method.

According to the method for preventing the equipment from being awoken by mistake provided by the embodiment of the invention, when the voice equipment plays voice information, the awaking words are detected, and the voice equipment is controlled to refuse to be awoken at the positions of the awaking words. Therefore, the intelligent voice assistant equipment can be awakened only when detecting that the voice comprises the awakening word, so that the intelligent voice assistant equipment is prevented from being awakened by mistake, the function operation of the intelligent voice assistant equipment is ensured, and the user experience is improved.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail embodiments of the present invention with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings, like reference numbers generally represent like parts or steps.

FIG. 1 is a schematic flow chart diagram of a method for preventing a device from being mistakenly awakened according to an embodiment of the invention;

FIG. 2 is a schematic diagram of the connection between a voice device and a peripheral device according to an embodiment of the present invention;

FIG. 3 is a schematic interaction diagram of a method for preventing a device from being mistakenly awakened according to an embodiment of the invention;

FIG. 4 is a schematic block diagram of an apparatus for preventing a device from being mistakenly awakened according to an embodiment of the present invention;

fig. 5 is a schematic block diagram of a system for preventing a device from being mistakenly woken up according to an embodiment of the present invention.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art, that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the invention.

In the following description, for purposes of explanation, specific details are set forth in order to provide a thorough understanding of the present invention. It is apparent that the practice of the invention is not limited to the specific details set forth herein as are known to those of skill in the art. The following detailed description of the preferred embodiments of the invention, however, the invention is capable of other embodiments in addition to the detailed description and should not be construed as limited to the embodiments set forth herein.

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention, as the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. When the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "upper", "lower", "front", "rear", "left", "right" and the like as used herein are for purposes of illustration only and are not limiting.

Ordinal words such as "first" and "second" are referred to herein merely as labels, and do not have any other meaning, such as a particular order, etc. Also, for example, the term "first component" does not itself imply the presence of "second component", and the term "second component" does not itself imply the presence of "first component".

In the embodiment of the invention, the intelligent voice assistant equipment can be awakened when the awakening words are intercepted, and can interact with the user. The wakeup word may be preset, for example, may be set by a manufacturer and pre-stored in its memory; as another example, it may be customized by the user, modified, and stored in its memory.

The setting of the wake-up word is usually related to the name of the device, which is usually defined by the developer of the device, for example, the wake-up word may be any one of the following: name, name + title, brand + name, hi + name, hello + name, name + hello, my + name, and so forth. Of course, the wake-up word may also be in any other form than the list above, and the wake-up word may also be independent of the name of the device. The wake-up word can be set by the user in a customized manner, for example, the user can set the name of a pet that is kept by the child according to the preference of the user, and the like.

The intelligent voice assistant device may be referred to as an intelligent voice device, an intelligent assistant device, an intelligent voice interaction device, a voice device, etc., among others. In addition, the intelligent voice assistant device may be an internet device and may be communicatively connected to other devices, so that the intelligent voice assistant device may control the other devices communicatively connected thereto according to the user's instructions, for example, turn on/off an air conditioner, turn on/off lights, play/stop music, and the like.

In the using process, the equipment is generally in a dormant state (standby state) after being electrified, and the equipment monitors whether a wakeup word exists in sound received by the equipment; and if the awakening words are intercepted, the equipment is switched from the dormant state to the working state to interact with the user. For example, the device may play a preset voice, such that the user may speak instructions for the device to execute after hearing the preset voice. The preset Speech may be TTS (Text To Speech) Speech, such as "i am" in the main, what kind of indication is, etc. During the interaction of the device with the user, the device may play the response voice. For example, a user says "how is the weather today? The 'equipment plays response voice' today in sunny days, and the air temperature is 5-18 ℃. "where the responsive speech may also be TTS speech.

In addition, when the user uses the device, the device may be directly or indirectly connected to other external devices for convenience of use, for example, the device may be directly connected to a bluetooth speaker, a car radio, or the like, or connected to the bluetooth speaker through a mobile terminal (e.g., a smart phone, a wearable device, or the like). At this time, the device may play corresponding voices, such as TTS voices including preset voice, response voice, etc., through the external devices.

When the voice played by the device or the external device connected with the device includes the wake-up word, the device may listen to the wake-up word, but the device cannot distinguish whether the wake-up word is a human voice, and at this time, the device may be awakened again because the device listens to the wake-up word, which may cause interruption of interaction with the user. For example, assuming that the wake word is "small a", after the user says "small a" to wake up the device, then the user says "how is the weather today? The small A of 'the device should play the response voice' inquires the host: the weather is sunny today, and the air temperature is 5 to 18 ℃. "but because the answer voice includes" small a ", it may cause the device to mistakenly think as a wake-up word, so that the device is awakened again when the device listens to" small a "in the answer voice, thereby causing the device to interrupt the answer voice of the user, for example, the answer voice heard by the user is left: "Small A gives the owner". The user cannot acquire desired information, and thus the user experience is poor.

The current common solution is to use a single speech engine, and the device itself uses the played content as a reference sound to acoustically suppress and filter the awakening words in the played content, but when the sound is played not from the device but from the peripheral, the solution of the single speech engine will not work.

In order to solve some or all of the above technical problems, embodiments of the present invention provide a method for preventing a device from being awoken by mistake, so that the device can be well prevented from being awoken no matter the device plays the voice including an awaking word from itself or from an external device, thereby achieving a good self-awaking suppression effect.

Specific embodiments of the present invention will now be described in more detail with reference to the accompanying drawings, which illustrate representative embodiments of the invention and do not limit the invention.

The embodiment of the invention provides a method for preventing equipment from being awoken by mistake. As shown in fig. 1, is a schematic flow chart of the method, including:

s110, performing awakening word detection on the audio information which is played by the intelligent voice assistant equipment;

s120, determining the position of the awakening word in the audio information;

s130, controlling the intelligent voice assistant equipment to close the awakened function at the position.

For convenience of description, in the embodiment described in conjunction with FIG. 1, the intelligent voice assistant device is simply referred to as a voice device.

Illustratively, the audio information in S110 may be played by the voice device itself, or may be played by a peripheral connected to the voice device. For example, the user may interact directly with the speech device, i.e., the user may speak towards the speech device, with the speech device receiving the user's instructions directly. For another example, an external device (peripheral for short) may be connected to the voice device, and the peripheral may be directly or indirectly connected to the voice device, such as a bluetooth speaker, a radio, and the like. The connection mode between the peripheral and the voice device may be bluetooth or a Wireless network, for example, the Wireless network is Wireless Fidelity (WIFI). It is understood that other connection modes between the peripheral and the voice device may be possible, such as cable connection, for example, Universal Serial Bus (USB) 2.0 line connection, etc., which are not listed here. Referring to fig. 2, the bluetooth speaker 20 is a peripheral directly connected to the audio device 10 in a bluetooth manner, and the radio 30 is a peripheral indirectly connected to the audio device 10 in a wireless manner (e.g., WIFI) via the smart phone 40. For example, the bluetooth speaker 20 may be one of a true Wireless headset (TWS), an Active Noise Cancellation (ANC) headset, and a smart headset, and the bluetooth speaker 20 is not limited to one type, and may also be other types of devices.

Illustratively, the audio information may be TTS speech.

In S110, the detection method of the wakeup word may be the same as or similar to the method of detecting the wakeup word of the heard person in the prior art, which is not limited in the present invention.

For example, in S120, the position of the wake-up word in the audio information may be determined at the same time as the detection, or the position of the wake-up word may be determined after the detection. Illustratively, the position may be a starting position and a length of the wake-up word in the audio information; or, the starting position and the ending position of the awakening word in the audio information; alternatively, other positional forms are possible. Where the position may be in seconds or more precisely in frames. For example, "0, 10" may be used to indicate the position of the wake-up word in the audio information, specifically, the start position of the wake-up word in the audio information is frame 0, and the total length is 10 frames.

Illustratively, in S130, the voice device refuses to wake up at the location indicated in S120. Specifically, in the location interval determined in S120, the speech device turns off its awakened function, and at this time, the speech device does not execute an awakened response to the intercepted awakening word.

Therefore, for the audio information (such as TTS voice) played by the intelligent voice assistant device, even if the audio information contains the awakening words, the voice device cannot be awakened. Therefore, the voice is prevented from being mistakenly awakened by voice outside the user, and the user experience is improved.

In this embodiment of the present invention, after S130, the method may include: controlling the intelligent voice assistant device to resume the awakened function after the location. Thus, the voice device can resume the awakened function after the location interval determined at S120. That is, after the location interval, the voice device continues to perform a awakened response to the intercepted wake word.

For example, for the above example, the position determined in S120 is denoted as "0, 10", which means that the starting position of the wakeup word in the audio information is frame 0 and the total length is 10 frames. The voice device is not woken up during the 10-frame time interval from the 0 th frame during the voice information is directly played or played by the peripheral connected with the voice device, and the recovery can be woken up after 10 frames.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

As an example, an application program for performing an embodiment of the present invention may be installed on the hardware device, the intelligent voice assistant device. As another example, an application for performing voice playback, etc., and one or more applications for performing wake word detection may be installed on the hardware device, the intelligent voice assistant device.

As one example, an application program for performing voice playback or the like may be installed on this hardware device of the intelligent voice assistant device, and one or more application programs for performing wake word detection may be installed on another device. The intelligent voice assistant device and another device may be connected and interact in a wired or wireless manner to implement the steps of the above-described method.

It will be appreciated that the steps of the above-described method may be performed by another device connected to the intelligent voice assistant device. As exemplarily shown in FIG. 3, for example, the wake word detection may be performed using another device that includes one or more independent speech recognition engines, and the intelligent voice assistant device may be notified when the wake word is detected, such that the intelligent voice assistant device is not woken up during wake word play. Illustratively, the other device shown in fig. 3 may also be referred to as an independent wake word detection engine, to which the present invention is not limited.

Referring to FIG. 3, the intelligent voice assistant device starts playing voice information and transmits the voice information to another device. The other device performs wake word detection and determines the location of the wake word in the voice message, then sends the location of the wake word in the voice message to the intelligent voice assistant device and instructs the intelligent voice assistant device to turn off the awakened function at the determined location. So that the intelligent voice assistant device refuses to wake up at the corresponding location.

Specifically, the embodiment shown in fig. 3 includes:

s21, the intelligent voice assistant device starts playing the voice information.

Illustratively, the voice information may be audio information played by the intelligent voice assistant device itself or played through its peripheral. As an example, it may be TTS speech.

S22, the intelligent voice assistant device transmits the played voice information to another device.

Illustratively, another device may obtain the voice information for wake word detection.

S23, the other device performs wake word detection and determines the location of the wake word in the voice message.

For example, the detection of the wake word may be performed in real time, and as an implementation, a neural network-based detection model may be used, although other existing detection methods may be used, which are not listed here.

For example, the position of the wake-up word in the voice message may be determined at the same time as the detection, or the position thereof may be determined after the detection of the wake-up word.

Illustratively, the location may be represented in any available form, e.g., the location may be the starting location and length of the wake-up word in the voice message; or, the starting position and the ending position of the awakening word in the voice message can be obtained; alternatively, other positions are possible, and are not listed here.

S24, the other device sends a control command to the intelligent voice assistant device, the control command including a wake word location. The control command is used to control the intelligent voice assistant device not to wake up at the wake word position.

For example, the information of the wake-up word position may be carried at a specific field of the control command. Alternatively, the wake word location may be encapsulated as a separate signaling with the control command before being sent.

S25, the intelligent voice assistant device turns off the awakened function.

Illustratively, the intelligent voice assistant device refuses to wake up at the wake-up word position in accordance with the received control command. For example, the awakened function may be turned off at the awakening word position, while the awakened function may be kept on at a position other than the awakening word position of the voice message.

In such embodiments, the intelligent voice assistant device performs wakeup word detection itself, where the detected wakeup word may be a human voice or its own voice; another device connected to the intelligent voice assistant device also performs wake word detection, which detected wake words are the voice of the voice device. When another device detects the wake word, the intelligent voice assistant device is notified to refuse to wake up at the corresponding location. Therefore, the intelligent voice assistant equipment can be awakened only when detecting that the voice comprises the awakening word, so that the intelligent voice assistant equipment is prevented from being awakened by mistake, the function operation of the intelligent voice assistant equipment is ensured, and the user experience is improved.

According to another aspect of the present invention, there is provided an apparatus for preventing a device from being awoken by mistake, as shown in fig. 4, the method comprising the steps of:

the detection module 410 is used for performing awakening word detection on the audio information being played by the intelligent voice assistant device;

a determining module 420, configured to determine a position of the wake word in the audio information;

a control module 430 to control the intelligent voice assistant device to turn off the awakened function at the location.

Illustratively, the audio information may be played by the intelligent voice assistant device itself, or may be played by a peripheral connected to the intelligent voice assistant device. For example, the user may interact directly with the intelligent voice assistant device, i.e., the user may speak towards the intelligent voice assistant device, with the instructions of the user being received directly by the intelligent voice assistant device. For another example, an external device (peripheral for short) may be connected to the intelligent voice assistant device, and the peripheral may be directly or indirectly connected to the intelligent voice assistant device, such as a bluetooth speaker, a radio, and the like. The connection mode between the peripheral and the intelligent voice assistant device can be Bluetooth or a wireless network, for example, the wireless network is wireless fidelity (WIFI). It is understood that other connection modes between the peripheral and the intelligent voice assistant device are also possible, for example, a cable line connection, for example, a USB2.0 line connection, etc., which are not listed here.

Illustratively, the audio information may be TTS speech.

The detection module 410 may perform the wakeup word detection in real time, wherein the detection method may be the same as or similar to the method of performing wakeup word detection on the sensed human voice in the prior art, which is not limited in the present invention.

For example, the determination module 420 may determine the location of the wake word in the audio information at the same time as the detection, or may determine its location after the wake word is detected. Illustratively, the position may be a starting position and a length of the wake-up word in the audio information; or, the starting position and the ending position of the awakening word in the audio information; alternatively, other positional forms are possible.

Illustratively, the control module 430 may control the intelligent voice assistant device to turn off the awakened function at the location determined by the determination module 420. In particular, the control module 430 may notify or instruct the intelligent voice assistant device to refuse to wake up during the location interval determined by the determination module 420, when the intelligent voice assistant device does not perform a wake up response to the sensed wake up word.

Additionally, the control module 430 may be further operable to: the intelligent voice assistant device is controlled to resume the awakened function after determining the location determined by module 420.

According to another aspect of the present invention, there is provided a system for preventing a device from being mistakenly woken, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the steps of the method are implemented when the computer program is executed by the processor.

In one embodiment of the invention, as shown in FIG. 5, a system for preventing a device from being mistakenly woken includes one or more processors 510, one or more memories 520. Optionally, the system may also include at least one of an input device 530, an output device 540, a communication interface 550, which are interconnected via a bus system 560 and/or other form of connection mechanism (not shown). It should be noted that the components and configuration of the system shown in FIG. 5 are exemplary only, and not limiting, as the system may have other components and configurations as desired.

Processor 510 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), or other form of Processing Unit having data Processing capabilities and/or instruction execution capabilities, and may control other components in the system to perform desired functions. A processor 510 for executing the steps of the method for preventing the device from being mistakenly woken according to the embodiment of the present invention. For example, the processor 510 can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware Finite State Machines (FSMs), Digital Signal Processors (DSPs), or a combination thereof.

The memory 520 is used to store various types of data to support the operation of the detection device. For example, may include one or more computer program products that may include various forms of computer-readable storage media. The memory may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of example, but not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (DDR SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchronous link SDRAM (SLDRAM), and Direct Rambus RAM (DR RAM).

The input device 530 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.

The output device 540 may output various information (e.g., images or sounds) to an outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

Communication interface 550 is used for communication between the system and other devices, including wired or wireless communication. The system may access a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G, 5G, or a combination thereof. In an exemplary embodiment, the communication interface 310 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication interface 550 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wide Band (UWB) technology, BlueTooth (BlueTooth, BT) technology, and other technologies.

In one embodiment, the method for preventing a device from being mistakenly woken up as described above is performed when the program code is executed by the processor.

Illustratively, the information storage mode may include one of the following storage modes: local storage, database storage, distributed file system storage, and remote storage, the storage service address may include a server IP and a server port.

Illustratively, the above-described access to information may be performed in the form of a stream. For example, access to information may be achieved by transmission of a binary stream.

Furthermore, according to an embodiment of the present invention, there is also provided a storage medium, on which program instructions are stored, and when the program instructions are executed by a computer or a processor, the storage medium is configured to perform corresponding steps of the method for preventing a device from being awoken by mistake according to the embodiment of the present invention, and is configured to implement corresponding modules in the apparatus for preventing a device from being awoken by mistake as shown in fig. 4 according to the embodiment of the present invention. The storage medium may be a computer-readable storage medium, and may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disk read only memory (CD-ROM), a USB memory, or any combination of the above storage media. The computer readable storage medium can be any combination of one or more computer readable storage media, e.g., one containing computer readable program code for randomly generating a sequence of action instructions and another containing computer readable program code for performing a method of preventing a device from being mistakenly awakened.

In one embodiment, the computer program instructions may implement the functional modules of the apparatus for preventing a device from being awoken as shown in fig. 4 according to the embodiment of the present invention when being executed by a computer, and/or may execute the method for preventing a device from being awoken according to the embodiment of the present invention.

Furthermore, according to an embodiment of the present invention, a computer program is also provided, which, when being executed by a computer or a processor, is configured to perform the corresponding steps of the method for preventing a device from being awoken by mistake as shown in fig. 1 or fig. 3 according to an embodiment of the present invention.

Therefore, in the embodiment of the invention, when the voice device plays the voice message, the awakening word is detected, and the voice device is controlled to refuse to be awakened at the position of the awakening word. Therefore, the intelligent voice assistant equipment can be awakened only when detecting that the voice comprises the awakening word, so that the intelligent voice assistant equipment is prevented from being awakened by mistake, the function operation of the intelligent voice assistant equipment is ensured, and the user experience is improved.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for preventing a device from being mistakenly woken, the method comprising:

determining the position of the awakening word in the audio information;

2. The method of claim 1, wherein the audio information is played by the intelligent voice assistant device.

3. The method of claim 1, wherein the audio information is played by an external device directly or indirectly connected to the intelligent voice assistant device.

4. The method of claim 3 wherein the connection between the external device and the intelligent voice assistant device is through a Bluetooth or wireless network.

5. The method of claim 1, wherein the audio information is text-to-speech (TTS) speech.

6. The method of claim 1, wherein the location comprises a starting location and a length of the wake word in the audio information; or the position comprises a starting position and an ending position of the awakening word in the audio information.

7. The method of any of claims 1 to 5, further comprising:

8. An apparatus for preventing a device from being mistakenly woken up, the apparatus being configured to perform the steps of the method of any one of the preceding claims 1 to 7, the apparatus comprising:

9. A system for preventing a device from being mistakenly woken up, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 7 when executing the computer program.

10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a computer, implements the steps of the method of any of claims 1 to 7.