CN111429917B - Equipment awakening method and terminal equipment - Google Patents

Equipment awakening method and terminal equipment Download PDF

Info

Publication number
CN111429917B
CN111429917B CN202010191577.0A CN202010191577A CN111429917B CN 111429917 B CN111429917 B CN 111429917B CN 202010191577 A CN202010191577 A CN 202010191577A CN 111429917 B CN111429917 B CN 111429917B
Authority
CN
China
Prior art keywords
wake
voice
word
equipment
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010191577.0A
Other languages
Chinese (zh)
Other versions
CN111429917A (en
Inventor
陈天峰
冯大航
靳源
常乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202010191577.0A priority Critical patent/CN111429917B/en
Publication of CN111429917A publication Critical patent/CN111429917A/en
Application granted granted Critical
Publication of CN111429917B publication Critical patent/CN111429917B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a device wake-up method and a terminal device, wherein the terminal device is connected with at least one loudspeaker, the at least one loudspeaker is respectively arranged on at least one voice device and respectively covers the sound inlet channel of a microphone of each voice device, and the method comprises the following steps: receiving a first wake-up voice input by a user; under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment, the terminal equipment is awakened, a second wake-up voice matched with the second wake-up word of the target voice equipment is synthesized, the second wake-up voice is played through a loudspeaker arranged on the target voice equipment, and the target voice equipment is awakened through the second wake-up voice, wherein the target voice equipment is at least one of the at least one voice equipment. The embodiment of the invention can indirectly achieve the purpose of waking up the target voice equipment in a mode of waking up the terminal equipment, thereby bringing convenience to the user and reducing the burden of the user on memorizing the wake-up words of different equipment.

Description

Equipment awakening method and terminal equipment
Technical Field
The present invention relates to the field of speech processing technologies, and in particular, to a device wake-up method and a terminal device.
Background
With the progress and development of technology, intelligent voice devices, intelligent home appliances and other devices are popularized and promoted, and voice functions are also supported in large numbers. In a family, a plurality of voice control devices, such as a smart speaker, a smart television, a smart air conditioner and the like, are often configured, and each device may be respectively provided with a different wake-up word, for example, the wake-up word of the smart speaker may be a "colleague", the wake-up word of the smart television may be a "smart television", the wake-up word of the smart air conditioner may be a "hello-happy", and a user only speaks the corresponding wake-up word to wake up the corresponding device, and if the wake-up words are not right, the device cannot be woken up.
In practice, different smart devices may come from different vendors, and wake-up words are difficult to unify. When the wake-up words are many, the user is required to memorize the wake-up words of each device, and when the user forgets or confuses, the device cannot be woken up. In some public places, such as hotels, companies, etc., more intelligent voice devices, and users' unfamiliar, various wake-up words of different devices are more likely to cause difficulty in use by the users.
Disclosure of Invention
The embodiment of the invention provides a device awakening method and terminal equipment, which are used for solving the problem of difficult use brought to a user due to different awakening words of different devices in the prior art.
In order to solve the technical problems, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a device wake-up method, applied to a terminal device, where the terminal device is connected to at least one speaker, and the at least one speaker is respectively installed on at least one voice device and covers an incoming channel of a microphone of each voice device, where the method includes:
receiving a first wake-up voice input by a user;
and under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment, waking up the terminal equipment, synthesizing second wake-up voice matched with a second wake-up word of target voice equipment, and playing the second wake-up voice through a loudspeaker installed on the target voice equipment so as to wake up the target voice equipment through the second wake-up voice, wherein the target voice equipment is at least one of the at least one voice equipment.
Optionally, the synthesizing the second wake up speech matched with the second wake up word of the target speech device includes:
determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
And synthesizing a second wake-up voice matched with the second wake-up word.
Optionally, before the receiving the first wake-up voice input by the user, the method further includes:
receiving a wake-up word of each voice device in the at least one voice device input by a user;
storing wake-up words of each of the at least one voice device;
the determining the second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device comprises:
and determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
Optionally, after the terminal device is awakened, before the second awakening voice matched with the second awakening word of the target voice device is synthesized, the method further includes:
receiving a voice control instruction input by a user;
and determining the corresponding target voice equipment based on the voice control instruction, wherein the target voice equipment is one of the at least one voice equipment capable of responding to the voice control instruction.
Optionally, after receiving the voice control instruction input by the user, before determining the corresponding target voice device based on the voice control instruction, the method further includes:
Determining whether the voice control instruction is matched with the common control instruction according to a pre-recorded common control instruction corresponding to the first wake-up word;
the determining, based on the voice control instruction, the corresponding target voice device includes:
and under the condition that the voice control instruction is matched with the common control instruction, determining the corresponding target voice equipment based on the voice control instruction.
Optionally, before the receiving the first wake-up voice input by the user, the method further includes:
receiving a third wake-up word input by a user;
setting the third wake-up word as a first wake-up word of the terminal equipment;
and determining the score of the first wake word and outputting the score.
Optionally, in the case that the first wake-up voice matches the first wake-up word of the terminal device, waking up the terminal device includes:
extracting voiceprint features in the first wake-up speech;
and waking up the terminal equipment under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment and the voiceprint characteristic is matched with a preset voiceprint characteristic.
In a second aspect, an embodiment of the present invention provides a terminal device connected to at least one speaker, the at least one speaker being respectively mounted on at least one voice device and respectively covering an acoustic path of a microphone of each voice device, the terminal device including:
The first receiving module is used for receiving a first wake-up voice input by a user;
the wake-up module is configured to wake up the terminal device when the first wake-up voice is matched with a first wake-up word of the terminal device, synthesize a second wake-up voice matched with a second wake-up word of a target voice device, and play the second wake-up voice through a speaker installed on the target voice device, so as to wake up the target voice device through the second wake-up voice, where the target voice device is at least one of the at least one voice device.
Optionally, the wake-up module includes:
the determining unit is used for determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
and the synthesis unit is used for synthesizing the second wake-up voice matched with the second wake-up word.
Optionally, the terminal device further includes:
the second receiving module is used for receiving wake-up words of each voice device in the at least one voice device input by a user;
the storage module is used for storing wake-up words of each voice device in the at least one voice device;
The determining unit is used for determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
Optionally, the terminal device further includes:
the third receiving module is used for receiving a voice control instruction input by a user;
the first determining module is configured to determine, based on the voice control instruction, the corresponding target voice device, where the target voice device is a voice device capable of responding to the voice control instruction in the at least one voice device.
Optionally, the terminal device further includes:
the second determining module is used for determining whether the voice control instruction is matched with the common control instruction according to a pre-recorded common control instruction corresponding to the first wake-up word;
the first determining module is configured to determine, based on the voice control instruction, the corresponding target voice device if the voice control instruction matches the common control instruction.
Optionally, the terminal device further includes:
the fourth receiving module is used for receiving a third wake-up word input by a user;
the setting module is used for setting the third wake-up word as a first wake-up word of the terminal equipment;
And the third determining module is used for determining the score of the first wake-up word and outputting the score.
Optionally, the wake-up module includes:
the extraction unit is used for extracting voiceprint features in the first wake-up voice;
the wake-up unit is used for waking up the terminal equipment under the condition that the first wake-up voice is matched with a first wake-up word of the terminal equipment and the voiceprint characteristic is matched with a preset voiceprint characteristic.
In a third aspect, an embodiment of the present invention provides a terminal device, including a processor, a memory, and a computer program stored in the memory and executable on the processor, where the computer program when executed by the processor implements steps in the device wake-up method described above.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the device wake-up method described above.
In the embodiment of the invention, the terminal equipment is connected with at least one loudspeaker, the at least one loudspeaker is respectively arranged on at least one voice equipment and respectively covers the sound inlet channel of the microphone of each voice equipment, on the basis, the terminal equipment is triggered to synthesize the wake-up voice of the target voice equipment by waking up the terminal equipment, and the synthesized wake-up voice is played by the loudspeaker, so that the aim of waking up the target voice equipment is fulfilled. Therefore, the user does not need to memorize the wake-up words of different voice devices, only needs to memorize the wake-up words of the terminal device, and indirectly achieves the purpose of waking up the target voice device in a mode of waking up the terminal device, so that convenience and quickness are brought to the user, and the burden of memorizing the wake-up words of different devices by the user is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
FIG. 1 is a flow chart of a device wake-up method provided by an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a device wake-up method provided by an embodiment of the present invention, applied to a terminal device, where the terminal device is connected to at least one speaker, and the at least one speaker is respectively installed on at least one voice device and covers an incoming channel of a microphone of each voice device, as shown in fig. 1, where the method includes the following steps:
Step 101, receiving a first wake-up voice input by a user.
In the embodiment of the invention, the terminal equipment can be an independent portable electronic equipment which is used as a special awakening manager, and can also be a mobile terminal such as a mobile phone, a wearable device and the like, and the awakening manager APP is arranged on the terminal equipment and can be used for awakening management of other voice equipment.
When the terminal equipment is a mobile terminal, the terminal equipment can be externally connected with one or more speakers, and the data connection between the terminal equipment and the speakers can be established by Bluetooth, hot spot and the like, namely, the terminal equipment can transmit the audio data to the speakers connected with the terminal equipment for playing.
In order to achieve the purpose of the present invention, that is, to achieve the purpose of indirectly waking up other voice devices by waking up the terminal device, in the embodiment of the present invention, at least one speaker connected to the terminal device needs to be installed on at least one voice device respectively and cover the sound inlet channel of the microphone of each voice device respectively, where the at least one voice device may refer to a device supporting functions such as voice wake-up and voice control, and is typically an intelligent device, such as an intelligent sound box, an intelligent home appliance, and other devices. Thus, since the sound inlet channel of the microphone of the voice device is covered by the loudspeaker, the voice device can hardly receive external sound through the microphone thereof, namely, can hardly directly receive and respond to the voice command input by the user, but since the loudspeaker is covered on the sound inlet channel of the microphone thereof, the audio transmission distance is very short, thereby better receiving the voice command sent by the loudspeaker and further enabling the response to be quicker and more accurate.
In practical application, when all of the voice devices, especially a plurality of voice devices, are located in a home, hotel or company, in order to enable a user to wake up any one of the voice devices more easily, a speaker may be installed on each voice device, and each speaker covers a microphone opening of each voice device, where the speaker may be designed to be capable of fitting a shape of a housing of the voice device, such as a square shape, a round shape, etc. with a proper thickness, and the terminal device may be connected with the speakers in a data connection manner.
In step 101, the first wake-up speech may be a speech signal input by the user for waking up the terminal device, and in order to ensure that the terminal device is woken up, the first wake-up speech needs to be matched with a wake-up word of the terminal device, for example, the wake-up word of the terminal device is "hello-xiaojun", and the user may input the wake-up speech of "hello-xiaojun".
That is, the terminal device may be in a sleep state under the condition that the wake-up voice of the user is not received, and when the user needs to wake up one or more of the at least one voice device, a first wake-up voice matched with the wake-up word of the terminal device may be input first to wake up the terminal device, so that the voice device that needs to be woken up is woken up through the terminal device.
Optionally, before the step 101, the method further includes:
receiving a third wake-up word input by a user;
setting the third wake-up word as a first wake-up word of the terminal equipment;
and determining the score of the first wake word and outputting the score.
In this embodiment, in order to meet the requirement of the user for flexibly setting the wake-up word and ensure a better wake-up effect, the wake-up word of the terminal device may be set according to the user input, and the wake-up word input by the user may be scored and evaluated, so as to guide the user to input a proper wake-up word with a better wake-up effect.
Specifically, before the user wakes up by using the terminal device, the user may register a wake-up word on the terminal device, that is, the user may input any wake-up word according to his own needs, use habits, personal preferences, etc., and the terminal device may set the wake-up word input by the user, that is, the third wake-up word, as the wake-up word of the terminal device, that is, the first wake-up word, so that the subsequent user may wake up the terminal device by inputting wake-up voice matched with the first wake-up word.
When the wake-up word input by the user is too long, the number of repeated words is too large or the tongue is relatively flexible, the wake-up effect may be poor, such as easy false wake-up or low wake-up success rate, so that in order to ensure the better wake-up effect, the wake-up word input by the user, namely the first wake-up word, may be scored, specifically, the length of the first wake-up word, the confusion degree of a state sequence and the like may be integrated, wherein the confusion degree of the state sequence may be defined according to the arrangement of words and the number of repeated words in the first wake-up word, for example, the confusion degree of the state sequence of the wake-up word is high due to the fact that all the wake-up words are repeated words, and the wake-up word is read with good taste and has no repetition, and the confusion degree of the state sequence is low. The rule for scoring the first wake word may be a moderate length, low confusion score for the state sequence, and a low confusion score for the state sequence that is too long or too short.
After scoring the first wake-up word, the score of the first wake-up word can be directly output (e.g., displayed or voice prompted), so that a user can determine whether the first wake-up word is suitable as the wake-up word of the terminal device to continue to use according to the score, if the score is higher, the user does not need to modify the wake-up word of the terminal device, and if the score is lower, the user can modify the wake-up word of the terminal device.
Further, in order to achieve a better prompting effect, a prompting signal may be output when it is determined that the score of the first wake-up word is lower than a preset score, where the prompting signal is used to prompt a user to change the first wake-up word. For example, under the condition that the score of the first wake-up word is determined to be lower than 60 points, a text prompt message or a voice prompt signal can be output to prompt the user that the score of the current wake-up word is lower, and the wake-up word can be replaced to obtain a better wake-up effect.
Step 102, waking up the terminal device and synthesizing a second wake-up voice matched with a second wake-up word of a target voice device under the condition that the first wake-up voice is matched with the first wake-up word of the terminal device, and playing the second wake-up voice through a loudspeaker installed on the target voice device so as to wake up the target voice device through the second wake-up voice, wherein the target voice device is at least one of the at least one voice device.
After the first wake-up voice is received, the first wake-up voice can be subjected to matching verification, namely, whether the first wake-up voice is matched with a first wake-up word of the terminal equipment or not is verified, if the first wake-up voice is matched with the first wake-up word, the terminal equipment can be woken up in response to the first wake-up voice, namely, the terminal equipment enters a wake-up state, wherein the first wake-up word is a predefined wake-up word used for waking up the terminal equipment, and the wake-up word of the terminal equipment can be customized by a user according to self preference or use habit.
In the embodiment of the invention, the purpose of waking up the terminal device by the user is to expect to wake up one or more voice devices in the at least one voice device, so after the terminal device is woken up, a second wake-up voice for waking up the target voice device can be further synthesized through voice synthesis, the second wake-up voice can be transmitted to a loudspeaker installed on the target voice device, the second wake-up voice is played through the loudspeaker, and thus the target voice device can receive the second wake-up voice and respond to the second wake-up voice, and then enter a wake-up state, and thus, the target voice device is successfully woken up.
The target voice device may be at least one of the at least one voice device, that is, may wake up one voice device at a time, or may wake up a plurality of voice devices at a time, specifically, the determination of the target voice device may have a plurality of different manners, for example, may be determined by a function indicated by a voice control instruction input again by the user after inputting the first wake up voice, or may be determined by presetting a wake up period of a different voice device, so that a voice device in the wake up period may be determined based on the current period, or may be determined by presetting a plurality of wake up words (e.g., 2) for the terminal device, where each wake up word is used to wake up a corresponding plurality of voice devices (e.g., wake up word 1 is used to wake up voice devices 1, 2 and 3, and wake up word 2 is used to wake up voice devices 4 and 5), or the like.
The wake-up words of the target voice device may be obtained through a mode of user pre-entry, for example, the user may pre-enter the wake-up words of the voice device expected to wake up through the terminal device into the terminal device, or may enter the types of the voice devices into the terminal device, the terminal device searches for the corresponding wake-up words according to the device types, or may scan two-dimensional codes of the voice devices to obtain device type information, and then find out the corresponding wake-up words.
It should be noted that, when a plurality of voice devices need to be awakened at the same time and the awakening words of the plurality of voice devices are not identical, the awakening voices of the corresponding voice devices can be respectively synthesized, and the awakening voices for awakening the corresponding voice devices are respectively played through the corresponding speakers, so that the purpose of awakening the plurality of voice devices at one time is achieved.
Optionally, the synthesizing the second wake up speech matched with the second wake up word of the target speech device includes:
determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
and synthesizing a second wake-up voice matched with the second wake-up word.
In this embodiment, in order to ensure that the second wake-up speech of the target speech device is accurately synthesized, the second wake-up word of the target speech device may be determined first to synthesize the second wake-up speech according to the second wake-up word, specifically, the terminal device may obtain the wake-up word of each speech device in the at least one speech device in advance, so that when the wake-up speech of the target speech device needs to be synthesized, the second wake-up word of the target speech device may be directly found from the pre-obtained wake-up words of the speech devices, and then a speech synthesis technology is used to synthesize the second wake-up speech including the second wake-up word.
The wake-up words of each voice device in the at least one voice device may be obtained through a mode of user pre-entry, for example, a user may pre-enter wake-up words of voice devices expected to be awakened up through the terminal device into the terminal device, or may pre-enter the types of the voice devices into the terminal device, the terminal device searches for corresponding wake-up words according to the device types and performs associated storage, or may pre-scan two-dimensional codes of the voice devices to obtain device model information, and then search for corresponding wake-up words and perform associated storage.
Therefore, after the terminal equipment wakes up, the second wake-up word of the target voice equipment can be rapidly determined based on the wake-up word of each voice equipment in the at least one voice equipment, so that the second wake-up voice matched with the second wake-up word can be rapidly synthesized, and the purposes of improving the equipment wake-up speed and further improving the user wake-up experience can be achieved.
Further, before the receiving the first wake-up voice input by the user, the method further includes:
Receiving a wake-up word of each voice device in the at least one voice device input by a user;
storing wake-up words of each of the at least one voice device;
the determining the second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device comprises:
and determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
In this embodiment, in the process of waking up the voice device, in order to ensure that the terminal device can quickly synthesize the wake-up word of any one of the at least one voice device and ensure the accuracy of the wake-up word of each voice device, the user may enter the wake-up word of each voice device in the at least one voice device in advance, that is, before using the terminal device to wake up voice, the user may input the wake-up word of each voice device in the at least one voice device on the terminal device, specifically, the user may input device information (such as a device name or a device model, etc.) of each voice device and a corresponding wake-up word, or the terminal device may add each voice device and then the user may input the corresponding wake-up word to each voice device.
In order to complete the configuration of the terminal device at one time, the wake-up word of each voice device in the at least one voice device may be input together when the wake-up word is registered for the terminal device.
After the user inputs, each voice device and the corresponding wake-up word can be stored in a correlated mode, and the voice device and the corresponding wake-up word can be stored locally or in the cloud. When the second wake-up voice of the target voice equipment needs to be synthesized, the second wake-up word of the target voice equipment can be quickly searched from the stored wake-up words of each voice equipment in the at least one voice equipment, and then the second wake-up voice matched with the second wake-up word is synthesized.
Optionally, after the terminal device is awakened, before the second awakening voice matched with the second awakening word of the target voice device is synthesized, the method further includes:
receiving a voice control instruction input by a user;
and determining the corresponding target voice equipment based on the voice control instruction, wherein the target voice equipment is one of the at least one voice equipment capable of responding to the voice control instruction.
In this embodiment, in order to avoid waking up unnecessary voice devices, the voice synthesis may not be performed immediately after waking up the terminal device, but the user may input a voice control instruction first, and then determine the target voice device to be woken up by the user based on analysis of the voice control instruction input by the user, where the voice control instruction may be an instruction for instructing to control a certain function of the target voice device, such as a voice instruction of "raising temperature", "raising volume", "playing music", and the like.
After receiving the voice control instruction, the terminal device may determine the corresponding target voice device based on the voice control instruction, specifically, may determine the corresponding target voice device by analyzing a function control instruction corresponding to the voice control instruction and combining functions of each voice device in the at least one voice device for supporting control, that is, the target voice device is a voice device capable of responding to the voice control instruction in the at least one voice device.
For example, if the user inputs a voice control instruction of "raise temperature", it may be determined that the target voice device that the user desires to wake up is an intelligent air conditioner; if a user inputs a voice control instruction of switching to a news channel, the target voice equipment which is expected to wake up can be determined to be the intelligent television; if a user inputs a voice control instruction of playing music, the target voice equipment which is expected to wake up can be determined to be an intelligent sound box; etc. And after the target voice equipment is awakened, the corresponding function of the target voice equipment can be correspondingly regulated or started in response to the voice control instruction, namely according to the instruction of the voice control instruction.
Therefore, by determining the corresponding target voice equipment based on the voice control instruction input by the user, the voice equipment expected by the user can be accurately awakened, other unnecessary voice equipment can be prevented from being awakened by mistake, and the power consumption of the equipment is further reduced.
Further, after receiving the voice control instruction input by the user, before determining the corresponding target voice device based on the voice control instruction, the method further includes:
determining whether the voice control instruction is matched with the common control instruction according to a pre-recorded common control instruction corresponding to the first wake-up word;
the determining, based on the voice control instruction, the corresponding target voice device includes:
and under the condition that the voice control instruction is matched with the common control instruction, determining the corresponding target voice equipment based on the voice control instruction.
In this embodiment, after receiving a voice control instruction input by a user, it may be first determined whether the voice control instruction is matched with a common control instruction, and in the case of matching, the corresponding target voice device is determined based on the voice control instruction, so as to avoid the user from inputting an incorrect voice control instruction by mistake, or avoid incorrect wake-up and incorrect response of the voice device, where the common control instruction may be a common control instruction corresponding to the first wake-up word that is previously input by the user, and the user may register the corresponding common control instruction together when registering the wake-up word of the terminal device, so that after the terminal device is awakened, only when receiving the voice control instruction that is input by the user and matches with the common control instruction, the target voice device in the at least one voice device will be awakened, and only in this case, the target voice device will respond to the voice control instruction.
For example, if the voice control instruction input by the user is "play music", the voice control instruction can be determined to match with the common control instruction, and the corresponding target voice device can be determined to be an intelligent sound box based on the instruction, and after the intelligent sound box is awakened, the intelligent sound box starts playing music; if the voice control command input by the user is "turn high", the voice control command is not matched with the common control command because the common control command does not include the command, so that the voice control command is not responded, and any voice equipment is not awakened.
Therefore, by pre-entering the common control instruction corresponding to the first wake-up word and matching the voice control instruction, the voice control instruction of the user can be ensured to be responded more accurately, and unnecessary equipment wake-up and voice instruction response are avoided.
Optionally, in the case that the first wake-up voice matches the first wake-up word of the terminal device, waking up the terminal device includes:
Extracting voiceprint features in the first wake-up speech;
and waking up the terminal equipment under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment and the voiceprint characteristic is matched with a preset voiceprint characteristic.
In order to enhance the security of equipment wakeup and avoid any wakeup of other users, voiceprint matching can be introduced in the condition of waking up the terminal equipment, specifically, in the case of receiving a first wakeup voice of a user, voiceprint characteristics of the user inputting the first wakeup voice can be further extracted, namely, the voiceprint characteristics of the user input are extracted from the first wakeup voice, the extracted voiceprint characteristics are matched with preset voiceprint characteristics, and the first wakeup voice is matched with a first wakeup word of the terminal equipment; the preset voiceprint features may be voiceprint features of a user with wake-up authority, which are pre-recorded in the terminal device, and the preset voiceprint features may include one or more of the voiceprint features.
Only if the voiceprint features extracted from the first wake-up voice are matched with the preset voiceprint features, determining the current user as an authorized user, and if the first wake-up word is matched with the first wake-up word of the terminal equipment, waking up the terminal equipment in response to the first wake-up voice.
Therefore, when the at least one voice device does not have the voice print recognition function, voice print matching is added in the wake-up of the terminal device, so that the voice print recognition function is indirectly provided in the wake-up of the at least one voice device, and the wake-up security of the voice devices is improved.
In the embodiment of the present invention, the terminal device may be any device having a storage medium, for example: a terminal Device such as a Computer (Computer), a mobile phone, a tablet Computer (Tablet Personal Computer), a Laptop (Laptop Computer), a personal digital assistant (Personal Digital Assistant, PDA for short), a mobile internet Device (Mobile Internet Device, MID for short), or a Wearable Device (MID Device), a portable electronic Device, or the like.
According to the equipment awakening method, terminal equipment is connected with at least one loudspeaker, the at least one loudspeaker is respectively arranged on at least one voice equipment and respectively covers the sound inlet channels of the microphones of each voice equipment, on the basis, the terminal equipment is triggered to synthesize awakening voice of target voice equipment through awakening, and the synthesized awakening voice is played through the loudspeaker, so that the aim of awakening the target voice equipment is achieved. Therefore, the user does not need to memorize the wake-up words of different voice devices, only needs to memorize the wake-up words of the terminal device, and indirectly achieves the purpose of waking up the target voice device in a mode of waking up the terminal device, so that convenience and quickness are brought to the user, and the burden of memorizing the wake-up words of different devices by the user is reduced.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a terminal device according to an embodiment of the present invention, where the terminal device is connected to at least one speaker, and the at least one speaker is respectively mounted on at least one voice device and covers an incoming channel of a microphone of each voice device, as shown in fig. 2, the terminal device 200 includes:
a first receiving module 201, configured to receive a first wake-up voice input by a user;
and a wake module 202, configured to wake up the terminal device if the first wake-up voice matches a first wake-up word of the terminal device, and synthesize a second wake-up voice that matches a second wake-up word of a target voice device, play the second wake-up voice through a speaker installed on the target voice device, so as to wake up the target voice device through the second wake-up voice, where the target voice device is at least one of the at least one voice device.
Optionally, the wake-up module 202 includes:
the determining unit is used for determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
And the synthesis unit is used for synthesizing the second wake-up voice matched with the second wake-up word.
Optionally, the terminal device 200 further includes:
the second receiving module is used for receiving wake-up words of each voice device in the at least one voice device input by a user;
the storage module is used for storing wake-up words of each voice device in the at least one voice device;
the determining unit is used for determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
Optionally, the terminal device 200 further includes:
the third receiving module is used for receiving a voice control instruction input by a user;
the first determining module is configured to determine, based on the voice control instruction, the corresponding target voice device, where the target voice device is a voice device capable of responding to the voice control instruction in the at least one voice device.
Optionally, the terminal device 200 further includes:
the second determining module is used for determining whether the voice control instruction is matched with the common control instruction according to a pre-recorded common control instruction corresponding to the first wake-up word;
The first determining module is configured to determine, based on the voice control instruction, the corresponding target voice device if the voice control instruction matches the common control instruction.
Optionally, the terminal device 200 further includes:
the fourth receiving module is used for receiving a third wake-up word input by a user;
the setting module is used for setting the third wake-up word as a first wake-up word of the terminal equipment;
and the third determining module is used for determining the score of the first wake-up word and outputting the score.
Optionally, the wake-up module 202 includes:
the extraction unit is used for extracting voiceprint features in the first wake-up voice;
the wake-up unit is used for waking up the terminal equipment under the condition that the first wake-up voice is matched with a first wake-up word of the terminal equipment and the voiceprint characteristic is matched with a preset voiceprint characteristic.
The terminal device 200 is capable of implementing each process implemented by the terminal device in the method embodiment of fig. 1, and in order to avoid repetition, a description thereof will be omitted. The terminal device 400 of the embodiment of the invention can enter the wake-up state under the condition of receiving the first wake-up voice input by the user, synthesize the wake-up voice of the target voice device, and play the synthesized wake-up voice through the loudspeaker so as to wake-up the target voice device. Therefore, the user does not need to memorize the wake-up words of different voice devices, only needs to memorize the wake-up words of the terminal device, and indirectly achieves the purpose of waking up the target voice device in a mode of waking up the terminal device, so that convenience and quickness are brought to the user, and the burden of memorizing the wake-up words of different devices by the user is reduced.
The embodiment of the invention also provides a terminal device, which comprises a processor, a memory and a computer program stored in the memory and capable of running on the processor, wherein the computer program realizes the processes of the device wake-up method embodiment when being executed by the processor, and can achieve the same technical effects, and the repetition is avoided, so that the description is omitted.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes the processes of the above device wake-up method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims (12)

1. A device wake-up method applied to a terminal device, wherein the terminal device is connected to at least one speaker, the at least one speaker is respectively mounted on at least one voice device and covers an incoming channel of a microphone of each voice device, the method comprising:
receiving a first wake-up voice input by a user;
and under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment, waking up the terminal equipment, synthesizing second wake-up voice matched with a second wake-up word of target voice equipment, and playing the second wake-up voice through a loudspeaker installed on the target voice equipment so as to wake up the target voice equipment through the second wake-up voice, wherein the target voice equipment is at least one of the at least one voice equipment.
2. The method of claim 1, wherein the synthesizing a second wake up speech that matches a second wake up word of the target speech device comprises:
determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
And synthesizing a second wake-up voice matched with the second wake-up word.
3. The method of claim 2, wherein prior to receiving the first wake-up speech input by the user, the method further comprises:
receiving a wake-up word of each voice device in the at least one voice device input by a user;
storing wake-up words of each of the at least one voice device;
the determining the second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device comprises:
and determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
4. The method of claim 1, wherein after waking up the terminal device, the method further comprises, before synthesizing a second wake up speech that matches a second wake up word of a target speech device:
receiving a voice control instruction input by a user;
and determining the corresponding target voice equipment based on the voice control instruction, wherein the target voice equipment is one of the at least one voice equipment capable of responding to the voice control instruction.
5. The method of claim 4, wherein after receiving the voice control command input by the user, before determining the corresponding target voice device based on the voice control command, the method further comprises:
determining whether the voice control instruction is matched with the common control instruction according to a pre-recorded common control instruction corresponding to the first wake-up word;
the determining, based on the voice control instruction, the corresponding target voice device includes:
and under the condition that the voice control instruction is matched with the common control instruction, determining the corresponding target voice equipment based on the voice control instruction.
6. The method of claim 1, wherein prior to receiving the first wake-up speech input by the user, the method further comprises:
receiving a third wake-up word input by a user;
setting the third wake-up word as a first wake-up word of the terminal equipment;
and determining the score of the first wake word and outputting the score.
7. The method of claim 1, wherein waking up the terminal device if the first wake-up speech matches a first wake-up word of the terminal device comprises:
Extracting voiceprint features in the first wake-up speech;
and waking up the terminal equipment under the condition that the first wake-up voice is matched with the first wake-up word of the terminal equipment and the voiceprint characteristic is matched with a preset voiceprint characteristic.
8. A terminal device connected to at least one speaker, the at least one speaker being mounted on at least one voice device and covering an inlet channel of a microphone of each voice device, respectively, the terminal device comprising:
the first receiving module is used for receiving a first wake-up voice input by a user;
the wake-up module is configured to wake up the terminal device when the first wake-up voice is matched with a first wake-up word of the terminal device, synthesize a second wake-up voice matched with a second wake-up word of a target voice device, and play the second wake-up voice through a speaker installed on the target voice device, so as to wake up the target voice device through the second wake-up voice, where the target voice device is at least one of the at least one voice device.
9. The terminal device of claim 8, wherein the wake-up module comprises:
The determining unit is used for determining a second wake-up word of the target voice device according to the pre-acquired wake-up word of each voice device in the at least one voice device;
and the synthesis unit is used for synthesizing the second wake-up voice matched with the second wake-up word.
10. The terminal device according to claim 9, characterized in that the terminal device further comprises:
the second receiving module is used for receiving wake-up words of each voice device in the at least one voice device input by a user;
the storage module is used for storing wake-up words of each voice device in the at least one voice device;
the determining unit is used for determining a second wake-up word of the target voice device according to the stored wake-up word of each voice device in the at least one voice device.
11. Terminal device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, which computer program, when being executed by the processor, implements the steps of the device wake-up method according to any of claims 1 to 7.
12. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when executed by a processor, implements the steps of the device wake-up method according to any of claims 1 to 7.
CN202010191577.0A 2020-03-18 2020-03-18 Equipment awakening method and terminal equipment Active CN111429917B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010191577.0A CN111429917B (en) 2020-03-18 2020-03-18 Equipment awakening method and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010191577.0A CN111429917B (en) 2020-03-18 2020-03-18 Equipment awakening method and terminal equipment

Publications (2)

Publication Number Publication Date
CN111429917A CN111429917A (en) 2020-07-17
CN111429917B true CN111429917B (en) 2023-09-22

Family

ID=71547551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010191577.0A Active CN111429917B (en) 2020-03-18 2020-03-18 Equipment awakening method and terminal equipment

Country Status (1)

Country Link
CN (1) CN111429917B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111798850B (en) * 2020-08-05 2024-03-01 深圳市北科瑞声科技股份有限公司 Method and system for operating equipment by voice and server
CN112000836A (en) * 2020-08-20 2020-11-27 北京声智科技有限公司 Song playing method and device and electronic equipment
CN115242571A (en) * 2021-04-25 2022-10-25 佛山市顺德区美的电热电器制造有限公司 Distributed voice interaction method and device, readable storage medium and household appliance

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104243717A (en) * 2014-09-30 2014-12-24 广东欧珀移动通信有限公司 Method and device for unlocking mobile phone in voice recognition mode on basis of social applications
CN108712566A (en) * 2018-04-27 2018-10-26 维沃移动通信有限公司 A kind of voice assistant awakening method and mobile terminal
CN108899027A (en) * 2018-08-15 2018-11-27 珠海格力电器股份有限公司 Speech analysis method and device
CN109243459A (en) * 2018-11-30 2019-01-18 广东美的制冷设备有限公司 Voice awakening method, device, household appliance and the control system of equipment
KR20190082689A (en) * 2019-06-20 2019-07-10 엘지전자 주식회사 Method and apparatus for recognizing a voice
CN110097876A (en) * 2018-01-30 2019-08-06 阿里巴巴集团控股有限公司 Voice wakes up processing method and is waken up equipment
CN110827836A (en) * 2019-10-23 2020-02-21 珠海格力电器股份有限公司 Method and device for resetting awakening words, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10475449B2 (en) * 2017-08-07 2019-11-12 Sonos, Inc. Wake-word detection suppression
US10679629B2 (en) * 2018-04-09 2020-06-09 Amazon Technologies, Inc. Device arbitration by multiple speech processing systems

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104243717A (en) * 2014-09-30 2014-12-24 广东欧珀移动通信有限公司 Method and device for unlocking mobile phone in voice recognition mode on basis of social applications
CN110097876A (en) * 2018-01-30 2019-08-06 阿里巴巴集团控股有限公司 Voice wakes up processing method and is waken up equipment
CN108712566A (en) * 2018-04-27 2018-10-26 维沃移动通信有限公司 A kind of voice assistant awakening method and mobile terminal
CN108899027A (en) * 2018-08-15 2018-11-27 珠海格力电器股份有限公司 Speech analysis method and device
CN109243459A (en) * 2018-11-30 2019-01-18 广东美的制冷设备有限公司 Voice awakening method, device, household appliance and the control system of equipment
KR20190082689A (en) * 2019-06-20 2019-07-10 엘지전자 주식회사 Method and apparatus for recognizing a voice
CN110827836A (en) * 2019-10-23 2020-02-21 珠海格力电器股份有限公司 Method and device for resetting awakening words, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111429917A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
US11386905B2 (en) Information processing method and device, multimedia device and storage medium
CN111429917B (en) Equipment awakening method and terminal equipment
CN107895578B (en) Voice interaction method and device
CN110634483B (en) Man-machine interaction method and device, electronic equipment and storage medium
CN104394491B (en) A kind of intelligent earphone, Cloud Server and volume adjusting method and system
EP2674941B1 (en) Terminal apparatus and control method thereof
US11763808B2 (en) Temporary account association with voice-enabled devices
CN102117614B (en) Personalized text-to-speech synthesis and personalized speech feature extraction
US20020088336A1 (en) Method of identifying pieces of music
EP3611724A1 (en) Voice response method and device, and smart device
CN106992008B (en) Processing method and electronic equipment
CN111343028A (en) Distribution network control method and device
US20210168460A1 (en) Electronic device and subtitle expression method thereof
KR20140055502A (en) Broadcast receiving apparatus, server and control method thereof
CN111640434A (en) Method and apparatus for controlling voice device
CN107994879A (en) Volume control method and device
CN113409764B (en) Speech synthesis method and device for speech synthesis
CN104182039B (en) Apparatus control method, device and electronic equipment
JP2005031540A (en) Household electric appliance with voice function
CN111161742A (en) Directional person communication method, system, storage medium and intelligent voice device
CN110415703A (en) Voice memos information processing method and device
KR20200016547A (en) Electronic device and method for registering new user through authentication by registered user
CN110012359A (en) Answer reminding method and device
CN112820265B (en) Speech synthesis model training method and related device
CN113314115A (en) Voice processing method of terminal equipment, terminal equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant