WO2022141990A1 - 家电设备及其语音控制方法、语音装置、计算机存储介质 - Google Patents

家电设备及其语音控制方法、语音装置、计算机存储介质 Download PDF

Info

Publication number
WO2022141990A1
WO2022141990A1 PCT/CN2021/090041 CN2021090041W WO2022141990A1 WO 2022141990 A1 WO2022141990 A1 WO 2022141990A1 CN 2021090041 W CN2021090041 W CN 2021090041W WO 2022141990 A1 WO2022141990 A1 WO 2022141990A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
command
full
control method
voice control
Prior art date
Application number
PCT/CN2021/090041
Other languages
English (en)
French (fr)
Inventor
颜林
霍伟明
张新健
徐浩
席红艳
陈柏仰
Original Assignee
广东美的制冷设备有限公司
美的集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东美的制冷设备有限公司, 美的集团股份有限公司 filed Critical 广东美的制冷设备有限公司
Publication of WO2022141990A1 publication Critical patent/WO2022141990A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • H04L12/282Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Definitions

  • the present application relates to the field of household electrical appliances, and in particular, to household electrical appliances and a voice control method thereof, a voice device, and a computer storage medium.
  • the embodiments of the present application solve the technical problem of the fluency of the voice interaction of the home appliance in the prior art by providing a home appliance and a voice control method thereof, a voice device, and a computer storage medium.
  • An embodiment of the present application provides a voice control method for household electrical appliances, which performs voice control on the household electrical appliances through a voice device; the voice control method includes the following steps:
  • the voice information of the surrounding environment is collected and recognized, and when a voice command is recognized, the voice mode is switched according to the type of the voice command; the voice mode includes a full-duplex mode and a non-full-duplex mode.
  • the collecting and identifying the voice information of the surrounding environment includes:
  • the obtained voice feature information is matched with the voice feature information corresponding to the type of the voice command, and the type of the voice command of the ambient voice is determined according to the matching result.
  • the method before performing speech recognition on the picked-up ambient speech, the method further includes:
  • Segmentation processing is performed on the collected ambient speech, and speech recognition is performed on the segmented ambient speech at the same time.
  • the voice control method further includes:
  • the voice command is not recognized within a preset time, the current voice mode is exited, and the wake-up state is exited.
  • the voice control method further includes:
  • control the home appliance In the non-full-duplex mode, control the home appliance to execute the recognized voice command, and determine whether to exit the wake-up state according to the type of the non-full-duplex mode.
  • the voice control method further includes:
  • the voice control method further includes:
  • the picked-up ambient voice is recognized as a voice command
  • the current voice broadcast is stopped, and the home appliance is controlled to execute the voice control command.
  • Embodiments of the present application further provide a voice device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the computer program is executed by the processor to implement the above implementation
  • voice control is performed on the household electrical appliance.
  • An embodiment of the present application further provides a household appliance, including a memory, a processor, and a computer program stored on the memory and running on the processor, where the computer program is executed by the processor to implement the above-mentioned implementation
  • voice control is performed on the household electrical appliance.
  • An embodiment of the present application further provides a computer storage medium, where a voice control application program is stored on the readable storage medium, and the voice control application program is executed by a processor to implement the voice control method for a home appliance in any one of the foregoing embodiments. steps to realize voice control of home appliances.
  • the embodiment of the present application performs automatic switching between full-duplex mode and non-full-duplex mode according to the type of voice sent, which not only meets the needs of different usage scenarios, but also realizes simple switching of modes, making voice control smoother ;
  • the voice pickup function and the voice broadcast function work in parallel, so that the user's voice control of the home appliance is smoother, and the voice broadcast can be interrupted according to personal circumstances.
  • the operation is simpler, faster and more flexible;
  • the voice recognition is realized while collecting, not only can the voice control command be quickly recognized, and the current voice broadcast can be interrupted, thereby avoiding the false interruption of noise and the user can be informed in time. , which improves the efficiency of speech recognition;
  • the time setting of voice control can not only realize effective voice control, but also avoid the privacy and security of users.
  • FIG. 1 is a schematic flowchart of a voice control method for a household electrical appliance according to an embodiment of the present application
  • Fig. 2 is the working example diagram of voice pickup function and voice broadcast function when entering full-duplex mode in the embodiment of the present application;
  • FIG. 3 is a schematic diagram of a refinement process of performing voice recognition on collected ambient voices in a voice control method for household electrical appliances according to an embodiment of the present application;
  • Fig. 4 is the working example diagram of voice pickup function and voice broadcast function when entering full-duplex mode in the embodiment of the present application;
  • FIG. 6 is an application example of the voice control method according to the embodiment of the present application.
  • the voice control of home appliances requires the user to speak a wake-up word that meets the requirements, such as "Tmall Genie”, “Xiao Ai”, etc., to wake up the home appliance, and then voice control of the home appliance can be performed. For example, “turn on the air conditioner", "play a song”, etc.
  • This kind of voice control can accurately realize voice control, but can only issue a voice control command once. If you need to issue a voice control command again, such as adjusting the temperature of the air conditioner, you need to wake up the home appliance again, which makes the voice control operation cumbersome.
  • a new voice control mode appears in the prior art, that is, after the home appliance is woken up, it performs alternate voice pickup and voice broadcast.
  • this voice control method solves the problem of multiple wake-up of home appliances, but the fluency of voice control still needs to be improved.
  • the technical solution of the present application mainly provides a voice control solution for home appliances
  • the voice control solution includes a full-duplex mode and a non-full-duplex mode, wherein in the full-duplex mode, voice pickup and voice broadcast of the home appliance can work in parallel , just like a natural dialogue between people, making the voice control of home appliances smoother; and users can also interrupt the voice broadcast according to the situation to perform voice control, which makes the voice control of home appliances more convenient, fast and convenient. flexible.
  • automatic switching between full-duplex mode and non-full-duplex mode is performed according to the type of voice sent, which not only meets the needs of different usage scenarios, but also realizes simple switching of modes, making voice control smoother.
  • FIG. 1 is a schematic flowchart of a voice control method for a home appliance according to an embodiment of the present application.
  • the voice control method of the household electrical appliance in this embodiment includes the following steps:
  • Step S110 receiving a voice wake-up command, and waking up the voice device according to the voice wake-up command;
  • step S120 the voice information of the surrounding environment is collected and recognized, and when the voice command is recognized, the voice mode is switched according to the type of the voice command.
  • the voice wake-up instruction in the above step S110 may include a wake-up word set by default, such as "Xiaomei Xiaomei", of course, the individualized setting of the wake-up word may also be performed through the control terminal of the household appliance.
  • the above-mentioned voice device can be a functional component arranged on the household appliance; it can also be a voice device independent of the household appliance, which realizes the voice pickup function and the voice broadcast function, and of course also has a network communication function, such as communicating with the household appliance network. , the collected voice information is recognized by voice and converted into voice commands, and then sent to the home appliance to realize the voice control of the home appliance.
  • the voice mode includes a full-duplex mode and a non-full-duplex mode, and the corresponding voice mode is entered according to different voice commands.
  • the full-duplex mode is set according to the characteristics of continuous dialogue between people, and the voice device can perform voice pickup and voice broadcast at the same time. Therefore, in the full-duplex mode, the user only needs to wake up once to continue Send out voice commands, and at the same time, it can also perform voice broadcasts and feedback the results of command execution.
  • the voice device includes two functional modules, a voice pickup module and a voice broadcast module, and the two functional modules operate independently and do not interfere with each other.
  • the voice pickup module and the voice broadcast module enable the voice device to collect the voice in the surrounding environment and broadcast the voice content that needs to be broadcast.
  • FIG. 2 is a working example diagram of the voice pickup function and the voice broadcast function when entering the full-duplex mode in the embodiment of the present application.
  • the voice pickup module collects and recognizes the voice wake-up command, it determines whether to enter the full-duplex mode. When the conditions for entering the full-duplex mode are met, it determines to enter the full-duplex mode. In the full-duplex mode, the voice pickup function and The voice broadcast function is in working condition.
  • the voice pickup module collects and recognizes the voice control command 1, it controls the home appliance to execute the voice control command 1, and then the voice broadcast module voice broadcasts the execution result or execution state. While the voice broadcast module performs voice broadcast, the voice pickup module can collect the voice information of the surrounding environment, that is, the voice pickup function and the voice broadcast function do not conflict, and both can run at the same time.
  • the non-full-duplex mode includes modes such as single-round interaction mode and multi-round interaction mode.
  • the voice pickup module and the voice broadcast module operate alternately, that is, the voice pickup module collects and recognizes voice control commands. After that, the recognition results need to be fed back in time, that is, the recognition results are reported by voice; the main difference between the two is that the single-round interaction mode needs to wake up before each voice control command is issued, while the multi-round interaction mode needs to wake up after one wake-up. , you can issue multiple voice control commands, but it must be when the voice broadcasting module stops broadcasting and the voice pickup module works.
  • the automatic switching between the full-duplex mode and the non-full-duplex mode is performed according to the type of voice sent, which not only meets the needs of different usage scenarios, but also realizes simple switching of voice modes, making voice control smoother .
  • the voice pickup function and the voice broadcast function work in parallel, making the user's voice control of the home appliance smoother.
  • FIG. 3 is a schematic diagram of a refinement process of voice recognition for the collected ambient voice in the voice control method of the household electrical appliance according to an embodiment of the present application.
  • the type judgment of the voice command in step S120 of the above embodiment may include the following steps:
  • Step S121 picking up surrounding environmental voices according to the current voice mode, and processing the picked-up environmental voices to obtain voice feature information
  • Step S122 Match the obtained voice feature information with the voice feature information corresponding to the type of the voice command, and determine the type of the voice command of the ambient voice according to the matching result.
  • the voice signal of the surrounding environment is picked up by the voice pickup module, and voice recognition is performed on the picked-up voice signal.
  • the voice pickup module includes, for example, a microphone and related components for voice recognition.
  • the microphone can be set to one, two or more. By setting two or more microphones, voice information can be collected from multiple directions, and the processing algorithm of differential noise elimination can be used to improve the quality of voice collection and improve voice recognition. Rate.
  • the voice pickup module recognizes the voice signal of the surrounding environment while collecting, for example, the voice pickup module adopts at least two running processes, one process is used to collect the voice signal of the surrounding environment, and the other process is used to collect The received voice signal is used for voice recognition. In this way, the voice pickup module can quickly identify and learn the user's intention in time.
  • the voice pickup module can establish communication with the cloud server, the voice pickup module collects the voice signals of the surrounding environment, and sends the collected voice signals to the cloud server, and the cloud server performs voice recognition on the received voice signals.
  • the voice pickup module can use the local differential noise reduction algorithm for voice acquisition and processing, so as to quickly and efficiently complete the continuous and dynamic collection process of voice information, and combine with the voice recognition algorithm in the cloud to further improve voice recognition. Efficiency and timely access to the user's intentions.
  • the voice pickup module After the voice pickup module collects the voice information of the surrounding environment, it can also perform noise reduction processing on the collected environmental voice, and then perform semantic recognition processing on the collected environmental voice to extract voice feature information.
  • a database of speech features can be preset, and the feature words of semantic recognition are compared and screened with the preset database to obtain final voice feature information.
  • the full-duplex mode voice command includes a control-type command word (voice feature information), and it is not necessary to provide immediate feedback according to the input voice command word, but according to the voice command Feedback on execution results.
  • the command word has a corresponding grammatical structure, taking "close the device" as an example, the grammatical structure is as follows:
  • Non-full-duplex commands include natural dialogue-type command words (voice feature information), and the recognition result needs to be immediately fed back according to the input voice command. Feedback not recognized.
  • the instruction word is not like the voice instruction in the full-duplex mode, but has a corresponding grammatical structure, and the instruction word can be more random and flexible. Specific examples are as follows:
  • the voice feature information obtained in step S121 is identified, and when the picked-up ambient voice is identified as a full-duplex command, the full-duplex mode is entered. ; When the picked-up ambient voice is recognized as a non-full-duplex command, enter the non-full-duplex mode.
  • the judgment of the full-duplex command and the non-full-duplex command may also be performed according to historical voice commands.
  • the historical voice commands include, for example, voice control commands set by default, voice control commands recognized during use and through machine learning, voice control commands manually added by the user, updated voice control commands, and the like.
  • Each historical voice command includes voice feature information corresponding to the full-duplex command and the non-full-duplex command.
  • the relevant voice control commands in the full-duplex mode and the non-full-duplex mode can be obtained, so that it can be determined whether the voice feature information is a voice control command in the full-duplex mode.
  • the judgment of full-duplex commands and non-full-duplex commands can be made more accurate.
  • the voice mode is switched according to the type of the voice command. For example, after the voice device is woken up, if the currently recognized voice command is a full-duplex command, the voice device enters the full-duplex mode. In the mode, the voice pickup module and the voice broadcast module of the voice device work in parallel, and the voice pickup module adopts the recognition while picking up. When a full-duplex command is recognized, the full-duplex command is executed, and the command result is broadcasted by voice.
  • the voice device After the voice device is woken up, if the currently recognized voice command is a non-full-duplex command, the voice device enters a non-full-duplex mode.
  • the home appliance In the non-full-duplex mode, the home appliance is controlled to execute the recognized command.
  • Non-full-duplex instruction and determine whether to exit the wake-up state according to the specific non-full-duplex mode. If the single-round interactive mode is used, the wake-up state will be exited, and if the multi-round interactive mode is used, the wake-up state will not be exited.
  • the voice device exits the non-full-duplex mode and enters the full-duplex mode.
  • the full-duplex command may include multiple voice dialogue scenarios, such as “air conditioning control”, “sleep control” and so on.
  • the relevant voice commands in this scene include, for example, “set the temperature to 26°C”, “set the wind speed to mid-range”, “set the wind direction to sweep up and down”, and “set the humidity to 60" %", “Turn on cooling mode”, etc. If the obtained voice feature information is "mid-range wind”, “wind speed is mid-range”, “wind speed is set to mid-range”, etc., it is determined that the voice feature information is a voice control command in the dialogue scene, that is, "wind speed is set to mid-range”.
  • the voice feature information is a voice control command in the dialogue scene, that is, "set the wind direction to sweep up and down".
  • command recognition can be performed first according to historical voice commands in the current scene, so that the recognition can be performed faster. If the voice command cannot be recognized in the current scene, the command recognition is performed according to the historical voice commands in other scenarios until the voice command cannot be recognized, and the collected voice information is determined to be noise.
  • the voice pickup module collects the surrounding ambient voice, it will also perform segmentation processing on the collected ambient voice, and simultaneously perform voice recognition on the segmented ambient voice.
  • segmentation processing By performing segmentation processing on the collected speech information, speech recognition can be performed on the segmented speech information in advance, thereby improving the speech recognition efficiency.
  • segmentation processing may be performed according to the volume of the voice information, and when voice collection is performed, the collected voice information is divided into multiple voice segments.
  • a volume threshold such as 3000
  • the voice information below the volume threshold is judged as not speaking. Therefore, according to the volume threshold, when the volume of the collected voice information is lower than the volume threshold, it will be segmented. , and perform speech recognition on the segmented speech information.
  • the speech information may be segmented according to the pause time between the speech information, and when the speech is collected, the collected speech information is divided into a plurality of speech segments.
  • a time threshold (0.5 seconds) is set, and the speech information whose pause time is higher than the time threshold is judged as not speaking. Therefore, according to the time threshold, if the pause time between the collected speech information is higher than the time threshold, then It is processed into segments, and speech recognition is performed on the segmented speech information at the same time.
  • the judgment of the voice control command will be also combined with the recognition results of the voice information of the preceding and following segments. Since the voice information is processed in sections, it may be necessary to combine several consecutive sections of voice information before and after the corresponding voice control commands can be accurately analyzed. It may be divided into multiple pieces of voice information, so it is impossible to accurately know the user's true intention only based on the recognition result of one piece of voice information. Increase the temperature", which generates a voice control command of "Raise the target temperature".
  • the full-duplex mode can be entered in other ways: the full-duplex mode is controlled by a third-party control device that installs relevant plug-ins. That is, when an instruction to enter the full-duplex mode sent by the mobile terminal is received, it is determined to enter the full-duplex mode.
  • the mobile terminal can be installed with a plug-in/application program for controlling the voice device, through which the configuration management of the voice device can be realized, and the function activation of the voice device can also be realized.
  • the above-mentioned full-duplex mode can be exited in other ways: in one embodiment, the full-duplex voice command can be used to exit, for example, "Close the full-duplex mode”. ", "exit full-duplex mode", etc.; in another embodiment, after entering the full-duplex mode, if no voice control command is recognized within a preset time, the full-duplex mode is exited and the wake-up state is exited.
  • a preset time is set, such as 30 seconds, at the preset time If the voice control command is not recognized, it will exit the current voice mode and exit the wake-up state.
  • the time setting of the voice mode can not only realize effective voice control, but also avoid the privacy security of the user.
  • the method further includes: if the picked-up voice information is identified as a voice command, stopping the current voice broadcast and controlling the household appliance to execute the voice control command.
  • the two functional modules of the voice pickup module and the voice broadcast module of the above-mentioned voice device will also be controlled by the processor of the voice device.
  • the processor can issue control instructions at any time to control the voice pickup module and the voice broadcast module to stop working. For example, when the voice broadcast module performs voice broadcast, the processor can control the voice broadcast module to stop the broadcast work according to the content collected by the voice pickup module, just like when people communicate with each other, they can choose to listen to the other party's speech according to the content of the other party's speech , you can also choose to interrupt the other party's speech according to the content of the other party's speech.
  • the voice pickup module can more accurately pick up the voice information sent by the user, so it can also control the home appliance to execute the voice control command after the complete voice signal is collected and confirmed as the voice control command again.
  • the voice device still maintains the current voice mode. , if the current voice broadcast module is in the voice broadcast state, the voice broadcast will continue to be performed, thereby avoiding the false interruption of noise.
  • FIG. 4 is a working example diagram of the voice pickup function and the voice broadcast function when the full-duplex mode is entered in the embodiment of the present application. In this full-duplex mode, both the voice pickup function and the voice broadcast function are working.
  • the voice pickup module collects and recognizes the voice control command 2
  • the voice broadcast module is performing voice broadcast
  • the voice pickup module collects and recognizes the voice control command 3
  • the voice broadcast module continues to broadcast.
  • the voice control command in the full-duplex mode can be quickly recognized, and the current voice broadcast can be interrupted, which not only improves the voice recognition efficiency, makes voice control smoother, but also avoids noise errors. interrupt.
  • the voice broadcast module of the above embodiment receives the voice content to be broadcast, it parses the voice broadcast content to be broadcast, and selects a corresponding voice broadcast mode for voice broadcast according to the analysis result.
  • the voice broadcast module parses the voice broadcast content to be played, for example, performs word segmentation and sentence segmentation processing on the voice broadcast content, determines the key information of the broadcast, and controls the volume, speech rate and pause between words during the voice broadcast Time, etc., so that the voice broadcast effect is better and the user experience is improved.
  • the voice broadcast module of the above embodiment receives the voice content to be broadcast, it can also use a voice mode suitable for the user to perform voice broadcast according to the broadcast control.
  • the voice pickup module collects the voice information of the surrounding environment, and performs voice recognition on it to identify the user type of the current user, such as the elderly, children or men, women, so that according to the identified user type, select The corresponding voice mode performs voice broadcast to further improve the user experience.
  • FIG. 5 is an application example of the voice control method according to the embodiment of the present application.
  • the user sends out the voice "Xiaomei Xiaomei", and the voice device collects and recognizes the environmental voice as a wake-up command, and gives voice feedback "I'm here" through voice broadcast. For example, it can also be set to other feedback voices, "please order” and so on.
  • the user then makes a voice "open natural dialogue”.
  • the voice device collects and recognizes that the environmental voice is a full-duplex command, then the voice device enters the full-duplex mode, and gives voice feedback through the voice broadcast "Now you can talk freely. ".
  • the voice pickup function and semantic broadcast function of the voice device are activated and work in parallel.
  • the voice device collects and recognizes the environment. If the voice is a full-duplex command, it will control the air conditioner to turn on, and give voice feedback through the voice broadcast "air conditioner is on, cooling mode, 26°C, natural wind”.
  • the user can issue a voice command again, such as "adjust the temperature to 24°C, strong wind”.
  • the voice device collects and recognizes the environmental voice as a full-duplex command, and continues to maintain the full-duplex mode.
  • the air conditioner that is, control the air conditioner to adjust the target temperature and wind speed, and give voice feedback through voice broadcast that "the temperature has been adjusted to 24°C, and the wind speed has been adjusted to strong wind”.
  • the voice device collects and recognizes that the ambient voice is a non-full-duplex command, it exits the full-duplex mode and enters the non-full-duplex mode, then Feedback the recognition result "OK, play it for the master immediately” through voice broadcast, and then control the voice device to play the selected song.
  • the non-full-duplex mode adopts the single-round interaction mode, it exits the wake-up state and needs to wake up again if voice control is required; if the non-full-duplex mode adopts the multi-round interaction mode, there is no need to exit the wake-up state and press the multi-round interaction mode. mode for voice control.
  • FIG. 6 is an application example of the voice control method according to the embodiment of the present application.
  • the user sends out the voice "Xiaomei Xiaomei”, and the voice device collects and recognizes the environmental voice as a wake-up command, and gives voice feedback "I'm here" through voice broadcast. For example, it can also be set to other feedback voices, "please order” and so on.
  • the voice device collects and recognizes that the environmental voice is a full-duplex command, then enters the full-duplex mode, controls the air conditioner to turn on, and then gives a voice feedback through the voice broadcast that "the air conditioner is turned on. , cooling mode, 26°C, natural wind”.
  • the user can issue a voice command again. If the user does not issue any voice command within the preset time (for example, 30 seconds), and the voice device has not collected the voice command at this time, the voice broadcast will be used. Feedback "Go back first, remember to wake me up later", then exit the current voice mode and exit the wake-up state.
  • the preset time for example, 30 seconds
  • the above-mentioned home appliances and voice devices can each include a processor, a memory, and a communication module.
  • the memory can be used as a computer storage medium, and the memory can include the operating system and the voice control program of the home appliance.
  • the voice control program is invoked by the processor of the home appliance to execute the steps of the voice control method for the home appliance in the above embodiment.
  • the voice control program is invoked by the processor of the voice device to execute the steps of the voice control method of the household electrical appliance in the above-mentioned embodiment.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • An apparatus implements the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the block diagrams.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in one or more of the flowcharts and/or one or more blocks of the block diagrams.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not preclude the presence of a plurality of such elements.
  • the present application may be implemented by means of hardware comprising several different components and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means may be embodied by one and the same item of hardware.
  • the use of the words first, second, and third, etc. do not denote any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Automation & Control Theory (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Selective Calling Equipment (AREA)

Abstract

一种家电设备的语音控制方法,通过语音装置对家电设备进行语音控制,包括:接收语音唤醒指令,并根据语音唤醒指令唤醒语音装置(S110);采集并识别周边环境的语音信息,并在识别到语音指令时,根据语音指令的类型进行语音模式切换(S120)。还公开了一种家电设备、语音装置及计算机存储介质。

Description

家电设备及其语音控制方法、语音装置、计算机存储介质
本申请要求于2020年12月31日申请的、申请号为202011645138.9 的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及家电设备领域,尤其涉及家电设备及其语音控制方法、语音装置、计算机存储介质。
背景技术
随着家电设备的智能化发展,人们对家电设备的智能化要求也越来越高,例如通过语音控制家电设备,人们希望与家电设备之间的语音交流能更顺畅,甚至达到人与人之间的自然交流。
但是目前家电设备的语音控制技术在语音交互的流畅度方面仍有待提升。
技术问题
本申请实施例通过提供一种家电设备及其语音控制方法、语音装置、计算机存储介质,解决了现有技术中家电设备的语音交互的流畅度的技术问题。
技术解决方案
本申请实施例提供了一种家电设备的语音控制方法,通过语音装置对家电设备进行语音控制;所述语音控制方法包括以下步骤:
接收语音唤醒指令,并根据所述语音唤醒指令唤醒所述语音装置;
采集并识别周边环境的语音信息,并在识别到语音指令时,根据语音指令的类型进行语音模式切换;所述语音模式包括全双工模式和非全双工模式。
本申请一实施例中,所述采集并识别周边环境的语音信息包括:
按当前的语音模式拾取周边的环境语音,并对拾取到的环境语音进行处理,获得语音特征信息;
将获得的语音特征信息与语音指令的类型对应的语音特征信息进行匹配,并根据匹配结果判断所述环境语音的语音指令的类型。
本申请一实施例中,在对拾取到的环境语音进行语音识别之前,还包括:
对采集到的环境语音进行分段处理,同时对分段后的环境语音进行语音识别。
本申请一实施例中,所述语音控制方法还包括:
若一预置时间内未识别到语音指令,则退出当前语音模式,并退出唤醒状态。
本申请一实施例中,所述语音控制方法还包括:
非全双工模式下,控制家电设备执行识别到的语音指令,并根据非全双工模式的类型确定是否退出唤醒状态。
本申请一实施例中,所述语音控制方法还包括:
接收到移动终端发送的进入全双工模式的指令时,进入全双工模式。
本申请一实施例中,所述语音控制方法还包括:
若拾取到的环境语音被识别为语音指令时,则停止当前的语音播报,并控制所述家电设备执行所述语音控制指令。
本申请实施例还提供一种语音装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述实施例的语音控制方法的步骤,对家电设备进行语音控制。
本申请实施例还提供一种家电设备,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述实施例的语音控制方法的步骤,对家电设备的进行语音控制。
本申请实施例还提供一种计算机存储介质,所述可读存储介质上存储有语音控制应用程序,所述语音控制应用程序被处理器执行实现上述任一种实施方式的家电设备的语音控制方法的步骤,实现对家电设备的语音控制。
有益效果
本申请实施例中提供的一个或多个技术方案,至少具有如下技术效果或优点:
(1)本申请实施例根据发出的语音类型进行全双工模式和非全双工模式的自动切换,既满足了不同的使用场景的需求,又实现了模式的简单切换,使得语音控制更加顺畅;
(2)通过全双工语音模式中,语音拾取功能和语音播报功能并行工作,使得使用者对家电设备的语音控制更加顺畅,而且可以根据个人情况打断语音播报而进行语音控制,使得语音控制操作更加简单快捷且灵活;
(3)通过本申请实施例的语音处理,实现了语音边采集边识别,不但可以快速识别到语音控制指令,并打断当前语音播报,从而避免了噪声的误打断而且可以及时获知使用者的意图,提升了语音识别效率;
(4)本申请实施例中,通过对采集到的语音信息进行分段处理,从而可以提前对分好段的语音信息进行语音识别,从而提升了语音识别效率;
(5)通过语音控制的时间设置,既能实现有效的语音控制,又避免了使用者的隐私安全。
附图说明
图1是本申请一实施例的家电设备的语音控制方法的流程示意图;
图2是本申请实施例中进入全双工模式时语音拾取功能和语音播报功能的工作示例图;
图3是本申请一实施例的家电设备的语音控制方法中,对采集的环境语音进行语音识别的细化流程示意图;
图4是本申请实施例中进入全双工模式时语音拾取功能和语音播报功能的工作示例图;
图5是本申请实施例的语音控制方法的一应用示例;
图6是本申请实施例的语音控制方法的一应用示例。
本发明的实施方式
为了更好的理解上述技术方案,下面将参照附图更详细地描述本申请的示例性实施例。虽然附图中显示了本申请的示例性实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本申请,并且能够将本申请的范围完整的传达给本领域的技术人员。
现有技术中对家电设备的语音控制时,需要使用者说出符合要求的唤醒词,例如“天猫精灵”、“小爱同学”等,将家电设备唤醒,才能对家电设备进行语音控制,例如“打开空调”、“播放歌曲”等。这种语音控制可以精准实现语音控制,但是只能发出一次语音控制指令,如需再次发出语音控制指令,例如调节空调温度等,则需要将家电设备再次唤醒,如此使得语音控制操作繁琐。对此,现有技术中出现新的语音控制模式,即家电设备在被唤醒后,进行交替式地语音拾取以及语音播报,因此家电设备被唤醒后,不再需要重复唤醒,而是在语音播报结束后发出新的语音控制指令即可继续语音控制家电设备。该语音控制方式比前一种语音控制方式,解决了家电设备多次唤醒的问题,但是在语音控制的流畅度上仍有待提升。
本申请技术方案主要提供一种家电设备的语音控制方案,该语音控制方案中包括全双工模式和非全双工模式,其中全双工模式中,家电设备的语音拾取和语音播报能并行工作,就像人与人之间的自然对话,使得对家电设备的语音控制更加顺畅;而且使用者还可以根据情况打断语音播报,进行语音控制,如此使得对家电设备的语音控制更加简便快捷且灵活。另外,根据发出的语音类型进行全双工模式和非全双工模式的自动切换,既满足了不同的使用场景的需求,又实现了模式的简单切换,使得语音控制更加顺畅。
如图1所示,图1是本申请一实施例的家电设备的语音控制方法的流程示意图。该实施例中的家电设备的语音控制方法包括以下步骤:
步骤S110,接收语音唤醒指令,并根据所述语音唤醒指令唤醒所述语音装置;
步骤S120,采集并识别周边环境的语音信息,并在识别到语音指令时,根据语音指令的类型进行语音模式切换。
上述步骤S110的语音唤醒指令可以包括默认设置的唤醒词,例如“小美小美”,当然也可以通过家电设备的控制终端进行唤醒词的个性化设置。上述语音装置可以为设置于家电设备上的一功能组件;也可以为独立于家电设备的一语音设备,实现语音拾取功能和语音播报功能,当然还具有网络通信功能,例如与家电设备网络进行通信,将采集到的语音信息进行语音识别并转换为语音指令后,发给家电设备,实现对家电设备的语音控制。
步骤S120中,语音模式包括全双工模式和非全双工模式,根据不同的语音指令进入对应的语音模式。其中,全双工模式为根据人与人之间的持续对话的特点而设置,语音装置能同时进行语音拾取和语音播报,因此全双工模式中,使用者只需要唤醒一次,即可持续地发出语音指令,同时也能进行语音播报,反馈指令执行的结果。例如语音装置包括语音拾取模块和语音播报模块两个功能模块,且该两个功能模块独立运行,互不干涉。通过语音拾取模块和语音播报模块使得语音装置实现一边采集周围环境中的语音,一边播报需要播报的语音内容。
如图2所示,图2是本申请实施例中进入全双工模式时语音拾取功能和语音播报功能的工作示例图。语音拾取模块采集并识别到语音唤醒指令后,确定是否进入全双工模式,当进入全双工模式的条件满足后,则确定进入全双工模式,该全双工模式中,语音拾取功能和语音播报功能均处于工作状态。图2中,语音拾取模块采集并识别到语音控制指令1时,控制家电设备执行该语音控制指令1,随后语音播报模块将执行结果或执行状态进行语音播报。在语音播报模块进行语音播报的同时,语音拾取模块可以采集周边环境的语音信息,即语音拾取功能和语音播报功能并不冲突,两者可以同时运行。
非全双工模式例如包括单轮交互模式、多轮交互模式等模式,其中非全双工模式中,语音拾取模块和语音播报模块均为交替运行,即语音拾取模块采集并识别到语音控制指令后,都需要及时地反馈识别结果,即通过语音播报反馈识别结果;两者的主要区别在于,单轮交互模式在发出每条语音控制指令之前都需要唤醒,而多轮交互模式在一次唤醒后,可以发出多条语音控制指令,但必须在语音播报模块停止播报,语音拾取模块工作的时候。
本申请实施例中,根据发出的语音类型进行全双工模式和非全双工模式的自动切换,既满足了不同的使用场景的需求,又实现了语音模式的简单切换,使得语音控制更加顺畅。另外,通过全双工模式,语音拾取功能和语音播报功能并行工作,使得使用者对家电设备的语音控制更加顺畅。
进一步地,如图3所示,图3是本申请一实施例中家电设备的语音控制方法中,对采集的环境语音进行语音识别的细化流程示意图。上述实施例的步骤S120中对于语音指令的类型判断可包括以下步骤:
步骤S121,按当前的语音模式拾取周边的环境语音,并对拾取到的环境语音进行处理,获得语音特征信息;
步骤S122,将获得的语音特征信息与语音指令的类型对应的语音特征信息进行匹配,并根据匹配结果判断所述环境语音的语音指令的类型。
具体地,上述语音装置中通过语音拾取模块拾取周围环境的语音信号,并对拾取到的语音信号进行语音识别。该语音拾取模块例如包括麦克风及语音识别的相关组件。该麦克风可以设置一个、两个或多个,通过设置两个或多个麦克风,可以从多个方向采集语音信息,并采用差分消除噪声的处理算法,从而提高语音采集的质量,进而提高语音识别率。
一实施例中,该语音拾取模块对周围环境的语音信号进行边采集边识别,例如语音拾取模块采用至少两个运行进程,一个进程用于采集周围环境的语音信号,另一个进程用于对采集到的语音信号进行语音识别。如此,语音拾取模块可以快速地进行识别,并及时获知使用者的意图。另一实施例中,该语音拾取模块可与云服务器建立通信,语音拾取模块采集周围环境的语音信号,并将采集到的语音信号发送至云服务器,云服务器对接收到的语音信号进行语音识别,该实施例的架构中,语音拾取模块可以采用本地差分降噪算法进行语音采集处理,以快速、高效的完成语音信息连续动态收集过程,并结合云端的语音识别算法,从而进一步提升了语音识别效率,及时获知使用者的意图。
在语音拾取模块采集到周围环境的语音信息后,还可以先对采集到的环境语音进行降噪处理,然后对采集到的环境语音进行语义识别处理,提取语音特征信息。一实施例中,可以预设语音特征的数据库,将语义识别的特征词与预设的数据库进行比较筛选,获得最终的语音特征信息。
本申请一实施例中,全双工模式的语音指令(全双工指令)包括控制类型的指令词(语音特征信息),无需根据输入的语音指令词进行即时反馈,而是根据对语音指令的执行结果进行反馈。该指令词具有对应的语法结构,以“关闭设备”举例,语法结构具体如下:
Figure dest_path_image001
非全双工模式的语音指令(非全双工指令)包括自然对话类型的指令词(语音特征信息),需要根据输入的语音指令即时反馈识别结果,例如能够理解则语音回应,不能理解则语音反馈无法识别。该指令词并不像全双工模式的语音指令,具有对应的语法结构,其指令词可以更加随意、灵活。具体举例如下:
Figure dest_path_image002
根据上述全双工指令和非全双工指令对应的语音特征信息,对步骤S121获得的语音特征信息进行识别,当拾取到的环境语音被识别为全双工指令时,则进入全双工模式;当拾取到的环境语音被识别为非全双工指令时,则进入非全双工模式。
另一实施例中,还可以根据历史语音指令进行全双工指令以及非全双工指令的判断。该历史语音指令例如包括默认设置的语音控制指令、使用过程中通过及机器学习识别到的语音控制指令、使用者手动添加的语音控制指令、升级更新的语音控制指令等等。每个历史语音指令包括全双工指令和非全双工指令对应的语音特征信息。根据历史语音指令能获知全双工模式和非全双工模式下相关语音控制指令,从而可以判断语音特征信息是否为全双工模式的语音控制指令。通过历史语音指令的判断,可以使得全双工指令和非全双工指令的判断更加精准。
上述实施例中,根据语音指令的类型进行语音模式的切换,例如语音装置在被唤醒后,若当前识别到的语音指令为全双工指令,则语音装置进入全双工模式,该全双工模式下,语音装置的语音拾取模块和语音播报模块并行工作,且语音拾取模块采用边拾取边识别,当识别到全双工指令,则执行该全双工指令,并将指令结果进行语音播报。当识别到非全双工指令,则退出全双工模式,进入非全双工模式,且控制家电设备执行识别到的非全双工指令,并根据具体的非全双工模式确定是否退出唤醒状态,若采用单轮交互模式则退出唤醒状态,若采用多轮交互模式则不退出唤醒状态。
再例如,若语音装置在被唤醒后,若当前识别到的语音指令为非全双工指令,则语音装置进入非全双工模式,该非全双工模式下,控制家电设备执行识别到的非全双工指令,并根据具体的非全双工模式确定是否退出唤醒状态,若采用单轮交互模式则退出唤醒状态,若采用多轮交互模式则不退出唤醒状态。当多轮交互模式下,识别到全双工指令,则语音装置退出非全双工模式,进入全双工模式。
进一步地,全双工指令中可包括多个语音对话场景,例如“空调控制”、“睡眠控制”等等。以语音对话场景为“空调控制”举例,该场景下相关的语音指令例如包括,“温度设置为26℃”、“风速设置为中档”、“风向设置为上下扫风”、“湿度设置为60%”、“开启制冷模式”等等。若获得语音特征信息为“中档风”、“风速中档”、“风速调为中档”等,则判断该语音特征信息为该对话场景下的语音控制指令,即“风速设置为中档”。若获得语音特征信息为“上下扫风”、“扫风”等,则判断该语音特征信息为该对话场景下的语音控制指令,即“风向设置为上下扫风”。在进行语音识别时,可以先根据当前场景下的历史语音指令进行指令识别,从而可以更快地进行识别。如果当前场景下无法识别到语音指令,则根据其他场景下的历史语音指令进行指令识别,直到无法识别到语音指令,确定采集到的语音信息为噪音。
进一步地,上述语音拾取模块在采集周围的环境语音后,还将对采集到的环境语音进行分段处理,同时对分段后的环境语音进行语音识别。通过对采集到的语音信息进行分段处理,从而可以提前对分好段的语音信息进行语音识别,从而提升了语音识别效率。
具体地,一实施例中,可以根据语音信息的音量大小进行分段处理,在进行语音采集时,将采集到的语音信息拆分成多个语音段。例如设置一音量阈值(如3000),将低于该音量阈值的语音信息判断为未说话,因此根据该音量阈值,在采集到语音信息的音量低于该音量阈值的,则将其分段处理,同时将分段后的语音信息进行语音识别。另一实施例中,可以根据语音信息之间的停顿时间,对语音信息进行分段处理,在进行语音采集时,将采集到的语音信息拆分成多个语音段。例如设置一时间阈值(0.5秒),将停顿时间高于该时间阈值的语音信息判断为未说话,因此根据该时间阈值,在采集到语音信息之间的停顿时间高于该时间阈值的,则将其分段处理,同时将分段后的语音信息进行语音识别。
进一步地,对分段处理后的语音信息进行语音识别时,还将结合前后段的语音信息的识别结果,进行语音控制指令的判断。由于语音信息经过分段处理,可能需要结合前后连续几段的语音信息,才能准确分析出相应的语音控制指令,例如使用者说“好冷啊....调高温度”,经过分段处理可能将分成多段语音信息,如此仅根据一段语音信息的识别结果无法准确知道使用者的真实意图,只有将前后段的语音信息结合进行分析,才能确定使用者的真实意图是“将空调器的目标温度调高”,由此产生“升高目标温度”的语音控制指令。
进一步地,上述全双工模式的进入除了通过全双工的语音指令的方式,还可以通过其他的方式:通过安装相关插件的第三方控制设备控制进入全双工模式。即接收到移动终端发送的进入全双工模式的指令时,确定进入全双工模式。该移动终端可安装有控制语音装置的插件/应用程序,通过该插件可以实现对语音装置的配置管理,也可以实现对语音装置的功能启动。
进一步地,上述全双工模式的退出除了通过非全双工指令的方式,还可以通过其他的方式:一实施例中,通过退出全双工的语音指令的方式,例如“关闭全双工模式”、“退出全双工模式”等等;另一实施例中,在进入全双工模式后,一预置时间内未识别到语音控制指令,则退出全双工模式,并退出唤醒状态。
在进入全双工模式或非全双工模式后,虽然能采集到语音指令,但是周围环境中发出的非语音指令也会被语音装置采集到,尤其是全双工模式,语音拾取装置处于持续的语音拾取状态,而使用者为了个人隐私的安全,往往不希望自己发出的非语音指令的语音信息被采集,因此本实施例中,设置一预置时间,例如30秒,在该预置时间内,若未识别到语音控制指令,则退出当前语音模式,并退出唤醒状态。
本申请实施例中,通过语音模式的时间设置,既能实现有效的语音控制,又避免了使用者的隐私安全。
进一步地,上述步骤S120进入全双工模式时,还包括:若拾取到的语音信息识别为语音指令,则停止当前的语音播报,并控制所述家电设备执行所述语音控制指令。
上述语音装置的语音拾取模块和语音播报模块这两个功能模块还将受语音装置的处理器的控制,该处理器可以随时发出控制指令,控制语音拾取模块和语音播报模块停止工作。例如在语音播报模块进行语音播报时,处理器可以根据语音拾取模块的采集内容而控制语音播报模块停止播报工作,就像人与人在交流时,可以根据对方的说话内容而选择聆听对方的讲话,也可以根据对方的说话内容而选择打断对方的讲话。
具体地,由于周围环境的语音信号为边采集边识别,因此在未采集到完整的语音信号之前,就能预测使用者的语音意图,从而确认该语音信号是否为语音控制指令,如果是则控制停止语音播报,并控制家电设备执行该语音控制指令。由于停止了语音播报,语音拾取模块可以更准确地拾取到使用者发出的语音信息,因此还可以在采集到完整的语音信号并再次确认为语音控制指令后,再控制家电设备执行该语音控制指令。
上述实施例中,若识别到环境语音为非语音指令,也就是说,既不是全双工指令,也不是非全双工指令,即确定环境语音为噪音,则语音装置仍然维持当前的语音模式,如果当前语音播报模块处于语音播报状态,则继续进行语音播报,从而避免了噪音的误打断。
如图4所示,图4是本申请实施例中进入全双工模式时语音拾取功能和语音播报功能的工作示例图。该全双工模式中,语音拾取功能和语音播报功能均处于工作状态。图4中,语音拾取模块采集并识别到语音控制指令2时,控制家电设备执行该语音控制指令2,随后语音播报模块将执行结果或执行状态进行语音播报。在语音播报模块进行语音播报的同时,语音拾取模块采集并识别到语音控制指令3时,控制语音播报模块停止播报工作。而图2中,在语音播报模块进行语音播报的同时,语音拾取模块采集并识别到不是语音控制指令(即噪音)时,语音播报模块继续播报工作。
通过本实施例中的语音识别处理,可以快速识别到全双工模式的语音控制指令,并打断当前语音播报,不但提升了语音识别效率,使得语音控制更加顺畅,而且还避免了噪声的误打断。
进一步地,上述实施例的语音播报模块在接收到待播报的语音内容时,对待播报的语音播报内容进行解析,并根据解析结果选择对应的语音播报模式进行语音播报。
一实施例中,语音播报模块对待播放的语音播报内容进行解析,例如对语音播报内容进行分词、断句处理,确定播报的关键信息,并控制语音播报时的音量、语速以及词语之间的停顿时间等等,从而使得语音播报效果更佳,提升使用体验。
进一步地,上述实施例的语音播报模块在接收到待播报的语音内容时,还可以根据播报控制,采用适合使用者的语音模式进行语音播报。例如语音拾取模块在采集到周围环境的语音信息后,并对其进行语音识别,以识别出当前使用者的用户类型,例如老人、小孩或男人、女人,从而可以根据识别出的用户类型,选择相应的语音模式进行语音播报,进一步提升使用体验。
以下将通过空调设备的语音控制举例说明本申请实施例的语音控制过程。
如图5所示,图5是本申请实施例的语音控制方法的一应用示例。使用者发出语音“小美小美”,语音装置采集并识别到该环境语音为唤醒指令,则通过语音播报进行语音反馈“我在呢”。例如还可以设置为其他的反馈语音,“请吩咐”等。使用者再发出语音“打开自然对话”,此时语音装置采集并识别到该环境语音为全双工指令,则语音装置进入全双工模式,并通过语音播报进行语音反馈“现在可以自由对话了”。该全双工模式下,使用者不需要再重复唤醒,语音装置的语音拾取功能和语义播报功能启动且并行工作,当使用者发出语音“打开空调”,此时语音装置采集并识别到该环境语音为全双工指令,则控制空调打开,并通过语音播报进行语音反馈“空调已开机,制冷模式,26℃,自然风”。在该语音播报的同时,使用者可以再次发出语音指令,例如“温度调至24℃,强劲风”,此时语音装置采集并识别到该环境语音为全双工指令,继续保持全双工模式,即控制空调调节目标温度及风速,并通过语音播报进行语音反馈“温度已调至24℃,风速已调至强劲风”。此时,若使用者发出“播放舒缓的歌曲”的语音指令,此时语音装置采集并识别到该环境语音为非全双工指令,则退出全双工模式,进入非全双工模式,则通过语音播报反馈识别结果“好的,马上为主人播放”,然后再控制语音装置播放选好的歌曲。如果该非全双工模式采用单轮交互模式则,退出唤醒状态,需要重新进行语音控制则需要再次唤醒;如果非全双工模式采用多轮交互模式,则不用退出唤醒状态,按多轮交互模式的方式进行语音控制。
如图6所示,图6是本申请实施例的语音控制方法的一应用示例。使用者发出语音“小美小美”,语音装置采集并识别到该环境语音为唤醒指令,则通过语音播报进行语音反馈“我在呢”。例如还可以设置为其他的反馈语音,“请吩咐”等。当使用者发出语音“打开空调”,此时语音装置采集并识别到该环境语音为全双工指令,则进入全双工模式,并控制空调打开,然后通过语音播报进行语音反馈“空调已开机,制冷模式,26℃,自然风”。在该语音播报的同时,使用者可以再次发出语音指令,若使用者在预置时间(例如30秒)内未发出任何语音指令,此时语音装置未采集到语音指令,则通过语音播报进行语音反馈“先退下啦,后面记得唤醒我”,然后退出当前语音模式,并退出唤醒状态。
上述提及的家电设备和语音装置均可包括处理器、存储器以及通信模块。而且,该存储器可作为一种计算机存储介质,该存储器中可以包括操作系统以及家电设备的语音控制程序。该语音控制程序供家电设备的处理器调用,以执行上述实施例中家电设备的语音控制方法的步骤。或者该语音控制程序供语音装置的处理器调用,以执行上述实施例中家电设备的语音控制方法的步骤。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
应当注意的是,在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的部件或步骤。位于部件之前的单词“一”或“一个”不排除存在多个这样的部件。本申请可以借助于包括有若干不同部件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (10)

  1. 一种家电设备的语音控制方法,其中,通过语音装置对家电设备进行语音控制;所述语音控制方法包括以下步骤:
    接收语音唤醒指令,并根据所述语音唤醒指令唤醒所述语音装置;
    采集并识别周边环境的语音信息,并在识别到语音指令时,根据语音指令的类型进行语音模式切换;所述语音模式包括全双工模式和非全双工模式。
  2. 如权利要求1所述的家电设备的语音控制方法,其中,所述采集并识别周边环境的语音信息包括:
    按当前的语音模式拾取周边的环境语音,并对拾取到的环境语音进行处理,获得语音特征信息;
    将获得的语音特征信息与语音指令的类型对应的语音特征信息进行匹配,并根据匹配结果判断所述环境语音的语音指令的类型。
  3. 如权利要求2所述的家电设备的语音控制方法,其中,在对拾取到的环境语音进行语音识别之前,还包括:
    对采集到的环境语音进行分段处理,同时对分段后的环境语音进行语音识别。
  4. 如权利要求1-3中任一项所述的家电设备的语音控制方法,其中,所述语音控制方法还包括:
    若一预置时间内未识别到语音指令,则退出当前语音模式,并退出唤醒状态。
  5. 如权利要求1-3中任一项所述的家电设备的语音控制方法,其中,所述语音控制方法还包括:
    非全双工模式下,控制家电设备执行识别到的语音指令,并根据非全双工模式的类型确定是否退出唤醒状态。
  6. 如权利要求1所述的家电设备的语音控制方法,其中,所述语音控制方法还包括:
    接收到移动终端发送的进入全双工模式的指令时,进入全双工模式。
  7. 如权利要求1所述的家电设备的语音控制方法,其中,所述语音控制方法还包括:
    若拾取到的环境语音被识别为语音指令时,则停止当前的语音播报,并控制所述家电设备执行所述语音指令。
  8. 一种语音装置,其中,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1-7中任一项所述的家电设备的语音控制方法的步骤,对家电设备进行语音控制。
  9. 一种家电设备,其中,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1-7中任一项所述的家电设备的语音控制方法的步骤。
  10. 一种计算机存储介质,其中,所述计算机存储介质上存储有语音控制程序,所述语音控制程序被处理器执行实现如权利要求1-7中任意一项所述的家电设备的语音控制方法的步骤。
PCT/CN2021/090041 2020-12-31 2021-04-26 家电设备及其语音控制方法、语音装置、计算机存储介质 WO2022141990A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011645138.9 2020-12-31
CN202011645138.9A CN112820290A (zh) 2020-12-31 2020-12-31 家电设备及其语音控制方法、语音装置、计算机存储介质

Publications (1)

Publication Number Publication Date
WO2022141990A1 true WO2022141990A1 (zh) 2022-07-07

Family

ID=75856699

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090041 WO2022141990A1 (zh) 2020-12-31 2021-04-26 家电设备及其语音控制方法、语音装置、计算机存储介质

Country Status (2)

Country Link
CN (1) CN112820290A (zh)
WO (1) WO2022141990A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539263B (zh) * 2021-07-09 2023-11-14 广东金鸿星智能科技有限公司 一种电动门的语音控制方法和系统
CN113707143A (zh) * 2021-08-20 2021-11-26 珠海格力电器股份有限公司 语音处理方法、装置、电子设备和存储介质
CN114400001A (zh) * 2021-12-20 2022-04-26 上海华兴数字科技有限公司 作业机械语音交互方法、系统及作业机械
CN115631752B (zh) * 2022-12-19 2023-02-28 深圳慢云智能科技有限公司 一种支持机器学习的智能设备ai语音控制方法及系统

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679472A (zh) * 2015-02-13 2015-06-03 百度在线网络技术(北京)有限公司 人机语音交互方法和装置
CN107424607A (zh) * 2017-07-04 2017-12-01 珠海格力电器股份有限公司 语音控制模式切换方法、装置及具有该装置的设备
CN109657091A (zh) * 2019-01-02 2019-04-19 百度在线网络技术(北京)有限公司 语音交互设备的状态呈现方法、装置、设备及存储介质
CN109712621A (zh) * 2018-12-27 2019-05-03 维沃移动通信有限公司 一种语音交互控制方法及终端
CN109994108A (zh) * 2017-12-29 2019-07-09 微软技术许可有限责任公司 用于聊天机器人和人之间的会话交谈的全双工通信技术
CN110557451A (zh) * 2019-08-30 2019-12-10 北京百度网讯科技有限公司 对话交互处理方法、装置、电子设备和存储介质
CN110618613A (zh) * 2019-09-03 2019-12-27 珠海格力电器股份有限公司 一种智能设备的联动控制方法及装置
US20200150919A1 (en) * 2018-11-13 2020-05-14 Synervoz Communications Inc. Systems and methods for contextual audio detection and communication mode transactions
CN111508474A (zh) * 2019-08-08 2020-08-07 马上消费金融股份有限公司 一种语音打断方法、电子设备及存储装置
US10778826B1 (en) * 2015-05-18 2020-09-15 Amazon Technologies, Inc. System to facilitate communication
CN112002315A (zh) * 2020-07-28 2020-11-27 珠海格力电器股份有限公司 一种语音控制方法、装置、电器设备、存储介质及处理器
CN112735398A (zh) * 2019-10-28 2021-04-30 苏州思必驰信息科技有限公司 人机对话模式切换方法及系统

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104679472A (zh) * 2015-02-13 2015-06-03 百度在线网络技术(北京)有限公司 人机语音交互方法和装置
US10778826B1 (en) * 2015-05-18 2020-09-15 Amazon Technologies, Inc. System to facilitate communication
CN107424607A (zh) * 2017-07-04 2017-12-01 珠海格力电器股份有限公司 语音控制模式切换方法、装置及具有该装置的设备
CN109994108A (zh) * 2017-12-29 2019-07-09 微软技术许可有限责任公司 用于聊天机器人和人之间的会话交谈的全双工通信技术
US20200150919A1 (en) * 2018-11-13 2020-05-14 Synervoz Communications Inc. Systems and methods for contextual audio detection and communication mode transactions
CN109712621A (zh) * 2018-12-27 2019-05-03 维沃移动通信有限公司 一种语音交互控制方法及终端
CN109657091A (zh) * 2019-01-02 2019-04-19 百度在线网络技术(北京)有限公司 语音交互设备的状态呈现方法、装置、设备及存储介质
CN111508474A (zh) * 2019-08-08 2020-08-07 马上消费金融股份有限公司 一种语音打断方法、电子设备及存储装置
CN110557451A (zh) * 2019-08-30 2019-12-10 北京百度网讯科技有限公司 对话交互处理方法、装置、电子设备和存储介质
CN110618613A (zh) * 2019-09-03 2019-12-27 珠海格力电器股份有限公司 一种智能设备的联动控制方法及装置
CN112735398A (zh) * 2019-10-28 2021-04-30 苏州思必驰信息科技有限公司 人机对话模式切换方法及系统
CN112002315A (zh) * 2020-07-28 2020-11-27 珠海格力电器股份有限公司 一种语音控制方法、装置、电器设备、存储介质及处理器

Also Published As

Publication number Publication date
CN112820290A (zh) 2021-05-18

Similar Documents

Publication Publication Date Title
WO2022141990A1 (zh) 家电设备及其语音控制方法、语音装置、计算机存储介质
US10672387B2 (en) Systems and methods for recognizing user speech
CN111989741B (zh) 具有动态可切换端点的基于语音的用户接口
WO2020042993A1 (zh) 语音控制方法、装置及系统
CN109584876B (zh) 语音数据的处理方法、装置和语音空调
WO2020244573A1 (zh) 一种语音指令的处理方法、设备及控制系统
US11694689B2 (en) Input detection windowing
US20230075581A1 (en) Command keywords with input detection windowing
US9349384B2 (en) Method and system for object-dependent adjustment of levels of audio objects
WO2020062670A1 (zh) 电器设备的控制方法、装置、电器设备和介质
CN110618613A (zh) 一种智能设备的联动控制方法及装置
CN110808044B (zh) 智能家居设备语音控制方法、装置、电子设备及存储介质
CN110751948A (zh) 一种语音识别方法、装置、存储介质及语音设备
CN109473095A (zh) 一种智能家居控制系统及控制方法
CN110875045A (zh) 一种语音识别方法、智能设备和智能电视
CN110767225B (zh) 一种语音交互方法、装置及系统
CN108932947B (zh) 语音控制方法及家电设备
CN114172757A (zh) 服务器、智能家居系统及多设备语音唤醒方法
CN109584874A (zh) 电器设备控制方法、装置、电器设备及存储介质
CN112002315B (zh) 一种语音控制方法、装置、电器设备、存储介质及处理器
CN112838967B (zh) 主控设备、智能家居及其控制装置、控制系统及控制方法
KR20230118164A (ko) 디바이스 또는 어시스턴트-특정 핫워드들의 단일 발언으로의결합
CN112420043A (zh) 基于语音的智能唤醒方法、装置、电子设备及存储介质
CN113674738A (zh) 一种全屋分布式语音的系统和方法
CN113241073B (zh) 智能语音控制方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21912753

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.11.2023)