WO2021004236A1

WO2021004236A1 - Voice control method and system, device and computer-readable storage medium

Info

Publication number: WO2021004236A1
Application number: PCT/CN2020/096267
Authority: WO
Inventors: 庄健春
Original assignee: 深圳开立生物医疗科技股份有限公司
Priority date: 2019-07-08
Filing date: 2020-06-16
Publication date: 2021-01-14
Also published as: CN110335599B; CN110335599A

Abstract

A voice control method and system, a device and a computer-readable storage medium, applied to a smart device. Said method comprises: when determining to execute a voice interaction function, the smart device continuously acquiring a voice to obtain a target voice (S101); and identifying a target command in the target voice, and responding to the target command (S102). As the voice is continuously acquired, a user can continuously input a voice without continuing to wake up a smart device, and there is no case where the smart device sleeps before receiving the voice audio, so that the efficiency of voice acquisition by the smart device can be improved, and the efficiency of voice processing can further be improved.

Description

Voice control method, system, equipment and computer readable storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201910611505.4, and the invention title is "a voice control method, system, equipment and computer-readable storage medium" on July 8, 2019. All of them The content is incorporated in this application by reference.

Technical field

This application relates to the field of communication technology, and more specifically, to a voice control method, system, device, and computer-readable storage medium.

Background technique

With the development of communication technology, more and more smart devices have entered the lives of users and attracted attention from users. One feature of smart devices is that they can recognize and respond to users' voices. Taking the smart device as a mobile phone as an example, when the user wakes up the voice recognition function of the mobile phone through a specific voice, the mobile phone can collect the voice input by the user over a period of time and perform corresponding processing, and enter the sleep state after the processing operation is completed, and wait Was awakened by the user once. That is to say, when the user uses the voice interaction function of a smart device such as a mobile phone, it is necessary to wake up the mobile phone many times, and after the user wakes up the mobile phone, if the voice input operation is not completed within a specific time, the mobile phone will still be in a sleep state, making the user use The experience of smart devices is poor, making smart devices less efficient in processing voice. In addition, the portability of mobile phones can still make up for the shortcomings of voice triggering (long pressing the menu button, etc.), but for some smart devices that are relatively large and do not have portability, the operation is time-consuming and laborious.

Summary of the invention

The purpose of this application is to provide a voice control method that can solve the problem of how to improve the efficiency of voice processing by smart devices to a certain extent. This application also provides a voice control system, equipment, and computer-readable storage medium.

In order to achieve the above objectives, this application provides the following technical solutions:

A voice control method applied to smart devices, including:

When it is determined to perform the voice interaction function, continue to collect voices to obtain the target voice;

Recognizing the target command in the target voice, and responding to the target command.

Preferably, the target speech is composed of speech units;

The continuously collecting voice to obtain the target voice includes:

Determine whether the current moment belongs to the preset voice collection moment;

If the current moment belongs to the voice collection moment, starting from the current moment, a voice of a preset duration is collected as the voice unit;

If the current time does not belong to the voice collection time, return to the step of determining whether the current time belongs to the preset voice collection time.

Preferably, the judging whether the current moment is before the preset voice collection moment, further includes:

The voice collection time and the preset time length are determined according to the principle that the time length between adjacent voice collection times is less than the preset time length, and the preset time length is greater than or equal to the voice time length of the target command.

Preferably, the said voice collection time and the preset time are determined according to the principle that the time between adjacent voice collection times is less than the preset time length, and the preset time length is greater than or equal to the voice time length of the target command Duration, including:

According to the time length relation formula, the time length between the adjacent voice collection time is less than the preset time length, and the preset time length is greater than or equal to the voice time length of the target command, determine the voice collection time and the Preset duration

The duration relationship formula includes:

X≤(N-1)L/N; L=NP;

Wherein, X represents the voice duration of the target command; N represents a positive integer greater than 1; L represents the preset duration; P represents the duration between adjacent voice collection moments.

Preferably, the collecting a voice of a preset duration from the current moment as the voice unit includes:

Select a free storage space for storing voice as the target storage space;

All the voices collected from the current moment are stored in the target storage space until the target storage space is full to obtain the voice unit;

Wherein, the duration of the voice that can be stored in the storage space is the preset duration.

Preferably, the selecting a free storage space for storing voice as the target storage space includes:

Determine whether there is free storage space;

If there is no free storage space, create a storage space and use it as the target storage space;

If there is a free storage space, a free storage space is selected as the target storage space.

Preferably, the voices collected from the current moment are stored in the target storage space until the target storage space is full, and after the voice unit is obtained, the method further includes:

Storing the voice unit in the target storage space in a preset audio queue;

Release the target storage space;

The recognizing the target command in the target voice includes:

Acquiring one of the voice units from the preset audio queue for command recognition;

And delete the selected voice unit from the preset audio queue.

Preferably, the recognizing the target command in the target voice includes:

The target voice is matched with a preset grammar, and if the matching is successful, the preset grammar that matches the target voice is mapped to the target command.

Preferably, the smart device includes an ultrasound device;

The recognizing the target command in the target voice and responding to the target command includes:

Recognizing the ultrasonic instruction in the target voice, and responding to the ultrasonic instruction.

A voice control system applied to smart devices, including:

The first collection module is used to continuously collect voices to obtain the target voice when it is determined to perform the voice interaction function;

The first recognition module is used to recognize the target command in the target voice and respond to the target command.

An ultrasound device, including:

Memory, used to store computer programs;

The processor is used to implement the steps of any of the above voice control methods when executing the computer program.

A computer-readable storage medium is applied to a smart device. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of any of the above voice control methods are realized.

The voice control method provided by the present application is applied to a smart device. When it is determined to perform a voice interaction function, the voice is continuously collected to obtain the target voice; the target command in the target voice is recognized, and the target command is responded to. In the voice control method provided by this application, when the smart device determines to perform the voice interaction function, it continuously collects voice to obtain the target voice, recognizes the target command in the target voice, and responds to the target command. Because the voice is continuously collected, The user does not need to continue to wake up the smart device to continue to input voice, and there is no situation that the smart device goes to sleep before the voice is received, which can improve the efficiency of the smart device to collect voice, thereby improving the efficiency of voice processing. The voice control system, equipment, and computer-readable storage medium provided by this application also solve the corresponding technical problems.

Description of the drawings

In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only It is an embodiment of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without creative work.

FIG. 1 is a first flowchart of a voice control method provided by an embodiment of this application;

2 is a second flowchart of a voice control method provided by an embodiment of the application;

Figure 3 is a schematic diagram of the relationship between the voice duration of the target command, the preset duration, and the duration between adjacent voice collection moments;

FIG. 4 is a schematic structural diagram of a voice control system provided by an embodiment of this application;

FIG. 5 is a schematic structural diagram of a voice control device provided by an embodiment of this application;

FIG. 6 is another schematic structural diagram of a voice control device provided by an embodiment of the application.

Detailed ways

The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

With the development of communication technology, more and more smart devices have entered the lives of users and attracted attention from users. One feature of smart devices is that they can recognize and respond to users' voices. Taking the smart device as a mobile phone as an example, when the user wakes up the voice recognition function of the mobile phone through a specific voice, the mobile phone can collect the voice input by the user over a period of time and perform corresponding processing, and enter the sleep state after the processing operation is completed, and wait Was awakened by the user once. That is, when a user uses the voice interaction function of a smart device such as a mobile phone, the mobile phone needs to be awakened multiple times, and after the user wakes up the mobile phone, if the voice input operation is not completed within a certain time, the mobile phone will still go to sleep. For smart devices that are relatively large and have no portability, this voice operation method will reduce the user experience. The voice control method provided in the present application can improve the convenience and voice processing efficiency of a user when using a smart device.

Please refer to FIG. 1. FIG. 1 is a first flowchart of a voice control method according to an embodiment of the application.

A voice control method provided by an embodiment of the present application, applied to a smart device, may include the following steps:

Step S101: When it is determined to perform the voice interaction function, the voice is continuously collected to obtain the target voice.

In practical applications, when the smart device decides to perform the voice interaction function, it will continue to collect voice and obtain the corresponding target voice. The type of smart device can be determined according to actual needs, for example, it can be a mobile phone, a tablet, an ultrasound device, etc. The judgment method for the smart device to perform the voice interaction function can also be flexibly determined according to actual needs. For example, the smart device can determine that it needs to perform the voice interaction function after receiving a specific trigger command, or it can determine that it needs to perform voice interaction when its own specific button is triggered Function, you can also determine that the voice interaction function needs to be performed after its own button is triggered in a specific trigger mode.

Step S102: Recognize the target command in the target voice, and respond to the target command.

In practical applications, after the smart device collects the target voice, it can recognize the target command in the target voice, and recognize the target command accordingly. In specific application scenarios, when recognizing the target command in the target voice, a grammar recognition network can be built in the smart device in advance, and the target voice can be matched with the grammar recognition network to obtain the corresponding target command. In specific application scenarios, when recognizing the target command in the target voice, the target voice can also be directly matched with the preset grammar. If the matching is successful, the preset grammar matching the target voice is mapped to the target command.

In a specific application scenario, the smart device may be an ultrasound device. At this time, when recognizing the target command in the target voice and responding to the target command, it can recognize the ultrasonic command in the target voice and respond to the ultrasonic command.

In practical applications, the process of whether the smart device turns off the voice interaction function can be controlled by the outside world. For example, the outside world can control whether the smart device turns off the voice interaction function through instructions, etc., then the smart device can also determine whether it has received voice after responding to the target command. The interactive function close command; if the voice interactive function close command is received, the voice collection is stopped; if the voice interactive function close command is not received, the voice collection continues. It should be pointed out that the voice interaction function closing instruction may be an instruction input by the user's voice, or an instruction generated after the user triggers a button on the smart device.

The voice control method provided by the present application is applied to a smart device. When it is determined to perform a voice interaction function, the voice is continuously collected to obtain the target voice; the target command in the target voice is recognized, and the target command is responded to. In the voice control method provided by this application, when the smart device determines to perform the voice interaction function, it continuously collects voice to obtain the target voice, recognizes the target command in the target voice, and responds to the target command. Because the voice is continuously collected, The user does not need to continue to wake up the smart device to continue to input voice, and there is no situation that the smart device goes to sleep before the voice is received. This can improve the efficiency of the smart device to collect voice, thereby improving the efficiency of voice processing, because there is no need to wake up repeatedly, The operation is simple and convenient, suitable for large-scale intelligent equipment.

Please refer to FIG. 2. FIG. 2 is a second flowchart of a voice control method provided by an embodiment of this application.

In practical applications, the target voice in this application may be composed of multiple voice units, and a voice control method provided in an embodiment of this application may include the following steps:

Step S201: When it is determined to perform the voice interaction function, it is determined whether the current time belongs to the preset voice collection time, if it is, step S202 is executed, and if not, step S201 is returned to.

Step S202: Collect a voice of a preset duration as a voice unit, and perform step S203.

In practical applications, if the smart device continuously collects voice without interval, the power consumption of the smart device will be large. In order to reduce the power consumption of the smart device, when the voice is continuously collected, you can first determine whether the current moment belongs to the preset voice collection If yes, the voice of the preset duration is collected as the target voice. Since the voice is only collected at the voice collection moment, compared with continuous voice collection without interval, the power consumption of the smart device when collecting voice can be reduced; Compared with the continuous collection of voices at intervals to obtain a whole target voice, by collecting voices of a preset duration at different voice collection moments as the voice unit, it is equivalent to splitting the target voice into multiple voice units, so that the voice unit can be used as the voice unit. The unit performs command recognition and processing on the collected voice, that is, when the next voice unit is collected, the collected voice unit can be processed. Compared with the voice processing after the complete target voice is collected, it can improve Recognition efficiency and processing efficiency of commands. It should be pointed out that the voice collection moments involved in this application belong to the time when the voice collection moments are concentrated, that is, the value of the voice collection moment is not unique, and its number can be determined by the voice collection duration in a specific application scenario.

In specific application scenarios, when voices of a preset duration are collected as voice units according to different voice collection moments, the target commands in the target voice may be stored in a voice unit. At this time, when the command is recognized for each voice unit , You only need to directly respond to the target command after recognizing the target command; and when the target command is stored in multiple voice units, when the command recognition is performed on each voice unit, the command recognized by each voice unit is only Part of the command in the target command. At this time, after the command in the voice unit is recognized, it is necessary to piece together the recognized commands to recover the target command, and then respond to the target command.

In specific application scenarios, if the preset duration is less than the duration between two adjacent voice collection moments, the target voice collected by the smart device will be incomplete, and the smart device may not be able to recognize the instructions in the target voice. In order to avoid this situation, before judging whether the current time belongs to the preset voice collection time, the time between adjacent voice collection time can be less than the preset time, and the preset time is greater than or equal to the voice time of the target command. The principle of determining the voice collection time and preset duration. Since the duration between adjacent voice collection moments is less than the preset duration, and the preset duration is greater than or equal to the voice duration of the target command, the target command tends to be completely collected into a voice unit, which can ensure that smart devices collect through the voice unit To complete the target command, avoid the smart device from performing operations such as patching the recognized commands, and further improve the efficiency of the smart device in processing voice. Of course, there may also be other methods for determining the voice collection time and the preset duration, which are not specifically limited in this application.

In specific application scenarios, according to the principle that the duration between adjacent voice collection moments is less than the preset duration, and the preset duration is greater than or equal to the voice duration of the target command, the voice collection moment and the preset duration can be determined according to the duration relationship formula, Determine the voice collection time and the preset duration according to the principle that the time between adjacent voice collection moments is less than the preset duration, and the preset duration is greater than or equal to the voice duration of the target command;

The duration relationship formula includes:

X≤(N-1)L/N; L=NP;

Among them, X represents the voice duration of the target command; N represents a positive integer greater than 1; L represents the preset duration; P represents the duration between adjacent voice collection moments.

The derivation process of the duration relationship formula is as follows:

Please refer to FIG. 3, which is a schematic diagram of the relationship between the voice duration of the target command, the preset duration, and the duration between adjacent voice collection moments. In order to align the data to facilitate the processing of the data, assume that L=NP, that is, L is an integer multiple of P; when a certain voice unit can contain the entire target command, X≤(N-1)P, that is, X≤( N-1) L/N. For ease of understanding, assume that the voice duration of the target command is 2 seconds, and the duration between adjacent voice collection moments is 2 seconds, and if N=2, the preset duration is 4 seconds. The target command can be collected into a voice unit no matter what time period.

According to the calculation formula, it can be ensured that the target command can be completely collected into a voice unit.

Step S203: Recognize the target command in the voice unit, and respond to the target command.

In practical applications, in order to facilitate the processing of the target voice by the smart device, when the voice is collected according to the voice collection time, different voice storage carriers can be used to distinguish the target voice collected at different voice collection moments. For example, the voice can be saved with the help of storage space. Unit, and the length of the voice that can be stored in the storage space is exactly equal to the length of the voice unit. Then, a storage space can only store one voice unit, so that different voice units can be distinguished with the help of the storage space. At the beginning, when collecting voices with a preset duration as the voice unit, you can select an idle storage space for storing voices as the target storage space; the voices collected from the current moment are all stored in the target storage space until the target is full The storage space obtains the voice unit; wherein, the duration of the voice that can be stored in the storage space is the preset duration.

In specific application scenarios, the amount of existing storage space may be limited. In this case, if the storage space is occupied, it will cause trouble to the storage of the voice unit. In order to avoid the storage space from causing trouble to the storage of the voice unit, When you select a free storage space for storing voice as the target storage space, you can determine whether there is free storage space; if there is no free storage space, create a storage space and use it as the target storage space; if there is free storage space, then Choose a free storage space as the target storage space.

In specific application scenarios, not only can different voice units be distinguished with the help of storage space, but also different voice units can be processed with the help of storage space. In this process, in order to improve the utilization of storage space and to facilitate smart devices Accurately process the voice unit. The smart device will store all the voices collected from the current moment in the target storage space until the target storage space is full. After the voice unit is obtained, it can also store the voice unit in the target storage space to the preset Set the audio queue; release the target storage space; accordingly, when recognizing the target command in the voice unit, you can obtain a voice unit from the preset audio queue for recognition; and delete the selected voice from the preset audio queue unit. That is, after the smart device obtains the voice unit, it will store the voice unit in the preset audio queue, and then release the target storage space so that the target storage space can store the next voice unit, reducing the number of storage spaces created and increasing the storage space Utilization; and the smart device obtains one voice unit from the preset audio queue for recognition each time, avoiding recognizing multiple voice units at a time, thereby avoiding the smart device from recognizing multiple commands at a time, thereby avoiding the recognition There are too many commands in the process, and the smart device recognizes the error situation, which ensures the accuracy of the smart device to recognize the voice.

The present application also provides a voice control system, which has the corresponding effects of the voice control method provided in the embodiments of the present application. Please refer to FIG. 4, which is a schematic structural diagram of a voice control system provided by an embodiment of the application.

A voice control system provided by an embodiment of the present application, applied to a smart device, may include:

The first collection module 101 is configured to continuously collect voices to obtain target voices when it is determined to perform the voice interaction function;

The first recognition module 102 is configured to recognize the target command in the target voice and respond to the target command.

A voice control system provided by an embodiment of the present application is applied to a smart device, and the target voice may be composed of voice units;

The first collection module may include:

The first judgment sub-module is used to judge whether the current moment belongs to the preset voice collection moment; if the current moment belongs to the voice collection moment, start from the current moment, collect the voice of the preset duration as the voice unit; if the current moment does not belong to the voice At the time of collection, return to the step of determining whether the current time belongs to the preset voice collection time.

A voice control system provided by an embodiment of the present application, applied to a smart device, may also include:

The first determining sub-module is used for the first determining sub-module to determine whether the current time belongs to the preset voice collection time before, according to the time between adjacent voice collection time is less than the preset time, and the preset time is greater than or equal to the voice of the target command The principle of duration is to determine the voice collection time and preset duration.

A voice control system provided by an embodiment of the present application is applied to a smart device, and the first determining submodule may include:

The first determining unit is configured to determine the voice collection time and the preset time length according to the principle that the time between adjacent voice collection moments is less than the preset time length and the preset time length is greater than or equal to the voice time length of the target command according to the time length relationship formula;

The duration relationship formula includes:

X≤(N-1)L/N; L=NP;

The voice control system provided by the embodiment of the present application is applied to a smart device, and the first judgment submodule may include:

The first selection sub-module is used to select a free storage space for storing voice as the target storage space;

The first storage sub-module is used to store the voices collected from the current moment in the target storage space until the target storage space is filled to obtain the voice unit;

The duration of the voice that can be stored in the storage space is the preset duration.

The voice control system provided by the embodiment of the present application is applied to a smart device, and the first selection submodule may include:

The first judging unit is used to judge whether there is a free storage space; if there is no free storage space, a storage space is created and used as the target storage space; if there is a free storage space, a free storage space is selected as the target storage space.

The second storage sub-module is used for the first storage sub-module to store all the voices collected from the current moment in the target storage space until the target storage space is filled, and after the voice unit is obtained, store the voice unit in the target storage space To the preset audio queue;

The first release submodule is used to release the target storage space;

The first identification module may include:

The first acquisition sub-module is used to acquire a voice unit from the preset audio queue for recognition;

The first deletion sub-module is used to delete the selected voice unit from the preset audio queue.

The voice control system provided by the embodiment of the present application is applied to a smart device, and the first recognition module may include:

The first matching unit is configured to match the target voice with the preset grammar, and if the matching is successful, map the preset grammar matching the target voice to the target command.

A voice control system provided by an embodiment of the application is applied to a smart device, and the smart device may include an ultrasound device;

The first identification module may include:

The first recognition unit is used for recognizing the ultrasonic instruction in the target voice and responding to the ultrasonic instruction.

This application also provides an ultrasound device and a computer-readable storage medium, both of which have the corresponding effects of the voice control method provided in the embodiments of the application. Please refer to FIG. 5, which is a schematic structural diagram of an ultrasonic device provided by an embodiment of the application.

An ultrasound device provided by an embodiment of the present application is applied to a smart device and includes a memory 201 and a processor 202. A computer program is stored in the memory 201. When the processor 202 executes the computer program stored in the memory 201, any of the above embodiments is implemented The steps of the described voice control method.

Referring to FIG. 6, another ultrasound device provided by an embodiment of the present application may further include: an input port 203 connected to the processor 202, used to transmit commands input from the outside to the processor 202; and a display connected to the processor 202 The unit 204 is used to display the processing result of the processor 202 to the outside; the communication module 205 connected to the processor 202 is used to implement the communication between the ultrasound device and the outside. The display unit 204 can be a display panel, a laser scanning display, etc.; the communication mode adopted by the communication module 205 includes but is not limited to mobile high-definition link technology (HML), universal serial bus (USB), high-definition multimedia interface (HDMI), Wireless connection: wireless fidelity technology (WiFi), Bluetooth communication technology, low-power Bluetooth communication technology, communication technology based on IEEE802.11s.

An embodiment of the present application provides a computer-readable storage medium, which is applied to a smart device, and a computer program is stored in the computer-readable storage medium. When the computer program is executed by a processor, the voice control method as described in any of the above embodiments is implemented. step.

The computer-readable storage media involved in this application include random access memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disks, removable disks, and CD-ROMs , Or any other form of storage medium known in the technical field.

Please refer to the detailed description of the corresponding part in the voice control method provided in the embodiment of the present application for the description of the relevant parts in the voice control system, device and computer-readable storage medium provided in the embodiment of the present application, which will not be repeated here. In addition, the parts of the foregoing technical solutions provided by the embodiments of the present application that are consistent with the implementation principles of the corresponding technical solutions in the prior art are not described in detail, so as to avoid redundant description.

It should also be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities or operations There is any such actual relationship or order between. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or device that includes a series of elements includes not only those elements, but also includes Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article, or equipment including the element.

The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use this application. Various modifications to these embodiments will be obvious to those skilled in the art, and the general principles defined in this document can be implemented in other embodiments without departing from the spirit or scope of the application. Therefore, the application will not be limited to the embodiments shown in this document, but should conform to the widest scope consistent with the principles and novel features disclosed in this document.

Claims

A voice control method, characterized in that it is applied to a smart device, and includes:

When it is determined to perform the voice interaction function, continue to collect voices to obtain the target voice;

Recognizing the target command in the target voice, and responding to the target command.
The method according to claim 1, wherein the target speech is composed of speech units;

The continuously collecting voice to obtain the target voice includes:

Determine whether the current moment belongs to the preset voice collection moment;

If the current moment belongs to the voice collection moment, starting from the current moment, a voice of a preset duration is collected as the voice unit;

If the current time does not belong to the voice collection time, return to the step of determining whether the current time belongs to the preset voice collection time.
The method according to claim 2, wherein the judging whether the current moment belongs to the preset voice collection moment before, further comprises:

The voice collection time and the preset time length are determined according to the principle that the time length between adjacent voice collection times is less than the preset time length, and the preset time length is greater than or equal to the voice time length of the target command.
The method according to claim 3, wherein the determining is based on the principle that the duration between adjacent voice collection moments is less than the preset duration, and the preset duration is greater than or equal to the voice duration of the target command The voice collection time and the preset duration include:

According to the time length relation formula, the time length between the adjacent voice collection time is less than the preset time length, and the preset time length is greater than or equal to the voice time length of the target command, determine the voice collection time and the Preset duration

The duration relationship formula includes:

X≤(N-1)L/N; L=NP;

Wherein, X represents the voice duration of the target command; N represents a positive integer greater than 1; L represents the preset duration; P represents the duration between adjacent voice collection moments.
The method according to claim 2, wherein the collecting a voice of a preset duration as the voice unit starting from the current moment comprises:

Select a free storage space for storing voice as the target storage space;

All the voices collected from the current moment are stored in the target storage space until the target storage space is full to obtain the voice unit;

Wherein, the duration of the voice that can be stored in the storage space is the preset duration.
The method according to claim 5, wherein the selecting a free storage space for storing voice as the target storage space comprises:

Determine whether there is free storage space;

If there is no free storage space, create a storage space and use it as the target storage space;

If there is a free storage space, a free storage space is selected as the target storage space.
The method according to claim 5, wherein the voices collected from the current moment are all stored in the target storage space until the target storage space is full, and after the voice unit is obtained, include:

Storing the voice unit in the target storage space in a preset audio queue;

Release the target storage space;

The recognizing the target command in the target voice includes:

Acquiring one of the voice units from the preset audio queue for command recognition;

And delete the selected voice unit from the preset audio queue.
The method according to any one of claims 1 to 7, wherein the recognizing the target command in the target voice comprises:

The target voice is matched with a preset grammar, and if the matching is successful, the preset grammar that matches the target voice is mapped to the target command.
The method according to claim 8, wherein the smart device comprises an ultrasound device;

The recognizing the target command in the target voice and responding to the target command includes:

Recognizing the ultrasonic instruction in the target voice, and responding to the ultrasonic instruction.
A voice control system, characterized in that it is applied to smart devices, and includes:

The first collection module is used to continuously collect voices to obtain the target voice when it is determined to perform the voice interaction function;

The first recognition module is used to recognize the target command in the target voice and respond to the target command.
An ultrasonic device, characterized in that it comprises:

Memory, used to store computer programs;

The processor is configured to implement the steps of the voice control method according to any one of claims 1 to 9 when executing the computer program.
A computer-readable storage medium, characterized in that it is applied to a smart device, and a computer program is stored in the computer-readable storage medium. When the computer program is executed by a processor, the computer program implements any one of claims 1 to 9 The steps of the voice control method are described.