CN114627865A

CN114627865A - Voice instruction execution control method and device, terminal equipment and storage medium

Info

Publication number: CN114627865A
Application number: CN202011435608.9A
Authority: CN
Inventors: 林嘉明
Original assignee: Shenzhen TCL New Technology Co Ltd
Current assignee: Shenzhen TCL New Technology Co Ltd
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2022-06-14

Abstract

The invention discloses a voice instruction execution control method, a voice instruction execution control device, terminal equipment and a storage medium, wherein the method comprises the following steps: acquiring voice operation request information generated based on a voice instruction; according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information; and executing the operation corresponding to the operation instruction according to the operation instruction, and receiving feedback information responding to the operation instruction in real time. The invention can determine the operation instruction based on the voice operation request information, then execute the corresponding operation based on the operation instruction, and can receive the feedback information responding to the operation instruction in real time, thereby timely acquiring the state of the operation instruction in the executed process according to the feedback information and providing convenience for the use of a user.

Description

Voice instruction execution control method and device, terminal equipment and storage medium

Technical Field

The present invention relates to the field of voice instruction execution technologies, and in particular, to a method and an apparatus for controlling voice instruction execution, a terminal device, and a storage medium.

Background

In the present day that natural language processing technology is increasingly vigorous, the technology of voice interaction is more and more mature. The voice assistant is widely applied to various IOT (Internet of things) devices such as mobile phones, televisions and computers, and covers various fields such as movie and television, weather inquiry, device control, shopping and consumption. However, there are various fields related to voice interaction, and in the prior art, in the execution process of a voice command, it is necessary to repeatedly confirm whether the voice command is executed, which is time-consuming and labor-consuming, and is inconvenient for a user to use.

Thus, there is a need for improvements and enhancements in the art.

Disclosure of Invention

The present invention provides a method, an apparatus, a terminal device and a storage medium for controlling execution of a voice command, aiming to solve the problems of time and labor consumption and inconvenience for a user, which are caused by the need of repeatedly confirming whether a voice command is executed during the execution of the voice command in the prior art.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

in a first aspect, the present invention provides a method for controlling execution of a voice command, wherein the method includes:

acquiring voice operation request information generated based on a voice instruction;

according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information;

and executing the operation corresponding to the operation instruction according to the operation instruction.

In one implementation, the acquiring voice operation request information generated based on a voice instruction includes:

collecting the voice instruction in a preset range in real time;

acquiring sound characteristics in the voice instruction, wherein the sound characteristics comprise: voiceprint features, loudness identification, tone identification and tone identification;

and determining the voice operation request information according to the sound characteristics.

In one implementation manner, determining, according to the voice operation request information, an operation instruction corresponding to the voice operation request information includes:

analyzing the voice operation request information to obtain voice information corresponding to the voice instruction in the voice operation request information;

and converting the voice information into text information, and determining an operation instruction corresponding to the voice operation request instruction according to the text information.

In one implementation manner, the determining, according to the text information, an operation instruction corresponding to the voice operation request instruction includes:

analyzing the text information to obtain field information in the text information;

and determining an operation instruction corresponding to the field information according to the field information.

rewriting the field information into specified field information;

and determining a specified operation instruction corresponding to the specified field information according to the specified field information, and taking the specified operation instruction as the operation instruction.

In one implementation manner, the executing the operation corresponding to the operation instruction according to the operation instruction, and receiving feedback information responding to the operation instruction in real time includes:

freezing the specified food according to the freezing instruction, and acquiring the image information of the specified food in real time;

according to the image information, determining the freezing state of the specified food, and taking the freezing state as the feedback information;

and outputting voice prompt information according to the freezing state.

In one implementation, the determining a freezing status of the designated food from the image information includes:

matching the image information with pre-stored frozen image information of the specified food which is frozen to obtain a matching result;

and determining the freezing state of the specified food according to the matching result.

In a second aspect, an embodiment of the present invention further provides a device for controlling execution of a voice instruction, where the device includes:

the request information acquisition module is used for acquiring voice operation request information generated based on the voice instruction;

the operation execution determining module is used for determining an operation instruction corresponding to the voice operation request information according to the voice operation request information;

and the operation instruction execution module is used for executing the operation corresponding to the operation instruction according to the operation instruction and receiving the feedback information responding to the operation instruction in real time.

In a third aspect, an embodiment of the present invention further provides a terminal device, where the terminal device includes a memory, a processor, and a voice instruction execution control program that is stored in the memory and is executable on the processor, and when the processor executes the voice instruction execution control program, the step of implementing the voice instruction execution control method in any one of the foregoing schemes is implemented.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a voice instruction execution control program is stored on the computer-readable storage medium, and when the voice instruction execution control program is executed by a processor, the steps of the voice instruction execution control method in any one of the above schemes are implemented.

Has the beneficial effects that: compared with the prior art, the invention provides a voice instruction execution control method. And then, according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information. And finally, executing the operation corresponding to the operation instruction according to the operation instruction. The invention can determine the operation instruction based on the voice operation request information, then execute the corresponding operation based on the operation instruction, and can receive the feedback information responding to the operation instruction in real time, thereby timely acquiring the state of the operation instruction in the executed process according to the feedback information and providing convenience for the use of a user.

Drawings

Fig. 1 is a flowchart of a specific implementation of a voice command execution control method according to an embodiment of the present invention.

Fig. 2 is a flowchart of acquiring request information in a voice instruction execution control method according to an embodiment of the present invention.

Fig. 3 is a flowchart of determining an operation instruction in the voice instruction execution control method according to the embodiment of the present invention.

Fig. 4 is a flowchart of executing an operation instruction in the voice instruction execution control method according to the embodiment of the present invention.

Fig. 5 is a schematic block diagram of a voice instruction execution control apparatus according to an embodiment of the present invention.

Fig. 6 is a schematic block diagram of an internal structure of a terminal device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and effects of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

With the increasingly vigorous natural language processing technology, the technology of voice interaction is more and more mature. The voice assistant is widely applied to various IOT (Internet of things) devices such as mobile phones, televisions and computers, and covers various fields such as movie and television, weather inquiry, device control, shopping and consumption. However, there are various fields related to voice interaction, and in the prior art, in the execution process of a voice command, it is necessary to repeatedly confirm whether the voice command is executed, which is time-consuming and labor-consuming, and is inconvenient for a user to use. For example, in recent years, the intelligent terminal occupies the daily life of people, but the development of the intelligent terminal is not finished, and the life of people is more convenient and more beautiful due to the intelligent electric appliance. In daily life, people always want to make some frozen foods, such as: the foods such as ice cream, bean jelly and jelly … … are delicious, so that a user can send a voice instruction to the intelligent refrigerator to enable the intelligent refrigerator to automatically freeze things, but the time for making the foods is controlled, the refrigerator is repeatedly opened to observe whether the foods are completely frozen, and the foods are time-consuming and labor-consuming.

In order to solve the problems in the prior art, the present embodiment provides a method for controlling execution of a voice instruction, and the method of the present embodiment can timely obtain the state of the operation instruction in the executed process according to the feedback information, thereby providing convenience for a user. In specific implementation, the embodiment first obtains the voice operation request information generated based on the voice instruction. And then, according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information. And finally, executing the operation corresponding to the operation instruction according to the operation instruction.

For example, the method of the embodiment is applied to a scene of a smart home, and when it is necessary to control the smart refrigerator to start freezing food, the embodiment can output a voice instruction for freezing food to the smart refrigerator, so that the smart refrigerator can obtain voice request information obtained based on the voice instruction, and then obtain an operation instruction based on the voice request information, that is, start to freeze food. When the operation instruction is executed, the intelligent refrigerator in the embodiment can receive the feedback information responding to the operation instruction in real time and know the state of the intelligent refrigerator in the freezing process in real time, so that a user can directly know the freezing state of food, and convenience is provided for the user to use.

Exemplary method

The voice instruction execution control method in the embodiment can be applied to terminal equipment, such as intelligent refrigerators, intelligent air conditioners and other equipment. Specifically, as shown in fig. 1, the method of the present embodiment includes the steps of:

and step S100, acquiring voice operation request information generated based on the voice command.

In this embodiment, when a user wants to complete an operation, a voice instruction, which is voice information spoken by the user, is sent to the intelligent terminal, and then, after the intelligent terminal obtains the voice instruction, a voice operation request message is generated according to the voice instruction, where the voice operation request message is a request message for reflecting that the user wants to execute the voice instruction. For example, when the user wants to watch the swordsmen, the user sends a voice command of 'playing the swordsmen' to the smart television, and after receiving the voice command of 'playing the swordsmen', the smart television generates voice operation request information that the user wants to play the swordsmen. For another example, when the user wants to freeze the ice-lolly, the user will generate the voice operation request information that the user wants to freeze the food after sending the voice command of "freeze the food" to the intelligent refrigerator.

In one implementation, as shown in fig. 2, the step S100 specifically includes:

s101, acquiring a voice command in a preset range in real time;

step S102, obtaining sound characteristics in the voice command, wherein the sound characteristics comprise: voiceprint features, loudness identification, tone identification and tone identification;

and step S103, determining the voice operation request information according to the sound characteristics.

In specific implementation, the embodiment collects the voice command in the preset range in real time, and the voice command is voice information sent by the user. For example, the intelligent refrigerator can collect voice commands within a range of 5 meters. After the voice command is collected, the embodiment can start to acquire the voice characteristics in the voice command. The sound characteristics are used for reflecting the voiceprint characteristics, the loudness identification and the tone identification in the voice command. The voice characteristics are used for analyzing the voice command and judging whether the voice command is sent by a preset user. For the convenience of the user, in this embodiment, a corresponding preset user is first set for each terminal device, and the preset user is used to trigger the corresponding terminal device. For example, if the preset user corresponding to the intelligent refrigerator is a, only the voice command sent by the user a can be recognized and responded by the intelligent refrigerator, that is, only the user a can control the intelligent refrigerator through the voice command. Therefore, after the voice command is collected, the voice print feature, the loudness identifier and the tone identifier in the voice command can be analyzed. In this embodiment, the voiceprint feature is used to analyze a matching degree between a voiceprint in the received voice instruction and a voiceprint of a preset user, and if the matching degree is higher than 80%, it indicates that the received voice instruction is the voice instruction sent by the preset user. Similarly, the loudness identification is used for analyzing whether the loudness of the sound of the received voice command matches with the loudness of the sound emitted by the preset user, and if so, the loudness identification indicates that the loudness of the sound of the received voice command is the same as the loudness of the sound of the voice command emitted by the preset user. Similarly, the tone mark and the tone mark in this embodiment are used to analyze the matching degree between the tone and the tone in the received voice command and the tone of the preset user, and if the matching degree is higher than 80%, it indicates that the received voice command is the voice command sent by the preset user. After the voice command is analyzed through the voice characteristics, whether the received voice characteristics are the voice command sent by a preset user or not can be judged, and if yes, the voice operation request information generated based on the voice command can be acquired.

And step S200, determining an operation instruction corresponding to the voice operation request information according to the voice operation request information.

In specific implementation, after receiving the voice operation request information, the embodiment may analyze the voice operation request information, and further determine an operation instruction corresponding to the voice operation request information. For example, the intelligent terminal in this embodiment may further include a recording function, and when the user outputs the voice information of "playing the swordsmen", the intelligent terminal may receive the voice information of "playing the swordsmen" by using the recording function, where the voice information is the voice instruction. After the intelligent terminal obtains the voice instruction of playing the martial-arts drama, corresponding voice operation request information can be generated, wherein the voice operation request information comprises the voice instruction of playing the martial-arts drama. Therefore, when the intelligent terminal in this embodiment determines the voice information according to the voice operation request information, the operation instruction corresponding to the voice operation request information can be obtained as- "play martial arts".

In one implementation, as shown in fig. 3, the step S200 specifically includes the following steps:

step S201, analyzing the voice operation request information to obtain voice information corresponding to the voice instruction in the voice operation request information;

step S202, converting the voice information into text information, and determining an operation instruction corresponding to the voice operation request instruction according to the text information.

In order to determine the operation instruction corresponding to the voice operation request information in this embodiment, after the voice operation request information is obtained, the voice operation request information may be analyzed to obtain voice information corresponding to the voice instruction in the voice operation request information; and converting the voice information into text information, and determining an operation instruction corresponding to the voice operation request instruction according to the text information. In one implementation, the present embodiment may convert the voice information into the text information by using voice recognition or voice translation. After the text information is obtained, the present embodiment may analyze the text information, and then according to field information in the text information, the field information is some words or words in the text information. Since the text message is a sentence or words recognized by the speech message, and some words have no meaning, for example, when the recognized text message is "please play the warm-playing conutting transmission". In the text information, only two field information, namely 'play' and 'conconututting' play a role in the execution process of the voice instruction, are available, and the intelligent terminal can determine the intention of the user as long as the two field information are obtained. The text message "please" and "hot" has no meaning to the execution process of the voice command. Therefore, when determining the field information according to the text information, the embodiment needs to screen some useless or unpractical fields in the text information, so as to achieve the purpose of more accurately determining the user intention.

After the field information is determined, the present embodiment may determine, according to the field information, an operation instruction corresponding to the field information. In this embodiment, the field information is obtained from the text information, and the text information is obtained from the voice information corresponding to the voice command. Therefore, the field information may reflect the intention of the user, for example, the determined field information is "play" and "conutturn transmission" in the above example, and the corresponding intention of the user is to play the drama of conutturn transmission for the smart tv. And the user intent may reflect the type of operational event. In an implementation manner, the embodiment may preset a mapping file, where the mapping file is provided with a corresponding relationship between field information and an operation instruction. Therefore, after the field information is obtained, the field information can be matched with the mapping file to obtain a corresponding operation instruction. For example, in the above example, the determined field information is "play" and "conconconutu transmission", in this embodiment, the operation instruction matching with "play" can be determined as "play event" according to "play" in the field information, and further, it can be determined that "movie drama" matching with "conconutu transmission" is "movie drama" according to "conutu transmission" in the field information, and then the determined play event is combined with the movie drama, so that it can be determined that the final operation instruction is: and playing the movie and television play. For another example, when the field information extracted from the voice information is "frozen" or "jelly", the operation instruction can be determined as follows according to the field information: a frozen jelly is prepared.

In addition, in order to meet more application scenarios, the present embodiment may further rewrite the field information, that is, customize the field information, so that the operation instruction may be customized. In concrete implementation, when the field information is rewritten, the present embodiment can rewrite the field information obtained from the voice information to the specified field information, i.e., the intention information intended by the user. For example, in the above example, the field information obtained according to the voice information is "play" and "conututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututututut. In order to meet the requirement of the user for viewing, the present embodiment may perform adaptive rewriting according to the determined field information, that is, rewriting one or more of the field information into information associated with the original field information, that is, specifying the field information, then determining a specified operation instruction according to the specified field information, and taking the specified operation instruction as the operation instruction. For example, in this embodiment, the "conutting transmission" may be rewritten into "exemplary transmission", because on the network, the associated entries of the two drams of "conutting transmission" and "exemplary transmission" are the most, so the field information is changed into the designated field information "exemplary transmission", the obtained operation instruction is the broadcast drama as exemplary transmission, and thus, the normal viewing of the user can be satisfied. For another example, the present embodiment may also perform mandatory rewriting on the field information, for example, when the intelligent refrigerator recognizes that the field information includes: the 'I want to freeze', 'help me freeze' and 'do frozen food' are all forced to be rewritten into the appointed field information 'manufacture of frozen food', so that the corresponding operation instruction can be determined to be frozen food.

And step S300, executing the operation corresponding to the operation instruction according to the operation instruction.

After the operation instruction is determined, the embodiment may execute an operation corresponding to the operation instruction according to the operation instruction. In order to timely know the execution condition of the operation instruction, the present embodiment may obtain feedback information in response to the operation instruction after the operation instruction is executed, where the feedback information is used to reflect a state of the terminal device after the operation instruction is executed, so as to timely know details of the execution of the operation instruction.

In one implementation, as shown in fig. 4, the embodiment is described with a freeze command, and if the operation command is a freeze command, the step S300 specifically includes:

s301, freezing specified food according to the freezing instruction, and acquiring image information of the specified food in real time;

step S302, according to the image information, determining the freezing state of the specified food, and taking the freezing state as the feedback information;

and S303, outputting voice prompt information according to the freezing state.

In specific implementation, after the intelligent refrigerator receives the freezing instruction, the specified food can be frozen according to the freezing instruction. In order to timely know the detailed information of the intelligent refrigerator after executing the freezing instruction, the embodiment can acquire the image information of the specified food in real time in the process of executing the freezing instruction by the intelligent refrigerator; and determining the freezing state of the specified food according to the image information, and taking the freezing state as the feedback information. In specific implementation, the embodiment can compare and analyze the collected image information of the frozen food with the pre-stored frozen image information of the specified food which is frozen, and because the pre-stored frozen image information of the specified food which is frozen is obtained and stored by using a plurality of cameras, a matching result can be obtained by comparing and matching analysis, namely, whether the similarity between the image information of the specified food and the pre-stored frozen image information is more than 70% is determined, and if so, the freezing state of the specified food is determined, namely, whether the specified food is frozen.

And when the freezing state is obtained, outputting voice prompt information according to the freezing state of the specified food. For example, when the freezing state of the specified food is freezing completion, the voice prompt message of freezing completion is output, and when the freezing state of the specified food is not freezing completion, the voice prompt message of non-freezing completion is output. Of course, the operation instruction in this embodiment may be other, for example, if the operation instruction is to play a tv play, video information, such as play duration, of the terminal device in the process of playing the tv play may be received. That is to say, whatever operation instruction is applied to any scene, the present embodiment may receive feedback information after the terminal device receives the operation instruction and responds to the operation instruction.

In summary, the present embodiment first obtains the voice operation request information generated based on the voice command. And then, according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information. And finally, executing the operation corresponding to the operation instruction according to the operation instruction. The embodiment can determine the operation instruction based on the voice operation request information, then execute the corresponding operation based on the operation instruction, and also can receive the feedback information responding to the operation instruction in real time, so that the state of the operation instruction in the executed process can be obtained in time according to the feedback information, and convenience is provided for the use of a user.

Exemplary device

As shown in fig. 5, an embodiment of the present invention provides a voice instruction execution control apparatus, including: a request information acquisition module 10, an operation execution determination module 20, and an operation instruction execution module 30. Specifically, the request information obtaining module 10 is configured to obtain voice operation request information generated based on a voice instruction. The operation execution determining module 20 is configured to determine, according to the voice operation request information, an operation instruction corresponding to the voice operation request information. The operation instruction execution module 30 is configured to execute an operation corresponding to the operation instruction according to the operation instruction.

In one implementation, the request information obtaining module 10 includes:

the voice instruction acquisition unit is used for acquiring a voice instruction within a preset range in real time;

a sound characteristic acquiring unit, configured to acquire a sound characteristic in the voice instruction, where the sound characteristic includes: voiceprint features, loudness identification, tone identification and tone identification;

and the operation request determining unit is used for determining the voice operation request information according to the sound characteristics.

In one implementation, the operation execution determination module 20 includes:

a voice information determining unit, configured to analyze the voice operation request information to obtain voice information corresponding to the voice instruction in the voice operation request information;

and the operation instruction determining unit is used for converting the voice information into text information and determining an operation instruction corresponding to the voice operation request instruction according to the text information.

In one implementation, the operation instruction execution module 30 includes:

the image information acquisition unit is used for freezing the specified food according to the freezing instruction and acquiring the image information of the specified food in real time;

a feedback information determination unit for determining a freezing state of the designated food according to the image information, and taking the freezing state as the feedback information;

and the prompt information output unit is used for outputting voice prompt information according to the freezing state.

Based on the above embodiments, the present invention further provides a terminal device, and a schematic block diagram thereof may be as shown in fig. 6. The terminal equipment comprises a processor, a memory, a network interface, a display screen and a temperature sensor which are connected through a system bus. Wherein the processor of the terminal device is configured to provide computing and control capabilities. The memory of the terminal equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the terminal device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a voice instruction execution control method. The display screen of the terminal equipment can be a liquid crystal display screen or an electronic ink display screen, and the temperature sensor of the terminal equipment is arranged in the terminal equipment in advance and used for detecting the operating temperature of the internal equipment.

It will be understood by those skilled in the art that the block diagram of fig. 6 is only a block diagram of a part of the structure related to the solution of the present invention, and does not constitute a limitation to the terminal device to which the solution of the present invention is applied, and a specific terminal device may include more or less components than those shown in the figure, or may combine some components, or have different arrangements of components.

In one embodiment, a terminal device is provided, where the terminal device includes a memory, a processor, and a voice instruction execution control program stored in the memory and executable on the processor, and when the processor executes the voice instruction execution control program, the following operation instructions are implemented:

and executing the operation corresponding to the operation instruction according to the operation instruction, and receiving feedback information responding to the operation instruction in real time.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, databases, or other media used in embodiments provided herein may include non-volatile and/or volatile memory. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

In summary, the present invention discloses a method, an apparatus, a terminal device and a storage medium for controlling execution of a voice instruction, wherein the method comprises: acquiring voice operation request information generated based on a voice instruction; according to the voice operation request information, determining an operation instruction corresponding to the voice operation request information; and executing the operation corresponding to the operation instruction according to the operation instruction, and receiving feedback information responding to the operation instruction in real time. The invention can determine the operation instruction based on the voice operation request information, then execute the corresponding operation based on the operation instruction, and can receive the feedback information responding to the operation instruction in real time, thereby timely acquiring the state of the operation instruction in the executed process according to the feedback information and providing convenience for the use of a user.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the present invention in its responsive technical solutions.

Claims

1. A voice instruction execution control method, the method comprising:

2. The voice instruction execution control method according to claim 1, wherein the acquiring voice operation request information generated based on the voice instruction includes:

collecting the voice instruction in a preset range in real time;

3. The method of claim 1, wherein determining the operation command corresponding to the voice operation request information according to the voice operation request information comprises:

4. The method according to claim 3, wherein the determining an operation instruction corresponding to the voice operation request instruction based on the text information includes:

5. The method according to claim 3, wherein the determining an operation instruction corresponding to the voice operation request instruction based on the text information includes:

rewriting the field information into specified field information;

6. The method of claim 1, wherein the operation command is a freezing command, and the performing an operation corresponding to the operation command according to the operation command and receiving feedback information in real time in response to the operation command comprises:

and outputting voice prompt information according to the freezing state.

7. The voice instruction execution control method according to claim 6, wherein the determining a freezing state of the specified food based on the image information includes:

8. A voice instruction execution control apparatus, characterized in that the apparatus comprises:

and the operation instruction execution module is used for executing the operation corresponding to the operation instruction according to the operation instruction.

9. A terminal device, characterized in that the terminal device comprises a memory, a processor and a voice instruction execution control program stored in the memory and operable on the processor, and the processor executes the voice instruction execution control program to implement the steps of the voice instruction execution control method according to any one of claims 1 to 7.

10. A computer-readable storage medium, wherein a voice instruction execution control program is stored on the computer-readable storage medium, and when the voice instruction execution control program is executed by a processor, the steps of the voice instruction execution control method according to any one of claims 1 to 7 are implemented.