CN116246624A - Voice control method and device of intelligent equipment, storage medium and electronic device - Google Patents

Voice control method and device of intelligent equipment, storage medium and electronic device Download PDF

Info

Publication number
CN116246624A
CN116246624A CN202310048578.3A CN202310048578A CN116246624A CN 116246624 A CN116246624 A CN 116246624A CN 202310048578 A CN202310048578 A CN 202310048578A CN 116246624 A CN116246624 A CN 116246624A
Authority
CN
China
Prior art keywords
voice
equipment
control instruction
voice control
intelligent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310048578.3A
Other languages
Chinese (zh)
Inventor
任学磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Haier Uplus Intelligent Technology Beijing Co Ltd
Original Assignee
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Technology Co Ltd, Haier Smart Home Co Ltd, Haier Uplus Intelligent Technology Beijing Co Ltd filed Critical Qingdao Haier Technology Co Ltd
Priority to CN202310048578.3A priority Critical patent/CN116246624A/en
Publication of CN116246624A publication Critical patent/CN116246624A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Selective Calling Equipment (AREA)

Abstract

The application discloses a voice control method, a device, a storage medium and an electronic device of intelligent equipment, which relate to the technical field of intelligent families, wherein the voice control method of the intelligent equipment comprises the following steps: analyzing the voice interaction data of the target object to determine a plurality of voice data packets corresponding to the voice interaction data; determining a voice control instruction of a target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of a plurality of voice data packets; the technical scheme is adopted to solve the technical problem of how to control the equipment so as to improve the response speed of the equipment.

Description

Voice control method and device of intelligent equipment, storage medium and electronic device
Technical Field
The application relates to the technical field of smart families, in particular to a voice control method and device of smart equipment, a storage medium and an electronic device.
Background
Currently, with the continuous progress of technology, more and more intelligent devices can support voice control functions, for example, intelligent home appliances in sea can control an air conditioner by using voice instructions such as "little excellent, open air conditioner" without manually operating a remote controller to control the air conditioner. However, when the intelligent home appliance is controlled by voice, the situation that the device returns to the user in a delayed manner is easy to occur, the experience of the user is reduced, and the solutions in the related art comprise: 1. the home network speed of the user is improved; 2. the response speed of the device is improved.
For the first scheme, because of the technical and cost considerations, the industry generally adopts ASR (automatic speech recognition), NLP (natural language processing), TTS (text to speech) modes to complete the speech collection, speech broadcasting and other works at the server, but the home network speed of the user has a speed bottleneck, and after exceeding a certain threshold, the home network speed cannot be continuously improved, and in addition, the reasons for the recovery delay include not only the network speed difference, but also the problem of the recovery delay cannot be completely solved by uniformly improving the network speed.
For the scheme II, sometimes, the response speed of the equipment cannot meet the requirement of the equipment access platform on the equipment, for example, the actual response speed of the equipment cannot be balanced with the time when the equipment access platform requires the equipment to respond to the command, which is unfavorable for the popularization of the equipment access platform, and the response speed of the equipment cannot be improved endlessly due to the restriction of the equipment hardware.
Accordingly, in the related art, there is a technical problem of how to control the device to increase the response speed of the device.
Aiming at the technical problem of how to control the equipment to improve the response speed of the equipment in the related art, no effective solution has been proposed yet.
Disclosure of Invention
The embodiment of the application provides a voice control method and device of intelligent equipment, a storage medium and an electronic device, which at least solve the technical problem of how to control the equipment so as to improve the response speed of the equipment in the related technology.
According to an embodiment of the present application, there is provided a voice control method of an intelligent device, including: analyzing voice interaction data of a target object, and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data; determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets; and searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device, and controlling the intelligent device according to the target voice control instruction.
In an exemplary embodiment, determining a plurality of voice data packets corresponding to the voice interaction data includes: grouping the voice interaction data according to a preset period to obtain a plurality of voice interaction data, wherein the voice interaction data are continuous in time; and respectively identifying each piece of voice interaction data in the plurality of pieces of voice interaction data to obtain a plurality of voice data packets included in each piece of voice data.
In an exemplary embodiment, determining the voice control instruction of the target object according to the recognition result of the plurality of voice data packets includes: respectively comparing the plurality of second voice data packets with preset white noise to obtain data similarity of the plurality of second voice data packets and the preset white noise; determining a second voice data packet corresponding to the maximum data similarity as a target voice data packet; and under the condition that the data similarity of the target voice data packet is larger than a preset threshold value, determining other voice data packets except the target voice data packet from the plurality of second voice data packets, and determining a voice control instruction of the target object according to voice recognition results of the other voice data packets and the first voice data packet.
In one exemplary embodiment, before looking up the target voice control instruction consistent with the voice control instruction from the device instruction set supported by the smart device, the method further comprises: acquiring equipment information sent by the intelligent equipment, and determining the equipment state of the intelligent equipment based on the equipment information; and under the condition that the equipment state of the intelligent equipment is determined to be the working state, acquiring an equipment instruction set supported by the intelligent equipment and preset by the target object.
In an exemplary embodiment, controlling the smart device according to the target voice control instruction includes: acquiring a mutually exclusive instruction set of the intelligent device, wherein the mutually exclusive instruction set comprises: different equipment modes which allow the intelligent equipment to run and voice control instructions which cannot be supported by the different equipment modes; under the condition that the target voice control instruction is not found in the mutually exclusive instruction set, suspending executing the control instruction in the current equipment mode of the intelligent equipment, and executing the target voice control instruction; and after the target voice control instruction is executed, continuing to execute the control instruction in the current equipment mode.
In one exemplary embodiment, after controlling the smart device according to the target voice control instruction, it includes: receiving a reply message of the intelligent equipment, wherein the reply message comprises an execution result of the intelligent equipment after executing the target voice control instruction; and under the condition that the execution result indicates that the intelligent equipment fails to execute the target voice control instruction, if the equipment state of the intelligent equipment is determined to be the working state, the target voice control instruction is sent to the intelligent equipment so as to control the intelligent equipment to execute the target voice control instruction again.
In one exemplary embodiment, before looking up the target voice control instruction consistent with the voice control instruction from the device instruction set supported by the smart device, the method further comprises: acquiring a special equipment instruction set preset for the target object, wherein the special equipment instruction set represents a historical equipment control instruction of the target object; and searching a target voice control instruction consistent with the voice control instruction from the special equipment instruction set.
According to another embodiment of the present application, there is also provided a voice control apparatus of an intelligent device, including: the first determining module is used for analyzing the voice interaction data of the target object and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data; the second determining module is used for determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to the recognition results of the voice data packets; and the control module is used for searching a target voice control instruction consistent with the voice control instruction from the equipment instruction set supported by the intelligent equipment and controlling the intelligent equipment according to the target voice control instruction.
According to yet another aspect of the embodiments of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the voice control method of the smart device when running.
According to still another aspect of the embodiments of the present application, there is further provided an electronic apparatus including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the voice control method of the smart device through the computer program.
In the embodiment of the application, the voice interaction data of the target object are analyzed, and a plurality of voice data packets corresponding to the voice interaction data are determined; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data; determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets; searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device, and controlling the intelligent device according to the target voice control instruction; by adopting the technical scheme, the technical problem of how to control the equipment so as to improve the response speed of the equipment is solved, the response speed of the equipment is further improved, and the user experience is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
Fig. 1 is a schematic diagram of a hardware environment of a voice control method of an intelligent device according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of voice control of a smart device according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a voice control method of a smart device according to an embodiment of the present application;
fig. 4 is a block diagram (a) of a voice control apparatus of an intelligent device according to an embodiment of the present application;
fig. 5 is a block diagram (ii) of a voice control apparatus of an intelligent device according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to one aspect of the embodiment of the application, a voice control method of an intelligent device is provided. The voice control method of the intelligent equipment is widely applied to full-house intelligent digital control application scenes such as intelligent Home (Smart Home), intelligent Home equipment ecology, intelligent Home (Intelligence House) ecology and the like. Alternatively, in the present embodiment, the voice control method of the above-described smart device may be applied to a hardware environment constituted by the terminal device 102 and the server 104 as shown in fig. 1. As shown in fig. 1, the server 104 is connected to the terminal device 102 through a network, and may be used to provide services (such as application services and the like) for a terminal or a client installed on the terminal, a database may be set on the server or independent of the server, for providing data storage services for the server 104, and cloud computing and/or edge computing services may be configured on the server or independent of the server, for providing data computing services for the server 104.
The network may include, but is not limited to, at least one of: wired network, wireless network. The wired network may include, but is not limited to, at least one of: a wide area network, a metropolitan area network, a local area network, and the wireless network may include, but is not limited to, at least one of: WIFI (Wireless Fidelity ), bluetooth. The terminal device 102 may not be limited to a PC, a mobile phone, a tablet computer, an intelligent air conditioner, an intelligent smoke machine, an intelligent refrigerator, an intelligent oven, an intelligent cooking range, an intelligent washing machine, an intelligent water heater, an intelligent washing device, an intelligent dish washer, an intelligent projection device, an intelligent television, an intelligent clothes hanger, an intelligent curtain, an intelligent video, an intelligent socket, an intelligent sound box, an intelligent fresh air device, an intelligent kitchen and toilet device, an intelligent bathroom device, an intelligent sweeping robot, an intelligent window cleaning robot, an intelligent mopping robot, an intelligent air purifying device, an intelligent steam box, an intelligent microwave oven, an intelligent kitchen appliance, an intelligent purifier, an intelligent water dispenser, an intelligent door lock, and the like.
In this embodiment, a voice control method of an intelligent device is provided and applied to the computer terminal, and fig. 2 is a flowchart of a voice control method of an intelligent device according to an embodiment of the present application, where the flowchart includes the following steps:
Step S202, analyzing voice interaction data of a target object, and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
step S204, determining a voice control instruction of the target object and an intelligent device to be controlled by the voice control instruction according to the recognition results of the voice data packets;
step S206, searching a target voice control instruction consistent with the voice control instruction from the equipment instruction set supported by the intelligent equipment, and controlling the intelligent equipment according to the target voice control instruction.
Through the steps, the voice interaction data of the target object are analyzed, and a plurality of voice data packets corresponding to the voice interaction data are determined; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data; determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets; the target voice control instruction consistent with the voice control instruction is searched from the equipment instruction set supported by the intelligent equipment, and the intelligent equipment is controlled according to the target voice control instruction, so that the technical problem of how to control equipment to improve the equipment response speed in the related technology is solved, the equipment response speed is further improved, and the user experience is improved.
In the above embodiment, it should be noted that, for the process of searching for the target voice control instruction consistent with the voice control instruction from the device instruction set supported by the intelligent device, if the searching fails, a preset reply is sent to the user (i.e. the target object), meanwhile, the voice interaction data of the user is analyzed again, the latest multiple voice data packets are determined, and then the voice control instruction of the user and the intelligent device to be controlled by the voice control instruction are determined by using the recognition results of the latest multiple voice data packets.
Optionally, the preset reply word may include, but is not limited to, "do not support this operation in the current mode", and the like. In this embodiment, after searching for a target voice control instruction consistent with the voice control instruction from the device instruction set supported by the intelligent device, a preset reply word such as "good target voice control instruction has been searched for" may also be sent to the user (i.e. the target object).
In an exemplary embodiment, in order to better understand the process of determining the plurality of voice data packets corresponding to the voice interaction data in the step S202, the following technical solutions are provided, where the specific steps include: grouping the voice interaction data according to a preset period to obtain a plurality of voice interaction data, wherein the voice interaction data are continuous in time; and respectively identifying each piece of voice interaction data in the plurality of pieces of voice interaction data to obtain a voice data packet included in each piece of voice data.
Through the above embodiment, a technical solution for grouping voice interaction data periodically is provided, where the preset period may be flexibly adjusted, may be fixed, and at this time, the preset time interval between two groups of continuous voice interaction data in any time may be consistent, or may also be changed in real time, at this time, the preset time interval between two groups of continuous voice interaction data in any time may not be consistent, and optionally, for example, two groups of continuous voice interaction data intervals 2s in group a are set, and two groups of continuous voice interaction data intervals 3s in group (a+1) are set.
In an exemplary embodiment, further, in order to better understand the technical solution of determining the voice control instruction of the target object according to the recognition results of the plurality of voice data packets in the step S204, specifically, the following implementation steps are provided: respectively comparing the plurality of second voice data packets with preset white noise to obtain data similarity of the plurality of second voice data packets and the preset white noise; determining a second voice data packet corresponding to the maximum data similarity as a target voice data packet; and under the condition that the data similarity of the target voice data packet is larger than a preset threshold value, determining other voice data packets except the target voice data packet from the plurality of second voice data packets, and determining a voice control instruction of the target object according to voice recognition results of the other voice data packets and the first voice data packet.
Optionally, under the condition that the preset white noise data is monitored, acquiring a noise time period in the process of monitoring the preset white noise data; if the noise time period is determined to be greater than the preset time period, controlling the intelligent equipment to send an inquiry message to the target object, wherein the inquiry message is used for inquiring whether the target object finishes the current voice interaction process or not; under the condition that a reply message sent by the target object is received, if the reply message is determined to indicate that the target object does not end the current voice interaction process, continuing to acquire voice interaction data of the target object; and if the reply message indicates that the target object finishes the current voice interaction process, stopping the voice interaction data of the target object, and controlling the intelligent equipment to prompt the target object that the current voice interaction process is finished.
In an exemplary embodiment, before searching for a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device, further, other technical solutions are further provided, including: acquiring equipment information sent by the intelligent equipment, and determining the equipment state of the intelligent equipment based on the equipment information; and under the condition that the equipment state of the intelligent equipment is determined to be the working state, acquiring an equipment instruction set supported by the intelligent equipment and preset by the target object.
By the embodiment, the state information of the intelligent equipment can be queried in advance, and the equipment instruction set supported by the intelligent equipment can be determined in advance, so that the time for a user to finish voice interaction to search the equipment instruction set supported by the intelligent equipment is saved, and the equipment control efficiency is improved.
In an exemplary embodiment, a technical solution for explaining the controlling of the intelligent device according to the target voice control instruction in the step S206 is provided, and the specific steps include: acquiring a mutually exclusive instruction set of the intelligent device, wherein the mutually exclusive instruction set comprises: different equipment modes which allow the intelligent equipment to run and voice control instructions which cannot be supported by the different equipment modes; under the condition that the target voice control instruction is not found in the mutually exclusive instruction set, suspending executing the control instruction in the current equipment mode of the intelligent equipment, and executing the target voice control instruction; and after the target voice control instruction is executed, continuing to execute the control instruction in the current equipment mode.
It should be noted that, the control instruction in the current device mode of the intelligent device may include, for example, a remote control instruction, a field control instruction, and the like, but is not limited thereto.
In an exemplary embodiment, further, after the intelligent device is controlled according to the target voice control instruction, the following technical solutions are further provided, where the specific steps include: receiving a reply message of the intelligent equipment, wherein the reply message comprises an execution result of the intelligent equipment after executing the target voice control instruction; and under the condition that the execution result indicates that the intelligent equipment fails to execute the target voice control instruction, if the equipment state of the intelligent equipment is determined to be the working state, the target voice control instruction is sent to the intelligent equipment so as to control the intelligent equipment to execute the target voice control instruction again.
In an exemplary embodiment, further, before searching for a target voice control instruction consistent with the voice control instruction from the device instruction set supported by the smart device, a specific device instruction set preset for the target object may be further acquired, where the specific device instruction set represents a historical device control instruction of the target object; and searching a target voice control instruction consistent with the voice control instruction from the special equipment instruction set.
Further, in this embodiment, in the case that the search is successful, the intelligent device is controlled according to the searched target voice control instruction; and under the condition of failure in searching, searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device.
In an optional embodiment, a preset reply word corresponding to the target voice control instruction is obtained; and determining a broadcasting form supported by the intelligent equipment according to the equipment type of the intelligent equipment, and broadcasting the preset reply language in the broadcasting form supported by the intelligent equipment.
The broadcasting forms supported by the intelligent device can include audio and video broadcasting, text broadcasting, popup broadcasting, virtual image broadcasting and the like, for example, and the application is not limited to this.
In order to better understand the process of the voice control method of the intelligent device, the following describes the implementation method flow of the voice control of the intelligent device in combination with the alternative embodiment, but is not limited to the technical solution of the embodiment of the present application.
In this embodiment, a voice control method of an intelligent device is provided in conjunction with fig. 3, and fig. 3 is a schematic diagram of a voice control method of an intelligent device according to an embodiment of the present application, as shown in fig. 3, and specific steps are as follows:
Step 1, the voice interaction data of the user can be split and uploaded to the server side in a small audio packet mode, for example, the audio duration of each small audio packet can be set to be less than 200ms.
Alternatively, whether the user ends the voice interaction may be determined based on whether the audio within the small audio packets is white noise, for example, when the audio within the plurality of small audio packets is recognized as white noise for more than 500 ms.
Step 2, performing stream ASR (i.e. performing real-time identification on the uploaded collected audio), and returning an identification result in real time;
step 3, carrying out streaming NLP (i.e. carrying out real-time semantic understanding on the identified content), and identifying the user intention in real time;
step 4, inquiring the state information of the intelligent equipment;
it should be noted that, step 2, step 3, and step 4 may be performed simultaneously or sequentially, which is not limited in this application.
And 5, presynthesizing the message to be broadcasted (namely the preset reply language) according to the intention identified in the step 2-3.
And 6, performing mutual exclusion logic verification on the intention identified in the step 2-3 (namely, whether the user command is supported to execute the current time in the current mode of the intelligent equipment) and broadcasting the corresponding document according to the mutual exclusion logic verification result. If the verification is passed, the successful document is directly broadcasted, if the verification is good, the operation is already performed for you, and if the verification is not passed, the mutually exclusive document is broadcasted, such as that the operation is not supported in the current mode, the operation is already performed in the XX mode, and the like.
Step 7, performing bottom covering treatment: under the condition of broadcasting failure, acquiring the latest state of the equipment again through the server side and automatically retrying the command, and informing the user if the equipment still fails. This step is generally applicable to scenarios that have passed mutual exclusion checks but still fail execution.
Through the embodiment, the equipment can be controlled according to the state information of the intelligent equipment, and the response accuracy of the equipment is improved through the mutual exclusion logic verification.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.
Fig. 4 is a block diagram (a) of a voice control apparatus of an intelligent device according to an embodiment of the present application; as shown in fig. 4, includes:
the first determining module 42 is configured to parse the voice interaction data of the target object, and determine a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
a second determining module 44, configured to determine a voice control instruction of the target object and an intelligent device to be controlled by the voice control instruction according to recognition results of the plurality of voice data packets;
and the control module 46 is configured to search a device instruction set supported by the intelligent device for a target voice control instruction consistent with the voice control instruction, and control the intelligent device according to the target voice control instruction.
Through the device, the voice interaction data of the target object are analyzed, and a plurality of voice data packets corresponding to the voice interaction data are determined; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data; determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets; the target voice control instruction consistent with the voice control instruction is searched from the equipment instruction set supported by the intelligent equipment, and the intelligent equipment is controlled according to the target voice control instruction, so that the technical problem of how to control equipment to improve the equipment response speed in the related technology is solved, the equipment response speed is further improved, and the user experience is improved.
In the above embodiment, it should be noted that, for the process of searching for the target voice control instruction consistent with the voice control instruction from the device instruction set supported by the intelligent device, if the searching fails, a preset reply is sent to the user (i.e. the target object), meanwhile, the voice interaction data of the user is analyzed again, the latest multiple voice data packets are determined, and then the voice control instruction of the user and the intelligent device to be controlled by the voice control instruction are determined by using the recognition results of the latest multiple voice data packets.
Optionally, the preset reply word may include, but is not limited to, "do not support this operation in the current mode", and the like. In this embodiment, after searching for a target voice control instruction consistent with the voice control instruction from the device instruction set supported by the intelligent device, a preset reply word such as "good target voice control instruction has been searched for" may also be sent to the user (i.e. the target object).
In an exemplary embodiment, the first determining module 42 is further configured to: grouping the voice interaction data according to a preset period to obtain a plurality of voice interaction data, wherein the voice interaction data are continuous in time; and respectively identifying each piece of voice interaction data in the plurality of pieces of voice interaction data to obtain a voice data packet included in each piece of voice data.
Through the above embodiment, a technical solution for grouping voice interaction data periodically is provided, where the preset period may be flexibly adjusted, may be fixed, and at this time, the preset time interval between two groups of continuous voice interaction data in any time may be consistent, or may also be changed in real time, at this time, the preset time interval between two groups of continuous voice interaction data in any time may not be consistent, and optionally, for example, two groups of continuous voice interaction data intervals 2s in group a are set, and two groups of continuous voice interaction data intervals 3s in group (a+1) are set.
In an exemplary embodiment, the second determining module 44 is further configured to: respectively comparing the plurality of second voice data packets with preset white noise to obtain data similarity of the plurality of second voice data packets and the preset white noise; determining a second voice data packet corresponding to the maximum data similarity as a target voice data packet; and under the condition that the data similarity of the target voice data packet is larger than a preset threshold value, determining other voice data packets except the target voice data packet from the plurality of second voice data packets, and determining a voice control instruction of the target object according to voice recognition results of the other voice data packets and the first voice data packet.
Optionally, under the condition that the preset white noise data is monitored, acquiring a noise time period in the process of monitoring the preset white noise data; if the noise time period is determined to be greater than the preset time period, controlling the intelligent equipment to send an inquiry message to the target object, wherein the inquiry message is used for inquiring whether the target object finishes the current voice interaction process or not; under the condition that a reply message sent by the target object is received, if the reply message is determined to indicate that the target object does not end the current voice interaction process, continuing to acquire voice interaction data of the target object; and if the reply message indicates that the target object finishes the current voice interaction process, stopping the voice interaction data of the target object, and controlling the intelligent equipment to prompt the target object that the current voice interaction process is finished.
In an exemplary embodiment, the second determining module 44 is further configured to: before a target voice control instruction consistent with the voice control instruction is searched from a device instruction set supported by the intelligent device, obtaining device information sent by the intelligent device, and determining the device state of the intelligent device based on the device information; and under the condition that the equipment state of the intelligent equipment is determined to be the working state, acquiring an equipment instruction set supported by the intelligent equipment and preset by the target object.
In one exemplary embodiment, the control module 46 is further configured to: acquiring a mutually exclusive instruction set of the intelligent device, wherein the mutually exclusive instruction set comprises: different equipment modes which allow the intelligent equipment to run and voice control instructions which cannot be supported by the different equipment modes; under the condition that the target voice control instruction is not found in the mutually exclusive instruction set, suspending executing the control instruction in the current equipment mode of the intelligent equipment, and executing the target voice control instruction; and after the target voice control instruction is executed, continuing to execute the control instruction in the current equipment mode.
It should be noted that, the control instruction in the current device mode of the intelligent device may include, for example, a remote control instruction, a field control instruction, and the like, but is not limited thereto.
In one exemplary embodiment, further, the control module 46 is further configured to: receiving a reply message of the intelligent equipment, wherein the reply message comprises an execution result of the intelligent equipment after executing the target voice control instruction; and under the condition that the execution result indicates that the intelligent equipment fails to execute the target voice control instruction, if the equipment state of the intelligent equipment is determined to be the working state, the target voice control instruction is sent to the intelligent equipment so as to control the intelligent equipment to execute the target voice control instruction again.
In an exemplary embodiment, as shown in fig. 5, fig. 5 is a block diagram (two) of a voice control apparatus of an intelligent device according to an embodiment of the present application, where the voice control apparatus of an intelligent device further includes, in addition to the first determining module 42, the second determining module 44, and the control module 46, a searching module 52, configured to: before searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device, a special device instruction set preset for the target object can be obtained, wherein the special device instruction set represents a historical device control instruction of the target object; and searching a target voice control instruction consistent with the voice control instruction from the special equipment instruction set.
Further, in this embodiment, in the case that the search is successful, the intelligent device is controlled according to the searched target voice control instruction; and under the condition of failure in searching, searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device.
In an optional embodiment, a preset reply word corresponding to the target voice control instruction is obtained; and determining a broadcasting form supported by the intelligent equipment according to the equipment type of the intelligent equipment, and broadcasting the preset reply language in the broadcasting form supported by the intelligent equipment.
The broadcasting forms supported by the intelligent device can include audio and video broadcasting, text broadcasting, popup broadcasting, virtual image broadcasting and the like, for example, and the application is not limited to this.
Embodiments of the present application also provide a storage medium including a stored program, wherein the program performs the method of any one of the above when run.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store program code for performing the steps of:
s1, analyzing voice interaction data of a target object, and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
s2, determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets;
s3, searching a target voice control instruction consistent with the voice control instruction from the equipment instruction set supported by the intelligent equipment, and controlling the intelligent equipment according to the target voice control instruction.
Embodiments of the present application also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1, analyzing voice interaction data of a target object, and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
s2, determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets;
S3, searching a target voice control instruction consistent with the voice control instruction from the equipment instruction set supported by the intelligent equipment, and controlling the intelligent equipment according to the target voice control instruction.
Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device for execution by the computing devices and, in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be implemented as individual integrated circuit modules, or as individual integrated circuit modules. Thus, the present application is not limited to any specific combination of hardware and software.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (10)

1. The voice control method of the intelligent equipment is characterized by comprising the following steps of:
analyzing voice interaction data of a target object, and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to recognition results of the voice data packets;
and searching a target voice control instruction consistent with the voice control instruction from a device instruction set supported by the intelligent device, and controlling the intelligent device according to the target voice control instruction.
2. The voice control method of an intelligent device according to claim 1, wherein determining a plurality of voice data packets corresponding to the voice interaction data includes:
grouping the voice interaction data according to a preset period to obtain a plurality of voice interaction data, wherein the voice interaction data are continuous in time;
and respectively identifying each piece of voice interaction data in the plurality of pieces of voice interaction data to obtain a plurality of voice data packets included in each piece of voice data.
3. The voice control method of the intelligent device according to claim 1, wherein determining the voice control instruction of the target object according to the recognition result of the plurality of voice data packets comprises:
respectively comparing the plurality of second voice data packets with preset white noise to obtain data similarity of the plurality of second voice data packets and the preset white noise;
determining a second voice data packet corresponding to the maximum data similarity as a target voice data packet;
and under the condition that the data similarity of the target voice data packet is larger than a preset threshold value, determining other voice data packets except the target voice data packet from the plurality of second voice data packets, and determining a voice control instruction of the target object according to voice recognition results of the other voice data packets and the first voice data packet.
4. The method of claim 1, further comprising, prior to locating a target voice control command consistent with the voice control command from a set of device commands supported by the smart device:
acquiring equipment information sent by the intelligent equipment, and determining the equipment state of the intelligent equipment based on the equipment information;
and under the condition that the equipment state of the intelligent equipment is determined to be the working state, acquiring an equipment instruction set supported by the intelligent equipment and preset by the target object.
5. The voice control method of an intelligent device according to claim 1 or 4, characterized by controlling the intelligent device according to the target voice control instruction, comprising:
acquiring a mutually exclusive instruction set of the intelligent device, wherein the mutually exclusive instruction set comprises: different equipment modes which allow the intelligent equipment to run and voice control instructions which cannot be supported by the different equipment modes;
under the condition that the target voice control instruction is not found in the mutually exclusive instruction set, suspending executing the control instruction in the current equipment mode of the intelligent equipment, and executing the target voice control instruction; and after the target voice control instruction is executed, continuing to execute the control instruction in the current equipment mode.
6. The voice control method of the smart device according to claim 1, comprising, after controlling the smart device according to the target voice control instruction:
receiving a reply message of the intelligent equipment, wherein the reply message comprises an execution result of the intelligent equipment after executing the target voice control instruction;
and under the condition that the execution result indicates that the intelligent equipment fails to execute the target voice control instruction, if the equipment state of the intelligent equipment is determined to be the working state, the target voice control instruction is sent to the intelligent equipment so as to control the intelligent equipment to execute the target voice control instruction again.
7. The method of claim 1, further comprising, prior to locating a target voice control command consistent with the voice control command from a set of device commands supported by the smart device:
acquiring a special equipment instruction set preset for the target object, wherein the special equipment instruction set represents a historical equipment control instruction of the target object;
and searching a target voice control instruction consistent with the voice control instruction from the special equipment instruction set.
8. A voice control apparatus for an intelligent device, comprising:
the first determining module is used for analyzing the voice interaction data of the target object and determining a plurality of voice data packets corresponding to the voice interaction data; wherein the plurality of voice data packets comprises: a first voice data packet and a second voice data packet, the first voice data packet comprising: the entity word for indicating the intelligent device, the second voice data packet includes: other voice interaction data except the first voice data packet in the voice interaction data;
the second determining module is used for determining a voice control instruction of the target object and intelligent equipment to be controlled by the voice control instruction according to the recognition results of the voice data packets;
and the control module is used for searching a target voice control instruction consistent with the voice control instruction from the equipment instruction set supported by the intelligent equipment and controlling the intelligent equipment according to the target voice control instruction.
9. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any of the preceding claims 1 to 7.
10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.
CN202310048578.3A 2023-01-31 2023-01-31 Voice control method and device of intelligent equipment, storage medium and electronic device Pending CN116246624A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310048578.3A CN116246624A (en) 2023-01-31 2023-01-31 Voice control method and device of intelligent equipment, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310048578.3A CN116246624A (en) 2023-01-31 2023-01-31 Voice control method and device of intelligent equipment, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN116246624A true CN116246624A (en) 2023-06-09

Family

ID=86632347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310048578.3A Pending CN116246624A (en) 2023-01-31 2023-01-31 Voice control method and device of intelligent equipment, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN116246624A (en)

Similar Documents

Publication Publication Date Title
CN109377995B (en) Method and device for controlling equipment
CN109347708B (en) Voice recognition method and device, household appliance, cloud server and medium
CN114697150B (en) Command issuing method and device, storage medium and electronic device
CN109343481B (en) Method and device for controlling device
CN117573320A (en) Task node execution method and device, storage medium and electronic device
CN116246624A (en) Voice control method and device of intelligent equipment, storage medium and electronic device
CN114915514B (en) Method and device for processing intention, storage medium and electronic device
CN116107975A (en) Control method and device of equipment, storage medium and electronic device
CN116225834A (en) Alarm information sending method and device, storage medium and electronic device
CN109976168B (en) Decentralized intelligent home control method and system
CN116165931A (en) Control method and system of intelligent equipment, device, storage medium and electronic device
CN114911535B (en) Application program component configuration method, storage medium and electronic device
CN115314331B (en) Control method and device of intelligent terminal, storage medium and electronic device
CN115001885B (en) Equipment control method and device, storage medium and electronic device
CN116092498A (en) Voice instruction response method and device, storage medium and electronic device
CN115373283A (en) Control instruction determination method and device, storage medium and electronic device
CN117527459A (en) Control method and device of intelligent equipment, storage medium and electronic device
CN112947107B (en) Configuration method and device for smart home scene and electronic equipment
CN116483961A (en) Training method and device of dialogue model, storage medium and electronic equipment
CN116389436A (en) Broadcasting method and device of processing result, storage medium and electronic device
CN116386597A (en) Dialect recognition model construction method and device, storage medium and electronic device
CN117857621A (en) Data pushing method and device, storage medium and electronic equipment
CN117749843A (en) Scene triggering method and device, storage medium and electronic device
CN116312624A (en) Voiceprint recognition-based test method, voiceprint recognition-based test system, storage medium and electronic device
CN117373436A (en) Model update triggering method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination