CN212624795U - Interactive system, voice interaction equipment and control equipment - Google Patents

Interactive system, voice interaction equipment and control equipment Download PDF

Info

Publication number
CN212624795U
CN212624795U CN202020557184.2U CN202020557184U CN212624795U CN 212624795 U CN212624795 U CN 212624795U CN 202020557184 U CN202020557184 U CN 202020557184U CN 212624795 U CN212624795 U CN 212624795U
Authority
CN
China
Prior art keywords
voice interaction
button
corpus text
corpus
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202020557184.2U
Other languages
Chinese (zh)
Inventor
刘兆健
葛佩
汪贇
李岳冰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202020557184.2U priority Critical patent/CN212624795U/en
Application granted granted Critical
Publication of CN212624795U publication Critical patent/CN212624795U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The embodiment of the utility model provides an interactive system, voice interaction equipment and control equipment relates to communication technology field. Wherein the system comprises: the voice interaction device is used for detecting the operation of the control device used for interacting with the voice interaction device and sending corresponding corpus texts to the voice interaction device based on the operation; the voice interaction device is used for receiving the corpus text sent by the control device and executing a corresponding operation instruction based on the corpus text. Through the embodiment of the utility model provides a, under the unsuitable condition that carries out the pronunciation interaction with the pronunciation interaction device, can interact with the pronunciation interaction device conveniently.

Description

Interactive system, voice interaction equipment and control equipment
Technical Field
The embodiment of the utility model provides a relate to communication technology field, especially relate to an interactive system, voice interaction equipment and control device.
Background
With the development of terminal technology, voice interaction devices such as smart speakers have become increasingly popular. The voice interaction equipment can collect voice data of the user, and then provides service for the user according to the collected voice data, so that automation and intellectualization of life of the user are realized. For example, a user interacts with a smart speaker through voice, the user says "i want to listen to a children's story", the smart speaker collects voice data of the user through a microphone, recognizes the voice data into text using an automatic voice recognition technique, and then recognizes the user's intention and executes corresponding instructions according to the text using a natural language processing technique. However, in some application scenarios of a voice interaction device, a user is not adapted to interact with the voice interaction device by voice. For example, a user may not be able to speak for a certain period of time due to vocal cords being injured, but may wish to have access to a voice interactive device. As another example, a child may not pronounce the standard, or the user may simply speak a dialect, which the voice interaction device cannot recognize. Also for example, in a noisy environment, the accuracy of speech recognition is low. Both of these application scenarios may cause a degradation of the user's voice interaction experience with the voice interaction device. Therefore, under the condition that the voice interaction with the voice interaction equipment is not suitable, how to conveniently interact with the voice interaction equipment becomes a technical problem to be solved at present.
SUMMERY OF THE UTILITY MODEL
The application aims to provide an interaction system, voice interaction equipment and control equipment, and is used for solving the technical problem of how to conveniently interact with the voice interaction equipment under the condition that the voice interaction with the voice interaction equipment is not suitable in the prior art.
According to a first aspect of the embodiments of the present invention, an interactive system is provided. The system comprises: the voice interaction device is used for detecting the operation of the control device used for interacting with the voice interaction device and sending corresponding corpus texts to the voice interaction device based on the operation; the voice interaction device is used for receiving the corpus text sent by the control device and executing a corresponding operation instruction based on the corpus text.
According to a second aspect of the embodiments of the present invention, an interactive system is provided. The system comprises: the intelligent sweeper comprises a sweeper with a voice interaction function and a button panel in communication connection with the sweeper, wherein the button panel is used for detecting operation on the button panel used for interacting with the sweeper and sending a corpus text for sweeping in a kitchen to the sweeper based on the operation; the sweeper is used for receiving the corpus text sent by the button panel and sweeping the floor in the kitchen based on the corpus text.
According to a third aspect of embodiments of the present invention, an interactive system is provided. The system comprises: the control device is used for detecting the operation of the control device used for interacting with the voice interaction device, determining an operation time point corresponding to the operation, and sending a corresponding corpus text to the voice interaction device based on a customized time period of the operation time point; the voice interaction device is used for receiving the corpus text sent by the control device and executing a corresponding operation instruction based on the corpus text.
According to the utility model discloses in the fourth aspect of the embodiment, a voice interaction device is provided. The apparatus comprises: the microphone, the loudspeaker and the short-range wireless communication device are connected with the controller through the circuit board; the microphone collects voice data of a user and sends the voice data to the controller, and the controller forwards the voice data to a cloud end, so that the cloud end executes a corresponding voice instruction according to the voice data and controls the loudspeaker to play an execution result of the voice instruction; the near field communication device is used for receiving a corpus text sent by an operation device interacting with voice interaction equipment based on the operation of the operation device and sending the corpus text to the controller, and the controller forwards the corpus text to the cloud end, so that the cloud end executes a corresponding operation instruction according to the corpus text and controls the loudspeaker to play an execution result of the operation instruction.
According to the utility model discloses in the fifth aspect of the embodiment, a voice interaction device is provided. The apparatus comprises: the microphone, the loudspeaker and the button are connected with the controller through the circuit board; the microphone collects voice data of a user and sends the voice data to the controller, and the controller forwards the voice data to a cloud end, so that the cloud end executes a corresponding voice instruction according to the voice data and controls the loudspeaker to play an execution result of the voice instruction; the button is used for detecting the operation of the button and sending corresponding corpus texts to the controller based on the operation, and the controller forwards the corpus texts to the cloud end, so that the cloud end executes corresponding operation instructions according to the corpus texts and controls the loudspeaker to play execution results of the operation instructions.
According to the utility model discloses the sixth aspect of the embodiment provides a control device. The apparatus comprises: the controller is used for sending corresponding corpus texts to voice interaction equipment when the fact that the key is pressed is detected, and therefore the voice interaction equipment executes corresponding operation instructions based on the corpus texts.
According to the utility model discloses the seventh aspect of the embodiment provides a control device. The apparatus comprises: a sensor for detecting an operation for the manipulation device; and the controller is in communication connection with the sensor and is used for sending corresponding corpus texts to the voice interaction equipment based on the operation when the sensor detects the operation aiming at the control equipment, so that the voice interaction equipment executes corresponding operation instructions based on the corpus texts.
According to the utility model discloses the eighth aspect of the embodiment provides a control device. The apparatus comprises: the input device is used for determining the corpus text input by the user based on the corpus text input operation of the user and sending the corpus text input by the user to the controller in communication connection with the input device; the controller is used for sending the corpus text input by the user to the voice interaction equipment, so that the voice interaction equipment executes a corresponding operation instruction based on the corpus text input by the user.
According to the embodiment of the utility model provides an interactive scheme, the operation of the control equipment that is used for interacting with voice interaction equipment is detected to the control equipment of being connected with voice interaction equipment communication, and based on the operation, send corresponding corpus text to voice interaction equipment; compared with the existing other modes, the voice interaction device is in communication connection with the voice interaction device and sends the corresponding corpus text to the voice interaction device based on the operation of a user on the control device, and the voice interaction device executes the corresponding operation instruction based on the received corpus text, so that the voice interaction device can conveniently interact with the voice interaction device under the condition that the voice interaction device is not suitable for voice interaction.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:
fig. 1A is a schematic structural diagram of an interactive system according to a first embodiment of the present invention;
fig. 1B is a schematic diagram of an interaction process according to an embodiment of the present invention;
fig. 2A is a schematic structural diagram of an interactive system in an embodiment of the present invention;
fig. 2B is a schematic diagram of an interaction process according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an interactive system in a third embodiment of the present invention;
fig. 4A is a schematic structural diagram of a voice interaction device in the fourth embodiment of the present invention;
fig. 4B is a schematic diagram of an interaction process according to the fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a fifth embodiment of the present invention;
fig. 6A is a schematic structural view of an operation device in the sixth embodiment of the present invention;
fig. 6B is a schematic diagram of an interaction process according to the sixth embodiment of the present invention;
fig. 7A is a schematic structural diagram of a seventh embodiment of the present invention;
fig. 7B is a schematic diagram of an interaction process according to a seventh embodiment of the present invention;
fig. 8A is a schematic structural diagram of an eighth embodiment of the present invention;
fig. 8B is a schematic diagram of an interaction process according to an eighth embodiment of the present invention;
fig. 9 is a flowchart illustrating steps of an interaction method according to a ninth embodiment of the present invention;
fig. 10 is a schematic structural diagram of an interaction device in the tenth embodiment of the present invention;
fig. 11 is a schematic structural diagram of an electronic device in an eleventh embodiment of the present invention;
fig. 12 is a hardware structure of an electronic device according to a twelfth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely configured to illustrate the relevant invention and not to limit the invention. It should be noted that, for convenience of description, only the relevant portions of the related inventions are shown in the drawings.
It should be noted that, in the present invention, the embodiments and features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Referring to fig. 1A, a schematic structural diagram of an interaction system according to a first embodiment of the present invention is shown. Specifically, the interactive system provided in this embodiment includes: the voice interaction device 10 and the control device 20 in communication connection with the voice interaction device 10, where the control device 20 is configured to detect an operation on the control device 20 for interacting with the voice interaction device and send a corresponding corpus text to the voice interaction device 10 based on the operation; the voice interaction device 10 is configured to receive the corpus text sent by the control device 20, and execute a corresponding operation instruction based on the corpus text.
In the embodiment of the present invention, the voice interaction device 10 includes at least one of the following: the system comprises a sound box with a voice interaction function, a television with the voice interaction function and a mobile phone terminal with the voice interaction function. The voice interaction function can be understood as a function that a user interacts with the device through voice. The control device 20 can be understood as a device that provides a user with a control access to the voice interaction device. The manipulation device 20 may be a button panel, a child's toy, a distributed button, a tablet, etc. The manipulation device 20 interacts with the voice interaction device 10 by at least one of the following communication means: near field communication mode, bluetooth, zigbee, wireless local area network. The operation for the manipulation device 20 may be a pressing operation for a key of the manipulation device 20, a sending operation for a corpus text input by a user at the manipulation device 20, or the like. The corpus text may be understood as text for characterizing a user's intention, such as "play music", "listen to children's stories", and the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the voice interaction device 10 executes a corresponding operation instruction based on the corpus text, the voice interaction device 10 performs semantic analysis on the corpus text to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction. For a machine, a text is also the text itself, the meaning expressed by the text needs to be determined, and then the natural meaning corresponding to the text needs to be determined through semantic recognition, so that the content of the corpus text can be recognized. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the control device includes a first button arranged on the panel for sending the corpus text and a second button for setting the corpus text corresponding to the first button, and the second button is used for setting the corpus text corresponding to the first button based on a pressing duration of the second button by a user. Therefore, the corpus text corresponding to the first button can be set based on the pressing time length of the second button by the user through the second button for setting the corpus text corresponding to the first button. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the pressing time of the user for the second button reaches 100ms, the corpus text corresponding to the first button can be set as "play music" through the second button. When the pressing time of the user for the second button reaches 500ms, the corpus text corresponding to the first button can be set as the 'children story listening' through the second button. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the control device includes a first button arranged on the panel for sending the corpus text and a second button for setting the corpus text corresponding to the first button, and the second button is configured to set the corpus text corresponding to the first button based on the number of times that the user presses the second button continuously. Thereby, the corpus text corresponding to the first button can be set based on the number of times of continuous pressing of the second button by the user through the second button for setting the corpus text corresponding to the first button. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the continuous pressing of the second button by the user may be understood as pressing of the second button by the user, wherein the interval duration between two adjacent pressing of the second button by the user is less than the preset duration. And when the continuous pressing times of the user for the second button reach three times, the corpus text corresponding to the first button can be set as 'music playing' through the second button. When the number of times of continuous pressing of the user on the second button reaches two times, the corpus text corresponding to the first button can be set as a 'children story listening' through the second button. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In one particular example, the manipulation device 20 comprises a child toy. The child toy is used for detecting the operation of the child toy through the nine-axis sensor of the child toy, and sending corresponding corpus texts to the voice interaction device 10 based on the operation. Wherein the operation on the child toy comprises at least one of: an upward waving operation of the child toy, a downward waving operation of the child toy, a leftward waving operation of the child toy, a rightward waving operation of the child toy, a circle drawing operation of the child toy. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the operation for the child toy is detected as the upward-waving operation of the child toy through the nine-axis sensor of the child toy, the corresponding corpus text "play a children's song" is transmitted to the voice interaction device 10 based on the upward-waving operation of the child toy. When it is detected through the nine-axis sensor of the child toy that the manipulation for the child toy is a downward waving manipulation for the child toy, a corresponding corpus text "play children's story" is transmitted to the voice interaction device 10 based on the downward waving manipulation for the child toy. When detecting that the operation for the child toy is a left waving operation for the child toy through the nine-axis sensor of the child toy, a corresponding corpus text "play child poem" is transmitted to the voice interaction device 10 based on the left waving operation for the child toy. When detecting that the operation to the child toy is the right waving operation to the child toy through the nine-axis sensor of the child toy, based on the right waving operation to the child toy, the corresponding corpus text "play child piano music" is transmitted to the voice interaction device 10. When it is detected through the nine-axis sensor of the child toy that the operation for the child toy is the operation of drawing a circle on the child toy, a corresponding corpus text "play child light music" is sent to the voice interaction device 10 based on the operation of drawing a circle on the child toy. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the manipulation device 20 includes a plurality of distributed buttons located in different areas, and the distributed buttons are used for detecting operations on the distributed buttons for interacting with the voice interaction device 10 and sending corresponding corpus texts to the voice interaction device 10 based on the operations. Wherein the operation on the distributed button may be a pressing operation on the distributed button. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the plurality of distributed buttons located in different areas may include a distributed button located in a living room, a distributed button located in a primary bed, a distributed button located in a secondary bed, and the like. When the distributed button located in the living room detects a pressing operation of the user for the distributed button located in the living room, the distributed button located in the living room sends a corresponding corpus text "turn on the light of the living room" to the voice interaction device 10 based on the pressing operation. When the distributed button located in the main-lying position detects a pressing operation of the user on the distributed button located in the main-lying position, the distributed button located in the main-lying position sends a corresponding corpus text "closing the main-lying window curtain" to the voice interaction device 10 based on the pressing operation. When the distributed button located on the second lying position detects a pressing operation of the user for the distributed button located on the second lying position, the distributed button located on the second lying position sends a corresponding corpus text "open intelligent mirror" to the voice interaction device 10 based on the pressing operation. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, the system further comprises: and the terminal equipment is in communication connection with the distributed buttons and is used for respectively setting corresponding corpus texts for the distributed buttons through a client installed on the terminal equipment. Therefore, through the client of the terminal device connected with the distributed buttons in a communication mode, corresponding corpus texts can be set for the distributed buttons respectively. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the distributed button is a distributed button located in a living room, the client installed in the terminal device may set a corresponding corpus text "turn on a light in the living room" for the distributed button located in the living room. When the distributed button is the distributed button positioned in the main-lying position, the corresponding corpus text 'closing the main-lying curtain' is set for the distributed button positioned in the main-lying position through the client installed on the terminal equipment. When the distributed button is a distributed button located in the second lying position, a corresponding corpus text 'intelligent mirror' is set for the distributed button located in the second lying position through a client installed on the terminal device. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, the control device 20 includes a handwriting pad, and the handwriting pad is configured to detect a corpus text input operation on the handwriting pad to generate an input corpus text, and detect a sending operation on the input corpus text after generating the input corpus text, so as to send the input corpus text to the voice interaction device. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the handwriting board detects a corpus text input operation on the handwriting board, the input corpus text is generated based on the corpus text input operation. After generating the input corpus text, the handwriting pad detects whether a user performs a sending operation on the input corpus text. And if the handwriting board detects the sending operation of the input corpus text, the handwriting board sends the input corpus text to the voice interaction equipment. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 1B, the voice interaction device 10 is a sound box, and the manipulation device 20 is a button panel. The button panel is in communication connection with the loudspeaker box through a wireless local area network or Bluetooth. Three buttons in the button panel are customized by the user to set the corpus text. For example, the corpus text corresponding to the button 1 may be set to "children's story", the corpus text corresponding to the button 2 may be set to "strike", and the corpus text corresponding to the button 3 may be set to "light off". The user does not need to interact with the voice of the voice equipment, and only needs to press the corresponding button, so that the sound box can conveniently execute the corresponding operation instruction. Specifically, when a user presses a button 1 of the button panel, the button panel sends a corpus text "children story" corresponding to the button 1 to the sound box, the sound box transparently transmits the received corpus text "children story" to a cloud end in communication connection with the sound box, and the cloud end performs semantic analysis on the corpus text "children story" to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (plays the children story). When a user presses a button 2 of the button panel, the button panel sends a corpus text 'time telling' corresponding to the button 2 to the sound box, the sound box transmits the received corpus text 'time telling' to a cloud end in communication connection with the sound box, the cloud end carries out semantic analysis on the corpus text 'time telling' so as to obtain a semantic analysis result of the corpus text, an operation instruction matched with the semantic analysis result is matched, and then the matched operation instruction (time telling) is executed. When a user presses a button 3 of the button panel, the button panel sends a corpus text 'off light' corresponding to the button 3 to the sound box, the sound box transmits the received corpus text 'off light' to a cloud end in communication connection with the sound box, the cloud end performs semantic analysis on the corpus text 'off light' to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (off light). It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
The interaction scheme provided by the embodiment expands the interaction mode of the voice interaction device, so that the control device sends the intention of the user to the voice interaction device in a text mode, and the voice interaction device performs natural language processing on the text and executes a corresponding instruction. The method is more universal, different control devices can customize corpus texts, and the voice interaction device does not need to make any additional adaptation.
According to the embodiment of the utility model provides an interactive scheme, the operation of the control equipment that is used for interacting with voice interaction equipment is detected to the control equipment of being connected with voice interaction equipment communication, and based on the operation, send corresponding corpus text to voice interaction equipment; compared with the existing other modes, the voice interaction device is in communication connection with the voice interaction device and sends the corresponding corpus text to the voice interaction device based on the operation of a user on the control device, and the voice interaction device executes the corresponding operation instruction based on the received corpus text, so that the voice interaction device can conveniently interact with the voice interaction device under the condition that the voice interaction device is not suitable for voice interaction.
Referring to fig. 2A, a schematic structural diagram of an interaction system according to a second embodiment of the present invention is shown. Specifically, the interactive system provided in this embodiment includes: the intelligent floor sweeping machine comprises a floor sweeping machine 30 with a voice interaction function and a button panel 40 in communication connection with the floor sweeping machine 30, wherein the button panel 40 is used for detecting operation on the button panel 40 used for interacting with the floor sweeping machine 30 and sending a corpus text for sweeping in a kitchen to the floor sweeping machine 30 based on the operation; the sweeper 30 is configured to receive the corpus text sent by the button panel 40, and sweep the floor in the kitchen based on the corpus text. Therefore, the voice interaction with the sweeper can be conveniently carried out under the condition that the voice interaction with the sweeper is not suitable. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In the embodiment of the present invention, the voice interaction function can be understood as a function that a user interacts with a device through voice. The button panel 40 interacts with the sweeper 30 through at least one of the following communication means: near field communication mode, bluetooth, zigbee, wireless local area network. The operation for the button panel 40 may be a pressing operation for a button of the button panel 20. The corpus text may be understood as a text for characterizing the user's intention, for example, "sweeping the floor in the kitchen", etc. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, when the sweeper 30 sweeps the kitchen based on the corpus text, the sweeper 30 performs semantic analysis on the corpus text to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction. For a machine, a text is also the text itself, the meaning expressed by the text needs to be determined, and then the natural meaning corresponding to the text needs to be determined through semantic recognition, so that the content of the corpus text can be recognized. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 2B, the button panel is communicatively connected to the sweeper via a wireless lan or bluetooth. Three buttons in the button panel are customized by the user to set the corpus text. For example, the corpus text corresponding to the button 1 may be set to "sweep in the kitchen", the corpus text corresponding to the button 2 may be set to "sweep in the living room", and the corpus text corresponding to the button 3 may be set to "sweep in the bedroom". The user does not need to carry out voice with the sweeper with the voice interaction function, and only needs to press the corresponding button, so that the sweeper can conveniently execute the corresponding operation instruction. Specifically, when a user presses a button 1 of the button panel, the button panel sends a corpus text "sweeping in the kitchen" corresponding to the button 1 to the sweeper, the sweeper receives the corpus text "sweeping in the kitchen" through the short-distance wireless communication device and forwards the received corpus text "sweeping in the kitchen" to a cloud end in communication connection with the sweeper, and the cloud end performs semantic analysis on the corpus text "sweeping in the kitchen" to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (sweeping in the kitchen). When a user presses a button 2 of a button panel, the button panel sends a language material text 'sweeping in a living room' corresponding to the button 2 to a sweeper, the sweeper receives the language material text 'sweeping in the living room' through a short-distance wireless communication device and forwards the received language material text 'sweeping in the living room' to a cloud end in communication connection with the sweeper, the cloud end conducts semantic analysis on the language material text 'sweeping in the living room' to obtain a semantic analysis result of the language material text, an operation instruction matched with the semantic analysis result is matched, and the matched operation instruction (sweeping in the living room) is executed. When a user presses a button 3 of the button panel, the button panel sends a corpus text 'sweeping in a bedroom' corresponding to the button 3 to the sweeper, the sweeper receives the corpus text 'sweeping in the bedroom' through the short-distance wireless communication device and forwards the received corpus text 'sweeping in the bedroom' to a cloud end in communication connection with the sweeper, the cloud end conducts semantic analysis on the corpus text 'sweeping in the bedroom' to obtain a semantic analysis result of the corpus text, an operation instruction matched with the semantic analysis result is matched, and then the matched operation instruction (sweeping in the bedroom) is executed. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 3, a schematic structural diagram of an interactive system according to a third embodiment of the present invention is shown. Specifically, the interactive system provided in this embodiment includes: the voice interaction device 10 and the control device 20 are in communication connection with the voice interaction device 10, and the control device 20 is configured to detect an operation on the control device for interacting with the voice interaction device, determine an operation time point corresponding to the operation, and send a corresponding corpus text to the voice interaction device based on a customized time period in which the operation time point is located; the voice interaction device 20 is configured to receive the corpus text sent by the control device, and execute a corresponding operation instruction based on the corpus text.
In a specific example, before the control device 20 is put into use, the corpus text corresponding to the control device 20 in the customized time period is configured in advance. For example, when the control device 20 is a button panel communicatively connected to a sweeper, the corpus text corresponding to the control device 20 at the customized time period from 9 to 11 points may be set as "sweeping the sweeper in the kitchen". When the control device 20 is a button panel communicatively connected to the sweeper, the corpus text corresponding to the control device 20 from 14 to 17 points in the customized time period may be set as "sweeping in the living room". When the control device 20 is a button panel in communication connection with the sweeper, the corpus text corresponding to the control device 20 from 19 to 20 points in the customized time period may be set as "sweeping the floor in the bedroom". After the corpus text corresponding to the customized time period of the control device 20 is configured in advance, the control device 20 detects an operation on the control device for interacting with the voice interaction device, and determines an operation time point corresponding to the operation. If the customized time period of the operation time point is from 9 to 11, the control device 20 sends the corresponding corpus text "sweeping the floor in the kitchen" to the voice interaction device 10. If the customized time period of the operation time point is 14 to 17, the control device 20 sends the corresponding corpus text "sweeping the floor in the living room" to the voice interaction device 10. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
According to the embodiment of the utility model provides an interactive scheme, the operation of the control equipment that is used for interacting with voice interaction equipment is detected with voice interaction equipment communication connection's control equipment to confirm the operation time point that the operation corresponds, again based on the customization time quantum that operation time point is located, send corresponding corpus text to voice interaction equipment; compared with the existing other modes, the voice interaction device is in communication connection with the voice interaction device and sends the corresponding corpus text to the voice interaction device based on the customized time period of the operation time point, and the voice interaction device executes the corresponding operation instruction based on the received corpus text and can conveniently interact with the voice interaction device under the condition that the voice interaction device is not suitable for voice interaction.
Referring to fig. 4A, a schematic structural diagram of a voice interaction device according to a fourth embodiment of the present invention is shown. Specifically, the voice interaction device 10 provided in this embodiment includes: the device comprises a microphone 11, a loudspeaker 12, a circuit board 13, a controller 14 and a short-range wireless communication device 15, wherein the controller 14 and the short-range wireless communication device 15 are arranged on the circuit board 13, and the microphone 11, the loudspeaker 12 and the short-range wireless communication device 15 are connected with the controller 14 through the circuit board 13; the microphone 11 collects voice data of a user and sends the voice data to the controller 14, and the controller 14 forwards the voice data to the cloud 30, so that the cloud 30 executes a corresponding voice instruction according to the voice data and controls the loudspeaker 12 to play an execution result of the voice instruction; the short-distance wireless communication device 15 is configured to receive a corpus text sent by the control device 20 interacting with the voice interaction device 10 based on an operation on the control device 20 and send the corpus text to the controller 14, and the controller 14 forwards the corpus text to the cloud 30, so that the cloud 30 executes a corresponding operation instruction according to the corpus text and controls the speaker 12 to play an execution result of the operation instruction. Therefore, the short-distance wireless communication device receives the corpus text sent by the control equipment interacting with the voice interaction equipment based on the operation on the control equipment and executes the corresponding operation instruction according to the corpus text, so that the voice interaction mode of the voice interaction equipment is supplemented. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In one specific example, the close range wireless communication device comprises at least one of: bluetooth device, zigbee device, near field communication device, wireless local area network device. The controller may be a voice chip or a processor with voice signal and data processing capability and control function, and may specifically be a central processing unit, and may also be other general processors, digital signal processors, application specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 4B, the voice interaction device 10 is a robot, and the manipulation device 20 is a button panel. The button panel is in communication connection with the loudspeaker box through a wireless local area network or Bluetooth. Three buttons in the button panel are customized by the user to set the corpus text. For example, the corpus text corresponding to the button 1 may be set to "play music", the corpus text corresponding to the button 2 may be set to "play poem", and the corpus text corresponding to the button 3 may be set to "play novel". The user does not need to carry out voice with the voice interaction equipment, and only needs to press a corresponding button, so that the robot can conveniently execute a corresponding operation instruction. Specifically, when a user presses a button 1 of the button panel, the button panel sends a corpus text 'playing music' corresponding to the button 1 to the robot, the robot receives the corpus text 'playing music' through the short-distance wireless communication device and forwards the received corpus text 'playing music' to a cloud end in communication connection with the robot, the cloud end performs semantic analysis on the corpus text 'playing music' to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (playing music). When a user presses a button 2 of a button panel, the button panel sends a linguistic data text 'play poem' corresponding to the button 2 to a robot, the robot receives the linguistic data text 'play poem' through a near-distance wireless communication device and forwards the received linguistic data text 'play poem' to a cloud end in communication connection with the robot, the cloud end conducts semantic analysis on the linguistic data text 'play poem' to obtain a semantic analysis result of the linguistic data text, an operation instruction matched with the semantic analysis result is matched, and the matched operation instruction (play poem) is executed. When a user presses a button 3 of a button panel, the button panel sends a corpus text 'playing novel speech' corresponding to the button 3 to a robot, the robot receives the corpus text 'playing novel speech' through a short-distance wireless communication device and forwards the received corpus text 'playing novel speech' to a cloud end in communication connection with the robot, the cloud end conducts semantic analysis on the corpus text 'playing novel speech' to obtain a semantic analysis result of the corpus text, an operation instruction matched with the semantic analysis result is matched, and then the matched operation instruction (playing novel speech) is executed. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 5, a schematic structural diagram of a voice interaction device according to a fifth embodiment of the present invention is shown. Specifically, the voice interaction device 10 provided in this embodiment includes: the microphone 11, the loudspeaker 12, the circuit board 13, and the controller 14 and the button 16 arranged on the circuit board 13, wherein the microphone 11, the loudspeaker 12 and the button 16 are connected with the controller 14 through the circuit board 13; the microphone 11 collects voice data of a user and sends the voice data to the controller 14, and the controller 14 forwards the voice data to the cloud 30, so that the cloud 30 executes a corresponding voice instruction according to the voice data and controls the loudspeaker 12 to play an execution result of the voice instruction; the button 16 is configured to detect an operation on the button 16, and send a corresponding corpus text to the controller 14 based on the operation, and the controller 14 forwards the corpus text to the cloud 30, so that the cloud 30 executes a corresponding operation instruction according to the corpus text, and controls the speaker 12 to play an execution result of the operation instruction. Therefore, the button is operated through button detection, the corresponding corpus text is sent to the controller based on the operation, the controller forwards the corpus text to the cloud, the cloud executes the corresponding operation instruction according to the corpus text, and therefore the voice interaction mode of the voice interaction device is supplemented. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the controller may be a voice chip or a processor with voice signal and data processing capability and control function, and may be specifically a central processing unit, and may also be other general processors, digital signal processors, application specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 6A, a schematic structural diagram of a sixth control device according to an embodiment of the present invention is shown. Specifically, the control device provided in this embodiment includes: the controller 22 is configured to send a corresponding corpus text to a voice interaction device when it is detected that the key 21 is pressed, so that the voice interaction device executes a corresponding operation instruction based on the corpus text. Therefore, the voice interaction equipment can be conveniently interacted with. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, the controller may be a chip or a processor with data processing capability and control function, specifically a central processing unit, and may also be other general processors, digital signal processors, application specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 6B, the voice interaction device is a sound box, and the manipulation device is a button panel. The button panel is in communication connection with the loudspeaker box through a wireless local area network or Bluetooth. Three buttons in the button panel are customized by the user to set the corpus text. For example, the corpus text corresponding to the button 1 may be set to "open a curtain", the corpus text corresponding to the button 2 may be set to "turn on a desk lamp", and the corpus text corresponding to the button 3 may be set to "turn on a water heater". The user does not need to interact with the voice of the voice equipment, and only needs to press the corresponding button, so that the sound box can conveniently execute the corresponding operation instruction. Specifically, when a user presses a button 1 of the button panel, the button panel sends a corpus text 'open curtain' corresponding to the button 1 to the sound box, the sound box forwards the received corpus text 'open curtain' to a cloud end in communication connection with the sound box, the cloud end performs semantic analysis on the corpus text 'open curtain' to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (open curtain). When a user presses a button 2 of the button panel, the button panel sends a corpus text 'turn-on desk lamp' corresponding to the button 2 to the sound box, the sound box forwards the received corpus text 'turn-on desk lamp' to a cloud end in communication connection with the sound box, the cloud end performs semantic analysis on the corpus text 'turn-on desk lamp' to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (turn-on desk lamp). When a user presses a button 3 of the button panel, the button panel sends a corpus text 'open water heater' corresponding to the button 3 to the sound box, the sound box forwards the received corpus text 'open water heater' to a cloud end in communication connection with the sound box, the cloud end conducts semantic analysis on the corpus text 'open water heater' to obtain a semantic analysis result of the corpus text, an operation instruction matched with the semantic analysis result is matched, and then the matched operation instruction is executed (the water heater is opened). It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 7A, a schematic structural diagram of a seventh embodiment of the present invention is shown. Specifically, the control device provided in this embodiment includes: a sensor 23 for detecting an operation for the manipulation device; and the controller 22 is in communication connection with the sensor 23, and is configured to send, based on an operation of the control device, a corresponding corpus text to the voice interaction device when the sensor detects the operation of the control device, so that the voice interaction device executes a corresponding operation instruction based on the corpus text. Therefore, the voice interaction equipment can be conveniently interacted with. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In one particular example, the sensor 22 may be a nine-axis sensor. The controller may be a chip or a processor with data processing capability and control function, and may specifically be a central processing unit, and may also be other general processors, digital signal processors, application specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 7B, the voice interaction device is a sound box, and the manipulation device is a child toy. The children toy is in communication connection with the loudspeaker box through a wireless local area network or Bluetooth. Corpus text corresponding to the operation of the child toy may be preconfigured. For example, the corpus text corresponding to the upward-waving operation of the child toy may be set to "play child story", and the corpus text corresponding to the downward-waving operation of the child toy may be set to "play child poem". The children do not need to interact with the voice of the equipment, and only need to execute corresponding operation aiming at the children toys, the sound box can conveniently execute corresponding operation instructions. Specifically, when the children toy is waved upwards, the children toy sends a corresponding corpus text 'playing children story' to the sound box, the sound box forwards the received corpus text 'playing children story' to a cloud end in communication connection with the sound box, the cloud end performs semantic analysis on the corpus text 'playing children story' to obtain a semantic analysis result of the corpus text, matches an operation instruction matched with the semantic analysis result, and then executes the matched operation instruction (playing children story). When the children toys are waved downwards, the children toys send the played children poems to the sound box according to the corresponding linguistic data text, the sound box forwards the received linguistic data text, namely the played children poems, to the cloud end in communication connection with the sound box, the cloud end conducts semantic analysis on the linguistic data text, namely the played children poems, so that semantic analysis results of the linguistic data text are obtained, operation instructions matched with the semantic analysis results are matched, and the matched operation instructions (played children poems) are executed. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 8A, a schematic structural diagram of an eighth embodiment of the present invention is shown. Specifically, the control device provided in this embodiment includes: the input device is used for determining the corpus text input by the user based on the corpus text input operation of the user and sending the corpus text input by the user to the controller in communication connection with the input device; the controller is used for sending the corpus text input by the user to the voice interaction equipment, so that the voice interaction equipment executes a corresponding operation instruction based on the corpus text input by the user. Therefore, the voice interaction equipment can be conveniently interacted with. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In one specific example, the input device may be a touch display screen. The controller may be a chip or a processor with data processing capability and control function, and may specifically be a central processing unit, and may also be other general processors, digital signal processors, application specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, as shown in fig. 8B, the voice interaction device is a sound box, and the control device is a handwriting board. The handwriting board is in communication connection with the loudspeaker box through a wireless local area network or Bluetooth. The voice of the voice interaction equipment is not needed by the children, and the sound box can conveniently execute corresponding operation instructions only by executing corresponding corpus text input operation aiming at the handwriting board. Specifically, when a user executes a corpus text input operation on the handwriting board, the handwriting board generates an input corpus text according to the corpus text input operation, the input corpus text is sent to the sound box, the sound box forwards the received corpus text to a cloud end in communication connection with the sound box, the cloud end performs semantic analysis on the corpus text to obtain a semantic analysis result of the corpus text, an operation instruction matched with the semantic analysis result is matched, and the matched operation instruction is executed. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
Referring to fig. 9, a flowchart illustrating steps of an interaction method according to a ninth embodiment of the present invention is shown.
Specifically, the interaction method provided by this embodiment includes the following steps (the execution subject is the control device):
in step S901, an operation of a manipulation device for interacting with the voice interaction device is detected.
In an embodiment of the present invention, the voice interaction device includes at least one of the following: the system comprises a sound box with a voice interaction function, a television with the voice interaction function and a mobile phone terminal with the voice interaction function. The voice interaction function can be understood as a function that a user interacts with the device through voice. The control device can be understood as a device that provides a user with a control entry of the voice interaction device. The manipulation device may be a button panel, a child's toy, a distributed button, a tablet, etc. The control device interacts with the voice interaction device through at least one of the following communication modes: near field communication mode, bluetooth, zigbee, wireless local area network. The operation is used for instructing the control device to send corresponding corpus texts to the voice interaction device. The operation for the control device may be a pressing operation for a key of the control device, a sending operation for a corpus text input by a user at the control device, or the like. The corpus text may be understood as text for characterizing a user's intention, such as "play music", "listen to children's stories", and the like. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, before the detecting operation of the manipulation device for interacting with the voice interaction device, the method further comprises: configuring corresponding corpus texts for keys of the control equipment; the detecting operation on a manipulation device for interacting with the voice interaction device includes: detecting operation of a key of the control device. Therefore, corresponding corpus texts can be configured for the keys of the control device. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In a specific example, if the manipulation device has two keys, one key may be configured with a corresponding corpus text "play children's story" and the other key may be configured with a corresponding corpus text "play children's music". It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, the manipulation device comprises a child toy, and the detecting operation of the manipulation device for interacting with the voice interaction device comprises: detecting, by a nine-axis sensor of the child toy, an operation of the child toy. Wherein the operation on the child toy comprises at least one of: an upward waving operation of the child toy, a downward waving operation of the child toy, a leftward waving operation of the child toy, a rightward waving operation of the child toy, a circle drawing operation of the child toy. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, the manipulation device comprises a plurality of distributed buttons located in different regions, and the method further comprises, before the detecting operation of the manipulation device for interacting with the voice interaction device: through installing in the customer end of terminal equipment, set up corresponding corpus text respectively for a plurality of distributed button, wherein, a plurality of distributed button respectively with customer end communication connection, detect the operation to be used for with the interactive control equipment of voice interaction equipment includes: detecting an operation of the distributed button. Therefore, the corresponding corpus texts can be respectively set for the distributed buttons by the client installed in the terminal equipment. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In some optional embodiments, the manipulation device comprises a writing pad. When detecting the operation of a control device for interacting with the voice interaction device, detecting the corpus text input operation of the handwriting board to generate an input corpus text; after the input corpus text is generated, the sending operation of the input corpus text is detected, so that the input corpus text is sent to the voice interaction equipment. It should be understood that the above description is only exemplary, and the embodiments of the present invention are not limited thereto.
In step S902, based on the operation, a corresponding corpus text is sent to the voice interaction device, so that the voice interaction device executes a corresponding operation instruction based on the corpus text.
In practical application, a user can conveniently interact with the voice interaction device through the control device, and the control device does not need to be connected with a complex data protocol and only needs to send corresponding corpus texts. In addition, the data transmitted by the voice interaction equipment and the control equipment is simplified into a speech material text, and the interaction protocol of the control equipment and the voice interaction equipment is simpler and more universal.
According to the embodiment of the utility model provides an interactive method detects the operation to being used for with the interactive controlgear of voice interaction equipment, the operation is used for instructing the controlgear to send corresponding corpus text to voice interaction equipment, and based on the operation sends corresponding corpus text to voice interaction equipment, makes voice interaction equipment carry out corresponding operating instruction based on the corpus text, compares with current other modes, based on to being used for with the interactive controlgear's of voice interaction equipment operation, sends corresponding corpus text to voice interaction equipment, makes voice interaction equipment carry out corresponding operating instruction based on the corpus text, under the unsuitable circumstances that carries out voice interaction with voice interaction equipment, can interact with voice interaction equipment conveniently.
The interaction method provided by the present embodiment may be performed by any suitable device having data processing capabilities, including but not limited to: a camera, a terminal, a mobile terminal, a PC, a server, an in-vehicle device, an entertainment device, an advertising device, a Personal Digital Assistant (PDA), a tablet, a laptop, a handheld game machine, glasses, a watch, a wearable device, a virtual display device, a display enhancement device, or the like.
Referring to fig. 10, a schematic structural diagram of an interaction device in the tenth embodiment of the present invention is shown.
The interaction device provided by the embodiment comprises: a detecting module 1003, configured to detect an operation on a control device configured to interact with the voice interaction device, where the operation is used to instruct the control device to send a corresponding corpus text to the voice interaction device; a sending module 1004, configured to send, based on the operation, a corresponding corpus text to the voice interaction device, so that the voice interaction device executes a corresponding operation instruction based on the corpus text.
Optionally, before the detecting module 1003, the apparatus further includes: a configuration module 1001, configured to configure a corresponding corpus text for a key of the control device; the detection module 1003 is specifically configured to: detecting operation of a key of the control device.
Optionally, the control device includes a toy for children, and the detection module 1003 is specifically configured to: detecting, by a nine-axis sensor of the child toy, an operation of the child toy.
Optionally, the operation of the child toy comprises at least one of: an upward waving operation of the child toy, a downward waving operation of the child toy, a leftward waving operation of the child toy, a rightward waving operation of the child toy, a circle drawing operation of the child toy.
Optionally, the manipulating device includes a plurality of distributed buttons located in different areas, and before the detecting module 1003, the apparatus further includes: a setting module 1002, configured to set, through a client installed in a terminal device, corresponding corpus texts for a plurality of distributed buttons respectively, where the plurality of distributed buttons are in communication connection with the client respectively, and the detecting module 1003 is specifically configured to: detecting an operation of the distributed button.
Optionally, the control device includes a tablet, and the detection module 1003 is specifically configured to: detecting a corpus text input operation on the handwriting board to generate an input corpus text; after the input corpus text is generated, the sending operation of the input corpus text is detected, so that the input corpus text is sent to the voice interaction equipment.
Optionally, the control device interacts with the voice interaction device through at least one of the following communication modes: near field communication mode, bluetooth, zigbee, wireless local area network.
Optionally, the voice interaction device comprises at least one of: the system comprises a sound box with a voice interaction function, a television with the voice interaction function and a mobile phone terminal with the voice interaction function.
The interaction device of this embodiment is used to implement the corresponding interaction method in the foregoing method embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein again.
Fig. 11 is a schematic structural diagram of an electronic device in an eleventh embodiment of the present invention; the electronic device may include:
one or more processors 1101;
a computer-readable medium 1102, which may be configured to store one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the interaction method according to the above-mentioned embodiment nine.
Fig. 12 is a hardware structure of an electronic device in a twelfth embodiment of the present invention; as shown in fig. 12, the hardware structure of the electronic device may include: a processor 1201, a communication interface 1202, a computer readable medium 1203, and a communication bus 1204;
wherein the processor 1201, the communication interface 1202, and the computer readable medium 1203 are in communication with each other via a communication bus 1204;
alternatively, the communication interface 1202 may be an interface of a communication module, such as an interface of a GSM module;
the processor 1201 may be specifically configured to: detecting an operation on a control device used for interacting with the voice interaction device, wherein the operation is used for instructing the control device to send a corresponding corpus text to the voice interaction device; and sending a corresponding corpus text to the voice interaction equipment based on the operation, so that the voice interaction equipment executes a corresponding operation instruction based on the corpus text.
The Processor 1201 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps and logic blocks disclosed in embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The computer-readable medium 1203 may be, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code configured to perform the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The above-mentioned functions as defined in the method of the invention are performed when the computer program is executed by a Central Processing Unit (CPU). It should be noted that the computer readable medium of the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the above. The computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access storage media (RAM), a read-only storage media (ROM), an erasable programmable read-only storage media (EPROM or flash memory), an optical fiber, a portable compact disc read-only storage media (CD-ROM), an optical storage media piece, a magnetic storage media piece, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present invention, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code configured to carry out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may operate over any of a variety of networks: including a Local Area Network (LAN) or a Wide Area Network (WAN) -to the user's computer, or alternatively, to an external computer (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions configured to implement the specified logical function(s). In the above embodiments, specific precedence relationships are provided, but these precedence relationships are only exemplary, and in particular implementations, the steps may be fewer, more, or the execution order may be modified. That is, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a detection module and a transmission module. The names of these modules do not in some cases constitute a limitation on the module itself, for example, the detection module may also be described as a "module that detects the operation of a manipulation device for interacting with the voice interaction device".
As another aspect, the present invention further provides a computer-readable medium, on which a computer program is stored, which when executed by a processor implements the interaction method as described in the above embodiment nine.
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: detecting an operation on a control device used for interacting with the voice interaction device, wherein the operation is used for instructing the control device to send a corresponding corpus text to the voice interaction device; and sending a corresponding corpus text to the voice interaction equipment based on the operation, so that the voice interaction equipment executes a corresponding operation instruction based on the corpus text.
The expressions "first", "second", "said first" or "said second" used in various embodiments of the present disclosure may modify various components regardless of order and/or importance, but these expressions do not limit the respective components. The above description is only configured for the purpose of distinguishing elements from other elements. For example, the first user equipment and the second user equipment represent different user equipment, although both are user equipment. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure.
When an element (e.g., a first element) is referred to as being "operably or communicatively coupled" or "connected" (operably or communicatively) to "another element (e.g., a second element) or" connected "to another element (e.g., a second element), it is understood that the element is directly connected to the other element or the element is indirectly connected to the other element via yet another element (e.g., a third element). In contrast, it is understood that when an element (e.g., a first element) is referred to as being "directly connected" or "directly coupled" to another element (a second element), no element (e.g., a third element) is interposed therebetween.
The above description is only a preferred embodiment of the invention and is intended to illustrate the technical principles applied. It will be understood by those skilled in the art that the scope of the present invention is not limited to the specific combination of the above-mentioned features, but also covers other embodiments formed by any combination of the above-mentioned features or their equivalents without departing from the spirit of the present invention. For example, the above features and (but not limited to) technical features having similar functions disclosed in the present invention are mutually replaced to form the technical solution.

Claims (18)

1. An interactive system, characterized in that the system comprises:
the voice interaction device and the control device which is in communication connection with the voice interaction device,
the control device is used for detecting the operation of the control device used for interacting with the voice interaction device and sending corresponding corpus texts to the voice interaction device based on the operation;
the voice interaction device is used for receiving the corpus text sent by the control device and executing a corresponding operation instruction based on the corpus text.
2. The system of claim 1, wherein the manipulation device comprises a child toy,
the children toy is used for detecting the operation of the children toy through the nine-axis sensor of the children toy, and sending corresponding corpus texts to the voice interaction equipment based on the operation.
3. The system of claim 2, wherein the manipulation of the child toy comprises at least one of: an upward waving operation of the child toy, a downward waving operation of the child toy, a leftward waving operation of the child toy, a rightward waving operation of the child toy, a circle drawing operation of the child toy.
4. The system of claim 1, wherein the manipulation device comprises a plurality of distributed buttons located in different regions,
the distributed button is used for detecting the operation of the distributed button for interacting with the voice interaction equipment and sending corresponding corpus texts to the voice interaction equipment based on the operation.
5. The system of claim 4, further comprising:
and the terminal equipment is in communication connection with the distributed buttons and is used for respectively setting corresponding corpus texts for the distributed buttons through a client installed on the terminal equipment.
6. The system of claim 1, wherein the manipulation device comprises a writing pad,
the handwriting board is used for detecting the operation of inputting the language material text of the handwriting board so as to generate the input language material text, and detecting the operation of sending the input language material text after the input language material text is generated so as to send the input language material text to the voice interaction equipment.
7. The system according to claim 1, wherein the manipulation device includes a first button disposed on a panel for transmitting the corpus text and a second button for setting the corpus text corresponding to the first button,
and the second button is used for setting the corpus text corresponding to the first button based on the pressing duration of the second button by the user.
8. The system according to claim 1, wherein the manipulation device includes a first button disposed on a panel for transmitting the corpus text and a second button for setting the corpus text corresponding to the first button,
and the second button is used for setting the corpus text corresponding to the first button based on the continuous pressing times of the user for the second button.
9. The system according to any one of claims 1-8, wherein the manipulation device interacts with the voice interaction device through at least one of the following communication means:
near field communication mode, bluetooth, zigbee, wireless local area network.
10. The system of any one of claims 1-8, wherein the voice interaction device comprises at least one of:
the system comprises a sound box with a voice interaction function, a television with the voice interaction function and a mobile phone terminal with the voice interaction function.
11. An interactive system, characterized in that the system comprises:
a sweeper with a voice interaction function and a button panel in communication connection with the sweeper,
the button panel is used for detecting the operation of the button panel for interacting with the sweeper and sending the corpus text for sweeping in the kitchen to the sweeper based on the operation;
the sweeper is used for receiving the corpus text sent by the button panel and sweeping the floor in the kitchen based on the corpus text.
12. An interactive system, characterized in that the system comprises:
the voice interaction device and the control device which is in communication connection with the voice interaction device,
the control device is used for detecting the operation of the control device for interacting with the voice interaction device, determining an operation time point corresponding to the operation, and sending a corresponding corpus text to the voice interaction device based on the customized time period of the operation time point;
the voice interaction device is used for receiving the corpus text sent by the control device and executing a corresponding operation instruction based on the corpus text.
13. A voice interaction device, the device comprising:
the microphone, the loudspeaker and the short-range wireless communication device are connected with the controller through the circuit board;
the microphone collects voice data of a user and sends the voice data to the controller, and the controller forwards the voice data to a cloud end, so that the cloud end executes a corresponding voice instruction according to the voice data and controls the loudspeaker to play an execution result of the voice instruction;
the near field communication device is used for receiving a corpus text sent by an operation device interacting with voice interaction equipment based on the operation of the operation device and sending the corpus text to the controller, and the controller forwards the corpus text to the cloud end, so that the cloud end executes a corresponding operation instruction according to the corpus text and controls the loudspeaker to play an execution result of the operation instruction.
14. The apparatus of claim 13, wherein the close-range wireless communication device comprises at least one of: bluetooth device, zigbee device, near field communication device, wireless local area network device.
15. A voice interaction device, the device comprising:
the microphone, the loudspeaker and the button are connected with the controller through the circuit board;
the microphone collects voice data of a user and sends the voice data to the controller, and the controller forwards the voice data to a cloud end, so that the cloud end executes a corresponding voice instruction according to the voice data and controls the loudspeaker to play an execution result of the voice instruction;
the button is used for detecting the operation of the button and sending corresponding corpus texts to the controller based on the operation, and the controller forwards the corpus texts to the cloud end, so that the cloud end executes corresponding operation instructions according to the corpus texts and controls the loudspeaker to play execution results of the operation instructions.
16. A manipulation device, characterized in that the device comprises:
a key and a controller in communication connection with the key,
the controller is configured to send a corresponding corpus text to a voice interaction device when it is detected that the key is pressed, so that the voice interaction device executes a corresponding operation instruction based on the corpus text.
17. A manipulation device, characterized in that the device comprises:
a sensor for detecting an operation for the manipulation device;
and the controller is in communication connection with the sensor and is used for sending corresponding corpus texts to the voice interaction equipment based on the operation when the sensor detects the operation aiming at the control equipment, so that the voice interaction equipment executes corresponding operation instructions based on the corpus texts.
18. A manipulation device, characterized in that the device comprises:
the input device is used for determining the corpus text input by the user based on the corpus text input operation of the user and sending the corpus text input by the user to the controller in communication connection with the input device;
the controller is used for sending the corpus text input by the user to the voice interaction equipment, so that the voice interaction equipment executes a corresponding operation instruction based on the corpus text input by the user.
CN202020557184.2U 2020-04-15 2020-04-15 Interactive system, voice interaction equipment and control equipment Active CN212624795U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202020557184.2U CN212624795U (en) 2020-04-15 2020-04-15 Interactive system, voice interaction equipment and control equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202020557184.2U CN212624795U (en) 2020-04-15 2020-04-15 Interactive system, voice interaction equipment and control equipment

Publications (1)

Publication Number Publication Date
CN212624795U true CN212624795U (en) 2021-02-26

Family

ID=74702201

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202020557184.2U Active CN212624795U (en) 2020-04-15 2020-04-15 Interactive system, voice interaction equipment and control equipment

Country Status (1)

Country Link
CN (1) CN212624795U (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539250A (en) * 2020-04-15 2021-10-22 阿里巴巴集团控股有限公司 Interaction method, device, system, voice interaction equipment, control equipment and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539250A (en) * 2020-04-15 2021-10-22 阿里巴巴集团控股有限公司 Interaction method, device, system, voice interaction equipment, control equipment and medium

Similar Documents

Publication Publication Date Title
US11488591B1 (en) Altering audio to improve automatic speech recognition
US10887710B1 (en) Characterizing environment using ultrasound pilot tones
US11922925B1 (en) Managing dialogs on a speech recognition platform
CN108829235B (en) Voice data processing method and electronic device supporting the same
US10121465B1 (en) Providing content on multiple devices
US10930277B2 (en) Configuration of voice controlled assistant
EP3077921B1 (en) Natural language control of secondary device
US9087520B1 (en) Altering audio based on non-speech commands
KR102305992B1 (en) Voice play method and device
CN113168227A (en) Method of performing function of electronic device and electronic device using the same
KR20210016815A (en) Electronic device for managing a plurality of intelligent agents and method of operating thereof
JP2018194832A (en) User command processing method and system for adjusting output volume of sound to be output, based on input volume of received voice input
JP6619488B2 (en) Continuous conversation function in artificial intelligence equipment
WO2020135773A1 (en) Data processing method, device, and computer-readable storage medium
KR102629796B1 (en) An electronic device supporting improved speech recognition
CN212624795U (en) Interactive system, voice interaction equipment and control equipment
US10002611B1 (en) Asynchronous audio messaging
US10062386B1 (en) Signaling voice-controlled devices
JP6985113B2 (en) How to provide an interpreter function for electronic devices
KR20210001082A (en) Electornic device for processing user utterance and method for operating thereof
CN110308886A (en) The system and method for voice command service associated with personalized task are provided
KR102380717B1 (en) Electronic apparatus for processing user utterance and controlling method thereof
KR20200057501A (en) ELECTRONIC APPARATUS AND WiFi CONNECTING METHOD THEREOF
US20200202861A1 (en) Electronic device controlling system, voice output device, and methods therefor
CN111161734A (en) Voice interaction method and device based on designated scene

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant