CN112233670A - Voice interaction method and system based on alexa cloud service - Google Patents

Voice interaction method and system based on alexa cloud service Download PDF

Info

Publication number
CN112233670A
CN112233670A CN202010885996.4A CN202010885996A CN112233670A CN 112233670 A CN112233670 A CN 112233670A CN 202010885996 A CN202010885996 A CN 202010885996A CN 112233670 A CN112233670 A CN 112233670A
Authority
CN
China
Prior art keywords
voice
alexa
execution
instruction
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010885996.4A
Other languages
Chinese (zh)
Inventor
何志宏
高裘生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Zhixiang Information Technology Co ltd
Original Assignee
Fuzhou Zhixiang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Zhixiang Information Technology Co ltd filed Critical Fuzhou Zhixiang Information Technology Co ltd
Priority to CN202010885996.4A priority Critical patent/CN112233670A/en
Publication of CN112233670A publication Critical patent/CN112233670A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/142Managing session states for stateless protocols; Signalling session states; State transitions; Keeping-state mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/143Termination or inactivation of sessions, e.g. event-controlled end of session
    • H04L67/145Termination or inactivation of sessions, e.g. event-controlled end of session avoiding end of session, e.g. keep-alive, heartbeats, resumption message or wake-up for inactive or interrupted session
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • H04L69/162Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields involving adaptations of sockets based mechanisms
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention provides a voice interaction method and a voice interaction system based on alexa cloud service in the technical field of intelligent sound boxes, wherein the method comprises the following steps: s10, setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to each execution instruction and activation duration; step S20, the sound box receives the sound in the receiving range in real time, and activates an alexa voice assistant after verifying the received sound based on the awakening word; step S30, the sound box continuously receives the voice command sent by the user in the activation duration, converts the voice command into an execution command and then sequentially inputs the execution command into an alexa voice assistant; and S40, executing the received execution instruction by the alexa voice assistant, controlling a display screen to perform interface response, controlling light to display a corresponding state, keeping long connection of the alexa voice assistant through a WebSocket protocol, and monitoring the execution condition of the execution instruction. The invention has the advantages that: the intelligent sound box is connected for a long time, interface response is carried out, and user experience is greatly improved.

Description

Voice interaction method and system based on alexa cloud service
Technical Field
The invention relates to the technical field of intelligent sound boxes, in particular to a voice interaction method and system based on alexa cloud service.
Background
Along with the continuous progress of science and technology, intelligent audio amplifier has appeared in people's the field of vision gradually, and intelligent audio amplifier not only can play the music, can also carry out voice interaction with the user. However, in the process of executing a task, if an interruption occurs due to a network disconnection or the like, the conventional smart speaker cannot continue to execute the task, and cannot respond to a corresponding interface in the voice interaction process, which results in low user experience.
Therefore, how to provide a voice interaction method and system based on alexa cloud service to realize long connection of the intelligent sound box and perform interface response, so as to improve user experience, becomes a problem to be solved urgently.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a voice interaction method and system based on alexa cloud service, so that long connection of an intelligent sound box is realized, interface response is carried out, and user experience is further improved.
In one aspect, the invention provides a voice interaction method based on alexa cloud service, which comprises the following steps:
s10, setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to each execution instruction and activation duration;
step S20, the sound box receives the sound in the receiving range in real time, and activates an alexa voice assistant after verifying the received sound based on the awakening word;
step S30, the sound box continuously receives the voice command sent by the user in the activation duration, converts the voice command into an execution command and then sequentially inputs the execution command into an alexa voice assistant;
and S40, executing the received execution instruction by the alexa voice assistant, controlling a display screen to perform interface response, controlling light to display a corresponding state, keeping long connection of the alexa voice assistant through a WebSocket protocol, and monitoring the execution condition of the execution instruction.
Further, the step S20 is specifically:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
Further, the step S30 is specifically:
the loudspeaker box is in the activation duration, utilize the adapter to continuously receive the voice command that the user sent, utilize voiceprint recognition technology to be right the voice command is categorised, utilize neural network recognition classification behind the latent intention of voice command, will voice command inputs alexa voice assistant in proper order after converting into the executive instruction.
Further, in the step S30, the execution instruction includes an execution duration.
Further, in step S40, the maintaining of the long connection of the alexa voice assistant by the WebSocket protocol includes:
step S41, setting a heartbeat cycle, monitoring whether the execution instruction generates interruption in the execution duration, if yes, entering step S42; if not, go to step S20;
step S42, monitoring whether the interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, continuing to execute the execution instruction; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
On the other hand, the invention provides a voice interaction system based on alexa cloud service, which comprises the following modules:
the sound box initialization module is used for setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to the response of each execution instruction and activation duration;
the alexa voice assistant activation module is used for receiving the sound in the receiving range in real time by the sound box, verifying the received sound based on the awakening word and activating the alexa voice assistant;
the instruction receiving module is used for continuously receiving voice instructions sent by a user in the activation duration by the sound box, converting the voice instructions into execution instructions and then sequentially inputting the execution instructions into an alexa voice assistant;
and the instruction execution module is used for executing the received execution instruction by the alexa voice assistant, controlling the display screen to perform interface response, controlling the lamplight to display a corresponding state, keeping the long connection of the alexa voice assistant through a WebSocket protocol and monitoring the execution condition of the execution instruction.
Further, the alexa voice assistant activation module specifically includes:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
Further, the instruction receiving module specifically includes:
the loudspeaker box is in the activation duration, utilize the adapter to continuously receive the voice command that the user sent, utilize voiceprint recognition technology to be right the voice command is categorised, utilize neural network recognition classification behind the latent intention of voice command, will voice command inputs alexa voice assistant in proper order after converting into the executive instruction.
Further, in the instruction receiving module, the execution instruction includes an execution duration.
Further, in the instruction execution module, the maintaining of the long connection of the alexa voice assistant through the WebSocket protocol includes:
the interruption monitoring unit is used for setting a heartbeat cycle, monitoring whether the execution instruction generates interruption within the execution duration, and entering the heartbeat testing unit if the execution instruction generates interruption within the execution duration; if not, entering an alexa voice assistant activation module;
the heartbeat testing unit is used for monitoring whether interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, the execution instruction is continuously executed; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
The invention has the advantages that:
1. the method comprises the steps that long connection of an alexa voice assistant is maintained through a WebSocket protocol, execution conditions of an execution instruction are monitored, when the execution instruction is interrupted within execution duration, heartbeat testing is conducted at intervals of a heart state cycle, the execution instruction continues to be executed after interruption is recovered, and long connection of the intelligent sound box is achieved; by setting the corresponding response interfaces of the execution instructions, the alexa voice assistant makes the display screen jump to the corresponding interfaces after receiving the execution instructions, so that the interface response of the intelligent sound box is realized, and further the user experience is greatly improved.
2. By adopting the alexa voice assistant, the accuracy of English recognition is greatly improved.
3. By setting the activation duration, after waking up the alexa voice assistant, a user can continuously issue voice commands in the activation duration, and the alexa voice assistant does not need to be waken up once every time the voice commands are issued, so that continuous interaction can be performed with the loudspeaker box, and further user experience is greatly improved.
4. And classifying the voice command by utilizing a voiceprint recognition technology, so that the sound box can recognize different users, and further, carrying out preference setting according to different users. For example, music is played, wherein a user A prefers rock music, a user B prefers movie and television golden music, and when the sound box receives a voice instruction for playing music, if the user A who sends the voice instruction is identified by utilizing a voiceprint recognition technology, the rock music is played, so that the sound box is more intelligent, and further user experience is greatly improved.
5. And the potential intention of the classified voice instruction is recognized by utilizing the neural network, so that the recognition accuracy of the voice instruction is greatly improved.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
Fig. 1 is a flowchart of a voice interaction method based on alexa cloud service according to the present invention.
Fig. 2 is a schematic structural diagram of a voice interaction system based on alexa cloud service according to the present invention.
Detailed Description
The technical scheme in the embodiment of the application has the following general idea: the long connection of the alexa voice assistant is kept through a WebSocket protocol, when the execution instruction generates interruption within the execution duration, a heartbeat test is carried out by taking a heart-state cycle as an interval, and the execution instruction is continuously executed after the interruption is recovered; by setting the corresponding responding interface of each execution instruction, the alexa voice assistant makes the display screen jump to the corresponding interface after receiving the execution instruction; and then realize the long connection of intelligent audio amplifier to carry out interface response, and then promote user experience.
The intelligent sound box used by the invention is provided with a display screen, an indicator light, a sound pick-up and a wireless communication module; the display screen is used for displaying an interface corresponding to the execution instruction, the indicating lamp is used for displaying different states so as to inform a user of the current running condition of the loudspeaker box, the sound pick-up is used for picking up the sound made by the user, and the wireless communication module is used for being connected and interacted with the server or other intelligent equipment.
Referring to fig. 1 to 2, a preferred embodiment of a voice interaction method based on alexa cloud service according to the present invention includes the following steps:
s10, setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to each execution instruction and activation duration; by setting the corresponding response interfaces of the execution instructions, the alexa voice assistant makes the display screen jump to the corresponding interfaces after receiving the execution instructions, so that the interface response of the intelligent sound box is realized, and further the user experience is greatly improved.
Step S20, the sound box receives the sound in the receiving range in real time, and activates an alexa voice assistant after verifying the received sound based on the awakening word; by adopting the alexa voice assistant, the accuracy of English recognition is greatly improved.
Step S30, the sound box continuously receives the voice command sent by the user in the activation duration, converts the voice command into an execution command and then sequentially inputs the execution command into an alexa voice assistant; by setting the activation duration, after waking up the alexa voice assistant, a user can continuously issue voice commands in the activation duration, and the alexa voice assistant does not need to be waken up once every time the voice commands are issued, so that continuous interaction can be performed with the loudspeaker box, and further user experience is greatly improved.
And S40, executing the received execution instruction by the alexa voice assistant, controlling a display screen to perform interface response, controlling light to display a corresponding state, keeping long connection of the alexa voice assistant through a WebSocket protocol, and monitoring the execution condition of the execution instruction. The method comprises the steps of keeping long connection of an alexa voice assistant through a WebSocket protocol, monitoring the execution condition of an execution instruction, carrying out heartbeat test at intervals of a heart state cycle when the execution instruction is interrupted within the execution time, and continuing to execute the execution instruction after interruption is recovered, so that long connection of the intelligent sound box is achieved.
The step S20 specifically includes:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
The step S30 specifically includes:
the sound box continuously receives a voice instruction sent by a user within the activation duration by using a sound pick-up, classifies the voice instruction by using a voiceprint recognition technology, and converts the voice instruction into an execution instruction by using a voice engine and then sequentially inputs the execution instruction into an alexa voice assistant after recognizing the potential intention of the classified voice instruction by using a neural network; the execution instruction is a precise text instruction. And classifying the voice command by utilizing a voiceprint recognition technology, so that the sound box can recognize different users, and further, carrying out preference setting according to different users. For example, music is played, wherein a user A prefers rock music, a user B prefers movie and television golden music, and when the sound box receives a voice instruction for playing music, if the user A who sends the voice instruction is identified by utilizing a voiceprint recognition technology, the rock music is played, so that the sound box is more intelligent, and further user experience is greatly improved. And the potential intention of the classified voice instruction is recognized by utilizing the neural network, so that the recognition accuracy of the voice instruction is greatly improved.
In step S30, the execution instruction includes an execution time length, for example, if music is played for half an hour, the execution time length of the execution instruction is half an hour.
In step S40, the maintaining of the long connection of the alexa voice assistant by using the WebSocket protocol includes:
step S41, setting a heartbeat cycle, monitoring whether the execution instruction generates interruption in the execution duration, if yes, entering step S42; if not, go to step S20;
step S42, monitoring whether the interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, continuing to execute the execution instruction; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
For example, the execution instruction is to play music for one hour, the heartbeat cycle is one minute, when the music is played for half an hour, the interruption is caused by a network reason, whether the network is recovered or not is monitored every one minute, and if the network is recovered, the music is continuously played until the playing is full for one hour.
The invention discloses a preferred embodiment of a voice interaction system based on alexa cloud service, which comprises the following modules:
the sound box initialization module is used for setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to the response of each execution instruction and activation duration; by setting the corresponding response interfaces of the execution instructions, the alexa voice assistant makes the display screen jump to the corresponding interfaces after receiving the execution instructions, so that the interface response of the intelligent sound box is realized, and further the user experience is greatly improved.
The alexa voice assistant activation module is used for receiving the sound in the receiving range in real time by the sound box, verifying the received sound based on the awakening word and activating the alexa voice assistant; by adopting the alexa voice assistant, the accuracy of English recognition is greatly improved.
The instruction receiving module is used for continuously receiving voice instructions sent by a user in the activation duration by the sound box, converting the voice instructions into execution instructions and then sequentially inputting the execution instructions into an alexa voice assistant; by setting the activation duration, after waking up the alexa voice assistant, a user can continuously issue voice commands in the activation duration, and the alexa voice assistant does not need to be waken up once every time the voice commands are issued, so that continuous interaction can be performed with the loudspeaker box, and further user experience is greatly improved.
And the instruction execution module is used for executing the received execution instruction by the alexa voice assistant, controlling the display screen to perform interface response, controlling the lamplight to display a corresponding state, keeping the long connection of the alexa voice assistant through a WebSocket protocol and monitoring the execution condition of the execution instruction. The method comprises the steps of keeping long connection of an alexa voice assistant through a WebSocket protocol, monitoring the execution condition of an execution instruction, carrying out heartbeat test at intervals of a heart state cycle when the execution instruction is interrupted within the execution time, and continuing to execute the execution instruction after interruption is recovered, so that long connection of the intelligent sound box is achieved.
The alexa voice assistant activation module specifically comprises:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
The instruction receiving module is specifically as follows:
the sound box continuously receives a voice instruction sent by a user within the activation duration by using a sound pick-up, classifies the voice instruction by using a voiceprint recognition technology, and converts the voice instruction into an execution instruction by using a voice engine and then sequentially inputs the execution instruction into an alexa voice assistant after recognizing the potential intention of the classified voice instruction by using a neural network; the execution instruction is a precise text instruction. And classifying the voice command by utilizing a voiceprint recognition technology, so that the sound box can recognize different users, and further, carrying out preference setting according to different users. For example, music is played, wherein a user A prefers rock music, a user B prefers movie and television golden music, and when the sound box receives a voice instruction for playing music, if the user A who sends the voice instruction is identified by utilizing a voiceprint recognition technology, the rock music is played, so that the sound box is more intelligent, and further user experience is greatly improved. And the potential intention of the classified voice instruction is recognized by utilizing the neural network, so that the recognition accuracy of the voice instruction is greatly improved.
In the instruction receiving module, the execution instruction includes an execution duration, for example, if music is played for half an hour, the execution duration of the execution instruction is half an hour.
In the instruction execution module, the maintaining of the long connection of the alexa voice assistant through the WebSocket protocol specifically includes:
the interruption monitoring unit is used for setting a heartbeat cycle, monitoring whether the execution instruction generates interruption within the execution duration, and entering the heartbeat testing unit if the execution instruction generates interruption within the execution duration; if not, entering an alexa voice assistant activation module;
the heartbeat testing unit is used for monitoring whether interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, the execution instruction is continuously executed; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
For example, the execution instruction is to play music for one hour, the heartbeat cycle is one minute, when the music is played for half an hour, the interruption is caused by a network reason, whether the network is recovered or not is monitored every one minute, and if the network is recovered, the music is continuously played until the playing is full for one hour.
In summary, the invention has the advantages that:
1. the method comprises the steps that long connection of an alexa voice assistant is maintained through a WebSocket protocol, execution conditions of an execution instruction are monitored, when the execution instruction is interrupted within execution duration, heartbeat testing is conducted at intervals of a heart state cycle, the execution instruction continues to be executed after interruption is recovered, and long connection of the intelligent sound box is achieved; by setting the corresponding response interfaces of the execution instructions, the alexa voice assistant makes the display screen jump to the corresponding interfaces after receiving the execution instructions, so that the interface response of the intelligent sound box is realized, and further the user experience is greatly improved.
2. By adopting the alexa voice assistant, the accuracy of English recognition is greatly improved.
3. By setting the activation duration, after waking up the alexa voice assistant, a user can continuously issue voice commands in the activation duration, and the alexa voice assistant does not need to be waken up once every time the voice commands are issued, so that continuous interaction can be performed with the loudspeaker box, and further user experience is greatly improved.
4. And classifying the voice command by utilizing a voiceprint recognition technology, so that the sound box can recognize different users, and further, carrying out preference setting according to different users. For example, music is played, wherein a user A prefers rock music, a user B prefers movie and television golden music, and when the sound box receives a voice instruction for playing music, if the user A who sends the voice instruction is identified by utilizing a voiceprint recognition technology, the rock music is played, so that the sound box is more intelligent, and further user experience is greatly improved.
5. And the potential intention of the classified voice instruction is recognized by utilizing the neural network, so that the recognition accuracy of the voice instruction is greatly improved.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (10)

1. A voice interaction method based on alexa cloud service is characterized in that: the method comprises the following steps:
s10, setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to each execution instruction and activation duration;
step S20, the sound box receives the sound in the receiving range in real time, and activates an alexa voice assistant after verifying the received sound based on the awakening word;
step S30, the sound box continuously receives the voice command sent by the user in the activation duration, converts the voice command into an execution command and then sequentially inputs the execution command into an alexa voice assistant;
and S40, executing the received execution instruction by the alexa voice assistant, controlling a display screen to perform interface response, controlling light to display a corresponding state, keeping long connection of the alexa voice assistant through a WebSocket protocol, and monitoring the execution condition of the execution instruction.
2. The voice interaction method based on alexa cloud service as claimed in claim 1, characterized in that: the step S20 specifically includes:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
3. The voice interaction method based on alexa cloud service as claimed in claim 1, characterized in that: the step S30 specifically includes:
the loudspeaker box is in the activation duration, utilize the adapter to continuously receive the voice command that the user sent, utilize voiceprint recognition technology to be right the voice command is categorised, utilize neural network recognition classification behind the latent intention of voice command, will voice command inputs alexa voice assistant in proper order after converting into the executive instruction.
4. The voice interaction method based on alexa cloud service as claimed in claim 1, characterized in that: in step S30, the execution instruction includes an execution duration.
5. The voice interaction method based on alexa cloud service as claimed in claim 1, characterized in that: in step S40, the maintaining of the long connection of the alexa voice assistant by using the WebSocket protocol includes:
step S41, setting a heartbeat cycle, monitoring whether the execution instruction generates interruption in the execution duration, if yes, entering step S42; if not, go to step S20;
step S42, monitoring whether the interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, continuing to execute the execution instruction; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
6. A voice interaction system based on alexa cloud service is characterized in that: the system comprises the following modules:
the sound box initialization module is used for setting a wake-up word of the sound box, a state of the light display corresponding to each execution instruction, an interface corresponding to the response of each execution instruction and activation duration;
the alexa voice assistant activation module is used for receiving the sound in the receiving range in real time by the sound box, verifying the received sound based on the awakening word and activating the alexa voice assistant;
the instruction receiving module is used for continuously receiving voice instructions sent by a user in the activation duration by the sound box, converting the voice instructions into execution instructions and then sequentially inputting the execution instructions into an alexa voice assistant;
and the instruction execution module is used for executing the received execution instruction by the alexa voice assistant, controlling the display screen to perform interface response, controlling the lamplight to display a corresponding state, keeping the long connection of the alexa voice assistant through a WebSocket protocol and monitoring the execution condition of the execution instruction.
7. The voice interaction system based on alexa cloud service as claimed in claim 6, characterized in that: the alexa voice assistant activation module specifically comprises:
the sound box receives sound in a receiving range in real time by using a sound pickup, converts the received sound into characters in real time by using a voice engine, compares whether the converted characters are consistent with the awakening words or not, and activates an alexa voice assistant if the characters are consistent with the awakening words; if not, the voice in the receiving range is continuously received and identified.
8. The voice interaction system based on alexa cloud service as claimed in claim 6, characterized in that: the instruction receiving module is specifically as follows:
the loudspeaker box is in the activation duration, utilize the adapter to continuously receive the voice command that the user sent, utilize voiceprint recognition technology to be right the voice command is categorised, utilize neural network recognition classification behind the latent intention of voice command, will voice command inputs alexa voice assistant in proper order after converting into the executive instruction.
9. The voice interaction system based on alexa cloud service as claimed in claim 6, characterized in that: in the instruction receiving module, the execution instruction includes an execution duration.
10. The voice interaction system based on alexa cloud service as claimed in claim 6, characterized in that: in the instruction execution module, the maintaining of the long connection of the alexa voice assistant through the WebSocket protocol specifically includes:
the interruption monitoring unit is used for setting a heartbeat cycle, monitoring whether the execution instruction generates interruption within the execution duration, and entering the heartbeat testing unit if the execution instruction generates interruption within the execution duration; if not, entering an alexa voice assistant activation module;
the heartbeat testing unit is used for monitoring whether interruption is recovered or not by using a WebSocket protocol and taking the heartbeat cycle as an interval, and if yes, the execution instruction is continuously executed; if not, continuously monitoring whether the interruption is recovered or not by taking the heartbeat cycle as an interval.
CN202010885996.4A 2020-08-28 2020-08-28 Voice interaction method and system based on alexa cloud service Pending CN112233670A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010885996.4A CN112233670A (en) 2020-08-28 2020-08-28 Voice interaction method and system based on alexa cloud service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010885996.4A CN112233670A (en) 2020-08-28 2020-08-28 Voice interaction method and system based on alexa cloud service

Publications (1)

Publication Number Publication Date
CN112233670A true CN112233670A (en) 2021-01-15

Family

ID=74115772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010885996.4A Pending CN112233670A (en) 2020-08-28 2020-08-28 Voice interaction method and system based on alexa cloud service

Country Status (1)

Country Link
CN (1) CN112233670A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157350A (en) * 2021-03-18 2021-07-23 福建马恒达信息科技有限公司 Office auxiliary system and method based on voice recognition

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108962250A (en) * 2018-09-26 2018-12-07 出门问问信息科技有限公司 Audio recognition method, device and electronic equipment
CN109410951A (en) * 2018-11-21 2019-03-01 广州番禺巨大汽车音响设备有限公司 Audio controlling method, system and stereo set based on Alexa voice control
CN109545206A (en) * 2018-10-29 2019-03-29 百度在线网络技术(北京)有限公司 Voice interaction processing method, device and the smart machine of smart machine
WO2020022573A1 (en) * 2018-07-27 2020-01-30 (주)휴맥스 Smart device and control method therefor
CN110910886A (en) * 2019-12-17 2020-03-24 广州三星通信技术研究有限公司 Man-machine interaction method and device
CN111292733A (en) * 2018-12-06 2020-06-16 阿里巴巴集团控股有限公司 Voice interaction method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020022573A1 (en) * 2018-07-27 2020-01-30 (주)휴맥스 Smart device and control method therefor
CN108962250A (en) * 2018-09-26 2018-12-07 出门问问信息科技有限公司 Audio recognition method, device and electronic equipment
CN109545206A (en) * 2018-10-29 2019-03-29 百度在线网络技术(北京)有限公司 Voice interaction processing method, device and the smart machine of smart machine
CN109410951A (en) * 2018-11-21 2019-03-01 广州番禺巨大汽车音响设备有限公司 Audio controlling method, system and stereo set based on Alexa voice control
CN111292733A (en) * 2018-12-06 2020-06-16 阿里巴巴集团控股有限公司 Voice interaction method and device
CN110910886A (en) * 2019-12-17 2020-03-24 广州三星通信技术研究有限公司 Man-machine interaction method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113157350A (en) * 2021-03-18 2021-07-23 福建马恒达信息科技有限公司 Office auxiliary system and method based on voice recognition

Similar Documents

Publication Publication Date Title
JP7242520B2 (en) visually aided speech processing
CN105323648B (en) Caption concealment method and electronic device
KR20200037687A (en) The Method for Controlling a plurality of Voice Recognizing Device and the Electronic Device supporting the same
CN111161714B (en) Voice information processing method, electronic equipment and storage medium
CN105575395A (en) Voice wake-up method and apparatus, terminal, and processing method thereof
KR102580408B1 (en) Portable Audio DEVICE with Voice Capabilities
CN110047481A (en) Method for voice recognition and device
CN103971681A (en) Voice recognition method and system
JP7017598B2 (en) Data processing methods, devices, devices and storage media for smart devices
CN109271533A (en) A kind of multimedia document retrieval method
US11720814B2 (en) Method and system for classifying time-series data
US11626104B2 (en) User speech profile management
CN111081275B (en) Terminal processing method and device based on sound analysis, storage medium and terminal
CN112207811B (en) Robot control method and device, robot and storage medium
CN112233670A (en) Voice interaction method and system based on alexa cloud service
US20210225363A1 (en) Information processing device and information processing method
CN109195016B (en) Intelligent terminal equipment-oriented voice interaction method and terminal system for video barrage and intelligent terminal equipment
KR20200099380A (en) Method for providing speech recognition serivce and electronic device thereof
CN113611316A (en) Man-machine interaction method, device, equipment and storage medium
CN111933137A (en) Voice wake-up test method and device, computer readable medium and electronic device
US10748535B2 (en) Transcription record comparison
CN111339881A (en) Baby growth monitoring method and system based on emotion recognition
WO2019228140A1 (en) Instruction execution method and apparatus, storage medium, and electronic device
US11641592B1 (en) Device management using stored network metrics
CN212588503U (en) Embedded audio playing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination