CN115410580A - Voice recognition method, device, equipment and medium for command scheduling system - Google Patents

Voice recognition method, device, equipment and medium for command scheduling system Download PDF

Info

Publication number
CN115410580A
CN115410580A CN202211035571.XA CN202211035571A CN115410580A CN 115410580 A CN115410580 A CN 115410580A CN 202211035571 A CN202211035571 A CN 202211035571A CN 115410580 A CN115410580 A CN 115410580A
Authority
CN
China
Prior art keywords
voice recognition
voice
command
client
voice instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211035571.XA
Other languages
Chinese (zh)
Inventor
蒋俊兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Feixun Digital Technology Co ltd
Original Assignee
Beijing Feixun Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Feixun Digital Technology Co ltd filed Critical Beijing Feixun Digital Technology Co ltd
Priority to CN202211035571.XA priority Critical patent/CN115410580A/en
Publication of CN115410580A publication Critical patent/CN115410580A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0807Network architectures or network communication protocols for network security for authentication of entities using tickets, e.g. Kerberos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/108Network architectures or network communication protocols for network security for controlling access to devices or network resources when the policy decisions are valid for a limited amount of time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/321Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving a third party or a trusted authority
    • H04L9/3213Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving a third party or a trusted authority using tickets or tokens, e.g. Kerberos
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a voice recognition method, a voice recognition device, voice recognition equipment and a voice recognition medium for a command and dispatch system. The method comprises the following steps: acquiring a voice instruction message sent by a command scheduling client through a voice recognition server, analyzing the voice instruction message based on a preset communication protocol, and acquiring voice instruction data corresponding to the voice instruction message; and then carrying out voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client. According to the technical scheme of the embodiment, the voice recognition software development kit is deployed in the voice recognition server to perform voice recognition on the voice instruction data of the commanding and scheduling client, so that the voice recognition of the commanding and scheduling client of different operating systems can be realized, and the system compatibility of the voice recognition of the commanding and scheduling client can be improved.

Description

Voice recognition method, device, equipment and medium for command scheduling system
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a medium for speech recognition in a command and dispatch system.
Background
In the audio and video command and scheduling system, a user can command and schedule through voice, for example, voice control is performed on intelligent equipment, and the convenience of command and scheduling can be greatly improved.
At present, a speech recognition method in an audio/video command and dispatch system generally integrates a speech recognition service provided by a third party in an audio/video command and dispatch client, so as to realize a speech recognition function. However, the voice recognition service provided by the third party often has the limitation of the operating system, that is, the compatibility with all operating systems cannot be realized, especially with newly developed home operating systems. Therefore, in the audio and video command and dispatch client sides of different operating systems, speech recognition may not be successfully realized through the speech recognition service provided by a third party.
Disclosure of Invention
The invention provides a voice recognition method, a voice recognition device, voice recognition equipment and a voice recognition medium for a command and dispatch system, which can realize voice recognition of command and dispatch clients of different operating systems and can improve the system compatibility of the voice recognition of the command and dispatch clients.
According to an aspect of the present invention, there is provided a speech recognition method for a command and dispatch system, applied to a speech recognition server, including:
acquiring a voice instruction message sent by a command scheduling client;
analyzing the voice instruction message based on a preset communication protocol to obtain voice instruction data corresponding to the voice instruction message;
and performing voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client.
According to another aspect of the present invention, there is provided a speech recognition method for a command and dispatch system, applied to a command and dispatch client, comprising:
responding to a voice scheduling request of a user, and acquiring voice instruction data;
acquiring a voice instruction message according to the voice instruction data and a preset communication protocol, and sending the voice instruction message to a voice recognition server;
and receiving a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, and conducting command scheduling based on the voice recognition result.
According to another aspect of the present invention, there is provided a speech recognition apparatus for commanding a dispatching system, applied to a speech recognition server, including:
the first voice instruction message acquisition module is used for acquiring a voice instruction message sent by a command scheduling client;
the first voice instruction data acquisition module is used for analyzing the voice instruction message based on a preset communication protocol and acquiring voice instruction data corresponding to the voice instruction message;
and the voice recognition result sending module is used for carrying out voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit and sending a voice recognition result to the command scheduling client.
According to another aspect of the present invention, there is provided a speech recognition apparatus for a command and dispatch system, applied to a command and dispatch client, including:
the second voice instruction data acquisition module is used for responding to a voice scheduling request of a user and acquiring voice instruction data;
the second voice instruction message acquisition module is used for acquiring a voice instruction message according to the voice instruction data and a preset communication protocol and sending the voice instruction message to a voice recognition server;
and the voice recognition result receiving module is used for receiving the voice recognition result corresponding to the voice instruction data sent by the voice recognition server and conducting command scheduling based on the voice recognition result.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform a method of speech recognition for directing a dispatch system as described in any of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement a method for speech recognition of a command scheduling system according to any one of the embodiments of the present invention when executed.
According to the technical scheme of the embodiment of the invention, the voice instruction message sent by the command scheduling client is obtained through the voice recognition server, and the voice instruction message is analyzed based on the preset communication protocol to obtain the voice instruction data corresponding to the voice instruction message; and then carrying out voice recognition on the voice instruction data through the pre-deployed voice recognition software development kit, sending a voice recognition result to the commanding and dispatching client, and deploying the voice recognition software development kit in the voice recognition server to carry out voice recognition on the voice instruction data of the commanding and dispatching client, so that the voice recognition of the commanding and dispatching clients of different operating systems can be realized, and the system compatibility of the voice recognition of the commanding and dispatching clients can be improved.
It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1A is a flowchart of a voice recognition method for commanding a dispatching system according to an embodiment of the present invention;
FIG. 1B is a flow chart illustrating a prior art method for speech recognition in a command and dispatch system;
fig. 2A is a flowchart of a speech recognition method of a command and dispatch system according to a second embodiment of the present invention;
fig. 2B is a flowchart illustrating a speech recognition method of a command and dispatch system according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a speech recognition apparatus of a command and dispatch system according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a speech recognition apparatus of a command and dispatch system according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a speech recognition method of a command and dispatch system according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," "object," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1A is a flowchart of a voice recognition method of a command and dispatch system according to an embodiment of the present invention, where this embodiment is applicable to a case where voice recognition is performed on a voice instruction input by a user in the command and dispatch system, and the method may be executed by a voice recognition device of the command and dispatch system according to a third embodiment of the present invention, where the voice recognition device of the command and dispatch system may be implemented in a form of hardware and/or software, and the voice recognition device of the command and dispatch system may be configured in an electronic device, and typically, the electronic device may be a server. As shown in fig. 1A, the method includes:
s110, obtaining a voice instruction message sent by the command scheduling client.
The command scheduling client can be an application program for command scheduling, and can be installed in computer equipment or a mobile terminal; typically, the method can be an audio and video command scheduling client. In this embodiment, the command and dispatch client may control the associated intelligent device (e.g., an intelligent television, an intelligent air conditioner, etc.) according to the command and dispatch voice of the user. The client scheduling command can be based on different operating systems, and the number of the client scheduling commands can be one or more. It should be noted that, the command and dispatch client is authorized by the user in advance before collecting the command and dispatch voice of the user.
In this embodiment, when the command scheduling client detects a command scheduling voice input by a user, the command scheduling voice may be encapsulated based on a preset communication Protocol, for example, a WebSocket Protocol, a Transmission Control Protocol (TCP), a hypertext Transfer Protocol (HTTP), and the like, to generate a voice instruction packet, and the voice instruction packet may be sent to the voice recognition server through a preset communication link with the voice server. And then, the voice recognition server can receive the voice instruction message sent by the command scheduling client.
The voice instruction packet may further include an identity of the command scheduling client, identity verification information (e.g., an identity certificate, etc.), and a domain name address, for example, a Uniform Resource Locator (URL), a domain name address of the voice recognition server, and other contents.
S120, analyzing the voice instruction message based on a preset communication protocol, and acquiring voice instruction data corresponding to the voice instruction message.
Specifically, after receiving the voice instruction packet sent by the command scheduling client, the voice recognition server may analyze the voice instruction packet based on a preset communication protocol, for example, webSocket, TCP, HTTP, and the like, to obtain included voice instruction data. The preset communication protocol can be preset in the voice recognition server and the command scheduling client.
S130, performing voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client.
In this embodiment, the speech recognition server is pre-deployed with a speech recognition Software Development Kit (SDK) for implementing the speech recognition function. The voice recognition SDK may be provided by a third party manufacturer, and the embodiment does not specifically limit the type and applicable system of the voice recognition SDK.
Therefore, after the voice recognition server acquires the voice instruction data, the voice recognition server can call the pre-deployed voice recognition SDK through a request to perform voice recognition on the voice instruction data, so that a voice recognition result is acquired. Then, the voice recognition server can send the voice recognition result to the corresponding command and dispatch client based on the domain name address of the command and dispatch client in the voice instruction message. Wherein the voice recognition result may be text data corresponding to the voice instruction data.
It should be noted that, in the prior art, a voice recognition process of the command scheduling system is shown in fig. 1B, a voice recognition SDK is integrated in the audio/video command scheduling client, and when voice recognition is required, the audio/video command scheduling client may directly call the voice recognition SDK to obtain a voice recognition result. However, the current voice recognition SDK cannot be compatible with all different operating systems, especially with newly developed or domestic operating systems with a small number of users, and one voice recognition SDK is required for one audio/video command and dispatch client, so that multiplexing of the voice recognition SDK cannot be realized.
In view of the above problems, in this embodiment, the voice recognition SDK is deployed to the voice recognition server, the command scheduling client invokes the voice recognition server, and then the voice recognition server invokes the voice recognition SDK in a unified manner to obtain a voice recognition result, so that limitations of the operating system can be overcome, voice recognition of voice instructions of the command scheduling client of all different operating systems can be realized, multiplexing of the voice recognition SDK can be realized, the loading amount of the voice recognition SDK can be reduced, and the voice recognition cost can be reduced.
Alternatively, a matching server port may be set for the speech recognition SDK in advance. Therefore, after the voice recognition server receives the voice instruction message, the voice instruction message can be automatically analyzed to obtain voice instruction data, and a preset server port can be automatically called to call the corresponding voice recognition SDK to perform voice recognition on the voice instruction data.
The advantage of above-mentioned setting lies in, can simplify the speech recognition flow of speech recognition server, can promote the efficiency of speech recognition server.
According to the technical scheme of the embodiment of the invention, the voice instruction message sent by the command scheduling client is obtained through the voice recognition server, and the voice instruction message is analyzed based on the preset communication protocol to obtain the voice instruction data corresponding to the voice instruction message; and then carrying out voice recognition on the voice instruction data through the pre-deployed voice recognition software development kit, sending a voice recognition result to the commanding and dispatching client, and deploying the voice recognition software development kit in the voice recognition server to carry out voice recognition on the voice instruction data of the commanding and dispatching client, so that the voice recognition of the commanding and dispatching clients of different operating systems can be realized, and the system compatibility of the voice recognition of the commanding and dispatching clients can be improved.
In an optional implementation manner of this embodiment, before obtaining the voice instruction packet sent by the command and scheduling client, the method may further include:
when a communication link establishment request sent by the command scheduling client is received, acquiring an identity token corresponding to the command scheduling client according to the communication link establishment request;
and if the identity token corresponding to the command scheduling client is detected to pass the verification successfully, generating response data corresponding to the communication link establishment request, and sending the response data corresponding to the communication link establishment request to the command scheduling client so as to establish a communication link with the command scheduling client.
The communication link establishment request may include an identity token corresponding to the command scheduling client, and the identity token may be used for identity verification.
It should be noted that, when a user initially logs in the command scheduling client, the command scheduling client sends a login request including a user name and a user password to the voice recognition server; and after receiving the login request of the command scheduling client, the voice recognition server verifies the user name and the user password. After the voice recognition server successfully verifies, the voice recognition server issues an identity token and sends the identity token to the command scheduling client. The command scheduling client, upon receiving the identity token, may store the identity token locally, e.g., to a Cookie or a Local Storage.
And then, the command scheduling client carries the identity token signed and issued by the voice recognition server every time when sending message data to the voice recognition server. Therefore, after receiving the communication link establishment request of the commanding and scheduling client, the voice recognition server verifies the identity token in the communication link establishment request, if the same identity token is detected to be stored in the server and the identity token is in the validity period, the commanding and scheduling client can be determined to successfully pass the verification, response data corresponding to the communication link establishment request can be generated and sent to the commanding and scheduling client, and the communication link with the commanding and scheduling client is established. In addition, if the command scheduling client fails to pass the verification successfully, the voice recognition server may not feed back any data to the command scheduling client.
The communication link establishment request may be request data generated by the scheduling client and used for establishing a communication link with the voice recognition server; correspondingly, the response data corresponding to the communication link establishment request may be response data generated by the speech recognition server and used for indicating that the communication link establishment request has successfully been verified.
Correspondingly, obtaining the voice instruction packet sent by the command scheduling client may include:
and acquiring a voice instruction message sent by the command scheduling client based on the communication link with the command scheduling client.
Specifically, after the communication link between the speech recognition server and the command scheduling client is successfully established, the transmission of the speech instruction packet may be performed based on the communication link.
In another optional implementation manner of this embodiment, after acquiring, according to the communication link establishment request, the identity token corresponding to the command scheduling client, the method may further include:
if the identity token corresponding to the command scheduling client is detected to be expired, generating identity token expiration response data, and sending the identity token expiration response data to the command scheduling client;
the command scheduling client is used for generating an identity token updating request when detecting that the identity token sent by the voice recognition server is overdue and responding to data, and sending the identity token updating request to the voice recognition server;
acquiring an identity token updating request sent by the command scheduling client, and acquiring an updating identity token according to the identity token updating request;
and sending the update identity token to the command scheduling client to complete the update operation of the identity token.
It should be noted that the identity token issued by the voice recognition server is provided with a corresponding valid time, and when the valid time is exceeded, the identity token is expired. Therefore, when the voice recognition server detects that the same identity token is stored in the server and the identity token is expired, identity token expiration response data can be generated and sent to the command and dispatch client to inform the command and dispatch client that the current identity token is expired.
And then, after receiving the identity token expiration response data, the commanding and scheduling client can generate an identity token updating request and send the identity token updating request to the voice recognition server so as to request to acquire the updated identity token. After receiving the identity token update request sent by the command and dispatch client, the voice server can issue an update identity token and send the update identity token to the command and dispatch client. Further, the command scheduling client, after receiving the update identity token, may reconstruct a communication link establishment request based on the update identity token and send the communication link establishment request to the voice recognition server, so as to establish a communication link with the voice recognition server.
In another optional implementation manner of this embodiment, after obtaining the voice instruction packet sent by the commanding and scheduling client, the method may further include:
establishing a thread corresponding to the command scheduling client;
analyzing the voice instruction packet based on a preset communication protocol to obtain voice instruction data corresponding to the voice instruction packet may include:
analyzing the voice instruction message based on a preset communication protocol through a thread corresponding to the command scheduling client, and acquiring voice instruction data corresponding to the voice instruction message;
performing voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client, where the performing of voice recognition may include:
and performing voice recognition on the voice instruction data by adopting a pre-deployed voice recognition software development kit through a thread corresponding to the command scheduling client, and sending a voice recognition result to the command scheduling client.
In this embodiment, for each command scheduling client, the speech recognition server may respectively establish a corresponding thread for performing corresponding data processing. Specifically, after receiving a voice instruction packet sent by a command scheduling client, the voice recognition server may establish a thread corresponding to the command scheduling client, analyze the voice instruction packet based on the thread, and call the voice recognition SDK to perform voice recognition on voice instruction data.
In this embodiment, by respectively establishing the threads corresponding to the command and dispatch clients, speech recognition of the speech instruction data of a plurality of command and dispatch clients can be simultaneously achieved, and the efficiency of speech recognition of the command and dispatch clients can be improved.
Example two
Fig. 2A is a flowchart of a voice recognition method of a command and dispatch system according to a second embodiment of the present invention, where this embodiment is applicable to a case where voice recognition is performed on a voice instruction input by a user in the command and dispatch system, and the method may be executed by a voice recognition device of the command and dispatch system according to a fourth embodiment of the present invention, where the voice recognition device of the command and dispatch system may be implemented in a form of hardware and/or software, and the voice recognition device of the command and dispatch system may be configured in an electronic device, where the electronic device may be, typically, a computer device or a mobile terminal equipped with a command and dispatch client. As shown in fig. 2A, the method includes:
s210, responding to the voice scheduling request of the user, and acquiring voice instruction data.
In this embodiment, a user may send a voice scheduling request to the command scheduling client by clicking a voice input button in the command scheduling client; or, the voice scheduling request can be sent to the command scheduling client through a preset wake-up keyword. And the command scheduling client can acquire the voice instruction data of the user according to the voice scheduling request. The present embodiment does not specifically limit the format of the voice instruction data.
S220, acquiring a voice instruction message according to the voice instruction data and a preset communication protocol, and sending the voice instruction message to a voice recognition server.
Specifically, the command scheduling client may encapsulate the voice instruction data based on a preset communication protocol to obtain a voice instruction packet, and send the voice instruction packet to the voice recognition server. After receiving the voice instruction message, the voice recognition server can analyze the voice instruction message based on a preset communication protocol to acquire voice instruction data; and then, carrying out voice recognition on the voice instruction data by calling the pre-deployed voice recognition SDK, and sending a voice recognition result to the command scheduling client. The preset communication protocol may include WebSocket, TCP, HTTP, and the like.
S230, receiving a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, and conducting command scheduling based on the voice recognition result.
In this embodiment, after receiving the voice recognition result corresponding to the voice instruction data, the command scheduling client may generate a corresponding command scheduling instruction based on the voice recognition result, and send the command scheduling instruction to the associated smart device, so as to control the smart device to perform a corresponding operation.
According to the technical scheme of the embodiment of the invention, the command scheduling client responds to the voice scheduling request of the user to obtain the voice instruction data, and obtains the voice instruction message according to the voice instruction data and the preset communication protocol, and sends the voice instruction message to the voice recognition server; and then, receiving a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, carrying out command scheduling based on the voice recognition result, and sending the voice instruction data to the voice recognition server for uniform voice recognition, so that voice recognition of voice instructions of command scheduling clients of different operating systems can be realized, and the system compatibility of voice recognition of the command scheduling clients can be improved.
In an optional implementation manner of this embodiment, before sending the voice instruction packet to the voice recognition server, the method may further include:
acquiring a locally stored identity token, and generating a communication link establishment request according to the identity token;
and sending the communication link establishment request to the voice recognition server, and establishing a communication link with the voice recognition server when response data corresponding to the communication link establishment request sent by the voice recognition server is received.
In this embodiment, the command and dispatch client may locally store the identity token issued by the voice recognition server. Therefore, after the command scheduling client acquires the voice recognition message, the command scheduling client can firstly acquire the identity token from the local storage, generate a communication link establishment request based on the identity token and send the communication link establishment request to the voice recognition server. The voice recognition server can verify the identity token in the received communication link establishment request, and if the verification is successfully passed, response data corresponding to the communication link establishment request can be generated and sent to the command scheduling client so as to establish a communication link with the command scheduling client.
Correspondingly, sending the voice instruction packet to the voice recognition server may include:
and sending the voice instruction message to a voice recognition server based on a communication link with the voice recognition server.
Specifically, the command scheduling client may send the voice instruction packet to the voice recognition server based on a pre-established communication link with the voice recognition server. In addition, the command scheduling client and the voice recognition server can perform subsequent data transmission based on the communication link.
In a specific implementation manner of this embodiment, a flow of a speech recognition method of the command and dispatch system may be as shown in fig. 2B. Specifically, the command and dispatch system may include three audio/video command and dispatch clients and a voice recognition server, each audio/video command and dispatch client may be deployed on a terminal device of a different operating system, and the voice recognition server is pre-deployed with a voice recognition SDK. Firstly, each audio and video command scheduling client can send voice instruction data to the voice recognition server, and after receiving the voice instruction data, the voice recognition server can call the voice recognition SDK to perform voice recognition on each voice instruction data so as to obtain a corresponding voice recognition result. Furthermore, the voice recognition server can respectively send each voice recognition result to the corresponding audio and video command and dispatch client, so that voice recognition of the command and dispatch system is realized.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a speech recognition device of a command scheduling system according to a third embodiment of the present invention. As shown in fig. 3, the apparatus may be applied to a voice recognition server, and includes: a first voice instruction message obtaining module 310, a first voice instruction data obtaining module 320 and a voice recognition result sending module 330; wherein the content of the first and second substances,
a first voice instruction message obtaining module 310, configured to obtain a voice instruction message sent by a command and dispatch client;
a first voice instruction data obtaining module 320, configured to parse the voice instruction packet based on a preset communication protocol, and obtain voice instruction data corresponding to the voice instruction packet;
and the voice recognition result sending module 330 is configured to perform voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and send a voice recognition result to the command scheduling client.
According to the technical scheme of the embodiment of the invention, the voice instruction message sent by the command scheduling client is obtained through the voice recognition server, and the voice instruction message is analyzed based on the preset communication protocol, so that the voice instruction data corresponding to the voice instruction message is obtained; and then carrying out voice recognition on the voice instruction data through the pre-deployed voice recognition software development kit, sending a voice recognition result to the commanding and dispatching client, and deploying the voice recognition software development kit in the voice recognition server to carry out voice recognition on the voice instruction data of the commanding and dispatching client, so that the voice recognition of the commanding and dispatching clients of different operating systems can be realized, and the system compatibility of the voice recognition of the commanding and dispatching clients can be improved.
Optionally, the voice recognition apparatus of the command and dispatch system further includes:
the identity token acquisition module is used for acquiring an identity token corresponding to the commanding and scheduling client according to the communication link establishment request when the communication link establishment request sent by the commanding and scheduling client is received;
the communication link establishing module is used for generating response data corresponding to the communication link establishing request and sending the response data corresponding to the communication link establishing request to the commanding and dispatching client to establish a communication link with the commanding and dispatching client if the identity token corresponding to the commanding and dispatching client is detected to pass the verification successfully;
the first voice instruction packet obtaining module 310 is specifically configured to obtain, based on a communication link with the command and dispatch client, a voice instruction packet sent by the command and dispatch client.
Optionally, the voice recognition apparatus of the command scheduling system further includes:
the identity token expiration response data generating module is used for generating identity token expiration response data and sending the identity token expiration response data to the command and dispatch client if the identity token corresponding to the command and dispatch client is detected to be expired;
the command scheduling client is used for generating an identity token updating request when detecting that the identity token sent by the voice recognition server is overdue and responding to data, and sending the identity token updating request to the voice recognition server;
the updating identity token acquisition module is used for acquiring an identity token updating request sent by the command scheduling client and acquiring an updating identity token according to the identity token updating request;
and the update identity token sending module is used for sending the update identity token to the command scheduling client so as to complete the update operation of the identity token.
Optionally, the voice recognition apparatus of the command and dispatch system further includes:
the thread establishing module is used for establishing a thread corresponding to the command scheduling client;
the first voice instruction data obtaining module 320 is specifically configured to parse, based on a preset communication protocol, the voice instruction packet through a thread corresponding to the command and dispatch client, and obtain voice instruction data corresponding to the voice instruction packet;
the voice recognition result sending module 330 is specifically configured to perform voice recognition on the voice instruction data by using a pre-deployed voice recognition software development kit through a thread corresponding to the command and dispatch client, and send a voice recognition result to the command and dispatch client.
The voice recognition device of the command scheduling system provided by the embodiment of the invention can execute the voice recognition method of the command scheduling system provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of a speech recognition device of a command and dispatch system according to a fourth embodiment of the present invention. As shown in fig. 4, the apparatus may be applied to a command scheduling client, and includes: a second voice instruction data acquisition module 410, a second voice instruction message acquisition module 420 and a voice recognition result receiving module 430; wherein, the first and the second end of the pipe are connected with each other,
a second voice instruction data obtaining module 410, configured to respond to a voice scheduling request of a user, and obtain voice instruction data;
a second voice instruction message obtaining module 420, configured to obtain a voice instruction message according to the voice instruction data and a preset communication protocol, and send the voice instruction message to a voice recognition server;
a voice recognition result receiving module 430, configured to receive a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, and perform command scheduling based on the voice recognition result.
According to the technical scheme of the embodiment of the invention, the command scheduling client responds to the voice scheduling request of the user to acquire the voice instruction data, acquires the voice instruction message according to the voice instruction data and the preset communication protocol, and sends the voice instruction message to the voice recognition server; and then, receiving a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, carrying out command scheduling based on the voice recognition result, and sending the voice instruction data to the voice recognition server for uniform voice recognition, so that voice recognition of voice instructions of command scheduling clients of different operating systems can be realized, and the system compatibility of voice recognition of the command scheduling clients can be improved.
Optionally, the voice recognition apparatus of the command scheduling system further includes:
the communication link establishment request generation module is used for acquiring a locally stored identity token and generating a communication link establishment request according to the identity token;
a communication link establishment request sending module, configured to send the communication link establishment request to the voice recognition server, and establish a communication link with the voice recognition server when receiving response data corresponding to the communication link establishment request sent by the voice recognition server;
the second voice instruction packet obtaining module 420 is specifically configured to send the voice instruction packet to the voice recognition server based on the communication link with the voice recognition server.
The voice recognition device of the command scheduling system provided by the embodiment of the invention can execute the voice recognition method of the command scheduling system provided by the second embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
It should be noted that, in the technical solution of the present embodiment, the acquisition, storage, application, and the like of the personal information of the related user all conform to the regulations of the relevant laws and regulations, and do not violate the good custom of the public order.
EXAMPLE five
FIG. 5 illustrates a schematic diagram of an electronic device 50 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 50 includes at least one processor 51, and a memory communicatively connected to the at least one processor 51, such as a Read Only Memory (ROM) 52, a Random Access Memory (RAM) 53, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 51 may perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 52 or the computer program loaded from a storage unit 58 into the Random Access Memory (RAM) 53. In the RAM 53, various programs and data necessary for the operation of the electronic apparatus 50 can also be stored. The processor 51, the ROM 52, and the RAM 53 are connected to each other via a bus 54. An input/output (I/O) interface 55 is also connected to bus 54.
A plurality of components in the electronic apparatus 50 are connected to the I/O interface 55, including: an input unit 56 such as a keyboard, a mouse, or the like; an output unit 57 such as various types of displays, speakers, and the like; a storage unit 58 such as a magnetic disk, an optical disk, or the like; and a communication unit 59 such as a network card, modem, wireless communication transceiver, etc. The communication unit 59 allows the electronic device 50 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 51 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processors 51 include, but are not limited to, central Processing Units (CPUs), graphics Processing Units (GPUs), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processors, controllers, microcontrollers, and the like. The processor 51 performs the various methods and processes described above, such as the voice recognition method of the command dispatch system.
In some embodiments, the speech recognition method of the orchestration system may be implemented as a computer program, tangibly embodied in a computer-readable storage medium, such as storage unit 58. In some embodiments, part or all of the computer program may be loaded and/or installed onto electronic device 50 via ROM 52 and/or communications unit 59. When the computer program is loaded into RAM 53 and executed by processor 51, one or more steps of the speech recognition method of the command and dispatch system described above may be performed. Alternatively, in other embodiments, the processor 51 may be configured by any other suitable means (e.g., by means of firmware) to perform the speech recognition method of the command scheduling system.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Computer programs for implementing the methods of the present invention can be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A voice recognition method of a command scheduling system is applied to a voice recognition server and comprises the following steps:
acquiring a voice instruction message sent by a command scheduling client;
analyzing the voice instruction message based on a preset communication protocol to obtain voice instruction data corresponding to the voice instruction message;
and performing voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client.
2. The method of claim 1, before obtaining the voice command message sent by the client for commanding and scheduling, further comprising:
when a communication link establishment request sent by the command scheduling client is received, acquiring an identity token corresponding to the command scheduling client according to the communication link establishment request;
if the identity token corresponding to the command scheduling client is detected to pass the verification successfully, generating response data corresponding to the communication link establishment request, and sending the response data corresponding to the communication link establishment request to the command scheduling client to establish a communication link with the command scheduling client;
the method for acquiring the voice instruction message sent by the commanding and scheduling client comprises the following steps:
and acquiring the voice instruction message sent by the command and dispatch client based on the communication link with the command and dispatch client.
3. The method of claim 2, after obtaining the identity token corresponding to the client for commanding and scheduling according to the communication link establishment request, further comprising:
if the identity token corresponding to the command and dispatch client is detected to be expired, generating identity token expiration response data, and sending the identity token expiration response data to the command and dispatch client;
the command scheduling client is used for generating an identity token updating request and sending the identity token updating request to the voice recognition server when detecting that the identity token sent by the voice recognition server is overdue and responded to data;
acquiring an identity token updating request sent by the command scheduling client, and acquiring an updating identity token according to the identity token updating request;
and sending the updated identity token to the command scheduling client so as to complete the updating operation of the identity token.
4. The method of claim 1, after obtaining the voice command message sent by the client end of command and dispatch, further comprising:
establishing a thread corresponding to the command scheduling client;
analyzing the voice instruction message based on a preset communication protocol to acquire voice instruction data corresponding to the voice instruction message, wherein the method comprises the following steps:
analyzing the voice instruction message based on a preset communication protocol through a thread corresponding to the command scheduling client, and acquiring voice instruction data corresponding to the voice instruction message;
performing voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit, and sending a voice recognition result to the command scheduling client, including:
and performing voice recognition on the voice instruction data by adopting a pre-deployed voice recognition software development kit through a thread corresponding to the command scheduling client, and sending a voice recognition result to the command scheduling client.
5. A voice recognition method of a command and dispatch system is applied to a command and dispatch client and comprises the following steps:
responding to a voice scheduling request of a user, and acquiring voice instruction data;
acquiring a voice instruction message according to the voice instruction data and a preset communication protocol, and sending the voice instruction message to a voice recognition server;
and receiving a voice recognition result corresponding to the voice instruction data sent by the voice recognition server, and conducting command scheduling based on the voice recognition result.
6. The method of claim 5, further comprising, prior to sending the voice instruction message to a voice recognition server:
acquiring a locally stored identity token, and generating a communication link establishment request according to the identity token;
sending the communication link establishment request to the voice recognition server, and establishing a communication link with the voice recognition server when response data corresponding to the communication link establishment request sent by the voice recognition server is received;
sending the voice instruction message to a voice recognition server, comprising:
and sending the voice instruction message to a voice recognition server based on a communication link with the voice recognition server.
7. A voice recognition device for commanding a dispatching system is applied to a voice recognition server and comprises the following components:
the first voice instruction message acquisition module is used for acquiring a voice instruction message sent by a command scheduling client;
the first voice instruction data acquisition module is used for analyzing the voice instruction message based on a preset communication protocol and acquiring voice instruction data corresponding to the voice instruction message;
and the voice recognition result sending module is used for carrying out voice recognition on the voice instruction data through a pre-deployed voice recognition software development kit and sending a voice recognition result to the command scheduling client.
8. A speech recognition device of a command and dispatch system is applied to a command and dispatch client and comprises the following components:
the second voice instruction data acquisition module is used for responding to a voice scheduling request of a user and acquiring voice instruction data;
the second voice instruction message acquisition module is used for acquiring a voice instruction message according to the voice instruction data and a preset communication protocol and sending the voice instruction message to a voice recognition server;
and the voice recognition result receiving module is used for receiving the voice recognition result corresponding to the voice instruction data sent by the voice recognition server and conducting command scheduling based on the voice recognition result.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method of speech recognition of a command and dispatch system of any one of claims 1 to 4, or of any one of claims 5 to 6.
10. A computer readable storage medium, having stored thereon computer instructions for causing a processor, when executed, to implement the speech recognition method of any one of claims 1-4, or of any one of claims 5-6, of a command scheduling system.
CN202211035571.XA 2022-08-26 2022-08-26 Voice recognition method, device, equipment and medium for command scheduling system Pending CN115410580A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211035571.XA CN115410580A (en) 2022-08-26 2022-08-26 Voice recognition method, device, equipment and medium for command scheduling system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211035571.XA CN115410580A (en) 2022-08-26 2022-08-26 Voice recognition method, device, equipment and medium for command scheduling system

Publications (1)

Publication Number Publication Date
CN115410580A true CN115410580A (en) 2022-11-29

Family

ID=84161073

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211035571.XA Pending CN115410580A (en) 2022-08-26 2022-08-26 Voice recognition method, device, equipment and medium for command scheduling system

Country Status (1)

Country Link
CN (1) CN115410580A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116072119A (en) * 2023-03-31 2023-05-05 北京华录高诚科技有限公司 Voice control system, method, electronic equipment and medium for emergency command

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116072119A (en) * 2023-03-31 2023-05-05 北京华录高诚科技有限公司 Voice control system, method, electronic equipment and medium for emergency command

Similar Documents

Publication Publication Date Title
CN107277153B (en) Method, device and server for providing voice service
CN108897854B (en) Monitoring method and device for overtime task
CN109936587B (en) Control method, control device, electronic apparatus, and storage medium
CN109783427B (en) Method, server and system for realizing linked schedule reminding
CN111786939B (en) Method, device and system for testing management platform of Internet of things
CN113361838A (en) Business wind control method and device, electronic equipment and storage medium
CN108243222A (en) Server network architecture method and device
CN115410580A (en) Voice recognition method, device, equipment and medium for command scheduling system
CN115794313A (en) Virtual machine debugging method, system, electronic equipment and storage medium
CN106411713B (en) State notification method and server
CN109788251B (en) Video processing method, device and storage medium
CN114328132A (en) Method, device, equipment and medium for monitoring state of external data source
CN112887355B (en) Service processing method and device for abnormal server
CN106302432B (en) A kind of communication device and control method based on car networking
CN111767176A (en) Method and device for remotely controlling terminal equipment
CN115509714A (en) Task processing method and device, electronic equipment and storage medium
CN112992142B (en) Voice message reply method, device, equipment and medium
CN108989404A (en) A kind of barrage message issuing method, server, system and storage medium
CN114924937A (en) Batch task processing method and device, electronic equipment and computer readable medium
CN113590243A (en) Energy enterprise project creation method and device, computer equipment and medium
CN112333262A (en) Data updating prompting method and device, computer equipment and readable storage medium
CN110768855B (en) Method and device for testing linkmzation performance
CN108429741B (en) Method and system for realizing NCSI protocol
CN111782445A (en) Configuration method and device of equipment debugging environment
CN111368512B (en) Service data conversion method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination