CN111883126A - Data processing mode selection method and device and electronic equipment - Google Patents
Data processing mode selection method and device and electronic equipment Download PDFInfo
- Publication number
- CN111883126A CN111883126A CN202010730769.4A CN202010730769A CN111883126A CN 111883126 A CN111883126 A CN 111883126A CN 202010730769 A CN202010730769 A CN 202010730769A CN 111883126 A CN111883126 A CN 111883126A
- Authority
- CN
- China
- Prior art keywords
- processing
- processing mode
- mode
- voice recognition
- processing request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 381
- 238000010187 selection method Methods 0.000 title description 6
- 238000013145 classification model Methods 0.000 claims abstract description 62
- 238000000034 method Methods 0.000 claims abstract description 53
- 238000012549 training Methods 0.000 claims description 52
- 238000003860 storage Methods 0.000 claims description 26
- 239000013598 vector Substances 0.000 claims description 18
- 238000003672 processing method Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 11
- 238000004458 analytical method Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 6
- 238000012706 support-vector machine Methods 0.000 claims description 6
- 230000003993 interaction Effects 0.000 abstract description 16
- 239000002699 waste material Substances 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 8
- 230000009286 beneficial effect Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
The application discloses a method and a device for selecting a data processing mode and electronic equipment. The method comprises the following steps: carrying out voice recognition on the received voice command to obtain a voice recognition result; analyzing the voice recognition result through a classification model to obtain a matched processing mode; the matching processing mode is an online processing mode or an offline processing mode; and processing the target processing request corresponding to the voice recognition result according to the matched processing mode. Therefore, the processing mode of matching the voice recognition result obtained by the voice instruction is determined through the classification model, and the problems of low processing speed and low efficiency of voice interaction in the prior art are solved, so that the efficiency of voice interaction is improved, and the waste of computer processing resources is reduced.
Description
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to a method and an apparatus for selecting a data processing mode, and an electronic device.
Background
With the continuous progress of artificial intelligence, human-computer interaction based on voice has been widely popularized, wherein the field of smart homes is particularly prominent, such as smart sound boxes, voice electric cookers, voice air conditioners and the like. Therefore, a voice module combining software and hardware has been developed and produced in a large quantity, a complete voice module at least includes functions of Wifi, speech Recognition (ASR), semantic understanding (NLU), and speech synthesis, and the speed of returning results by the voice module becomes an important factor for the user to purchase the product.
In the related technology, most of intelligent voice modules adopt two processing modes of off-line and on-line, the confidence degrees are respectively determined after the results are obtained through the two processing modes, and then the results with higher confidence degrees are returned. Generally, the speed of returning results in an online processing mode is slower than that in an offline processing mode, and the results are better returned only in a part of fields, such as music, news, and the like. Therefore, the related art has problems of slow processing speed and low efficiency for voice interaction.
Disclosure of Invention
The application aims to provide a method and a device for selecting a data processing mode and electronic equipment, which are used for solving the problems of low processing speed and low efficiency of voice interaction in the prior art.
In a first aspect, an embodiment of the present application provides a method for selecting a data processing mode, including:
carrying out voice recognition on the received voice command to obtain a voice recognition result;
analyzing the voice recognition result through a classification model to obtain a matched processing mode; the matching processing mode is an online processing mode or an offline processing mode;
and processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
According to the method, the processing mode of the voice command is matched through the classification model, so that the voice command is processed through the matched processing mode, and compared with the technical scheme that after the online processing mode and the offline processing mode need to be waited to return the processing result, the confidence coefficient of the result is calculated and compared, and the matched processing mode is determined in the related technology, the method provided by the application firstly determines the matched processing mode through the classification model, processes the voice recognition result through the matched processing mode, so that the waiting time for obtaining the returned result for all the processing modes and the confidence coefficient comparison are reduced, and the processing speed and the processing efficiency of voice interaction are improved.
In a possible embodiment, before the parsing the speech recognition result through the classification model, the method further includes:
and determining that the current network state meets the processing condition of the online processing mode.
The beneficial effect of this embodiment does: in order to ensure that the voice recognition result can be normally processed through the matched processing mode, the current network state can meet the requirement of an online processing mode before the analysis is carried out through the classification model, so that the accuracy of the selection of the processing mode is ensured, and the situation that the matched processing mode cannot process the voice recognition result after the data processing mode is selected is avoided.
In a possible embodiment, the method further comprises:
and if the current network state does not meet the processing condition of the online processing mode, determining the matched processing mode as the offline processing mode.
The beneficial effect of this embodiment does: and when the processing condition of the online processing mode is determined not to be met according to the current network state, processing is directly performed according to the offline processing mode. In addition, in order to avoid a scenario that a target processing request in some voice recognition results can obtain a better result through an online processing mode, a prompt for the network state can be sent to a user when the current network state cannot meet the processing condition of the online processing mode, so that analysis through a classification model can be realized, matching is performed from all possible processing modes, and accurate processing of the target processing request corresponding to the voice instruction is improved.
In a possible embodiment, the processing the target processing request corresponding to the speech recognition result according to the matching processing manner includes:
if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the online processing mode and the target processing request is an inquiry type processing request, acquiring an inquiry result corresponding to the target processing request from a network in an online mode and then returning the inquiry result; or the like, or, alternatively,
and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
The beneficial effect of this embodiment does: after the matched processing mode is determined, different processing is performed according to the type of the target processing request determined by the voice recognition result, and the efficiency and the accuracy of voice interaction are improved.
In a possible embodiment, the classification model is obtained based on the following method:
acquiring a training sample containing a plurality of matching relations between voice recognition results and processing modes;
inputting the training samples into a BERT model for extracting feature vectors to obtain the feature vectors corresponding to the training samples;
and taking each feature vector as the input of a TWSVW (twin support vector machine), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
The beneficial effect of this embodiment does: the embodiment of the invention enables the classification model to realize analysis according to a voice recognition result obtained after voice recognition is carried out on a voice instruction, so that a processing mode of voice instruction matching is obtained, and the efficiency of voice interaction is improved on the basis of ensuring the accuracy of voice interaction.
In one possible embodiment, if the TWVSW is an untrained TWVSW, the training sample is historical data;
and if the TWVSW is the trained TWVSW, the training sample is historical data and/or the voice recognition result and the corresponding data of the matched processing mode obtained by analyzing the voice recognition result through the classification model.
The beneficial effect of this embodiment does: after the matching processing mode corresponding to the voice recognition result is obtained through the classification model, the voice recognition result and the matching processing mode thereof can be used as a new training sample, and/or other historical data can be obtained as a new training sample, so that continuous adjustment of the classification model is realized, and the accuracy of analysis of the classification model is improved through an incremental training mode.
In a second aspect, an embodiment of the present application provides a device for selecting a data processing method, where the device includes:
the voice recognition unit is used for carrying out voice recognition on the received voice command to obtain a voice recognition result;
the analysis unit is used for analyzing the voice recognition result through the classification model to obtain a matched processing mode; the matching processing mode is an online processing mode or an offline processing mode;
and the processing unit is used for processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
In a possible embodiment, the apparatus further comprises:
and the determining unit is used for determining that the current network state meets the processing condition of the online processing mode before the speech recognition result is analyzed through the classification model.
In a possible embodiment, the determining unit is further configured to:
and if the current network state does not meet the processing condition of the online processing mode, determining the matched processing mode as the offline processing mode.
In a possible embodiment, the processing unit is configured to process a destination processing request corresponding to the speech recognition result according to the matching processing manner, and specifically, is configured to:
if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the online processing mode and the target processing request is an inquiry type processing request, acquiring an inquiry result corresponding to the target processing request from a network in an online mode and then returning the inquiry result; or the like, or, alternatively,
and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
In a possible embodiment, the classification model is obtained based on the following method:
acquiring a training sample containing a plurality of matching relations between voice recognition results and processing modes;
inputting the training samples into a BERT model for extracting feature vectors to obtain the feature vectors corresponding to the training samples;
and taking each feature vector as the input of a TWSVW (twin support vector machine), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
In one possible embodiment, if the TWVSW is an untrained TWVSW, the training sample is historical data;
and if the TWVSW is the trained TWVSW, the training sample is historical data and/or the voice recognition result and the corresponding data of the matched processing mode obtained by analyzing the voice recognition result through the classification model.
In a third aspect, another embodiment of the present application further provides an electronic device, including at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the method for selecting any data processing mode provided by the embodiment of the application.
In a fourth aspect, another embodiment of the present application further provides a computer storage medium, where the computer storage medium stores a computer program, and the computer program is used to enable a computer to execute the method for selecting any data processing manner in the embodiments of the present application.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is an application scenario diagram of a method for selecting a data processing manner according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a method for selecting a data processing method according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for training a classification model according to an embodiment of the present application;
fig. 4 is another schematic flow chart of a method for selecting a data processing method according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a selection apparatus for data processing modes according to an embodiment of the present application;
fig. 6 is a schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that such descriptions are interchangeable under appropriate circumstances such that the embodiments of the disclosure can be practiced in sequences other than those illustrated or described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In the related technology, most of intelligent voice modules adopt two processing modes of off-line and on-line, the confidence degrees are respectively determined after the results are obtained through the two processing modes, and then the results with higher confidence degrees are returned. Generally, the speed of returning results in an online processing mode is slower than that in an offline processing mode, and the results are better returned only in a part of fields, such as music, news, and the like. Therefore, the related art has problems of slow processing speed and low efficiency for voice interaction.
In view of this, the present application provides a data processing method, and the method provided by the present application realizes processing the target processing request of the voice command by the matching processing method by determining the matching processing method of the voice command, so as to improve the efficiency of voice interaction.
For example, referring to fig. 1, a scene diagram of a method for selecting a data processing method according to an embodiment of the present application is shown. The scenario may for example comprise a user 10, a terminal device 11 and a server 12. Various clients, such as a blog client for social contact, a wechat client, a client for voice interaction control, and the like, may be installed in the terminal device 11. After the client of the terminal device 11 establishes communication connection with the server 12, the client of the terminal device 11 obtains the voice command and sends the voice command to the server 12. Then, the server 12 performs voice recognition on the received voice command to obtain a voice recognition result, and analyzes the voice recognition result through a classification model to obtain a matching processing mode; and secondly, processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
The terminal device 11 and the server 12 may be communicatively connected through a communication network, which may be a local area network, a wide area network, or the like. The terminal device 11 may be any device that can be used for voice interaction, such as a smart speaker, a smart rice cooker, a voice air conditioner, and the like, and the server 12 may be any server device that can support a selection method of a data processing manner.
Based on the above design concept, referring to fig. 2, a schematic flow chart of a method for selecting a data processing method according to an embodiment of the present application includes:
step S201: and carrying out voice recognition on the received voice command to obtain a voice recognition result.
In a possible embodiment, the device for intelligent interaction comprises a voice module, wherein the voice module at least comprises: wifi, Speech Recognition (ASR), semantic understanding (NLU), and Speech synthesis. And after receiving the awakening of the voice module, receiving a voice instruction, and performing voice recognition on the voice instruction to obtain a voice recognition result.
In order to ensure that normal processing is performed through a processing mode matched with a classification model, before the classification model is passed, it is determined that the current network state meets the processing condition of the online processing mode, that is, it is required to ensure that both the online processing mode and the offline processing mode can realize processing of the voice recognition result, so that after the matched processing mode is determined through the classifier model, the voice recognition result can be processed no matter the matched processing mode is the offline processing mode or the online processing mode, and thus, the user experience is ensured.
In addition, if the current network state does not meet the processing condition of the online processing mode, the matched processing mode is determined to be the offline processing mode. Or sending a reminding instruction that the current network state does not meet the online processing mode, so that the user updates the network state, processing conditions meeting the online processing mode are met, and the matched processing mode is determined through the classification model.
Step S202: analyzing the voice recognition result through a classification model to obtain a matched processing mode; and the matching processing mode is an online processing mode or an offline processing mode.
In an implementation manner, referring to fig. 3, a schematic flow chart of a training method of a classification model provided in an embodiment of the present application includes:
step 301: and acquiring a training sample containing a plurality of matching relations between the voice recognition results and the processing modes.
The data attribute of the training sample at least comprises a target processing request of a user and a matched online processing mode or offline processing mode thereof. For example, referring to table 1 below, several examples of simple training samples are provided for the present application to understand the present application as follows:
TABLE 1
User's purpose processing request | Matched online processing mode |
How much the weather is today | On-line processing |
Weather of Beijing today | On-line processing |
Playing today's news | On-line processing |
Increase the volume | Off-line processing |
Opening XX software | Off-line processing |
…… | …… |
It should be noted that the acquisition mode of the training sample includes, but is not limited to: the return result determined by the traditional selection mode combining online and offline, the matching processing mode determined by the subjective judgment of the processing mode of the user aiming at the target processing request, the relevant data of the purchased profession and the like.
Step 302: and inputting the training samples into a BERT (Bidirectional Encoder representation based on a Transformer) model for feature vector extraction, and obtaining feature vectors corresponding to the training samples.
Step 303: and taking each feature vector as an input of a Twin Support vector machine (TWSVW), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
When the method is implemented, the TWSVW is trained based on the SMO algorithm, and a classification model for analyzing a voice recognition result obtained by recognizing a voice instruction is obtained through supervised learning of a training sample; and the TWSVW is helpful for realizing the processing mode of obtaining the best matching output data aiming at the input data, namely obtaining the best matching for the voice recognition result input into the classification model, and the analysis speed of the classification model is improved through the SMO algorithm. In addition, through supervised learning based on the training samples, the TWVSW obtains a classification model based on the learning of the training samples and according to the rule obtained by learning, and therefore the method for selecting the matching processing mode through the classification model is provided.
In order to improve the accuracy of the classification model, in the embodiment provided by the application, the idea of incremental training is adopted to adjust and optimize the classification model obtained by the above training method. In implementation, if the TWVSW is a trained TWVSW (i.e., a classification model), the training sample is history data and/or the speech recognition result and data of a matching processing manner obtained by analyzing the speech recognition result through the classification model, that is, when the classification model trained in the above manner is applied, the matching processing manner obtained by analyzing the speech instruction when applied can be used as a new training sample to update the trained TWVSW, so that the result of the matching processing manner obtained by analyzing the classification model is more accurate.
In addition, the embodiment of obtaining the classification model further includes: training through a supervised learning Clustering algorithm to obtain a classification model, such as DBSCAN (sensitivity-Based Clustering of Applications with noise, a representative Density-Based Clustering algorithm), K-Means (K-Means Clustering algorithm), GMM (Adaptive background Clustering for real-time tracking, Gaussian mixture model), and the like; or training by a decision tree method to obtain a classification model; and training through a Hidden Markov Model (HMM) to obtain a classification Model and the like.
Step S203: and processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
After the classification model is obtained in the manner shown in fig. 3, the shortest path network map is obtained through the classification model, and a mapping relationship between input and output is obtained, that is, after a voice recognition result obtained according to a voice instruction of a user is input, a best matching processing manner is obtained through the mapping relationship.
In implementation, the processing modes include an online processing mode and an offline mode, and the target processing request of the user has different types of requests, so that the processing return results obtained based on the different types of requests are different. The present application provides several possible scenarios, as follows:
scene 1: and if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction.
For example, if the voice command is "update current weather information", the matching processing mode may be determined to be online processing based on the analysis of the voice recognition result obtained by the voice command; then, after the current weather information is acquired in an online processing mode, the target processing request corresponding to the voice instruction is updated, namely the operation type processing request, and the acquired weather information is updated to the current weather information.
Scene 2: and if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction.
For example, if the voice command is "increase volume", the matching processing method is determined as offline processing, and if the voice command is an operation-type processing request, the volume is increased based on the target processing request.
Scene 3: and if the matched processing mode is the online processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request from the network in the online mode.
For example, if the voice command is "1 +1 equals to several", that is, the voice command is a processing request of the query class, and it is determined that the matching processing mode is online processing, the corresponding result is obtained from the network, and then the result obtained by the query is returned to the user. Wherein, optionally returning to the user in a voice playing mode; or returning to the user in a terminal push mode.
Scene 3: and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
For example, if the voice command is "what the number of the small red mobile phone is", it is determined that the matching processing mode is offline processing, and the matching processing mode is a processing request of the query class, and based on the processing request of the destination, the queried mobile phone number is returned after the small red mobile phone number is searched from the locally stored contacts.
By the data processing mode selection method, the voice instruction is processed by providing the classification model for selecting the matching processing mode, so that the voice instruction is processed by adopting the matching online processing mode or offline processing mode according to the corresponding target processing request based on the voice instruction; compared with the prior art that the confidence degrees of the two processing modes are compared after the processing is carried out in the online processing mode and the offline processing mode, the matched processing result mode is determined, the voice interaction efficiency is improved, the waiting time of a user during voice interaction is reduced, and the user experience is improved.
To more clearly understand the method for selecting a data processing method provided by the present application, referring to fig. 4, a schematic flow chart of a method for selecting a data processing method provided by another embodiment of the present application includes:
step 401: and receiving a wake-up instruction of a user for the voice module.
Step 402: and performing voice recognition on the received voice command.
Step 403: and acquiring a voice recognition result.
Step 404: and judging whether the current network state meets the processing requirement of the online processing mode.
If yes, go on to step 405; otherwise, step 406B is performed.
Step 405: and analyzing the voice recognition result through a classification model, and determining a matching processing mode.
Step 406A: and processing the target processing request corresponding to the voice recognition result in an online processing mode.
Step 406B: and processing the target processing request corresponding to the voice recognition result in an off-line processing mode.
Based on the same conception, the embodiment of the application also provides a device for selecting the data processing mode.
As shown in fig. 5, the data processing method selection device 500 may include: speech recognition section 510, analysis section 520, and processing section 530.
A voice recognition unit 510, configured to perform voice recognition on the received voice instruction to obtain a voice recognition result;
an analyzing unit 520, configured to analyze the voice recognition result through the classification model to obtain a matching processing mode; the matching processing mode is an online processing mode or an offline processing mode;
the processing unit 530 is configured to process the destination processing request corresponding to the speech recognition result according to the matching processing manner.
In a possible embodiment, the apparatus further comprises:
and the determining unit is used for determining that the current network state meets the processing condition of the online processing mode before the speech recognition result is analyzed through the classification model.
In a possible embodiment, the determining unit is further configured to:
and if the current network state does not meet the processing condition of the online processing mode, determining the matched processing mode as the offline processing mode.
In a possible embodiment, the processing unit 530 is configured to process, according to the matching processing manner, a destination processing request corresponding to the speech recognition result, and specifically configured to:
if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the online processing mode and the target processing request is an inquiry type processing request, acquiring an inquiry result corresponding to the target processing request from a network in an online mode and then returning the inquiry result; or the like, or, alternatively,
and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
In a possible embodiment, the classification model is obtained based on the following method:
acquiring a training sample containing a plurality of matching relations between voice recognition results and processing modes;
inputting the training samples into a BERT model for extracting feature vectors to obtain the feature vectors corresponding to the training samples;
and taking each feature vector as the input of a TWSVW (twin support vector machine), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
In one possible embodiment, if the TWVSW is an untrained TWVSW, the training sample is historical data;
and if the TWVSW is the trained TWVSW, the training sample is historical data and/or the voice recognition result and the corresponding data of the matched processing mode obtained by analyzing the voice recognition result through the classification model.
The specific implementation of the selection device for the data processing mode and the functional modules thereof can be referred to the above description in conjunction with fig. 1 to 4, and are not described herein again.
After a method and an apparatus for selecting a data processing mode according to an exemplary embodiment of the present application are introduced, an electronic device according to another exemplary embodiment of the present application is introduced next.
As will be appreciated by one skilled in the art, aspects of the present application may be embodied as a system, method or program product. Accordingly, various aspects of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible implementations, an electronic device according to the present application may include at least one processor, and at least one memory. The memory stores program code, and the program code, when executed by the processor, causes the processor to execute the steps of the method for selecting a data processing mode according to various exemplary embodiments of the present application described above in the present specification. For example, the processor may perform the steps shown in fig. 2-4.
The electronic device 130 according to this embodiment of the present application is described below with reference to fig. 6. The electronic device 130 shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the electronic device 130 is represented in the form of a general electronic device. The components of the electronic device 130 may include, but are not limited to: the at least one processor 131, the at least one memory 132, and a bus 133 that connects the various system components (including the memory 132 and the processor 131).
The memory 132 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)1321 and/or cache memory 1322, and may further include Read Only Memory (ROM) 1323.
The electronic device 130 may also communicate with one or more external devices 134 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with the electronic device 130, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 130 to communicate with one or more other electronic devices. Such communication may occur via input/output (I/O) interfaces 135. Also, the electronic device 130 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 136. As shown, network adapter 136 communicates with other modules for electronic device 130 over bus 133. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 130, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some possible embodiments, aspects of a data processing manner selection method provided by the present application may also be implemented in the form of a program product including computer program code for causing a computer device to perform the steps of a data processing manner selection method according to various exemplary embodiments of the present application described above in this specification when the program product is run on a computer device, for example, the computer device may perform the steps as shown in fig. 2 to 4.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The program product for image processing of the embodiments of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on an electronic device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device and partly on a remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic devices may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., through the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, according to embodiments of the application. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Further, while the operations of the methods of the present application are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (14)
1. A method for selecting a data processing mode is characterized by comprising the following steps:
carrying out voice recognition on the received voice command to obtain a voice recognition result;
analyzing the voice recognition result through a classification model to obtain a matched processing mode; the matching processing mode is an online processing mode or an offline processing mode;
and processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
2. The method of claim 1, wherein prior to parsing the speech recognition results through the classification model, the method further comprises:
and determining that the current network state meets the processing condition of the online processing mode.
3. The method of claim 2, further comprising:
and if the current network state does not meet the processing condition of the online processing mode, determining the matched processing mode as the offline processing mode.
4. The method according to claim 1, wherein the processing the destination processing request corresponding to the speech recognition result according to the matching processing manner includes:
if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the online processing mode and the target processing request is an inquiry type processing request, acquiring an inquiry result corresponding to the target processing request from a network in an online mode and then returning the inquiry result; or the like, or, alternatively,
and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
5. The method of claim 1, wherein the classification model is derived based on the following method:
acquiring a training sample containing a plurality of matching relations between voice recognition results and processing modes;
inputting the training samples into a BERT model for extracting feature vectors to obtain the feature vectors corresponding to the training samples;
and taking each feature vector as the input of a TWSVW (twin support vector machine), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
6. The method of claim 5, wherein the training samples are historical data if the TWVSW is an untrained TWVSW;
and if the TWVSW is the trained TWVSW, the training sample is historical data and/or the voice recognition result and the corresponding data of the matched processing mode obtained by analyzing the voice recognition result through the classification model.
7. An apparatus for selecting a data processing method, the apparatus comprising:
the voice recognition unit is used for carrying out voice recognition on the received voice command to obtain a voice recognition result;
the analysis unit is used for analyzing the voice recognition result through the classification model to obtain a matched processing mode; the matching processing mode is an online processing mode or an offline processing mode;
and the processing unit is used for processing the target processing request corresponding to the voice recognition result according to the matched processing mode.
8. The apparatus of claim 7, further comprising:
and the determining unit is used for determining that the current network state meets the processing condition of the online processing mode before the speech recognition result is analyzed through the classification model.
9. The apparatus of claim 8, wherein the determining unit is further configured to:
and if the current network state does not meet the processing condition of the online processing mode, determining the matched processing mode as the offline processing mode.
10. The apparatus according to claim 7, wherein the processing unit is configured to process the destination processing request corresponding to the speech recognition result according to the matching processing manner, and specifically is configured to:
if the matched processing mode is the online processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request from a network in an online mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the offline processing mode and the target processing request is an operation type processing request, acquiring an operation instruction corresponding to the target processing request in a local storage in the offline mode, and then executing corresponding operation according to the operation instruction; or the like, or, alternatively,
if the matched processing mode is the online processing mode and the target processing request is an inquiry type processing request, acquiring an inquiry result corresponding to the target processing request from a network in an online mode and then returning the inquiry result; or the like, or, alternatively,
and if the matched processing mode is the offline processing mode and the target processing request is the query type processing request, returning the query result after acquiring the query result corresponding to the target processing request in a local storage in the offline mode.
11. The apparatus of claim 7, wherein the classification model is obtained based on the following method:
acquiring a training sample containing a plurality of matching relations between voice recognition results and processing modes;
inputting the training samples into a BERT model for extracting feature vectors to obtain the feature vectors corresponding to the training samples;
and taking each feature vector as the input of a TWSVW (twin support vector machine), training the TWVSW based on a Sequence Minimum Optimization (SMO) algorithm, and taking the trained TWVSW as the classification model.
12. The apparatus of claim 11, wherein the training samples are historical data if the TWVSW is an untrained TWVSW;
and if the TWVSW is the trained TWVSW, the training sample is historical data and/or the voice recognition result and the corresponding data of the matched processing mode obtained by analyzing the voice recognition result through the classification model.
13. An electronic device comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A computer storage medium, characterized in that it stores a computer program for causing a computer to perform the method according to any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010730769.4A CN111883126A (en) | 2020-07-27 | 2020-07-27 | Data processing mode selection method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010730769.4A CN111883126A (en) | 2020-07-27 | 2020-07-27 | Data processing mode selection method and device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111883126A true CN111883126A (en) | 2020-11-03 |
Family
ID=73200750
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010730769.4A Pending CN111883126A (en) | 2020-07-27 | 2020-07-27 | Data processing mode selection method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111883126A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023179226A1 (en) * | 2022-03-22 | 2023-09-28 | 青岛海尔空调器有限总公司 | Method and apparatus for voice control of air conditioner, and air conditioner and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014106523A (en) * | 2012-11-30 | 2014-06-09 | Aisin Aw Co Ltd | Voice input corresponding device and voice input corresponding program |
CN104392720A (en) * | 2014-12-01 | 2015-03-04 | 江西洪都航空工业集团有限责任公司 | Voice interaction method of intelligent service robot |
CN104899002A (en) * | 2015-05-29 | 2015-09-09 | 深圳市锐曼智能装备有限公司 | Conversation forecasting based online identification and offline identification switching method and system for robot |
CN106782558A (en) * | 2016-12-27 | 2017-05-31 | 重庆峰创科技有限公司 | A kind of vehicle-mounted interactive system of intelligent sound with image understanding |
CN106920551A (en) * | 2016-06-28 | 2017-07-04 | 广州零号软件科技有限公司 | Share the bilingual voice recognition method of service robot of a set of microphone array |
CN106919059A (en) * | 2016-06-28 | 2017-07-04 | 广州零号软件科技有限公司 | The bilingual voice recognition method of service robot with separate microphone array |
CN107424607A (en) * | 2017-07-04 | 2017-12-01 | 珠海格力电器股份有限公司 | Voice command mode switching method, device and the equipment with the device |
CN107464567A (en) * | 2017-07-24 | 2017-12-12 | 深圳云知声信息技术有限公司 | Audio recognition method and device |
CN108039171A (en) * | 2018-01-08 | 2018-05-15 | 珠海格力电器股份有限公司 | Sound control method and device |
CN111445911A (en) * | 2020-03-28 | 2020-07-24 | 大连鼎创科技开发有限公司 | Home offline online voice recognition switching logic method |
-
2020
- 2020-07-27 CN CN202010730769.4A patent/CN111883126A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014106523A (en) * | 2012-11-30 | 2014-06-09 | Aisin Aw Co Ltd | Voice input corresponding device and voice input corresponding program |
CN104392720A (en) * | 2014-12-01 | 2015-03-04 | 江西洪都航空工业集团有限责任公司 | Voice interaction method of intelligent service robot |
CN104899002A (en) * | 2015-05-29 | 2015-09-09 | 深圳市锐曼智能装备有限公司 | Conversation forecasting based online identification and offline identification switching method and system for robot |
CN106920551A (en) * | 2016-06-28 | 2017-07-04 | 广州零号软件科技有限公司 | Share the bilingual voice recognition method of service robot of a set of microphone array |
CN106919059A (en) * | 2016-06-28 | 2017-07-04 | 广州零号软件科技有限公司 | The bilingual voice recognition method of service robot with separate microphone array |
CN106782558A (en) * | 2016-12-27 | 2017-05-31 | 重庆峰创科技有限公司 | A kind of vehicle-mounted interactive system of intelligent sound with image understanding |
CN107424607A (en) * | 2017-07-04 | 2017-12-01 | 珠海格力电器股份有限公司 | Voice command mode switching method, device and the equipment with the device |
CN107464567A (en) * | 2017-07-24 | 2017-12-12 | 深圳云知声信息技术有限公司 | Audio recognition method and device |
CN108039171A (en) * | 2018-01-08 | 2018-05-15 | 珠海格力电器股份有限公司 | Sound control method and device |
CN111445911A (en) * | 2020-03-28 | 2020-07-24 | 大连鼎创科技开发有限公司 | Home offline online voice recognition switching logic method |
Non-Patent Citations (1)
Title |
---|
TIAN YINGJIE等: ""Improved twin support vector machine"", 《SCIENCE CHINA MATHEMATICS》, vol. 57, no. 2, pages 417 - 432 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023179226A1 (en) * | 2022-03-22 | 2023-09-28 | 青岛海尔空调器有限总公司 | Method and apparatus for voice control of air conditioner, and air conditioner and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220020357A1 (en) | On-device learning in a hybrid speech processing system | |
US10685649B2 (en) | Method and apparatus for providing voice service | |
KR102204740B1 (en) | Method and system for processing unclear intention query in conversation system | |
WO2019101083A1 (en) | Voice data processing method, voice-based interactive device, and storage medium | |
US11132509B1 (en) | Utilization of natural language understanding (NLU) models | |
CN110473537B (en) | Voice skill control method, device, equipment and storage medium | |
JP7213943B2 (en) | Audio processing method, device, device and storage medium for in-vehicle equipment | |
US20140207716A1 (en) | Natural language processing method and system | |
US11393490B2 (en) | Method, apparatus, device and computer-readable storage medium for voice interaction | |
CN114840671A (en) | Dialogue generation method, model training method, device, equipment and medium | |
CN112735418B (en) | Voice interaction processing method, device, terminal and storage medium | |
CN110515944B (en) | Data storage method based on distributed database, storage medium and electronic equipment | |
US20220358921A1 (en) | Speech processing for multiple inputs | |
CN110956955A (en) | Voice interaction method and device | |
CN112466289A (en) | Voice instruction recognition method and device, voice equipment and storage medium | |
US20240013784A1 (en) | Speaker recognition adaptation | |
CN115132209A (en) | Speech recognition method, apparatus, device and medium | |
CN113239157B (en) | Method, device, equipment and storage medium for training conversation model | |
US11544504B1 (en) | Dialog management system | |
CN111883126A (en) | Data processing mode selection method and device and electronic equipment | |
CN114299955B (en) | Voice interaction method and device, electronic equipment and storage medium | |
CN114399992B (en) | Voice instruction response method, device and storage medium | |
CN114171016B (en) | Voice interaction method and device, electronic equipment and storage medium | |
US11507752B1 (en) | Evaluating natural language processing components | |
US20220180865A1 (en) | Runtime topic change analyses in spoken dialog contexts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |