CN105810188B - Information processing method and electronic equipment - Google Patents

Information processing method and electronic equipment Download PDF

Info

Publication number
CN105810188B
CN105810188B CN201410840387.1A CN201410840387A CN105810188B CN 105810188 B CN105810188 B CN 105810188B CN 201410840387 A CN201410840387 A CN 201410840387A CN 105810188 B CN105810188 B CN 105810188B
Authority
CN
China
Prior art keywords
recognition
voice
voice information
result
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410840387.1A
Other languages
Chinese (zh)
Other versions
CN105810188A (en
Inventor
史泳文
戴海生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201410840387.1A priority Critical patent/CN105810188B/en
Publication of CN105810188A publication Critical patent/CN105810188A/en
Application granted granted Critical
Publication of CN105810188B publication Critical patent/CN105810188B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • User Interface Of Digital Computer (AREA)
  • Telephonic Communication Services (AREA)

Abstract

In the information processing method provided by the invention, after a user sends a first voice, the electronic equipment generates a first recognition result according to the first voice information, does not confirm the received first recognition result, but receives second voice information corresponding to the same to-be-input content corresponding to the first voice information, the first recognition result is not determined by the user as an erroneous recognition result, the user repeatedly inputs the voice information corresponding to the same content, and when the second voice is recognized, the result obtained by the first recognition is combined with the history information of which the first recognition result is the erroneous result, the result of the second voice recognition is adjusted, so that the finally obtained second recognition result is different from the first recognition result, the accuracy of the recognition is improved while the history information is effectively utilized, the user does not need to input voice for many times aiming at the same content, and the input speed of inputting through the voice mode is improved.

Description

Information processing method and electronic equipment
Technical Field
The invention belongs to the field of electronic equipment, and particularly relates to an information processing method and electronic equipment.
Background
With the development of electronic technology, various electronic devices have been added with a voice input function. Since voice input is fast compared to handwriting or typing input, it is widely used in various electronic devices.
In the recognition process, after the user sends out voice information, due to the fact that the voice of the user has accent or the accuracy of a voice recognition model of the electronic equipment is low, the optimal result obtained by recognition of the electronic equipment is not the result to be input by the user.
However, the recognition process requires the user to input a large number of times of voice for the same content, and the voice input each time is self-corrected, which results in a complicated input process and a low input speed.
Disclosure of Invention
In view of the above, an object of the present invention is to provide an information processing method, which can solve the problem that a user needs to input a voice with the same content many times to recognize a target result, so that the input is complicated.
An information processing method comprising:
receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
analyzing the first voice information to obtain a first recognition result;
when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result;
the first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result.
Preferably, the method further includes, after obtaining the first recognition result:
starting timing from the reception of the first voice message, stopping timing when receiving the second voice message, and obtaining a timing value;
comparing the timing value with a preset time threshold value to obtain a first comparison result;
comparing the first voice information with the second voice information to obtain a second comparison result;
and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, determining that the first voice information and the second voice information correspond to the same content to be input, and executing the step of analyzing the second voice information based on the first recognition result to obtain a second recognition result.
In the above method, preferably, the comparing the first voice message with the second voice message to obtain a second comparison result includes:
comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similarity value;
and when the similarity value is larger than a preset threshold value, judging that the first voice information is matched with the second voice information.
In the above method, preferably, the analyzing the second speech information based on the first recognition result to obtain a second recognition result includes:
recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, wherein the third recognition result comprises at least two recognition items;
and adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result.
In the above method, preferably, the analyzing the second speech information based on the first recognition result to obtain a second recognition result includes:
recognizing the second voice information according to a preset voice recognition model, and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result;
and sequencing the acquired identification items according to the adjusted matching degree to obtain the second identification result.
Preferably, the method further includes, after analyzing the second speech information based on the first recognition result to obtain a second recognition result:
receiving confirmation information, wherein the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
and generating instruction information according to the confirmation information and the target identification item, and executing the instruction information.
Preferably, the method further includes, after executing the instruction information:
acquiring the first voice information, the second voice information and the target identification item;
training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item.
In the foregoing method, preferably, when a first recognition item in the third recognition result is the same as a first recognition item in the first recognition result, the adjusting at least two recognition items in the third recognition result based on the first recognition result to obtain the second recognition result includes:
and according to a preset second algorithm, reducing the matching degree of the first recognition item in the third recognition result and the second voice information, and increasing the matching degree of the non-first recognition item in the third recognition result and the second voice information to obtain the second recognition result.
In the above method, preferably, the electronic device displays a preset number of identification items according to a sequence of matching degrees, and when the preset number of identification items from a top identification item in the third identification result is the same as a corresponding identification item in the first identification result, adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result includes:
according to a preset third algorithm, the matching degree of the preset number of recognition items and the second voice information from the first recognition item in the third recognition result is reduced, and the matching degree of the recognition items behind the preset number of recognition items and the second voice information in the second recognition result is increased.
An electronic device, comprising:
the receiving module is used for receiving first voice information, and the first voice information represents voice sent by a user according to the content to be input;
the first analysis module is used for analyzing the first voice information to obtain a first recognition result;
the second analysis module is used for analyzing the second voice information based on the first recognition result to obtain a second recognition result when the information confirming the first recognition result is not received and the second voice information is received;
the first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result.
The electronic device described above preferably further includes:
the timing module is used for starting timing from the reception of the first voice message and stopping timing when receiving the second voice message to obtain a timing value;
the first comparison module is used for comparing the timing value with a preset time threshold value to obtain a first comparison result;
the second comparison module is used for comparing the first voice information with the second voice information to obtain a second comparison result;
and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice message and the second voice message are matched, determining that the first voice message and the second voice message correspond to the same content to be input, and triggering a second analysis module.
In the above electronic device, preferably, the second comparing module includes:
the comparison unit is used for comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similar value;
and the judging unit is used for judging that the first voice information is matched with the second voice information when the similarity value is larger than a preset threshold value.
In the electronic device, preferably, the second analysis module includes:
the first recognition unit is used for recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, and the third recognition result comprises at least two recognition items;
and the first adjusting unit is used for adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result.
In the electronic device, preferably, the second analysis module includes:
the second recognition unit is used for recognizing the second voice information according to a preset voice recognition model and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result;
and the second adjusting unit is used for sequencing the acquired identification items according to the adjusted matching degree to obtain a second identification result.
The electronic device described above preferably further includes:
the confirmation module is used for receiving confirmation information, and the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
and the instruction module is used for generating instruction information according to the confirmation information and the target identification item and executing the instruction information.
The electronic device described above preferably further includes:
the acquisition module is used for acquiring the first voice information, the second voice information and the target identification item;
and the training module is used for training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item.
The invention provides an information processing method, which is applied to electronic equipment with a voice recognition module and comprises the following steps: and analyzing the received first voice information to obtain a first recognition result, and when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result. By adopting the method, when the user sends out a first voice, the electronic equipment generates a first recognition result according to the first voice information, but the user does not confirm the first recognition result but continues to input a second voice information, the first recognition result is not determined by the user as an erroneous recognition result, the second voice information is the same content to be input corresponding to the first voice information, namely the user repeatedly inputs the voice information corresponding to the same content, when the electronic equipment recognizes the second voice, the voice recognition model based on the recognition is the same model as the voice recognition model for recognizing the first voice information, and the recognition results of the two are the same, therefore, when the electronic equipment recognizes the second voice information, the first recognition result is combined with the historical information of the erroneous result, the result of recognizing the second voice is adjusted, so that the finally obtained second recognition result is different from the first recognition result, the accuracy of recognition is improved while the historical information is effectively utilized, a user does not need to input the voice for many times aiming at the same content, and the input speed of inputting the voice in a voice mode is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of an information processing method embodiment 1 provided in the present application;
fig. 2 is a flowchart of an information processing method embodiment 2 provided in the present application;
fig. 3 is a flowchart of an information processing method embodiment 3 provided in the present application;
fig. 4 is a flowchart of an information processing method embodiment 4 provided in the present application;
fig. 5 is a flowchart of an embodiment 5 of an information processing method provided in the present application;
fig. 6 is a flowchart of an embodiment 6 of an information processing method provided in the present application;
fig. 7 is a flowchart of embodiment 7 of an information processing method provided in the present application;
fig. 8 is a schematic structural diagram of an electronic device embodiment 1 provided in the present application;
fig. 9 is a schematic structural diagram of an electronic device in embodiment 2 provided in the present application;
fig. 10 is a schematic structural diagram of an electronic device in embodiment 3 provided in the present application;
fig. 11 is a schematic structural diagram of an electronic device in embodiment 4 provided in the present application;
fig. 12 is a schematic structural diagram of an electronic device in embodiment 5 provided in the present application;
fig. 13 is a schematic structural diagram of an electronic device according to embodiment 6 provided in the present application;
fig. 14 is a schematic structural diagram of an electronic device in embodiment 7 provided in the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, which is a flowchart of an embodiment 1 of an information processing method provided in the present application, the information processing method is applied to an electronic device, where the electronic device may specifically be an electronic device such as a desktop, a notebook, a tablet computer, a mobile phone, a smart television, a smart watch, and a wearable device, and the electronic device is provided with a voice recognition function.
The method is realized by the following steps:
step S101: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
when a user wants to input a content to be input in a voice mode, the user sends out a voice corresponding to the content to be input.
It should be noted that, after the user inputs the voice information once, the electronic device does not recognize the target input content (i.e., the content to be input) of the user, and then the user inputs the voice information again according to the content to be input.
It should be noted that the first voice information and the second voice information are only voice information for distinguishing two consecutive voice inputs, the first voice information does not refer to a voice that is first sent by a user according to a content to be input, and correspondingly, the second voice information does not refer to a voice that is last sent by the user according to the content to be input.
Step S102: analyzing the first voice information to obtain a first recognition result;
and analyzing the received first voice information according to the voice recognition model to obtain a first recognition result.
It should be noted that the first recognition result includes at least two recognition items, and the speech recognition model analyzes the speech information, and the matching degree of the obtained recognition item and the first speech information is close.
Step S103: and when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result.
The first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result.
The manner of recognizing the voice information by the electronic device is as follows: and if the voice recognition model is preset in the electronic equipment, analyzing the received voice information according to the voice recognition model to obtain at least one recognition item, and sequentially arranging according to the matching degree of the recognition item and the voice information to obtain a recognition result.
In a specific implementation, the first recognition result includes at least two recognition items, and the matching degrees of the recognition items are close to each other.
It should be noted that, when a user sends a voice according to a content to be input, and the electronic device recognizes a first recognition result generated after the first voice information corresponding to the voice is recognized, and the user does not see a recognition item corresponding to the input content in the first recognition result, the user does not confirm the recognition item in the first recognition result, and continues to input the voice corresponding to the content to be input, and the electronic device receives second voice information.
Specifically, the second recognition result is different from the first recognition result in that the arrangement order of the recognition items is different; or at least one identification item in the second identification result is not contained by the first identification result; or the value of the matching degree between each recognition item in the second recognition result and the second voice information is different from the value of the matching degree between the corresponding recognition item in the first recognition result and the first voice information.
Specifically, when the electronic device does not receive the information that the user confirms the first recognition information and receives the second voice information, it may be determined that the electronic device incorrectly recognizes the first voice information, so that the user inputs the second voice corresponding to the content to be input again, and in order to improve the recognition accuracy, when recognizing the second voice information, it is necessary to combine the first recognition result with the same voice recognition model as the voice recognition model for recognizing the first voice information, and the recognition results of the two voice recognition models are the same, so that when the electronic device recognizes the second voice information, the result of recognizing the second voice is adjusted by combining the history information that the first recognition result is an error result and the first recognition result is an error result, so that the finally obtained second recognition result is different from the first recognition result, the method has the advantages that the historical information is effectively utilized, the recognition accuracy is improved, a user does not need to input voice for many times aiming at the same content, and the input speed of inputting the voice is improved.
In summary, in an information processing method provided in this embodiment, a first recognition result is obtained by analyzing received first voice information, and when information confirming the first recognition result is not received and second voice information is received, a second recognition result is obtained by analyzing the second voice information based on the first recognition result. By adopting the method, when the user sends out a first voice, the electronic equipment generates a first recognition result according to the first voice information, but the user does not confirm the first recognition result but continues to input a second voice information, the first recognition result is not determined by the user as an erroneous recognition result, the second voice information is the same content to be input corresponding to the first voice information, namely the user repeatedly inputs the voice information corresponding to the same content, when the electronic equipment recognizes the second voice, the voice recognition model based on the recognition is the same model as the voice recognition model for recognizing the first voice information, and the recognition results of the two are the same, therefore, when the electronic equipment recognizes the second voice information, the first recognition result is combined with the historical information of the erroneous result, the result of recognizing the second voice is adjusted, so that the finally obtained second recognition result is different from the first recognition result, the accuracy of recognition is improved while the historical information is effectively utilized, a user does not need to input the voice for many times aiming at the same content, and the input speed of inputting the voice in a voice mode is improved.
As shown in fig. 2, a flowchart of embodiment 2 of an information processing method provided in the present application is implemented by the following steps:
step S201: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S202: analyzing the first voice information to obtain a first recognition result;
steps S201 to S202 are the same as steps S101 to S102 in embodiment 1, and this embodiment is not described in detail.
Step S203: starting timing from the reception of the first voice message, stopping timing when receiving the second voice message, and obtaining a timing value;
in order to determine whether the voice information received twice consecutively is voice input by the user, a time value between the two times of voice information reception needs to be detected.
Specifically, a timer may be started when the first voice message is received and stopped when the second voice message is received, and the timer value represents a time difference between two voice inputs.
Of course, in specific implementation, other manners may also be adopted to determine whether the voice is continuously input. For example, the electronic device analyzes the first voice message to obtain a first recognition result, starts timing, and stops when receiving the second voice message. Or, starting timing from the reception of the first voice message, stopping timing when the timing time meets a preset continuous input threshold, and judging that the subsequently input second voice message is not the voice message continuously input by the user; when the timing value does not satisfy the preset continuous input threshold value, the second voice message is received, and the second voice message is judged to be continuously input.
Step S204: comparing the timing value with a preset time threshold value to obtain a first comparison result;
the preset time threshold is the longest time value representing that two adjacent voices are continuously input, when the time interval between the two adjacent voices is smaller than the preset time threshold, the two voices are continuous, otherwise, the two voices are not continuous.
When the second voice information input for the second time is not continuous with the first voice information input for the first time, the voice information input for the two times is regarded as independent voice, and no relation exists between the two voice information.
Step S205: comparing the first voice information with the second voice information to obtain a second comparison result;
in order to determine whether the received voice information of two times is the repeated input voice of the user for the same content to be input, the received voice information of two times needs to be compared to obtain a second comparison result.
It should be noted that, when the comparison result of the two times of voice information indicates that the first voice information and the second voice information are matched, that is, the two times of voice information correspond to the same input content, the second voice information is identified by combining the first voice information corresponding to the same content to be input; and when the comparison result of the two voice information shows that the two voice information do not correspond to the same input content, recognizing the second voice information according to a preset voice recognition model, wherein the recognition is irrelevant to the process and the result of the first voice recognition.
Step S206: and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, determining that the first voice information and the second voice information correspond to the same content to be input, and analyzing the second voice information based on the first recognition result to obtain a second recognition result.
When the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, the repeated voice input aiming at the same content to be input in a short time is shown, and the second voice information which is repeatedly input is identified based on the first identification result to obtain a second identification result.
It should be noted that, in the present application, the receiving time of the first voice message and the receiving time of the second voice message are timed, and the timing value is compared with a preset time threshold, then the first voice message and the second voice message are compared, and whether the two voice messages are continuously and repeatedly input for a certain content to be input is determined according to the corresponding results of the two comparisons, but the two comparison orders are not limited thereto, and the two comparison orders may exchange a sequence order, that is, after the second voice message is received, it is determined whether the second voice message and the first voice message are for the same content to be input, and then the timing value is determined. Of course, in a specific implementation, the determination of one of the first and second voice messages may be stopped when it is determined that the other voice message does not satisfy the condition, for example, after the timing value does not satisfy the time threshold, the first voice message and the second voice message are not compared again, so as to reduce the data processing amount of the electronic device.
In summary, in the information processing method provided in this embodiment, after obtaining the first recognition result, the method further includes: starting timing from the reception of the first voice message, stopping timing when receiving the second voice message, and obtaining a timing value; comparing the timing value with a preset time threshold value to obtain a first comparison result; comparing the first voice information with the second voice information to obtain a second comparison result; and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, determining that the first voice information and the second voice information correspond to the same content to be input, and analyzing the second voice information based on the first recognition result to obtain a second recognition result. By adopting the method, the voice information input twice adjacently is judged, when the two voice information are continuously and repeatedly input aiming at the same input content, the first recognition result can be determined to be an error result and is not determined by a user, when the electronic equipment recognizes the second voice, the voice recognition model based on the recognition is the same model as the voice recognition model for recognizing the first voice information, and the recognition results of the two voice recognition are the same, therefore, when the electronic equipment recognizes the second voice information, the result of the second voice recognition is adjusted by combining the history information of which the first recognition result is the error result and the first recognition result is the error result, so that the finally obtained second recognition result is different from the first recognition result, the accuracy of the recognition is improved while the history information is effectively utilized, the user does not need to input voice for many times aiming at the same content, and the input speed of inputting through the voice mode is improved.
As shown in fig. 3, a flowchart of embodiment 3 of an information processing method provided in the present application is implemented by the following steps:
step S301: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S302: analyzing the first voice information to obtain a first recognition result;
step S303: starting timing from the reception of the first voice message, stopping timing when receiving the second voice message, and obtaining a timing value;
step S304: comparing the timing value with a preset time threshold value to obtain a first comparison result;
steps S301 to 304 are the same as steps S201 to 204 in embodiment 2, and this embodiment is not described in detail.
Step S305: comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similarity value, and judging that the first voice information is matched with the second voice information when the similarity value is larger than a preset threshold value;
the electronic device is preset with a similarity rule for judging the voice information, and the similarity of any two voice information can be judged according to the similarity rule.
The similarity judgment can judge the two voice messages from multiple aspects of voice time length, voice corresponding content, voice speed, voice tone and the like to obtain a similarity value, wherein the closer the two voice messages are, the larger the similarity value is; conversely, the smaller the similarity value.
The preset threshold value represents the minimum similarity between two pieces of voice information input aiming at the same content to be input, and when the similarity between two pieces of voice is larger than the preset threshold value, the two pieces of voice can be judged to be sent aiming at the same content to be input.
In specific implementations, the preset threshold may be set to a larger value, such as 80%, or 0.75.
It should be noted that, when the same person utters two times of voices for the same information to be input, the similarity between the two times of voices is higher.
Specifically, when the similarity value obtained by identifying the first voice information and the second voice information is large and is greater than the preset threshold value, the first voice information and the second voice information correspond to the same content to be input, that is, the first voice information and the second voice information are matched.
Step S306: and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, determining that the first voice information and the second voice information correspond to the same content to be input, and analyzing the second voice information based on the first recognition result to obtain a second recognition result.
Step S306 is the same as step S206 in embodiment 2, and this embodiment is not described in detail.
In summary, in the information processing method provided in this embodiment, the comparing the first voice information with the second voice information to obtain a second comparison result includes: comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similarity value; and when the similarity value is larger than a preset threshold value, judging that the first voice information is matched with the second voice information. By adopting the method, the two voice messages are compared, and when the similarity value obtained by the comparison is larger than the preset threshold value, the two voice messages are judged to be matched and correspond to the same content to be input. Based on the judgment method, when the continuous repeated input aiming at the same input content by the user is determined, the first recognition result can be determined to be an error result and is not determined by the user, and when the electronic equipment recognizes the second voice information, the result of recognizing the second voice is adjusted by combining the history information of which the result is obtained by the first recognition and the first recognition result is the error result.
As shown in fig. 4, a flowchart of embodiment 4 of an information processing method provided by the present application is implemented by the following steps:
step S401: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S402: analyzing the first voice information to obtain a first recognition result;
steps S401 to 402 are the same as steps S101 to 102 in embodiment 1, and this embodiment is not described in detail.
Step S403: when the information confirming the first recognition result is not received and second voice information is received, recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, wherein the third recognition result comprises at least two recognition items;
and the electronic equipment identifies the received second voice information based on a preset voice identification model to obtain a third identification result.
It should be noted that the speech recognition model is also used for recognizing the first speech information, and it can be understood that when the first speech information is the same as the second speech information, the third recognition result obtained by the speech recognition model through recognizing the second speech information is completely the same as the first recognition result.
In a specific implementation, since the user cannot determine the recognition item corresponding to the content to be input according to the first recognition result, when the user utters a voice again according to the content to be input, the user may adjust the voice, and the second voice information and the first voice information have a certain difference, in this case, a third recognition result obtained by the voice recognition model recognizing the second voice information may be different from the first recognition result.
As an example, the electronic device may only display a first of the identified plurality of identified items to prompt the user. When a user wants to input 'power on' by voice, the voice corresponding to the 'power on' is input for the first time, the electronic equipment acquires first voice information corresponding to the first input, the obtained result is 'power off' and 'power on', the 'power off' is displayed to the user because the electronic equipment only displays the first identification item, the final result is not consistent with the 'power on' content input by a user target, the user does not determine the identification result, the voice corresponding to the 'power on' is input again, the electronic equipment acquires second voice information corresponding to the second input, the obtained result is identified as the 'power off' and the 'power on', and the 'power off' is displayed to the user.
Step S404: and adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result.
If the second voice information is determined to be the repeated input aiming at the content to be input, the user cannot determine the identification item corresponding to the content to be input according to the first identification result.
In combination with the example in step S403, when the second recognition result is "power off" and "power on", the "power off" is displayed to the user, the "power off" is displayed based on the first recognition result, and the user does not confirm the recognition result corresponding to the "power off", it is determined that the recognition result "power off" is wrong, and the two recognition items are adjusted to display the recognition item corresponding to the "power on" to the user.
Specifically, when only the first identification item is displayed in the electronic device, the adjustment method may be: comparing the first identification item in the third identification result with the first identification item in the first identification result, and when the first identification item and the second identification item are the same, lifting the second identification item in the third identification result to the first position and reducing the sequencing position of the original first identification item.
Specifically, when a preset number of identification items are displayed in the electronic device, the adjustment mode may be: comparing the preset number of identification items for display in the third identification result with the identification items in the first identification result in corresponding sorting to obtain the same identification items, placing the obtained identification items at the position behind the sorting, and lifting the identification items which are not displayed and have the same number as the same identification items to a display area.
In a specific implementation, the adjusting manner of adjusting the at least two recognition items in the third recognition result may be adjusting a matching degree between the recognition items and the second speech information, and further adjusting an arrangement order of the recognition items.
When the first identification item in the third identification result is the same as the first identification item in the first identification result, the method specifically includes the following steps: and according to a preset first algorithm, reducing the matching degree of the first recognition item in the third recognition result and the second voice information, and increasing the matching degree of the non-first recognition item in the third recognition result and the second voice information to obtain the second recognition result.
For example, if the matching degree of the recognized "power off" and the second voice information is 0.92, and the matching degree of the recognized "power on" and the second voice information is 0.88, then the matching degree corresponding to the "power off" of the leading recognition item is decreased according to the weighting calculation of the preset first algorithm, if the weighting value is 0.9, the final matching degree of the recognized "power off" and the second voice information is 0.92 × 0.9 × 0.828, if the weighting value is 1.1, the matching degree corresponding to the "power on" of the non-leading recognition item is increased, if the weighting value is 0.88 × 1.1 × 0.968, then the matching degree corresponding to the "power off" is 0.828, the matching degree corresponding to the "power on" is 0.968, the "power on" is adjusted to the leading recognition item, the "power off" is adjusted to the non-leading recognition item, and the user can select the input process according to the input contents, and then, subsequent operations are carried out according to the 'startup' identification item.
The electronic device displays a preset number of identification items according to the sequence of the matching degree, and when the identification items with the preset number are the same as the corresponding identification items in the first identification result from the first identification item in the third identification result, the step includes: according to a preset second algorithm, the matching degree of the preset number of recognition items and the second voice information from the first recognition item in the third recognition result is reduced, and the matching degree of the recognition items behind the preset number of recognition items and the second voice information in the second recognition result is increased.
For example, 2 recognition items are displayed in the electronic device, when the recognition result of the second voice information has three "off", "start", and "on", the matching degree of the "off" and the second voice information is 0.92, the matching degree of the "start" and the second voice information is 0.88, and the matching degree of the "on" and the second voice information is 0.82, the matching degree corresponding to the first recognition item "off" is reduced according to the preset second algorithm weighting calculation, and if the weighting value is 0.9, the final matching degree of the "off" and the second voice information is 0.92 × 0.9 ═ 0.828; the matching degree corresponding to the non-first recognition item 'start' is improved, if the weight value is 1.1, the final matching degree of the 'start' and the second voice information is 0.88 × 1.1 ═ 0.968; the matching degree corresponding to the non-leading identification item 'power on' is improved, if the weight value is 1.1, the final matching degree of the 'power on' and the second voice information is 0.82 × 1.1 ═ 0.902, then the 'start', 'power on' and 'power off' are sequentially obtained by arranging according to the matching degree in the second identification result, the electronic equipment displays the two identification items of 'start' and 'power on', and a user can select 'power on' according to the content to be input, thereby completing the voice input process.
Specifically, when the electronic device displays a preset number of identification items according to the sequence of the matching degree, and at least one identification item appears in the first identification result and the third identification result in the preset number of identification items, then this step includes: according to a preset third algorithm, acquiring at least one coincident recognition item in the preset number of recognition items in the third recognition result and the preset number of recognition items in the first recognition result, reducing the matching degree of the coincident recognition item and the second voice information, and increasing the matching degree of the recognition items after the preset number of recognition items in the second recognition result and the second voice information.
For example, 2 recognition items are displayed in the electronic device, a first recognition result obtained by recognizing the first voice information includes four recognition items of "off", "open mirror", "on" and "start", when a third recognition result obtained by recognizing the second voice information includes three recognition items of "start", "off" and "on", a matching degree of the "start" and the second voice information is 0.90, a matching degree of the "off" and the second voice information is 0.88, a matching degree of the "on" and the second voice information is 0.86, then the "off" recognition item displayed in the third recognition result and the first recognition result is obtained according to a preset third algorithm weighted calculation, the matching degree of the "off" recognition item and the second voice information is reduced, and the matching degree of the "start" recognition item not displayed in the third recognition result and the first recognition result and the second voice information is improved, and increasing the 'power-on' identification item which is not displayed in the electronic equipment in the third identification result. According to the preset fourth algorithm, the matching degree corresponding to the recognition item 'shutdown' is reduced, if the weight value is 0.9, the final matching degree of the 'shutdown' and the second voice information is 0.88 × 0.9-0.792; the matching degree corresponding to the identification item 'start' is improved, and if the weight value is 1.1, the final matching degree of the 'start' and the second voice information is 0.90 x 1.1-0.99; if the weight value is 1.1, the final matching degree of the 'startup' and the second voice information is 0.86 x 1.1 ═ 0.946, then the 'startup' and the second voice information are sequentially arranged according to the matching degree in the second recognition result, so that the 'startup' and the 'shutdown' are obtained, the electronic equipment displays the 'startup' and the 'startup' recognition items, and a user can select the 'startup' according to the content to be input, so that the voice input process is completed.
It should be noted that the weight setting in the weighting algorithm may be set according to an actual situation, and a specific setting value is not limited in this embodiment.
It should be noted that, in two specific examples provided in this embodiment, it may be implemented that, after a matching value is calculated by weighting once, a non-leading recognized item is modified to the leading position, or an undisplayed recognized item is modified to the display position, but in the specific implementation, because the setting of the weight or the initial matching degree of the recognized item is within the similar threshold range, if the result is small, the modification may not be completed by the second recognition (the non-leading recognized item is modified to the leading position, or the undisplayed recognized item is modified to the display position), the third speech information is input again, and the third speech information is recognized as the speech information that is newly adjacent to the second speech information.
In summary, in an information processing method provided in this embodiment, the analyzing the second speech information based on the first recognition result to obtain a second recognition result includes: recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, wherein the third recognition result comprises at least two recognition items; and adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result. By adopting the method, when the electronic equipment identifies the second voice information, the result obtained by the first identification is combined with the historical information of which the first identification result is an error result, the result of the second voice identification is adjusted, and the identification item in the third identification result, which is different from the first identification result, can be displayed to the user, so that the finally obtained second identification result is different from the first identification result, the accuracy of the identification is improved while the historical information is effectively utilized, the user does not need to input the voice for many times aiming at the same content, and the input speed of the voice input is improved.
As shown in fig. 5, a flowchart of embodiment 5 of an information processing method provided by the present application is implemented by the following steps:
step S501: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S502: analyzing the first voice information to obtain a first recognition result;
steps S501 to 502 are the same as steps S101 to 102 in embodiment 1, and this embodiment is not described in detail.
Step S503: when the information confirming the first recognition result is not received and second voice information is received, recognizing the second voice information according to a preset voice recognition model, and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result;
first, it should be noted that, in the process of recognizing the voice information by the voice recognition model, according to the degree of matching, multiple matching results are obtained by sequentially matching from high to low, that is, each recognition item obtained by recognizing the voice information is obtained by non-simultaneous recognition.
Specifically, after the recognition item is displayed, the user further performs an action of determining or ignoring re-input of the voice information according to the displayed content, and it is necessary to ensure that at least a part of the recognition item of the content displayed by the second recognition result is different from the recognition item displayed by the first recognition result.
Specifically, the second speech information is recognized according to a preset speech recognition model, after a first recognition item meeting the display ordering is recognized, when the first recognition item is matched with a recognition item displayed in a first recognition result, the matching degree of the second recognition item and the second speech information is reduced, and when the first recognition item is not matched with the recognition item displayed in the first recognition result, the matching degree of the first recognition item and the second speech information is increased; and after the second recognition item which is not in the display sequence is recognized, when the second recognition item is not matched with the recognition item displayed in the first recognition result, increasing the matching degree of the second recognition item and the second voice message, and when the second recognition item is matched with the recognition item displayed in the first recognition result, reducing the matching degree of the second recognition item and the second voice message.
For example, 2 recognition items are displayed in the electronic device, a first recognition result obtained by recognizing first voice information by the voice recognition model includes three recognition items of "power off", "power on", and when a first recognition item obtained by recognizing second voice information is "start", the matching degree between the "start" and second voice information is 0.90, and since the "start" does not exist in the recognition items displayed by the first recognition result, the matching degree between the "start" and second voice information is increased according to a preset fourth algorithm weighting calculation, and if a weight value is 1.1, the final matching degree between the "start" and second voice information is 0.90-1.1-0.99; the voice recognition model continues to recognize the second voice information to obtain a second recognition item 'power off', the matching degree of the 'power off' and the second voice information is 0.88, because the 'power off' exists in the recognition items displayed by the first recognition result, the matching degree of the 'power off' and the second voice information is reduced according to the preset fourth algorithm weighting calculation, if the weight value is 0.9, the final matching degree of the 'power off' and the second voice information is 0.88 × 0.9 ═ 0.792; and the voice recognition model continues to recognize the second voice information to obtain a third recognition item 'power on', the matching degree of the 'power on' and the second voice information is 0.86, because the recognition item displayed by the first recognition result does not have the 'power on', the matching degree of the 'power on' and the second voice information is improved according to the weighted calculation of a preset fourth algorithm, and if the weight value is 1.1, the final matching degree of the 'power on' and the second voice information is 0.86-1.1-0.946.
It should be noted that, when a certain voice message is recognized, the matching degree between each recognition item and the second voice message is increased, the matching degree is increased by using the same weight value, the matching degree between each recognition item and the second voice message is reduced, the matching degree is reduced by using the uniform weight value, and the increased weight value and the reduced weight value are different.
Step S504: and sequencing the acquired identification items according to the adjusted matching degree to obtain the second identification result.
Specifically, in step S503, the matching degree of the recognized identification items is sequentially adjusted, and the adjusted matching degree is checked to sort the identification items according to a preset sorting manner, so as to obtain a second recognition result.
In step S503, the recognition items obtained by recognizing the second voice message are "start", "power off" and "power on", and the recognition items are sequentially arranged according to the matching degree and the matching degree to obtain "start", "power on" and "power off", that is, the second recognition result includes: the electronic equipment displays two identification items of 'start' and 'start', the user can select 'start' according to the content to be input, and the voice input process is completed.
In summary, in an information processing method provided in this embodiment, the analyzing the second speech information based on the first recognition result to obtain a second recognition result includes: recognizing the second voice information according to a preset voice recognition model, and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result; and sequencing the acquired identification items according to the adjusted matching degree to obtain the second identification result. By adopting the method, when the electronic equipment identifies the second voice information, the identification items of the second voice identification are adjusted in sequence in real time in combination with the historical information of which the first identification result is an error result and the result is obtained by the first identification, the identification items of the second voice identification are shown to the user, which are different from the first identification result, so that the accuracy of identification is improved while the historical information is effectively utilized, the user does not need to input voice for the same content for many times, and the input speed of the voice input is improved.
As shown in fig. 6, a flowchart of embodiment 6 of an information processing method provided by the present application is implemented by the following steps:
step S601: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S602: analyzing the first voice information to obtain a first recognition result;
step S603: when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result;
steps S601 to 603 are the same as steps S101 to 102 in embodiment 1, and this embodiment is not described in detail.
Step S604: receiving confirmation information, wherein the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
when any one identification item in the second identification result corresponds to the content to be input, the user can confirm according to the identification item, and the identification item is a target identification item.
Specifically, the confirmation mode may be multiple, when the electronic device has a key, the identification item in the second identification result is displayed on the display screen of the electronic device, and the user selects and confirms one or more identification items displayed in the display screen through the case to generate corresponding confirmation information; or, when the electronic device is a touch screen, displaying the identification item in the second identification result in the touch screen, and clicking the corresponding selection touch key of the target identification item by a user to generate corresponding confirmation information; or, in the second recognition result, only the first recognition item is displayed on the screen, and the user inputs voice "confirm", and then the electronic device generates corresponding confirmation information after receiving the voice information "confirm".
The manner of confirmation is not limited to the three types provided in the present embodiment, and may be other manners as long as one identification item in the second identification result can be confirmed.
Step S605: and generating instruction information according to the confirmation information and the target identification item, and executing the instruction information.
When the content of the target identification item corresponds to one piece of control information, when receiving the confirmation information, generating instruction information according to the content of the target identification item and the confirmation information.
For example, if the target identification item is "shutdown", the target identification item correspondingly controls the electronic device to shutdown; or, when the target identification item is "pause" and the electronic device is currently playing a video, the target identification item may correspondingly control the player to pause.
Specifically, after the instruction information is generated, the electronic device executes the instruction information to control the corresponding program to execute the corresponding operation.
In summary, the information processing method provided in this embodiment further includes: receiving confirmation information, wherein the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item; and generating instruction information according to the confirmation information and the target identification item, and executing the instruction information. By adopting the method, after a user determines that one identification item in the second identification result is a target identification item, the user generates instruction information by combining the content corresponding to the target identification item and executes the instruction information, and controls the electronic equipment according to the voice input content, and the control mode is simple and easy.
As shown in fig. 7, a flowchart of embodiment 7 of an information processing method provided by the present application is implemented by the following steps:
step S701: receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
step S702: analyzing the first voice information to obtain a first recognition result;
step S703: when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result;
step S704: receiving confirmation information, wherein the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
step S705: generating instruction information according to the confirmation information and the target identification item, and executing the instruction information;
steps S701 to 705 are the same as steps S601 to 605 in embodiment 6, which is not described in detail in this embodiment.
Step S706: acquiring the first voice information, the second voice information and the target identification item;
it should be noted that, because the first speech information and the second speech information correspond to the same content to be input, the target recognition item should correspond to the first speech information and the second speech information at the same time, and only due to problems such as recognition accuracy of the speech recognition model or accent of the user, the target recognition item is not recognized when the speech recognition model recognizes the first speech information, so that the speech recognition model needs to be adaptively trained to be more suitable for the speech mode of the user.
Therefore, after one recognition item in a second recognition result obtained by recognition according to the second voice information is a target recognition item, the first voice information, the second voice information and the target recognition item are obtained for training the voice recognition model in the subsequent steps.
Step S707: training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item.
Before the speech recognition models in the electronic devices are used for the first time, the speech templates corresponding to the phrases in the speech recognition models in all the electronic devices are uniformly set, and therefore, in order to adapt to the speech habits of users, the speech recognition models need to be adaptively trained.
The speech recognition model comprises phrases and corresponding speech templates.
For example, when the speech recognition model corresponds to a Chinese speech input, the speech template may be a speech corresponding to the standard Mandarin.
Specifically, the voice template is adjusted according to the target recognition item, the first voice information and the second voice information, so that various corresponding parameters such as a pronunciation mode, a speed and the like are more matched with the pronunciation of the user, and when the user inputs the voice information again, the user can quickly and accurately recognize the voice template according to the voice recognition model after the self-adaptive training.
In summary, in the information processing method provided in this embodiment, after the executing the instruction information, the method further includes: acquiring the first voice information, the second voice information and the target identification item; training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item. By adopting the method, the voice recognition model is adaptively trained according to the voice information input by the user, and the matching degree between the voice recognition model and the pronunciation mode of the user is improved, so that when the user inputs the voice information again, the voice recognition model after the adaptive training can be quickly and accurately recognized, the recognition accuracy is improved, and the voice input speed can be further improved.
Corresponding to the embodiment of the information processing method provided by the application, the application also provides an embodiment of the electronic equipment applying the information processing method.
As shown in fig. 8, a schematic structural diagram of an embodiment 1 of an electronic device provided in the present application is shown, where the electronic device may specifically be an electronic device such as a desktop, a notebook, a tablet computer, a mobile phone, a smart television, a smart watch, and a wearable device, and the electronic device is provided with a voice recognition function.
The electronic device can be realized by the following structure: a receiving module 801, a first analyzing module 802 and a second analyzing module 803;
the receiving module 801 is configured to receive first voice information, where the first voice information represents a voice sent by a user according to content to be input;
the first analysis module 802 is configured to analyze the first voice information to obtain a first recognition result;
the second analysis module 803 is configured to, when the information confirming the first recognition result is not received and a second voice message is received, analyze the second voice message based on the first recognition result to obtain a second recognition result;
the first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result.
In a specific implementation, the first analysis module and the second analysis module may be part of a structure of a processor of the electronic device, and the processor may specifically be a Central Processing Unit (CPU).
In summary, in an electronic device provided in this embodiment, after a user utters a first voice, the electronic device generates a first recognition result according to the first voice information, and the user does not confirm the first recognition result, but continues to input a second voice information, the first recognition result is not determined by the user as an erroneous recognition result, the second voice information corresponds to the same content to be input as the first voice information, that is, the user repeatedly inputs voice information corresponding to the same content, when the electronic device recognizes the second voice, the voice recognition model based on the recognition is the same model as the voice recognition model for recognizing the first voice information, and the recognition results of the two are the same, therefore, when the electronic device recognizes the second voice information, the result obtained by combining the first recognition is history information of the erroneous result, the result of recognizing the second voice is adjusted, so that the finally obtained second recognition result is different from the first recognition result, the accuracy of recognition is improved while the historical information is effectively utilized, a user does not need to input the voice for many times aiming at the same content, and the input speed of inputting the voice in a voice mode is improved.
As shown in fig. 9, a schematic structural diagram of an embodiment 2 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure: a receiving module 901, a first analyzing module 902, a timing module 903, a first comparing module 904, a second comparing module 905 and a second analyzing module 906.
The structural functions of the receiving module 901, the first analyzing module 902, and the second analyzing module 906 are the same as the corresponding structures in embodiment 1, and are not described in detail in this embodiment.
The timing module 903 is configured to start timing when receiving the first voice message, and stop timing when receiving the second voice message, so as to obtain a timing value;
the first comparing module 904 is configured to compare the timing value with a preset time threshold to obtain a first comparison result;
the second comparing module 905 is configured to compare the first voice information with the second voice information to obtain a second comparison result;
and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice message and the second voice message are matched, determining that the first voice message and the second voice message correspond to the same content to be input, and triggering a second analysis module.
In summary, in an electronic device provided in this embodiment, the voice information input twice in adjacent is determined, when the two voice information are continuously and repeatedly input for the same input content, it is determined that the first recognition result is an error result and is not determined by the user, and when the electronic device recognizes the second voice, the voice recognition model based on which the recognition is performed is the same as the voice recognition model for recognizing the first voice information, and the recognition results of the two voice recognition models are the same, so that when the electronic device recognizes the second voice information, the result of recognizing the second voice is adjusted in combination with the history information of which the first recognition result is an error result and the first recognition result is an error result, so that the finally obtained second recognition result is different from the first recognition result, and while effectively utilizing the history information, the accuracy of recognition is improved, a user does not need to input voice for many times aiming at the same content, and the input speed of inputting through a voice mode is improved.
As shown in fig. 10, a schematic structural diagram of an embodiment 3 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure: a receiving module 1001, a first analyzing module 1002, a timing module 1003, a first comparing module 1004, a second comparing module 1005 and a second analyzing module 1006.
The second comparing module 1005 includes a comparing unit 1007 and a determining unit 1008.
The structural functions of the receiving module 1001, the first analyzing module 1002, the timing module 1003, the first comparing module 1004, and the second analyzing module 1006 are the same as those of the corresponding structure in embodiment 2, and are not described in detail in this embodiment.
The comparing unit 1007 is configured to compare the first voice information with the second voice information according to a preset similarity rule to obtain a similarity value;
the determining unit 1008 is configured to determine that the first voice information is matched with the second voice information when the similarity value is greater than a preset threshold.
In summary, in the electronic device provided in this embodiment, two pieces of voice information are compared, and when a similarity value obtained by the comparison is greater than a preset threshold, it is determined that the two pieces of voice information are matched and correspond to the same content to be input. Based on the judgment method, when the continuous repeated input aiming at the same input content by the user is determined, the first recognition result can be determined to be an error result and is not determined by the user, and when the electronic equipment recognizes the second voice information, the result of recognizing the second voice is adjusted by combining the history information of which the result is obtained by the first recognition and the first recognition result is the error result.
As shown in fig. 11, a schematic structural diagram of an embodiment 4 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure:
the electronic device can be realized by the following structure: a receiving module 1101, a first analyzing module 1102 and a second analyzing module 1103;
wherein, the second analysis module 1103 includes: a first recognition unit 1104 and a first adjustment unit 1105.
The structural functions of the receiving module 1101 and the first analyzing module 1102 are the same as those of embodiment 1, and are not described again in this embodiment.
The first recognition unit 1104 is configured to recognize the second speech information according to a preset speech recognition model to obtain a third recognition result, where the third recognition result includes at least two recognition items;
a first adjusting unit 1105, configured to adjust at least two recognition items in the third recognition result based on the first recognition result, to obtain a second recognition result.
Specifically, when the first identification item in the third identification result is the same as the first identification item in the first identification result, the first adjusting unit 1105 is specifically configured to: and according to a preset second algorithm, reducing the matching degree of the first recognition item in the third recognition result and the second voice information, and increasing the matching degree of the non-first recognition item in the third recognition result and the second voice information to obtain the second recognition result.
Specifically, the electronic device displays a preset number of identification items according to the sequence of the matching degree, and when the identification items of the preset number are the same as the corresponding identification items in the first identification result from the first identification item in the third identification result, the first adjusting unit 1105 is specifically configured to: according to a preset third algorithm, the matching degree of the preset number of recognition items and the second voice information from the first recognition item in the third recognition result is reduced, and the matching degree of the recognition items behind the preset number of recognition items and the second voice information in the second recognition result is increased.
In summary, in an electronic device provided in this embodiment, when the electronic device recognizes second voice information, the result obtained by the first recognition is combined with the history information that the first recognition result is an erroneous result, and the result of the second voice recognition is adjusted, so that a recognition item different from the first recognition result in the third recognition result can be displayed to the user, so that the finally obtained second recognition result is different from the first recognition result, and while the history information is effectively utilized, the recognition accuracy is improved, the user does not need to input a large number of times of voices for the same content, and the input speed through a voice mode is improved.
As shown in fig. 12, a schematic structural diagram of an embodiment 5 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure:
the electronic device can be realized by the following structure: a receiving module 1201, a first analyzing module 1202 and a second analyzing module 1203;
wherein, the second analysis module 1203 includes: a second identifying unit 1204 and a second adjusting unit 1205.
The structural functions of the receiving module 1201 and the first analyzing module 1202 are the same as those of the embodiment 1, and are not described again in this embodiment.
The second recognition unit 1204 is configured to recognize the second speech information according to a preset speech recognition model, and adjust a matching degree of a recognized recognition item by using each recognition item in the first recognition result;
the second adjusting unit 1205 is configured to sort the obtained identification items according to the adjusted matching degree, so as to obtain the second identification result.
In summary, in the electronic device provided in this embodiment, when the electronic device recognizes the second voice information, the electronic device combines the history information of which the result is obtained by the first recognition and the first recognition result is an erroneous result, and immediately adjusts the recognition items of the second voice recognition in sequence, so that the recognition items different from the first recognition result in the second recognition result can be displayed to the user, and the second recognition result is different from the first recognition result.
As shown in fig. 13, a schematic structural diagram of an embodiment 6 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure:
the electronic device can be realized by the following structure: a receiving module 1301, a first analyzing module 1302, a second analyzing module 1303, a confirming module 1304, and an instruction module 1305;
the structural functions of the receiving module 1301, the first analyzing module 1302, and the second analyzing module 1303 are the same as those of the embodiment 1, and are not described in detail in this embodiment.
The confirmation module 1304 is configured to receive confirmation information, where the confirmation information indicates that the user determines that one identification item in the second identification result is a target identification item;
the instruction module 1305 is configured to generate instruction information according to the confirmation information and the target identification item, and execute the instruction information.
In summary, in the electronic device provided in this embodiment, after determining that one recognition item in the second recognition result is the target recognition item, the user generates instruction information in combination with content corresponding to the target recognition item and executes the instruction information, and controls the electronic device according to the voice input content, and the control method is simple and easy to implement.
As shown in fig. 14, a schematic structural diagram of an embodiment 7 of an electronic device provided in the present application is shown, where the electronic device may be implemented by the following structure:
the electronic device can be realized by the following structure: a receiving module 1401, a first analyzing module 1402, a second analyzing module 1403, a confirming module 1404, an instruction module 1405, an obtaining module 1406, and a training module 1407;
the receiving module 1401, the first analyzing module 1402, the second analyzing module 1403, the confirming module 1404, and the instruction module 1405 are consistent with the corresponding structural functions in embodiment 6, and are not described in detail in this embodiment.
The obtaining module 1406 is configured to obtain the first voice information, the second voice information, and the target identification item;
the training module 1407 is configured to train a preset speech recognition model in the electronic device based on the first speech information, the second speech information, and the target recognition item.
In summary, in the electronic device provided in this embodiment, the speech recognition model is adaptively trained according to the speech information input by the user, so as to improve the matching degree between the speech recognition model and the pronunciation manner of the user, so that when the user inputs the speech information again, the speech recognition model after the adaptive training can be quickly and accurately recognized, the recognition accuracy is improved, and the speech input speed can be further improved.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing detailed description is directed to an information processing method and an electronic device provided by the present application, and specific examples are applied herein to illustrate the principles and implementations of the present application, and the descriptions of the foregoing examples are only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (13)

1. An information processing method, characterized in that the method comprises:
receiving first voice information, wherein the first voice information represents voice sent by a user according to content to be input;
analyzing the first voice information to obtain a first recognition result;
when the information confirming the first recognition result is not received and second voice information is received, analyzing the second voice information based on the first recognition result to obtain a second recognition result;
the first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result;
wherein the analyzing the second voice information based on the first recognition result to obtain a second recognition result comprises:
recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, wherein the third recognition result comprises at least two recognition items;
adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result;
wherein, when the first recognition item in the third recognition result is the same as the first recognition item in the first recognition result, the adjusting at least two recognition items in the third recognition result based on the first recognition result to obtain the second recognition result comprises:
and according to a preset second algorithm, reducing the matching degree of the first recognition item in the third recognition result and the second voice information, and increasing the matching degree of the non-first recognition item in the third recognition result and the second voice information to obtain the second recognition result.
2. The method of claim 1, wherein obtaining the first recognition result further comprises:
starting timing from the reception of the first voice message, stopping timing when receiving the second voice message, and obtaining a timing value;
comparing the timing value with a preset time threshold value to obtain a first comparison result;
comparing the first voice information with the second voice information to obtain a second comparison result;
and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice information and the second voice information are matched, determining that the first voice information and the second voice information correspond to the same content to be input, and executing the step of analyzing the second voice information based on the first recognition result to obtain a second recognition result.
3. The method of claim 2, wherein the comparing the first voice message with the second voice message to obtain a second comparison result comprises:
comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similarity value;
and when the similarity value is larger than a preset threshold value, judging that the first voice information is matched with the second voice information.
4. The method of claim 1, wherein analyzing the second speech information based on the first recognition result to obtain a second recognition result comprises:
recognizing the second voice information according to a preset voice recognition model, and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result;
and sequencing the acquired identification items according to the adjusted matching degree to obtain the second identification result.
5. The method of claim 1, wherein after analyzing the second speech information based on the first recognition result to obtain a second recognition result, further comprising:
receiving confirmation information, wherein the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
and generating instruction information according to the confirmation information and the target identification item, and executing the instruction information.
6. The method of claim 5, wherein after the executing the instruction information, further comprising:
acquiring the first voice information, the second voice information and the target identification item;
and training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item.
7. The method of claim 1, wherein the electronic device displays a preset number of identification items in order of matching degree, and when the preset number of identification items in the third identification result from a first identification item is the same as a corresponding identification item in the first identification result, the adjusting at least two identification items in the third identification result based on the first identification result to obtain a second identification result comprises:
according to a preset third algorithm, the matching degree of the preset number of recognition items and the second voice information from the first recognition item in the third recognition result is reduced, and the matching degree of the recognition items behind the preset number of recognition items and the second voice information in the second recognition result is increased.
8. An electronic device, comprising:
the receiving module is used for receiving first voice information, and the first voice information represents voice sent by a user according to the content to be input;
the first analysis module is used for analyzing the first voice information to obtain a first recognition result;
the second analysis module is used for analyzing the second voice information based on the first recognition result to obtain a second recognition result when the information confirming the first recognition result is not received and the second voice information is received;
the first voice information and the second voice information correspond to the same content to be input, and the second recognition result is different from the first recognition result;
wherein the second analysis module comprises:
the first recognition unit is used for recognizing the second voice information according to a preset voice recognition model to obtain a third recognition result, and the third recognition result comprises at least two recognition items;
a first adjusting unit, configured to adjust at least two identification items in the third identification result based on the first identification result, so as to obtain a second identification result;
wherein, when the first recognition item in the third recognition result is the same as the first recognition item in the first recognition result, the adjusting at least two recognition items in the third recognition result based on the first recognition result to obtain the second recognition result comprises:
and according to a preset second algorithm, reducing the matching degree of the first recognition item in the third recognition result and the second voice information, and increasing the matching degree of the non-first recognition item in the third recognition result and the second voice information to obtain the second recognition result.
9. The electronic device of claim 8, further comprising:
the timing module is used for starting timing from the reception of the first voice message and stopping timing when receiving the second voice message to obtain a timing value;
the first comparison module is used for comparing the timing value with a preset time threshold value to obtain a first comparison result;
the second comparison module is used for comparing the first voice information with the second voice information to obtain a second comparison result;
and when the first comparison result shows that the timing value is smaller than the time threshold value and the second comparison result shows that the first voice message and the second voice message are matched, determining that the first voice message and the second voice message correspond to the same content to be input, and triggering a second analysis module.
10. The electronic device of claim 9, wherein the second comparison module comprises:
the comparison unit is used for comparing the first voice information with the second voice information according to a preset similarity rule to obtain a similar value;
and the judging unit is used for judging that the first voice information is matched with the second voice information when the similarity value is larger than a preset threshold value.
11. The electronic device of claim 8, wherein the second analysis module comprises:
the second recognition unit is used for recognizing the second voice information according to a preset voice recognition model and adjusting the matching degree of the recognized recognition items by utilizing the recognition items in the first recognition result;
and the second adjusting unit is used for sequencing the acquired identification items according to the adjusted matching degree to obtain a second identification result.
12. The electronic device of claim 8, further comprising:
the confirmation module is used for receiving confirmation information, and the confirmation information represents that a user determines that one identification item in the second identification result is a target identification item;
and the instruction module is used for generating instruction information according to the confirmation information and the target identification item and executing the instruction information.
13. The electronic device of claim 12, further comprising:
the acquisition module is used for acquiring the first voice information, the second voice information and the target identification item;
and the training module is used for training a preset voice recognition model in the electronic equipment based on the first voice information, the second voice information and the target recognition item.
CN201410840387.1A 2014-12-30 2014-12-30 Information processing method and electronic equipment Active CN105810188B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410840387.1A CN105810188B (en) 2014-12-30 2014-12-30 Information processing method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410840387.1A CN105810188B (en) 2014-12-30 2014-12-30 Information processing method and electronic equipment

Publications (2)

Publication Number Publication Date
CN105810188A CN105810188A (en) 2016-07-27
CN105810188B true CN105810188B (en) 2020-02-21

Family

ID=56980845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410840387.1A Active CN105810188B (en) 2014-12-30 2014-12-30 Information processing method and electronic equipment

Country Status (1)

Country Link
CN (1) CN105810188B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107195300B (en) * 2017-05-15 2019-03-19 珠海格力电器股份有限公司 Sound control method and system
CN107195302A (en) * 2017-06-02 2017-09-22 努比亚技术有限公司 A kind of method of Voice command and corresponding system, terminal device
CN107240398B (en) * 2017-07-04 2020-11-17 科大讯飞股份有限公司 Intelligent voice interaction method and device
CN109119073A (en) * 2018-06-25 2019-01-01 福来宝电子(深圳)有限公司 Audio recognition method, system, speaker and storage medium based on multi-source identification
CN109308897B (en) * 2018-08-27 2022-04-26 广东美的制冷设备有限公司 Voice control method, module, household appliance, system and computer storage medium
CN110706691B (en) * 2019-10-12 2021-02-09 出门问问信息科技有限公司 Voice verification method and device, electronic equipment and computer readable storage medium
CN110838284B (en) * 2019-11-19 2022-06-14 大众问问(北京)信息科技有限公司 Method and device for processing voice recognition result and computer equipment
CN111199730B (en) * 2020-01-08 2023-02-03 北京小米松果电子有限公司 Voice recognition method, device, terminal and storage medium
CN111326140B (en) * 2020-03-12 2023-05-30 科大讯飞股份有限公司 Speech recognition result discriminating method, correcting method, device, equipment and storage medium
WO2023005362A1 (en) * 2021-07-30 2023-02-02 深圳传音控股股份有限公司 Processing method, processing device and storage medium
CN113314120B (en) * 2021-07-30 2021-12-28 深圳传音控股股份有限公司 Processing method, processing apparatus, and storage medium
CN113782023A (en) * 2021-09-26 2021-12-10 中电科思仪科技股份有限公司 Voice control method and system based on program control instruction
CN114842871A (en) * 2022-03-25 2022-08-02 青岛海尔科技有限公司 Voice data processing method and device, storage medium and electronic device
CN115798465B (en) * 2023-02-07 2023-04-07 天创光电工程有限公司 Voice input method, system and readable storage medium
CN117789706B (en) * 2024-02-27 2024-05-03 富迪科技(南京)有限公司 Audio information content identification method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1612208A (en) * 2003-10-30 2005-05-04 台达电子工业股份有限公司 Voice identifying method
CN1905007A (en) * 2005-07-27 2007-01-31 日本电气株式会社 Voice recognition system and method
CN1941077A (en) * 2005-09-27 2007-04-04 株式会社东芝 Apparatus and method speech recognition of character string in speech input
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN103106061A (en) * 2013-03-05 2013-05-15 北京车音网科技有限公司 Voice input method and device
CN103645876A (en) * 2013-12-06 2014-03-19 百度在线网络技术(北京)有限公司 Voice inputting method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3762327B2 (en) * 2002-04-24 2006-04-05 株式会社東芝 Speech recognition method, speech recognition apparatus, and speech recognition program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1612208A (en) * 2003-10-30 2005-05-04 台达电子工业股份有限公司 Voice identifying method
CN1905007A (en) * 2005-07-27 2007-01-31 日本电气株式会社 Voice recognition system and method
CN1941077A (en) * 2005-09-27 2007-04-04 株式会社东芝 Apparatus and method speech recognition of character string in speech input
CN103000173A (en) * 2012-12-11 2013-03-27 优视科技有限公司 Voice interaction method and device
CN103106061A (en) * 2013-03-05 2013-05-15 北京车音网科技有限公司 Voice input method and device
CN103645876A (en) * 2013-12-06 2014-03-19 百度在线网络技术(北京)有限公司 Voice inputting method and device

Also Published As

Publication number Publication date
CN105810188A (en) 2016-07-27

Similar Documents

Publication Publication Date Title
CN105810188B (en) Information processing method and electronic equipment
US11900939B2 (en) Display apparatus and method for registration of user command
US10114809B2 (en) Method and apparatus for phonetically annotating text
CN110517685B (en) Voice recognition method and device, electronic equipment and storage medium
JP2019102063A (en) Method and apparatus for controlling page
Tinwala et al. Eyes-free text entry with error correction on touchscreen mobile devices
CN110534109B (en) Voice recognition method and device, electronic equipment and storage medium
US20180300542A1 (en) Drawing emojis for insertion into electronic text-based messages
US20190095430A1 (en) Speech translation device and associated method
CN110109541B (en) Multi-modal interaction method
US20140184542A1 (en) Electronice device, handwriting input recognition system, and method for recognizing handwritten input thereof
KR101474856B1 (en) Apparatus and method for generateg an event by voice recognition
US20210165624A1 (en) Content prioritization for a display array
CN108549493B (en) Candidate word screening method and related equipment
US20160092104A1 (en) Methods, systems and devices for interacting with a computing device
WO2017176470A2 (en) Faster text entry on mobile devices through user-defined stroke patterns
KR101447879B1 (en) Apparatus and method for selecting a control object by voice recognition
EP3010016B1 (en) Input information support apparatus, method for supporting input information, and input information support program
CN112684936A (en) Information identification method, storage medium and computer equipment
KR20130102702A (en) Multi-modal input device using handwriting and voice recognition and control method thereof
US11625545B2 (en) Systems and methods for improved conversation translation
US9613311B2 (en) Receiving voice/speech, replacing elements including characters, and determining additional elements by pronouncing a first element
JP5850886B2 (en) Information processing apparatus and method
US10474886B2 (en) Motion input system, motion input method and program
CN114678019A (en) Intelligent device interaction method and device, storage medium and electronic device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant