CN107305769B - Voice interaction processing method, device, equipment and operating system - Google Patents

Voice interaction processing method, device, equipment and operating system Download PDF

Info

Publication number
CN107305769B
CN107305769B CN201610250092.8A CN201610250092A CN107305769B CN 107305769 B CN107305769 B CN 107305769B CN 201610250092 A CN201610250092 A CN 201610250092A CN 107305769 B CN107305769 B CN 107305769B
Authority
CN
China
Prior art keywords
user
voice
input
information
guide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610250092.8A
Other languages
Chinese (zh)
Other versions
CN107305769A (en
Inventor
郭云云
蔡丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zebra Network Technology Co Ltd
Original Assignee
Zebra Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zebra Network Technology Co Ltd filed Critical Zebra Network Technology Co Ltd
Priority to CN201610250092.8A priority Critical patent/CN107305769B/en
Publication of CN107305769A publication Critical patent/CN107305769A/en
Application granted granted Critical
Publication of CN107305769B publication Critical patent/CN107305769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/221Announcement of recognition results
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a voice interaction processing method, a voice interaction processing device, equipment and an operating system, wherein the method comprises the following steps: collecting voice information input by a user from a voice input interface; recognizing the voice information to obtain a voice recognition result; and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information. According to the voice interaction processing method, the voice interaction processing device, the voice interaction processing equipment and the voice interaction processing system, when the user waits for inputting the voice information again, the input help information is pushed to the user on the voice input interface, so that the user can input the voice information according to the input help information, the voice interaction efficiency is improved, the operation of the user is simplified, and the user experience degree is improved.

Description

Voice interaction processing method, device, equipment and operating system
Technical Field
The present application relates to intelligent device processing technologies, and in particular, to a method, an apparatus, a device, and an operating system for processing voice interaction.
Background
With the continuous development of electronic information technology, the functions of the user equipment are also more and more powerful. The voice assistant is an application that can implement or replace part of the functions of the user equipment through voice interaction, and the convenience of operating the user equipment in different scenes can be greatly improved through the voice assistant, so that the voice assistant is more and more widely applied to various user equipment.
In the prior art, after a user opens a voice assistant, the voice assistant can input voice information through a microphone and other devices, and the voice assistant can perform corresponding processing according to the voice information input by the user. When the voice input is abnormal, for example, the voice information input by the user cannot be correctly recognized, the voice assistant can feed back corresponding response information to the user, for example, "the user cannot answer the answer, i'm' does not hear", when the voice information input by the user for many times cannot be recognized, the voice assistant can simply repeat the response information, the user can only execute the operation that the user wants by other modes, for example, quitting the voice assistant and searching for the corresponding function in the display menu of the user equipment, and the like, the operation is tedious, the efficiency is low, and the user experience is poor.
Disclosure of Invention
The application provides a voice interaction processing method, a voice interaction processing device, voice interaction processing equipment and an operating system, and aims to solve the technical problem that in the prior art, the efficiency of operating user equipment by inputting voice is low.
In one aspect, the present application provides a voice interaction processing method, including:
collecting voice information input by a user from a voice input interface;
recognizing the voice information to obtain a voice recognition result;
and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information.
In another aspect, the present application provides a voice interaction processing apparatus, including:
the acquisition module is used for acquiring voice information input by a user from the voice input interface;
the recognition module is used for recognizing the voice information to obtain a voice recognition result;
and the control module is used for waiting for the user to input the voice information again when the voice recognition result is failure, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information.
In yet another aspect, the present application provides a user equipment, including: the device comprises a processor, a display device and a voice input device;
the display equipment is used for displaying a voice input interface to a user;
the voice input equipment is used for collecting voice information input by a user from the voice input interface;
the processor is coupled to the display device and the voice input device, and is configured to recognize the voice information to obtain a voice recognition result, wait for a user to re-input the voice information if the voice recognition result is a failure, and push input help information to the user on the voice input interface, so that the user inputs the voice information according to the input help information.
In yet another aspect, the present application provides a control apparatus for a vehicle, comprising; the system comprises an airborne instruction input device, an airborne processor and an airborne display device;
the airborne display equipment is used for displaying a voice input interface to a user;
the airborne voice input equipment is used for collecting voice information input by a user from the voice input interface;
the onboard processor is coupled to the onboard display device and the onboard voice input device and is used for recognizing the voice information to obtain a voice recognition result, if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information.
In another aspect, the present application provides a vehicle-mounted internet operating system, including:
the acquisition control unit is used for controlling the vehicle-mounted voice input equipment to acquire voice information input by a user from the voice input interface;
and the recognition control unit is used for recognizing the voice information to obtain a voice recognition result, waiting for the user to input the voice information again if the voice recognition result is failed, and controlling the vehicle-mounted display equipment to push input help information to the user on the voice input interface so that the user can input the voice information according to the input help information.
In the application, through gathering the speech information of user at the input of speech input interface, and when the speech recognition result was the failure, wait for the user to re-input speech information, be in simultaneously on the speech input interface to user's propelling movement input help information makes the user can be according to input help information input speech information, improved the interactive efficiency of pronunciation, simplified user's operation, promoted user experience degree.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a voice interaction processing method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a voice input interface in a voice interaction processing method according to an embodiment of the present application;
fig. 3 is a schematic diagram illustrating that help information is displayed on a voice interaction interface in a voice interaction processing method according to an embodiment of the present application;
fig. 4 is a flowchart of a voice interaction processing method according to a second embodiment of the present application;
fig. 5 is a flowchart of a voice interaction processing method according to a third embodiment of the present application;
fig. 6 is a schematic diagram illustrating a function word displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present application;
fig. 7 is a schematic diagram illustrating exit information displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present invention;
fig. 8 is a schematic view illustrating text information displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present invention;
fig. 9 is a schematic diagram illustrating a response result displayed on a voice input interface in the voice interaction processing method according to the third embodiment of the present invention;
fig. 10 is a block diagram of a voice interaction processing apparatus according to a third embodiment of the present application;
fig. 11 is a block diagram illustrating a structure of a control module in a speech interaction processing apparatus according to a fourth embodiment of the present application;
fig. 12 is a block diagram of a user equipment provided in an embodiment of the present application;
fig. 13 is a block diagram of a control device according to a sixth embodiment of the present application;
fig. 14 is a block diagram illustrating a configuration of an in-vehicle internet operating system according to a seventh embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.
Additionally, the term "vehicle" as used herein includes, but is not limited to, internal combustion engines, electric vehicles, electric motorcycles, electric mopeds, electric balances, remote controlled vehicles, small aircraft (e.g., unmanned aircraft, manned small aircraft, remote controlled aircraft), and various modifications.
Example one
The embodiment of the application provides a voice interaction processing method. Fig. 1 is a flowchart of a voice interaction processing method according to an embodiment of the present application. As shown in fig. 1, the method in this embodiment may include:
step 101, collecting voice information input by a user from a voice input interface.
Specifically, the method in this embodiment may be applied to a user equipment, and the user equipment may include but is not limited to: vehicle-mounted terminals, cell phones, computers, tablet devices, digital broadcast terminals, messaging devices, game consoles, medical devices, fitness devices, personal digital assistants, smart home devices, and the like.
The user equipment may be provided with a display device, such as a display screen or a touch screen, and the voice input interface may be displayed to a user on the display device. Fig. 2 is a schematic diagram of a voice input interface in a voice interaction processing method according to an embodiment of the present application. As shown in fig. 2, the voice input interface may be displayed to the user after the user turns on the user equipment or inputs a corresponding instruction.
Correspondingly, the user equipment may further include a voice collecting device, such as a microphone, and when the user enters the voice input interface, voice information indicating an operation that the user needs to perform, such as "call to queen," or "air conditioner temperature is set to 25 ℃," may be input through the voice collecting device.
And 102, recognizing the voice information to obtain a voice recognition result.
After receiving the voice information input by the user, the voice information can be recognized to obtain a voice recognition result.
Specifically, the speech recognition result may include both success and failure. If the voice information input by the user is not recognized within the preset time, determining that the voice recognition result is failed, for example, after the voice input interface is displayed to the user, the user does not speak within the preset time, and determining that the voice recognition result is failed, wherein the preset time can be set according to actual needs, and for example, can be 5 s.
Or if the meaning indicated by the voice information input by the user cannot be successfully recognized, determining that the voice recognition result is failure. For example, a user inputs a piece of speech, but the environment is too noisy, or the pronunciation of the user is not standard, which may result in that the meaning indicated by the speech information input by the user cannot be recognized, and the speech recognition result is considered as a failure.
And if the meaning represented by the voice information input by the user can be recognized, the voice recognition result is considered to be successful.
Step 103, if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface, so that the user can input the voice information according to the input help information.
Optionally, if the voice recognition result is a failure, the response information may be displayed on the voice input interface. As described above, the voice recognition result is a failure, and two situations may be included, one is that the user does not input the voice information within the preset time, and the other is that the meaning of the voice information input by the user cannot be recognized. Accordingly, the response information may also be set according to the reason of the failure of the voice recognition.
For example, if the user does not input voice information within the preset time, the response message may be "please input voice"; if the meaning of the voice information input by the user cannot be recognized, the response information can be 'no answer, i'm 'does not hear clearly, please re-input'.
In this embodiment, when waiting for the user to re-input the voice information, the help information may be pushed and input to the user on the voice input interface. The input help information may be a guide sentence or a function word.
The guidance sentence may be a sentence for guiding the user operation, for example, "do it rains today", "help me navigate to company". The function word may be a word for indicating a function of the user equipment, for example, if the user equipment is a vehicle-mounted terminal, the function word may be "navigation", "rear view mirror", "radio", or the like; if the user equipment is an air conditioner, the functional words can be wind direction, air quantity, temperature and the like; if the user equipment is a mobile phone, the functional words can be 'telephone', 'short message', etc.
The input help information can be a guide sentence or a functional word. The push mode of the input help information can be various. In this embodiment, the pushing input help information to the user on the voice input interface may include:
displaying input help information to a user on the voice input interface, and/or playing input help information to a user in voice form on the voice input interface.
Fig. 3 is a schematic diagram illustrating that help information is displayed on a voice interaction interface in a voice interaction processing method according to an embodiment of the present application. As shown in fig. 3, in addition to the response information "i did not hear and please say again" after the recognition failure, the input help information "do it rains today", "help me navigates to the company", "air volume is too large", and "air conditioner off" are displayed on the voice input interface.
Further, above the input help information of the voice input interface, a prompt message "you can tell me as such" is also displayed to prompt the user to input information similar to the input help information.
And displaying the input help information on the voice input interface and simultaneously playing the input help information. Alternatively, the input help information may not be displayed, and may be pushed to the user only in a play format.
The number of the input help information can be one or more, and the input help information can be arranged at any suitable position of the voice input interface. The number, position and specific content of the input help information can be set according to actual needs, and are not limited to the manner shown in fig. 3.
In this embodiment, if the voice information input by the user is successfully recognized, corresponding processing may be performed according to the voice information. The processing method after successful identification belongs to the prior art and is not detailed here.
According to the voice interaction processing method provided by the embodiment, the voice information input by the user from the voice input interface is collected, when the voice recognition result is failure, the user is waited to input the voice information again, and meanwhile, the input help information is pushed to the user on the voice input interface, so that the user can input the voice information according to the input help information, the voice interaction efficiency is improved, the operation of the user is simplified, and the user experience degree is improved.
Further, if the voice recognition result in step 103 is a failure, waiting for the user to re-input the voice information, and pushing the input help information to the user on the voice input interface may include:
if the voice recognition result is failure, determining the frequency of failure of recognizing the voice information input by the user on the voice input interface;
and waiting for the user to input the voice information again, and pushing the input help information corresponding to the times to the user on the voice input interface.
As described above, when the voice information input by the user on the voice interface for many times is not recognized, different input help information may be pushed to the user according to the number of times of recognition failure.
The correspondence between the number of times and the input help information may be preset according to actual needs.
For example, the guidance sentence may be pushed to the user when the first and second recognition of the voice information input by the user fail, and the function word may be pushed to the user when the third and subsequent recognition of the voice information input by the user fail.
Or, the guidance sentence may be pushed to the user when the odd-numbered recognition of the voice information input by the user fails, and the function word may be pushed to the user when the even-numbered recognition of the voice information input by the user fails.
Or, more specifically, it may be set that the guidance sentence a is pushed to the user when the first recognition of the voice information input by the user fails, the function word B is pushed to the user when the second recognition of the voice information input by the user fails, the function words C and D are pushed to the user when the third recognition of the voice information input by the user fails, and so on.
If the number of times of recognition failure reaches a preset threshold value, the voice input interface can be closed, and voice interaction with the user is stopped.
Once the voice information input by the user is successfully recognized, subsequent normal flow operation can be carried out according to the voice information, and help information is not pushed and input to the user.
When the voice information input by the user on the voice interface for many times is failed to be identified, different input help information is pushed to the user according to the times of failure in identification, the input help information can be shown to the user in more forms and contents, the user is helped to complete the voice input more quickly, and the efficiency of voice interaction is improved.
Example two
The second embodiment of the application provides a voice interaction processing method. The embodiment is based on the technical scheme provided by the first embodiment, and the guide statement is used as the input help information. Fig. 4 is a flowchart of a voice interaction processing method according to a second embodiment of the present application. As shown in fig. 4, the method in this embodiment may include:
step 201, collecting voice information input by a user from a voice input interface.
Step 202, recognizing the voice information to obtain a voice recognition result.
Steps 201 to 202 are similar to the specific implementation principle of steps 101 to 102 in the first embodiment, and are not described herein again.
Step 203, if the voice recognition result is failure, selecting at least one guide sentence from the guide sentence library.
Specifically, in this embodiment, the input help information displayed on the voice input interface after the recognition failure may be a guidance statement. There may be one or more of the boot statements.
The guide sentence may be selected from a library of guide sentences. The guidance sentence library may contain a plurality of guidance sentences related to various functions of the user equipment, and the guidance sentences displayed on the voice input interface may be randomly or sequentially selected from the guidance sentence library.
And 204, waiting for the user to input the voice information again, and displaying the selected guide statement to the user on the voice input interface so that the user can input the voice information according to the guide statement.
In this embodiment, the guidance sentence library may be formed as follows: acquiring function information of user equipment; and generating a plurality of guide sentences for each piece of functional information of the user equipment to form the guide sentence library.
For example, the user equipment includes eight functions, and one or more guide sentences are generated for each function information, and assuming that 10 corresponding guide sentences are generated according to each function information, eight function information correspond to 80 guide sentences, and the 80 guide sentences form the guide sentence library.
Accordingly, selecting at least one guide sentence from the guide sentence library may include: selecting one or more function information of the user equipment; and selecting at least one guide statement from the guide statements corresponding to the one or more functional information in the guide statement library.
Specifically, N pieces of function information (N is less than or equal to Y) may be selected from Y pieces of function information of the user equipment, and then X pieces of guide statements may be respectively selected from the guide statements corresponding to the N pieces of function information. Y, N, X can be set according to actual needs.
Assuming that Y, N, X are 8, 3 and 1, respectively, three of the eight pieces of function information of the user equipment are selected, and for the selected three pieces of function information, a corresponding guidance sentence is selected, respectively, and the selected three guidance sentences are displayed on the voice input interface.
Further, the guide sentences displayed on the voice input interface can be replaced by one batch at regular intervals, for example, the guide sentences can be replaced by one batch at intervals of 2s, so that more guide sentences can be displayed to the user to guide the user to input recognizable voice information.
Further, the following principle may be followed in selecting the guide statement: it is endeavoured to ensure that the different guide sentences are not identical to the previously displayed guide sentences in order to present as many different guide sentences as possible.
According to the voice interaction processing method provided by the embodiment, the corresponding guide statement is selected from the guide statement library according to the function of the user equipment, and the selected guide statement is displayed on the voice input interface, so that the input voice information can be displayed to the user more comprehensively, the user is guided to complete the desired operation more quickly, and the user experience is further improved.
On the basis of the technical solution provided by the above embodiment, preferably, the method may further include:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
For example, if the user successfully recognizes that the voice information input by the user is "navigate to go to company" after inputting the voice information, it may be determined whether a guidance sentence "navigate to go to company" is present in the guidance sentence library, and if the guidance sentence "navigate to go to company" is present, it is determined whether the guidance sentence is to be deleted in the guidance sentence library according to the number of times that the guidance sentence is successfully recognized, and if the preset condition is more than five times, if the user successfully inputs six times "navigate to go to company", the guidance sentence is deleted in the guidance sentence library.
If the number of times that a certain guide sentence is successfully recognized meets the preset condition, the user is indicated to input the guide sentence for many times, and corresponding guide is not needed any more, so that the guide sentence deleted in the guide sentence library can ensure that the guide sentence which is skillfully applied by the user does not appear on a voice input interface any more, and the voice input efficiency of the user is improved.
EXAMPLE III
The third embodiment of the application provides a voice interaction processing method. The embodiment is based on the technical scheme provided by the first embodiment, and takes the guide sentence as the input help information when the first recognition fails, and takes the functional word as the input help information when the second recognition fails. Fig. 5 is a flowchart of a voice interaction processing method according to a third embodiment of the present application. As shown in fig. 5, the method in this embodiment may include:
step 301, collecting voice information input by a user from a voice input interface.
And step 302, recognizing the voice information to obtain a voice recognition result.
Step 303, if the voice recognition result is a failure, waiting for the user to re-input the voice information, and pushing a guide statement to the user on the voice input interface.
The specific implementation principle of steps 301 to 303 is similar to that of the foregoing embodiment, and is not described here again.
And step 304, receiving the voice information input again by the user.
And 305, recognizing the voice information input again by the user to obtain a voice recognition result.
And step 306, if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing the function words of the user equipment to the user on the voice input interface.
In this embodiment, after the first time of failure of speech recognition, that is, in step 303, a guidance sentence may be displayed to the user on the speech input interface and wait for the user to re-input. When the second voice recognition fails, i.e., whether the re-input voice information is not successfully recognized, a function word may be displayed to the user on the voice input interface in step 306, where the function word may be function information of the user equipment, such as "navigation", "air conditioning", and the like.
The voice input interface in step 301 may refer to fig. 2, as shown in fig. 2, when the user turns on the corresponding function of the user equipment and enters the voice input interface, a welcome sentence, such as "hello! What help is needed ", the welcome phrase may also be announced in voice while it is displayed.
Under the voice input interface shown in fig. 2, the user is waited for to input voice information. And if the voice information input by the user fails to be identified, pushing a guide statement to the user on the voice input interface. The guidance sentence displayed when the first voice information recognition fails may refer to fig. 3. As shown in fig. 3, guidance sentences "do it rains today", "help me navigates to company", "air volume is too large", and "air conditioner off" are displayed on the voice input interface. When the interface shown in fig. 3 is displayed to the user, the response message "i'm does not hear clearly, please say again" etc. may also be broadcasted to the user by voice at the same time, and wait for the user to input again.
When the voice information input by the user for the second time still cannot be successfully recognized, the function words corresponding to the user equipment can be displayed for the user. Fig. 6 is a schematic diagram illustrating a function word displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present application. As shown in fig. 6, the functional words displayed on the voice input interface include: navigation, music, radio, search, air conditioning, telephone calls, weather, and screen brightness. When the interface shown in fig. 6 is displayed to the user, the user can also be simultaneously broadcasted with the voice response message "i am not understood, you can tell me with the following prompt, can click directly", and the like, and wait for the user to input again.
In practical application, when the voice information input by the user for the first time fails to be recognized, a guide statement can be displayed for the user to prompt the user to input correct voice information, and when the voice information input by the user for the second time fails to be recognized, a function word can be displayed for the user to prompt the user of the function of the user equipment, so that the user can input correct voice information in a shorter language according to the displayed function word, and the user experience degree is improved.
Further, when the voice information input by the user fails to be recognized for a plurality of times, the guide sentence or the function word may be alternately displayed, for example, on the basis of the above steps, the guide sentence may be displayed for the third failure, the function word may be displayed for the fourth failure, the guide word may be displayed for the fifth failure, and so on. When the voice information input by the user fails to be recognized for a certain number of times, the voice input interface can be quitted.
For example, when the voice information input by the user for the third time still cannot be successfully recognized, the exit information may be displayed to the user and the voice input interface may be closed. Fig. 7 is a schematic diagram illustrating exit information displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present invention. As shown in fig. 7, when the recognition result of the voice information input by the user for the third time is still failed, the exit information "i can not listen to the Mars spoken by you, see again! When the interface shown in fig. 7 is displayed to the user, the exit information may be simultaneously broadcasted to the user by voice, and the voice input interface is closed after the broadcasting is completed, or the voice input interface may be closed after the exit information is displayed for a certain time.
When the voice information input by the user is successfully recognized, normal flow operation can be entered. Specifically, when the voice information input by the user is successfully recognized, the voice information may be first converted into text information, and the text information may be displayed on the voice input interface. Fig. 8 is a schematic diagram illustrating text information displayed on a voice input interface in a voice interaction processing method according to a third embodiment of the present invention. As shown in fig. 8, if the user inputs "how to go to the shanghai university at the fastest speed" by voice, the text information "how to go to the shanghai university at the fastest speed" may be displayed on the voice input interface after the recognition is successful.
Then, the user may be pushed a response result regarding "what to go fastest to shanghai transportation university". Fig. 9 is a schematic diagram illustrating a response result displayed on a voice input interface in the voice interaction processing method according to the third embodiment of the present invention. As shown in fig. 9, the response result may include a plurality of options: "200M, Shanghai traffic university Xuhui school district", "210M, Shanghai traffic university medical school affiliated ninth hospital", "13.4 kM, Shanghai traffic university Min-line school district" and "13.4 kM, Shanghai traffic university Min-line school district north gate", for user selection.
The user may select one of the options by clicking on the screen or by inputting speech, e.g., "first" may be spoken, and the Shanghai university Xunji school zone 200M from the current location is selected.
When the voice input by the user under the interface shown in fig. 9 cannot be successfully recognized, the response voice "no answer, i'm not clearly heard, please say again" or "still not clearly heard," you can say the second, previous or next page ", etc. can be broadcasted to the user. When the voice input by the user for many times cannot be successfully recognized, the voice quitting 'the Mars language which cannot be understood by you and the Mars language which can not be spoken by you can be seen again' can be broadcasted to the user, and then the voice input interface is closed.
According to the voice interaction processing method provided by the embodiment, different input help information is pushed to the user after the voice information input by the user is identified for the first time and the voice information input by the user is identified for the second time, so that the user is guided more comprehensively, and the success rate of re-input of the user is improved.
On the basis of the technical solution provided by the above embodiment, preferably, after the input help information is displayed to the user on the voice input interface, the input help information selected by the user by clicking the voice input interface may be received, and the processing is performed according to the input help information selected by the user.
When the user selects a certain input help information, which corresponds to the user inputting the input help information in a voice form, for example, three pieces of input help information are displayed to the user on the voice input interface: the method comprises the steps of navigating to a company, turning on a radio and turning up the temperature of an air conditioner, and if a user clicks a guide sentence of turning up the temperature of the air conditioner, the temperature of the air conditioner can be correspondingly increased according to the click of the user, so that the effect of the user selecting the expected operation by clicking a voice input interface is equal to the effect of the user inputting the expected operation by voice, the input mode of the user is diversified, and convenience is brought to the user.
On the basis of the technical solution provided by the foregoing embodiment, preferably, the voice interaction processing method may further include:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information;
wherein the user attribute information includes at least one of:
the age, sex, occupation, address, location, preference and historical operating record of the user.
Specifically, before processing according to the voice information and the user attribute information input by the user, the user account information may be determined according to the login information of the user, or the user account information may be determined according to the voiceprint information of the input voice of the user, and the user attribute information may be determined according to the user account information.
The corresponding relation between the user account information and the user attribute information can be stored in the user equipment, when a user logs in or inputs voice, the user account information can be correspondingly determined, the user attribute information can be determined according to the user account information, or the corresponding relation between the user account information and the user attribute information can also be stored in the server, and after the user account information is determined, the user attribute information can be acquired through interaction with the server. Currently, the user account information may also be determined in other ways, such as by a fingerprint of the user, and the like, which is not limited herein.
After the user attribute information is acquired, corresponding processing can be performed according to the voice information input by the user and the user attribute information when the voice information input by the user is successfully identified. For the same voice information, the processing modes corresponding to different user attribute information may be different.
For example, the user attribute information is used to indicate the favorite information of the user, and the input voice information is "music", and if the favorite of the user a is light music, the light music can be directly played to the user a; if the user B prefers to rock music, the rock music can be directly played to the user B; if the preference of the user C is none, a query message "what music you want to listen to" may be sent to it.
When the voice information input by the user is successfully identified, corresponding processing is carried out according to the voice information input by the user and the user attribute information, so that the processing can better accord with the user attribute, and the personalized requirements of the user are met.
By adopting the voice interaction processing method provided by the embodiments, the input help information can be pushed to the user in time when the voice information input by the user is not successfully identified, the user is guided to correctly input expected operation information, the interaction efficiency is effectively improved, and particularly, when the technical scheme provided by the embodiments is applied to a vehicle, a more remarkable effect can be obtained. In particular, since the environment where the user is located is noisy when driving or riding in a vehicle, the voice information input by the user is not easily recognized successfully, and the user is inconvenient to increase the successful recognition rate of the voice information by approaching the microphone, it is more suitable to use the method provided by the embodiment.
In practical application, when a user drives a vehicle on the road, the environment in the vehicle is often noisy, and a microphone is mostly arranged on a steering wheel or a center console and is at a distance from the user, so that the voice information input by the user is more difficult to be successfully identified compared with other application scenes, and in the driving process, the user is inconvenient to get close to the steering wheel or the center console to input the voice information due to safety considerations.
Further, when the recognition of the voice information input by the user for the first time fails, the input help information is pushed to the user on the voice input interface, if the input help information does not have the content corresponding to the operation expected by the user, the user can input the voice information again, if the recognition of the voice information input for the second time still fails, the input help information can be pushed to the user again on the voice input interface, the content of the input help information pushed to the user for the first time and the content of the input help information pushed to the user for the second time can be different, so that the selection of the user is more comprehensive, the user can be guided to complete the operation expected by the user more quickly, and the influence of the long-time repeated input of the voice information by the user on the driving safety is avoided.
When the method provided by the embodiment is applied to other vehicles, such as airplanes, ships and trains, the similar beneficial effects can be obtained, and great convenience is provided for users.
A voice interaction processing apparatus according to one or more embodiments of the present application will be described in detail below. These voice interaction processing means may be implemented in the infrastructure of a vehicle or a mobile terminal, and may also be implemented in an interactive system of a server and a client device. Those skilled in the art will appreciate that the voice interaction processing means may be constructed by configuring the steps taught in the present scheme using commercially available hardware components. For example, processor components (or processing modules, control modules, processors, controllers) may use components such as single-chip microprocessors, microcontrollers, microprocessors, etc. from Texas instruments, Intel corporation, ARM corporation, etc.
Example four
The fourth embodiment of the application provides a voice interaction processing device. Fig. 10 is a block diagram of a voice interaction processing apparatus according to a fourth embodiment of the present application. As shown in fig. 10, the apparatus in this embodiment may include:
the acquisition module 401 acquires voice information input by a user from a voice input interface;
the recognition module 402 is configured to recognize the voice information to obtain a voice recognition result;
the control module 403 is configured to wait for the user to re-input the voice information when the voice recognition result is a failure, and push input help information to the user on the voice input interface, so that the user inputs the voice information according to the input help information.
The embodiment may be used to execute the voice interaction processing method described in the first embodiment, and the specific implementation principle is similar to that of the embodiment, which is not described herein again.
The voice interaction processing device provided by the embodiment is used for acquiring the voice information input by the user from the voice input interface, waiting for the user to input the voice information again when the voice recognition result is failure, and meanwhile, pushing the input help information to the user on the voice input interface, so that the user can input the voice information according to the input help information, the voice interaction efficiency is improved, the operation of the user is simplified, and the user experience is improved.
Further, the identifying module 402 may be specifically configured to:
if the voice information input by the user is not recognized within the preset time, determining that the voice recognition result is failure;
alternatively, the first and second electrodes may be,
and if the meaning represented by the voice information input by the user cannot be successfully recognized, determining that the voice recognition result is failure.
Further, the control module 403 may specifically be configured to:
if the voice recognition result is failure, determining the frequency of failure of recognizing the voice information input by the user on the voice input interface;
and waiting for the user to input the voice information again, and pushing the input help information corresponding to the times to the user on the voice input interface.
Fig. 11 is a block diagram of a control module in a speech interaction processing apparatus according to a fourth embodiment of the present application. As shown in fig. 11, the control module 403 may include a display pushing unit 4031 and/or a voice pushing unit 4032:
the display pushing unit 4031 is used for displaying input help information to a user on the voice input interface;
the voice pushing unit 4032 is used for playing the input help information in a voice form to the user on the voice input interface.
Further, the display and push unit 4031 may be specifically configured to:
displaying at least one guide sentence to a user on the voice input interface.
Further, the display and push unit 4031 may be specifically configured to:
selecting at least one guide sentence from the guide sentence library;
displaying the selected guidance statement to a user on the voice input interface.
Further, the control module 403 may further include a generation unit 4033:
the generating unit 4033 may be configured to: acquiring function information of user equipment before displaying at least one guide statement to a user on the voice input interface; and generating a plurality of guide sentences for each piece of functional information of the user equipment to form the guide sentence library.
Further, the display and push unit 4031 may be specifically configured to: selecting one or more function information of the user equipment;
and selecting at least one guide statement from the guide statements corresponding to the one or more functional information in the guide statement library.
Further, the generating unit 4033 may be further configured to:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
Further, the control module 403 may further include: a re-input unit 4034;
the re-input unit 4034 may be configured to:
after the help information is pushed and input to the user on the voice input interface, receiving voice information input again by the user;
recognizing the voice information re-input by the user to obtain a voice recognition result;
and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface, wherein the input help information is a functional word of the user equipment.
Further, the re-input unit 4034 may be further configured to:
after the input help information is displayed to the user on the voice input interface, receiving input help information selected by the user through clicking the voice input interface;
and processing according to the input help information selected by the user.
Further, the re-input unit 4034 may be further configured to:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information;
wherein the user attribute information includes at least one of:
the age, sex, occupation, address, location, preference and historical operating record of the user.
Further, the re-input unit 4034 may be further configured to:
before processing according to voice information input by a user and user attribute information, determining the user account information according to login information of the user, or determining the user account information according to voiceprint information of input voice of the user;
and determining the user attribute information according to the user account information.
EXAMPLE five
The fifth embodiment of the application provides user equipment. Fig. 12 is a block diagram of a user equipment according to a fifth embodiment of the present application. As shown in fig. 12, the user equipment in this embodiment may include: a processor 501, a display device 502, and a voice input device 503;
the display device 502 is used for displaying a voice input interface to a user;
the voice input device 503 is configured to collect voice information input by a user from the voice input interface;
the processor 501, coupled to the display device 502 and the voice input device 503, is configured to recognize the voice information to obtain a voice recognition result, and if the voice recognition result is a failure, wait for the user to re-input the voice information, and push input help information to the user on the voice input interface, so that the user inputs the voice information according to the input help information.
The user equipment may be a vehicle-mounted terminal, a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like. The display device 502 may be a touch display, a liquid crystal display, etc., and the voice input device 503 may be a microphone, etc.
The user equipment provided in this embodiment may be configured to execute the driving record processing method described in each of the foregoing embodiments, and a specific implementation principle of the user equipment is similar to that in the foregoing embodiments, and details are not described here again.
The user equipment provided by the embodiment waits for the user to input the voice information again when the voice recognition result is failure by collecting the voice information input by the user from the voice input interface, and meanwhile, the user can input the help information by pushing the user on the voice input interface, so that the user can input the voice information according to the help information, the voice interaction efficiency is improved, the user operation is simplified, and the user experience is improved.
Further, the processor 501 may push the input help information to the user by: controlling the display device 502 to display the input help information to the user on the voice input interface.
Further, the user equipment may further include: an audio output device coupled to the processor 501; the audio output device may be a speaker or the like.
Accordingly, the processor 501 may push the input help information to the user by: and controlling the audio output equipment to play the input help information in a voice form to the user on the voice input interface.
Further, the processor 501 may be further configured to:
after the input help information is displayed to the user on the voice input interface, receiving input help information selected by the user through clicking the voice input interface; and processing according to the input help information selected by the user.
Further, the processor 501 may be further configured to:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information; wherein the user attribute information includes at least one of: the age, sex, occupation, address, location, preference and historical operating record of the user.
Further, the user equipment may further include: a text input device coupled to the processor 501; the text input device can be a keyboard, keys and the like.
Accordingly, the processor 501 may be further configured to: before processing according to voice information input by a user and user attribute information, determining the user account information according to login information input by the user through the character input equipment, or determining the user account information according to voiceprint information of voice input by the user through the voice input equipment; and determining the user attribute information according to the user account information.
EXAMPLE six
The sixth embodiment of the present application provides a control apparatus for a vehicle. Fig. 13 is a block diagram of a control device according to a sixth embodiment of the present application. The control device may be integrated in a central control system of the vehicle, including but not limited to: vehicle equipment, control equipment attached after the vehicle leaves the factory, and the like. The control device may include; an onboard processor, an onboard display device, and an onboard voice input device, among other additional devices. The onboard devices such as the onboard processor, the onboard display device and the onboard voice input device are devices which can be arranged on a vehicle, such as an onboard display screen and an onboard voice input device.
Referring to fig. 13, the control device 900 may specifically include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.
The on-board processor may be the processing component 902 of fig. 13, the processing component 902 generally controlling overall operation of the control device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 902 may include one or more processors 920 to execute instructions to complete all or part of the steps of the voice interaction processing method according to any one of the first to third embodiments. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.
Depending on the type of vehicle being installed, the processing component 902 may be implemented using various Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors, or other electronic components, and may be used to perform the voice interaction processing methods described above.
The processing component 902 may be coupled to the onboard voice input device and the onboard display device described above via in-vehicle wiring or a wireless connection. According to the above scheme, the processing component 902 may be configured to recognize the voice information collected by the onboard voice input device to obtain a voice recognition result, and if the voice recognition result is a failure, wait for the user to re-input the voice information, and push the input help information to the user on the voice input interface, so that the user inputs the voice information according to the input help information.
The memory 904 is configured to store various types of data to support operation at the device 900. Examples of such data include instructions for any application or method operating on the control device 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power component 906 provides power to the various components of the control device 900. The power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the control device 900.
The multimedia component 908 may include the on-board display device. In some embodiments, the on-board display device may include a Liquid Crystal Display (LCD) and/or a Touch Panel (TP). If the on-board display device includes a touch panel, it may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. The on-board display device may be used to display a voice input interface to a user.
In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 900 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 910 is configured to output and/or input audio signals. For example, audio component 910 may include the onboard voice input device, which may include one or more of the following: a microphone or sound pick-up mounted on the console; a microphone or sound pick-up mounted on the steering wheel; a microphone or microphone disposed on the steering rudder. The onboard voice input device can be used for collecting voice information input by a user from the voice input interface.
In some embodiments, audio component 910 may also include an onboard audio output device coupled to the onboard processor; accordingly, the onboard processor may push input help information to the user by: and controlling the onboard audio output equipment to play the input help information in a voice form to the user on the voice input interface.
I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 914 includes one or more sensors for providing status assessment of various aspects of the control device 900. For example, the sensor assembly 914 may detect an open/closed state of the device 900, the relative positioning of components, such as a display and keypad of the control device 900, the sensor assembly 914 may also detect a change in position of the control device 900 or a component of the control device 900, the presence or absence of user contact with the control device 900, orientation or acceleration/deceleration of the control device 900, and a change in temperature of the control device 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 916 is configured to facilitate wired or wireless communication between the control device 900 and other devices. The control device 900 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 916 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 916 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the control device 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
EXAMPLE seven
The seventh embodiment of the application provides a vehicle-mounted Internet operating system. Fig. 14 is a block diagram illustrating a configuration of an in-vehicle internet operating system according to a seventh embodiment of the present application. As shown in fig. 14, the operating system provided in this embodiment may include: an acquisition control unit 701 and an identification control unit 702.
The acquisition control unit 701 controls the vehicle-mounted voice input equipment to acquire voice information input by a user from a voice input interface;
the recognition control unit 702 is configured to recognize the voice information to obtain a voice recognition result, wait for the user to input the voice information again if the voice recognition result is a failure, and control the vehicle-mounted display device to push input help information to the user on the voice input interface, so that the user can input the voice information according to the input help information.
Those skilled in the art will appreciate that the hardware of the vehicle-mounted internet operating system, which can manage and control the user equipment or the control equipment for the vehicle described in the foregoing embodiments, and the computer program of the software resource referred to in the present application are system software that directly runs on the user equipment or the control equipment for the vehicle. The operating system is an interface between a user and the user device or the control device, and is also an interface between hardware and other software.
The vehicle-mounted internet operating system can interact with other modules or functional equipment on a vehicle to control functions of the corresponding modules or functional equipment.
Specifically, taking the vehicle in the above embodiments as an example, based on the development of the vehicle-mounted internet operating system and the vehicle communication technology provided by the present application, the vehicle is no longer independent of the communication network, and the vehicle and the service end can be connected with each other to form a network, so as to form a vehicle-mounted internet. The vehicle-mounted internet system can provide voice communication service, positioning service, navigation service, mobile internet access, vehicle emergency rescue, vehicle data and management service, vehicle-mounted entertainment service and the like.
The vehicle-mounted internet operating system provided in this embodiment may control corresponding components to execute the voice interaction processing method described in each embodiment through the acquisition control unit 701 and the recognition control unit 702, or on the basis of the two units, in combination with other units, and a specific implementation principle of the voice interaction processing method is similar to that in the foregoing embodiments, and is not described again in this embodiment.
The vehicle-mounted internet operating system provided by the embodiment collects the voice information input by the user from the voice input interface through controlling the vehicle-mounted voice input device, waits for the user to input the voice information again when the voice recognition result is failure, and controls the vehicle-mounted display device to push the input help information on the voice input interface to the user, so that the user can input the voice information according to the input help information, the voice interaction efficiency is improved, the operation of the user is simplified, and the user experience degree is improved.
Example eight
An eighth embodiment of the present application provides a computer/processor-readable storage medium, where the storage medium has stored therein program instructions, where the program instructions are configured to cause a computer/processor to execute:
collecting voice information input by a user from a voice input interface;
recognizing the voice information to obtain a voice recognition result;
and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information.
The computer/processor readable storage medium provided in this embodiment may be used to execute the voice interaction processing method described in the foregoing embodiments, and the specific implementation principle of the computer/processor readable storage medium is similar to that of the foregoing embodiments, and is not described here again.
The computer/processor readable storage medium provided by this embodiment collects the voice information input by the user from the voice input interface, waits for the user to input the voice information again when the voice recognition result is a failure, and pushes the input help information to the user on the voice input interface, so that the user can input the voice information according to the input help information, thereby improving the efficiency of voice interaction, simplifying the operation of the user, and improving the user experience.
The readable storage medium may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (30)

1. A voice interaction processing method is characterized by comprising the following steps:
collecting voice information input by a user from a voice input interface;
recognizing the voice information to obtain a voice recognition result;
if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information;
wherein the pushing input help information to the user on the voice input interface comprises:
selecting at least one guide sentence from the guide sentence library;
displaying the selected guide sentence to a user on the voice input interface;
the method further comprises the following steps:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
2. The method of claim 1, wherein the recognizing the voice information to obtain a voice recognition result comprises:
if the voice information input by the user is not recognized within the preset time, determining that the voice recognition result is failure;
alternatively, the first and second electrodes may be,
and if the meaning represented by the voice information input by the user cannot be successfully recognized, determining that the voice recognition result is failure.
3. The method of claim 1, wherein if the voice recognition result is a failure, waiting for the user to re-input the voice information, and pushing input help information to the user on the voice input interface comprises:
if the voice recognition result is failure, determining the frequency of failure of recognizing the voice information input by the user on the voice input interface;
and waiting for the user to input the voice information again, and pushing the input help information corresponding to the times to the user on the voice input interface.
4. The method of claim 1, wherein pushing input help information to the user on the voice input interface comprises:
playing the input help information to the user in voice form on the voice input interface.
5. The method of claim 1, further comprising, prior to displaying at least one guide sentence to a user on the voice input interface:
acquiring function information of user equipment;
and generating a plurality of guide sentences for each piece of functional information of the user equipment to form the guide sentence library.
6. The method of claim 5, wherein selecting at least one guide sentence from a library of guide sentences comprises:
selecting one or more function information of the user equipment;
and selecting at least one guide statement from the guide statements corresponding to the one or more functional information in the guide statement library.
7. The method according to any one of claims 1-6, further comprising, after pushing input help information to the user on the voice input interface:
receiving voice information input again by a user;
recognizing the voice information re-input by the user to obtain a voice recognition result;
and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface, wherein the input help information is a functional word of the user equipment.
8. The method of any of claims 1-6, further comprising, after displaying the input help information to the user on the voice input interface:
receiving input help information selected by a user through clicking a voice input interface;
and processing according to the input help information selected by the user.
9. The method of any one of claims 1-6, further comprising:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information;
wherein the user attribute information includes at least one of:
the age, sex, occupation, address, location, preference and historical operating record of the user.
10. The method of claim 9, further comprising, before processing according to the voice information input by the user and the user attribute information:
determining the user account information according to the login information of the user, or determining the user account information according to the voiceprint information of the input voice of the user;
and determining the user attribute information according to the user account information.
11. A speech interaction processing apparatus, comprising:
the acquisition module is used for acquiring voice information input by a user from the voice input interface;
the recognition module is used for recognizing the voice information to obtain a voice recognition result;
the control module is used for waiting for the user to input the voice information again when the voice recognition result is failure, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information;
the control module comprises a display pushing unit; the display pushing unit is used for:
selecting at least one guide sentence from the guide sentence library;
displaying the selected guide sentence to a user on the voice input interface;
the control module further comprises a generating unit, wherein the generating unit is used for:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
12. The apparatus according to claim 11, wherein the identification module is specifically configured to:
if the voice information input by the user is not recognized within the preset time, determining that the voice recognition result is failure;
alternatively, the first and second electrodes may be,
and if the meaning represented by the voice information input by the user cannot be successfully recognized, determining that the voice recognition result is failure.
13. The apparatus of claim 11, wherein the control module is specifically configured to:
if the voice recognition result is failure, determining the frequency of failure of recognizing the voice information input by the user on the voice input interface;
and waiting for the user to input the voice information again, and pushing the input help information corresponding to the times to the user on the voice input interface.
14. The apparatus of claim 11, wherein the control module further comprises a voice push unit:
the voice pushing unit is used for playing the input help information in a voice form to the user on the voice input interface.
15. The apparatus of claim 11, wherein the generating unit is further configured to: acquiring function information of user equipment before displaying at least one guide statement to a user on the voice input interface; and generating a plurality of guide sentences for each piece of functional information of the user equipment to form the guide sentence library.
16. The apparatus according to claim 15, wherein the display pushing unit is specifically configured to:
selecting one or more function information of the user equipment;
and selecting at least one guide statement from the guide statements corresponding to the one or more functional information in the guide statement library.
17. The apparatus of any one of claims 11-16, wherein the control module further comprises: a re-input unit;
the re-input unit is used for:
after the help information is pushed and input to the user on the voice input interface, receiving voice information input again by the user;
recognizing the voice information re-input by the user to obtain a voice recognition result;
and if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface, wherein the input help information is a functional word of the user equipment.
18. The apparatus of claim 17, wherein the re-input unit is further configured to:
after the input help information is displayed to the user on the voice input interface, receiving input help information selected by the user through clicking the voice input interface;
and processing according to the input help information selected by the user.
19. The apparatus of claim 17, wherein the re-input unit is further configured to:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information;
wherein the user attribute information includes at least one of:
the age, sex, occupation, address, location, preference and historical operating record of the user.
20. The apparatus of claim 19, wherein the re-input unit is further configured to:
before processing according to voice information input by a user and user attribute information, determining the user account information according to login information of the user, or determining the user account information according to voiceprint information of input voice of the user;
and determining the user attribute information according to the user account information.
21. A user device, comprising: the device comprises a processor, a display device and a voice input device;
the display equipment is used for displaying a voice input interface to a user;
the voice input equipment is used for collecting voice information input by a user from the voice input interface;
the processor is coupled to the display device and the voice input device, and is configured to recognize the voice information to obtain a voice recognition result, wait for a user to re-input the voice information if the voice recognition result is a failure, and push input help information to the user on the voice input interface, so that the user inputs the voice information according to the input help information;
the processor is specifically configured to:
selecting at least one guide sentence from the guide sentence library;
displaying the selected guide sentence to a user on the voice input interface;
the processor is further configured to:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
22. The user device of claim 21, wherein the processor pushes input help information to the user by: and controlling the display equipment to display input help information to a user on the voice input interface.
23. The user equipment of claim 21, further comprising: an audio output device coupled to the processor;
correspondingly, the processor pushes the input help information to the user by the following method: and controlling the audio output equipment to play the input help information in a voice form to the user on the voice input interface.
24. The user equipment of any of claims 21-23, wherein the processor is further configured to:
after the input help information is displayed to the user on the voice input interface, receiving input help information selected by the user through clicking the voice input interface;
and processing according to the input help information selected by the user.
25. The user equipment of any of claims 21-23, wherein the processor is further configured to:
if the voice information input by the user is successfully identified, processing according to the voice information input by the user and the user attribute information;
wherein the user attribute information includes at least one of:
the age, sex, occupation, address, location, preference and historical operating record of the user.
26. The user equipment of claim 25, further comprising: a text input device coupled to the processor;
the processor is further configured to: before processing according to voice information input by a user and user attribute information, determining the user account information according to login information input by the user through the character input equipment, or determining the user account information according to voiceprint information of voice input by the user through the voice input equipment; and determining the user attribute information according to the user account information.
27. A control apparatus for a vehicle, characterized by comprising; the system comprises an airborne processor, an airborne display device and an airborne voice input device;
the airborne display equipment is used for displaying a voice input interface to a user;
the airborne voice input equipment is used for collecting voice information input by a user from the voice input interface;
the onboard processor is coupled to the onboard display device and the onboard voice input device and is used for recognizing the voice information to obtain a voice recognition result, if the voice recognition result is failure, waiting for the user to input the voice information again, and pushing input help information to the user on the voice input interface so that the user can input the voice information according to the input help information;
the onboard processor pushes input help information to the user by: controlling the onboard display device to display input help information to a user on the voice input interface;
the onboard processor is configured to:
selecting at least one guide sentence from the guide sentence library;
controlling the onboard display device to display the selected guidance statement to a user on the voice input interface;
the onboard processor is further configured to:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
28. The control apparatus according to claim 27, characterized by further comprising: an onboard audio output device coupled to the onboard processor;
accordingly, the on-board processor pushes input help information to the user by: and controlling the onboard audio output equipment to play the input help information in a voice form to the user on the voice input interface.
29. The control device of claim 27 or 28, wherein the on-board voice input device comprises one or more of:
a microphone or sound pick-up mounted on the console;
a microphone or sound pick-up mounted on the steering wheel;
a microphone or microphone disposed on the steering rudder.
30. An in-vehicle internet operating system, comprising:
the acquisition control unit is used for controlling the vehicle-mounted voice input equipment to acquire voice information input by a user from the voice input interface;
the recognition control unit is used for recognizing the voice information to obtain a voice recognition result, if the voice recognition result is failed, waiting for the user to input the voice information again, and controlling the vehicle-mounted display equipment to push input help information to the user on the voice input interface so that the user can input the voice information according to the input help information;
the identification control unit is specifically configured to:
selecting at least one guide sentence from the guide sentence library;
controlling the vehicle-mounted display equipment to display the selected guide sentence to a user on the voice input interface;
the identification control unit is further configured to:
if the voice information input by the user is successfully identified, searching a guide statement corresponding to the voice information in the guide statement library;
if the guide statement corresponding to the voice information is found, determining the times of successful recognition of the guide statement;
and if the times meet a preset condition, deleting the guide statement corresponding to the voice information from the guide statement library.
CN201610250092.8A 2016-04-20 2016-04-20 Voice interaction processing method, device, equipment and operating system Active CN107305769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610250092.8A CN107305769B (en) 2016-04-20 2016-04-20 Voice interaction processing method, device, equipment and operating system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610250092.8A CN107305769B (en) 2016-04-20 2016-04-20 Voice interaction processing method, device, equipment and operating system

Publications (2)

Publication Number Publication Date
CN107305769A CN107305769A (en) 2017-10-31
CN107305769B true CN107305769B (en) 2020-06-23

Family

ID=60152456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610250092.8A Active CN107305769B (en) 2016-04-20 2016-04-20 Voice interaction processing method, device, equipment and operating system

Country Status (1)

Country Link
CN (1) CN107305769B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102480728B1 (en) * 2017-11-10 2022-12-23 삼성전자주식회사 Electronic apparatus and control method thereof
CN108132805B (en) * 2017-12-20 2022-01-04 深圳Tcl新技术有限公司 Voice interaction method and device and computer readable storage medium
CN108538288A (en) * 2018-02-12 2018-09-14 深圳迎凯生物科技有限公司 In-vitro diagnosis apparatus control method, device, computer equipment and storage medium
CN108563627B (en) * 2018-03-02 2021-09-03 云知声智能科技股份有限公司 Heuristic voice interaction method and device
CN110309274B (en) * 2018-03-14 2021-09-07 北京三快在线科技有限公司 Guide word recommendation method and device and electronic equipment
CN108509416B (en) * 2018-03-20 2022-10-11 京东方科技集团股份有限公司 Sentence meaning identification method and device, equipment and storage medium
CN108674344B (en) * 2018-03-30 2024-04-02 斑马网络技术有限公司 Voice processing system based on steering wheel and application thereof
CN108766425A (en) * 2018-05-29 2018-11-06 广东好帮手环球科技有限公司 A kind of mobile unit sound control method
CN109960807A (en) * 2019-03-26 2019-07-02 北京博瑞彤芸文化传播股份有限公司 A kind of intelligent semantic matching process based on context relation
CN110203209A (en) * 2019-06-05 2019-09-06 广州小鹏汽车科技有限公司 A kind of phonetic prompt method and device
CN110364155A (en) * 2019-07-30 2019-10-22 广东美的制冷设备有限公司 Voice control error-reporting method, electric appliance and computer readable storage medium
CN112445390B (en) * 2019-08-29 2022-10-11 Tcl科技集团股份有限公司 Submenu selection method and device and terminal equipment
CN110503938A (en) * 2019-08-30 2019-11-26 北京太极华保科技股份有限公司 The recognition methods of machine conversational language and device, identification engine switching method and device
CN112017646A (en) * 2020-08-21 2020-12-01 博泰车联网(南京)有限公司 Voice processing method and device and computer storage medium
CN112578989A (en) * 2020-12-25 2021-03-30 雄狮汽车科技(南京)有限公司 Operation method of vehicle-mounted touch screen
CN112786048A (en) * 2021-03-05 2021-05-11 百度在线网络技术(北京)有限公司 Voice interaction method and device, electronic equipment and medium
CN113962685A (en) * 2021-10-19 2022-01-21 国家电网有限公司客户服务中心 System suitable for multi-user energy information management

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609671A (en) * 2009-07-21 2009-12-23 北京邮电大学 A kind of method and apparatus of continuous speech recognition result evaluation
CN103632665A (en) * 2012-08-29 2014-03-12 联想(北京)有限公司 Voice identification method and electronic device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1432997A (en) * 2002-01-09 2003-07-30 比特联创电子(北京)有限公司 Phonetic recognition method
US20090018829A1 (en) * 2004-06-08 2009-01-15 Metaphor Solutions, Inc. Speech Recognition Dialog Management
CN101739459A (en) * 2009-12-21 2010-06-16 中兴通讯股份有限公司 Method for adding word bank of mobile terminal and mobile terminal
US9262612B2 (en) * 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
CN103065628A (en) * 2012-11-20 2013-04-24 江南大学 Voice interaction control guide system and method thereof
CN105117195B (en) * 2015-09-09 2018-05-08 百度在线网络技术(北京)有限公司 The bootstrap technique and device of phonetic entry

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609671A (en) * 2009-07-21 2009-12-23 北京邮电大学 A kind of method and apparatus of continuous speech recognition result evaluation
CN103632665A (en) * 2012-08-29 2014-03-12 联想(北京)有限公司 Voice identification method and electronic device

Also Published As

Publication number Publication date
CN107305769A (en) 2017-10-31

Similar Documents

Publication Publication Date Title
CN107305769B (en) Voice interaction processing method, device, equipment and operating system
US10845871B2 (en) Interaction and management of devices using gaze detection
CN106462909B (en) System and method for enabling contextually relevant and user-centric presentation of content for conversations
CN107004410B (en) Voice and connectivity platform
CN107315511B (en) Service display method, device, equipment and system
US20170168774A1 (en) In-vehicle interactive system and in-vehicle information appliance
US20150145790A1 (en) Input device disposed in handle and vehicle including the same
CN110100277B (en) Speech recognition method and device
US8718621B2 (en) Notification method and system
US20200043478A1 (en) Artificial intelligence apparatus for performing speech recognition and method thereof
US10755711B2 (en) Information presentation device, information presentation system, and terminal device
EP2933607A1 (en) Navigation system having language category self-adaptive function and method of controlling the system
CN110827810A (en) Apparatus and method for recognizing speech and text
WO2017186009A1 (en) Service display method, apparatus, device, and system
CN108806674A (en) A kind of positioning navigation method, device and electronic equipment
CN110784833A (en) Message reminding method and device, vehicle-mounted equipment and storage medium
US20200286479A1 (en) Agent device, method for controlling agent device, and storage medium
CN109976515B (en) Information processing method, device, vehicle and computer readable storage medium
KR102371513B1 (en) Dialogue processing apparatus and dialogue processing method
CN112242143A (en) Voice interaction method and device, terminal equipment and storage medium
KR101769957B1 (en) In-vehicle infotainment device
CN106098066B (en) Voice recognition method and device
CN113791841A (en) Execution instruction determining method, device, equipment and storage medium
CN111649759A (en) Navigation method, device, equipment and storage medium of vehicle
CN111312254A (en) Voice conversation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant