CN107977183A

CN107977183A - voice interactive method, device and equipment

Info

Publication number: CN107977183A
Application number: CN201711140428.6A
Authority: CN
Inventors: 李新征; 王磊; 安家雨
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-11-16
Filing date: 2017-11-16
Publication date: 2018-05-01

Abstract

The invention discloses a kind of voice interactive method, device and equipment, wherein, method includes：The voice messaging that user inputs target device is obtained, voice messaging is identified processing；The operation intention that semantic analysis obtains user is carried out to recognition result；Detection operates the validity being intended to, and is intended to effectively if detection is known to operate, being intended to progress information processing services according to operation obtains corresponding content-data；Content-data is fed back into user by target device.Thus, by way of active sniffing and identifying that the operation in the voice messaging of user is intended to, actively meet the interactive voice demand of user, solve in the prior art, after needing user actively to trigger interactive voice unlatching event, less efficient, cumbersome technical problem caused by the voice request of user could be performed.

Description

Voice interactive method, device and equipment

Technical field

The present invention relates to intelligent search technique field, more particularly to a kind of voice interactive method, device and equipment.

Background technology

Artificial intelligence (Artificial Intelligence), english abbreviation AI.It is research, exploitation be used for simulate, Extension and the extension intelligent theory of people, method, a new technological sciences of technology and application system.Artificial intelligence is to calculate One branch of machine science, it attempts to understand essence of intelligence, and produce it is a kind of it is new can be in a manner of human intelligence be similar The intelligence machine made a response, the research in the field include robot, speech recognition, image recognition, natural language processing and specially Family's system etc..Wherein, the most important aspect of artificial intelligence is exactly speech recognition technology.

Voice is gradually extensive as a kind of application to natural interaction technique in the product, but current voice interactive system The event that first actively triggering and voice system interact is required for, such as, wake-up word is actively entered, can just begin through voice command To realize the interaction with system, convenience and usage experience that user uses are have impact on, it is cumbersome.

The content of the invention

The present invention provides a kind of voice interactive method, device and equipment, solves in the prior art, it is necessary to which user actively triggers After interactive voice unlatching event, less efficient, cumbersome technical problem caused by the voice request of user could be performed.

The embodiment of the present invention provides a kind of voice interactive method, comprises the following steps：User is obtained to input target device Voice messaging, processing is identified to the voice messaging；The behaviour that semantic analysis obtains the user is carried out to recognition result Work is intended to；The validity that the operation is intended to is detected, if detection knows that the operation intention is effective, is intended to according to the operation Carry out information processing services and obtain corresponding content-data；The content-data is fed back into the use by the target device Family.

Another embodiment of the present invention provides a kind of voice interaction device, including：Recognition processing module, for obtaining user couple The voice messaging of target device input, processing is identified to the voice messaging；First acquisition module, for recognition result Carry out the operation intention that semantic analysis obtains the user；Detection module, the validity being intended to for detecting the operation；Second Acquisition module, for when detection knows that the operation is intended to effective, being intended to progress information processing services according to the operation and obtaining Take corresponding content-data；Feedback module, for the content-data to be fed back to the user by the target device.

Further embodiment of this invention provides a kind of computer equipment, including processor and memory；Wherein, the processor Program corresponding with the executable program code is run by reading the executable program code stored in the memory, For realizing the voice interactive method as described in above-described embodiment.

A further embodiment of the present invention provides a kind of computer program product, when the instruction in the computer program product by When processor performs, the voice interactive method as described in above-described embodiment is realized.

Yet another embodiment of the invention provides a kind of non-transitorycomputer readable storage medium, is stored thereon with computer journey Sequence, realizes the voice interactive method as described in above-described embodiment when the computer program is executed by processor.

Technical solution provided in an embodiment of the present invention can include the following benefits：

Obtain the voice messaging that is inputted to target device of user, voice messaging be identified processing, to recognition result into The operation that row semantic analysis obtains user is intended to, and detection operates the validity being intended to, if detection knows that operation is intended to effectively, root It is intended to carry out the corresponding content-data of information processing services acquisition according to operation, most content-data is fed back to by target device at last User.Thus, active sniffing and by way of identifying that the operation in the voice messaging of user is intended to, the language of user is actively met Sound interaction demand, solves in the prior art, it is necessary to after user actively triggers interactive voice unlatching event, could perform user's Technical problem less efficient caused by voice request, cumbersome.

Brief description of the drawings

Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein：

Fig. 1 is the flow chart of voice interactive method according to an embodiment of the invention；

Fig. 2 is the interface schematic diagram of prompting user's expiration operation according to an embodiment of the invention；

Fig. 3 is the flow chart of voice interactive method in accordance with another embodiment of the present invention；

Fig. 4 is the flow chart of the voice interactive method of another embodiment according to the present invention；

Fig. 5 is the flow chart of the voice interactive method of another embodiment according to the present invention；

Fig. 6 is the structure diagram of voice interaction device according to an embodiment of the invention；

Fig. 7 is the structure diagram of voice interaction device in accordance with another embodiment of the present invention；

Fig. 8 is the structure diagram of the voice interaction device of another embodiment according to the present invention；

Fig. 9 is the structure diagram of the voice interaction device of further embodiment according to the present invention；

Figure 10 is the structure diagram of the voice interaction device of a still further embodiment according to the present invention；And

Figure 11 is the structure diagram of computer equipment according to an embodiment of the invention.

Embodiment

The embodiment of the present invention is described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or has the function of same or like element.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.

Below with reference to the accompanying drawings the voice interactive method, device and equipment of the embodiment of the present invention are described.

For in the prior art when being interacted with voice system, it is necessary to user's actively triggering and voice system interaction Event, causes the technical problem for influencing the convenience that uses of user and usage experience, the present invention propose it is a kind of it is new need not Triggering and the event of voice system interaction, you can actively provide the voice interactive method of related service to the user.

Below with reference to the accompanying drawings the voice interactive method of the embodiment of the present invention is described in detail, wherein it is desired to explanation, this hair The executive agent of the voice interactive method of bright embodiment can be that smart mobile phone, intelligent robot, wearable device etc. can be with The hardware device of user information can be supplied to by voice operating and in a manner of interface display etc..

Fig. 1 is the flow chart of voice interactive method according to an embodiment of the invention, as shown in Figure 1, this method includes：

Step 101, the voice messaging that user inputs target device is obtained, voice messaging is identified processing.

Wherein, target device is the hardware device for performing voice interactive method, can be that tablet computer, individual digital help Reason, Wearable etc. have speech identifying function and can be to the hardware devices of field feedback, which can To be Intelligent bracelet, intelligent watch, intelligent glasses etc..

It is emphasized that in order to avoid missing voice messaging of user and target device progress interactive voice etc., obtain The operation for the voice messaging that user inputs target device, can network in target device and run all the time after the power is turned on, also It is to say, the voice messaging of target device monitoring users input all the time, wherein, obtain the voice messaging that user inputs target device Mode it is related with the hardware configuration of target device, such as, can be by microphone speech interface actively catch obtain etc..

In actual mechanical process, if the environment residing for user is more noisy, or the voice messaging received includes Target device runs echo that application program makes a sound in itself, and (for example audiovisual applications program is just in the sound of playing audio-video Deng), then in order to which the voice messaging of user is recognized accurately, it is necessary to pass through correlation before voice messaging being identified processing Speech de-noising technology, reduces other influences of the incoherent sound to voice messaging input by user, such as, inputted for user Voice messaging, remove the sound that exports of target device in itself using echo cancelltion technology, utilize noise reduction techniques to reduce ring The influence of border noise.

Specifically, during voice messaging is identified, as a kind of possible implementation, can pass through VAD (Voice Activity Detection, speech terminals detection) technology identifies that user speaks from loquituring to terminating Effective voice messaging, the technology one section of voice messaging can be divided into mute section, changeover portion, voice segments and technology, it is relatively more normal VAD technologies are the double threshold end-point detections based on short-time energy and zero-crossing rate, wherein, end-point detection is speech recognition and language One basic link of sound processing, and a hot fields of the Research of Speech Recognition, main purpose are from the voice of input Voice and non-voice are distinguished, major function including interrupting, removing the mute component in voice, acquisition input voice automatically Middle efficient voice, remove noise simultaneously voice is strengthened, and then obtain voice recognition result, the voice recognition result according to answer Can be the form of speech waveform or the form of text, this is not restricted with the difference of scene.

Wherein, in an embodiment of the present invention, can be voice messaging to be identified in target device processing, or Mitigate the computing pressure of the processor of target device, voice messaging is sent to third party devices such as Cloud Servers and carries out voice The processing of identification.

Step 102, the operation intention that semantic analysis obtains user is carried out to recognition result.

Specifically, after voice recognition result is obtained, semantic analysis is carried out to sound result, and then, according to semantic analysis Result obtain user operation be intended to, the operation be intended to instruction user wish by with target device carry out interactive voice With the related needs of satisfaction, wherein, according to the difference of application scenarios, semantic point can be carried out to recognition result in different ways The operation that analysis obtains user is intended to, and illustrates as follows：

The first example：

The voice messaging of corresponding operation intention of user is obtained previously according to lot of experimental data, such as, corresponding user opens The voice messaging that the operation of application program is intended to " opens application program A ", and for example, the corresponding some functions of application program enable Operation be intended to language message " translating into lower one page ", " font amplification point " etc., and for example, the information of corresponding application program provides Voice messaging " help me look into lower apple how many species " that the operation of function is intended to etc., and then, by the knowledge in the embodiment of the present invention Other result carries out the matching of keyword with the voice messaging of the corresponding operation intention of user to prestore, when matching degree is higher, then recognizes It is intended to get the operation of user.

Second of example：

Depth model can be trained previously according to lot of experimental data, the input of the depth model is the voice letter after identification Breath, exports the operation intention for user, then the recognition result obtained in the present invention is inputted the depth model trained in advance, with The operation for obtaining the user of model output is intended to.

In practical applications, it is nonsensical may to include some interjections etc. when carrying out the input of voice messaging by user Vocabulary, therefore,, can be with when carrying out semantic analysis to recognition result in order to improve the efficiency for obtaining operation intention of user The filtration treatment of some unrelated words, the conversion operation of synonym etc. are carried out, alternatively, under application scenes, it is input by user Voice messaging may pronounce and nonstandard, for example dialectal accent is serious etc., causes to voice messaging identifying processing input by user The recognition result obtained afterwards has ambiguity, and at this time, during semantic analysis is carried out to recognition result, identification can also be tied Fruit carries out correction process of dialect etc..

Step 103, the validity that detection operation is intended to, if to know that operation is intended to effective for detection, according to operation be intended into Row information handles the corresponding content-data of service acquisition.

Step 104, content-data is fed back into user by target device.

Specifically, under application scenes, user may be only to speak near target device, not have With target device carry out interactive voice demand, at this time, according to the voice messaging identifying processing of user obtain recognition result into Row semantic analysis, be obtain less than user operation be intended to, in other words, detect user operation be intended that it is invalid, Under application scenes, even if the operation for getting user is intended to, but the application program installed in current target device is different, Alternatively, the version of application program is different, the ability that the operation of active user is intended to that performs may not be had, such as, user's Operation is intended to open weather forecast application program, but is to be fitted without weather forecast application program in current target device , under this application scenarios, the operation to the user of the executive capability more than target device is intended to, and is also determined as invalid behaviour Work is intended to.

Specifically, the validity that detection operation is intended to, if to know that operation is intended to effective for detection, according to operation be intended into The corresponding content-data of row information processing service acquisition, such as operating effectively of knowing are intended that " species of search apple ", Then target device is intended to carry out information processing services according to the operation, and being obtained by related browser includes the " kind of search apple The content-data of the result of class ".

In one embodiment of the invention, if operation is known in detection, intention is invalid, to target device feedback neutralization Operation prompt information, such as with text prompt, alternatively, to user feedback expiration operation prompt message in a manner of voice prompt, The reason in order to further be easy to user to understand expiration operation, the original of expiration operation is may also include in the expiration operation prompt message Cause, such as, as shown in Fig. 2, prompting user's expiration operation in the form of pop-up in target device, and it is shown to user failure behaviour The reason for making is " being fitted without with the relevant application program of weather forecast ".

Further, content-data is fed back into user by target device, wherein, target device feeds back to user content The mode of data is related comprising content with content-data, currently, is analyzed as more than, content-data is the operation with user Be intended to it is related, such as, content-data be browser search result, then target device give user content number by interface display According to, such as, content-data plays for audio, then target device plays content-data by equipment such as microphones for user, and compares Such as, content-data is the chat content with user, can play voice-enabled chat information etc. by microphone etc. for user.

Thus, the voice interactive method of the embodiment of the present invention, the operation that can synchronously identify and monitor user are intended to, relatively In in the prior art, user is when with the intention that interactive voice is carried out with target device, it is necessary to input " ding-dong, ding-dong " first Or " hi " wakes up triggering interactive voice open function after word, then the implementation of input voice information " today, weather was how ", In an embodiment of the present invention, after voice messaging input by user " today, how is weather " is recognized, you can get and day The relevant content-data of gas, easy to operate, user experience is higher.

For example when using the echo of Amazon, be required for first saying every time " alex ", then it could be initiated with target device Voice dialogue, this seriously affects user experience in more wheel dialogues, and the technical solution of interactive voice using the present invention, can be with Need not use similar " alex " it is this kind of wake up word can directly and target device initiation voice dialogue, it is convenient and efficient.

Again for example, when chatting with human machine people siri, the related key that user need not be triggered in target device is touched Send out the chat feature, it is only necessary to phonetic entry：" siri, you have had a meal today ", it is possible to obtain the voice letter of siri feedbacks Cease " not having, you, which have eaten, does not have ", so that, provide a kind of user experience truly chatted to the user, further reduce robot Unnatural sense and feeling of unreality.

In conclusion the voice interactive method of the embodiment of the present invention, obtains the voice messaging that user inputs target device, Processing is identified to voice messaging, the operation intention that semantic analysis obtains user is carried out to recognition result, detection operation is intended to Validity, if to know that operation is intended to effective for detection, be intended to carry out information processing services according to operation to obtain corresponding content Data, most content-data feeds back to user by target device at last.Thus, by active sniffing and identify user voice believe The mode that operation in breath is intended to, actively meets the interactive voice demand of user, solves in the prior art, it is necessary to user's active It after triggering interactive voice unlatching event, could perform less efficient caused by the voice request of user, cumbersome technology is asked Topic.

It should be appreciated that in the above embodiment, as long as detecting that the operation of user is intended to effectively, i.e., according to user's Operation is intended to carry out information processing services to obtain corresponding content-data, comprising embodiment be, even if current target device Middle operation is A for application program, then the operation of user is intended to the corresponding function of being provided for application program B, and can also call should It is the related needs that meet user with program B.

But in practical applications, the misrecognition of some operation intention of user may be caused or responded by mistake, such as, currently The application program of operation is shopping application program, then according to voice messaging input by user " day Freezing, is not it, several years on earth today , buy part overcoat ", then after possible target device carries out speech analysis to the voice recognition result, may open weather should Weather related information, and shopping application program are provided to the user with program recommends overcoat etc. for user, and at this time, it is clear that user Operation be intended that and do shopping in currently running shopping application program, alternatively, currently running application program is weather forecast During program 1, according to voice messaging input by user " day Freezing, is not it, several years today ", then possible target device is to the speech recognition As a result after carrying out speech analysis, it may can also open weather application 2 and provide weather related information to the user, so that same luck Two functionally similar application programs of row, add the operating pressure of target device, or, currently running application program is Music application program, the operation known according to voice messaging input by user " day Freezing, is not it, several years today " be intended that with Currently running application program is completely irrelevant, may be only the chat content of user, it is not necessary to respond thereto.

Therefore, in order to mitigate the operating pressure of target device, and the content-data of feedback is improved and the operation of user is anticipated The uniformity of figure, in an embodiment of the present invention, reference may also be made to what the detections such as the current operating conditions of target device operation was intended to Validity.

Specifically, Fig. 3 is the flow chart of voice interactive method in accordance with another embodiment of the present invention, as shown in figure 3, Step 102 further includes：

Step 201, the characteristic information for the current operation application program that target device is sent is obtained.

Wherein, the mark application such as ID of the icon information of the characteristic information of application program including application program, application program The information of program uniqueness.

Step 202, the validity of operation intention is detected according to characteristic information.

Specifically, the characteristic information for the application program that can be run by identifying current foreground, identification target device are worked as The application program of preceding operation, and then according to currently running application program, the corresponding function services of the application program are matched, from And detect operation according to the function services and be intended to corresponding user demand, if the function services provided more than application program Scope or to application program provide function services it is related, if it exceeds or it is uncorrelated, it is determined that operation intention it is invalid, if It is not above or related, it is determined that operation is intended to effective.

Thus, in an embodiment of the present invention, it is contemplated that the operation of usual user is intended to and currently running application program It is related, such as, currently running is shopping application program, then it is probably very much for shopping application program greatly that the operation of user, which is intended to, , thus, the response efficiency that interactive voice service is provided is further increased, improves user experience.

In one embodiment of the invention, in order to further improve the validity of voice interactive method, can also be based on should The validity being intended to the detection operation of program currently running interface function, the i.e. operation in view of usual user be intended to it is current The current runnable interface of the application program of operation is related, and under the scene, the characteristic information of application program can be application program Unique identification information for representing interface such as interface I D.

Further, Fig. 4 is the flow chart of the voice interactive method of another embodiment according to the present invention, as shown in figure 4, Above-mentioned steps 202 comprise the following steps：

Step 301, the application content currently provided according to characteristic information acquisition target device, is excavated according to application content Context information.

Wherein, target device currently provides application content includes the service type that application program currently provides, includes Content information etc., contextual information can be excavated according to application content, which includes the current fortune of current application program Control location and function on row interface, menu content, the information content included etc., for example, in currently running application The application content of program is corresponding when being shopping-cart interface, then the contextual information excavated according to the application content of shopping-cart interface Position of control, payment control and each control etc. is chosen and deleted including the sequence information in shopping cart, order.

Step 302, contextual information and operation intention are associated detection, determine the validity that operation is intended to.

Specifically, contextual information and operation intention are associated detection, determine whether the operation intention of user can quilt The current interface of current application program is implemented, alternatively, it is whether related with the application content of currently running application program, if energy Implement or related, it is determined that operation is intended to effectively, otherwise, then invalid.

, then will be defeated according to user for example the application content of currently running application program is corresponding when being shopping-cart interface The voice messaging " having bought this overcoat " entered, associates with the sequence information in current shopping cart, then recognizes big in shopping cart Clothing is paid the bill etc..

Again for example, continue by taking the scene using the echo of Amazon as an example, current target device is playing music, then control The input of music processed and be considered as effectively to inquire about with the relevant inquiry of music, target device can be directly in response to, Otherwise, the operation that the inquiry unrelated with music is then considered invalid is intended to.

In conclusion the voice interactive method of the embodiment of the present invention, obtains the current operation that target device is sent and applies journey The characteristic information of sequence, the validity of operation intention is detected according to characteristic information.Thus, the available content of combination product, judges Whether current phonetic entry, which needs, is handled, and improves the accuracy rate of interactive voice response.

Based on above example, it should be appreciated that in practical applications, possible target device can get multiple users The voice messaging of input, alternatively, the voice messaging that may be got not is validated user, for example is not owner user, At this time, in order to which protection information waits safely, it is necessary to carry out the verification of legitimacy to the voice messaging input by user got.

Fig. 5 is the flow chart of the voice interactive method of another embodiment according to the present invention, as shown in figure 5, in above-mentioned step After obtaining the voice messaging that user inputs target device in rapid 101, this method further includes：

Step 401, the vocal print feature that vocal print processing obtains user is carried out to voice messaging.

Vocal print is the sound wave spectrum for the carrying verbal information that electricity consumption acoustic instrument is shown, vocal print not only has specificity, and And have the characteristics of relative stability, after especially growing up, the sound of people can keep relatively stablizing for a long time it is constant, it is demonstrated experimentally that No matter talker is deliberately to imitate other people sound and the tone, or whisper in sb.'s ear is softly talked, even if imitating remarkably true to life, its vocal print But differ all the time, therefore in an embodiment of the present invention, vocal print processing can be carried out to voice messaging.For example carry out frequency spectrum wave The extraction of shape, to obtain the vocal print feature of user, wherein, the vocal print feature may include the intensity of voice messaging, wavelength, frequency, Tempo variation etc..

Step 402, the device identification of target device is obtained, it is corresponding with device identification to inquire about the log-on message acquisition to prestore Register vocal print feature.

It is appreciated that storing the registration vocal print feature of validated user in log-on message in advance, and store target device Device identification and the correspondence of corresponding registration vocal print feature, such as, storage device identification and corresponding note in the server The correspondence of volume vocal print feature, but for example, storage is with being somebody's turn to do in target deviceThe corresponding registration vocal print feature of equipment, into And the device identification of target device is obtained, inquire about the log-on message to prestore and obtain registration vocal print feature corresponding with device identification.

Step 403, the vocal print feature of user is matched with registration vocal print feature, judges whether user is validated user To determine whether to carry out semantic analysis processing.

Specifically, the vocal print feature of user is matched with registration vocal print feature, judges whether user is validated user To determine whether to carry out semantic analysis processing, if the vocal print feature of user is matched with registration vocal print feature, show currently to use Family is validated user, so as to carry out semantic analysis processing to the voice messaging of the user's input, otherwise, active user is not legal User, so as to not carry out semantic analysis processing to the voice messaging of the user's input.

In conclusion the voice interactive method of the embodiment of the present invention, when vocal print meets chartered user and equipment letter Just start the intention analysis to user during breath, ensure that information security, improve practicality and the flexibility of interactive voice.

In order to realize above-described embodiment, the invention also provides a kind of voice interaction device, Fig. 6 is one according to the present invention The structure diagram of the voice interaction device of embodiment, as shown in fig. 6, the voice interaction device includes：Recognition processing module 100th, the first acquisition module 200, detection module 300, the second acquisition module 400 and feedback module 500.

Wherein, recognition processing module 100, the voice messaging inputted for obtaining user to target device, to voice messaging Processing is identified.

First acquisition module 200, the operation that user is obtained for carrying out semantic analysis to recognition result are intended to.

Detection module 300, the validity being intended to for detecting operation.

Second acquisition module 400, for when detection knows that operation is intended to effective, being intended to carry out information processing according to operation The corresponding content-data of service acquisition.

Feedback module 500, for content-data to be fed back to user by target device.

In one embodiment of the invention, as shown in fig. 7, on the basis of as shown in Figure 6, the voice interaction device is also Including reminding module 600, wherein, reminding module 600, for when detection knows that operation is intended to invalid, being fed back to target device Expiration operation prompt message.

It should be noted that the foregoing explanation to voice interactive method embodiment is also applied for the voice of the embodiment Interactive device, unpub details in voice interaction device embodiment of the present invention, details are not described herein again.

In conclusion the voice interaction device of the embodiment of the present invention, obtains the voice messaging that user inputs target device, Processing is identified to voice messaging, the operation intention that semantic analysis obtains user is carried out to recognition result, detection operation is intended to Validity, if to know that operation is intended to effective for detection, be intended to carry out information processing services according to operation to obtain corresponding content Data, most content-data feeds back to user by target device at last.Thus, by active sniffing and identify user voice believe The mode that operation in breath is intended to, actively meets the interactive voice demand of user, solves in the prior art, it is necessary to user's active It after triggering interactive voice unlatching event, could perform less efficient caused by the voice request of user, cumbersome technology is asked Topic.

Fig. 8 is the structure diagram of the voice interaction device of another embodiment according to the present invention, as shown in figure 8, such as On the basis of shown in Fig. 6, which further includes the 3rd acquisition module 700, wherein, the 3rd acquisition module 700, is used for The characteristic information for the current operation application program that the target device is sent is obtained, in the present embodiment, detection module 300, is used In the validity that detection operation is intended to.

In one embodiment of the invention, as shown in figure 9, the detection module 300 includes acquiring unit 310, excavates list Member 320 and determination unit 330.

Wherein, acquiring unit 310, for being obtained according to the characteristic information in the application that the target device currently provides Hold.

Unit 320 is excavated, for excavating contextual information according to the application content.

Determination unit 330, for the contextual information and the operation intention to be associated detection, determines the behaviour Make the validity being intended to.

In conclusion the voice interaction device of the embodiment of the present invention, obtains the current operation that target device is sent and applies journey The characteristic information of sequence, the validity of operation intention is detected according to characteristic information.Thus, the available content of combination product, judges Whether current phonetic entry, which needs, is handled, and improves the accuracy rate of interactive voice response.

Figure 10 is the structure diagram of the voice interaction device of a still further embodiment according to the present invention, as shown in Figure 10, should Device further includes the 4th acquisition module 800, enquiry module 900, judgment module 1000.

Wherein, the 4th acquisition module 800, the vocal print feature for obtaining user for carrying out vocal print processing to voice messaging.

Enquiry module 900, for obtaining the device identification of target device, inquires about the log-on message to prestore and obtains and equipment mark Know corresponding registration vocal print feature.

Whether judgment module 1000, for the vocal print feature of user to be matched with registration vocal print feature, judge user For validated user with determine whether carry out semantic analysis processing.

In conclusion the voice interaction device of the embodiment of the present invention, when vocal print meets chartered user and equipment letter Just start the intention analysis to user during breath, ensure that information security, improve practicality and the flexibility of interactive voice.

In order to realize above-described embodiment, the invention also provides a kind of computer equipment, Figure 11 is one according to the present invention The structure diagram of the computer equipment of embodiment.As shown in figure 11, memory 21, processor 22 and it is stored on memory 21 And the computer program that can be run on processor 22.

Processor 22 realizes the voice interactive method provided in above-described embodiment when performing described program.

Further, computer equipment further includes：

Communication interface 23, for the communication between memory 21 and processor 22.

Memory 21, for storing the computer program that can be run on processor 22.

Memory 21 may include high-speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile Memory), a for example, at least magnetic disk storage.

Processor 22, for performing described program when, realize the voice interactive method described in above-described embodiment.

If memory 21, processor 22 and the independent realization of communication interface 23, communication interface 21, memory 21 and processing Device 22 can be connected with each other by bus and complete mutual communication.The bus can be industry standard architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, controlling bus etc.. For ease of representing, only represented in Fig. 5 with a thick line, it is not intended that an only bus or a type of bus.

Optionally, in specific implementation, if memory 21, processor 22 and communication interface 23, are integrated in chip piece Upper realization, then memory 21, processor 22 and communication interface 23 can complete mutual communication by internal interface.

Processor 22 is probably a central processing unit (Central Processing Unit, referred to as CPU), or Specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or by with It is set to the one or more integrated circuits for implementing the embodiment of the present invention.

In order to realize above-described embodiment, the present invention also proposes a kind of non-transitorycomputer readable storage medium, when described Instruction in storage medium is performed by processor, enabling performs the voice interactive method as described in above-described embodiment.

In order to realize above-described embodiment, the present invention also proposes a kind of computer program product, when the computer program produces When instruction processing unit in product performs, the voice interactive method as described in above-described embodiment is performed.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment of the present invention or example.In the present specification, schematic expression of the above terms is not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office Combined in an appropriate manner in one or more embodiments or example.In addition, without conflicting with each other, the skill of this area Art personnel can be tied the different embodiments or example described in this specification and different embodiments or exemplary feature Close and combine.

In addition, term " first ", " second " are only used for description purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, " multiple " are meant that at least two, such as two, three It is a etc., unless otherwise specifically defined.

Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used for realization custom logic function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic at the same time in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following：Electricity with one or more wiring Connecting portion (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can be for example by carrying out optical scanner to paper or other media, then into edlin, interpretation or if necessary with it His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware with another embodiment, following skill well known in the art can be used Any one of art or their combination are realized：With the logic gates for realizing logic function to data-signal from Logic circuit is dissipated, the application-specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..

Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly it is that relevant hardware can be instructed to complete by program, the program can be stored in a kind of computer-readable storage medium In matter, the program upon execution, including one or a combination set of the step of embodiment of the method.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.The integrated module is such as Fruit is realized in the form of software function module and as independent production marketing or in use, can also be stored in a computer In read/write memory medium.

Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention System, those of ordinary skill in the art can be changed above-described embodiment, change, replace and become within the scope of the invention Type.

Claims

1. a kind of voice interactive method, it is characterised in that comprise the following steps：

The voice messaging that user inputs target device is obtained, processing is identified to the voice messaging；

The operation intention that semantic analysis obtains the user is carried out to recognition result；

Detect the validity that the operation is intended to, if to know that the operation is intended to effective for detection, according to the operation be intended into Row information handles the corresponding content-data of service acquisition；

The content-data is fed back into the user by the target device.

2. the method as described in claim 1, it is characterised in that further include：

Obtain the characteristic information for the current operation application program that the target device is sent；

The validity that the detection operation is intended to, including：

The validity being intended to according to the characteristic information detection operation.

3. method as claimed in claim 2, it is characterised in that described to operate what is be intended to according to characteristic information detection is described Validity, including：

The application content currently provided according to the characteristic information acquisition target device, is excavated according to the application content Context information；

The contextual information and the operation intention are associated detection, determine the validity that the operation is intended to.

4. the method as described in claim 1, it is characterised in that after the validity that the detection operation is intended to, also Including：

It is invalid that if detection knows that the operation is intended to, to the target device feedback neutralization operation prompt information.

5. the method as described in claim 1-4 is any, it is characterised in that in the language that the acquisition user inputs target device After message breath, further include：

The vocal print feature that vocal print processing obtains the user is carried out to the voice messaging；

The device identification of the target device is obtained, the log-on message to prestore is inquired about and obtains registration corresponding with the device identification Vocal print feature；

The vocal print feature of the user is matched with the registration vocal print feature, judges whether the user is validated user To determine whether to carry out semantic analysis processing.

A kind of 6. voice interaction device, it is characterised in that including：

Recognition processing module, the voice messaging inputted for obtaining user to target device, is identified the voice messaging Processing；

First acquisition module, the operation that the user is obtained for carrying out semantic analysis to recognition result are intended to；

Detection module, the validity being intended to for detecting the operation；

Second acquisition module, for when detection knows that the operation is intended to effective, being intended to according to the operation at row information Manage the corresponding content-data of service acquisition；

Feedback module, for the content-data to be fed back to the user by the target device.

7. device as claimed in claim 6, it is characterised in that further include：

3rd acquisition module, the characteristic information of the current operation application program sent for obtaining the target device；

The detection module, specifically for the validity being intended to according to the characteristic information detection operation.

8. device as claimed in claim 7, it is characterised in that the detection module includes：

Acquiring unit, for the application content currently provided according to the characteristic information acquisition target device；

Unit is excavated, for excavating contextual information according to the application content；

Determination unit, for the contextual information and the operation intention to be associated detection, determines that the operation is intended to Validity.

9. device as claimed in claim 6, it is characterised in that further include：

Reminding module, for detection know it is described operation be intended to it is invalid when, to the target device feedback neutralization operation indicating Information.

10. the device as described in claim 6-9 is any, it is characterised in that further include：

4th acquisition module, the vocal print feature for obtaining the user for carrying out vocal print processing to the voice messaging；

Enquiry module, for obtaining the device identification of the target device, inquires about the log-on message to prestore and obtains and the equipment Identify corresponding registration vocal print feature；

Judgment module, for the vocal print feature of the user to be matched with the registration vocal print feature, judges the user Whether it is validated user to determine whether to carry out semantic analysis processing.

11. a kind of computer equipment, it is characterised in that including processor and memory；

Wherein, the processor can perform by reading the executable program code stored in the memory to run with described The corresponding program of program code, for realizing the voice interactive method as any one of claim 1-5.

12. a kind of computer program product, it is characterised in that when the instruction in the computer program product is performed by processor When, realize the voice interactive method as any one of claim 1-5.

13. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, it is characterised in that the meter The voice interactive method as any one of claim 1-5 is realized when calculation machine program is executed by processor.