CN109461448A

CN109461448A - Voice interactive method and device

Info

Publication number: CN109461448A
Application number: CN201811508104.8A
Authority: CN
Inventors: 申慧丽
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-12-11
Filing date: 2018-12-11
Publication date: 2019-03-12

Abstract

The present invention proposes a kind of voice interactive method and device, and wherein method includes: the voice signal for obtaining user；Voice signal is identified, judges in voice signal whether to include waking up language；If including waking up language in voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, to carry out interactive voice operation with user；When in the voice signal got including conclusion, stop interactive voice operation, speech-sound intelligent equipment is converted from wake-up states to dormant state, to during user and speech-sound intelligent equipment carry out interactive voice, it only needs to say primary wake-up language and a conclusion, it is ensured that entirely interactive voice process is coherent, meets conventional link up and is accustomed to, interactive voice efficiency is improved, the interactive voice experience of user is improved.

Description

Voice interactive method and device

Technical field

The present invention relates to technical field of voice interaction more particularly to a kind of voice interactive methods and device.

Background technique

Currently, user's one problem of every proposition requires first to say one when user and speech-sound intelligent equipment carry out interactive voice Secondary wake-up word wakes up speech-sound intelligent equipment, then proposes problem, obtains the answer that speech-sound intelligent equipment provides, voice is caused to be handed over During mutually, user needs frequently to say wake-up word, entire interactive voice process isolate very much with it is cumbersome, be not accordant to the old routine communication Habit reduces interactive voice efficiency and the interactive voice experience of user.

Summary of the invention

The present invention is directed to solve at least some of the technical problems in related technologies.

For this purpose, the first purpose of this invention is to propose a kind of voice interactive method, for solving language in the prior art The problem of sound interactive efficiency difference.

Second object of the present invention is to propose a kind of voice interaction device.

Third object of the present invention is to propose another voice interaction device.

Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.

5th purpose of the invention is to propose a kind of computer program product.

In order to achieve the above object, first aspect present invention embodiment proposes a kind of voice interactive method, it is applied to voice intelligence Energy equipment, comprising:

Obtain the voice signal of user；

The voice signal is identified, judges in the voice signal whether to include waking up language；

If including waking up language in the voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, To carry out interactive voice operation with the user；

When in the voice signal got including conclusion, stop interactive voice operation, by the speech-sound intelligent equipment It converts from wake-up states to dormant state.

Further, described that the voice signal is identified, judge in the voice signal whether to include waking up language, Include:

The voice signal is identified, the corresponding content of text of the voice signal is obtained；

It is inquired according to the content of text and wakes up dictionary, judge to whether there is and the wake-up dictionary in the content of text Matched first word of middle wake-up language or the first sentence；

If there are first word or first sentences in the content of text, it is determined that in the voice signal Including waking up language.

Further, speech-sound intelligent equipment is converted from dormant state to wake-up states, further includes:

When not including conclusion in the voice signal got, the voice signal got is identified, acquisition pair The first content of text answered；

According to first content of text, problem base is inquired, obtains the problem of matching with first content of text；

First content of text is compared with described problem, judges whether lack into first content of text Point；

If not lacking ingredient in first content of text, answer corresponding the problem of matching is determined as described first The corresponding response result of content of text.

Further, described that first content of text is compared with described problem, judge in first text After whether lacking ingredient in appearance, further includes:

If lacking ingredient in first content of text, user is prompted to supplement the ingredient；

In conjunction with the voice signal and first content of text got again, the problem of user proposes is determined；

The problem of proposing with the user corresponding answer is determined as response result.

Further, the voice signal and first content of text that the combination is got again, determine the use The problem of family proposes, comprising:

The voice signal got again is identified, corresponding second content of text is obtained；

Judge whether second content of text is ingredient lacking in first content of text；

If second content of text is ingredient lacking in first content of text, in conjunction in second text Appearance and first content of text determine the problem of user proposes；

If second content of text is not ingredient lacking in first content of text, according to second text Content determines the problem of user proposes.

The voice interactive method of the embodiment of the present invention, by the voice signal for obtaining user；Voice signal is identified, Judge in voice signal whether to include waking up language；If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user；When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.

In order to achieve the above object, second aspect of the present invention embodiment proposes a kind of voice interaction device, it is applied to voice intelligence Energy equipment, comprising:

Module is obtained, for obtaining the voice signal of user；

Whether identification module judges in the voice signal to include waking up language for identifying the voice signal；

Conversion processing module, for including when waking up language, by speech-sound intelligent equipment from suspend mode shape in the voice signal State is converted to wake-up states, to carry out interactive voice operation with the user；

The conversion processing module is also used to stop interactive voice when in the voice signal got including conclusion Operation, the speech-sound intelligent equipment is converted from wake-up states to dormant state.

Further, the identification module is specifically used for,

Further, the device further include: enquiry module, comparison module and determining module；

The identification module is also used to when not including conclusion in the voice signal got, to the voice got Signal is identified, corresponding first content of text is obtained；

The enquiry module, for inquiring problem base according to first content of text, obtain in first text The problem of holding matching；

The comparison module judges first text for first content of text to be compared with described problem Whether lack ingredient in this content；

The determining module will be corresponding the problem of matching when for not lacking ingredient in first content of text Answer is determined as the corresponding response result of first content of text.

Further, the device further include: cue module；

The cue module when for lacking ingredient in first content of text, prompts user to supplement the ingredient；

The determining module is also used to determine in conjunction with the voice signal and first content of text got again The problem of user proposes；

The determining module is also used to answer corresponding the problem of proposition with the user being determined as response result.

Further, the determining module is specifically used for,

The voice interaction device of the embodiment of the present invention, by the voice signal for obtaining user；Voice signal is identified, Judge in voice signal whether to include waking up language；If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user；When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.

In order to achieve the above object, third aspect present invention embodiment proposes another voice interaction device, comprising: storage Device, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that the processor Voice interactive method as described above is realized when executing described program.

To achieve the goals above, fourth aspect present invention embodiment proposes a kind of computer readable storage medium, On be stored with computer program, which realizes voice interactive method as described above when being executed by processor.

To achieve the goals above, fifth aspect present invention embodiment proposes a kind of computer program product, when described When instruction processing unit in computer program product executes, voice interactive method as described above is realized.

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.

Detailed description of the invention

Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:

Fig. 1 is a kind of flow diagram of voice interactive method provided in an embodiment of the present invention；

Fig. 2 is a kind of structural schematic diagram of voice interaction device provided in an embodiment of the present invention；

Fig. 3 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention；

Fig. 4 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention；

Fig. 5 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention.

Specific embodiment

The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.

Below with reference to the accompanying drawings the voice interactive method and device of the embodiment of the present invention are described.

Fig. 1 is a kind of flow diagram of voice interactive method provided in an embodiment of the present invention.As shown in Figure 1, the voice Exchange method is applied to speech-sound intelligent equipment, mainly comprises the steps that

S101, the voice signal for obtaining user.

The executing subject of voice interactive method provided by the invention is voice interaction device, and voice interaction device specifically can be with For the hardware device with voice interactive function, or the software to be installed on hardware device.Hardware device is specially voice intelligence Energy equipment, for example, the equipment such as intelligent players, intelligent water heater, Intelligent heater, intelligent washing machine.

In the present embodiment, the equipment such as microphone can be set in speech-sound intelligent equipment, be constantly in speech signal collection State, for acquiring the voice signal of user in real time.

S102, voice signal is identified, judges in voice signal whether to include waking up language.

In the present embodiment, the process that speech-sound intelligent equipment executes step 102 is specifically as follows, and knows to voice signal Not, the corresponding content of text of voice signal is obtained；It is inquired according to content of text and wakes up dictionary, judge to whether there is in content of text Matched first word of language or the first sentence are waken up in dictionary with waking up；If there are the first word or first in content of text Sentence, it is determined that include waking up language in voice signal.

In the present embodiment, waking up in dictionary may include multiple wake-up languages.Waking up language can be wake-up sentence or wake-up Word.It is corresponding, after speech-sound intelligent equipment gets the corresponding content of text of voice signal, sentence can be carried out to content of text It splits, the sentence inquiry obtained according to fractionation wakes up dictionary, judges whether there is and wakes up wake-up language matched first in dictionary Sentence；And to content of text carry out word fractionation, according to split obtain word inquiry wake up dictionary, judge whether there is with It wakes up and wakes up matched first word of language in dictionary.

If including waking up language in S103, voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, To carry out interactive voice operation with user.

In the present embodiment, when speech-sound intelligent equipment is in wake-up states, speech-sound intelligent equipment can directly acquire user Voice signal, carry out identification acquisition problem, and to user speech provide response result, without carry out wake up language knowledge Not.That is, user can carry out more wheels with intelligent sound equipment and ask before getting including the voice signal of conclusion Answer interaction.

It should be noted that user at this time can be multiple users.When speech-sound intelligent equipment is in wake-up states, language Sound smart machine can carry out interactive voice operation with multiple users, until getting the conclusion of some user.

Further, on the basis of the above embodiments, speech-sound intelligent equipment is converted from dormant state to wake-up states Later, during user and speech-sound intelligent equipment carry out interactive voice, the method can be the following steps are included: when obtaining When not including conclusion in the voice signal got, the voice signal got is identified, obtains corresponding first text Content；According to the first content of text, problem base is inquired, obtains the problem of matching with the first content of text；By the first content of text It is compared with problem, judges whether lack ingredient in the first content of text；It, will if not lacking ingredient in the first content of text The corresponding answer of the problem of matching is determined as the corresponding response result of the first content of text.

In addition, prompting user's complementary element if lacking ingredient in the first content of text；In conjunction with the voice got again Signal and the first content of text determine the problem of user proposes；The problem of proposing with user corresponding answer is determined as answering Answer result.

It may include having each typical problem and the corresponding answer of problem in the present embodiment, in problem base.Wherein, standard Problem refers to the problem of not lacking ingredient, such as " me is helped to search the Chinese-style restaurant near lower ", includes apart from model in the problem Enclose, position location and search object.Corresponding, non-standard issue for example " helps me to search lower restaurant ", lack in the problem away from From range, and the type for searching object is not fine enough.In the present embodiment, typical problem can be carried out according to the actual needs of user Setting.For example, the problems in problem base is determined as typical problem；It is lacked by non-problems library, and compared with problem in problem base Few certain ingredients or the offending problem of some component type, are determined as non-standard issue.

It should be noted that in the present embodiment, when lacking ingredient in the first content of text, multiple ask may be matched to Topic.By taking corresponding first content of text of problem is " me is helped to search lower restaurant " as an example, may be matched in problem base following is asked Topic, " me is helped to search the Chinese-style restaurant near lower ", " me is helped to search the restaurant which serves Western food near lower " etc..At this point, multiple ask can be matched to Topic, also illustrates to lack ingredient in the first content of text.

In the present embodiment, by taking corresponding first content of text of problem is " me is helped to search lower restaurant " as an example, first text Lack distance range in content and lookup object is not fine, then speech-sound intelligent equipment can supplement distance range with voice prompting user With fine lookup object, so that user issues voice signal according to the prompt again, the voice signal issued again is for example " attached Close restaurant which serves Western food ", speech-sound intelligent equipment can obtain voice signal again at this time, according to the voice signal got again and First content of text determines the problem of user proposes " me is helped to search the restaurant which serves Western food near lower ".

In the present embodiment, speech-sound intelligent equipment combines the voice signal and the first content of text got again, determines The process for the problem of user proposes is specifically as follows, and identifies to the voice signal got again, obtains corresponding second Content of text；Judge whether the second content of text is ingredient lacking in the first content of text；If the second content of text is first Ingredient lacking in content of text then combines the second content of text and the first content of text, determines the problem of user proposes；If Second content of text is not ingredient lacking in the first content of text, then determines asking for user's proposition according to the second content of text Topic.

For example, if corresponding first content of text of problem be " me is helped to search lower restaurant " for, in first content of text Lack distance range and to search object not fine, then speech-sound intelligent equipment can supplement distance range and fine with voice prompting user Lookup object so that user issues voice signal according to the prompt again, if the voice signal that user issues again is that " please help I searches the hospital of traditional Chinese hospital near lower ", then corresponding second content of text of the voice signal is uncorrelated to the first content of text, then The first content of text can be abandoned, the second content of text is directly determined as the problem of user proposes.

In the present embodiment, after speech-sound intelligent equipment determines the problem of user proposes, the problem of being proposed according to user The true intention for understanding user obtains and the matched answer of user's true intention from question and answer library or vast resources library.It needs Illustrate, herein the problem of be not limited to specific question sentence, or the sentence with demand such as please provide voice intelligence The details etc. of energy equipment.

S104, when in the voice signal got include conclusion when, stop interactive voice operation, by speech-sound intelligent equipment It converts from wake-up states to dormant state.

Wherein, conclusion can be sentence or word.In the present embodiment, it is in speech-sound intelligent equipment recognition of speech signals The no process including conclusion can be to identify to the voice signal got, obtain corresponding content of text；According to text This content search terminates dictionary, judge in content of text with the presence or absence of with terminate dictionary in matched second word of conclusion or Second sentence；If there are the second word or the second sentences in content of text, it is determined that include conclusion in voice signal.Its In, conclusion such as " good, first getting along well, you chat ", " we first talk to here ", " good-by ", " goodbye " etc..

Fig. 2 is a kind of structural schematic diagram of voice interaction device provided in an embodiment of the present invention.As shown in Fig. 2, the voice Interactive device is applied to speech-sound intelligent equipment, specifically includes that and obtains module 21, identification module 22 and conversion processing module 23.

Wherein, module 21 is obtained, for obtaining the voice signal of user；

Whether identification module 22 judges in the voice signal to include waking up for identifying the voice signal Language；

Conversion processing module 23, for including when waking up language, by speech-sound intelligent equipment from suspend mode in the voice signal State is converted to wake-up states, to carry out interactive voice operation with the user；

The conversion processing module 23 is also used to when in the voice signal got including conclusion, is stopped voice and is handed over Interoperability, the speech-sound intelligent equipment is converted from wake-up states to dormant state.

Voice interaction device provided by the invention is specifically as follows the hardware device with voice interactive function, or is hard The software installed in part equipment.Hardware device is specially speech-sound intelligent equipment, for example, intelligent players, intelligent water heater, intelligence The equipment such as heater, intelligent washing machine.

In the present embodiment, the identification module 22 is specifically used for, and identifies to the voice signal, obtains the voice The corresponding content of text of signal；According to the content of text inquire wake up dictionary, judge in the content of text with the presence or absence of with Matched first word of language or the first sentence are waken up in the wake-up dictionary；If there are first words in the content of text Language or first sentence, it is determined that include waking up language in the voice signal.

Further, in conjunction with reference Fig. 3, on the basis of embodiment shown in Fig. 2, the device can also include: to look into Ask module 24, comparison module 25 and determining module 26；

The identification module 22 is also used to when not including conclusion in the voice signal got, to the language got Sound signal is identified, corresponding first content of text is obtained；

The enquiry module 24 obtains and first text for inquiring problem base according to first content of text The problem of content matching；

The comparison module 25 judges described first for first content of text to be compared with described problem Whether lack ingredient in content of text；

When for not lacking ingredient in first content of text, the problem of matching, is corresponded to for the determining module 26 Answer be determined as the corresponding response result of first content of text.

Further, in conjunction with reference Fig. 4, on the basis of embodiment shown in Fig. 3, the device can also include: to mention Show module 27；

The cue module 27, when for lacking ingredient in first content of text, prompt user supplement it is described at Point；

The determining module 26 is also used in conjunction with the voice signal and first content of text got again, really The problem of fixed user proposes；

The determining module 26 is also used to answer corresponding the problem of proposition with the user being determined as response result.

In the present embodiment, determining module 26 combines the voice signal and the first content of text got again, determines and uses The process for the problem of family proposes is specifically as follows, and identifies to the voice signal got again, obtains corresponding second text This content；Judge whether the second content of text is ingredient lacking in the first content of text；If the second content of text is the first text Ingredient lacking in this content then combines the second content of text and the first content of text, determines the problem of user proposes；If the Two content of text are not ingredients lacking in the first content of text, then determine the problem of user proposes according to the second content of text.

Fig. 5 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention.The voice interaction device Include:

Memory 1001, processor 1002 and it is stored in the calculating that can be run on memory 1001 and on processor 1002 Machine program.

Processor 1002 realizes the voice interactive method provided in above-described embodiment when executing described program.

Further, voice interaction device further include:

Communication interface 1003, for the communication between memory 1001 and processor 1002.

Memory 1001, for storing the computer program that can be run on processor 1002.

Memory 1001 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

Processor 1002 realizes voice interactive method described in above-described embodiment when for executing described program.

If memory 1001, processor 1002 and the independent realization of communication interface 1003, communication interface 1003, memory 1001 and processor 1002 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard Architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, control Bus processed etc..Only to be indicated with a thick line in Fig. 5, it is not intended that an only bus or a type of convenient for indicating Bus.

Optionally, in specific implementation, if memory 1001, processor 1002 and communication interface 1003, are integrated in one It is realized on block chip, then memory 1001, processor 1002 and communication interface 1003 can be completed mutual by internal interface Communication.

Processor 1002 may be a central processing unit (Central Processing Unit, referred to as CPU), or Person is specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC) or quilt It is configured to implement one or more integrated circuits of the embodiment of the present invention.

The present invention also provides a kind of non-transitorycomputer readable storage mediums, are stored thereon with computer program, the journey Voice interactive method as described above is realized when sequence is executed by processor.

The present invention also provides a kind of computer program products, when the instruction processing unit in the computer program product executes When, realize voice interactive method as described above.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.

Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as to limit of the invention System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of the invention Type.

Claims

1. a kind of voice interactive method is applied to speech-sound intelligent equipment characterized by comprising

Obtain the voice signal of user；

If in the voice signal include wake up language, speech-sound intelligent equipment is converted from dormant state to wake-up states, with The user carries out interactive voice operation；

When in the voice signal got including conclusion, stop interactive voice operation, by the speech-sound intelligent equipment from calling out Awake state is converted to dormant state.

2. the method according to claim 1, wherein described identify the voice signal, described in judgement It whether include waking up language in voice signal, comprising:

It is inquired according to the content of text and wakes up dictionary, judge to whether there is in the content of text and call out in the wake-up dictionary Awake matched first word of language or the first sentence；

If there are first word or first sentences in the content of text, it is determined that include in the voice signal Wake up language.

3. the method according to claim 1, wherein converting speech-sound intelligent equipment from dormant state to wake-up shape After state, further includes:

When not including conclusion in the voice signal got, the voice signal got is identified, is obtained corresponding First content of text；

First content of text is compared with described problem, judges whether lack ingredient in first content of text；

If not lacking ingredient in first content of text, answer corresponding the problem of matching is determined as first text The corresponding response result of content.

4. according to the method described in claim 3, it is characterized in that, described carry out first content of text and described problem It compares, after judging whether to lack in first content of text ingredient, further includes:

5. according to the method described in claim 4, it is characterized in that, voice signal that the combination is got again and described First content of text determines the problem of user proposes, comprising:

If second content of text be ingredient lacking in first content of text, in conjunction with second content of text with And first content of text, determine the problem of user proposes；

If second content of text is not ingredient lacking in first content of text, according to second content of text Determine the problem of user proposes.

6. a kind of voice interaction device is applied to speech-sound intelligent equipment characterized by comprising

Module is obtained, for obtaining the voice signal of user；

Conversion processing module, for including turning speech-sound intelligent equipment from dormant state when waking up language in the voice signal Wake-up states are shifted to, to carry out interactive voice operation with the user；

The conversion processing module is also used to when in the voice signal got including conclusion, stops interactive voice operation, The speech-sound intelligent equipment is converted from wake-up states to dormant state.

7. device according to claim 6, which is characterized in that the identification module is specifically used for,

8. device according to claim 6, which is characterized in that further include: enquiry module, comparison module and determining module；

The identification module is also used to when not including conclusion in the voice signal got, to the voice signal got It is identified, obtains corresponding first content of text；

The enquiry module obtains and first content of text for inquiring problem base according to first content of text With the problem of；

The comparison module judges in first text for first content of text to be compared with described problem Whether lack ingredient in appearance；

The determining module, when for not lacking ingredient in first content of text, by answer corresponding the problem of matching It is determined as the corresponding response result of first content of text.

9. device according to claim 8, which is characterized in that further include: cue module；

The determining module, is also used in conjunction with the voice signal and first content of text got again, determine described in The problem of user proposes；

10. device according to claim 9, which is characterized in that the determining module is specifically used for,

11. a kind of voice interaction device characterized by comprising

Memory, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that institute It states when processor executes described program and realizes such as voice interactive method as claimed in any one of claims 1 to 5.

12. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program Such as voice interactive method as claimed in any one of claims 1 to 5 is realized when being executed by processor.

13. a kind of computer program product realizes such as right when the instruction processing unit in the computer program product executes It is required that any voice interactive method in 1-5.