CN109461448A - Voice interactive method and device - Google Patents

Voice interactive method and device Download PDF

Info

Publication number
CN109461448A
CN109461448A CN201811508104.8A CN201811508104A CN109461448A CN 109461448 A CN109461448 A CN 109461448A CN 201811508104 A CN201811508104 A CN 201811508104A CN 109461448 A CN109461448 A CN 109461448A
Authority
CN
China
Prior art keywords
text
content
voice signal
voice
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811508104.8A
Other languages
Chinese (zh)
Inventor
申慧丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811508104.8A priority Critical patent/CN109461448A/en
Publication of CN109461448A publication Critical patent/CN109461448A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present invention proposes a kind of voice interactive method and device, and wherein method includes: the voice signal for obtaining user;Voice signal is identified, judges in voice signal whether to include waking up language;If including waking up language in voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, to carry out interactive voice operation with user;When in the voice signal got including conclusion, stop interactive voice operation, speech-sound intelligent equipment is converted from wake-up states to dormant state, to during user and speech-sound intelligent equipment carry out interactive voice, it only needs to say primary wake-up language and a conclusion, it is ensured that entirely interactive voice process is coherent, meets conventional link up and is accustomed to, interactive voice efficiency is improved, the interactive voice experience of user is improved.

Description

Voice interactive method and device
Technical field
The present invention relates to technical field of voice interaction more particularly to a kind of voice interactive methods and device.
Background technique
Currently, user's one problem of every proposition requires first to say one when user and speech-sound intelligent equipment carry out interactive voice Secondary wake-up word wakes up speech-sound intelligent equipment, then proposes problem, obtains the answer that speech-sound intelligent equipment provides, voice is caused to be handed over During mutually, user needs frequently to say wake-up word, entire interactive voice process isolate very much with it is cumbersome, be not accordant to the old routine communication Habit reduces interactive voice efficiency and the interactive voice experience of user.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.
For this purpose, the first purpose of this invention is to propose a kind of voice interactive method, for solving language in the prior art The problem of sound interactive efficiency difference.
Second object of the present invention is to propose a kind of voice interaction device.
Third object of the present invention is to propose another voice interaction device.
Fourth object of the present invention is to propose a kind of non-transitorycomputer readable storage medium.
5th purpose of the invention is to propose a kind of computer program product.
In order to achieve the above object, first aspect present invention embodiment proposes a kind of voice interactive method, it is applied to voice intelligence Energy equipment, comprising:
Obtain the voice signal of user;
The voice signal is identified, judges in the voice signal whether to include waking up language;
If including waking up language in the voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, To carry out interactive voice operation with the user;
When in the voice signal got including conclusion, stop interactive voice operation, by the speech-sound intelligent equipment It converts from wake-up states to dormant state.
Further, described that the voice signal is identified, judge in the voice signal whether to include waking up language, Include:
The voice signal is identified, the corresponding content of text of the voice signal is obtained;
It is inquired according to the content of text and wakes up dictionary, judge to whether there is and the wake-up dictionary in the content of text Matched first word of middle wake-up language or the first sentence;
If there are first word or first sentences in the content of text, it is determined that in the voice signal Including waking up language.
Further, speech-sound intelligent equipment is converted from dormant state to wake-up states, further includes:
When not including conclusion in the voice signal got, the voice signal got is identified, acquisition pair The first content of text answered;
According to first content of text, problem base is inquired, obtains the problem of matching with first content of text;
First content of text is compared with described problem, judges whether lack into first content of text Point;
If not lacking ingredient in first content of text, answer corresponding the problem of matching is determined as described first The corresponding response result of content of text.
Further, described that first content of text is compared with described problem, judge in first text After whether lacking ingredient in appearance, further includes:
If lacking ingredient in first content of text, user is prompted to supplement the ingredient;
In conjunction with the voice signal and first content of text got again, the problem of user proposes is determined;
The problem of proposing with the user corresponding answer is determined as response result.
Further, the voice signal and first content of text that the combination is got again, determine the use The problem of family proposes, comprising:
The voice signal got again is identified, corresponding second content of text is obtained;
Judge whether second content of text is ingredient lacking in first content of text;
If second content of text is ingredient lacking in first content of text, in conjunction in second text Appearance and first content of text determine the problem of user proposes;
If second content of text is not ingredient lacking in first content of text, according to second text Content determines the problem of user proposes.
The voice interactive method of the embodiment of the present invention, by the voice signal for obtaining user;Voice signal is identified, Judge in voice signal whether to include waking up language;If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user;When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.
In order to achieve the above object, second aspect of the present invention embodiment proposes a kind of voice interaction device, it is applied to voice intelligence Energy equipment, comprising:
Module is obtained, for obtaining the voice signal of user;
Whether identification module judges in the voice signal to include waking up language for identifying the voice signal;
Conversion processing module, for including when waking up language, by speech-sound intelligent equipment from suspend mode shape in the voice signal State is converted to wake-up states, to carry out interactive voice operation with the user;
The conversion processing module is also used to stop interactive voice when in the voice signal got including conclusion Operation, the speech-sound intelligent equipment is converted from wake-up states to dormant state.
Further, the identification module is specifically used for,
The voice signal is identified, the corresponding content of text of the voice signal is obtained;
It is inquired according to the content of text and wakes up dictionary, judge to whether there is and the wake-up dictionary in the content of text Matched first word of middle wake-up language or the first sentence;
If there are first word or first sentences in the content of text, it is determined that in the voice signal Including waking up language.
Further, the device further include: enquiry module, comparison module and determining module;
The identification module is also used to when not including conclusion in the voice signal got, to the voice got Signal is identified, corresponding first content of text is obtained;
The enquiry module, for inquiring problem base according to first content of text, obtain in first text The problem of holding matching;
The comparison module judges first text for first content of text to be compared with described problem Whether lack ingredient in this content;
The determining module will be corresponding the problem of matching when for not lacking ingredient in first content of text Answer is determined as the corresponding response result of first content of text.
Further, the device further include: cue module;
The cue module when for lacking ingredient in first content of text, prompts user to supplement the ingredient;
The determining module is also used to determine in conjunction with the voice signal and first content of text got again The problem of user proposes;
The determining module is also used to answer corresponding the problem of proposition with the user being determined as response result.
Further, the determining module is specifically used for,
The voice signal got again is identified, corresponding second content of text is obtained;
Judge whether second content of text is ingredient lacking in first content of text;
If second content of text is ingredient lacking in first content of text, in conjunction in second text Appearance and first content of text determine the problem of user proposes;
If second content of text is not ingredient lacking in first content of text, according to second text Content determines the problem of user proposes.
The voice interaction device of the embodiment of the present invention, by the voice signal for obtaining user;Voice signal is identified, Judge in voice signal whether to include waking up language;If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user;When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.
In order to achieve the above object, third aspect present invention embodiment proposes another voice interaction device, comprising: storage Device, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that the processor Voice interactive method as described above is realized when executing described program.
To achieve the goals above, fourth aspect present invention embodiment proposes a kind of computer readable storage medium, On be stored with computer program, which realizes voice interactive method as described above when being executed by processor.
To achieve the goals above, fifth aspect present invention embodiment proposes a kind of computer program product, when described When instruction processing unit in computer program product executes, voice interactive method as described above is realized.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of flow diagram of voice interactive method provided in an embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of voice interaction device provided in an embodiment of the present invention;
Fig. 3 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention;
Fig. 4 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention;
Fig. 5 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings the voice interactive method and device of the embodiment of the present invention are described.
Fig. 1 is a kind of flow diagram of voice interactive method provided in an embodiment of the present invention.As shown in Figure 1, the voice Exchange method is applied to speech-sound intelligent equipment, mainly comprises the steps that
S101, the voice signal for obtaining user.
The executing subject of voice interactive method provided by the invention is voice interaction device, and voice interaction device specifically can be with For the hardware device with voice interactive function, or the software to be installed on hardware device.Hardware device is specially voice intelligence Energy equipment, for example, the equipment such as intelligent players, intelligent water heater, Intelligent heater, intelligent washing machine.
In the present embodiment, the equipment such as microphone can be set in speech-sound intelligent equipment, be constantly in speech signal collection State, for acquiring the voice signal of user in real time.
S102, voice signal is identified, judges in voice signal whether to include waking up language.
In the present embodiment, the process that speech-sound intelligent equipment executes step 102 is specifically as follows, and knows to voice signal Not, the corresponding content of text of voice signal is obtained;It is inquired according to content of text and wakes up dictionary, judge to whether there is in content of text Matched first word of language or the first sentence are waken up in dictionary with waking up;If there are the first word or first in content of text Sentence, it is determined that include waking up language in voice signal.
In the present embodiment, waking up in dictionary may include multiple wake-up languages.Waking up language can be wake-up sentence or wake-up Word.It is corresponding, after speech-sound intelligent equipment gets the corresponding content of text of voice signal, sentence can be carried out to content of text It splits, the sentence inquiry obtained according to fractionation wakes up dictionary, judges whether there is and wakes up wake-up language matched first in dictionary Sentence;And to content of text carry out word fractionation, according to split obtain word inquiry wake up dictionary, judge whether there is with It wakes up and wakes up matched first word of language in dictionary.
If including waking up language in S103, voice signal, speech-sound intelligent equipment is converted from dormant state to wake-up states, To carry out interactive voice operation with user.
In the present embodiment, when speech-sound intelligent equipment is in wake-up states, speech-sound intelligent equipment can directly acquire user Voice signal, carry out identification acquisition problem, and to user speech provide response result, without carry out wake up language knowledge Not.That is, user can carry out more wheels with intelligent sound equipment and ask before getting including the voice signal of conclusion Answer interaction.
It should be noted that user at this time can be multiple users.When speech-sound intelligent equipment is in wake-up states, language Sound smart machine can carry out interactive voice operation with multiple users, until getting the conclusion of some user.
Further, on the basis of the above embodiments, speech-sound intelligent equipment is converted from dormant state to wake-up states Later, during user and speech-sound intelligent equipment carry out interactive voice, the method can be the following steps are included: when obtaining When not including conclusion in the voice signal got, the voice signal got is identified, obtains corresponding first text Content;According to the first content of text, problem base is inquired, obtains the problem of matching with the first content of text;By the first content of text It is compared with problem, judges whether lack ingredient in the first content of text;It, will if not lacking ingredient in the first content of text The corresponding answer of the problem of matching is determined as the corresponding response result of the first content of text.
In addition, prompting user's complementary element if lacking ingredient in the first content of text;In conjunction with the voice got again Signal and the first content of text determine the problem of user proposes;The problem of proposing with user corresponding answer is determined as answering Answer result.
It may include having each typical problem and the corresponding answer of problem in the present embodiment, in problem base.Wherein, standard Problem refers to the problem of not lacking ingredient, such as " me is helped to search the Chinese-style restaurant near lower ", includes apart from model in the problem Enclose, position location and search object.Corresponding, non-standard issue for example " helps me to search lower restaurant ", lack in the problem away from From range, and the type for searching object is not fine enough.In the present embodiment, typical problem can be carried out according to the actual needs of user Setting.For example, the problems in problem base is determined as typical problem;It is lacked by non-problems library, and compared with problem in problem base Few certain ingredients or the offending problem of some component type, are determined as non-standard issue.
It should be noted that in the present embodiment, when lacking ingredient in the first content of text, multiple ask may be matched to Topic.By taking corresponding first content of text of problem is " me is helped to search lower restaurant " as an example, may be matched in problem base following is asked Topic, " me is helped to search the Chinese-style restaurant near lower ", " me is helped to search the restaurant which serves Western food near lower " etc..At this point, multiple ask can be matched to Topic, also illustrates to lack ingredient in the first content of text.
In the present embodiment, by taking corresponding first content of text of problem is " me is helped to search lower restaurant " as an example, first text Lack distance range in content and lookup object is not fine, then speech-sound intelligent equipment can supplement distance range with voice prompting user With fine lookup object, so that user issues voice signal according to the prompt again, the voice signal issued again is for example " attached Close restaurant which serves Western food ", speech-sound intelligent equipment can obtain voice signal again at this time, according to the voice signal got again and First content of text determines the problem of user proposes " me is helped to search the restaurant which serves Western food near lower ".
In the present embodiment, speech-sound intelligent equipment combines the voice signal and the first content of text got again, determines The process for the problem of user proposes is specifically as follows, and identifies to the voice signal got again, obtains corresponding second Content of text;Judge whether the second content of text is ingredient lacking in the first content of text;If the second content of text is first Ingredient lacking in content of text then combines the second content of text and the first content of text, determines the problem of user proposes;If Second content of text is not ingredient lacking in the first content of text, then determines asking for user's proposition according to the second content of text Topic.
For example, if corresponding first content of text of problem be " me is helped to search lower restaurant " for, in first content of text Lack distance range and to search object not fine, then speech-sound intelligent equipment can supplement distance range and fine with voice prompting user Lookup object so that user issues voice signal according to the prompt again, if the voice signal that user issues again is that " please help I searches the hospital of traditional Chinese hospital near lower ", then corresponding second content of text of the voice signal is uncorrelated to the first content of text, then The first content of text can be abandoned, the second content of text is directly determined as the problem of user proposes.
In the present embodiment, after speech-sound intelligent equipment determines the problem of user proposes, the problem of being proposed according to user The true intention for understanding user obtains and the matched answer of user's true intention from question and answer library or vast resources library.It needs Illustrate, herein the problem of be not limited to specific question sentence, or the sentence with demand such as please provide voice intelligence The details etc. of energy equipment.
S104, when in the voice signal got include conclusion when, stop interactive voice operation, by speech-sound intelligent equipment It converts from wake-up states to dormant state.
Wherein, conclusion can be sentence or word.In the present embodiment, it is in speech-sound intelligent equipment recognition of speech signals The no process including conclusion can be to identify to the voice signal got, obtain corresponding content of text;According to text This content search terminates dictionary, judge in content of text with the presence or absence of with terminate dictionary in matched second word of conclusion or Second sentence;If there are the second word or the second sentences in content of text, it is determined that include conclusion in voice signal.Its In, conclusion such as " good, first getting along well, you chat ", " we first talk to here ", " good-by ", " goodbye " etc..
The voice interactive method of the embodiment of the present invention, by the voice signal for obtaining user;Voice signal is identified, Judge in voice signal whether to include waking up language;If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user;When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.
Fig. 2 is a kind of structural schematic diagram of voice interaction device provided in an embodiment of the present invention.As shown in Fig. 2, the voice Interactive device is applied to speech-sound intelligent equipment, specifically includes that and obtains module 21, identification module 22 and conversion processing module 23.
Wherein, module 21 is obtained, for obtaining the voice signal of user;
Whether identification module 22 judges in the voice signal to include waking up for identifying the voice signal Language;
Conversion processing module 23, for including when waking up language, by speech-sound intelligent equipment from suspend mode in the voice signal State is converted to wake-up states, to carry out interactive voice operation with the user;
The conversion processing module 23 is also used to when in the voice signal got including conclusion, is stopped voice and is handed over Interoperability, the speech-sound intelligent equipment is converted from wake-up states to dormant state.
Voice interaction device provided by the invention is specifically as follows the hardware device with voice interactive function, or is hard The software installed in part equipment.Hardware device is specially speech-sound intelligent equipment, for example, intelligent players, intelligent water heater, intelligence The equipment such as heater, intelligent washing machine.
In the present embodiment, the identification module 22 is specifically used for, and identifies to the voice signal, obtains the voice The corresponding content of text of signal;According to the content of text inquire wake up dictionary, judge in the content of text with the presence or absence of with Matched first word of language or the first sentence are waken up in the wake-up dictionary;If there are first words in the content of text Language or first sentence, it is determined that include waking up language in the voice signal.
In the present embodiment, waking up in dictionary may include multiple wake-up languages.Waking up language can be wake-up sentence or wake-up Word.It is corresponding, after speech-sound intelligent equipment gets the corresponding content of text of voice signal, sentence can be carried out to content of text It splits, the sentence inquiry obtained according to fractionation wakes up dictionary, judges whether there is and wakes up wake-up language matched first in dictionary Sentence;And to content of text carry out word fractionation, according to split obtain word inquiry wake up dictionary, judge whether there is with It wakes up and wakes up matched first word of language in dictionary.
Further, in conjunction with reference Fig. 3, on the basis of embodiment shown in Fig. 2, the device can also include: to look into Ask module 24, comparison module 25 and determining module 26;
The identification module 22 is also used to when not including conclusion in the voice signal got, to the language got Sound signal is identified, corresponding first content of text is obtained;
The enquiry module 24 obtains and first text for inquiring problem base according to first content of text The problem of content matching;
The comparison module 25 judges described first for first content of text to be compared with described problem Whether lack ingredient in content of text;
When for not lacking ingredient in first content of text, the problem of matching, is corresponded to for the determining module 26 Answer be determined as the corresponding response result of first content of text.
Further, in conjunction with reference Fig. 4, on the basis of embodiment shown in Fig. 3, the device can also include: to mention Show module 27;
The cue module 27, when for lacking ingredient in first content of text, prompt user supplement it is described at Point;
The determining module 26 is also used in conjunction with the voice signal and first content of text got again, really The problem of fixed user proposes;
The determining module 26 is also used to answer corresponding the problem of proposition with the user being determined as response result.
It may include having each typical problem and the corresponding answer of problem in the present embodiment, in problem base.Wherein, standard Problem refers to the problem of not lacking ingredient, such as " me is helped to search the Chinese-style restaurant near lower ", includes apart from model in the problem Enclose, position location and search object.Corresponding, non-standard issue for example " helps me to search lower restaurant ", lack in the problem away from From range, and the type for searching object is not fine enough.In the present embodiment, typical problem can be carried out according to the actual needs of user Setting.For example, the problems in problem base is determined as typical problem;It is lacked by non-problems library, and compared with problem in problem base Few certain ingredients or the offending problem of some component type, are determined as non-standard issue.
In the present embodiment, by taking corresponding first content of text of problem is " me is helped to search lower restaurant " as an example, first text Lack distance range in content and lookup object is not fine, then speech-sound intelligent equipment can supplement distance range with voice prompting user With fine lookup object, so that user issues voice signal according to the prompt again, the voice signal issued again is for example " attached Close restaurant which serves Western food ", speech-sound intelligent equipment can obtain voice signal again at this time, according to the voice signal got again and First content of text determines the problem of user proposes " me is helped to search the restaurant which serves Western food near lower ".
In the present embodiment, determining module 26 combines the voice signal and the first content of text got again, determines and uses The process for the problem of family proposes is specifically as follows, and identifies to the voice signal got again, obtains corresponding second text This content;Judge whether the second content of text is ingredient lacking in the first content of text;If the second content of text is the first text Ingredient lacking in this content then combines the second content of text and the first content of text, determines the problem of user proposes;If the Two content of text are not ingredients lacking in the first content of text, then determine the problem of user proposes according to the second content of text.
For example, if corresponding first content of text of problem be " me is helped to search lower restaurant " for, in first content of text Lack distance range and to search object not fine, then speech-sound intelligent equipment can supplement distance range and fine with voice prompting user Lookup object so that user issues voice signal according to the prompt again, if the voice signal that user issues again is that " please help I searches the hospital of traditional Chinese hospital near lower ", then corresponding second content of text of the voice signal is uncorrelated to the first content of text, then The first content of text can be abandoned, the second content of text is directly determined as the problem of user proposes.
In the present embodiment, after speech-sound intelligent equipment determines the problem of user proposes, the problem of being proposed according to user The true intention for understanding user obtains and the matched answer of user's true intention from question and answer library or vast resources library.It needs Illustrate, herein the problem of be not limited to specific question sentence, or the sentence with demand such as please provide voice intelligence The details etc. of energy equipment.
The voice interaction device of the embodiment of the present invention, by the voice signal for obtaining user;Voice signal is identified, Judge in voice signal whether to include waking up language;If including waking up language in voice signal, by speech-sound intelligent equipment from suspend mode shape State is converted to wake-up states, to carry out interactive voice operation with user;When in the voice signal got including conclusion, stop Only interactive voice operates, and speech-sound intelligent equipment is converted from wake-up states to dormant state, to set in user and speech-sound intelligent During standby progress interactive voice, it is only necessary to say primary wake-up language and a conclusion, it is ensured that entire interactive voice process It is coherent, meet it is conventional link up habit, improve interactive voice efficiency, improve the interactive voice experience of user.
Fig. 5 is the structural schematic diagram of another voice interaction device provided in an embodiment of the present invention.The voice interaction device Include:
Memory 1001, processor 1002 and it is stored in the calculating that can be run on memory 1001 and on processor 1002 Machine program.
Processor 1002 realizes the voice interactive method provided in above-described embodiment when executing described program.
Further, voice interaction device further include:
Communication interface 1003, for the communication between memory 1001 and processor 1002.
Memory 1001, for storing the computer program that can be run on processor 1002.
Memory 1001 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
Processor 1002 realizes voice interactive method described in above-described embodiment when for executing described program.
If memory 1001, processor 1002 and the independent realization of communication interface 1003, communication interface 1003, memory 1001 and processor 1002 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard Architecture (Industry Standard Architecture, referred to as ISA) bus, external equipment interconnection (Peripheral Component, referred to as PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, referred to as EISA) bus etc..The bus can be divided into address bus, data/address bus, control Bus processed etc..Only to be indicated with a thick line in Fig. 5, it is not intended that an only bus or a type of convenient for indicating Bus.
Optionally, in specific implementation, if memory 1001, processor 1002 and communication interface 1003, are integrated in one It is realized on block chip, then memory 1001, processor 1002 and communication interface 1003 can be completed mutual by internal interface Communication.
Processor 1002 may be a central processing unit (Central Processing Unit, referred to as CPU), or Person is specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC) or quilt It is configured to implement one or more integrated circuits of the embodiment of the present invention.
The present invention also provides a kind of non-transitorycomputer readable storage mediums, are stored thereon with computer program, the journey Voice interactive method as described above is realized when sequence is executed by processor.
The present invention also provides a kind of computer program products, when the instruction processing unit in the computer program product executes When, realize voice interactive method as described above.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can then be edited, be interpreted or when necessary with it for example by carrying out optical scanner to paper or other media His suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as to limit of the invention System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of the invention Type.

Claims (13)

1. a kind of voice interactive method is applied to speech-sound intelligent equipment characterized by comprising
Obtain the voice signal of user;
The voice signal is identified, judges in the voice signal whether to include waking up language;
If in the voice signal include wake up language, speech-sound intelligent equipment is converted from dormant state to wake-up states, with The user carries out interactive voice operation;
When in the voice signal got including conclusion, stop interactive voice operation, by the speech-sound intelligent equipment from calling out Awake state is converted to dormant state.
2. the method according to claim 1, wherein described identify the voice signal, described in judgement It whether include waking up language in voice signal, comprising:
The voice signal is identified, the corresponding content of text of the voice signal is obtained;
It is inquired according to the content of text and wakes up dictionary, judge to whether there is in the content of text and call out in the wake-up dictionary Awake matched first word of language or the first sentence;
If there are first word or first sentences in the content of text, it is determined that include in the voice signal Wake up language.
3. the method according to claim 1, wherein converting speech-sound intelligent equipment from dormant state to wake-up shape After state, further includes:
When not including conclusion in the voice signal got, the voice signal got is identified, is obtained corresponding First content of text;
According to first content of text, problem base is inquired, obtains the problem of matching with first content of text;
First content of text is compared with described problem, judges whether lack ingredient in first content of text;
If not lacking ingredient in first content of text, answer corresponding the problem of matching is determined as first text The corresponding response result of content.
4. according to the method described in claim 3, it is characterized in that, described carry out first content of text and described problem It compares, after judging whether to lack in first content of text ingredient, further includes:
If lacking ingredient in first content of text, user is prompted to supplement the ingredient;
In conjunction with the voice signal and first content of text got again, the problem of user proposes is determined;
The problem of proposing with the user corresponding answer is determined as response result.
5. according to the method described in claim 4, it is characterized in that, voice signal that the combination is got again and described First content of text determines the problem of user proposes, comprising:
The voice signal got again is identified, corresponding second content of text is obtained;
Judge whether second content of text is ingredient lacking in first content of text;
If second content of text be ingredient lacking in first content of text, in conjunction with second content of text with And first content of text, determine the problem of user proposes;
If second content of text is not ingredient lacking in first content of text, according to second content of text Determine the problem of user proposes.
6. a kind of voice interaction device is applied to speech-sound intelligent equipment characterized by comprising
Module is obtained, for obtaining the voice signal of user;
Whether identification module judges in the voice signal to include waking up language for identifying the voice signal;
Conversion processing module, for including turning speech-sound intelligent equipment from dormant state when waking up language in the voice signal Wake-up states are shifted to, to carry out interactive voice operation with the user;
The conversion processing module is also used to when in the voice signal got including conclusion, stops interactive voice operation, The speech-sound intelligent equipment is converted from wake-up states to dormant state.
7. device according to claim 6, which is characterized in that the identification module is specifically used for,
The voice signal is identified, the corresponding content of text of the voice signal is obtained;
It is inquired according to the content of text and wakes up dictionary, judge to whether there is in the content of text and call out in the wake-up dictionary Awake matched first word of language or the first sentence;
If there are first word or first sentences in the content of text, it is determined that include in the voice signal Wake up language.
8. device according to claim 6, which is characterized in that further include: enquiry module, comparison module and determining module;
The identification module is also used to when not including conclusion in the voice signal got, to the voice signal got It is identified, obtains corresponding first content of text;
The enquiry module obtains and first content of text for inquiring problem base according to first content of text With the problem of;
The comparison module judges in first text for first content of text to be compared with described problem Whether lack ingredient in appearance;
The determining module, when for not lacking ingredient in first content of text, by answer corresponding the problem of matching It is determined as the corresponding response result of first content of text.
9. device according to claim 8, which is characterized in that further include: cue module;
The cue module when for lacking ingredient in first content of text, prompts user to supplement the ingredient;
The determining module, is also used in conjunction with the voice signal and first content of text got again, determine described in The problem of user proposes;
The determining module is also used to answer corresponding the problem of proposition with the user being determined as response result.
10. device according to claim 9, which is characterized in that the determining module is specifically used for,
The voice signal got again is identified, corresponding second content of text is obtained;
Judge whether second content of text is ingredient lacking in first content of text;
If second content of text be ingredient lacking in first content of text, in conjunction with second content of text with And first content of text, determine the problem of user proposes;
If second content of text is not ingredient lacking in first content of text, according to second content of text Determine the problem of user proposes.
11. a kind of voice interaction device characterized by comprising
Memory, processor and storage are on a memory and the computer program that can run on a processor, which is characterized in that institute It states when processor executes described program and realizes such as voice interactive method as claimed in any one of claims 1 to 5.
12. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program Such as voice interactive method as claimed in any one of claims 1 to 5 is realized when being executed by processor.
13. a kind of computer program product realizes such as right when the instruction processing unit in the computer program product executes It is required that any voice interactive method in 1-5.
CN201811508104.8A 2018-12-11 2018-12-11 Voice interactive method and device Pending CN109461448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811508104.8A CN109461448A (en) 2018-12-11 2018-12-11 Voice interactive method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811508104.8A CN109461448A (en) 2018-12-11 2018-12-11 Voice interactive method and device

Publications (1)

Publication Number Publication Date
CN109461448A true CN109461448A (en) 2019-03-12

Family

ID=65612904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811508104.8A Pending CN109461448A (en) 2018-12-11 2018-12-11 Voice interactive method and device

Country Status (1)

Country Link
CN (1) CN109461448A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364143A (en) * 2019-08-14 2019-10-22 腾讯科技(深圳)有限公司 Voice awakening method, device and its intelligent electronic device
CN110600009A (en) * 2019-09-26 2019-12-20 柯利达信息技术有限公司 Intelligent voice interaction operation platform and interaction method
CN111009245A (en) * 2019-12-18 2020-04-14 腾讯科技(深圳)有限公司 Instruction execution method, system and storage medium
CN112739507A (en) * 2020-04-22 2021-04-30 南京阿凡达机器人科技有限公司 Interactive communication implementation method, equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
CN103744836A (en) * 2014-01-08 2014-04-23 苏州思必驰信息科技有限公司 Man-machine conversation method and device
CN105975511A (en) * 2016-04-27 2016-09-28 乐视控股(北京)有限公司 Intelligent dialogue method and apparatus
CN106653021A (en) * 2016-12-27 2017-05-10 上海智臻智能网络科技股份有限公司 Voice wake-up control method and device and terminal
CN107193978A (en) * 2017-05-26 2017-09-22 武汉泰迪智慧科技有限公司 A kind of many wheel automatic chatting dialogue methods and system based on deep learning
US20180082684A1 (en) * 2012-10-30 2018-03-22 Google Technology Holdings LLC Voice Control User Interface with Progressive Command Engagement
CN108109618A (en) * 2016-11-25 2018-06-01 宇龙计算机通信科技(深圳)有限公司 voice interactive method, system and terminal device
CN108766422A (en) * 2018-04-02 2018-11-06 青岛海尔科技有限公司 Response method, device, storage medium and the computer equipment of speech ciphering equipment
CN108962217A (en) * 2018-07-28 2018-12-07 华为技术有限公司 Phoneme synthesizing method and relevant device

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
US20180082684A1 (en) * 2012-10-30 2018-03-22 Google Technology Holdings LLC Voice Control User Interface with Progressive Command Engagement
CN103744836A (en) * 2014-01-08 2014-04-23 苏州思必驰信息科技有限公司 Man-machine conversation method and device
CN105975511A (en) * 2016-04-27 2016-09-28 乐视控股(北京)有限公司 Intelligent dialogue method and apparatus
CN108109618A (en) * 2016-11-25 2018-06-01 宇龙计算机通信科技(深圳)有限公司 voice interactive method, system and terminal device
CN106653021A (en) * 2016-12-27 2017-05-10 上海智臻智能网络科技股份有限公司 Voice wake-up control method and device and terminal
CN107193978A (en) * 2017-05-26 2017-09-22 武汉泰迪智慧科技有限公司 A kind of many wheel automatic chatting dialogue methods and system based on deep learning
CN108766422A (en) * 2018-04-02 2018-11-06 青岛海尔科技有限公司 Response method, device, storage medium and the computer equipment of speech ciphering equipment
CN108962217A (en) * 2018-07-28 2018-12-07 华为技术有限公司 Phoneme synthesizing method and relevant device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364143A (en) * 2019-08-14 2019-10-22 腾讯科技(深圳)有限公司 Voice awakening method, device and its intelligent electronic device
CN110364143B (en) * 2019-08-14 2022-01-28 腾讯科技(深圳)有限公司 Voice awakening method and device and intelligent electronic equipment
CN110600009A (en) * 2019-09-26 2019-12-20 柯利达信息技术有限公司 Intelligent voice interaction operation platform and interaction method
CN111009245A (en) * 2019-12-18 2020-04-14 腾讯科技(深圳)有限公司 Instruction execution method, system and storage medium
CN111009245B (en) * 2019-12-18 2021-09-14 腾讯科技(深圳)有限公司 Instruction execution method, system and storage medium
CN112739507A (en) * 2020-04-22 2021-04-30 南京阿凡达机器人科技有限公司 Interactive communication implementation method, equipment and storage medium
CN112739507B (en) * 2020-04-22 2023-05-09 南京阿凡达机器人科技有限公司 Interactive communication realization method, device and storage medium

Similar Documents

Publication Publication Date Title
CN109461448A (en) Voice interactive method and device
CN105741838B (en) Voice awakening method and device
CN107564518A (en) Smart machine control method, device and computer equipment
CN108228764A (en) A kind of single-wheel dialogue and the fusion method of more wheel dialogues
CN107977183A (en) voice interactive method, device and equipment
CN107704275A (en) Smart machine awakening method, device, server and smart machine
CN109448725A (en) A kind of interactive voice equipment awakening method, device, equipment and storage medium
CN109036393A (en) Wake-up word training method, device and the household appliance of household appliance
CN108154140A (en) Voice awakening method, device, equipment and computer-readable medium based on lip reading
WO2005009205A3 (en) System and method for self management of health using natural language interface
CN110298770A (en) A kind of recipe recommendation system
US20200265843A1 (en) Speech broadcast method, device and terminal
CN106504768A (en) Phone testing audio frequency classification method and device based on artificial intelligence
JP7158217B2 (en) Speech recognition method, device and server
CN107526826A (en) Phonetic search processing method, device and server
CN108091324A (en) Tone recognition methods, device, electronic equipment and computer readable storage medium
CN109754788A (en) A kind of sound control method, device, equipment and storage medium
CN108932944A (en) Coding/decoding method and device
CN109686368A (en) Voice wakes up response process method and device, electronic equipment and storage medium
CN105912111A (en) Method for ending voice conversation in man-machine interaction and voice recognition device
CN108304561B (en) A kind of semantic understanding method, equipment and robot based on finite data
CN111192082B (en) Product selling point analysis method, terminal equipment and computer readable storage medium
CN117253478A (en) Voice interaction method and related device
CN112242135A (en) Voice data processing method and intelligent customer service device
CN108962235A (en) Voice interactive method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190312

RJ01 Rejection of invention patent application after publication