CN108962235B

CN108962235B - Voice interaction method and device

Info

Publication number: CN108962235B
Application number: CN201711446766.2A
Authority: CN
Inventors: 高慧湍; 韩伟; 李茂全; 李宝祥; 修铭徽
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2017-12-27
Filing date: 2017-12-27
Publication date: 2021-09-17
Anticipated expiration: 2037-12-27
Also published as: CN108962235A

Abstract

The invention provides a voice interaction method and a voice interaction device, wherein the method comprises the following steps: receiving a first content acquisition instruction, and acquiring content according to the first content acquisition instruction; if a second content acquisition instruction is received within a preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a related skill field; if the second content acquisition instruction and the first content acquisition instruction belong to the same technical field or related technical fields, acquiring content according to the second content acquisition instruction, and realizing that a user expresses an intention through multiple content acquisition instructions; and because the fields of noise and the like in the surrounding environment are generally irrelevant to the skill field of the content acquisition instruction of the user, the embodiment can avoid the voice equipment from executing the voice instruction in the surrounding environment by mistake, thereby improving the voice interaction efficiency and improving the experience of the user in using the voice equipment.

Description

Voice interaction method and device

Technical Field

The invention relates to the technical field of voice equipment, in particular to a voice interaction method and device.

Background

The current voice interaction methods mainly comprise two methods, one is that after each awakening, a voice instruction is executed only once. The other is to allow the execution of voice commands received within a certain time period after each wake-up. However, the first scheme requires the user to frequently wake up the voice device, and particularly, in the case that the user cannot express the intention through one voice instruction, it is difficult to achieve effective interaction between the user and the voice device through the first scheme. In the second scheme, because the voice device is generally used in an open scene, and there are many noises and background sounds faced, it is easy to cause the voice device to execute a voice instruction in the surrounding environment "mistakenly", so that it is difficult to realize effective interaction between the user and the voice device, and the voice interaction efficiency and the experience of the user in using the voice device are reduced.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, a first objective of the present invention is to provide a voice interaction method, which is used for solving the problems that the voice interaction efficiency is poor and the experience of the user using the voice device is affected in the prior art.

The second objective of the present invention is to provide a voice interaction device.

A third object of the invention is to propose an electronic device.

A fourth object of the invention is to propose a non-transitory computer-readable storage medium.

A fifth object of the invention is to propose a computer program product.

In order to achieve the above object, an embodiment of a first aspect of the present invention provides a voice interaction method, including:

receiving a first content acquisition instruction, and acquiring content according to the first content acquisition instruction;

if a second content acquisition instruction is received within a preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a related skill field;

and if the second content acquisition instruction and the first content acquisition instruction are determined to belong to the same skill field or the related skill field, acquiring content according to the second content acquisition instruction.

Further, the method further comprises the following steps:

and if the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the same technical field or the related technical field, not responding to the second content acquisition instruction.

Further, if it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same technical field, obtaining content according to the second content obtaining instruction specifically includes:

acquiring content according to the analysis result of the second content acquisition instruction and in combination with the analysis result of the first content acquisition instruction;

if it is determined that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field, acquiring content according to the second content acquisition instruction, specifically including:

and acquiring the content according to the analysis result of the second content acquisition instruction.

Further, the preset time period is determined according to the skill field to which the first content acquisition instruction belongs.

Further, determining whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field specifically includes:

determining a first skill field to which the first content acquisition instruction belongs and a second skill field to which the second content acquisition instruction belongs according to an instruction analysis result;

if the first skill field is the same as the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the same skill field;

if the first skill field is different from the second skill field, inquiring a preset related field mapping rule, and determining a preset related skill field corresponding to the first skill field;

and if the preset related skill field comprises the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field.

Further, before determining whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field, the method further includes:

determining that the second content acquisition instruction is not a wake-up instruction.

Further, the method further comprises the following steps: and responding to the awakening instruction if the second content acquisition instruction is the awakening instruction.

The voice interaction method provided by the embodiment receives a first content acquisition instruction, and acquires content according to the first content acquisition instruction; if a second content acquisition instruction is received within a preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a related skill field; if the second content acquisition instruction and the first content acquisition instruction belong to the same technical field or related technical fields, acquiring content according to the second content acquisition instruction, and realizing that a user expresses an intention through multiple content acquisition instructions; and because the fields of noise and the like in the surrounding environment are generally irrelevant to the skill field of the content acquisition instruction of the user, the embodiment can avoid the voice equipment from executing the voice instruction in the surrounding environment by mistake, thereby improving the voice interaction efficiency and improving the experience of the user in using the voice equipment.

In order to achieve the above object, a second aspect of the present invention provides a voice interaction apparatus, including:

the acquisition module is used for receiving a first content acquisition instruction and acquiring content according to the first content acquisition instruction;

the judging module is used for judging whether a second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a related skill field when the second content acquisition instruction is received within a preset time period;

the obtaining module is further configured to obtain content according to the second content obtaining instruction when it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field.

Further, the device further comprises:

and the processing module is used for not responding to the second content acquisition instruction when the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the same skill field or the related skill field.

Further, the obtaining module is specifically configured to,

when the second content acquisition instruction and the first content acquisition instruction belong to the same skill field, acquiring content according to the analysis result of the second content acquisition instruction and in combination with the analysis result of the first content acquisition instruction;

and when the second content acquisition instruction and the first content acquisition instruction are determined to belong to the related skill field, acquiring content according to the analysis result of the second content acquisition instruction.

Further, the judging module is specifically configured to,

Further, the determining module is further configured to determine that the second content obtaining instruction is not a wake-up instruction before determining whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field.

Further, the device further comprises:

and the response module is used for responding to the awakening instruction when the second content acquisition instruction is the awakening instruction.

The voice interaction device provided by the embodiment receives a first content acquisition instruction, and acquires content according to the first content acquisition instruction; if a second content acquisition instruction is received within a preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a related skill field; if the second content acquisition instruction and the first content acquisition instruction belong to the same technical field or related technical fields, acquiring content according to the second content acquisition instruction, and realizing that a user expresses an intention through multiple content acquisition instructions; and because the fields of noise and the like in the surrounding environment are generally irrelevant to the skill field of the content acquisition instruction of the user, the embodiment can avoid the voice equipment from executing the voice instruction in the surrounding environment by mistake, thereby improving the voice interaction efficiency and improving the experience of the user in using the voice equipment.

To achieve the above object, a third aspect of the present invention provides an electronic device, including: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of voice interaction as described above when executing the program.

In order to achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the voice interaction method as described above.

In order to achieve the above object, a fifth embodiment of the present invention provides a computer program product, which when executed by an instruction processor in the computer program product, implements the voice interaction method as described above.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic flow chart of a voice interaction method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a voice interaction apparatus according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

The following describes a voice interaction method and apparatus according to an embodiment of the present invention with reference to the drawings.

Fig. 1 is a schematic flow chart of a voice interaction method according to an embodiment of the present invention. As shown in fig. 1, the voice interaction method includes the following steps:

s101, receiving a first content acquisition instruction, and acquiring content according to the first content acquisition instruction.

The execution main body of the voice interaction method provided by the invention is a voice interaction device, and the voice interaction device can be a background server corresponding to the voice equipment or the voice equipment. The voice device may be, for example, a smart sound box, a smart air conditioner, a smart washing machine, a smart television, or the like, which may perform voice interaction with a user and perform corresponding operations according to an instruction of the user.

In this embodiment, in the case that the voice interaction device is a background server corresponding to the voice device, the first content acquisition instruction may be acquired in a manner that, in the process of interacting between the voice device and the user, the voice instruction of the user is obtained by monitoring and then is directly sent to the background server. After the background server acquires the first content acquisition instruction, the background server can perform voice recognition on the first content acquisition instruction, acquire an analysis result of the first content acquisition instruction, and acquire content according to the analysis result of the first content acquisition instruction.

In this embodiment, in a case that the voice interaction apparatus is a voice device, the first content obtaining instruction may be obtained by monitoring an obtained voice instruction of the user during an interaction between the voice device and the user. After the voice interaction device acquires the first content acquisition instruction, voice recognition can be performed on the first content acquisition instruction to acquire an analysis result of the first content acquisition instruction, and content is acquired according to the analysis result of the first content acquisition instruction.

In this embodiment, the content may be a result of a response to the first content acquisition instruction. For example, when the first content obtaining instruction is "i want to hear the forgetting water", the corresponding content may be "the forgetting water of the passerby a version"; for another example, when the first content acquiring instruction is "i want to hear the logical thinking", the corresponding content may be "the 12 th set of logical thinking"; when the first content acquisition instruction is "inquire weather", the corresponding content may be "raining".

And S102, if the second content acquisition instruction is received within the preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or the related skill field.

The preset time period is determined according to the skill field to which the first content acquisition instruction belongs. In this embodiment, before step 102, after receiving the first content acquisition instruction, the voice interaction device may determine, according to an analysis result of the first content acquisition instruction, a first skill field to which the first content acquisition instruction belongs, determine a preset time period corresponding to the first skill field, perform timing, and determine whether a second content acquisition instruction is received within the preset time period; if the second content acquisition instruction is not received within the preset time period, the voice interaction is finished.

In this embodiment, in the case that the voice interaction apparatus is a background server corresponding to the voice device, after the voice interaction is finished, the voice interaction apparatus may send a stop interaction instruction to the voice device, so that the voice device does not receive the voice instruction any more until the voice device receives a wake-up instruction of a user, and after the wake-up operation is performed, the voice instruction is received again and sent to the voice interaction apparatus. Under the condition that the voice interaction device is a voice device, after the voice interaction is finished, the voice interaction device does not receive the voice instruction until the awakening instruction of the user is received, and after the awakening operation is carried out, the voice instruction of the user is received again.

In this embodiment, when the voice interaction device is a background server corresponding to the voice device, the first way for determining, by the voice interaction device, the first skill field to which the first content acquisition instruction belongs according to the analysis result of the first content acquisition instruction may be: inputting the analysis result of the first content acquisition instruction into a preset skill field model to obtain the probability that the analysis result belongs to each skill field; and determining the first skill field to which the first content acquisition instruction belongs according to the probability that the analysis result belongs to each skill field. The preset skill field model can be a skill field model obtained by training a large number of sentences or words corresponding to each skill field.

In a case that the voice interaction device is a background server corresponding to the voice device, another way for determining, by the voice interaction device, the first skill field to which the first content acquisition instruction belongs according to the analysis result of the first content acquisition instruction may be: performing word segmentation on the analysis result of the first content acquisition instruction to acquire a word segmentation result; comparing each word in the word segmentation result with the words in each skill field, and determining the number of the words belonging to each skill field in the word segmentation result; and determining a first skill field to which the first content acquisition instruction belongs according to the number of words belonging to each skill field in the word segmentation result.

Of course, the first skill area to which the first content acquisition instruction belongs may be determined in other ways, which are not illustrated here.

The implementation of determining the second skill area to which the second content acquisition instruction belongs may be the same as the implementation of determining the first skill area to which the first content acquisition instruction belongs, and will not be described in detail here.

In this embodiment, the preset time period corresponding to each skill field may be set according to actual needs, and is not specifically limited herein.

If the second content acquisition instruction is received within the preset time period, the first mode that whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or the related skill field can be specifically determined by the voice interaction device, and the first skill field to which the first content acquisition instruction belongs and the second skill field to which the second content acquisition instruction belongs are determined according to the instruction analysis result; if the first skill field is the same as the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the same skill field; if the first skill field is different from the second skill field, inquiring a preset related field mapping rule, and determining a preset related skill field corresponding to the first skill field; and if the preset related skill field comprises a second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field. And if the preset related skill field does not comprise the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction do not belong to the same skill field or the related skill field.

And the preset related skill fields corresponding to the skill fields are stored in the preset related field mapping rule.

If a second content acquisition instruction is received within a preset time period, the voice interaction device judges whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or a second mode of a related skill field, and the second mode specifically comprises the steps of determining a first skill field to which the first content acquisition instruction belongs and a second skill field to which the second content acquisition instruction belongs according to an instruction analysis result; inquiring a preset related field mapping rule, and determining a preset related skill field corresponding to the first skill field; if the preset related skill field comprises a second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field; if the preset related skill field does not comprise a second skill field, judging whether the first skill field is the same as the second skill field, and if the first skill field is the same as the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the same skill field; and if the first skill field is different from the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction do not belong to the same skill field or the related skill field.

Further, on the basis of the above embodiment, before determining whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field, the voice interaction device may first determine whether the second content obtaining instruction is a wake-up instruction; if the second content acquisition instruction is not the awakening instruction, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field or the related skill field; and responding to the awakening instruction if the second content acquisition instruction is the awakening instruction.

S103, if the second content acquisition instruction and the first content acquisition instruction belong to the same technical field or related technical fields, acquiring the content according to the second content acquisition instruction.

In this embodiment, if it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same technical field, the voice interaction device may obtain the content according to an analysis result of the second content obtaining instruction and in combination with an analysis result of the first content obtaining instruction. If it is determined that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field, the voice interaction device may acquire the content according to an analysis result of the second content acquisition instruction.

For example, in a case where the first content acquisition instruction is "i want to listen to water of forgetfulness", and the second content acquisition instruction is "i want to listen to liudebua", the second content acquisition instruction and the first content acquisition instruction belong to the same technical field, and the corresponding content may be "water of forgetfulness of liudebua". When the first content obtaining instruction is "i want to listen to the logical thinking" and the second content obtaining instruction is "9 th set", the second content obtaining instruction and the first content obtaining instruction belong to the same technical field, and the corresponding content may be "9 th set of logical thinking".

For another example, when the first content obtaining instruction is "inquire weather", and the second content obtaining instruction is "call me a car, i also get the car to the company", the second content obtaining instruction and the first content obtaining instruction belong to the related skill field, and the corresponding content may be "start the car-taking function", for example, turn up the car-taking software according to the location, automatically input the company address, reserve the journey, and the like.

In addition, it should be noted that the method further includes: if the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the same skill field or the related skill field, the voice interaction device does not respond to the second content acquisition instruction, continues timing, and judges whether a third content acquisition instruction is received before a preset time period is reached; if the third content acquisition instruction is not received, the voice interaction is finished.

Fig. 2 is a schematic structural diagram of a voice interaction apparatus according to an embodiment of the present invention. As shown in fig. 2, includes: an acquisition module 21 and a judgment module 22.

The acquiring module 21 is configured to receive a first content acquiring instruction, and acquire content according to the first content acquiring instruction;

the judging module 22 is configured to, when a second content obtaining instruction is received within a preset time period, judge whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field;

the obtaining module 21 is further configured to obtain content according to the second content obtaining instruction when it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field.

The voice interaction device provided by the invention can be specifically a voice device or a background server corresponding to the voice device. The voice device may be, for example, a smart sound box, a smart air conditioner, a smart washing machine, a smart television, or the like, which may perform voice interaction with a user and perform corresponding operations according to an instruction of the user.

In the case that the voice interaction device is a background server corresponding to the voice device, the first content acquisition instruction may be acquired in a manner that, in the process of interaction between the voice device and the user, the voice instruction of the user is monitored and acquired and then directly sent to the background server. After the background server acquires the first content acquisition instruction, the background server can perform voice recognition on the first content acquisition instruction, acquire an analysis result of the first content acquisition instruction, and acquire content according to the analysis result of the first content acquisition instruction.

Under the condition that the voice interaction device is a voice device, after the voice interaction is finished, the voice interaction device does not receive the voice instruction until the awakening instruction of the user is received, and after the awakening operation is carried out, the voice instruction of the user is received again. Under the condition that the voice interaction device is a background server corresponding to the voice equipment, after the voice interaction is finished, the voice interaction device can send an interaction stopping instruction to the voice equipment, so that the voice equipment does not receive the voice instruction any more until the voice equipment receives a wake-up instruction of a user, and after the wake-up operation is carried out, the voice instruction is received again and sent to the voice interaction device.

Further, the determining module 22 may be specifically configured to determine, according to the instruction parsing result, a first skill field to which the first content obtaining instruction belongs and a second skill field to which the second content obtaining instruction belongs; if the first skill field is the same as the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the same skill field; if the first skill field is different from the second skill field, inquiring a preset related field mapping rule, and determining a preset related skill field corresponding to the first skill field; and if the preset related skill field comprises a second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field. And if the preset related skill field does not comprise the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction do not belong to the same skill field or the related skill field.

Further, the determining module 22 may be further configured to determine, according to the instruction parsing result, a first skill field to which the first content obtaining instruction belongs and a second skill field to which the second content obtaining instruction belongs; inquiring a preset related field mapping rule, and determining a preset related skill field corresponding to the first skill field; if the preset related skill field comprises a second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field; if the preset related skill field does not comprise a second skill field, judging whether the first skill field is the same as the second skill field, and if the first skill field is the same as the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the same skill field; and if the first skill field is different from the second skill field, determining that the second content acquisition instruction and the first content acquisition instruction do not belong to the same skill field or the related skill field.

Further, the obtaining module 21 is specifically configured to, when it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same skill field, obtain a content according to an analysis result of the second content obtaining instruction and in combination with an analysis result of the first content obtaining instruction; and when the second content acquisition instruction and the first content acquisition instruction are determined to belong to the related skill field, acquiring content according to the analysis result of the second content acquisition instruction.

For example, in a case where the first content acquisition instruction is "i want to listen to water of forgetfulness", and the second content acquisition instruction is "i want to listen to liudebua", the second content acquisition instruction and the first content acquisition instruction belong to the same technical field, and the corresponding content may be "water of forgetfulness of liudebua". When the first content obtaining instruction is "i want to listen to the logical thinking" and the second content obtaining instruction is "9 th set", the second content obtaining instruction and the first content obtaining instruction belong to the same technical field, and the corresponding content may be "9 th set of logical thinking". Under the condition that the first content acquisition instruction is 'inquiring weather', and the second content acquisition instruction is 'calling me a car and getting me to taxi to a company', the second content acquisition instruction and the first content acquisition instruction belong to the related skill field, and the corresponding content can be 'starting taxi taking function', such as calling taxi taking software according to the place, automatically inputting a company address, reserving a journey and the like.

Further, on the basis of the above embodiment, the determining module 22 is further configured to determine whether the second content obtaining instruction is a wake-up instruction before determining whether the second content obtaining instruction and the first content obtaining instruction belong to the same skill field or a related skill field; and if the second content acquisition instruction is not the awakening instruction, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same technical field or the related technical field. In addition, the voice interaction device further comprises: and the response module is used for responding to the awakening instruction when the second content acquisition instruction is the awakening instruction.

Further, on the basis of the above embodiment, the apparatus may further include: the processing module is used for not responding to the second content acquisition instruction when the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the same skill field or the related skill field, continuing timing and judging whether a third content acquisition instruction is received before a preset time period is reached; if the third content acquisition instruction is not received, the voice interaction is finished.

Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device includes:

memory 1001, processor 1002, and computer programs stored on memory 1001 and executable on processor 1002.

The processor 1002, when executing the program, implements the voice interaction method provided in the above-described embodiments.

Further, the electronic device further includes:

a communication interface 1003 for communicating between the memory 1001 and the processor 1002.

A memory 1001 for storing computer programs that may be run on the processor 1002.

Memory 1001 may include high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 1002 is configured to implement the voice interaction method according to the foregoing embodiment when executing the program.

If the memory 1001, the processor 1002, and the communication interface 1003 are implemented independently, the communication interface 1003, the memory 1001, and the processor 1002 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.

Optionally, in a specific implementation, if the memory 1001, the processor 1002, and the communication interface 1003 are integrated on one chip, the memory 1001, the processor 1002, and the communication interface 1003 may complete communication with each other through an internal interface.

The processor 1002 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.

The invention also provides a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method of voice interaction as described above.

The invention also provides a computer program product, wherein the instructions of the computer program product realize the voice interaction method when being executed by the processor.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware associated with instructions of a program, which may be stored in a computer-readable storage medium, and when executed, includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A method of voice interaction, comprising:

receiving a first content acquisition instruction, and acquiring a response result of the first content acquisition instruction according to the first content acquisition instruction;

if a second content acquisition instruction is received within a preset time period, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill field;

if the second content acquisition instruction and the first content acquisition instruction belong to the same skill field, acquiring a response result of the second content acquisition instruction by combining the first content acquisition instruction according to the second content acquisition instruction; otherwise, the response result of the second content acquisition instruction is acquired by combining the first content acquisition instruction without the second content acquisition instruction.

2. The method of claim 1, further comprising:

if the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the same skill field, judging whether the second content acquisition instruction and the first content acquisition instruction belong to the related skill field;

and if the second content acquisition instruction and the first content acquisition instruction are determined to belong to the related skill field, acquiring a response result of the second content acquisition instruction according to the second content acquisition instruction.

3. The method of claim 2, further comprising:

and if the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the related skill field, not responding to the second content acquisition instruction.

4. The method of claim 1, wherein the predetermined time period is determined according to a skill area to which the first content acquisition instruction belongs.

5. The method of claim 2,

judging whether the second content acquisition instruction and the first content acquisition instruction belong to the related skill field, specifically comprising:

querying a preset related field mapping rule, and determining a preset related skill field corresponding to a first skill field, wherein the first skill field is a skill field to which the first content acquisition instruction belongs;

and if the preset related skill field comprises a second skill field, determining that the second content acquisition instruction and the first content acquisition instruction belong to the related skill field, wherein the second skill field is the skill field to which the second content acquisition instruction belongs.

6. The method of claim 1, further comprising, prior to determining whether the second content retrieval instruction is in the same skill area as the first content retrieval instruction:

7. The method of claim 6, further comprising:

and responding to the awakening instruction if the second content acquisition instruction is the awakening instruction.

8. A voice interaction apparatus, comprising:

the acquisition module is used for receiving a first content acquisition instruction and acquiring a response result of the first content acquisition instruction according to the first content acquisition instruction;

the judging module is used for judging whether a second content acquisition instruction and the first content acquisition instruction belong to the same skill field or not if the second content acquisition instruction is received within a preset time period;

the obtaining module is further configured to, if it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the same technical field, obtain, according to the second content obtaining instruction, a response result of the second content obtaining instruction in combination with the first content obtaining instruction; otherwise, the response result of the second content acquisition instruction is acquired by combining the first content acquisition instruction without the second content acquisition instruction.

9. The apparatus according to claim 8, wherein the determining module is further configured to determine whether the second content obtaining instruction and the first content obtaining instruction belong to related skill fields if it is determined that the second content obtaining instruction and the first content obtaining instruction do not belong to the same skill field;

the obtaining module is further configured to obtain a response result of the second content obtaining instruction according to the second content obtaining instruction if it is determined that the second content obtaining instruction and the first content obtaining instruction belong to the related skill field.

10. The apparatus of claim 8, further comprising:

and the processing module is used for not responding to the second content acquisition instruction if the second content acquisition instruction and the first content acquisition instruction are determined not to belong to the related skill field.

11. The apparatus of claim 8, wherein the predetermined time period is determined according to a skill area to which the first content acquisition instruction belongs.

12. The apparatus according to claim 9, wherein the determining module is specifically configured to,

13. The apparatus of claim 8, wherein the determining module is further configured to determine that the second content acquisition instruction is not a wake-up instruction before determining whether the second content acquisition instruction and the first content acquisition instruction belong to the same skill area.

14. The apparatus of claim 13, further comprising:

and the response module is used for responding to the awakening instruction if the second content acquisition instruction is the awakening instruction.

15. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of voice interaction according to any of claims 1-7.

16. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the voice interaction method of any one of claims 1-7.