CN110310636A

CN110310636A - Interaction control method, device, equipment and audio frequency apparatus

Info

Publication number: CN110310636A
Application number: CN201910550413.XA
Authority: CN
Inventors: 张向军
Original assignee: Goertek Inc
Current assignee: Goertek Inc
Priority date: 2019-06-24
Filing date: 2019-06-24
Publication date: 2019-10-08

Abstract

The invention discloses a kind of interaction control method, device, equipment and audio frequency apparatuses.This method comprises: waking up word in response to the received target of institute, from virtual role set, destination virtual role corresponding with target wake-up word is chosen；Invocation target virtual role receives the phonetic order of user, and handles phonetic order, obtains corresponding instruction processing result；By destination virtual role, described instruction processing result is played to user.

Description

Interaction control method, device, equipment and audio frequency apparatus

Technical field

The present invention relates to interactive controlling technical field, more particularly, to a kind of interaction control method, device, equipment and Audio frequency apparatus.

Background technique

Recently as the development of artificial intelligence technology and equipment manufacturing technology, artificial intelligence equipment popularity rate is substantially mentioned It is high.For example, audio frequency apparatus just becomes the indispensable housed device of many families in recent years, user can be enabled to hand over by natural language Mutually, information can be obtained, entertained, control the application services such as household electrical appliances, user is enabled to obtain completely new quick household experience.

But existing audio frequency apparatus be user carry out interactive voice when, be all by the way that itself personification is turned to one Virtual role can only be supported and single use to implement the corresponding interaction (for example, as shown in Figure 1) with user in interactive process Family interacts, therefore audio frequency apparatus pointedly can not provide service for multiple users, it is difficult to meet for example domestic consumer this The group user of sample to the use demand of audio frequency apparatus, also, with audio frequency apparatus support the service provided become more diverse and It complicates, audio frequency apparatus needs to expend the longer processing time (for example, when for certain information retrievals when providing certain services Need longer information retrieval and finishing time), user can be enabled to consume a longer time waiting, Wu Faji in single interactive process It is continuous to obtain other services, extreme influence user experience.

Summary of the invention

It is an object of the present invention to provide a kind of for controlling the new solution interacted with audio frequency apparatus.

According to the first aspect of the invention, a kind of interaction control method is provided, is implemented by audio frequency apparatus, comprising:

Word is waken up in response to the received target of institute, from virtual role set, is chosen corresponding with target wake-up word Destination virtual role；

The destination virtual role is called, receives the phonetic order of user, and handle the phonetic order, is obtained Corresponding instruction processing result；

By the destination virtual role, described instruction processing result is played to user.

According to the second aspect of the invention, a kind of interaction control device is provided, is arranged in audio frequency apparatus side, comprising:

Role's selection unit, for waking up a word in response to the received target of institute, from virtual role set, choose with it is described Target wakes up the corresponding destination virtual role of word；

Instruction process unit receives the phonetic order of user, and to the voice for calling the destination virtual role Instruction is handled, and corresponding instruction processing result is obtained；

As a result broadcast unit, for playing described instruction processing result to user by the destination virtual role.

According to the third aspect of the invention we, a kind of interactive control equipment is provided, comprising:

Memory, for storing executable instruction；

Processor runs the interactive control equipment for the control according to the executable instruction, executes such as this hair Interaction control method described in bright first aspect.

According to the fourth aspect of the invention, a kind of audio frequency apparatus is provided, comprising:

Interaction control device as according to the second aspect of the invention, or interaction as according to the third aspect of the invention Control equipment.

According to one embodiment of the disclosure, word is waken up in response to the received target of institute by audio frequency apparatus, from virtual angle In color set, destination virtual role corresponding with target wake-up word, invocation target virtual role are chosen, the voice for receiving user refers to It enables, and phonetic order is handled, obtain corresponding instruction processing result, by destination virtual role, play and refer to user Processing result is enabled, multiple and different virtual roles is arranged in audio frequency apparatus and is interacted parallel with user for realization, is multiple use Family provides service parallel or provides multinomial service parallel for same user, avoids consuming a longer time with the single interaction of user, Meet user to the parallel interaction demand of audio frequency apparatus, improves user experience.

By referring to the drawings to the detailed description of exemplary embodiment of the present invention, other feature of the invention and its Advantage will become apparent.

Detailed description of the invention

It is combined in the description and the attached drawing for constituting part of specification shows the embodiment of the present invention, and even With its explanation together principle for explaining the present invention.

Fig. 1 is the frame for showing the example of hardware configuration for the audio frequency apparatus 1000 that can be used for realizing the embodiment of the present invention Figure.

Fig. 2 shows the flow charts of the interaction control method of the embodiment of the present invention.

Fig. 3 shows the schematic diagram of the example that virtual role is called by separate threads of the embodiment of the present invention.

Fig. 4 shows the schematic diagram of the example of the suspension invocation target virtual role of the embodiment of the present invention.

Fig. 5 shows the schematic diagram for the example that a user interacts with virtual roles multiple in audio frequency apparatus.

Fig. 6 shows the schematic diagram for the example that multiple users interact with virtual roles multiple in audio frequency apparatus respectively.

Fig. 7 show multiple users by using equipment be connected on audio frequency apparatus, respectively with it is multiple in audio frequency apparatus The schematic diagram of the example of virtual role interaction.

Fig. 8 shows the block diagram of the interaction control device 3000 of the embodiment of the present invention.

Fig. 9 shows the block diagram of the interactive control equipment 4000 of the embodiment of the present invention.

Specific embodiment

Carry out the various exemplary embodiments of detailed description of the present invention now with reference to attached drawing.It should also be noted that unless in addition having Body explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally The range of invention.

Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the present invention And its application or any restrictions used.

Technology, method and apparatus known to person of ordinary skill in the relevant may be not discussed in detail, but suitable In the case of, the technology, method and apparatus should be considered as part of specification.

It is shown here and discuss all examples in, any occurrence should be construed as merely illustratively, without It is as limitation.Therefore, other examples of exemplary embodiment can have different values.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, then in subsequent attached drawing does not need that it is further discussed.

Fig. 1 is the block diagram for showing the hardware configuration for the audio frequency apparatus 1000 that the embodiment of the present invention may be implemented.

Audio frequency apparatus 1000 may include intelligent sound box, intelligent earphone etc..As shown in Figure 1, audio frequency apparatus 1000 can wrap Include processor 1100, memory 1200, interface arrangement 1300, communication device 1400, display device 1500, input unit 1600, Loudspeaker 1700, microphone 1800 etc..Wherein, processor 1100 can be central processor CPU, Micro-processor MCV etc..It deposits Reservoir 1200 for example including ROM (read-only memory), RAM (random access memory), such as hard disk nonvolatile memory Deng.Interface arrangement 1300 is for example including USB interface, earphone interface etc..Communication device 1400 is for example able to carry out wired or wireless Communication specifically may include Wifi communication, Bluetooth communication, 2G/3G/4G/5G communication etc..Display device 1500 is, for example, liquid crystal Display screen, touch display screen etc..Input unit 1600 is such as may include touch screen, keyboard, body-sensing input.User can lead to It crosses microphone 1800 and inputs phonetic order, triggering audio frequency apparatus 1000 is by processor 1100 according to storing in memory 1200 The control of executable instruction, operation audio frequency apparatus 1000 handle the phonetic order, the processing result of phonetic order are passed through loudspeaking Device 1700 plays to user.

Audio frequency apparatus shown in FIG. 1 is merely illustrative and is in no way intended to the invention, its application, or uses Any restrictions.Using in an embodiment of the present invention, the memory 1200 of audio frequency apparatus 1000 is for storing instruction, described Instruction is operated for controlling the processor 1100 to execute any one interactive controlling side provided in an embodiment of the present invention Method.It will be appreciated by those skilled in the art that although showing multiple devices to audio frequency apparatus 1000 in Fig. 1, the present invention Partial devices therein can be only related to, for example, audio frequency apparatus 1000 pertains only to processor 1100 and storage device 1200.Technology Personnel can disclosed conceptual design instruction according to the present invention.How control processor is operated for instruction, this is this field public affairs Know, therefore is not described in detail herein.

In the present embodiment, a kind of interaction control method is provided, is implemented by audio frequency apparatus.The audio frequency apparatus is based on people Speaker, the earphone products of work intellectual technology (such as intelligent sound technology) realization, can application corresponding with user's interaction offer Service, such as the phonetic order of reception user play song, shopping, inquiry Weather information etc..In one example, audio frequency apparatus Hardware configuration can be as shown in Figure 1.

The interaction control method, as shown in Figure 2, comprising: step S2100-S2300.

Step S2100 wakes up word in response to the received target of institute, from virtual role set, chooses and wake up word with target Corresponding destination virtual role.

In the present embodiment, audio frequency apparatus is provided with virtual role set.It include multiple void in virtual role set Quasi- role.Virtual role is the human-computer interaction pair by being developed, being personalized based on computer technology and artificial intelligence technology As.Each virtual role has corresponding wake-up word.It, can be with different in wake-up activation audio frequency apparatus by different wake-up words Virtual role is interacted parallel with user by multiple and different virtual roles in conjunction with the subsequent step of the present embodiment, is more A user provides service parallel or provides multinomial service parallel for same user, avoid with user it is single interact consuming it is longer when Between, meet user to the parallel interaction demand of audio frequency apparatus, improves user experience.

In the present embodiment, it is corresponding with the destination virtual role that user's expectation is called that target wakes up word, can be target void Intend role's title, the role identification etc. of role.

In a specific example, each virtual role for including in virtual role set have corresponding role identification with And wake up permission.It is corresponding with the phonetic feature that the user group for allowing to wake up corresponding virtual role should have to wake up permission, example Such as, the permission that permission can indicate that there is the user group of corresponding phonetic feature to allow to wake up virtual role is waken up.The voice Feature may include sex character, the age characteristics etc. that user is come out by phonetic representation.

In this example, step S2100 includes: step S2110-S2130.

Step S2110 extracts the phonetic feature that target wakes up word, determines that corresponding target wakes up power according to phonetic feature Limit.

Target wakes up word and is issued by user by voice mode.Audio frequency apparatus can extract received target wake up word Phonetic feature, to determine that corresponding target wakes up permission.For example, the phonetic feature that target wakes up word shows the age of user Be characterized in that the elderly, sex character are males, corresponding target wake up permission be it is old, male user allows to wake up virtual role Permission.It is corresponding with the phonetic feature of user by the wake-up permission that virtual role is arranged, it can be special for having different phonetic The user group of sign, customizing has the corresponding virtual role for waking up permission, preferably meets the individual demand of user.

In this example, the phonetic feature that target wakes up word can be extracted by multiple voice recognizer, do not done herein It is specific to limit.

Step S2120 extracts the semantic content that target wakes up word, determines corresponding target angle colour code according to semantic content Know.

In this example, the semantic content that target wakes up word can be extracted by a variety of semantic analysis algorithms, do not done herein It is specific to limit.

The semantic content that word is waken up by extracting target, can determine that user is expected that by target and calls out according to voice content Target roles possessed by the destination virtual role that word wakes up of waking up mark.

Step S2130, from virtual role set, it is virtual that there is target wake-up permission and target roles to identify for selection Role, as destination virtual role.

Word is waken up by the target issued from user, phonetic feature is extracted to determine that corresponding target wakes up permission, chooses The virtual role for waking up permission and target roles mark with target can be by being different as destination virtual role Corresponding wake-up permission is arranged in virtual role, to limit the user group for allowing to wake up virtual role, is embodied as different users Group customizes corresponding virtual role, meets the individual demand of user.For example, can have spy for old man or children are this kind of The specific user group for determining phonetic feature customizes specific virtual role to practice interaction, to meet this kind of specific user The individual demand of group.

After choosing destination virtual role, enter:

Step S2200, invocation target virtual role receive the phonetic order of user, and handle phonetic order, obtain Take corresponding instruction processing result.

By calling destination virtual role corresponding with the target wake-up word that user issues, to receive the phonetic order of user And handled, it may be implemented to call different virtual roles to handle multiple phonetic orders of a user or handle multiple Multiple phonetic orders of user are that multiple users provide service parallel or are that same user provides multinomial service parallel, are avoided It is consumed a longer time with the single interaction of user, meets user to the parallel interaction demand of audio frequency apparatus, improve user experience.

In one example, invocation target virtual role receives the phonetic order of user, and to phonetic order at Reason, obtains corresponding instruction processing result, may include: step S2210-S2220.

Step S2210 obtains role's intersection record with destination virtual role.

Role's intersection record is used to record the history mutual information between corresponding virtual role and user, history interaction Information includes the historical information interacted between virtual role and user, the finger of the phonetic order, virtual role that issue including user Voice messaging, voice response of user's sending for enabling processing result, virtual role actively issue etc., can be suitable according to time order and function Sequence records to obtain corresponding role's intersection record.Role's intersection record can be presented by way of linking up context scene, It specifically can be similar to the form of chat record between user and virtual role.

In this example, it can record corresponding role's intersection record for each virtual role in virtual role set and protect There are locals, can be from the local role's intersection record for reading the virtual role, in conjunction with role when calling virtual role every time Intersection record can more accurately identify that the phonetic order of user is handled.

Step S2220 calls separate threads corresponding with destination virtual role, receives phonetic order, and based role is handed over Mutually record handles phonetic order, obtains corresponding instruction processing result.

In this example, being handled when the virtual role is called by calling each independent thread of virtual role needs to execute Task, to realize the calling to virtual role.Independent thread is arranged to each virtual role to call to realize, may be implemented Independent calling between virtual role, parallel processing, improve treatment effeciency.

For example, being usually the particular content progress information retrieval for phonetic order to the processing of phonetic order, then will The information retrieved is supplied to user as instruction processing result, and the corresponding process for calling virtual role is divided into five stages: Wake up word reception, phonetic order reception, information retrieval, information ready alert, information broadcasting, it is assumed that there are three virtual roles, It is waken up respectively by waking up word A, B, C, it is corresponding, these three virtual roles can be called by three separate threads, such as Fig. 3 institute Show.

And by calling separate threads corresponding with destination virtual role, phonetic order is received, independent process can be passed through Realize independent calling, other virtual roles are not influenced by calling, and improving treatment effeciency can in combination with role's intersection record More accurately to identify, handle phonetic order, process performance is improved.

In another example, phonetic order is handled, obtains corresponding instruction processing result, may include: step Rapid S2211-S2212.

Step S2211 determines the instruction processing authority of destination virtual role.

Instruction processing authority is the permission of process instruction possessed by corresponding virtual role.Instruction processing authority can wrap Include instruction process range (specifically allowing which phonetic order handled), (the every operation instruction or specific of instruction processing mode How phonetic order is handled) etc..

Each virtual role in virtual role set has corresponding instruction processing authority.For different virtual angles Different instruction processing authorities can be set in color, the customizing functions to virtual role is realized, to meet the diversified angle of user Color interaction demand.For example, the instruction disposal right of the virtual role can be set for the virtual role for youngsters and children customization Limit, the specific information library that can be retrieved when including only handling the phonetic order for allowing youngsters and children to issue, processing phonetic order And to instruction processing result filtering flame etc., realize the use demand for accurately meeting youngsters and children.

In this example, the instruction processing authority that can preset each virtual role in virtual role set is stored in this Ground, pre-set concrete mode can be the mode of default setting, or respond the mode of the exterior arrangement operation of user.? After determining destination virtual role, so that it may be determined from the local instruction processing authority for reading destination virtual role.

Alternatively, inquiry message can also be issued in real time to user, triggering user matches in real time after determining destination virtual role Set the instruction processing authority of target roles.

Step S2212, according to the instruction processing authority of destination virtual role, invocation target virtual role to phonetic order into Row processing, obtains corresponding instruction processing result.

According to the instruction processing authority of destination virtual role, invocation target virtual role handles phonetic order, can According to the instruction processing authority of setting, it is determined whether handle phonetic order, how to handle phonetic order etc., it realizes to target void The customizing functions of quasi- role, meet user to the diversified interaction demand of audio frequency apparatus.

In another example, phonetic order is handled, obtains corresponding instruction processing result, may include: step Rapid S2221-S2222.

Step S2221, triggering carry out authentication to user, obtain corresponding authentication result.

In this example, voice prompting can be issued by audio frequency apparatus, triggering carries out authentication, authentication to user Mode may include to user issue voice voice print verification, to user issue verbal instructions verbal instructions verify Deng；Alternatively, can also by audio frequency apparatus, to its by the modes such as WIFI, bluetooth carry out pairing connection, user uses Mobile terminal (such as mobile phone, tablet computer etc.) issues authentication prompt, and triggering user carries out identity by mobile terminal and tests Card, authentication mode may include fingerprint authentication, face verification, password authentification, gesture verifying etc..

Step S2222, institute's authentication result instruction be verified when, invocation target virtual role to phonetic order into Row processing, obtains corresponding instruction processing result.

In this example, when invocation target virtual role processing phonetic order, authentication is carried out to user by triggering, When authentication passes through, then practical invocation target virtual role handles phonetic order, and true, legal user is only allowed to call mesh Mark virtual role interacts, and ensures the interaction permission of legitimate user and audio frequency apparatus, avoids malicious third parties and audio frequency apparatus Interaction, brings security risk.

After acquisition instruction processing result, enter:

Step S2300, by destination virtual role, to user's play instruction processing result.

In the present embodiment, corresponding role characteristic can be set in each virtual role in virtual role set, specifically Set-up mode can be system default configuration or receive the exterior arrangement of user and preset.Role characteristic can be with Including sex character, tamber characteristic, tonality feature etc..Corresponding, destination virtual role also has corresponding role characteristic, can be with Based on corresponding role characteristic come to user's play instruction processing result, for example, the role characteristic of destination virtual role is sweet Maiden can be played when to user's play instruction processing result using the sound for meeting sweet maiden's feature.

It in one example, may include: step to the step of user's play instruction processing result by destination virtual role Rapid S2310-S2320.

Step S2310 issues the user with voice prompting after acquisition instruction processing result, prompts user's acquisition instruction Processing result, triggering user issue result play instruction.

In this example, which may include the prompt tone defaulted, voice messaging or sets depending on the user's operation Personalized prompt, voice messaging for setting etc..

By issuing the user with voice prompting after acquisition instruction processing result, it is current according to itself that user can be triggered Situation chooses whether to listen to instruction processing result, to determination be made whether issue result play instruction.

Step S2320, in response to the result play instruction received, to user's play instruction processing result.

After the instruction processing result for obtaining the phonetic order issued to user, voice prompting is issued the user with, triggering is used After family issues result play instruction, just to user's play instruction processing result, it can enable user that can be controlled according to itself present case System whether play instruction processing result, avoid be not suitable for play instruction processing result scene (such as user with other Virtual role carries out prior interaction or user and enters the mute scene of needs suddenly) under, direct play instruction processing knot Dried fruit disturbs user, influences user experience.

It in another example, may include: step to user's play instruction processing result by destination virtual role S2301-S2303。

Step S2301 is obtained when having existed other virtual roles to user's back sound information and is played configuration and refers to Show.

Play configuration instruction be used to indicate instruction processing result broadcast mode, may include support audio mixing play and not Audio mixing is supported to play.In this example, playing configuration instruction can be system default configuration, or the outside in response to user's implementation Configuration carries out individual cultivation, is chosen whether that audio mixing is supported to play according to itself preference by user.

Multiple virtual roles can be supported to interact parallel with sole user or multiple users in the present embodiment, passed through It, (should to user's back sound information if having existed other virtual roles when destination virtual role's play instruction processing result Acoustic information may include the instruction processing result that other virtual roles obtain or the speech message that active issues), it can obtain It takes and plays configuration instruction to determine how broadcasting, avoid direct play instruction processing result, influence the interactive experience of user.

Step S2302, when playing configuration instruction is that audio mixing is supported to play, the sound that other virtual roles are played is believed Breath, with destination virtual role play instruction processing result, carry out stereo process after, then to user play.

Step S2303 stops other virtual roles and plays sound letter when playing configuration instruction is not support audio mixing to play Breath, passes through destination virtual role, play instruction processing result.

In this example, the scene for supporting multiple virtual roles to be interacted parallel with user for audio frequency apparatus, Ke Yigen Whether audio mixing is supported to play according to broadcasting configuration instruction, in other existing virtual role back sound informations, it is determined whether right The instruction processing result that destination virtual role obtains carries out audio mixing broadcasting, avoids direct play instruction processing result, influences user Interactive experience.

Example is had been combined above and illustrates interaction control method as shown in Figure 2, in one example, is mentioned in the present embodiment The interaction control method of confession, include the steps that it is as shown in Figure 2 except, further includes:

Step S2400 receives new target during invocation target virtual role receives the phonetic order of user When waking up word, the calling of destination virtual role is terminated, selected ci poem is waken up according to new target, new destination virtual role is taken to adjust With.

In the present embodiment, multiple virtual roles that user can support with audio frequency apparatus interact.In practical application In, when invocation target virtual role receives the phonetic order of user, what may be issued by the user or other users is new Target wakes up word and interrupts, and can not continue completely to receive phonetic order at this time, and then handle phonetic order, and new target Word is waken up, embodiment is new user's interaction demand, terminates the calling of destination virtual role at this time, wakes up word according to new target It chooses new destination virtual role to be called, the scene that multiple virtual roles can be supported to interact with user in audio frequency apparatus In, it is ensured that effective interaction of user and virtual role.

For example, continuing to use example shown in Fig. 3, the process of virtual role is called to be divided into five stages: waking up word reception, voice Command reception, information retrieval, information ready alert, information play；Assuming that it is to wake up word A that the target that user issues, which wakes up word, adjust During receiving phonetic order with virtual role corresponding with word A is waken up, the user or other users issue new mesh again Mark wakes up word --- and word B is waken up, the calling to virtual role corresponding with word A is waken up can be stopped at this time, calls and wakes up word B's Virtual role carries out the stages such as subsequent reception phonetic order, information retrieval, as shown in Figure 4.It may thereby be ensured that user and more Effective interaction between a virtual role.

The interaction control method provided in the present embodiment is provided below with reference to Fig. 5,6,7.In this example, Audio frequency apparatus is intelligent sound box.

Fig. 5 shows the schematic diagram for the example that a user interacts with virtual roles multiple in intelligent sound box.In Fig. 5, User can through this embodiment in provide interaction control method, by different wake-up words " the small small A of A ", " the small small B of B ", " the small small C of C " wakes up three virtual roles being correspondingly arranged in intelligent sound box and interacts, for example, user wakes up and " the small small A of A " After the corresponding small A of virtual role assigns phonetic order, the corresponding small B of virtual role of wake-up " the small small B of B " can be continued and continue to assign Phonetic order, and so on, user can interact with multiple virtual roles parallel, without waiting at single virtual role After reason phonetic order obtains instruction processing result, then next phonetic order is assigned, saves waiting of the user in interactive process Time can more quickly respond the interaction demand of user, promote the interactive experience between user and intelligent sound box.

Fig. 6 shows the schematic diagram for the example that multiple users interact with virtual roles multiple in intelligent sound box respectively.In Fig. 6 In, two different users can through this embodiment in the interaction control method that provides, by different wake-up words, " small D is small D ", " the small small E of E " wake up two virtual roles being correspondingly arranged in intelligent sound box and interact, respectively for example, user 1 is waking up After the small D of virtual role corresponding with " the small small D of D " assigns phonetic order, user 2 can wake up " the small small E of E " corresponding virtual role Small E continues to assign phonetic order, and so on, multiple users can separately with virtual roles multiple in intelligent sound box into The independent interaction of row, the phonetic order that each user handles other user without waiting for single virtual role obtain instruction processing knot After fruit, then oneself phonetic order is assigned, saves waiting time of the user in interactive process, can more quickly respond user's Interaction demand promotes the interactive experience between user and intelligent sound box.

Fig. 7 show multiple users by using equipment be connected on intelligent sound box, respectively with it is multiple in intelligent sound box The schematic diagram of the example of virtual role interaction.In Fig. 7, different users can by helmet, mobile phone, tablet computer, The electronic equipments such as smart home articles are connected on intelligent sound box by modes such as WIFI, bluetooths, and each user passes through based on company The electronic equipment of intelligent sound box is connect, the middle interaction control method provided, multiple with being arranged in intelligent sound box through this embodiment Virtual role interacts respectively, and specific interactive process is similar with Fig. 6, and each user is handled without waiting for single virtual role The phonetic order of other user obtains after instructing processing result, then assigns the phonetic order of oneself, saves user in interactive process In waiting time, can more quickly respond the interaction demand of user, promote the interactive experience between user and intelligent sound box.

In the present embodiment, a kind of interaction control device 3000 is provided, as shown in Figure 8, comprising: role's selection unit 3100, instruction process unit 3200 and result broadcast unit 3300, the interactive controlling side for implementing to provide in the present embodiment Method.

Role's selection unit 3100, for waking up a word in response to the received target of institute, from virtual role set, choose with The target wakes up the corresponding destination virtual role of word.

Optionally, each virtual role for including in the virtual role set has corresponding role identification and wake-up Permission；Role's selection unit 3100 is also used to:

The phonetic feature that the target wakes up word is extracted, determines that corresponding target wakes up permission according to the phonetic feature；

The semantic content that the target wakes up word is extracted, determines that corresponding target roles identify according to the semantic content；

From the virtual role set, the institute that there is the target to wake up permission and target roles mark is chosen Virtual role is stated, as the destination virtual role.

Instruction process unit 3200 receives the phonetic order of user, and to described for calling the destination virtual role Phonetic order is handled, and corresponding instruction processing result is obtained.

Optionally, instruction process unit 3200 is also used to:

Obtain role's intersection record with the destination virtual role；Role's intersection record is for recording corresponding institute State the history mutual information between virtual role and user；

Separate threads corresponding with the destination virtual role are called, receive the phonetic order, and be based on the role Intersection record handles the phonetic order, obtains corresponding instruction processing result.

Optionally, instruction process unit 3200 is also used to:

Determine the instruction processing authority of the destination virtual role；

According to the instruction processing authority of the destination virtual role, call the destination virtual role to the phonetic order It is handled, obtains corresponding instruction processing result.

Optionally, instruction process unit 3200 is also used to:

Triggering carries out authentication to user, obtains corresponding authentication result；

The authentication result instruction be verified when, call the destination virtual role to the phonetic order into Row processing, obtains corresponding instruction processing result.

As a result broadcast unit 3300, for playing described instruction processing result to user by the destination virtual role.

Optionally, as a result broadcast unit 3300 is also used to:

After obtaining described instruction processing result, voice prompting is issued the user with, prompts user's knot of acquisition instruction processing Fruit, triggering user issue result play instruction；

In response to the result play instruction received, described instruction processing result is played to user.

Optionally, as a result broadcast unit 3300 is also used to:

When having existed other virtual roles to user's back sound information, obtains and play configuration instruction；

When the broadcasting configuration instruction is that audio mixing is supported to play, the acoustic information that other described virtual roles are played, With the destination virtual role play described instruction processing result, carry out stereo process after, then to user play；

When the broadcasting configuration instruction is not support audio mixing to play, stops other described virtual roles and play sound letter Breath plays described instruction processing result by the destination virtual role.

Optionally, interaction control device 3000 is also used to:

During calling the destination virtual role to receive the phonetic order of user, receives new target and wake up word When, the calling of the destination virtual role is terminated, selected ci poem is waken up according to the new target, new destination virtual role is taken to carry out It calls.

It will be appreciated by those skilled in the art that interaction control device 3000 can be realized by various modes.For example, can To realize interaction control device 3000 by instruction configuration processor.For example, instruction can be stored in the ROM, and work as When starting device, instruction is read in programming device from ROM and realizes interaction control device 3000.For example, can will hand over Mutual control device 3000 is cured in dedicated devices (such as ASIC).Interaction control device 3000 can be divided into mutually independent Unit, or they can be merged to realization.Interaction control device 3000 can be by above-mentioned various implementations One kind realize, or can be realized by the combination of two or more modes in above-mentioned various implementations.

In the present embodiment, the setting of interaction control device 3000 can be and be arranged in audio frequency apparatus in audio frequency apparatus side Software module, or the patch, the insert that are loaded in audio frequency apparatus etc., can also be that setting is built with audio frequency apparatus Application program in the equipment of vertical connection.In one example, interaction control device 3000 can also be packaged into software development work Have packet form (such as SDK), is run after being installed by audio frequency apparatus.

In the present embodiment, a kind of interactive control equipment 4000 is also provided, as shown in Figure 9, comprising:

Memory 4100, for storing executable instruction；

Processor 4200 runs the interactive control equipment 4000, holds for the control according to the executable instruction The interaction control method provided in row such as the present embodiment.

In the present embodiment, interactive control equipment 4000 can be set in audio frequency apparatus side, can be setting and sets in audio In standby, it is also possible to establish the autonomous device of wired or wireless connection with audio frequency apparatus.

In the present embodiment, a kind of audio frequency apparatus 5000 is also provided, comprising:

Interaction control device 3000 as shown in Figure 8 or interactive control equipment 6000 as shown in Figure 9.

In the present embodiment, the hardware configuration of audio frequency apparatus 5000 can be as shown in Figure 1, for example, pass through memory 1200 Interaction control device 3000 is stored, interaction control device 3000 is loaded by processor 1100, implements the interaction in the present embodiment Control method, alternatively, storing executable instruction by memory 1200 passes through processing according to the control of executable instruction Device 1100 implements the interaction control method in the present embodiment.Audio frequency apparatus 5000 may include intelligent sound box, intelligent earphone etc..

The interaction control method provided in Detailed description of the invention the present embodiment, device, equipment and audio frequency apparatus are provided above, Word is waken up in response to the received target of institute by audio frequency apparatus, from virtual role set, is chosen corresponding with target wake-up word Destination virtual role, invocation target virtual role receive the phonetic order of user, and handle phonetic order, acquisition pair The instruction processing result answered, by destination virtual role, to user's play instruction processing result, realization is arranged in audio frequency apparatus Multiple and different virtual roles is interacted parallel with user, for multiple users provide parallel service or be that same user is parallel Multinomial service is provided, avoids consuming a longer time with the single interaction of user, meets user to the parallel interaction demand of audio frequency apparatus, Improve user experience.

The present invention can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.

Computer readable storage medium, which can be, can keep and store the tangible of the instruction used by instruction execution equipment Equipment.Computer readable storage medium for example can be-- but it is not limited to-- storage device electric, magnetic storage apparatus, optical storage Equipment, electric magnetic storage apparatus, semiconductor memory apparatus or above-mentioned any appropriate combination.Computer readable storage medium More specific example (non exhaustive list) includes: portable computer diskette, hard disk, random access memory (RAM), read-only deposits It is reservoir (ROM), erasable programmable read only memory (EPROM or flash memory), static random access memory (SRAM), portable Compact disk read-only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanical coding equipment, for example thereon It is stored with punch card or groove internal projection structure and the above-mentioned any appropriate combination of instruction.Calculating used herein above Machine readable storage medium storing program for executing is not interpreted that instantaneous signal itself, the electromagnetic wave of such as radio wave or other Free propagations lead to It crosses the electromagnetic wave (for example, the light pulse for passing through fiber optic cables) of waveguide or the propagation of other transmission mediums or is transmitted by electric wire Electric signal.

Computer-readable program instructions as described herein can be downloaded to from computer readable storage medium it is each calculate/ Processing equipment, or outer computer or outer is downloaded to by network, such as internet, local area network, wide area network and/or wireless network Portion stores equipment.Network may include copper transmission cable, optical fiber transmission, wireless transmission, router, firewall, interchanger, gateway Computer and/or Edge Server.Adapter or network interface in each calculating/processing equipment are received from network to be counted Calculation machine readable program instructions, and the computer-readable program instructions are forwarded, for the meter being stored in each calculating/processing equipment In calculation machine readable storage medium storing program for executing.

Computer program instructions for executing operation of the present invention can be assembly instruction, instruction set architecture (ISA) instructs, Machine instruction, machine-dependent instructions, microcode, firmware instructions, condition setup data or with one or more programming languages The source code or object code that any combination is write, the programming language include the programming language-of object-oriented such as Smalltalk, C++ etc., and conventional procedural programming languages-such as " C " language or similar programming language.Computer Readable program instructions can be executed fully on the user computer, partly execute on the user computer, be only as one Vertical software package executes, part executes on the remote computer or completely in remote computer on the user computer for part Or it is executed on server.In situations involving remote computers, remote computer can pass through network-packet of any kind It includes local area network (LAN) or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as benefit It is connected with ISP by internet).In some embodiments, by utilizing computer-readable program instructions Status information carry out personalized customization electronic circuit, such as programmable logic circuit, field programmable gate array (FPGA) or can Programmed logic array (PLA) (PLA), the electronic circuit can execute computer-readable program instructions, to realize each side of the invention Face.

Referring herein to according to the method for the embodiment of the present invention, the flow chart of device (system) and computer program product and/ Or block diagram describes various aspects of the invention.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.

These computer-readable program instructions can be supplied to general purpose computer, special purpose computer or other programmable datas The processor of processing unit, so that a kind of machine is produced, so that these instructions are passing through computer or other programmable datas When the processor of processing unit executes, function specified in one or more boxes in implementation flow chart and/or block diagram is produced The device of energy/movement.These computer-readable program instructions can also be stored in a computer-readable storage medium, these refer to It enables so that computer, programmable data processing unit and/or other equipment work in a specific way, thus, it is stored with instruction Computer-readable medium then includes a manufacture comprising in one or more boxes in implementation flow chart and/or block diagram The instruction of the various aspects of defined function action.

Computer-readable program instructions can also be loaded into computer, other programmable data processing units or other In equipment, so that series of operation steps are executed in computer, other programmable data processing units or other equipment, to produce Raw computer implemented process, so that executed in computer, other programmable data processing units or other equipment Instruct function action specified in one or more boxes in implementation flow chart and/or block diagram.

The flow chart and block diagram in the drawings show the system of multiple embodiments according to the present invention, method and computer journeys The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.It is right For those skilled in the art it is well known that, by hardware mode realize, by software mode realize and pass through software and It is all of equal value that the mode of combination of hardware, which is realized,.

Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In principle, the practical application or to the technological improvement in market for best explaining each embodiment, or make the art its Its those of ordinary skill can understand each embodiment disclosed herein.The scope of the present invention is defined by the appended claims.

Claims

1. a kind of interaction control method, which is characterized in that implemented by audio frequency apparatus, comprising:

Word is waken up in response to the received target of institute, from virtual role set, chooses target corresponding with target wake-up word Virtual role；

The destination virtual role is called, receives the phonetic order of user, and handle the phonetic order, obtains and corresponds to Instruction processing result；

2. the method according to claim 1, wherein

The each virtual role for including in the virtual role set has corresponding role identification and wakes up permission；

It is described from virtual role set, choose and include: with the step of target wake-up word corresponding destination virtual role

From the virtual role set, the void that there is the target to wake up permission and target roles mark is chosen Quasi- role, as the destination virtual role.

3. receiving user's the method according to claim 1, wherein described call the destination virtual role Phonetic order, and the phonetic order is handled, obtain corresponding instruction processing result, comprising:

Obtain role's intersection record with the destination virtual role；Role's intersection record is for recording the corresponding void History mutual information between quasi- role and user；

Separate threads corresponding with the destination virtual role are called, receive the phonetic order, and based on role interaction Record handles the phonetic order, obtains corresponding instruction processing result.

4. the method according to claim 1, wherein

It is described that phonetic order is handled, obtain corresponding instruction processing result, comprising:

According to the instruction processing authority of the destination virtual role, the destination virtual role is called to carry out the phonetic order Processing, obtains corresponding instruction processing result；

And/or

The authentication result instruction be verified when, call the destination virtual role to the phonetic order at Reason, obtains corresponding instruction processing result.

5. being played to user the method according to claim 1, wherein described by the destination virtual role Described instruction processing result, comprising:

After obtaining described instruction processing result, voice prompting is issued the user with, prompts user's acquisition instruction processing result, touching Hair family issues result play instruction；

6. being played to user the method according to claim 1, wherein described by the destination virtual role Described instruction processing result, comprising:

When the broadcasting configuration instruction is that audio mixing is supported to play, the acoustic information that other described virtual roles are played, with institute State destination virtual role broadcasting described instruction processing result, carry out stereo process after, then to user play；

When the broadcasting configuration instruction is not support audio mixing to play, stops other described virtual role back sound informations, lead to The destination virtual role is crossed, described instruction processing result is played.

7. the method according to claim 1, wherein the method also includes:

During calling the destination virtual role to receive the phonetic order of user, when receiving new target wake-up word, The calling for terminating the destination virtual role wakes up selected ci poem according to the new target and new destination virtual role is taken to adjust With.

8. a kind of interaction control device, which is characterized in that be arranged in audio frequency apparatus side, comprising:

Role's selection unit from virtual role set, is chosen and the target for waking up word in response to the received target of institute Wake up the corresponding destination virtual role of word；

Instruction process unit receives the phonetic order of user, and to the phonetic order for calling the destination virtual role It is handled, obtains corresponding instruction processing result；

9. a kind of interactive control equipment characterized by comprising

Memory, for storing executable instruction；

Processor runs the interactive control equipment for the control according to the executable instruction, executes such as claim Interaction control method described in 1-7.

10. a kind of audio frequency apparatus characterized by comprising

Interaction control device as claimed in claim 8, alternatively, interactive control equipment as claimed in claim 9.