WO2019100738A1 - 多人参与的人机交互方法及装置 - Google Patents

多人参与的人机交互方法及装置 Download PDF

Info

Publication number
WO2019100738A1
WO2019100738A1 PCT/CN2018/096706 CN2018096706W WO2019100738A1 WO 2019100738 A1 WO2019100738 A1 WO 2019100738A1 CN 2018096706 W CN2018096706 W CN 2018096706W WO 2019100738 A1 WO2019100738 A1 WO 2019100738A1
Authority
WO
WIPO (PCT)
Prior art keywords
interaction
user
priority
response
instruction
Prior art date
Application number
PCT/CN2018/096706
Other languages
English (en)
French (fr)
Inventor
周维
陈志刚
王智国
胡国平
Original Assignee
科大讯飞股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 科大讯飞股份有限公司 filed Critical 科大讯飞股份有限公司
Publication of WO2019100738A1 publication Critical patent/WO2019100738A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality

Definitions

  • the invention relates to the field of human-computer interaction, and particularly relates to a human-computer interaction method and device for multi-person participation.
  • the result is either one-by-one response or arbitrary interruption, and it is difficult to achieve anthropomorphic effects, such as intelligent responses to interactive objects, interaction sequences, etc., so the user is often requested Make a wrong response and reduce the user experience.
  • the embodiment of the invention provides a human-computer interaction method and device for multi-person participation, so as to solve the problem that the existing human-computer interaction solution has poor interaction experience due to simple and violent interaction response strategy, and cannot achieve anthropomorphic intelligent interaction.
  • a multi-person human-computer interaction method comprising:
  • the user interaction is responded according to the priority of the interaction response and the user interaction intention.
  • the interactive instruction is information of any one or more of the following forms: voice, action, button.
  • the analyzing the current interaction instruction to obtain the user interaction intention comprises:
  • the semantic understanding is performed according to the recognition result and the stored historical data, and the user interaction intention is obtained.
  • the analyzing the current interaction instruction to obtain the user interaction intention further includes:
  • Performing semantic understanding according to the recognition result and the stored historical data, and obtaining the user interaction intention includes:
  • Semantic understanding is performed according to the recognition result and historical data information associated with the current interaction instruction, and a user interaction intention is obtained.
  • the method further includes:
  • Performing semantic understanding according to the recognition result and the stored historical data, and obtaining the user interaction intention includes:
  • the semantic understanding is performed according to the recognition result, the historical data corresponding to the user ID corresponding to the current interaction instruction, and the historical data corresponding to other user IDs, and the user interaction intention is obtained.
  • the determining the priority of the interaction response comprises:
  • the interaction priority characteristic comprises any one or more of the following: exchange request the degree of urgency S e, service timeliness S t, the current business interfering C disturb;
  • the priority of the interaction response is determined according to the interaction priority feature.
  • the determining the priority of the interaction response comprises:
  • the interaction priority characteristic comprises any one or more of the following: exchange request the degree of urgency S e, service timeliness S t, the current business interfering C disturb;
  • the responding to the user interaction according to the priority of the interaction response and the user interaction intention comprises:
  • the interrupted task is re-executed.
  • the responding to the user interaction according to the priority of the interaction response and the user interaction intention comprises:
  • a multi-person human-computer interaction device comprising:
  • An interaction information receiving module configured to receive a current interaction instruction of the user
  • An instruction parsing module configured to analyze the current interaction instruction to obtain a user interaction intention
  • a priority analysis module for determining the priority of the interactive response
  • the response module is configured to respond to the user interaction according to the priority of the interaction response and the user interaction intention.
  • the instruction parsing module includes:
  • a voice recognition unit configured to perform voice recognition on the current voice information, to obtain a recognition result
  • the semantic understanding unit is configured to perform semantic understanding according to the recognition result and the stored historical data to obtain a user interaction intention.
  • the instruction parsing module further includes:
  • An association determining unit configured to determine historical data associated with the current interaction instruction according to a pre-built association judgment model
  • the semantic understanding unit performs semantic understanding according to the recognition result and historical data information associated with the current interaction instruction to obtain a user interaction intention.
  • the device further comprises:
  • a user information obtaining module configured to determine user information corresponding to the current interaction instruction
  • the voice recognition unit is further configured to identify, according to the user information determined by the user information acquiring module, a user ID corresponding to the historical data;
  • the semantic understanding unit performs semantic understanding according to the recognition result, historical data corresponding to the user ID corresponding to the current interaction instruction, and historical data corresponding to other user IDs, to obtain a user interaction intention.
  • the priority analysis module is specifically configured to determine an interaction priority feature according to the user interaction intention, where the interaction priority feature includes any one or more of the following: an urgency of the interaction request S e , service timeliness S t, the current business interfering C disturb; determining priority interactive response based on the priority feature interaction.
  • the priority analysis module comprises:
  • a user analysis unit configured to determine a user priority feature S u according to the user information
  • An instruction analyzing unit configured to determine an interaction priority feature according to the user interaction intention, where the interaction priority feature includes any one or more of the following: an urgency of the interaction request S e , a service time effectiveness S t , and a current service interfering C disturb;
  • Priority determining means according to the user priority determined wherein S u interactive response priority, or according to the user priority and the priority characteristic S u wherein said interaction is determined interactive response to the priority.
  • the response module is further configured to determine whether the priority of the interaction response is greater than a set threshold, and if so, interrupt a response to other interaction instructions and respond to the current interaction instruction; After the response is completed, the interrupted task is re-executed.
  • the response module is specifically configured to determine a response policy according to the priority of the interaction response and a pre-built policy model, and respond to the current interaction instruction according to the response policy.
  • the interactive instruction is information of any one or more of the following forms: voice, action, button.
  • a multi-person human-computer interaction device includes a processor, a memory and a receiving circuit connected to each other;
  • the receiving circuit is configured to receive a current interaction instruction of the user and send the instruction to the processor;
  • the memory is configured to store program instructions
  • the processor is configured to execute the program instructions to perform:
  • the receiving circuit comprises any one or more of the following:
  • a microphone for receiving an interactive instruction in the form of a voice
  • a touch screen for receiving an interactive instruction in the form of an action
  • a button for receiving an interactive command in the form of a button is a button for receiving an interactive command in the form of a button.
  • the human-computer interaction method and device for multi-person participation analyzes the interaction instruction after receiving the current interaction instruction of the user, obtains the user interaction intention, and then determines the priority of the interaction response, according to the interaction.
  • the priority of the response and the intention of the user interaction respond to the user interaction, thereby not only ensuring the accuracy of the response, but also taking into account the priority of different interaction responses, thereby enabling the user to have a better experience and improving human-computer interaction.
  • the degree of intelligence and anthropomorphism is the degree of intelligence and anthropomorphism.
  • FIG. 1 is a flow chart of a human-computer interaction method in which a plurality of people participate in an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a human-machine interaction apparatus in which a plurality of people participate in an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an instruction parsing module in an embodiment of the present invention.
  • FIG. 4 is another schematic structural diagram of a human-machine interaction apparatus in which a plurality of people participate in an embodiment of the present invention
  • FIG. 5 is still another schematic structural diagram of a human-machine interaction apparatus in which a plurality of people participate in an embodiment of the present invention.
  • FIG. 1 it is a flowchart of a human-computer interaction method in which a plurality of people participate in the embodiment of the present invention, including the following steps:
  • Step 101 Receive a current interaction instruction of the user.
  • the interactive instruction may be one or more forms of interactive instructions of speech, actions such as gestures or other body actions, keys, and the like.
  • Step 102 Perform analysis on the current interaction instruction to obtain a user interaction intention.
  • the analysis of interactive instructions mainly refers to the recognition and understanding of the instruction information.
  • For the interactive instructions of the action class it is necessary to identify the corresponding action and find the user interaction intention corresponding to the action. These correspondences are preset, the machine only The action needs to be identified; similarly, for the button interaction command, the user interaction intention corresponding to the different buttons is also preset.
  • speech recognition For voice interactive instructions, speech recognition and semantic understanding are required. Among them, speech recognition can be realized by existing common speech recognition technology.
  • the current interactive instruction when the semantic interaction instruction is semantically understood, the current interactive instruction can be semantically understood by means of multiple pieces of historical interaction data. That is, when the current interactive instruction is semantically understood, the semantic understanding is performed according to the recognition result of the current interactive instruction and the stored historical data, and the user interaction intention is obtained. Of course, it is also possible to understand the language according to the recognition result of the current interactive instruction, and obtain the user interaction intention.
  • the semantic understanding can adopt the existing mainstream semantic understanding technology based on the neural network model.
  • the input of the model is the current instruction vector information and the vector information of the plurality of historical data
  • the output of the model is a semantic understanding result, such as Business, keywords.
  • the current instruction vector information may be obtained according to the recognition result of the current instruction, for example, segmenting the recognition result, obtaining a word vector of each word, and using the sum of the word vectors as the current instruction vector information.
  • the historical data associated with the current interaction instruction in the stored historical data may be determined according to the pre-established association judgment model, and then, according to the recognition result of the current interaction instruction and The historical data information associated with the current interaction instruction is semantically understood to obtain a user interaction intent.
  • the historical data information associated with the current interaction instruction may be a vector of historical data associated with the current interaction instruction, or a semantic understanding result corresponding to historical data associated with the current interaction instruction, such as a service ( The train ticket), the keyword (departure place, the destination, the time, and the like) are not limited in this embodiment of the present invention.
  • the association judgment model may adopt a neural network model (such as DNN, RNN, etc.), and the construction process is the same as that of the general neural network, roughly: collecting a large amount of training data and labeling, and acquiring each pair of training data (ie, historical data and current).
  • the vector information of the interaction data determines the topology of the model (the input is the historical data and the vector information of the current data, and the output is the correlation judgment result), and the common model training algorithm (such as the BP algorithm) is used to train the correlation judgment model.
  • the user identity information may also be considered, that is, the user information corresponding to the current interaction instruction is determined, and the user ID corresponding to each historical data is identified.
  • the interaction history data of different users may be stored separately.
  • the interaction history data of different users may be stored together, and the user ID corresponding to each historical data may be added.
  • the previous plurality of rounds of historical data and the previous plurality of rounds of historical data corresponding to the user may be simultaneously considered.
  • the accuracy of the user's intention judgment is affected.
  • the association determination may be performed, and the semantic understanding may be directly performed according to the recognition result and the historical data corresponding to the user ID corresponding to the current interaction instruction and the historical data corresponding to other user IDs.
  • Get user interaction intent since the historical data of the user corresponding to the current interactive instruction and the historical data of other users are simultaneously considered, accurate user interaction intention can also be obtained.
  • determining the user information corresponding to the current interaction instruction may be performed in various manners, for example, by performing voiceprint recognition on the current interaction instruction, determining user information corresponding to the current interaction instruction; or, when receiving the current interaction instruction of the user.
  • step 103 the priority of the interactive response is determined.
  • the priority of the interactive response for the current interactive instruction is further considered, in accordance with the present invention, in order to make the interaction more intelligent and anthropomorphic.
  • the priority determines the response strategy and responds to the user interaction in accordance with the corresponding response policy. For example, responding in the order of the priority of the interactive response, rather than responding to the chronological order of the received interactive instructions.
  • the priority of the interaction response may be determined according to the user interaction intention, or may be determined according to the user information, and may also be comprehensively determined according to the user interaction intention and the user information, wherein the interaction priority feature may be obtained according to the user interaction intention, according to User information can be obtained with user priority characteristics.
  • the three cases are explained below.
  • the interaction of the following priority characteristic comprises any one or more of: the degree of urgency of the requested interactive S e, service timeliness S t, the current business interfering C disturb.
  • the priority of the interactive response can be determined based on any one or more of the three characteristics. among them:
  • the urgency of the request S e refers to the urgency expressed in the interactive instructions, mainly through the keyword matching to determine whether there are words such as "hurry”, “immediately”, “hurry”, etc., which are urgently related to the time, if any The value is 1 (expressive), otherwise it is 0.
  • Business timeliness S t Different services have different timeliness, and the higher the timeliness, the higher the service priority. Take TV broadcast as an example, the priority of news and live broadcast is higher than video on demand.
  • the service timeliness value is between 0-1, and the timeliness of each service can be obtained from the pre-configured timeliness of each service type after determining the service type in the semantic understanding process.
  • interfering C disturb it means for interfering services currently executing, in response to disturbances on the higher priority. For example, when playing a movie, if you want to play music, you must stop playing the movie, which is a strong interference; if you are querying the weather, you can display it in a local small window or scroll bar, which is weak interference; if you add an alarm clock, you can think that there is no interference. . For the same request, there can be multiple response methods (for example, the query weather can be displayed on a large screen, can be played in a voice, or can be displayed in a scroll bar), and the system selects a less intrusive response mode according to the current state.
  • the interference coefficient can be defined to measure the interference, and the formula is as follows:
  • 0 a ⁇ 0,1 ⁇ indicates whether the audio broadcast is occupied or not
  • 0 ⁇ 0 s ⁇ 1 indicates the screen occupancy ratio
  • p ⁇ [0,1] is the weight distribution coefficient
  • t o and t left are respectively the time spent in response to the interactive instruction and the remaining time of the currently executing task. It can be seen that the less audio and picture occupation, the smaller the interference coefficient, the longer the occupation time, the less the remaining time of the current task, and the larger the interference coefficient. It should be noted that if the current system is idle, the audio and screen occupancy will be 0, which can respond directly.
  • the priority of the interactive response may be determined according to any one or more of the above three characteristics.
  • the user priority feature S u may be first determined according to the user information, and the priority of the interaction response is determined according to the user priority feature S u .
  • the level of the user may be determined according to the user information, and different user priority characteristics may be set for different levels of users.
  • different individuals may be distinguished by voiceprints and images, and divided into three categories: old people, adults, and children.
  • the priority can be set to Older > Adult > Child, which can prevent the naughty child from disturbing the adult; the user level setting can be preset by the system, or the user can provide the setting interface. Set by the user on site according to the needs of the actual application.
  • the priority of the interaction response may be determined according to one or more of the user priority feature S u and the three interaction priority features.
  • step 103 may also be performed before the step 101, which is not limited in this embodiment.
  • Step 104 Respond to user interaction according to the priority of the interaction response and the user interaction intention.
  • the response policy may be determined according to the priority of the interaction response, and the current interaction instruction is responded according to the response policy.
  • a policy model may be pre-built, and the policy model may adopt a neural network model, such as DNN, CNN, etc.
  • the input of the policy model may be each priority feature
  • the output of the policy model is a response strategy, such as cutting off, waiting. The current interaction ends, etc.
  • the interaction instruction corresponding to the user interaction intention is responded to in order of priority of the interaction response.
  • the calculation of the priority of the interaction response may be linearly weighted according to each of the priority characteristics described above, and the calculation formula may be, for example, the following formula:
  • a determination threshold of the response priority may be set, and if the calculated priority of the interactive response is greater than the set threshold, the response to the other interactive instruction is interrupted, and the current interactive instruction is responded; After the response to the current interactive instruction is completed, the interrupted task is re-executed.
  • the response policy has a strong flexibility, and the specific policy can be configured according to different products and applications, which is not limited in this embodiment of the present invention.
  • the human-computer interaction method of the multi-person participation analyzes the interaction instruction after receiving the current interaction instruction of the user, obtains the user interaction intention, and then determines the priority of the interaction response, according to the interaction response.
  • Priority and user interaction intent responding to user interaction, not only ensuring the accuracy of the response, but also taking into account the priority of different interaction responses, so that users can get a better experience and improve the intelligence of human-computer interaction. And the degree of personification.
  • the embodiment of the present invention further provides a human-machine interaction device for multi-person participation, as shown in FIG. 2, which is a schematic structural diagram of the device.
  • the apparatus comprises:
  • the interaction information receiving module 201 is configured to receive a current interaction instruction of the user
  • the instruction parsing module 202 is configured to analyze the current interaction instruction to obtain a user interaction intention
  • a priority analysis module 203 configured to determine a priority of the interaction response
  • the response module 204 is configured to respond to the user interaction according to the priority of the interaction response and the user interaction intention.
  • the interactive instruction may be in a variety of different forms, such as a voice, an action, a button, and the like.
  • the interaction information receiving module 201 may include any one or more of the following:
  • a microphone for receiving an interactive instruction in the form of a voice
  • a touch screen for receiving an interactive instruction in the form of an action
  • a button for receiving an interactive command in the form of a button is a button for receiving an interactive command in the form of a button.
  • the interaction information receiving module 201 may also be a physical entity of another form, which is not limited in this embodiment of the present invention.
  • a specific structure of the instruction parsing module 202 includes: a speech recognition unit and a semantic understanding unit, wherein:
  • the voice recognition unit is configured to perform voice recognition on the current voice information, obtain a recognition result, and store the recognition result as a piece of historical data; the voice recognition may adopt a prior art;
  • the semantic understanding unit is configured to perform semantic understanding according to the recognition result and the stored historical data to obtain a user interaction intention, and specifically adopts a current mainstream neural network model-based semantic understanding technology, which is not described in detail.
  • FIG. 3 Another specific structure of the instruction parsing module 202 is as shown in FIG. 3, and includes a speech recognition unit 221, a semantic comprehension unit 222, and an association judging unit 223 connected to the speech recognition unit 221 and the semantic comprehension unit 222, respectively. among them:
  • the voice recognition unit 221 is configured to perform voice recognition on the current voice information, obtain a recognition result, and store the recognition result as a piece of historical data;
  • the association determining unit 223 is configured to determine historical data associated with the current interaction instruction according to the pre-established association judgment model
  • the semantic understanding unit 222 is configured to perform semantic understanding according to the recognition result and historical data information associated with the current interaction instruction to obtain a user interaction intention.
  • the historical data information associated with the current interaction instruction may specifically be a vector of historical data associated with the current interaction instruction or a semantic understanding result corresponding to historical data associated with the current interaction instruction.
  • the semantic understanding can adopt the existing mainstream semantic understanding technology based on the neural network model, which will not be described in detail.
  • the association judgment model may adopt a neural network model (such as DNN, RNN, etc.), and may be constructed by a corresponding association judgment model construction module, and the construction process is the same as a general neural network, roughly: collecting a large amount of training data and marking, Obtain the vector information of each pair of training data (ie, historical data and current interaction data), determine the topology of the model (the input is the historical data and the vector information of the current data, and the output is the correlation judgment result), and adopt a common model training algorithm (such as BP). Algorithm) training of the association judgment model.
  • the association judgment model construction module may be used as a part of the apparatus of the present invention, or may be independent of the apparatus of the present invention, and the embodiment of the present invention is not limited thereto.
  • FIG. 4 it is another schematic structural diagram of a human-machine interaction device in which a plurality of people participate in the embodiment of the present invention.
  • the device further includes: a user information obtaining module 205, configured to determine user information corresponding to the current interactive instruction.
  • the instruction parsing module 202 includes a speech recognition unit 221 and a semantic comprehension unit 222. among them:
  • the voice recognition unit 221 not only performs voice recognition on the current voice information, but also obtains the recognition result, and identifies the user ID corresponding to the historical data according to the user information determined by the user information acquiring module 205;
  • the semantic understanding unit 222 performs semantics according to the recognition result, historical data corresponding to the user ID corresponding to the current interaction instruction (such as previous multiple rounds of historical data or historical data in a certain certain period of time), and historical data corresponding to other user IDs. Understand, get the user interaction intent.
  • the user information obtaining module 205 can obtain the user information in a plurality of manners, for example, by performing voiceprint recognition on the current interaction instruction, and determining user information corresponding to the current interaction instruction; or receiving the current interaction of the user in the interaction information receiving module 201.
  • the user information obtaining module 205 can obtain the user information in a plurality of manners, for example, by performing voiceprint recognition on the current interaction instruction, and determining user information corresponding to the current interaction instruction; or receiving the current interaction of the user in the interaction information receiving module 201.
  • When instructing acquiring other biological information of the user, and determining user information corresponding to the current interactive instruction according to the biological information.
  • the priority analysis module 203 may specifically determine the priority of the interaction response according to the user interaction intention and/or the user information.
  • the priority analysis module 203 is specifically configured to determine an interaction priority feature according to the user interaction intention, where the interaction priority feature includes any one or more of the following: an urgent request for interaction of S e, service timeliness S t, the current business interfering C disturb; determining priority interactive response based on the priority feature interaction.
  • priority analysis module 203 can include the following units:
  • a user analysis unit configured to determine a user priority feature S u according to the user information
  • An instruction analyzing unit configured to determine an interaction priority feature according to the user interaction intention, where the interaction priority feature includes any one or more of the following: an urgency of the interaction request S e , a service time effectiveness S t , and a current service interfering C disturb;
  • the preference determining unit is configured to determine a priority of the interaction response according to the user priority feature S u or determine a priority of the interaction response according to the user priority feature S u and the interaction priority feature.
  • the response module 204 may specifically determine a response policy according to the priority of the interaction response obtained by the priority analysis module 203, such as determining a response policy based on a pre-built policy model, according to the The response policy responds to the current interactive instruction.
  • the policy model may adopt a neural network model, such as DNN, CNN, etc., and may be constructed by a corresponding policy model building module.
  • the policy model building module may be used as a part of the device of the present invention, or may be independent of the device of the present invention.
  • the response module 204 may also respond to the user interaction according to the priority of the interaction response from the largest to the smallest, for example, according to the priority of the response from large to small.
  • the interaction instructions corresponding to the user interaction intention are sequentially inserted into the task queue, and the interaction instructions in the task queue are sequentially responded.
  • the response module 204 is further configured to determine whether the priority of the interaction response is greater than a set threshold, and if yes, interrupt the response to other interaction instructions, and respond to the current interaction instruction; After the instruction completes the response, the interrupted task is re-executed.
  • the human-machine interaction device of the multi-person participation analyzes the interaction instruction after receiving the current interaction instruction of the user, obtains the user interaction intention, and then determines the priority of the interaction response, according to the interaction response.
  • Priority and user interaction intent responding to user interaction, not only ensuring the accuracy of the response, but also taking into account the priority of different interaction responses, so that users can get a better experience and improve the intelligence of human-computer interaction. And the degree of personification.
  • FIG. 5 it is a schematic structural diagram of a human-machine interaction device in which a plurality of people participate in the embodiment of the present invention.
  • the system includes a processor 51, a memory 52, and a receiving circuit 53 that are connected to each other.
  • the receiving circuit 53 is configured to receive the current interactive command of the user and send it to the processor 51.
  • the memory 52 is used to store program instructions and can also be used to store data of the processor 51 during processing.
  • the processor 51 is configured to execute the program instructions to perform the multi-person participation human-computer interaction method in the above embodiment. For example, receiving the current interaction instruction of the user from the receiving circuit 53; analyzing the current interaction instruction to obtain a user interaction intention; determining a priority of the interaction response; and according to the priority of the interaction response and the user interaction intention, User interaction responds.
  • the receiving circuit 53 may include any one or more of the following:
  • a microphone for receiving an interactive instruction in the form of a voice
  • a touch screen for receiving an interactive instruction in the form of an action
  • a button for receiving an interactive command in the form of a button is a button for receiving an interactive command in the form of a button.
  • the human-machine interaction device can be any device with information processing capability such as a robot, a mobile phone, or a computer.
  • the processor 51 may also be referred to as a CPU (Central Processing Unit).
  • Processor 51 may be an integrated circuit chip with signal processing capabilities.
  • the processor 51 can also be a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, and discrete hardware components.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the various embodiments in the specification are described in a progressive manner, and similar parts of the various embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments.
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without any creative effort.

Abstract

一种多人参与的人机交互方法及装置,该方法包括:接收用户当前交互指令(101);对所述当前交互指令进行分析,得到用户交互意图(102);确定交互响应的优先度(103);根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应(104)。该方法可以提高多人参与的人机交互响应的准确性及智能化程度,提升用户体验。

Description

多人参与的人机交互方法及装置 【技术领域】
本发明涉及人机交互领域,具体涉及一种多人参与的人机交互方法及装置。
【背景技术】
随着人工智能技术的不断进步,人机交互也取得了长足的发展,各种人机交互机器人大肆兴起,随之人们对自然、拟人的人机交互方式的追求也愈发强烈。现有的人机交互系统大多还只能处理较为简单的一对一交互,其大致工作流程为:接收用户的交互指令,对所述交互指令进行分析,得到用户的交互意图,根据用户的交互意图做出相应的响应。即使有一些支持多人参与交互的设备,也只是在用户A交互过程中,如果用户B加入,则中断与用户A的交互,然后再与用户B进行交互。而对于多人参与的较为复杂的交互场景,其结果要么是逐一响应、要么是任意打断,难以达到拟人的效果,如对交互对象、交互顺序等的智能响应,因此时常会给对用户请求做出错误的响应,降低用户体验。
【发明内容】
本发明实施例提供一种多人参与的人机交互方法及装置,以解决现有人机交互方案存在的因交互响应策略简单粗暴导致的交互体验差、无法达到拟人智能交互的问题。
为此,本发明提供如下技术方案:
一种多人参与的人机交互方法,所述方法包括:
接收用户当前交互指令;
对所述当前交互指令进行分析,得到用户交互意图;
确定交互响应的优先度;
根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响 应。
优选地,所述交互指令为以下任意一种或多种形式的信息:语音、动作、按键。
优选地,在所述交互指令包含语音信息时,所述对所述当前交互指令进行分析,得到用户交互意图包括:
对当前语音信息进行语音识别,得到识别结果;
根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图。
优选地,所述对所述当前交互指令进行分析,得到用户交互意图还包括:
根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据;
所述根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图包括:
根据所述识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。
优选地,所述方法还包括:
确定当前交互指令对应的用户信息,并标识所述历史数据对应的用户ID;
所述根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图包括:
根据所述识别结果、当前交互指令对应的用户ID对应的历史数据及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。
优选地,所述确定交互响应的优先度包括:
根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
根据所述交互优先度特征确定交互响应的优先度。
优选地,所述确定交互响应的优先度包括:
根据所述用户信息确定用户优先度特征S u
根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
根据所述用户优先度特征S u确定交互响应的优先度,或者根据所述用户优先度特征S u和所述交互优先度特征确定交互响应的优先度。
优选地,所述根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应包括:
如果所述交互响应的优先度大于设定阈值,则中断对其它交互指令进行的响应,并对当前交互指令进行响应;
在对当前交互指令进行响应完成后,重新执行中断的任务。
优选地,所述根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应包括:
根据所述交互响应的优先度及预先构建的策略模型,确定响应策略;
根据所述响应策略对当前交互指令进行响应。
一种多人参与的人机交互装置,所述装置包括:
交互信息接收模块,用于接收用户当前交互指令;
指令解析模块,用于对所述当前交互指令进行分析,得到用户交互意图;
优先度分析模块,用于确定交互响应的优先度;
响应模块,用于根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
优选地,在所述交互指令包含语音信息时,所述指令解析模块包括:
语音识别单元,用于对当前语音信息进行语音识别,得到识别结果;
语义理解单元,用于根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图。
优选地,所述指令解析模块还包括:
关联判断单元,用于根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据;
所述语义理解单元根据所述识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。
优选地,所述装置还包括:
用户信息获取模块,用于确定当前交互指令对应的用户信息;
所述语音识别单元,还用于根据所述用户信息获取模块确定的用户信息标识所述历史数据对应的用户ID;
所述语义理解单元根据所述识别结果、当前交互指令对应的用户ID对应的历史数据及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。
优选地,所述优先度分析模块具体用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb;根据所述交互优先度特征确定交互响应的优先度。
优选地,所述优先度分析模块包括:
用户分析单元,用于根据所述用户信息确定用户优先度特征S u
指令分析单元,用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
优先度确定单元,用于根据所述用户优先度特征S u确定交互响应的优先度,或者根据所述用户优先度特征S u和所述交互优先度特征确定交互响应的优先度。
优选地,所述响应模块还用于判断所述交互响应的优先度是否大于设定阈值,如果是,则中断对其它交互指令进行的响应,并对当前交互指令进行响应;在对当前交互指令进行响应完成后,重新执行中断的任务。
优选地,所述响应模块具体用于根据所述交互响应的优先度及预先构建的策略模型,确定响应策略,并根据所述响应策略对当前交互指令进行响应。
优选地,所述交互指令为以下任意一种或多种形式的信息:语音、动作、按键。
一种多人参与的人机交互装置,其中,包括相互连接的处理器、存储器和接收电路;
所述接收电路用于接收用户当前交互指令并发送给所述处理器;
所述存储器用于存储程序指令;
所述处理器用于运行所述程序指令以执行:
从所述接收电路接收用户当前交互指令;
对所述当前交互指令进行分析,得到用户交互意图;
确定交互响应的优先度;
根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
优选地,所述接收电路包括以下任意一种或多种:
麦克风,用于接收语音形式的交互指令;
传感器,用于接收肢体动作形式的交互指令;
触摸屏,用于接收动作形式的交互指令;
按键,用于接收按键形式的交互指令。
本发明实施例提供的多人参与的人机交互方法及装置,在接收到用户当前交互指令后,对该交互指令进行分析,得到用户交互意图,然后确定交互响应的优先度,根据所述交互响应的优先度及用户交互意图,对用户交互进行响应,从而不仅可以保证响应的准确性,而且由于考虑了不同交互响应的优先度,因此可以使用户得到更好的体验,提高了人机交互的智能化及拟人化程度。
【附图说明】
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。
图1是本发明实施例多人参与的人机交互方法的一种流程图;
图2是本发明实施例多人参与的人机交互装置的一种结构示意图;
图3是本发明实施例中指令解析模块的一种结构示意图;
图4是本发明实施例多人参与的人机交互装置的另一种结构示意图;
图5是本发明实施例多人参与的人机交互装置的又一种结构示意图。
【具体实施方式】
为了使本技术领域的人员更好地理解本发明实施例的方案,下面结合附图和实施方式对本发明实施例作进一步的详细说明。
如图1所示,是本发明实施例多人参与的人机交互方法的一种流程图,包括以下步骤:
步骤101,接收用户当前交互指令。
所述交互指令可以是语音、动作比如手势或其他肢体动作、按键等其中的一种或多种形式的交互指令。
步骤102,对所述当前交互指令进行分析,得到用户交互意图。
对交互指令的分析,主要是指对指令信息的识别和理解,对于动作类的交互指令,需要识别相应动作,找到该动作对应的用户交互意图,这些对应关系是预先设定好的,机器只需对该动作进行识别;同样,对于按键交互指令,不同按键对应的用户交互意图也是预先设定好的。
下面主要以语音交互指令为例进行说明。
对于语音交互指令,需要对其进行语音识别及语义理解。其中,语音识别可以采用现有常见的语音识别技术实现。
对于语义理解,考虑到多人参与的交互,时常存在交叉式的交互,容易出现用户意图理解错误的情况。如表1所示,在示例A中,用户u1请求“那儿的天气怎么样”与上一轮用户u2的交互无关,而与自己的交互历史相关;而在示例B中,用户u1是在用户u2交互的基础上进行的交互,需要从用户u2的历史中继承电影名称。
表1
Figure PCTCN2018096706-appb-000001
为此,在本发明实施例中,在对语音交互指令进行语义理解时,可以借助于多条历史交互数据,对当前交互指令进行语义理解。即在对当前交互指令进行语义理解时,根据当前交互指令的识别结果以及存储的历史数据进行语义理解,得到用户交互意图。当然也可以只根据当前交互指令的识别结果对其进行语认理解,得到用户交互意图。
需要说明的是,语义理解可采用现有主流的基于神经网络模型的语义理解技术,模型的输入为当前指令向量信息和所述多条历史数据的向量信息,模型的输出为语义理解结果,如业务、关键词。所述当前指令向量信息可以根据当前指令的识别结果得到,比如,对所述识别结果进行分词,得到各词的词向量,将这些词向量的和作为当前指令向量信息。
进一步地,在本发明另一实施例中,还可以根据预先构建的关联判断模型,确定存储的历史数据中与所述当前交互指令相关联的历史数据,然后,根据当前交互指令的识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。与所述当前交互指令相关联的历史数据信息可以是与所述当前交互指令相关联的历史数据的向量,或者是与所述当前交互指令相关联的历史数据对应的语义理解结果,如业务(订火车票)、关键词(出发地、目的地、时间等),对此本发明实施例不做限定。
所述关联判断模型可以采用神经网络模型(如DNN、RNN等),其构建过程同于一般神经网络,大致为:收集大量的训练数据并进行标注,获取每对训练数据(即历史数据和当前交互数据)的向量信息,确定模型的拓扑结构(输入为历史数据和当前数据的向量信息,输出为关联判断结果),采用常见模型训练算法(如BP算法)进行关联判断模型的训练。
进一步地,在本发明另一实施例中,还可以考虑用户身份信息,即确定当前交互指令对应的用户信息,并标识各历史数据对应的用户ID。为了便于后续对历史数据的提取,可以将不同用户的交互历史数据独立存储,当然,也可以将不同用户的交互历史数据一起存储,并添加每条历史数据对应的用户ID即可。
相应地,在上述根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据时,可同时考虑时间上的前多轮历史数据和对应所 述用户的前多轮历史数据,以防止如果轮数M取得较小,而参与交互的人数较多情况下,未对当前交互用户的历史数据进行利用,影响用户意图判断的准确性。
当然,在考虑用户身份信息的情况下,也可以不用进行上述关联判断,直接根据所述识别结果、以及当前交互指令对应的用户ID对应的历史数据及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。在这种情况下,由于同时考虑了当前交互指令对应用户的历史数据和其它用户的历史数据,因此也能够得到准确的用户交互意图。
需要说明的是,确定当前交互指令对应的用户信息具体可以采用多种方式,比如,通过对当前交互指令进行声纹识别,确定当前交互指令对应的用户信息;或者,在接收用户当前交互指令时,获取用户其它生物信息(比如人脸图像、虹膜图像等),根据所述生物信息确定当前交互指令对应的用户信息。
步骤103,确定交互响应的优先度。
由于本发明方案针对的是多人参与的人机交互应用场景,为了使该交互更智能化、拟人化,在本发明实施例中,进一步考虑针对当前交互指令的交互响应的优先度,根据所述优先度决定响应策略,按照相应的响应策略对用户交互进行响应。比如,按照交互响应优先度从大到小的顺序进行响应,而非依照收到交互指令的时间先后顺序进行响应。
所述交互响应的优先度可以由根据用户交互意图决定,也可以根据用户信息确定,当然还可以根据用户交互意图和用户信息来综合确定,其中,根据用户交互意图可以得到交互优先度特征,根据用户信息可以得到用户优先度特征。下面分别对这三种情况加以说明。
1)根据用户交互意图确定交互响应的优先度
根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb。在实现应用中,可以根据这三种特征中的任意一种或多种特征确定所述交互响应的优先度。其中:
请求的紧迫度S e:是指交互指令中表达出的急迫感,主要通过关键词匹配判断语句中是否出现“赶快”,“马上”,“快点”等表达时间紧急 相关的词,如有取值为1(表达紧迫),否则取值为0。
业务时效性S t:不同的业务具有不同的时效性,时效性越高的业务优先级越高。以电视播放为例,新闻、赛事直播优先级高于视频点播。业务时效性取值在0-1之间,各业务的时效性可通过在语义理解过程中确定业务类型后,从预先配置的各业务类型对应的时效性中获取。
对当前业务的干扰性C disturb:是指对当前正在执行业务的干扰性,干扰越小的响应优先级越高。例如正在播放电影时,要播放音乐就必须停止播放电影,属强干扰;如果是查询天气,则可以在局部小窗口或滚动条进行展示,属弱干扰;如果是增设一个闹钟,可认为无干扰。对于同一请求,可以有多种响应方式(例如查询天气可以大屏展示,可以语音播放,也可以滚动条显示),系统根据当前状态选择干扰性较小的响应方式。比如,可以定义干扰系数来衡量干扰性,其计算公式如下:
Figure PCTCN2018096706-appb-000002
其中0 a∈{0,1}表示音频播报占用与否,0≤0 s≤1表示画面占用比例,p∈[0,1]为权重分配系数,针对不同的产品对音频和画面的占用情况分配不同的权重,如果没有显示屏,则p为1。t o和t left分别为响应该交互指令将耗费的时间和当前正在执行的任务的剩余时间。由此可以看到,音频和画面占用越少,干扰系数越小,占用时间越长、当前任务剩余时间越少、干扰系数越大。需要说明的是,如果当前系统空闲时,音频和画面占用都将为0,可以直接响应。
在实际应用中,可以根据上述三种特征中的任意一种或多种确定交互响应的优先度。
(2)根据用户信息确定交互响应的优先度
具体地,可以首先根据用户信息确定用户优先度特征S u,根据用户优先度特征S u确定交互响应的优先度。比如,可以根据用户信息确定用户的级别,对不同级别的用户设定不同的用户优先度特征,例如电视应用中,通过声纹和图像可以区分不同的个体,同时分成老人、成人和小孩三类优先等级的用户,优先度可以设置为老人>成人>小孩,这能防止爱淘气的小孩打扰大人的观影;用户级别的设定可以由系统预先设定,也可以向用户提供设定接口,由用户根据实际应用中的需要在现场设定。
(3)根据用户交互意图及用户信息确定交互响应的优先度
具体可以根据上述用户优先度特征S u以及上述三种交互优先度特征中的一种或多种确定交互响应的优先度。
需要说明的是,如果只根据用户信息确定交互响应的优先度,则上述步骤103也可以在步骤101之前进行,对此本实施例不做限定。
步骤104,根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
具体地,可以根据所述交互响应的优先度确定响应策略,根据所述响应策略对当前交互指令进行响应。
比如,预先构建策略模型,所述策略模型可以采用神经网络模型,如DNN、CNN等,所述策略模型的输入可以是各优先度特征,所述策略模型的输出为响应策略,如切断、等待当前交互结束等。
再比如,按照所述交互响应的优先度从大到小的顺序对所述用户交互意图对应的交互指令进行响应。
所述交互响应的优先度的计算可以根据上述各优先度特征进行线性加权,计算公式例如可以采用下述公式:
Figure PCTCN2018096706-appb-000003
其中,α、β、γ和θ均为优先级权重,可根据实际应用情况进行取值,其中α+β+γ=1。
进一步地,还可以设定响应优先度的判断阈值,如果计算得到的交互响应的优先度大于设定的阈值,则中断对其它交互指令进行的响应,并对所述当前交互指令进行响应;在对所述当前交互指令进行响应完成后,再重新执行中断的任务。
需要说明的是,在实际应用中,响应策略具有很强的灵活性,具体的策略可根据不同的产品和应用场合进行配置,对此本发明实施例不做限定。
本发明实施例提供的多人参与的人机交互方法,在接收到用户当前交互指令后,对该交互指令进行分析,得到用户交互意图,然后确定交互响应的优先度,根据所述交互响应的优先度及用户交互意图,对用户交互进行响应,从而不仅可以保证响应的准确性,而且由于考虑了不同交互响应的优先等级,因此可以使用户得到更好的体验,提高了人机交互的智能化 及拟人化程度。
相应地,本发明实施例还提供一种多人参与的人机交互装置,如图2所示,是该装置的一种结构示意图。
在该实施例中,所述装置包括:
交互信息接收模块201,用于接收用户当前交互指令;
指令解析模块202,用于对所述当前交互指令进行分析,得到用户交互意图;
优先度分析模块203,用于确定交互响应的优先度;
响应模块204,用于根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
根据产品及应用不同,所述交互指令可以有多种不同形式,比如语音、动作、按键等形式,相应地,所述交互信息接收模块201可以包括以下任意一种或多种:
麦克风,用于接收语音形式的交互指令;
传感器,用于接收肢体动作形式的交互指令;
触摸屏,用于接收动作形式的交互指令;
按键,用于接收按键形式的交互指令。
当然,所述交互信息接收模块201还可以是其它形式的物理实体,对此本发明实施例不做限定。
下面以交互指令包含语音信息的情况进行说明。
相应地,所述指令解析模块202的一种具体结构包括:语音识别单元和语义理解单元,其中:
所述语音识别单元用于对当前语音信息进行语音识别,得到识别结果,并将所述识别结果存储为一条历史数据;语音识别可以采用现有技术;
所述语义理解单元用于根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图,具体可采用现有主流的基于神经网络模型的语义理解技术,对此不再详细描述。
所述指令解析模块202的另一种具体结构如图3所示,包括:语音识别单元221、语义理解单元222、以及分别与语音识别单元221和语义理解单元222相连的关联判断单元223。其中:
所述语音识别单元221用于对当前语音信息进行语音识别,得到识别结果,并将所述识别结果存储为一条历史数据;
所述关联判断单元223用于根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据;
所述语义理解单元222用于根据所述识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。与当前交互指令相关联的历史数据信息具体可以是与当前交互指令相关联的历史数据的向量,或者是与所述当前交互指令相关联的历史数据对应的语义理解结果。语义理解具体可采用现有主流的基于神经网络模型的语义理解技术,对此不再详细描述。
所述关联判断模型可以采用神经网络模型(如DNN、RNN等),可由相应的关联判断模型构建模块来构建,其构建过程同于一般神经网络,大致为:收集大量的训练数据并进行标注,获取每对训练数据(即历史数据和当前交互数据)的向量信息,确定模型的拓扑结构(输入为历史数据和当前数据的向量信息,输出为关联判断结果),采用常见模型训练算法(如BP算法)进行关联判断模型的训练。所述关联判断模型构建模块可以作为本发明装置的一部分,也可以独立于本发明装置,对此本发明实施例不做限定。
如图4所示,是本发明实施例多人参与的人机交互装置的另一种结构示意图。
与图2所示实施例不同的是,在该实施例中,所述装置还包括:用户信息获取模块205,用于确定当前交互指令对应的用户信息。
相应地,在该实施例中,所述指令解析模块202包括:语音识别单元221和语义理解单元222。其中:
所述语音识别单元221不仅对当前语音信息进行语音识别,得到识别结果,还要根据所述用户信息获取模块205确定的用户信息标识所述历史数据对应的用户ID;
所述语义理解单元222根据所述识别结果、当前交互指令对应的用户ID对应的历史数据(比如前多轮历史数据或者前一定时间段内的历史数据)及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。
所述用户信息获取模块205具体可通过多种方式得到用户信息,比如 通过对当前交互指令进行声纹识别,确定当前交互指令对应的用户信息;或者在所述交互信息接收模块201接收用户当前交互指令时,获取用户的其它生物信息,并根据所述生物信息确定当前交互指令对应的用户信息。
在上述本发明装置各实施例中,所述优先度分析模块203具体可以根据用户交互意图和/或用户信息确定交互响应的优先度。
比如,在一个具体实施例中,所述优先度分析模块203具体用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb;根据所述交互优先度特征确定交互响应的优先度。
再比如,所述优先度分析模块203的另一个具体实施例可以包括以下各单元:
用户分析单元,用于根据所述用户信息确定用户优先度特征S u
指令分析单元,用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
优选度确定单元,用于根据所述用户优先度特征S u确定交互响应的优先度,或者根据所述用户优先度特征S u和所述交互优先度特征确定交互响应的优先度。
在上述本发明装置各实施例中,所述响应模块204具体可以根据所述优先度分析模块203得到的交互响应的优先度确定响应策略,比如基于预先构建的策略模型确定响应策略,根据所述响应策略对当前交互指令进行响应。所述策略模型可以采用神经网络模型,如DNN、CNN等,具体可由相应的策略模型构建模块来构建。所述策略模型构建模块可以作为本发明装置的一部分,也可以独立于本发明装置,对此本发明实施例不做限定。
在上述本发明装置各实施例中,所述响应模块204还可以按照所述交互响应的优先度从大到小的顺序对对用户交互进行响应,比如,按照响应的优先度从大到小的顺序将所述用户交互意图对应的交互指令插入到任务队列,依次对所述任务队列中的交互指令进行响应。
进一步地,所述响应模块204还用于判断所述交互响应的优先度是否大于设定阈值,如果是,则中断对其它交互指令进行的响应,并对当前交 互指令进行响应;在对当前交互指令进行响应完成后,重新执行中断的任务。
本发明实施例提供的多人参与的人机交互装置,在接收到用户当前交互指令后,对该交互指令进行分析,得到用户交互意图,然后确定交互响应的优先度,根据所述交互响应的优先度及用户交互意图,对用户交互进行响应,从而不仅可以保证响应的准确性,而且由于考虑了不同交互响应的优先等级,因此可以使用户得到更好的体验,提高了人机交互的智能化及拟人化程度。
本发明实施例还提供又一种多人参与的人机交互装置,如图5所示,是本发明实施例多人参与的人机交互装置的又一种结构示意图。
在该实施例中,该系统包括相互连接的处理器51、存储器52和接收电路53。接收电路53用于接收用户当前交互指令并发送给处理器51。该存储器52用于存储程序指令,而且还可用于存储处理器51在处理过程中的数据。处理器51用于运行该程序指令以执行上述实施例中的多人参与的人机交互方法。具体如,从接收电路53接收用户当前交互指令;对所述当前交互指令进行分析,得到用户交互意图;确定交互响应的优先度;根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
进一步地,该接收电路53可以包括以下任意一种或多种:
麦克风,用于接收语音形式的交互指令;
传感器,用于接收肢体动作形式的交互指令;
触摸屏,用于接收动作形式的交互指令;
按键,用于接收按键形式的交互指令。
另外,该人机交互装置可以为机器人、手机、电脑等任意具有信息处理能力的设备。处理器51还可以称为CPU(Central Processing Unit,中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相 同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
以上对本发明实施例进行了详细介绍,本文中应用了具体实施方式对本发明进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及装置;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。

Claims (20)

  1. 一种多人参与的人机交互方法,其中,所述方法包括:
    接收用户当前交互指令;
    对所述当前交互指令进行分析,得到用户交互意图;
    确定交互响应的优先度;
    根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
  2. 根据权利要求1所述的方法,其中,所述交互指令为以下任意一种或多种形式的信息:语音、动作、按键。
  3. 根据权利要求1所述的方法,其中,在所述交互指令包含语音信息时,所述对所述当前交互指令进行分析,得到用户交互意图包括:
    对当前语音信息进行语音识别,得到识别结果;
    根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图。
  4. 根据权利要求3所述的方法,其中,所述对所述当前交互指令进行分析,得到用户交互意图还包括:
    根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据;
    所述根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图包括:
    根据所述识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。
  5. 根据权利要求3所述的方法,其中,所述方法还包括:
    确定当前交互指令对应的用户信息,并标识所述历史数据对应的用户ID;
    所述根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图包括:
    根据所述识别结果、当前交互指令对应的用户ID对应的历史数据及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。
  6. 根据权利要求1所述的方法,其中,所述确定交互响应的优先度包括:
    根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
    根据所述交互优先度特征确定交互响应的优先度。
  7. 根据权利要求5所述的方法,其中,所述确定交互响应的优先度包括:
    根据所述用户信息确定用户优先度特征S u
    根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
    根据所述用户优先度特征S u确定交互响应的优先度,或者根据所述用户优先度特征S u和所述交互优先度特征确定交互响应的优先度。
  8. 根据权利要求1所述的方法,其中,所述根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应包括:
    如果所述交互响应的优先度大于设定阈值,则中断对其它交互指令进行的响应,并对当前交互指令进行响应;
    在对当前交互指令进行响应完成后,重新执行中断的任务。
  9. 根据权利要求1所述的方法,其中,所述根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应包括:
    根据所述交互响应的优先度及预先构建的策略模型,确定响应策略;
    根据所述响应策略对当前交互指令进行响应。
  10. 一种多人参与的人机交互装置,其中,所述装置包括:
    交互信息接收模块,用于接收用户当前交互指令;
    指令解析模块,用于对所述当前交互指令进行分析,得到用户交互意图;
    优先度分析模块,用于确定交互响应的优先度;
    响应模块,用于根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
  11. 根据权利要求10所述的装置,其中,在所述交互指令包含语音信息时,所述指令解析模块包括:
    语音识别单元,用于对当前语音信息进行语音识别,得到识别结果;
    语义理解单元,用于根据所述识别结果以及存储的历史数据进行语义理解,得到用户交互意图。
  12. 根据权利要求11所述的装置,其中,所述指令解析模块还包括:
    关联判断单元,用于根据预先构建的关联判断模型,确定与所述当前交互指令相关联的历史数据;
    所述语义理解单元根据所述识别结果以及与所述当前交互指令相关联的历史数据信息进行语义理解,得到用户交互意图。
  13. 根据权利要求11所述的装置,其中,所述装置还包括:
    用户信息获取模块,用于确定当前交互指令对应的用户信息;
    所述语音识别单元,还用于根据所述用户信息获取模块确定的用户信息标识所述历史数据对应的用户ID;
    所述语义理解单元根据所述识别结果、当前交互指令对应的用户ID对应的历史数据及其它用户ID对应的历史数据进行语义理解,得到用户交互意图。
  14. 根据权利要求10所述的装置,其中,
    所述优先度分析模块,具体用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb;根据所述交互优先度特征确定交互响应的优先度。
  15. 根据权利要求13所述的装置,其中,所述优先度分析模块包括:
    用户分析单元,用于根据所述用户信息确定用户优先度特征S u
    指令分析单元,用于根据所述用户交互意图确定交互优先度特征,所述交互优先度特征包括以下任意一项或多项:交互请求的紧迫度S e、业务时效性S t、对当前业务的干扰性C disturb
    优先度确定单元,用于根据所述用户优先度特征S u确定交互响应的优先度,或者根据所述用户优先度特征S u和所述交互优先度特征确定交互响应的优先度。
  16. 根据权利要求10所述的装置,其中,所述响应模块还用于判断所述交互响应的优先度是否大于设定阈值,如果是,则中断对其它交互指令进行的响应,并对当前交互指令进行响应;在对当前交互指令进行响应完成后,重新执行中断的任务。
  17. 根据权利要求10所述的装置,其中,所述响应模块具体用于根据所述交互响应的优先度及预先构建的策略模型,确定响应策略,并根据所述响应策略对当前交互指令进行响应。
  18. 根据权利要求10所述的装置,其中,所述交互指令为以下任意一种或多种形式的信息:语音、动作、按键。
  19. 一种多人参与的人机交互装置,其中,包括相互连接的处理器、存储器和接收电路;
    所述接收电路用于接收用户当前交互指令并发送给所述处理器;
    所述存储器用于存储程序指令;
    所述处理器用于运行所述程序指令以执行:
    从所述接收电路接收用户当前交互指令;
    对所述当前交互指令进行分析,得到用户交互意图;
    确定交互响应的优先度;
    根据所述交互响应的优先度及所述用户交互意图,对用户交互进行响应。
  20. 根据权利要求19所述的装置,其中,所述接收电路包括以下任意一种或多种:
    麦克风,用于接收语音形式的交互指令;
    传感器,用于接收肢体动作形式的交互指令;
    触摸屏,用于接收动作形式的交互指令;
    按键,用于接收按键形式的交互指令。
PCT/CN2018/096706 2017-11-24 2018-07-23 多人参与的人机交互方法及装置 WO2019100738A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711195912.9A CN107831903B (zh) 2017-11-24 2017-11-24 多人参与的人机交互方法及装置
CN201711195912.9 2017-11-24

Publications (1)

Publication Number Publication Date
WO2019100738A1 true WO2019100738A1 (zh) 2019-05-31

Family

ID=61645537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/096706 WO2019100738A1 (zh) 2017-11-24 2018-07-23 多人参与的人机交互方法及装置

Country Status (2)

Country Link
CN (1) CN107831903B (zh)
WO (1) WO2019100738A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022057870A1 (zh) * 2020-09-17 2022-03-24 华为技术有限公司 人机交互方法、装置和系统

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107831903B (zh) * 2017-11-24 2021-02-02 科大讯飞股份有限公司 多人参与的人机交互方法及装置
CN108536297A (zh) * 2018-03-29 2018-09-14 北京微播视界科技有限公司 用于多人的人机交互应用程序的实现方法和装置
CN108847225B (zh) * 2018-06-04 2021-01-12 上海智蕙林医疗科技有限公司 一种机场多人语音服务的机器人及其方法
CN110619870B (zh) * 2018-06-04 2022-05-06 佛山市顺德区美的电热电器制造有限公司 一种人机对话方法、装置、家用电器和计算机存储介质
CN108920172B (zh) * 2018-06-12 2021-12-14 思必驰科技股份有限公司 用于语音对话平台的程序发布和调用方法及系统
CN110689393B (zh) * 2018-07-06 2022-08-02 阿里巴巴集团控股有限公司 人机交互方法、设备、系统及存储介质
CN109408209A (zh) * 2018-09-27 2019-03-01 北京云迹科技有限公司 任务执行方法和装置
CN109710941A (zh) * 2018-12-29 2019-05-03 上海点融信息科技有限责任公司 基于人工智能的用户意图识别方法和装置
CN111724797A (zh) * 2019-03-22 2020-09-29 比亚迪股份有限公司 基于图像和声纹识别的语音控制方法、系统和车辆
US11580970B2 (en) * 2019-04-05 2023-02-14 Samsung Electronics Co., Ltd. System and method for context-enriched attentive memory network with global and local encoding for dialogue breakdown detection
CN110297544B (zh) * 2019-06-28 2021-08-17 联想(北京)有限公司 输入信息响应方法及装置、计算机系统和可读存储介质
CN111443801B (zh) * 2020-03-25 2023-10-13 北京百度网讯科技有限公司 人机交互方法、装置、设备及存储介质
CN112788004B (zh) * 2020-12-29 2023-05-09 上海掌门科技有限公司 一种通过虚拟会议机器人执行指令的方法、设备与计算机可读介质
CN112650489A (zh) * 2020-12-31 2021-04-13 北京猎户星空科技有限公司 业务控制方法、装置、计算机设备以及存储介质
CN113111066A (zh) * 2021-04-20 2021-07-13 长沙市到家悠享网络科技有限公司 一种数据库操作工单自动上线方法、装置、系统和计算机设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609089A (zh) * 2011-01-13 2012-07-25 微软公司 用于机器人和用户交互的多状态模型
WO2016157662A1 (ja) * 2015-03-31 2016-10-06 ソニー株式会社 情報処理装置、制御方法、およびプログラム
CN106445654A (zh) * 2016-08-31 2017-02-22 北京康力优蓝机器人科技有限公司 确定响应控制命令优先顺序的方法及装置
CN106569613A (zh) * 2016-11-14 2017-04-19 中国电子科技集团公司第二十八研究所 一种多模态人机交互系统及其控制方法
CN107831903A (zh) * 2017-11-24 2018-03-23 科大讯飞股份有限公司 多人参与的人机交互方法及装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685454A (zh) * 2008-09-28 2010-03-31 华为技术有限公司 人机交互方法及系统
CN102938030B (zh) * 2012-09-29 2015-11-18 周万荣 一种应用的使用权限设置及控制限定区域的方法和终端
US20140368537A1 (en) * 2013-06-18 2014-12-18 Tom G. Salter Shared and private holographic objects
CN104572133B (zh) * 2015-02-06 2020-05-08 上海莉莉丝科技股份有限公司 一种用于执行计算任务中多用户的操作的方法与设备
KR20170010494A (ko) * 2015-07-20 2017-02-01 엘지전자 주식회사 이동 단말기 및 그 제어 방법
CN105912128B (zh) * 2016-04-29 2019-05-24 北京光年无限科技有限公司 面向智能机器人的多模态交互数据处理方法及装置
CN107169034B (zh) * 2017-04-19 2020-08-04 畅捷通信息技术股份有限公司 一种多轮人机交互的方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609089A (zh) * 2011-01-13 2012-07-25 微软公司 用于机器人和用户交互的多状态模型
WO2016157662A1 (ja) * 2015-03-31 2016-10-06 ソニー株式会社 情報処理装置、制御方法、およびプログラム
CN106445654A (zh) * 2016-08-31 2017-02-22 北京康力优蓝机器人科技有限公司 确定响应控制命令优先顺序的方法及装置
CN106569613A (zh) * 2016-11-14 2017-04-19 中国电子科技集团公司第二十八研究所 一种多模态人机交互系统及其控制方法
CN107831903A (zh) * 2017-11-24 2018-03-23 科大讯飞股份有限公司 多人参与的人机交互方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022057870A1 (zh) * 2020-09-17 2022-03-24 华为技术有限公司 人机交互方法、装置和系统

Also Published As

Publication number Publication date
CN107831903A (zh) 2018-03-23
CN107831903B (zh) 2021-02-02

Similar Documents

Publication Publication Date Title
WO2019100738A1 (zh) 多人参与的人机交互方法及装置
US11238239B2 (en) In-call experience enhancement for assistant systems
KR102214972B1 (ko) 가변 레이턴시 디바이스 조정
US20230386462A1 (en) Reducing the need for manual start/end-pointing and trigger phrases
KR102100742B1 (ko) 디지털 어시스턴트 서비스의 원거리 확장
KR102523982B1 (ko) 자동화된 어시스턴트를 호출하기 위한 다이내믹 및/또는 컨텍스트-특정 핫 워드
KR102444165B1 (ko) 적응적으로 회의를 제공하기 위한 장치 및 방법
KR20200039030A (ko) 디지털 어시스턴트 서비스의 원거리 확장
CN110874716A (zh) 面试测评方法、装置、电子设备及存储介质
JP2023015054A (ja) 自動化アシスタントを呼び出すための動的および/またはコンテキスト固有のホットワード
US20230396848A1 (en) Predictive Media Routing
WO2020147380A1 (zh) 人机交互方法、装置、计算设备及计算机可读存储介质
WO2017166651A1 (zh) 语音识别模型训练方法、说话人类型识别方法及装置
JP2015517709A (ja) コンテキストに基づくメディアを適応配信するシステム
US20220131979A1 (en) Methods and systems for automatic queuing in conference calls
JP2023036574A (ja) 対話推薦方法、モデルの訓練方法、装置、電子機器、記憶媒体ならびにコンピュータプログラム
WO2021185113A1 (zh) 基于多分析任务的数据分析方法及电子设备
WO2016045468A1 (zh) 一种语音输入控制的方法、装置及终端
WO2024012501A1 (zh) 语音处理方法及相关装置、电子设备、存储介质
FI20185605A1 (en) Continuous verification of user identity in clinical trials via audio-based user interface
CN110196900A (zh) 用于终端的交互方法和装置
US11778277B1 (en) Digital item processing for video streams
CN113111197A (zh) 多媒体内容的推荐方法、装置、设备及存储介质
US20200089780A1 (en) Query-answering source for a user query
JP5850886B2 (ja) 情報処理装置及び方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18880514

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18880514

Country of ref document: EP

Kind code of ref document: A1