WO2021120631A1 - Intelligent interaction method and apparatus, and electronic device and storage medium - Google Patents

Intelligent interaction method and apparatus, and electronic device and storage medium Download PDF

Info

Publication number
WO2021120631A1
WO2021120631A1 PCT/CN2020/105636 CN2020105636W WO2021120631A1 WO 2021120631 A1 WO2021120631 A1 WO 2021120631A1 CN 2020105636 W CN2020105636 W CN 2020105636W WO 2021120631 A1 WO2021120631 A1 WO 2021120631A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
intention
level
information
slot
Prior art date
Application number
PCT/CN2020/105636
Other languages
French (fr)
Chinese (zh)
Inventor
刘璐
臧磊
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021120631A1 publication Critical patent/WO2021120631A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

An intelligent interaction method and apparatus, and an electronic device and a storage medium. The intelligent interaction method comprises: an intelligent voice assistant acquiring sound information of a user (S1); verifying the identity of the user according to the sound information (S2); after the identity verification of the user is passed, the intelligent voice assistant starting an open domain dialogue, and identifying a user intention according to the open domain dialogue (S3); determining a service level according to the user intention (S4); conducting a closed domain dialogue according to the service level, and identifying key information in the closed domain dialogue (S5); acquiring a slot position value according to the key information, and filling a slot position (S6); and when the filled slot position meets a threshold value, executing an operation corresponding to the user intention (S7). In the method, a secure dialogue with a user can be conducted by means of an intelligent voice assistant, and an operation is executed after a dialogue intention is identified.

Description

智能交互方法、装置、电子设备及存储介质Intelligent interaction method, device, electronic equipment and storage medium
本申请要求于2019年12月19日提交中国专利局、申请号为201911319401.2,发明名称为“智能交互方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on December 19, 2019, the application number is 201911319401.2, and the invention title is "Intelligent Interaction Method, Device, Electronic Equipment and Storage Medium", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及计算机技术领域,具体涉及一种智能交互方法、装置、电子设备及存储介质。This application relates to the field of computer technology, in particular to an intelligent interaction method, device, electronic equipment and storage medium.
背景技术Background technique
随着人工智能行业的发展,智能语音助手也成为人工智能系统应用相对成熟的领域。在现有技术中,智能语音助手通常应用在移动终端,用户可以使用移动终端的语音助手功能与机器助手进行语音交互,使机器助手可以在用户的语音控制下完成对移动终端的各种操作。然而,发明人意识到,现有智能语音助手的意图识别准确率低,使得人机交互流畅性差。With the development of the artificial intelligence industry, intelligent voice assistants have also become a relatively mature field of artificial intelligence system applications. In the prior art, smart voice assistants are usually applied to mobile terminals. Users can use the voice assistant function of the mobile terminal to interact with the machine assistant, so that the machine assistant can perform various operations on the mobile terminal under the user's voice control. However, the inventor realizes that the intent recognition accuracy of existing intelligent voice assistants is low, which makes human-computer interaction fluency poor.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种智能交互方法、装置、电子设备及存储介质,可以通过所述智能语音助手与用户安全对话,并准确识别对话意图后执行操作。In view of the above content, it is necessary to propose an intelligent interaction method, device, electronic device, and storage medium, which can safely dialogue with the user through the intelligent voice assistant, and perform operations after accurately identifying the dialogue intention.
本申请的第一方面提供一种智能交互方法,其中,所述智能交互方法包括:The first aspect of the present application provides an intelligent interaction method, wherein the intelligent interaction method includes:
智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
本申请的第二方面一种智能交互装置,其中,所述智能交互装置包括:A second aspect of the present application is an intelligent interaction device, wherein the intelligent interaction device includes:
获取模块,用于通过智能语音助手获取用户声音信息;The acquisition module is used to acquire the user's voice information through the intelligent voice assistant;
验证模块,用于根据所述声音信息验证用户身份;The verification module is used to verify the user's identity according to the voice information;
识别模块,用于当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;The recognition module is used to start the open domain dialogue after the user identity verification is passed by the intelligent voice assistant, and identify the user's intention according to the open domain dialogue;
确定模块,用于根据所述用户意图确定业务级别;The determining module is used to determine the service level according to the user's intention;
所述识别模块,还用于根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;The identification module is further configured to conduct a closed domain dialogue according to the service level and identify key information in the closed domain dialogue;
所述获取模块,还用于根据所述关键信息获取槽位值并填充槽位;及The obtaining module is also used to obtain the slot value according to the key information and fill the slot; and
执行模块,用于当填充的槽位满足阈值时,执行所述用户意图对应的操作。The execution module is used to execute the operation corresponding to the user's intention when the filled slot meets the threshold.
本申请的第三方面提供一种电子设备,其中,所述电子设备包括处理器,所述处理器用于执行存储器中存储的计算机可读指令以实现以下步骤:A third aspect of the present application provides an electronic device, wherein the electronic device includes a processor, and the processor is configured to execute computer-readable instructions stored in a memory to implement the following steps:
智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
本申请的第四方面提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:A fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions implement the following steps when executed by a processor:
智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
综上所述,本申请所述的智能交互方法、装置、电子设备及存储介质。涉及人工智能领域,在用户身份验证通过后所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图,根据所述用户意图确定业务级别,所述语音助手根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息,根据所述关键信息获取槽位值并填充槽位,当填充的槽位满足阈值时,执行所述用户意图对应的操作。本申请可以准确识别用户意图,并且在进入封闭域对话后,根据所述用户意图进入一级业务接口,在一级业务接口内进行问答式交流,执行任务更智能,人机交流的交互性更高。In summary, the intelligent interaction method, device, electronic equipment and storage medium described in this application. In the field of artificial intelligence, the intelligent voice assistant starts an open domain dialogue after the user identity verification is passed, recognizes the user's intention according to the open domain dialogue, determines the service level according to the user's intention, and the voice assistant performs the operation according to the service level Closed-domain dialogue, and identify key information in the closed-domain dialogue, obtain the slot value according to the key information and fill the slot, and when the filled slot meets the threshold, perform the operation corresponding to the user's intention. This application can accurately identify the user's intention, and after entering the closed domain dialogue, enter the first-level business interface according to the user's intention, and conduct question-and-answer communication in the first-level business interface, perform tasks more intelligently, and have more interactive human-computer communication. high.
另外,本申请可以处理用户意图对应的语音指令包含多个平级的业务和多个不同级别的业务的情况,可以在用户表达不明确时引导用户操作,直到完成整个闭环操作。In addition, the present application can handle the situation where the voice command corresponding to the user's intention includes multiple level services and multiple services of different levels, and can guide the user to operate when the user is not clear until the entire closed-loop operation is completed.
附图说明Description of the drawings
图1是本申请实施例一提供的智能交互方法的流程图。Fig. 1 is a flowchart of an intelligent interaction method provided in Embodiment 1 of the present application.
图2是本申请实施例二提供的智能交互装置的功能模块图。Fig. 2 is a functional module diagram of the intelligent interaction device provided in the second embodiment of the present application.
图3是本申请实施例三提供的电子设备的示意图。Fig. 3 is a schematic diagram of an electronic device provided in a third embodiment of the present application.
如下具体实施方式将结合上述附图进一步说明本申请。The following specific embodiments will further illustrate this application in conjunction with the above-mentioned drawings.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
在下面的描述中阐述了很多具体细节以便于充分理解本申请,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following description, many specific details are set forth in order to fully understand the present application. The described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the specification of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.
实施例一Example one
图1是本申请实施例一提供的智能交互方法的流程图。Fig. 1 is a flowchart of an intelligent interaction method provided in Embodiment 1 of the present application.
在本实施例中,所述智能交互方法可以应用于电子设备中。所述对于需要进智能交互的电子设备,可以直接在电子设备上集成本申请的方法所提供的智能交互功能,或者安装用于实现本申请的方法的客户端。再如,本申请所提供的方法还可以以软件开发工具包(Software Development Kit,SDK)的形式运行在服务器等设备上,以SDK的形式提供智能交互功能的接口,电子设备或其他设备通过提供的接口即可实现智能交互功能。In this embodiment, the intelligent interaction method can be applied to electronic equipment. For electronic devices that require intelligent interaction, the intelligent interaction function provided by the method of the present application can be directly integrated on the electronic device, or a client for implementing the method of the present application can be installed. For another example, the method provided in this application can also be run on servers and other devices in the form of a Software Development Kit (SDK), and provide interfaces for intelligent interactive functions in the form of SDK. Electronic devices or other devices provide The interface can realize the intelligent interactive function.
如图1所示,所述智能交互方法的流程图。根据不同的需求,所述流程图中的执行顺序可以改变,某些步骤可以省略。As shown in Fig. 1, the flow chart of the intelligent interaction method. According to different requirements, the execution sequence in the flowchart can be changed, and some steps can be omitted.
步骤S1,智能语音助手获取用户声音信息。Step S1, the intelligent voice assistant obtains the user's voice information.
在本实施方式中,所述智能交互方法应用在智能语音助手中,所述智能语音助手可以是银行智能语音助手。当用户在银行处理相关业务时,可以直接与银行智能语音助手交互。所述智能语音助手通过麦克风接收用户说话的声音,从而可以识别用户以及根据用户意图进行银行业务处理。In this embodiment, the intelligent interaction method is applied to an intelligent voice assistant, and the intelligent voice assistant may be a bank intelligent voice assistant. When the user handles related business in the bank, he can directly interact with the bank's intelligent voice assistant. The intelligent voice assistant receives the voice of the user through a microphone, so that the user can be identified and the banking service can be processed according to the user's intention.
例如,当用户需要进行查询账户余额操作时,可以先唤醒所述智能语音助手,在所述智能语音助手被唤醒的同时获取用户的声音信息。For example, when the user needs to perform an operation of querying the account balance, the smart voice assistant can be awakened first, and the user's voice information can be obtained when the smart voice assistant is awakened.
步骤S2,根据所述声音信息验证用户身份。Step S2, verify the user's identity according to the voice information.
在本实施方式中,提取所述声音信息中的声纹特征;将提取的声纹特征与预先构建的声纹模型进行匹配;当提取的声纹特征与预先构建的声纹模型匹配时,确认所述用户身份验证通过;当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户身份验证未通过。In this embodiment, the voiceprint feature in the voice information is extracted; the extracted voiceprint feature is matched with the pre-built voiceprint model; when the extracted voiceprint feature matches the pre-built voiceprint model, confirm The user identity verification is passed; when the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user identity verification is not passed.
具体地,所述根据所述声音信息识别用户身份的步骤包括:声纹注册阶段,向系统中输入用户声音样本,提取用户声音信息的Mel频率倒谱系数(MFCC),再使用resnet+ghostvlad网络进行端到端的方式训练,得到用户声音信息中的声纹特征,构建用户的声纹模型;声纹认证阶段,当用户在近场通过唤醒词唤醒智能语音助手时,所述智能语音助手获取用户声音信息。提取所述声音信息中的声纹特征,将提取的声纹特征与构建的声纹模型进行匹配,以验证用户身份。当提取的声纹特征与构建的声纹模型匹配时,确认所述用户为合法用户;当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户不合法。Specifically, the step of identifying the user's identity according to the voice information includes: the voiceprint registration stage, inputting the user's voice sample into the system, extracting the Mel frequency cepstral coefficient (MFCC) of the user's voice information, and then using the resnet+ghostvlad network Perform end-to-end training to obtain the voiceprint features in the user’s voice information, and build the user’s voiceprint model; in the voiceprint authentication phase, when the user wakes up the smart voice assistant through a wake-up word in the near field, the smart voice assistant acquires the user Voice information. The voiceprint feature in the voice information is extracted, and the extracted voiceprint feature is matched with the constructed voiceprint model to verify the user's identity. When the extracted voiceprint feature matches the constructed voiceprint model, it is confirmed that the user is a legitimate user; when the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user is illegal.
在另一实施方式中,在所述声纹注册阶段,可以向系统中输入用户声音样本,通过短时傅里叶变换提取用户声音信号的频谱,再使用resnet+ghostvlad网络进行端到端的方式训练,得到用户语音信号中的声纹特征,构建用户声纹模型。In another embodiment, in the voiceprint registration stage, user voice samples can be input into the system, the frequency spectrum of the user voice signal can be extracted through short-time Fourier transform, and then the resnet+ghostvlad network is used for end-to-end training , Get the voiceprint features in the user's voice signal, and build the user's voiceprint model.
例如,当用户在近场使用唤醒词和0-9十个数字作为语音样本,通过提取所述语音样本中的声纹特征。可以在声纹注册阶段构建用户的声纹模型。从而在身份验证时可以让用户根据智能语音助手规定的数字发音,来确定用户是否为合法用户。如此,可有效提高认证的准确率,还可以避免有人事先录音造假,提高安全性。For example, when the user uses a wake-up word and ten digits from 0-9 as a voice sample in the near field, the voiceprint feature in the voice sample is extracted. The user's voiceprint model can be constructed during the voiceprint registration stage. In this way, the user can determine whether the user is a legitimate user according to the digital pronunciation specified by the intelligent voice assistant during identity verification. In this way, the accuracy of authentication can be effectively improved, and it can also prevent someone from recording fraud in advance and improve security.
优选地,所述方法还包括:当提取的声纹特征与构建的声纹模型不匹配次数大于或等于预设次数(如3次)时,开启密码验证功能。Preferably, the method further includes: when the number of times the extracted voiceprint feature does not match the constructed voiceprint model is greater than or equal to a preset number of times (for example, 3 times), turning on the password verification function.
步骤S3,当用户身份验证通过后,智能语音助手启动开放域对话,根据所述开放域对话识别用户意图。Step S3: After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue.
在本实施方式中,用户身份验证通过后,智能语音助手启动开放域对话,将所述开放域对话中的语音信息转换为文字后进行意图识别。In this embodiment, after the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, converts the voice information in the open domain dialogue into text, and then performs intent recognition.
在本实施方式中,可以采用意图识别和槽填充联合模型来识别所述开放域对话中的用户意图。具体地,所述意图识别和槽填充联合模型包括三层,第一层是问句文本的one-hot编码;第二层由BLSTM和CNN组合的网络结构,用语学习语义信息和意图信息的共享表征;第三层为CRF层,对所述共享表征进行解码,使用统一损失函数共同学习意图识别任务及槽填充任务。所述意图识别和槽填充联合模型在将所述开放域对话中的问句进行one-hot编码得到句子向量,将所述句子向量输入BLSTM模型中得到一个新的序列向量的表示,然后经过CNN模型处理获取特征向量,将所述特征向量和序列向量进行拼接,得到输出向量。将所述输出向量馈送到CRF层,联合解码出最佳标签序列,通过将问句u中的每个字符wt与BIO标签相关联来表示所述标签序列。其中,BIO分别表示开始(begin)、继续(in)和其他(out)。输入标签X表示为w1,w2…wn,输出标签Y表示为s1,s2…sn。对于意图识别和槽填充联合模型,在输入问句的末尾添加额外的一个标签,与之对应的在输出标签的末端连接意图信息标志,得到新的输入标签和输出标签。模型最后的隐含层包含整个输入问句的潜在语义表示,以便用于问句的意图识别。In this embodiment, a combined model of intention recognition and slot filling may be used to identify the user's intention in the open domain dialogue. Specifically, the intention recognition and slot filling joint model includes three layers, the first layer is one-hot encoding of question text; the second layer is a network structure combined by BLSTM and CNN, and the language learning is used to share semantic information and intent information. Characterization; the third layer is the CRF layer, which decodes the shared characterization, and uses a unified loss function to jointly learn the intent recognition task and the slot filling task. The intent recognition and slot filling joint model performs one-hot encoding of the questions in the open-domain dialogue to obtain a sentence vector, inputs the sentence vector into the BLSTM model to obtain a new sequence vector representation, and then passes it through CNN The model processing obtains the feature vector, and splices the feature vector and the sequence vector to obtain an output vector. The output vector is fed to the CRF layer, and the optimal tag sequence is decoded jointly, and the tag sequence is represented by associating each character wt in the question u with the BIO tag. Among them, BIO means start (begin), continue (in) and other (out) respectively. The input label X is represented as w1, w2...wn, and the output label Y is represented as s1, s2...sn. For the combined model of intent recognition and slot filling, an extra label is added at the end of the input question, and the intent information mark is connected to the end of the output label to obtain new input label and output label. The hidden layer at the end of the model contains the latent semantic representation of the entire input question, so that it can be used to identify the intent of the question.
在其他实施方式中,可以通过基于规则模板的意图识别方法,基于统计特征分类的意图识别方法,基于词向量的意图识别方法,基于卷积神经网络的意图识别方法等方法中的一种 或多种组合来识别开放域对话中的用户意图,在此不再赘述。In other embodiments, one or more of the intent recognition method based on rule template, the intent recognition method based on statistical feature classification, the intent recognition method based on word vector, the intent recognition method based on convolutional neural network, etc. can be adopted. This combination is used to identify the user's intention in the open domain dialogue, which will not be repeated here.
步骤S4,根据所述用户意图确定业务级别。Step S4: Determine the service level according to the user's intention.
在本实施方式中,通过查询预先建立的意图与业务级别关联表来确定业务级别。所述关联表中可以根据应用领域的业务逻辑和该领域的知识库建立意图与业务级别对应关系。例如,在银行应用领域,可以根据银行领域的业务逻辑和银行领域的知识库,建立意图与业务级别对应关系。In this embodiment, the business level is determined by querying the pre-established association table of intention and business level. In the association table, the corresponding relationship between intent and business level can be established according to the business logic of the application field and the knowledge base of the field. For example, in the banking application field, the corresponding relationship between intent and business level can be established based on the business logic in the banking field and the knowledge base in the banking field.
例如,一级业务包括信用卡业务、缴费业务和贷款业务等。而所述信用卡业务对应的二级业务包括消费账单、还款金额和还款日期等;所述缴费业务对应的二级业务包括缴电费、缴煤气费和缴话费;贷款业务对应的二级业务包括快速信贷、现金信贷和智能信贷。For example, the first-level business includes credit card business, payment business and loan business. The secondary business corresponding to the credit card business includes consumption bills, repayment amount and repayment date, etc.; secondary business corresponding to the payment business includes payment of electricity bills, gas bills, and payment of telephone bills; secondary business corresponding to loan business Including fast credit, cash credit and smart credit.
在本实施方式中,所述一级业务与所述用户意图强相关。In this embodiment, the first-level service is strongly related to the user's intention.
步骤S5,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息。Step S5: Conduct a closed domain dialogue according to the business level, and identify key information in the closed domain dialogue.
在本实施方式中,封闭域对话是指识别用户意图后,为了明确用户目的(或者称为明确任务细节)而进行的对话。所述关键信息为当在所述封闭域对话中,从下级业务中提取出的信息。例如,若只接收到一级业务信息时,所述语音助手根据接收到的一级业务信息对应的二级业务信息进行信息播报,以提示用户需要执行的二级业务具体是什么。In this embodiment, the closed domain dialogue refers to a dialogue conducted to clarify the user's purpose (or called clarifying task details) after recognizing the user's intention. The key information is information extracted from lower-level services when in the closed domain dialogue. For example, if only the first-level service information is received, the voice assistant broadcasts information according to the second-level service information corresponding to the received first-level service information to prompt the user what the second-level service needs to be performed.
例如,当所述语音助手接收到一级业务为“信用卡”的信息后,并未接收到其他关于信用卡对应的二级业务消息时,所述语音助手发出语音提示“请问您是否需要查看消费账单”或者“请问您是否需要查询还款金额”或者“请问您是否需要查询还款日期”等。当用户听到所述语音提示而进行回复时,所述智能语音助手可以根据回复信息来确定二级业务信息。如此,可以根据用户意图进入一级业务接口,在一级业务接口内进行问答式交流。以得到用户需要执行的二级业务,从而使得执行任务更智能。For example, when the voice assistant receives the information that the primary business is "credit card", but does not receive other secondary business messages corresponding to the credit card, the voice assistant issues a voice prompt "Do you need to check the consumption bill? "Or "Do you need to check the repayment amount" or "Do you need to check the repayment date", etc. When the user hears the voice prompt and makes a reply, the intelligent voice assistant may determine the secondary service information according to the reply information. In this way, it is possible to enter the first-level service interface according to the user's intention, and conduct question-and-answer communication within the first-level service interface. In order to obtain the secondary business that users need to perform, so that the execution of tasks is more intelligent.
步骤S6,根据所述关键信息获取槽位值并填充槽位。Step S6: Obtain the slot value according to the key information and fill the slot.
所述填充槽位是指为了让用户意图转化为用户明确的指令而补全信息的过程。在本实施方式中,根据所述关键信息获取槽位值,再根据所述槽位值填充槽位。例如,当语音助手采集到的语音信息对应的文字信息为“查看我的信用卡中的消费账单”时,可以获取关键信息为:我、信用卡和消费账单。则所述智能语音助手会根据所述关键信息获取槽位值并填充槽位。The slot filling refers to a process of completing information in order to transform the user's intention into a clear instruction of the user. In this embodiment, the slot value is acquired according to the key information, and then the slot is filled according to the slot value. For example, when the text information corresponding to the voice information collected by the voice assistant is "View the consumption bill in my credit card", the key information that can be obtained are: me, credit card, and consumption bill. Then the intelligent voice assistant will obtain the slot value according to the key information and fill the slot.
步骤S7,当填充的槽位满足阈值时,执行所述用户意图对应的操作。Step S7: When the filled slot meets the threshold, execute the operation corresponding to the user's intention.
在本实施方式中,当填充的槽位满足阈值时,将所述用户意图转换为语音指令,智能语音助手根据所述语音指令执行操作。所述阈值与用户意图相关。例如,当用户意图是进行转账业务时,需要两个参数,分别是转账账号和转账金额。那么,对应的阈值也是两个。如果两个阈值中的任意一个没有完成,则无法执行所述用户意图对应的操作。In this embodiment, when the filled slot meets the threshold, the user's intention is converted into a voice instruction, and the intelligent voice assistant performs an operation according to the voice instruction. The threshold is related to the user's intention. For example, when the user intends to perform a transfer business, two parameters are required, namely the transfer account number and the transfer amount. Then, the corresponding threshold is also two. If any one of the two thresholds is not completed, the operation corresponding to the user's intention cannot be performed.
举例而言,当智能语音助手采集到的语音信息对应的文字信息为“查看我的信用卡”时,可以识别到用户意图为:信用卡。所述信用卡对应的一级业务为信用卡业务。则所述智能语音助手会进入信用卡业务的封闭域进行对话并根据槽位抽取槽位信息,调用目标接口。例如,发出提示语音“请问您是要查询信用卡消费账单、还款金额、还款日期或有没有逾期”等。当所述语音助手接收到用户回复“还款金额”时,所述智能语音助手查询用户信用卡消费情况并根据查询结果回复用户。例如,所述语音助手播报“您本月应还款2033元”。For example, when the text information corresponding to the voice information collected by the intelligent voice assistant is "Check my credit card", it can be recognized that the user's intention is: a credit card. The primary business corresponding to the credit card is a credit card business. Then the intelligent voice assistant will enter the closed domain of the credit card business to conduct a dialogue, extract the slot information according to the slot, and call the target interface. For example, the voice prompt "Are you going to check the credit card consumption bill, repayment amount, repayment date, or whether it is overdue", etc. When the voice assistant receives the user's reply "repayment amount", the intelligent voice assistant queries the user's credit card consumption status and responds to the user according to the query result. For example, the voice assistant broadcasts "You should repay 2033 yuan this month".
另外,所述智能语音助手也可以直接调用目标业务接口获取信息或执行操作。例如,当接收到用户语音信息为“查看我本月信用卡需还款多少元”时,所述直接调用二级业务接口中的还款余额,以获取余额信息。再进行语音播报“您本月应还款2033元”。In addition, the intelligent voice assistant can also directly call the target service interface to obtain information or perform operations. For example, when the user's voice message is "Check how much I need to repay my credit card this month", the repayment balance in the secondary service interface is directly called to obtain the balance information. Then carry out the voice broadcast "You should repay 2033 yuan this month".
优选地,当智能语音助手在执行所述用户意图对应的操作前,会发出提示信息供用户确认。例如,所述智能语音助手在执行前会将任务语音播放给用户确认,在接收到用户的确认信息后,执行所述用户意图对应的操作。并且所述智能语音助手执行成功与否均将结果反馈给用户。Preferably, when the intelligent voice assistant performs the operation corresponding to the user's intention, a prompt message will be issued for the user to confirm. For example, the intelligent voice assistant will play the task voice to the user for confirmation before execution, and after receiving the user's confirmation information, perform the operation corresponding to the user's intention. And whether the intelligent voice assistant executes successfully or not, the result is fed back to the user.
优选地,所述智能语音助手存储用户授权的信息,并根据所述授权的信息和识别的用户意图执行对应操作。具体地,接收用户授权的信息并存储所述授权的信息,其中,所述授权 的信息包括账户信息(例如,煤气账户);当根据用户意图确定业务级别后,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;根据所述授权的信息和所述关键信息获取槽位值并填充槽位;及当填充的槽位满足阈值时,执行所述用户意图对应的操作。Preferably, the intelligent voice assistant stores information authorized by the user, and performs corresponding operations according to the authorized information and the recognized user's intention. Specifically, receiving user-authorized information and storing the authorized information, where the authorized information includes account information (for example, a gas account); after the service level is determined according to the user's intention, the domain is closed according to the service level Dialogue, and identify the key information in the closed domain dialogue; obtain the slot value according to the authorized information and the key information and fill the slot; and when the filled slot meets the threshold, execute the corresponding user intention Operation.
例如,当所述智能语音助手存储了用户授权的帮忙家人缴纳煤气费的信息时,所述智能语音助手对用户授权的信息具有记忆功能,可以记住用户的定义的授权信息,不需要多次询问。当用户说“帮我婆婆缴煤气费”,智能助手识别到意图:缴费。根据用户意图确定二级业务为:缴煤气费。识别到所述封闭域对话中的关键信息为“我婆婆(即需缴费用户),缴煤气费”,可根据记忆查找婆婆的煤气账户,直接在应用程序中帮婆婆缴煤气费。For example, when the smart voice assistant stores the information authorized by the user to help the family pay gas bills, the smart voice assistant has a memory function for the information authorized by the user, and can remember the authorized information defined by the user without requiring multiple times. ask. When the user says "help my mother-in-law pay the gas bill," the smart assistant recognizes the intention: pay. According to the user's intention, the secondary business is determined as: paying gas bills. The key information identified in the closed domain dialogue is "My mother-in-law (that is, the user who needs to pay), pay the gas bill." You can find your mother-in-law's gas account based on memory, and directly help the mother-in-law pay the gas bill in the application.
优选地,当用户意图对应的语音指令包含多个平级的业务时,根据所述封闭域对话确定所述多个平级的业务的执行顺序,根据所述执行顺序执行对应的操作。Preferably, when the voice command corresponding to the user's intention includes multiple parallel services, the execution order of the multiple parallel services is determined according to the closed domain dialogue, and the corresponding operations are executed according to the execution order.
在用户意图对应的语音指令包含两个平级的业务时就需要澄清用户究竟是想进行哪个业务接口的操作。例如,当用户说“帮我查询一下信用卡业务和贷款业务”,所述智能语音助手就会提示用户“您是要先查询信用卡业务还是贷款业务”;在获取用户回复先查询信用卡业务再查询贷款业务的意图后,所述智能语音助手先执行信用卡业务查询,再执行贷款业务查询。When the voice command corresponding to the user's intention includes two parallel services, it is necessary to clarify which service interface operation the user wants to perform. For example, when the user says "Help me inquire about the credit card business and loan business", the intelligent voice assistant will prompt the user "Do you want to inquire about the credit card business or the loan business first?" After the business intention, the intelligent voice assistant first executes the credit card business inquiry, and then executes the loan business inquiry.
优选地,当用户意图对应的语音指令包含多个不同级别的业务时,提示用户所述多个不同级别的业务中的最低级别业务所对应的上级业务,再将所述上级业务包含的所有下级业务供用户选择。具体地,当所述用户意图对应的语音指令包含多个不同级别的业务时,根据所述意图与业务级别关联表识别所述多个不同级别的业务中的最低级别业务;查询所述最低级别业务所对应的上级业务;给出所述上级业务所包含的所有下级业务供用户选择。Preferably, when the voice command corresponding to the user's intention includes multiple services of different levels, the user is prompted to the upper-level service corresponding to the lowest-level service among the multiple different-level services, and then all the lower-level services included in the upper-level service Business for users to choose. Specifically, when the voice command corresponding to the user's intention includes multiple services of different levels, the lowest-level service among the multiple services of different levels is identified according to the intent and service level association table; and the lowest-level service is queried The upper-level business corresponding to the business; all the lower-level services included in the upper-level business are given for the user to choose.
例如,当用户意图对应的语音指令包含上下级的两个业务时,识别所述两个业务中的上级业务,并发出提示语音给用户澄清所述上级业务下所包含二级业务,并再次通过封闭域对话确认用户的业务需求。当用户说“帮我查看信用卡业务下的智能信贷”,所述智能语音助手就会提示用户“您是要查询贷款业务下的智能信贷吗?信用卡业务下没有智能贷款业务”。当用户确认查询贷款业务时,所述智能语音助手再给出所述贷款业务下的所有下级业务(如快速信贷、智能信贷和现金信贷)供用户选择。For example, when the voice command corresponding to the user's intention includes two services at the upper and lower levels, the upper-level services in the two services are identified, and a prompt voice is issued to the user to clarify the second-level services included under the upper-level services, and pass The closed domain dialogue confirms the user's business needs. When the user says "Help me check the smart credit under the credit card business", the smart voice assistant will prompt the user "Are you inquiring about the smart credit under the loan business? There is no smart loan business under the credit card business". When the user confirms to inquire about the loan business, the intelligent voice assistant then presents all the lower-level services (such as fast credit, smart credit, and cash credit) under the loan business for the user to choose.
优选地,当填充的槽位不满足阈值时,所述智能语音助手根据槽内缺少的槽位值发出语音提示;当存在多个缺少的槽位值时,所述智能语音助手按照顺序进行语音提示,并根据用户的回复按顺序填充所述缺少的槽位值;启动所述填充的槽位对应的任务,以执行所述用户意图对应的操作。如此,可以在填充的槽位不满足阈值时,根据槽内缺少的槽位值进行针对性的提问。当有多个槽需要澄清时,需按照顺序进行提问,以确保获取用户真实的槽位值信息,方便智能语音助手启动槽对应的任务。Preferably, when the filled slot does not meet the threshold, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot; when there are multiple missing slot values, the intelligent voice assistant performs voice prompts in sequence Prompt, and fill in the missing slot values in order according to the user's reply; start the task corresponding to the filled slot to perform the operation corresponding to the user's intention. In this way, when the filled slot does not meet the threshold, targeted questions can be asked based on the missing slot value in the slot. When there are multiple slots for clarification, you need to ask questions in order to ensure that the user's real slot value information is obtained, and it is convenient for the intelligent voice assistant to start the task corresponding to the slot.
例如,当用户说“我要缴费”,由于并不清楚缴纳什么费,并且是给谁缴费。因此,填充的槽位不满足阈值。此时,所述智能语音助手根据槽内缺少的槽位值发出语音提示。例如,“请问缴纳什么费”“替谁缴费”。由于当前两个缺少的槽位值,所述智能语音助手按照顺序进行语音提示。例如,所述智能语音助手语音提示“请问缴纳什么费”,在收到用户回复“缴煤气费”时,将煤气费填充至所述缺少的槽位值中;所述智能语音助手语音继续提示“替谁缴费”,在用户回复“替我婆婆缴费”时,将我婆婆填充所述缺少的槽位值;启动所述填充的槽位对应的任务(即替婆婆缴煤气费),以查找婆婆的煤气账户,直接在应用程序中帮婆婆缴煤气费。For example, when the user says "I want to pay", because it is not clear what fee to pay and who is to pay the fee. Therefore, the filled slot does not meet the threshold. At this time, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot. For example, "What fee should I pay?" "Who pays for?" Due to the current two missing slot values, the intelligent voice assistant performs voice prompts in order. For example, the smart voice assistant voice prompts "what fee to pay", when receiving the user's reply "pay gas bill", the gas fee is filled into the missing slot value; the smart voice assistant continues to prompt voice "Who pays for", when the user replies to "pay for my mother-in-law", fill my mother-in-law with the missing slot value; start the task corresponding to the filled slot (that is, pay the gas bill for her mother-in-law) to find Mother-in-law’s gas account can be used to help mother-in-law pay for gas directly in the app.
综上所述,本申请提供的智能交互方法包括,智能语音助手获取用户声音信息;根据所述声音信息验证用户身份;当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;根据所述用户意图确定业务级别;根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;根据所述关键信息获取槽位值并填充槽位;及当填充的槽位满足阈值时,执行所述用户意图对应的操作。本申请加入用户声纹识别系统,银行智能语音助手被唤醒的同时获取到了用户语音信息,在提取语音中的声纹特征后判别用户身份,在用户用语音控制其操作时,仅需语音验证不需要额外的验证操作,简化操作流程,提高安全性。本申请可以准确识别用户意图,并且在进入封闭域对话后,根据所述 用户意图进入一级业务接口,在一级业务接口内进行问答式交流,执行任务更智能,人机交流的交互性更高。另外,本申请可以处理用户意图对应的语音指令包含多个平级的业务和多个不同级别的业务的情况,可以在用户表达不明确时引导用户操作,直到完成整个闭环操作。In summary, the intelligent interaction method provided by this application includes: the intelligent voice assistant obtains the user's voice information; verifies the user's identity according to the voice information; when the user's identity is verified, the intelligent voice assistant initiates an open domain dialogue, The open domain dialogue identifies the user's intention; determines the service level according to the user's intention; conducts a closed domain dialogue according to the service level, and identifies key information in the closed domain dialogue; obtains and fills the slot value according to the key information Slot; and when the filled slot meets the threshold, perform the operation corresponding to the user's intention. This application joins the user's voiceprint recognition system. The bank's intelligent voice assistant is awakened and simultaneously obtains the user's voice information. After extracting the voiceprint features in the voice, the user's identity is determined. When the user controls its operation by voice, only voice verification is required. Additional verification operations are required to simplify the operation process and improve safety. This application can accurately identify the user's intention, and after entering the closed domain dialogue, enter the first-level business interface according to the user's intention, and conduct question-and-answer communication in the first-level business interface, perform tasks more intelligently, and have more interactive human-computer communication. high. In addition, the present application can handle the situation where the voice command corresponding to the user's intention includes multiple level services and multiple services of different levels, and can guide the user to operate when the user is not clear until the entire closed-loop operation is completed.
以上所述,仅是本申请的具体实施方式,但本申请的保护范围并不局限于此,对于本领域的普通技术人员来说,在不脱离本申请创造构思的前提下,还可以做出改进,但这些均属于本申请的保护范围。The above are only specific implementations of this application, but the scope of protection of this application is not limited to this. For those of ordinary skill in the art, without departing from the creative concept of this application, they can also make Improvements, but these all belong to the scope of protection of this application.
下面结合图2和图3,分别对实现上述智能交互方法的电子设备的功能模块及硬件结构进行介绍。The functional modules and hardware structure of the electronic device implementing the above-mentioned intelligent interaction method are respectively introduced below in conjunction with FIG. 2 and FIG. 3.
实施例二Example two
图2为本申请智能交互装置较佳实施例中的功能模块图。Fig. 2 is a diagram of functional modules in a preferred embodiment of the intelligent interaction device of this application.
在一些实施例中,所述智能交互装置20运行于电子设备中。所述智能交互装置20可以包括多个由程序代码段所组成的功能模块。所述智能交互装置20中的各个程序段的程序代码可以存储于存储器中,并由至少一个处理器所执行,以执行智能交互功能。In some embodiments, the smart interaction device 20 runs in an electronic device. The intelligent interaction device 20 may include multiple functional modules composed of program code segments. The program code of each program segment in the smart interaction device 20 may be stored in a memory and executed by at least one processor to perform smart interaction functions.
本实施例中,所述智能交互装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:获取模块201、验证模块202、识别模块203、确定模块204及执行模块205。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机程序段,其存储在存储器中。在一些实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the intelligent interaction device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: an acquisition module 201, a verification module 202, an identification module 203, a determination module 204, and an execution module 205. The module referred to in this application refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In some embodiments, the functions of each module will be detailed in subsequent embodiments.
所述获取模块201用于通过智能语音助手获取用户声音信息。The acquisition module 201 is used to acquire user voice information through an intelligent voice assistant.
在本实施方式中,所述智能语音助手可以是银行智能语音助手。当用户在银行处理相关业务时,可以直接与银行智能语音助手交互。所述智能语音助手通过麦克风接收用户说话的声音,从而可以识别用户以及根据用户意图进行银行业务处理。In this embodiment, the smart voice assistant may be a bank smart voice assistant. When the user handles related business in the bank, he can directly interact with the bank's intelligent voice assistant. The intelligent voice assistant receives the voice of the user through a microphone, so that the user can be identified and the banking service can be processed according to the user's intention.
例如,当用户需要进行查询账户余额操作时,可以先唤醒所述智能语音助手,在所述智能语音助手被唤醒的同时获取用户的声音信息。For example, when the user needs to perform an operation of querying the account balance, the smart voice assistant can be awakened first, and the user's voice information can be obtained when the smart voice assistant is awakened.
所述验证模块202用于根据所述声音信息验证用户身份。The verification module 202 is configured to verify the user's identity according to the voice information.
在本实施方式中,所述验证模块202用于提取所述声音信息中的声纹特征;将提取的声纹特征与预先构建的声纹模型进行匹配;当提取的声纹特征与预先构建的声纹模型匹配时,确认所述用户身份验证通过;当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户身份验证未通过。In this embodiment, the verification module 202 is used to extract the voiceprint features in the voice information; match the extracted voiceprint features with a pre-built voiceprint model; when the extracted voiceprint features match the pre-built voiceprint model When the voiceprint model matches, it is confirmed that the user identity verification is passed; when the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user identity verification is not passed.
具体地,所述根据所述声音信息识别用户身份包括:声纹注册阶段,向系统中输入用户声音样本,提取用户声音信息的Mel频率倒谱系数(MFCC),再使用resnet+ghostvlad网络进行端到端的方式训练,得到用户声音信息中的声纹特征,构建用户的声纹模型;声纹认证阶段,当用户在近场通过唤醒词唤醒智能语音助手时,所述智能语音助手获取用户声音信息。提取所述声音信息中的声纹特征,将提取的声纹特征与构建的声纹模型进行匹配,以验证用户身份。当提取的声纹特征与构建的声纹模型匹配时,确认所述用户为合法用户;当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户不合法。Specifically, the identification of the user's identity according to the voice information includes: the voiceprint registration stage, inputting the user's voice sample into the system, extracting the Mel frequency cepstral coefficient (MFCC) of the user's voice information, and then using the resnet+ghostvlad network to perform the terminal End-to-end training to obtain the voiceprint characteristics of the user's voice information, and build the user's voiceprint model; in the voiceprint authentication stage, when the user wakes up the intelligent voice assistant through a wake-up word in the near field, the intelligent voice assistant obtains the user's voice information . The voiceprint feature in the voice information is extracted, and the extracted voiceprint feature is matched with the constructed voiceprint model to verify the user's identity. When the extracted voiceprint feature matches the constructed voiceprint model, it is confirmed that the user is a legitimate user; when the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user is illegal.
在另一实施方式中,在所述声纹注册阶段,可以向系统中输入用户声音样本,通过短时傅里叶变换提取用户声音信号的频谱,再使用resnet+ghostvlad网络进行端到端的方式训练,得到用户语音信号中的声纹特征,构建用户声纹模型。In another embodiment, in the voiceprint registration stage, user voice samples can be input into the system, the frequency spectrum of the user voice signal can be extracted through short-time Fourier transform, and then the resnet+ghostvlad network is used for end-to-end training , Get the voiceprint features in the user's voice signal, and build the user's voiceprint model.
例如,当用户在近场使用唤醒词和0-9十个数字作为语音样本,通过提取所述语音样本中的声纹特征。可以在声纹注册阶段构建用户的声纹模型。从而在身份验证时可以让用户根据智能语音助手规定的数字发音,来确定用户是否为合法用户。如此,可有效提高认证的准确率,还可以避免有人事先录音造假,提高安全性。For example, when the user uses a wake-up word and ten digits from 0-9 as a voice sample in the near field, the voiceprint feature in the voice sample is extracted. The user's voiceprint model can be constructed during the voiceprint registration stage. In this way, the user can determine whether the user is a legitimate user according to the digital pronunciation specified by the intelligent voice assistant during identity verification. In this way, the accuracy of authentication can be effectively improved, and it can also prevent someone from recording fraud in advance and improve security.
优选地,所述智能交互装置还可以:当提取的声纹特征与构建的声纹模型不匹配次数大于或等于预设次数(如3次)时,开启密码验证功能。Preferably, the intelligent interaction device may also: enable the password verification function when the number of times the extracted voiceprint feature does not match the constructed voiceprint model is greater than or equal to a preset number of times (for example, 3 times).
所述识别模块203用于当用户身份验证通过后,智能语音助手启动开放域对话,根据所 述开放域对话识别用户意图。The recognition module 203 is used for the intelligent voice assistant to start an open domain dialogue after the user's identity verification is passed, and identify the user's intention according to the open domain dialogue.
在本实施方式中,用户身份验证通过后,智能语音助手启动开放域对话,将所述开放域对话中的语音信息转换为文字后进行意图识别。In this embodiment, after the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, converts the voice information in the open domain dialogue into text, and then performs intent recognition.
在本实施方式中,可以采用意图识别和槽填充联合模型来识别所述开放域对话中的用户意图。具体地,所述意图识别和槽填充联合模型包括三层,第一层是问句文本的one-hot编码;第二层由BLSTM和CNN组合的网络结构,用语学习语义信息和意图信息的共享表征;第三层为CRF层,对所述共享表征进行解码,使用统一损失函数共同学习意图识别任务及槽填充任务。所述意图识别和槽填充联合模型在将所述开放域对话中的问句进行one-hot编码得到句子向量,将所述句子向量输入BLSTM模型中得到一个新的序列向量的表示,然后经过CNN模型处理获取特征向量,将所述特征向量和序列向量进行拼接,得到输出向量。将所述输出向量馈送到CRF层,联合解码出最佳标签序列,通过将问句u中的每个字符wt与BIO标签相关联来表示所述标签序列。其中,BIO分别表示开始(begin)、继续(in)和其他(out)。输入标签X表示为w1,w2…wn,输出标签Y表示为s1,s2…sn。对于意图识别和槽填充联合模型,在输入问句的末尾添加额外的一个标签,与之对应的在输出标签的末端连接意图信息标志,得到新的输入标签和输出标签。模型最后的隐含层包含整个输入问句的潜在语义表示,以便用于问句的意图识别。In this embodiment, a combined model of intention recognition and slot filling may be used to identify the user's intention in the open domain dialogue. Specifically, the intention recognition and slot filling joint model includes three layers, the first layer is one-hot encoding of question text; the second layer is a network structure combined by BLSTM and CNN, and the language learning is used to share semantic information and intent information. Characterization; the third layer is the CRF layer, which decodes the shared characterization, and uses a unified loss function to jointly learn the intent recognition task and the slot filling task. The intent recognition and slot filling joint model performs one-hot encoding of the questions in the open-domain dialogue to obtain a sentence vector, inputs the sentence vector into the BLSTM model to obtain a new sequence vector representation, and then passes it through CNN The model processing obtains the feature vector, and splices the feature vector and the sequence vector to obtain an output vector. The output vector is fed to the CRF layer, and the optimal tag sequence is decoded jointly, and the tag sequence is represented by associating each character wt in the question u with the BIO tag. Among them, BIO means start (begin), continue (in) and other (out) respectively. The input label X is represented as w1, w2...wn, and the output label Y is represented as s1, s2...sn. For the combined model of intent recognition and slot filling, an extra label is added at the end of the input question, and the intent information sign is connected to the end of the output label to obtain new input label and output label. The hidden layer at the end of the model contains the latent semantic representation of the entire input question, so that it can be used to identify the intent of the question.
在其他实施方式中,可以通过基于规则模板的意图识别方法,基于统计特征分类的意图识别方法,基于词向量的意图识别方法,基于卷积神经网络的意图识别方法等方法中的一种或多种组合来识别开放域对话中的用户意图,在此不再赘述。In other embodiments, one or more of the intent recognition method based on rule template, the intent recognition method based on statistical feature classification, the intent recognition method based on word vector, the intent recognition method based on convolutional neural network, etc. can be adopted. This combination is used to identify the user's intention in the open domain dialogue, which will not be repeated here.
所述确定模块204用于根据所述用户意图确定业务级别。The determining module 204 is configured to determine the service level according to the user's intention.
在本实施方式中,通过查询预先建立的意图与业务级别关联表来确定业务级别。所述关联表中可以根据应用领域的业务逻辑和该领域的知识库建立意图与业务级别对应关系。例如,在银行应用领域,可以根据银行领域的业务逻辑和银行领域的知识库,建立意图与业务级别对应关系。In this embodiment, the business level is determined by querying the pre-established association table of intention and business level. In the association table, the corresponding relationship between intent and business level can be established according to the business logic of the application field and the knowledge base of the field. For example, in the banking application field, the corresponding relationship between intent and business level can be established based on the business logic in the banking field and the knowledge base in the banking field.
例如,一级业务包括信用卡业务、缴费业务和贷款业务等。而所述信用卡业务对应的二级业务包括消费账单、还款金额和还款日期等;所述缴费业务对应的二级业务包括缴电费、缴煤气费和缴话费;贷款业务对应的二级业务包括快速信贷、现金信贷和智能信贷。For example, the first-level business includes credit card business, payment business and loan business. The secondary business corresponding to the credit card business includes consumption bills, repayment amount and repayment date, etc.; secondary business corresponding to the payment business includes payment of electricity bills, gas bills, and payment of telephone bills; secondary business corresponding to loan business Including fast credit, cash credit and smart credit.
在本实施方式中,所述一级业务与所述用户意图强相关。In this embodiment, the first-level service is strongly related to the user's intention.
所述识别模块203还用于根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息。The identification module 203 is also used to conduct a closed domain dialogue according to the service level and identify key information in the closed domain dialogue.
在本实施方式中,封闭域对话是指识别用户意图后,为了明确用户目的(或者称为明确任务细节)而进行的对话。所述关键信息为当在所述封闭域对话中,从下级业务中提取出的信息。例如,若只接收到一级业务信息时,所述语音助手根据接收到的一级业务信息对应的二级业务信息进行信息播报,以提示用户需要执行的二级业务具体是什么。In this embodiment, the closed domain dialogue refers to a dialogue conducted to clarify the user's purpose (or called clarifying task details) after recognizing the user's intention. The key information is information extracted from lower-level services when in the closed domain dialogue. For example, if only the first-level service information is received, the voice assistant broadcasts information according to the second-level service information corresponding to the received first-level service information to prompt the user what the second-level service needs to be performed.
例如,当所述语音助手接收到一级业务为“信用卡”的信息后,并未接收到其他关于信用卡对应的二级业务消息时,所述语音助手发出语音提示“请问您是否需要查看消费账单”或者“请问您是否需要查询还款金额”或者“请问您是否需要查询还款日期”等。当用户听到所述语音提示而进行回复时,所述智能语音助手可以根据回复信息来确定二级业务信息。如此,可以根据用户意图进入一级业务接口,在一级业务接口内进行问答式交流。以得到用户需要执行的二级业务,从而使得执行任务更智能。For example, when the voice assistant receives the information that the primary business is "credit card", but does not receive other secondary business messages corresponding to the credit card, the voice assistant issues a voice prompt "Do you need to check the consumption bill? "Or "Do you need to check the repayment amount" or "Do you need to check the repayment date", etc. When the user hears the voice prompt and makes a reply, the intelligent voice assistant may determine the secondary service information according to the reply information. In this way, it is possible to enter the first-level service interface according to the user's intention, and conduct question-and-answer communication within the first-level service interface. In order to obtain the secondary business that users need to perform, so that the execution of tasks is more intelligent.
所述获取模块201还用于根据所述关键信息获取槽位值并填充槽位。The acquiring module 201 is further configured to acquire the slot value according to the key information and fill the slot.
所述填充槽位是指为了让用户意图转化为用户明确的指令而补全信息的过程。在本实施方式中,根据所述关键信息获取槽位值,再根据所述槽位值填充槽位。例如,当语音助手采集到的语音信息对应的文字信息为“查看我的信用卡中的消费账单”时,可以获取关键信息为:我、信用卡和消费账单。则所述智能语音助手会根据所述关键信息获取槽位值并填充槽位。The slot filling refers to a process of completing information in order to transform the user's intention into a clear instruction of the user. In this embodiment, the slot value is acquired according to the key information, and then the slot is filled according to the slot value. For example, when the text information corresponding to the voice information collected by the voice assistant is "View the consumption bill in my credit card", the key information that can be obtained are: me, credit card, and consumption bill. Then the intelligent voice assistant will obtain the slot value according to the key information and fill the slot.
所述执行模块205用于当填充的槽位满足阈值时,执行所述用户意图对应的操作。The execution module 205 is configured to execute the operation corresponding to the user's intention when the filled slot meets the threshold.
在本实施方式中,当填充的槽位满足阈值时,将所述用户意图转换为语音指令,智能语音助手根据所述语音指令执行操作。所述阈值与用户意图相关。例如,当用户意图是进行转账业务时,需要两个参数,分别是转账账号和转账金额。那么,对应的阈值也是两个。如果两个阈值中的任意一个没有完成,则无法执行所述用户意图对应的操作。In this embodiment, when the filled slot meets the threshold, the user's intention is converted into a voice instruction, and the intelligent voice assistant performs an operation according to the voice instruction. The threshold is related to the user's intention. For example, when the user intends to perform a transfer business, two parameters are required, namely the transfer account number and the transfer amount. Then, the corresponding threshold is also two. If any one of the two thresholds is not completed, the operation corresponding to the user's intention cannot be performed.
举例而言,当智能语音助手采集到的语音信息对应的文字信息为“查看我的信用卡”时,可以识别到用户意图为:信用卡。所述信用卡对应的一级业务为信用卡业务。则所述智能语音助手会进入信用卡业务的封闭域进行对话并根据槽位抽取槽位信息,调用目标接口。例如,发出提示语音“请问您是要查询信用卡消费账单、还款金额、还款日期或有没有逾期”等。当所述语音助手接收到用户回复“还款金额”时,所述智能语音助手查询用户信用卡消费情况并根据查询结果回复用户。例如,所述语音助手播报“您本月应还款2033元”。For example, when the text information corresponding to the voice information collected by the intelligent voice assistant is "Check my credit card", it can be recognized that the user's intention is: a credit card. The primary business corresponding to the credit card is a credit card business. Then the intelligent voice assistant will enter the closed domain of the credit card business to conduct a dialogue, extract the slot information according to the slot, and call the target interface. For example, the voice prompt "Are you going to check the credit card consumption bill, repayment amount, repayment date, or whether it is overdue", etc. When the voice assistant receives the user's reply "repayment amount", the intelligent voice assistant queries the user's credit card consumption status and responds to the user according to the query result. For example, the voice assistant broadcasts "You should repay 2033 yuan this month".
另外,所述智能语音助手也可以直接调用目标业务接口获取信息或执行操作。例如,当接收到用户语音信息为“查看我本月信用卡需还款多少元”时,所述直接调用二级业务接口中的还款余额,以获取余额信息。再进行语音播报“您本月应还款2033元”。In addition, the intelligent voice assistant can also directly call the target service interface to obtain information or perform operations. For example, when the user's voice message is "Check how much I need to repay my credit card this month", the repayment balance in the secondary service interface is directly called to obtain the balance information. Then carry out the voice broadcast "You should repay 2033 yuan this month".
优选地,当智能语音助手在执行所述用户意图对应的操作前,会发出提示信息供用户确认。例如,所述智能语音助手在执行前会将任务语音播放给用户确认,在接收到用户的确认信息后,执行所述用户意图对应的操作。并且所述智能语音助手执行成功与否均将结果反馈给用户。Preferably, when the intelligent voice assistant performs the operation corresponding to the user's intention, a prompt message will be issued for the user to confirm. For example, the intelligent voice assistant will play the task voice to the user for confirmation before execution, and after receiving the user's confirmation information, perform the operation corresponding to the user's intention. And whether the intelligent voice assistant executes successfully or not, the result is fed back to the user.
优选地,所述智能语音助手存储用户授权的信息,并根据所述授权的信息和识别的用户意图执行对应操作。具体地,接收用户授权的信息并存储所述授权的信息,其中,所述授权的信息包括账户信息(例如,煤气账户);当根据用户意图确定业务级别后,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;根据所述授权的信息和所述关键信息获取槽位值并填充槽位;及当填充的槽位满足阈值时,执行所述用户意图对应的操作。Preferably, the intelligent voice assistant stores information authorized by the user, and performs corresponding operations according to the authorized information and the recognized user's intention. Specifically, receiving user-authorized information and storing the authorized information, where the authorized information includes account information (for example, a gas account); after the service level is determined according to the user's intention, the domain is closed according to the service level Dialogue, and identify the key information in the closed domain dialogue; obtain the slot value according to the authorized information and the key information and fill the slot; and when the filled slot meets the threshold, execute the corresponding user intention Operation.
例如,当所述智能语音助手存储了用户授权的帮忙家人缴纳煤气费的信息时,所述智能语音助手对用户授权的信息具有记忆功能,可以记住用户的定义的授权信息,不需要多次询问。当用户说“帮我婆婆缴煤气费”,智能助手识别到意图:缴费。根据用户意图确定二级业务为:缴煤气费。识别到所述封闭域对话中的关键信息为“我婆婆(即需缴费用户),缴煤气费”,可根据记忆查找婆婆的煤气账户,直接在应用程序中帮婆婆缴煤气费。For example, when the smart voice assistant stores the information authorized by the user to help the family pay gas bills, the smart voice assistant has a memory function for the information authorized by the user, and can remember the authorized information defined by the user without requiring multiple times. ask. When the user says "help my mother-in-law pay the gas bill," the smart assistant recognizes the intention: pay the bill. According to the user's intention, the secondary business is determined as: paying gas bills. The key information identified in the closed domain dialogue is "My mother-in-law (that is, the user who needs to pay), pay the gas bill." You can find your mother-in-law's gas account based on memory, and directly help the mother-in-law pay the gas bill in the application.
优选地,当用户意图对应的语音指令包含多个平级的业务时,根据所述封闭域对话确定所述多个平级的业务的执行顺序,根据所述执行顺序执行对应的操作。Preferably, when the voice command corresponding to the user's intention includes multiple parallel services, the execution order of the multiple parallel services is determined according to the closed domain dialogue, and the corresponding operations are executed according to the execution order.
在用户意图对应的语音指令包含两个平级的业务时就需要澄清用户究竟是想进行哪个业务接口的操作。例如,当用户说“帮我查询一下信用卡业务和贷款业务”,所述智能语音助手就会提示用户“您是要先查询信用卡业务还是贷款业务”;在获取用户回复先查询信用卡业务再查询贷款业务的意图后,所述智能语音助手先执行信用卡业务查询,再执行贷款业务查询。When the voice command corresponding to the user's intention includes two parallel services, it is necessary to clarify which service interface operation the user wants to perform. For example, when the user says "Help me inquire about the credit card business and loan business", the intelligent voice assistant will prompt the user "Do you want to inquire about the credit card business or the loan business first?" After the business intention, the intelligent voice assistant first executes the credit card business inquiry, and then executes the loan business inquiry.
优选地,当用户意图对应的语音指令包含多个不同级别的业务时,提示用户所述多个不同级别的业务中的最低级别业务所对应的上级业务,再将所述上级业务包含的所有下级业务供用户选择。具体地,当所述用户意图对应的语音指令包含多个不同级别的业务时,根据所述意图与业务级别关联表识别所述多个不同级别的业务中的最低级别业务;查询所述最低级别业务所对应的上级业务;给出所述上级业务所包含的所有下级业务供用户选择。Preferably, when the voice command corresponding to the user's intention includes multiple services of different levels, the user is prompted to the upper-level service corresponding to the lowest-level service among the multiple different-level services, and then all the lower-level services included in the upper-level service Business for users to choose. Specifically, when the voice command corresponding to the user's intention includes multiple services of different levels, the lowest-level service among the multiple services of different levels is identified according to the intent and service level association table; and the lowest-level service is queried The upper-level business corresponding to the business; all the lower-level services included in the upper-level business are given for the user to choose.
例如,当用户意图对应的语音指令包含上下级的两个业务时,识别所述两个业务中的上级业务,并发出提示语音给用户澄清所述上级业务下所包含二级业务,并再次通过封闭域对话确认用户的业务需求。当用户说“帮我查看信用卡业务下的智能信贷”,所述智能语音助手就会提示用户“您是要查询贷款业务下的智能信贷吗?信用卡业务下没有智能贷款业务”。当用户确认查询贷款业务时,所述智能语音助手再给出所述贷款业务下的所有下级业务(如快速信贷、智能信贷和现金信贷)供用户选择。For example, when the voice command corresponding to the user's intention includes two services at the upper and lower levels, the upper-level services in the two services are identified, and a prompt voice is issued to the user to clarify the second-level services included under the upper-level services, and pass The closed domain dialogue confirms the user's business needs. When the user says "Help me check the smart credit under the credit card business", the smart voice assistant will prompt the user "Are you inquiring about the smart credit under the loan business? There is no smart loan business under the credit card business". When the user confirms to inquire about the loan business, the intelligent voice assistant then presents all the lower-level services (such as fast credit, smart credit, and cash credit) under the loan business for the user to choose.
优选地,当填充的槽位不满足阈值时,所述智能语音助手根据槽内缺少的槽位值发出语音提示;当存在多个缺少的槽位值时,所述智能语音助手按照顺序进行语音提示,并根据用 户的回复按顺序填充所述缺少的槽位值;启动所述填充的槽位对应的任务,以执行所述用户意图对应的操作。如此,可以在填充的槽位不满足阈值时,根据槽内缺少的槽位值进行针对性的提问。当有多个槽需要澄清时,需按照顺序进行提问,以确保获取用户真实的槽位值信息,方便智能语音助手启动槽对应的任务。Preferably, when the filled slot does not meet the threshold, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot; when there are multiple missing slot values, the intelligent voice assistant performs voice prompts in sequence Prompt, and fill in the missing slot values in order according to the user's reply; start the task corresponding to the filled slot to perform the operation corresponding to the user's intention. In this way, when the filled slot does not meet the threshold, targeted questions can be asked based on the missing slot value in the slot. When there are multiple slots for clarification, you need to ask questions in order to ensure that the user's real slot value information is obtained, and it is convenient for the intelligent voice assistant to start the task corresponding to the slot.
例如,当用户说“我要缴费”,由于并不清楚缴纳什么费,并且是给谁缴费。因此,填充的槽位不满足阈值。此时,所述智能语音助手根据槽内缺少的槽位值发出语音提示。例如,“请问缴纳什么费”“替谁缴费”。由于当前两个缺少的槽位值,所述智能语音助手按照顺序进行语音提示。例如,所述智能语音助手语音提示“请问缴纳什么费”,在收到用户回复“缴煤气费”时,将煤气费填充至所述缺少的槽位值中;所述智能语音助手语音继续提示“替谁缴费”,在用户回复“替我婆婆缴费”时,将我婆婆填充所述缺少的槽位值;启动所述填充的槽位对应的任务(即替婆婆缴煤气费),以查找婆婆的煤气账户,直接在应用程序中帮婆婆缴煤气费。For example, when the user says "I want to pay", because it is not clear what fee to pay and who is to pay the fee. Therefore, the filled slot does not meet the threshold. At this time, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot. For example, "What fee should I pay?" "Who pays for?" Due to the current two missing slot values, the intelligent voice assistant performs voice prompts in order. For example, the smart voice assistant voice prompts "what fee to pay", when receiving the user's reply "pay gas bill", the gas fee is filled into the missing slot value; the smart voice assistant continues to prompt voice "Who pays for", when the user replies to "pay for my mother-in-law", fill my mother-in-law with the missing slot value; start the task corresponding to the filled slot (that is, pay the gas bill for her mother-in-law) to find Mother-in-law’s gas account can be used to help mother-in-law pay for gas directly in the app.
综上所述,本申请提供的智能交互装置20包括获取模块201、验证模块202、识别模块203、确定模块204及执行模块205。所述获取模块201用于通过智能语音助手获取用户声音信息;所述验证模块202用于根据所述声音信息验证用户身份;所述识别模块203用于当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;所述确定模块204用于根据所述用户意图确定业务级别;所述识别模块203还用于根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;所述获取模块201还用于根据所述关键信息获取槽位值并填充槽位;及所述执行模块205用于当填充的槽位满足阈值时,执行所述用户意图对应的操作。本申请加入用户声纹识别系统,银行智能语音助手被唤醒的同时获取到了用户语音信息,在提取语音中的声纹特征后判别用户身份,在用户用语音控制其操作时,仅需语音验证不需要额外的验证操作,简化操作流程,提高安全性。本申请可以准确识别用户意图,并且在进入封闭域对话后,根据所述用户意图进入一级业务接口,在一级业务接口内进行问答式交流,执行任务更智能,人机交流的交互性更高。另外,本申请可以处理用户意图对应的语音指令包含多个平级的业务和多个不同级别的业务的情况,可以在用户表达不明确时引导用户操作,直到完成整个闭环操作。In summary, the intelligent interaction device 20 provided by the present application includes an acquisition module 201, a verification module 202, an identification module 203, a determination module 204, and an execution module 205. The obtaining module 201 is used to obtain the user's voice information through the intelligent voice assistant; the verification module 202 is used to verify the user's identity according to the voice information; the recognition module 203 is used to, when the user's identity is verified, the smart voice The assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue; the determining module 204 is used to determine the service level according to the user's intention; the identification module 203 is also used to conduct a closed domain dialogue according to the service level, And identify the key information in the closed domain dialogue; the acquisition module 201 is also used to acquire the slot value according to the key information and fill the slot; and the execution module 205 is used to when the filled slot meets the threshold To perform the operation corresponding to the user's intention. This application joins the user's voiceprint recognition system. The bank's intelligent voice assistant is awakened and simultaneously obtains the user's voice information. After extracting the voiceprint features in the voice, the user's identity is determined. When the user controls its operation by voice, only voice verification is required. Additional verification operations are required to simplify the operation process and improve safety. This application can accurately identify the user's intention, and after entering the closed domain dialogue, enter the first-level business interface according to the user's intention, and conduct question-and-answer communication in the first-level business interface, perform tasks more intelligently, and have more interactive human-computer communication. high. In addition, the present application can handle the situation where the voice command corresponding to the user's intention includes multiple level services and multiple services of different levels, and can guide the user to operate when the user is not clear until the entire closed-loop operation is completed.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,双屏设备,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes several instructions to make a computer device (which can be a personal computer, a dual-screen device, or a network device, etc.) or a processor to execute the various embodiments of this application. Part of the method.
图3为本申请实施例三提供的电子设备的示意图。FIG. 3 is a schematic diagram of the electronic device provided in the third embodiment of the application.
所述电子设备3包括:存储器31、至少一个处理器32、存储在所述存储器31中并可在所述至少一个处理器32上运行的计算机程序33、至少一条通讯总线34及数据库35。The electronic device 3 includes a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and running on the at least one processor 32, at least one communication bus 34 and a database 35.
所述至少一个处理器32执行所述计算机程序33时实现上述智能交互方法实施例中的步骤。When the at least one processor 32 executes the computer program 33, the steps in the foregoing embodiment of the intelligent interaction method are implemented.
示例性的,所述计算机程序33可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器31中,并由所述至少一个处理器32执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令,所述指令段用于描述所述计算机程序33在所述电子设备3中的执行过程。Exemplarily, the computer program 33 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 31 and executed by the at least one processor 32, To complete this application. The one or more modules/units may be a series of computer-readable instructions capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 33 in the electronic device 3.
所述电子设备3可以是手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)等安装有应用程序的设备。本领域技术人员可以理解,所述示意图3仅仅是电子设备3的示例,并不构成对电子设备3的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述电子设备3还可以包括输入输出设备、网络接入设备、总线等。The electronic device 3 may be a mobile phone, a tablet computer, a personal digital assistant (Personal Digital Assistant, PDA) and other devices installed with applications. Those skilled in the art can understand that the schematic diagram 3 is only an example of the electronic device 3, and does not constitute a limitation on the electronic device 3. It may include more or less components than those shown in the figure, or combine certain components, or be different. For example, the electronic device 3 may also include input and output devices, network access devices, buses, and so on.
所述至少一个处理器32可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述至少一个处理器32是所述电子设备3的控制核心(Control Unit),利用各种接 口和线路连接整个电子设备3的各个部件,通过运行或执行存储在所述存储器31内的程序或者模块,以及调用存储在所述存储器31内的数据,以执行电子设备3的各种功能和处理数据,例如执行智能交互的功能。The at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units. (Central Processing unit, CPU), a combination of microprocessors, digital processing chips, graphics processors, and various control chips. The at least one processor 32 is the control core (Control Unit) of the electronic device 3, which uses various interfaces and lines to connect the various components of the entire electronic device 3, and runs or executes programs stored in the memory 31 or Modules, and call data stored in the memory 31 to perform various functions of the electronic device 3 and process data, for example, perform smart interaction functions.
所述存储器31用于存储计算机可读指令和各种数据,例如安装在所述电子设备3中的智能交互装置20,并在述电子设备3的运行过程中实现高速、自动地完成程序或数据的存取。所述存储器31包括易失性和非易失性存储器,例如随机存取存储器(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者其他能够用于携带或存储数据的计算机可读的存储介质。所述计算机可读存储介质可以是非易失性,也可以是易失性的。The memory 31 is used to store computer-readable instructions and various data, such as the intelligent interactive device 20 installed in the electronic device 3, and realizes high-speed and automatic completion of programs or data during the operation of the electronic device 3 Access. The memory 31 includes volatile and non-volatile memory, such as random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only). Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronic Erasable Programmable Read-Only Memory, OTPROM Read memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or other data that can be used to carry or store data The computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile.
所述存储器31中存储有程序代码,且所述至少一个处理器32可调用所述存储器31中存储的程序代码以执行相关的功能。例如,图2中所述的各个模块(获取模块201、验证模块202、识别模块203、确定模块204及执行模块205)是存储在所述存储器31中的程序代码,并由所述至少一个处理器32所执行,从而实现所述各个模块的功能以达到智能交互的目的。The memory 31 stores program codes, and the at least one processor 32 can call the program codes stored in the memory 31 to perform related functions. For example, the modules (acquisition module 201, verification module 202, identification module 203, determination module 204, and execution module 205) described in FIG. 2 are program codes stored in the memory 31 and processed by the at least one Executed by the device 32, so as to realize the functions of the various modules to achieve the purpose of intelligent interaction.
所述获取模块201用于通过智能语音助手获取用户声音信息;所述验证模块202用于根据所述声音信息验证用户身份;所述识别模块203用于当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;所述确定模块204用于根据所述用户意图确定业务级别;所述识别模块203还用于根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;所述获取模块201还用于根据所述关键信息获取槽位值并填充槽位;及所述执行模块205用于当填充的槽位满足阈值时,执行所述用户意图对应的操作。The obtaining module 201 is used to obtain the user's voice information through the intelligent voice assistant; the verification module 202 is used to verify the user's identity according to the voice information; the recognition module 203 is used to, when the user's identity is verified, the smart voice The assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue; the determining module 204 is used to determine the service level according to the user's intention; the identification module 203 is also used to conduct a closed domain dialogue according to the service level, And identify the key information in the closed domain dialogue; the acquisition module 201 is also used to acquire the slot value and fill the slot according to the key information; and the execution module 205 is used to when the filled slot meets the threshold To perform the operation corresponding to the user's intention.
所述数据库(Database)35是按照数据结构来组织、存储和管理数据的建立在所述电子设备3上的仓库。数据库通常分为层次式数据库、网络式数据库和关系式数据库三种。在本实施方式中,所述数据库35用于存储用户声音信息等。The database (Database) 35 is a warehouse built on the electronic device 3 for organizing, storing and managing data according to a data structure. Databases are usually divided into three types: hierarchical database, network database and relational database. In this embodiment, the database 35 is used to store user voice information and the like.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any reference signs in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other elements or the singular number does not exclude the plural number. Multiple units or devices stated in the device claims can also be implemented by one unit or device through software or hardware. Words such as first and second are used to denote names, but do not denote any specific order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案 进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种智能交互方法,其中,所述智能交互方法包括:An intelligent interaction method, wherein the intelligent interaction method includes:
    智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
    根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
    当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
    根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
    根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
    根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  2. 如权利要求1所述的智能交互方法,其中,所述根据所述声音信息验证用户身份的步骤包括:The intelligent interaction method according to claim 1, wherein the step of verifying the user's identity according to the voice information comprises:
    提取所述声音信息中的声纹特征;Extracting voiceprint features in the voice information;
    将提取的声纹特征与预先构建的声纹模型进行匹配;Match the extracted voiceprint features with the pre-built voiceprint model;
    当提取的声纹特征与预先构建的声纹模型匹配时,确认所述用户身份验证通过;When the extracted voiceprint features match the pre-built voiceprint model, confirm that the user identity verification is passed;
    当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户身份验证未通过。When the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user identity verification fails.
  3. 如权利要求1所述的智能交互方法,其中,通过查询预先建立的意图与业务级别关联表来确定所述业务级别,其中,所述意图与业务级别关联表为根据应用领域的业务逻辑和所述应用领域的知识库建立的意图与业务级别对应关系。The intelligent interaction method of claim 1, wherein the business level is determined by querying a pre-established association table of intention and business level, wherein the association table of intention and business level is based on the business logic of the application domain and the The corresponding relationship between the intention and business level established by the knowledge base in the application domain.
  4. 如权利要求1所述的智能交互方法,其中,所述方法还包括:The intelligent interaction method according to claim 1, wherein the method further comprises:
    接收用户授权的信息并存储所述授权的信息,其中,所述授权的信息包括账户信息;Receiving user-authorized information and storing the authorized information, where the authorized information includes account information;
    当根据用户意图确定业务级别后,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;After the service level is determined according to the user's intention, a closed domain dialogue is conducted according to the service level, and key information in the closed domain dialogue is identified;
    根据所述授权的信息和所述关键信息获取槽位值并填充槽位;及Obtain the slot value and fill the slot according to the authorized information and the key information; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  5. 如权利要求1所述的智能交互方法,其中,当用户意图对应的语音指令包含多个平级的业务时,根据所述封闭域对话确定所述多个平级的业务的执行顺序,根据所述执行顺序执行对应的操作。The intelligent interaction method of claim 1, wherein when the voice instruction corresponding to the user's intention includes multiple parallel services, the execution sequence of the multiple parallel services is determined according to the closed domain dialogue, and the execution sequence of the multiple parallel services is determined according to the closed domain dialogue. Perform the corresponding operations in the order of execution.
  6. 如权利要求3所述的智能交互方法,其中,所述方法还包括:The intelligent interaction method of claim 3, wherein the method further comprises:
    当所述用户意图对应的语音指令包含多个不同级别的业务时,根据所述意图与业务级别关联表识别所述多个不同级别的业务中的最低级别业务;When the voice command corresponding to the user's intention includes multiple services of different levels, identifying the lowest level service among the multiple services of different levels according to the intent and service level association table;
    查询所述最低级别业务所对应的上级业务;Query the upper-level business corresponding to the lowest-level business;
    给出所述上级业务所包含的所有下级业务供用户选择。All the lower-level services included in the upper-level service are given for the user to choose.
  7. 如权利要求1所述的智能交互方法,其中,所述方法还包括:The intelligent interaction method according to claim 1, wherein the method further comprises:
    当填充的槽位不满足阈值时,所述智能语音助手根据槽内缺少的槽位值发出语音提示;When the filled slot does not meet the threshold, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot;
    当存在多个缺少的槽位值时,所述智能语音助手按照顺序进行语音提示,并根据用户的回复按顺序填充所述缺少的槽位值;When there are multiple missing slot values, the intelligent voice assistant performs voice prompts in order, and fills in the missing slot values in order according to the user's reply;
    启动所述填充的槽位对应的任务,以执行所述用户意图对应的操作。The task corresponding to the filled slot is started to execute the operation corresponding to the user's intention.
  8. 一种智能交互装置,其中,所述智能交互装置包括:An intelligent interactive device, wherein the intelligent interactive device includes:
    获取模块,用于通过智能语音助手获取用户声音信息;The acquisition module is used to acquire the user's voice information through the intelligent voice assistant;
    验证模块,用于根据所述声音信息验证用户身份;The verification module is used to verify the user's identity according to the voice information;
    识别模块,用于当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;The recognition module is used to start the open domain dialogue after the user identity verification is passed by the intelligent voice assistant, and identify the user's intention according to the open domain dialogue;
    确定模块,用于根据所述用户意图确定业务级别;The determining module is used to determine the service level according to the user's intention;
    所述识别模块,还用于根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;The identification module is further configured to conduct a closed domain dialogue according to the service level and identify key information in the closed domain dialogue;
    所述获取模块,还用于根据所述关键信息获取槽位值并填充槽位;及The obtaining module is also used to obtain the slot value according to the key information and fill the slot; and
    执行模块,用于当填充的槽位满足阈值时,执行所述用户意图对应的操作。The execution module is used to execute the operation corresponding to the user's intention when the filled slot meets the threshold.
  9. 一种电子设备,其中,所述电子设备包括处理器,所述处理器用于执行存储器中存储的计算机可读指令以实现以下步骤:An electronic device, wherein the electronic device includes a processor, and the processor is configured to execute computer-readable instructions stored in a memory to implement the following steps:
    智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
    根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
    当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
    根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
    根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
    根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  10. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机可读指令以实现所述根据所述声音信息验证用户身份时,具体包括:9. The electronic device according to claim 9, wherein when the processor executes the computer-readable instructions to implement the authentication of the user identity according to the voice information, it specifically comprises:
    提取所述声音信息中的声纹特征;Extracting voiceprint features in the voice information;
    将提取的声纹特征与预先构建的声纹模型进行匹配;Match the extracted voiceprint features with the pre-built voiceprint model;
    当提取的声纹特征与预先构建的声纹模型匹配时,确认所述用户身份验证通过;When the extracted voiceprint features match the pre-built voiceprint model, confirm that the user identity verification is passed;
    当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户身份验证未通过。When the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user identity verification fails.
  11. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机可读指令以实现所述根据所述用户意图确定业务级别时,具体包括:9. The electronic device according to claim 9, wherein, when the processor executes the computer-readable instructions to implement the determination of the service level according to the user's intention, it specifically comprises:
    通过查询预先建立的意图与业务级别关联表来确定所述业务级别,其中,所述意图与业务级别关联表为根据应用领域的业务逻辑和所述应用领域的知识库建立的意图与业务级别对应关系。The business level is determined by querying a pre-established association table of intent and business level, where the association table of intent and business level corresponds to the intent and business level established according to the business logic of the application field and the knowledge base of the application field relationship.
  12. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机可读指令还用以实现以下步骤:9. The electronic device of claim 9, wherein the processor executing the computer-readable instructions is further used to implement the following steps:
    接收用户授权的信息并存储所述授权的信息,其中,所述授权的信息包括账户信息;Receiving user-authorized information and storing the authorized information, where the authorized information includes account information;
    当根据用户意图确定业务级别后,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;After the service level is determined according to the user's intention, a closed domain dialogue is conducted according to the service level, and key information in the closed domain dialogue is identified;
    根据所述授权的信息和所述关键信息获取槽位值并填充槽位;及Obtain the slot value and fill the slot according to the authorized information and the key information; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  13. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机可读指令以实现所述当填充的槽位满足阈值时,执行所述用户意图对应的操作,具体包括:9. The electronic device of claim 9, wherein the processor executes the computer-readable instructions to implement the operation corresponding to the user's intention when the filled slot meets a threshold, which specifically includes:
    当所述用户意图对应的语音指令包含多个平级的业务时,根据所述封闭域对话确定所述多个平级的业务的执行顺序,根据所述执行顺序执行对应的操作。When the voice command corresponding to the user's intention includes multiple parallel services, the execution sequence of the multiple parallel services is determined according to the closed domain dialogue, and the corresponding operation is performed according to the execution sequence.
  14. 如权利要求11所述的电子设备,其中,所述处理器执行所述计算机可读指令还用以实现以下步骤:11. The electronic device of claim 11, wherein the processor executing the computer-readable instructions is further used to implement the following steps:
    当所述用户意图对应的语音指令包含多个不同级别的业务时,根据所述意图与业务级别关联表识别所述多个不同级别的业务中的最低级别业务;When the voice command corresponding to the user's intention includes multiple services of different levels, identifying the lowest level service among the multiple services of different levels according to the association table of intentions and service levels;
    查询所述最低级别业务所对应的上级业务;Query the upper-level business corresponding to the lowest-level business;
    给出所述上级业务所包含的所有下级业务供用户选择。All the lower-level services included in the upper-level service are given for the user to choose.
  15. 如权利要求9所述的电子设备,其中,所述处理器执行所述计算机可读指令还用以实现以下步骤:9. The electronic device of claim 9, wherein the processor executing the computer-readable instructions is further used to implement the following steps:
    当填充的槽位不满足阈值时,所述智能语音助手根据槽内缺少的槽位值发出语音提示;When the filled slot does not meet the threshold, the intelligent voice assistant issues a voice prompt according to the missing slot value in the slot;
    当存在多个缺少的槽位值时,所述智能语音助手按照顺序进行语音提示,并根据用户的回复按顺序填充所述缺少的槽位值;When there are multiple missing slot values, the intelligent voice assistant performs voice prompts in order, and fills in the missing slot values in order according to the user's reply;
    启动所述填充的槽位对应的任务,以执行所述用户意图对应的操作。The task corresponding to the filled slot is started to execute the operation corresponding to the user's intention.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:A computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions implement the following steps when executed by a processor:
    智能语音助手获取用户声音信息;The intelligent voice assistant obtains the user's voice information;
    根据所述声音信息验证用户身份;Verify the user identity according to the voice information;
    当用户身份验证通过后,所述智能语音助手启动开放域对话,根据所述开放域对话识别用户意图;After the user's identity is verified, the intelligent voice assistant starts an open domain dialogue, and recognizes the user's intention according to the open domain dialogue;
    根据所述用户意图确定业务级别;Determine the service level according to the user's intention;
    根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;Conducting a closed domain dialogue according to the business level, and identifying key information in the closed domain dialogue;
    根据所述关键信息获取槽位值并填充槽位;及Obtain the slot value according to the key information and fill the slot; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述根据所述声音信息验证用户身份时,具体包括:15. The computer-readable storage medium according to claim 16, wherein, when the computer-readable instructions are executed by the processor to implement the verification of the user identity based on the voice information, it specifically comprises:
    提取所述声音信息中的声纹特征;Extracting voiceprint features in the voice information;
    将提取的声纹特征与预先构建的声纹模型进行匹配;Match the extracted voiceprint features with the pre-built voiceprint model;
    当提取的声纹特征与预先构建的声纹模型匹配时,确认所述用户身份验证通过;When the extracted voiceprint features match the pre-built voiceprint model, confirm that the user identity verification is passed;
    当提取的声纹特征与构建的声纹模型不匹配时,确认所述用户身份验证未通过。When the extracted voiceprint feature does not match the constructed voiceprint model, it is confirmed that the user identity verification fails.
  18. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现所述根据所述用户意图确定业务级别时,具体包括:15. The computer-readable storage medium according to claim 16, wherein, when the computer-readable instructions are executed by the processor to implement the determination of the service level according to the user's intention, it specifically comprises:
    通过查询预先建立的意图与业务级别关联表来确定所述业务级别,其中,所述意图与业务级别关联表为根据应用领域的业务逻辑和所述应用领域的知识库建立的意图与业务级别对应关系。The business level is determined by querying a pre-established association table of intent and business level, where the association table of intent and business level corresponds to the intent and business level established according to the business logic of the application field and the knowledge base of the application field relationship.
  19. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行还用以实现以下步骤:16. The computer-readable storage medium of claim 16, wherein the computer-readable instructions are executed by the processor to further implement the following steps:
    接收用户授权的信息并存储所述授权的信息,其中,所述授权的信息包括账户信息;Receiving user-authorized information and storing the authorized information, where the authorized information includes account information;
    当根据用户意图确定业务级别后,根据所述业务级别进行封闭域对话,并识别所述封闭域对话中的关键信息;After the service level is determined according to the user's intention, a closed domain dialogue is conducted according to the service level, and key information in the closed domain dialogue is identified;
    根据所述授权的信息和所述关键信息获取槽位值并填充槽位;及Obtain the slot value and fill the slot according to the authorized information and the key information; and
    当填充的槽位满足阈值时,执行所述用户意图对应的操作。When the filled slot meets the threshold, the operation corresponding to the user's intention is performed.
  20. 如权利要求18所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行还用以实现以下步骤:18. The computer-readable storage medium of claim 18, wherein the computer-readable instructions are executed by the processor to further implement the following steps:
    当所述用户意图对应的语音指令包含多个不同级别的业务时,根据所述意图与业务级别关联表识别所述多个不同级别的业务中的最低级别业务;When the voice command corresponding to the user's intention includes multiple services of different levels, identifying the lowest level service among the multiple services of different levels according to the association table of intentions and service levels;
    查询所述最低级别业务所对应的上级业务;Query the upper-level business corresponding to the lowest-level business;
    给出所述上级业务所包含的所有下级业务供用户选择。All the lower-level services included in the upper-level service are given for the user to choose.
PCT/CN2020/105636 2019-12-19 2020-07-29 Intelligent interaction method and apparatus, and electronic device and storage medium WO2021120631A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911319401.2A CN111223485A (en) 2019-12-19 2019-12-19 Intelligent interaction method and device, electronic equipment and storage medium
CN201911319401.2 2019-12-19

Publications (1)

Publication Number Publication Date
WO2021120631A1 true WO2021120631A1 (en) 2021-06-24

Family

ID=70827894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105636 WO2021120631A1 (en) 2019-12-19 2020-07-29 Intelligent interaction method and apparatus, and electronic device and storage medium

Country Status (2)

Country Link
CN (1) CN111223485A (en)
WO (1) WO2021120631A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113782035A (en) * 2021-09-10 2021-12-10 中国银行股份有限公司 Service processing method and device, electronic equipment and storage medium
WO2023042988A1 (en) * 2021-09-14 2023-03-23 Samsung Electronics Co., Ltd. Methods and systems for determining missing slots associated with a voice command for an advanced voice interaction
CN116662555A (en) * 2023-07-28 2023-08-29 成都赛力斯科技有限公司 Request text processing method and device, electronic equipment and storage medium
CN117556864A (en) * 2024-01-12 2024-02-13 阿里云计算有限公司 Information processing method, electronic device, and storage medium
CN117725185A (en) * 2024-02-06 2024-03-19 河北神玥软件科技股份有限公司 Intelligent dialogue generation method and system

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111223485A (en) * 2019-12-19 2020-06-02 深圳壹账通智能科技有限公司 Intelligent interaction method and device, electronic equipment and storage medium
CN111767384A (en) * 2020-07-08 2020-10-13 上海风秩科技有限公司 Man-machine conversation processing method, device, equipment and storage medium
CN111598577B (en) * 2020-07-24 2020-11-13 深圳市声扬科技有限公司 Resource transfer method, device, computer equipment and storage medium
CN111986024A (en) * 2020-08-25 2020-11-24 北京文思海辉金信软件有限公司 Transaction processing method and device and electronic terminal
CN112035623B (en) * 2020-09-11 2023-08-04 杭州海康威视数字技术股份有限公司 Intelligent question-answering method and device, electronic equipment and storage medium
CN112331185B (en) * 2020-11-10 2023-08-11 珠海格力电器股份有限公司 Voice interaction method, system, storage medium and electronic equipment
CN112740323B (en) * 2020-12-26 2022-10-11 华为技术有限公司 Voice understanding method and device
CN112820285A (en) * 2020-12-29 2021-05-18 北京搜狗科技发展有限公司 Interaction method and earphone equipment
CN113113012A (en) * 2021-04-15 2021-07-13 北京蓦然认知科技有限公司 Method and device for interaction based on collaborative voice interaction engine cluster
CN117334183A (en) * 2022-06-24 2024-01-02 华为技术有限公司 Voice interaction method, electronic equipment and voice assistant development platform
CN115064167B (en) * 2022-08-17 2022-12-13 广州小鹏汽车科技有限公司 Voice interaction method, server and storage medium
CN117059095B (en) * 2023-07-21 2024-04-30 广州市睿翔通信科技有限公司 IVR-based service providing method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776936A (en) * 2016-12-01 2017-05-31 上海智臻智能网络科技股份有限公司 intelligent interactive method and system
CN107886948A (en) * 2017-11-16 2018-04-06 百度在线网络技术(北京)有限公司 Voice interactive method and device, terminal, server and readable storage medium storing program for executing
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN109635085A (en) * 2018-06-05 2019-04-16 安徽省泰岳祥升软件有限公司 The management method of intelligent interaction process, more wheel dialogue methods and device
US20190371329A1 (en) * 2016-06-27 2019-12-05 Amazon Technologies, Inc. Voice enablement and disablement of speech processing functionality
CN111223485A (en) * 2019-12-19 2020-06-02 深圳壹账通智能科技有限公司 Intelligent interaction method and device, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7137126B1 (en) * 1998-10-02 2006-11-14 International Business Machines Corporation Conversational computing via conversational virtual machine
CN103139404A (en) * 2013-01-25 2013-06-05 西安电子科技大学 System and method for generating interactive voice response display menu based on voice recognition
CN106101789B (en) * 2016-07-06 2020-04-24 深圳Tcl数字技术有限公司 Voice interaction method and device for terminal
CN109922213A (en) * 2019-01-17 2019-06-21 深圳壹账通智能科技有限公司 Data processing method, device, storage medium and terminal device when voice is seeked advice from
CN109671438A (en) * 2019-01-28 2019-04-23 武汉恩特拉信息技术有限公司 It is a kind of to provide the device and method of ancillary service using voice
CN110377720B (en) * 2019-07-26 2022-02-11 中国工商银行股份有限公司 Intelligent multi-round interaction method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190371329A1 (en) * 2016-06-27 2019-12-05 Amazon Technologies, Inc. Voice enablement and disablement of speech processing functionality
CN106776936A (en) * 2016-12-01 2017-05-31 上海智臻智能网络科技股份有限公司 intelligent interactive method and system
CN107886948A (en) * 2017-11-16 2018-04-06 百度在线网络技术(北京)有限公司 Voice interactive method and device, terminal, server and readable storage medium storing program for executing
CN109635085A (en) * 2018-06-05 2019-04-16 安徽省泰岳祥升软件有限公司 The management method of intelligent interaction process, more wheel dialogue methods and device
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN111223485A (en) * 2019-12-19 2020-06-02 深圳壹账通智能科技有限公司 Intelligent interaction method and device, electronic equipment and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113782035A (en) * 2021-09-10 2021-12-10 中国银行股份有限公司 Service processing method and device, electronic equipment and storage medium
WO2023042988A1 (en) * 2021-09-14 2023-03-23 Samsung Electronics Co., Ltd. Methods and systems for determining missing slots associated with a voice command for an advanced voice interaction
CN116662555A (en) * 2023-07-28 2023-08-29 成都赛力斯科技有限公司 Request text processing method and device, electronic equipment and storage medium
CN116662555B (en) * 2023-07-28 2023-10-20 成都赛力斯科技有限公司 Request text processing method and device, electronic equipment and storage medium
CN117556864A (en) * 2024-01-12 2024-02-13 阿里云计算有限公司 Information processing method, electronic device, and storage medium
CN117556864B (en) * 2024-01-12 2024-04-16 阿里云计算有限公司 Information processing method, electronic device, and storage medium
CN117725185A (en) * 2024-02-06 2024-03-19 河北神玥软件科技股份有限公司 Intelligent dialogue generation method and system
CN117725185B (en) * 2024-02-06 2024-05-07 河北神玥软件科技股份有限公司 Intelligent dialogue generation method and system

Also Published As

Publication number Publication date
CN111223485A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
WO2021120631A1 (en) Intelligent interaction method and apparatus, and electronic device and storage medium
CN106776936B (en) Intelligent interaction method and system
US9582757B1 (en) Scalable curation system
CN109428719B (en) Identity verification method, device and equipment
EP2784710A2 (en) Method and system for validating personalized account identifiers using biometric authentication and self-learning algorithms
CN107862005A (en) User view recognition methods and device
US8165887B2 (en) Data-driven voice user interface
WO2021109690A1 (en) Multi-type question smart answering method, system and device, and readable storage medium
CN109087639B (en) Method, apparatus, electronic device and computer readable medium for speech recognition
CN111540353B (en) Semantic understanding method, device, equipment and storage medium
CN112417128B (en) Method and device for recommending dialect, computer equipment and storage medium
CN109462482B (en) Voiceprint recognition method, voiceprint recognition device, electronic equipment and computer readable storage medium
CN110162675B (en) Method and device for generating answer sentence, computer readable medium and electronic device
CN111696558A (en) Intelligent outbound method, device, computer equipment and storage medium
CN107729549B (en) Robot customer service method and system including element extraction
US11604925B1 (en) Architecture for gazetteer-augmented named entity recognition
CN109637000A (en) The invoice method of inspection and device, storage medium, electric terminal
CN112131885A (en) Semantic recognition method and device, electronic equipment and storage medium
CN111797217B (en) Information query method based on FAQ matching model and related equipment thereof
CN111611358A (en) Information interaction method and device, electronic equipment and storage medium
CN115509485A (en) Filling-in method and device of business form, electronic equipment and storage medium
CN113707157B (en) Voiceprint recognition-based identity verification method and device, electronic equipment and medium
CN109087647A (en) Application on Voiceprint Recognition processing method, device, electronic equipment and storage medium
CN113132214B (en) Dialogue method, dialogue device, dialogue server and dialogue storage medium
CN112417412A (en) Bank account balance inquiry method, device and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20903052

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26.10.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20903052

Country of ref document: EP

Kind code of ref document: A1