WO2021179483A1 - 基于损失函数的意图识别方法、装置、设备及存储介质 - Google Patents

基于损失函数的意图识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021179483A1
WO2021179483A1 PCT/CN2020/098833 CN2020098833W WO2021179483A1 WO 2021179483 A1 WO2021179483 A1 WO 2021179483A1 CN 2020098833 W CN2020098833 W CN 2020098833W WO 2021179483 A1 WO2021179483 A1 WO 2021179483A1
Authority
WO
WIPO (PCT)
Prior art keywords
potential
intent
text
word segmentation
loss
Prior art date
Application number
PCT/CN2020/098833
Other languages
English (en)
French (fr)
Inventor
阮晓义
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021179483A1 publication Critical patent/WO2021179483A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the technical field of classification algorithms, and in particular to an intent recognition method, device, equipment and storage medium based on a loss function.
  • NLU natural language understanding
  • Intent recognition for text is one of the most commonly used methods for natural language understanding, but common intent recognition is limited to one sentence corresponding to one intent.
  • the classifier classifies the user’s words into one of multiple pre-designed categories. Go, that is, the multi-category method.
  • Go that is, the multi-category method.
  • the actual situation is that one sentence of the user often expresses more than one intention. In this case, simply recognizing the intention cannot meet the business needs. Therefore, the dialogue system needs a natural language understanding module that can recognize multiple intents of users at the same time, that is, to extend the traditional single-intention recognition to the field of multi-intention recognition.
  • Multi-intention recognition has always been a difficult problem in the industry.
  • the principle is to manually design keywords for all intents in advance. If a sentence matches multiple keywords, it is considered a hit.
  • this method has disadvantages such as poor scalability, large manual workload, and incomplete coverage.
  • the present application provides a method, device, device, and storage medium for intent recognition based on a loss function, so as to improve the confidence of the location of the tag, reduce the confidence of other locations, and improve the model's ability to recognize multiple intents.
  • the first aspect of the embodiments of the present application provides an intent recognition method based on a loss function, including: acquiring a text to be recognized, the text to be recognized is used to indicate at least one intent of a target user; and inputting an improved machine learning model FastText
  • the layer performs word segmentation on the text to be recognized to obtain multiple word segmentation vectors; calls the improved FastText hidden layer to superimpose and average the multiple word segmentation vectors to obtain a document vector; calls the cross-entropy loss function BCELoss for two classifications Calculate each potential intent in the document vector to obtain the loss value of multiple potential intents; filter the loss value of the multiple potential intents according to the loss value corresponding to each potential intent and a preset threshold value, A plurality of candidate intent tags of the text to be recognized are determined based on the filtered loss value.
  • a second aspect of the embodiments of the present application provides an intent recognition device based on a loss function, including: an acquisition unit, configured to acquire a text to be recognized, the text to be recognized is used to indicate at least one intent of a target user; a word segmentation unit, The input layer used to call the improved machine learning model FastText to segment the text to be recognized to obtain multiple word segmentation vectors; the averaging unit is used to call the hidden layer of the improved FastText to superimpose the multiple word segmentation vectors Averaging to obtain a document vector; a calculation unit for calling the cross-entropy loss function BCELoss used for two classifications to calculate each potential intent in the document vector to obtain the loss value of multiple potential intents; a filtering unit for calculating according to The loss value corresponding to each potential intent and a preset threshold are used to screen the loss values of the multiple potential intents, and the multiple candidate intent labels of the text to be recognized are determined based on the filtered loss values.
  • a loss function including: an acquisition unit, configured to acquire
  • the third aspect of the embodiments of the present application provides an intent recognition device based on a loss function, including a memory, a processor, and a computer program stored on the memory and running on the processor, and the processor executes
  • the computer program implements the aforementioned loss function-based intent recognition method, for example, implements the following steps: obtain the to-be-recognized text, the to-be-recognized text is used to indicate at least one intent of the target user; and call the improved machine learning model
  • the input layer of FastText performs word segmentation on the text to be recognized to obtain multiple word segmentation vectors; calls the improved FastText hidden layer to superimpose and average the multiple word segmentation vectors to obtain a document vector; calls cross entropy for two classifications
  • the loss function BCELoss calculates each potential intent in the document vector to obtain the loss value of multiple potential intents; the loss value of the multiple potential intents is calculated according to the loss value corresponding to each potential intent and a preset threshold Screening is performed,
  • the fourth aspect of the embodiments of the present application provides a computer-readable storage medium that stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the above-mentioned
  • the steps of the method of intent recognition of the loss function implement the following steps: obtain the text to be recognized, the text to be recognized is used to indicate at least one intent of the target user; call the input layer of the improved machine learning model FastText to The text is segmented to obtain multiple word segmentation vectors; the hidden layer of the improved FastText is used to superimpose and average the multiple word segmentation vectors to obtain a document vector; the cross-entropy loss function BCELoss for two classifications is used to calculate the document vector Calculate each potential intent to obtain the loss value of multiple potential intents; filter the loss value of the multiple potential intents according to the loss value corresponding to each potential intent and a preset threshold, based on the filtered loss value Determine multiple candidate intent tags of the text to be recognized.
  • the embodiments of the present application can increase the confidence of the location of the tag, reduce the confidence of the remaining locations, and improve the model's ability to recognize multiple intents.
  • FIG. 1 is a schematic diagram of an embodiment of an intention recognition method based on a loss function in an embodiment of the application
  • FIG. 2 is a schematic diagram of another embodiment of an intention recognition method based on a loss function in an embodiment of the application
  • FIG. 3 is a schematic diagram of an embodiment of an intention recognition apparatus based on a loss function in an embodiment of the application
  • FIG. 4 is a schematic diagram of another embodiment of an intention recognition apparatus based on a loss function in an embodiment of the application;
  • Fig. 5 is a schematic diagram of an embodiment of an intent recognition device based on a loss function in an embodiment of the application.
  • This application provides an intent recognition method, device, equipment and storage medium based on a loss function, which are used to directly calculate the influence of each label in the text on the loss function, adjust the probability distribution of the text in all intents, and make all label positions
  • the loss on the above is calculated, thereby increasing the confidence of the label's location, and reducing the confidence of the remaining locations, and improving the model's ability to recognize multiple intents.
  • the technical solution of the present application can be applied to the field of artificial intelligence, involving machine learning technology to improve the ability to recognize multiple intents.
  • FIG. 1 a flowchart of a method for intent recognition based on a loss function provided by an embodiment of the present application, which specifically includes:
  • the server obtains the text to be recognized, and the text to be recognized is used to indicate at least one intention of the target user.
  • the execution subject of the present application may be an intent recognition device based on a loss function, or may also be a terminal or a server, which is not specifically limited here.
  • the embodiment of the present application takes the server as the execution subject as an example for description.
  • the server calls the input layer of the improved machine learning model FastText to segment the text to be recognized to obtain multiple segmentation vectors. Specifically, the server uses a preset bag of words to segment the text to be recognized to obtain multiple candidate words; the server calls a preset n-gram model to characterize multiple candidate words to obtain the model characteristics of each candidate word; The model features of the candidate words are input to the input layer of the improved machine learning model FastText to generate multiple word segmentation vectors, and each word segmentation vector corresponds to a candidate word.
  • the bag of words is a method of segmenting all the corpus of the data set, and then judging the sentence based on the frequency of each word.
  • the n-gram model is a language model (language model, LM).
  • the language model is a probability-based discriminant model. Its input is a sentence (the sequence of words), and the output is the probability of the sentence, that is, these words The joint probability (joint probability).
  • FastText uses character-level n-grams to represent a word. For “apple”, assuming the value of n is 3, its trigrams are: “ ⁇ ap”,”app”,”ppl”,"ple “,”le>", where ⁇ means prefix,> means suffix, we can use the vector superposition of these 5 trigrams to represent the word vector of "apple”.
  • the characteristics of the bag-of-words model in the Chinese sentence “I love you” are “I", " ⁇ ", and "You”. These characteristics are the same as those of the sentence "You love me”. If you add 2-gram, the first sentence will also have the characteristics of "I-love” and "Love-you". The two sentences "I love you” and "You love me” can be distinguished.
  • the server calls the improved FastText hidden layer to superimpose and average multiple word segmentation vectors to obtain the document vector. Specifically, the server arranges multiple word segmentation vectors in the order of word segmentation to obtain the word segmentation sequence; the server inputs the word segmentation sequence in the order of word segmentation to the hidden layer of the improved FastText for averaging processing; the server obtains the output result of the hidden layer of the improved FastText , Get the document vector.
  • the server calls the cross-entropy loss function BCELoss for the two classification to calculate each potential intent in the document vector to obtain the loss value of multiple potential intents.
  • the server inputs the document vector into the improved FastText for negative sampling to obtain multiple sampling vectors; the server calls the hierarchical classifier to import the multiple sampling vectors into the tree structure to obtain the tree classification structure.
  • the tree classification structure includes Multiple potential intent labels; The server calculates multiple potential intent labels through the S-type threshold function Sigmoid and the cross-entropy loss function BCELoss for two classifications to obtain the loss values of multiple potential intents.
  • the server screens the loss values of multiple potential intents according to the loss value corresponding to each potential intent and a preset threshold, and determines multiple candidate intent labels of the text to be recognized based on the filtered loss value.
  • the server uses the binary cross entropy loss (BCELOSS) function to separately calculate the influence of each intent on the prediction, so that the probability distribution obtained has a larger value for multiple intents, and other intents have smaller values. Improved the model's ability to recognize multiple intents.
  • BCELOSS binary cross entropy loss
  • the embodiment of the application directly calculates the influence of each label in the text on the loss function, adjusts the probability distribution of the text in all intents, so that the loss at all label positions is calculated, thereby improving the confidence of the label position. And reduce the confidence of the remaining positions, and improve the model's ability to recognize multiple intents.
  • FIG. 2 another flowchart of the loss function-based intention recognition method provided by the embodiment of the present application, which specifically includes:
  • the server obtains the text to be recognized, and the text to be recognized is used to indicate at least one intention of the target user.
  • the execution subject of the present application may be an intent recognition device based on a loss function, or may also be a terminal or a server, which is not specifically limited here.
  • the embodiment of the present application takes the server as the execution subject as an example for description.
  • the server calls the input layer of the improved machine learning model FastText to segment the text to be recognized to obtain multiple segmentation vectors. Specifically, the server uses a preset bag of words to segment the text to be recognized to obtain multiple candidate words; the server calls a preset n-gram model to characterize multiple candidate words to obtain the model characteristics of each candidate word; The model features of the candidate words are input to the input layer of the improved machine learning model FastText to generate multiple word segmentation vectors, and each word segmentation vector corresponds to a candidate word.
  • the bag of words is a method of segmenting all the corpus of the data set, and then judging the sentence based on the frequency of each word.
  • the n-gram model is a language model (language model, LM).
  • the language model is a probability-based discriminant model. Its input is a sentence (the sequence of words), and the output is the probability of the sentence, that is, these words The joint probability (joint probability).
  • FastText uses character-level n-grams to represent a word. For “apple”, assuming the value of n is 3, its trigrams are: “ ⁇ ap”,”app”,”ppl”,"ple “,”le>", where ⁇ means prefix,> means suffix, we can use the vector superposition of these 5 trigrams to represent the word vector of "apple”.
  • the characteristics of the bag-of-words model in the Chinese sentence “I love you” are “I", " ⁇ ", and "You”. These characteristics are the same as those of the sentence "You love me”. If you add 2-gram, the first sentence will also have the characteristics of "I-love” and "Love-you". The two sentences "I love you” and "You love me” can be distinguished.
  • the server calls the improved FastText hidden layer to superimpose and average multiple word segmentation vectors to obtain the document vector. Specifically, the server arranges multiple word segmentation vectors in the order of word segmentation to obtain the word segmentation sequence; the server inputs the word segmentation sequence in the order of word segmentation to the hidden layer of the improved FastText for averaging processing; the server obtains the output result of the hidden layer of the improved FastText , Get the document vector.
  • the server inputs the document vector into the improved FastText for negative sampling to obtain multiple sampling vectors.
  • Negative sampling is to speed up the process of text classification.
  • the main solution is to give a pair of words to predict whether it is a context-target word. For example, orange-juice is the target word, marked as 1, called a positive sample; orange-king is not a target word, called a negative sample.
  • the number of word selections for negative samples is reduced. For large sample sets, only 2-5 negative samples are used for each positive sample for training; for small sample sets, only 5-20 negative samples are used, which improves the sampling efficiency.
  • the tree classification structure includes a plurality of potential intent tags.
  • the server invokes a hierarchical classifier to import multiple sampling vectors into a tree structure to obtain a tree classification structure.
  • the tree classification structure includes multiple potential intent tags.
  • the standard classifier Softmax when calculating the Softmax probability of a category, it is necessary to normalize the probabilities of all categories. It is very time-consuming when the category is very large. Therefore, the hierarchical classifier Hierarchical Softmax is proposed.
  • the Huffman tree is constructed by frequency to replace the standard Softmax, and the complexity can be reduced from N to logN through Hierarchical Softmax.
  • the original single-category loss value of the FastText model only calculates the loss caused by the category where the label is located, and does not calculate the loss value of the remaining positions, and Softmax normalizes the confidence of all categories and adds up to 1, making the model unable to Calculate the loss value of multi-label data.
  • the loss value of label 1 and the loss value of label 0 are considered at the same time.
  • the total loss value is their sum average, so that after training, you can get the greater the confidence of the position of all the labels, the better, all the labels The lower the position confidence of 0, the better the effect.
  • the threshold value is set to 0.1 to distinguish and identify intention.
  • the server screens the loss values of multiple potential intents according to the loss value corresponding to each potential intent and a preset threshold, and determines multiple candidate intent labels of the text to be recognized based on the filtered loss value. Specifically, the server obtains the preset threshold; the server determines whether the loss value corresponding to each potential intent is greater than the threshold; if the target loss value is greater than the threshold, the server determines the potential intent corresponding to the target loss value as the candidate intent label of the text to be recognized , Get multiple candidate intent tags.
  • the server annotates the text to be recognized according to a plurality of candidate intent tags.
  • the embodiment of the application directly calculates the influence of each label in the text on the loss function, adjusts the probability distribution of the text in all intents, so that the loss at all label positions is calculated, thereby improving the confidence of the label position. And reduce the confidence of the remaining positions, and improve the model's ability to recognize multiple intents.
  • An example of includes:
  • the obtaining unit 301 is configured to obtain a text to be recognized, where the text to be recognized is used to indicate at least one intention of the target user;
  • the word segmentation unit 302 is configured to call the input layer of the improved machine learning model FastText to segment the text to be recognized to obtain multiple word segmentation vectors;
  • the averaging unit 303 is configured to call the hidden layer of the improved FastText to superimpose and average the multiple word segmentation vectors to obtain a document vector;
  • the calculation unit 304 is configured to call the cross-entropy loss function BCELoss for two classifications to calculate each potential intent in the document vector to obtain the loss values of multiple potential intents;
  • the screening unit 305 is configured to screen the loss values of the multiple potential intentions according to the loss value corresponding to each potential intention and a preset threshold, and determine the multiple candidate intentions of the text to be recognized based on the filtered loss value Label.
  • the embodiment of the application directly calculates the influence of each label in the text on the loss function, adjusts the probability distribution of the text in all intents, so that the loss at all label positions is calculated, thereby improving the confidence of the label position. And reduce the confidence of the remaining positions, and improve the model's ability to recognize multiple intents.
  • another embodiment of the intent recognition apparatus based on the loss function in the embodiment of the present application includes:
  • the obtaining unit 301 is configured to obtain a text to be recognized, where the text to be recognized is used to indicate at least one intention of the target user;
  • the word segmentation unit 302 is configured to call the input layer of the improved machine learning model FastText to segment the text to be recognized to obtain multiple word segmentation vectors;
  • the averaging unit 303 is configured to call the hidden layer of the improved FastText to superimpose and average the multiple word segmentation vectors to obtain a document vector;
  • the calculation unit 304 is configured to call the cross-entropy loss function BCELoss for two classifications to calculate each potential intent in the document vector to obtain the loss values of multiple potential intents;
  • the screening unit 305 is configured to screen the loss values of the multiple potential intentions according to the loss value corresponding to each potential intention and a preset threshold, and determine the multiple candidate intentions of the text to be recognized based on the filtered loss value Label.
  • the calculation unit 304 includes:
  • the negative sampling module 3041 is used to input the document vector into the improved FastText for negative sampling to obtain multiple sampling vectors;
  • An import module 3042 configured to call a hierarchical classifier to import the multiple sampling vectors into a tree structure to obtain a tree classification structure, where the tree classification structure includes a plurality of potential intent tags;
  • the calculation module 3043 is configured to calculate the multiple potential intent labels through the S-type threshold function Sigmoid and the cross-entropy loss function BCELoss for two classifications to obtain the loss values of the multiple potential intents.
  • calculation module 3043 is specifically used for:
  • the screening unit 305 is specifically configured to:
  • the word segmentation unit 302 is specifically used for:
  • the averaging unit 303 is specifically used for:
  • the intent recognition device based on the loss function further includes:
  • the tagging unit 306 is configured to tag the text to be recognized according to a plurality of candidate intent tags.
  • the embodiment of the application directly calculates the influence of each label in the text on the loss function, adjusts the probability distribution of the text in all intents, so that the loss at all label positions is calculated, thereby improving the confidence of the label position. And reduce the confidence of the remaining positions, and improve the model's ability to recognize multiple intents.
  • FIGS 3 to 4 above describe in detail the loss function-based intent recognition device in the embodiment of the present application from the perspective of modular functional entities.
  • the following describes the loss function-based intent recognition device in the embodiment of the present application in detail from the perspective of hardware processing. describe.
  • FIG. 5 is a schematic structural diagram of a loss function-based intent recognition device provided by an embodiment of the present application.
  • the loss function-based intent recognition device 500 may have relatively large differences due to different configurations or performances, and may include one or more A processor (central processing units, CPU) 501 (for example, one or more processors), a memory 509, and one or more storage media 508 (for example, one or more storage devices with a large amount of data) storing application programs 507 or data 506.
  • the memory 509 and the storage medium 508 may be short-term storage or persistent storage.
  • the program stored in the storage medium 508 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the intent recognition device based on the loss function.
  • the processor 501 may be configured to communicate with the storage medium 508, and execute a series of instruction operations in the storage medium 508 on the intent recognition device 500 based on the loss function.
  • the intent recognition device 500 based on the loss function may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input and output interfaces 504, and/or one or more operating systems 505, For example, Windows Serve, Mac OS X, Unix, Linux, FreeBSD, etc.
  • Windows Serve Windows Serve
  • Mac OS X Unix
  • Linux FreeBSD
  • the processor 501 can perform the functions of the acquiring unit 301, the calling recognition unit 302, the superimposing unit 303, the confusion intention recognition unit 304, and the labeling unit 305 in the foregoing embodiment.
  • the processor 501 is the control center of the loss function-based intention recognition device, and can perform processing according to the set loss function-based intention recognition method.
  • the processor 501 uses various interfaces and lines to connect various parts of the entire loss function-based intent recognition device, and executes by running or executing software programs and/or modules stored in the memory 509, and calling data stored in the memory 509.
  • the loss function-based intent recognition device's various functions and processing data can improve the confidence of the label's location, reduce the confidence of other locations, and improve the model's ability to recognize multiple intents.
  • the storage medium 508 and the memory 509 are both carriers for storing data. In the embodiment of the present application, the storage medium 508 may refer to an internal memory with a small storage capacity but a fast speed, and the storage medium 509 may have a large storage capacity but a slow storage speed. External memory.
  • the memory 509 may be used to store software programs and modules.
  • the processor 501 executes various functional applications and data processing of the intent recognition device 500 based on the loss function by running the software programs and modules stored in the memory 509.
  • the memory 509 may mainly include a program storage area and a data storage area.
  • the storage program area may store an operating system and at least one application program required by a function (such as calling the improved FastText hidden layer to superimpose and average multiple word segmentation vectors to obtain Document vector), etc.; the storage data area can store data created according to the use of the loss function-based intent recognition device (such as loss values of multiple potential intents, etc.), etc.
  • the memory 509 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • a non-volatile memory such as at least one magnetic disk storage device, a flash memory device, or other non-volatile solid-state storage devices.
  • a computer-readable storage medium (or storage medium, etc.) is also provided, the computer-readable storage medium stores a computer program, and the computer program can be executed by a processor. Steps to implement the above-mentioned behavior recognition method.
  • the computer-readable storage medium may be non-volatile or volatile.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Machine Translation (AREA)

Abstract

一种基于损失函数的意图识别方法、装置、设备及存储介质,涉及人工智能领域,用于直接计算文本中每个标签对损失函数的影响,提高了模型对多个意图的识别能力。方法包括:获取待识别文本,待识别文本用于指示目标用户的至少一个意图(101);调用改进的机器学习模型FastText的输入层对待识别文本进行分词,得到多个分词向量(102);调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量(103);调用二分类用的交叉熵损失函数BCELoss对文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值(104);根据每个潜在意图对应的损失值以及预先设置的阈值对多个潜在意图的损失值进行筛选,基于筛选后的损失值确定待识别文本的多个候选意图标签(105)。

Description

基于损失函数的意图识别方法、装置、设备及存储介质
本申请要求于2020年3月9日提交中国专利局、申请号为202010156696.2,发明名称为“基于损失函数的意图识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及分类算法技术领域,尤其涉及一种基于损失函数的意图识别方法、装置、设备及存储介质。
背景技术
在多轮对话系统中自然语言理解(natural language understanding,NLU)是机器从用户身上获取信息至关重要的一个步骤。对于文本的意图识别是自然语言理解最常用的方法之一,但是常见的意图识别都局限于一句话对应一个意图,分类器将用户的话分到事先设计好的多个分类中的某一类当中去,也就是多分类方法。而实际的情况是用户的一句话常常表达了不止一个意图,在这种情况下单单只识别意图就无法满足业务需求了。因此,对话系统需要一个可以同时识别用户多种意图的自然语言理解模块,也就是要将传统的单意图识别扩展到多意图识别的领域。
而多意图识别在业界一直是一个难题,发明人发现,通常的方法就是使用规则匹配,其原理就是靠人工事先为所有的意图设计关键词,若一句话匹配到了多个关键词就认为命中了多个意图,但这种方式存在扩展性差、人工工作量大,情况覆盖不全等缺点。
发明内容
本申请提供了一种基于损失函数的意图识别方法、装置、设备及存储介质,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
本申请实施例的第一方面提供一种基于损失函数的意图识别方法,包括:获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
本申请实施例的第二方面提供了一种基于损失函数的意图识别装置,包括:获取单元,用于获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;分词单元,用于调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;平均单元,用于调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;计算单元,用于调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;筛选单元,用于根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
本申请实施例的第三方面提供了一种基于损失函数的意图识别设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述所述的基于损失函数的意图识别方法,例如,实现以下步骤:获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损 失值;根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
本申请实施例的第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,当所述计算机可读指令被处理器执行时实现上述所述的基于损失函数的意图识别方法的步骤,例如,实现以下步骤:获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
本申请实施例能够提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
附图说明
图1为本申请实施例中基于损失函数的意图识别方法的一个实施例示意图;
图2为本申请实施例中基于损失函数的意图识别方法的另一个实施例示意图;
图3为本申请实施例中基于损失函数的意图识别装置的一个实施例示意图;
图4为本申请实施例中基于损失函数的意图识别装置的另一个实施例示意图;
图5为本申请实施例中基于损失函数的意图识别设备的一个实施例示意图。
具体实施方式
本申请提供了一种基于损失函数的意图识别方法、装置、设备及存储介质,用于直接计算文本中每个标签对损失函数的影响,调整文本在所有意图中的概率分布,让所有标签位置上的损失都被计算在内,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
本申请的技术方案可应用于人工智能领域,涉及机器学习技术,以提升对多个意图的识别能力。
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例进行描述。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
请参阅图1,本申请实施例提供的基于损失函数的意图识别方法的流程图,具体包括:
101、获取待识别文本,待识别文本用于指示目标用户的至少一个意图。
服务器获取待识别文本,待识别文本用于指示目标用户的至少一个意图。
需要说明的是,不同的待识别文本,对应的意图可能相同也可能不相同,不同的待识别文本可以表达相同的一个或多个意图,例如,“是否可以购买平安福”和“可以购买平安福产品吗”表达的是同一个意图;相同的待识别文本可以表达不同的意图,例如,“我想买平安福”和“我想买平安福?”表达的意图不同。
可以理解的是,本申请的执行主体可以为基于损失函数的意图识别装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
102、调用改进的机器学习模型FastText的输入层对待识别文本进行分词,得到多个分词向量。
服务器调用改进的机器学习模型FastText的输入层对待识别文本进行分词,得到多个分词向量。具体的,服务器通过预置的词袋对待识别文本进行分词,得到多个候选词语;服务器调用预置的n元模型对多个候选词语进行表征,得到每个候选词语的模型特征;服务器将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
其中,词袋就是将数据集的所有语料都分词,然后基于各个词出现的频数从而对语句进行判断的方法。n-gram模型是一种语言模型(language model,LM),语言模型是一个基于概率的判别模型,它的输入是一句话(单词的顺序序列),输出是这句话的概率,即这些单词的联合概率(joint probability)。
例如,FastText使用了字符级别的n-gram来表示一个单词,对于“apple”,假设n的取值为3,则它的trigram有:"<ap","app","ppl","ple","le>",其中<表示前缀,>表示后缀,我们可以使用这5个trigram的向量叠加来表示“apple”的词向量。又例如,对于中文“我爱你”这句话中的词袋模型特征是“我”,“爱”,“你”。这些特征和句子“你爱我”的特征是一样的。如果加入2-gram,第一句话的特征还有“我-爱”和“爱-你”,这两句话“我爱你”和“你爱我”就能区别开来了。
需要说明的是,为了提高识别效率,需要将低频的n-gram进行过滤。
103、调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量。
服务器调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量。具体的,服务器将多个分词向量按照分词顺序进行排列,得到分词序列;服务器将分词序列按照分词顺序依次输入到改进的FastText的隐藏层进行平均处理;服务器获取改进的FastText的隐藏层的输出结果,得到文档向量。
104、调用二分类用的交叉熵损失函数BCELoss对文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值。
服务器调用二分类用的交叉熵损失函数BCELoss对文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值。具体的,服务器将文档向量输入到改进的FastText中进行负采样,得到多个采样向量;服务器调用分层分类器将多个采样向量导入树形结构,得到树形分类结构,树形分类结构包括多个潜在意图标签;服务器通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对多个潜在意图标签进行计算,得到多个潜在意图的损失值。
105、根据每个潜在意图对应的损失值以及预先设置的阈值对多个潜在意图的损失值进行筛选,基于筛选后的损失值确定待识别文本的多个候选意图标签。
服务器根据每个潜在意图对应的损失值以及预先设置的阈值对多个潜在意图的损失值进行筛选,基于筛选后的损失值确定待识别文本的多个候选意图标签。服务器利用二元交叉熵损失(binary cross entropy loss,BCELOSS)函数单独计算每个意图对预测的影响,从而使得得到的概率分布在多个意图上有较大值,其他的意图有较小值,提高了模型对多个意图的识别能力。
本申请实施例,直接计算文本中每个标签对损失函数的影响,调整文本在所有意图中的概率分布,让所有标签位置上的损失都被计算在内,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
请参阅图2,本申请实施例提供的基于损失函数的意图识别方法的另一个流程图,具体包括:
201、获取待识别文本,待识别文本用于指示目标用户的至少一个意图。
服务器获取待识别文本,待识别文本用于指示目标用户的至少一个意图。
需要说明的是,不同的待识别文本,对应的意图可能相同也可能不相同,不同的待识别文本可以表达相同的一个或多个意图,例如,“是否可以购买平安福”和“可以购买平安福产品吗”表达的是同一个意图;相同的待识别文本可以表达不同的意图,例如,“我想买平安福”和“我想买平安福?”表达的意图不同。
可以理解的是,本申请的执行主体可以为基于损失函数的意图识别装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
202、调用改进的机器学习模型FastText的输入层对待识别文本进行分词,得到多个分词向量。
服务器调用改进的机器学习模型FastText的输入层对待识别文本进行分词,得到多个分词向量。具体的,服务器通过预置的词袋对待识别文本进行分词,得到多个候选词语;服务器调用预置的n元模型对多个候选词语进行表征,得到每个候选词语的模型特征;服务器将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
其中,词袋就是将数据集的所有语料都分词,然后基于各个词出现的频数从而对语句进行判断的方法。n-gram模型是一种语言模型(language model,LM),语言模型是一个基于概率的判别模型,它的输入是一句话(单词的顺序序列),输出是这句话的概率,即这些单词的联合概率(joint probability)。
例如,FastText使用了字符级别的n-gram来表示一个单词,对于“apple”,假设n的取值为3,则它的trigram有:"<ap","app","ppl","ple","le>",其中<表示前缀,>表示后缀,我们可以使用这5个trigram的向量叠加来表示“apple”的词向量。又例如,对于中文“我爱你”这句话中的词袋模型特征是“我”,“爱”,“你”。这些特征和句子“你爱我”的特征是一样的。如果加入2-gram,第一句话的特征还有“我-爱”和“爱-你”,这两句话“我爱你”和“你爱我”就能区别开来了。
需要说明的是,为了提高识别效率,需要将低频的n-gram进行过滤。
203、调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量。
服务器调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量。具体的,服务器将多个分词向量按照分词顺序进行排列,得到分词序列;服务器将分词序列按照分词顺序依次输入到改进的FastText的隐藏层进行平均处理;服务器获取改进的FastText的隐藏层的输出结果,得到文档向量。
204、将文档向量输入到改进的FastText中进行负采样,得到多个采样向量。
服务器将文档向量输入到改进的FastText中进行负采样,得到多个采样向量。负采样(negative sampling),是为了加速文本分类处理过程,主要解决的方案是给定一对单词,去预测否是上下文-目标词。例如orange-juice是目标词,标记为1,称为正样本;orange-king不是目标词,称为负样本。负样本的词选择数量减少,对于大样本集仅采集,每训练一个正样本仅使用2-5个负样本;对于小样本集也只使用5-20个负样本,提高了采样效率。
205、调用分层分类器将多个采样向量导入树形结构,得到树形分类结构,树形分类结构包括多个潜在意图标签。
服务器调用分层分类器将多个采样向量导入树形结构,得到树形分类结构,树形分类结构包括多个潜在意图标签。
其中,在标准分类器Softmax中,计算一个类别的Softmax概率时,需要对所有类别概率做归一化,在这类别很大情况下非常耗时,因此提出了层次分类器Hierarchical Softmax,根据类别的频率构造霍夫曼树来代替标准Softmax,通过Hierarchical Softmax可以将复杂度从N降低到logN。
206、通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对多个潜在意图标签进行计算,得到多个潜在意图的损失值。
服务器通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对多个潜在意图标签进行计算,得到多个潜在意图的损失值。具体的,服务器通过阈值函数Sigmoid将多个潜在意图标签的计算概率值y′;服务器根据获取到的真实概率值y和预置BCELoss公式计算多个潜在意图标签的损失值,其中,预置BCELoss公式为l=-y log y′-(1-y)log(1-y′)。
例如,FastText模型原有的单分类损失值仅仅计算了标签所在类别带来的损失,并没有计算其余位置的损失值,而Softmax将所有类别的置信度归一化后加起来等于1使得模型无法对多标签数据的损失值进行计算。而本申请改进后损失计算方法为:l=-y log y′-(1-y)log(1-y′);
y=[0,0,1,0,1,0]
y′=[0.01,0.19,0.72,0.69,0.15,0.03]
这里同时考虑了标签为1的损失值和标签为0的损失值,总的损失值是他们的求和平均,这样进行训练后就能得到让所有标1的位置置信度越大越好,所有标0的位置置信度越小越好的效果。
需要说明的是,生成的每个潜在意图的损失函数的值的和不为1,对应多个置信度较高的意图,其他意图分值会非常小,一般阈值设为0.1就可以区分识别的意图。
207、根据每个潜在意图对应的损失值以及预先设置的阈值对多个潜在意图的损失值进行筛选,基于筛选后的损失值确定待识别文本的多个候选意图标签。
服务器根据每个潜在意图对应的损失值以及预先设置的阈值对多个潜在意图的损失值进行筛选,基于筛选后的损失值确定待识别文本的多个候选意图标签。具体的,服务器获取预先设置的阈值;服务器判断每个潜在意图对应的损失值是否大于阈值;若目标损失值大于阈值,则服务器将目标损失值对应的潜在意图确定为待识别文本的候选意图标签,得到多个候选意图标签。
208、根据多个候选意图标签对待识别文本进行标注。
服务器根据多个候选意图标签对待识别文本进行标注。
本申请实施例,直接计算文本中每个标签对损失函数的影响,调整文本在所有意图中的概率分布,让所有标签位置上的损失都被计算在内,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
上面对本申请实施例中基于损失函数的意图识别方法进行了描述,下面对本申请实施例中基于损失函数的意图识别装置进行描述,请参阅图3,本申请实施例中基于损失函数的意图识别装置的一个实施例包括:
获取单元301,用于获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
分词单元302,用于调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
平均单元303,用于调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
计算单元304,用于调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
筛选单元305,用于根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
本申请实施例,直接计算文本中每个标签对损失函数的影响,调整文本在所有意图中的概率分布,让所有标签位置上的损失都被计算在内,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
请参阅图4,本申请实施例中基于损失函数的意图识别装置的另一个实施例包括:
获取单元301,用于获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
分词单元302,用于调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
平均单元303,用于调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
计算单元304,用于调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
筛选单元305,用于根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
可选的,计算单元304包括:
负采样模块3041,用于将文档向量输入到改进的FastText中进行负采样,得到多个采样向量;
导入模块3042,用于调用分层分类器将所述多个采样向量导入树形结构,得到树形分类结构,所述树形分类结构包括多个潜在意图标签;
计算模块3043,用于通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值。
可选的,计算模块3043具体用于:
通过阈值函数Sigmoid将多个潜在意图标签的计算概率值y′;根据获取到的真实概率值y和预置BCELoss公式计算多个潜在意图标签的损失值,其中,预置BCELoss公式为l=-y log y′-(1-y)log(1-y′)。
可选的,筛选单元305具体用于:
获取预先设置的阈值;判断每个潜在意图对应的损失值是否大于所述阈值;若目标损失值大于所述阈值,则将所述目标损失值对应的潜在意图确定为所述待识别文本的候选意图标签,得到多个候选意图标签。
可选的,分词单元302具体用于:
通过预置的词袋对所述待识别文本进行分词,得到多个候选词语;调用预置的n元模型对所述多个候选词语进行表征,得到每个候选词语的模型特征;将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
可选的,平均单元303具体用于:
将多个分词向量按照分词顺序进行排列,得到分词序列;将所述分词序列按照所述分 词顺序依次输入到改进的FastText的隐藏层进行平均处理;获取所述改进的FastText的隐藏层的输出结果,得到文档向量。
可选的,基于损失函数的意图识别装置还包括:
标注单元306,用于根据多个候选意图标签对待识别文本进行标注。
本申请实施例,直接计算文本中每个标签对损失函数的影响,调整文本在所有意图中的概率分布,让所有标签位置上的损失都被计算在内,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。
上面图3至图4从模块化功能实体的角度对本申请实施例中的基于损失函数的意图识别装置进行详细描述,下面从硬件处理的角度对本申请实施例中基于损失函数的意图识别设备进行详细描述。
图5是本申请实施例提供的一种基于损失函数的意图识别设备的结构示意图,该基于损失函数的意图识别设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)501(例如,一个或一个以上处理器)和存储器509,一个或一个以上存储应用程序507或数据506的存储介质508(例如一个或一个以上海量存储设备)。其中,存储器509和存储介质508可以是短暂存储或持久存储。存储在存储介质508的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对基于损失函数的意图识别设备中的一系列指令操作。更进一步地,处理器501可以设置为与存储介质508通信,在基于损失函数的意图识别设备500上执行存储介质508中的一系列指令操作。
基于损失函数的意图识别设备500还可以包括一个或一个以上电源502,一个或一个以上有线或无线网络接口503,一个或一个以上输入输出接口504,和/或,一个或一个以上操作系统505,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5中示出的基于损失函数的意图识别设备结构并不构成对基于损失函数的意图识别设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。处理器501可以执行上述实施例中获取单元301、调用识别单元302、叠加单元303、混淆意图识别单元304和标注单元305的功能。
下面结合图5对基于损失函数的意图识别设备的各个构成部件进行具体的介绍:
处理器501是基于损失函数的意图识别设备的控制中心,可以按照设置的基于损失函数的意图识别方法进行处理。处理器501利用各种接口和线路连接整个基于损失函数的意图识别设备的各个部分,通过运行或执行存储在存储器509内的软件程序和/或模块,以及调用存储在存储器509内的数据,执行基于损失函数的意图识别设备的各种功能和处理数据,从而提高标签所在位置的置信度,并降低其余位置的置信度,并提高了模型对多个意图的识别能力。存储介质508和存储器509都是存储数据的载体,本申请实施例中,存储介质508可以是指储存容量较小,但速度快的内存储器,而存储器509可以是储存容量大,但储存速度慢的外存储器。
存储器509可用于存储软件程序以及模块,处理器501通过运行存储在存储器509的软件程序以及模块,从而执行基于损失函数的意图识别设备500的各种功能应用以及数据处理。存储器509可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如调用改进的FastText的隐藏层对多个分词向量进行叠加平均,得到文档向量)等;存储数据区可存储根据基于损失函数的意图识别设备的使用所创建的数据(比如多个潜在意图的损失值等)等。此外,存储器509可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在本申请实施例中提供的基于损失函数的意图识别方法程序和接收到的数据流存储在存储器中,当需要使用时,处理器501从存储器509中调用。
在本申请实施例中,还提供了一种计算机可读存储介质(或者还可以叫做存储介质等等),所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时可实现上述行为识别的方法的步骤。可选的,该计算机可读存储介质可以是非易失性的,也可以是易失性的。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种基于损失函数的意图识别方法,其中,包括:
    获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
    调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
    调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
    调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
    根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
  2. 根据权利要求1所述的基于损失函数的意图识别方法,其中,所述调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值,包括:
    将文档向量输入到改进的FastText中进行负采样,得到多个采样向量;
    调用分层分类器将所述多个采样向量导入树形结构,得到树形分类结构,所述树形分类结构包括多个潜在意图标签;
    通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值。
  3. 根据权利要求2所述的基于损失函数的意图识别方法,其中,所述通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值,包括:
    通过阈值函数Sigmoid将多个潜在意图标签的计算概率值y′;
    根据获取到的真实概率值y和预置BCELoss公式计算多个潜在意图标签的损失值,其中,预置BCELoss公式为l=-y log y′-(1-y)log(1-y′)。
  4. 根据权利要求1所述的基于损失函数的意图识别方法,其中,所述根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签,包括:
    获取预先设置的阈值;
    判断每个潜在意图对应的损失值是否大于所述阈值;
    若目标损失值大于所述阈值,则将所述目标损失值对应的潜在意图确定为所述待识别文本的候选意图标签,得到多个候选意图标签。
  5. 根据权利要求1所述的基于损失函数的意图识别方法,其中,所述调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量,包括:
    通过预置的词袋对所述待识别文本进行分词,得到多个候选词语;
    调用预置的n元模型对所述多个候选词语进行表征,得到每个候选词语的模型特征;
    将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
  6. 根据权利要求1所述的基于损失函数的意图识别方法,其中,所述调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量,包括:
    将多个分词向量按照分词顺序进行排列,得到分词序列;
    将所述分词序列按照所述分词顺序依次输入到改进的FastText的隐藏层进行平均处理;
    获取所述改进的FastText的隐藏层的输出结果,得到文档向量。
  7. 根据权利要求1-6中任一项所述的基于损失函数的意图识别方法,其中,在所述根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签之后,所述方法还包括:
    根据多个候选意图标签对待识别文本进行标注。
  8. 一种基于损失函数的意图识别装置,其中,包括:
    获取单元,用于获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
    分词单元,用于调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
    平均单元,用于调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
    计算单元,用于调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
    筛选单元,用于根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
  9. 一种基于损失函数的意图识别设备,其中,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现以下步骤:
    获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
    调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
    调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
    调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
    根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
  10. 根据权利要求9所述的基于损失函数的意图识别设备,其中,所述调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值时,具体实现以下步骤:
    将文档向量输入到改进的FastText中进行负采样,得到多个采样向量;
    调用分层分类器将所述多个采样向量导入树形结构,得到树形分类结构,所述树形分类结构包括多个潜在意图标签;
    通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值。
  11. 根据权利要求10所述的基于损失函数的意图识别设备,其中,所述通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值时,具体实现以下步骤:
    通过阈值函数Sigmoid将多个潜在意图标签的计算概率值y′;
    根据获取到的真实概率值y和预置BCELoss公式计算多个潜在意图标签的损失值,其中,预置BCELoss公式为l=-y log y′-(1-y)log(1-y′)。
  12. 根据权利要求9所述的基于损失函数的意图识别设备,其中,所述根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签时,具体实现以下步骤:
    获取预先设置的阈值;
    判断每个潜在意图对应的损失值是否大于所述阈值;
    若目标损失值大于所述阈值,则将所述目标损失值对应的潜在意图确定为所述待识别文本的候选意图标签,得到多个候选意图标签。
  13. 根据权利要求9所述的基于损失函数的意图识别设备,其中,所述调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量时,具体实现以下步骤:
    通过预置的词袋对所述待识别文本进行分词,得到多个候选词语;
    调用预置的n元模型对所述多个候选词语进行表征,得到每个候选词语的模型特征;
    将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
  14. 根据权利要求9所述的基于损失函数的意图识别设备,其中,所述调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量时,具体实现以下步骤:
    将多个分词向量按照分词顺序进行排列,得到分词序列;
    将所述分词序列按照所述分词顺序依次输入到改进的FastText的隐藏层进行平均处理;
    获取所述改进的FastText的隐藏层的输出结果,得到文档向量。
  15. 根据权利要求9-14中任一项所述的基于损失函数的意图识别设备,其中,在所述根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签之后,还实现以下步骤:
    根据多个候选意图标签对待识别文本进行标注。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,当所述计算机程序被处理器执行时实现以下步骤:
    获取待识别文本,所述待识别文本用于指示目标用户的至少一个意图;
    调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量;
    调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量;
    调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值;
    根据每个潜在意图对应的损失值以及预先设置的阈值对所述多个潜在意图的损失值进行筛选,基于筛选后的损失值确定所述待识别文本的多个候选意图标签。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述调用二分类用的交叉熵损失函数BCELoss对所述文档向量中的每个潜在意图进行计算,得到多个潜在意图的损失值时,具体实现以下步骤:
    将文档向量输入到改进的FastText中进行负采样,得到多个采样向量;
    调用分层分类器将所述多个采样向量导入树形结构,得到树形分类结构,所述树形分类结构包括多个潜在意图标签;
    通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值。
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述通过S型阈值函数Sigmoid和二分类用的交叉熵损失函数BCELoss对所述多个潜在意图标签进行计算,得到多个潜在意图的损失值时,具体实现以下步骤:
    通过阈值函数Sigmoid将多个潜在意图标签的计算概率值y′;
    根据获取到的真实概率值y和预置BCELoss公式计算多个潜在意图标签的损失值,其中,预置BCELoss公式为l=-y log y′-(1-y)log(1-y′)。
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述调用改进的机器学习模型FastText的输入层对所述待识别文本进行分词,得到多个分词向量时,具体实现以下步骤:
    通过预置的词袋对所述待识别文本进行分词,得到多个候选词语;
    调用预置的n元模型对所述多个候选词语进行表征,得到每个候选词语的模型特征;
    将每个候选词语的模型特征输入到改进的机器学习模型FastText的输入层生成多个分词向量,每个分词向量对应一个候选词语。
  20. 根据权利要求16所述的计算机可读存储介质,其中,所述调用所述改进的FastText的隐藏层对所述多个分词向量进行叠加平均,得到文档向量时,具体实现以下步骤:
    将多个分词向量按照分词顺序进行排列,得到分词序列;
    将所述分词序列按照所述分词顺序依次输入到改进的FastText的隐藏层进行平均处理;
    获取所述改进的FastText的隐藏层的输出结果,得到文档向量。
PCT/CN2020/098833 2020-03-09 2020-06-29 基于损失函数的意图识别方法、装置、设备及存储介质 WO2021179483A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010156696.2 2020-03-09
CN202010156696.2A CN111460806A (zh) 2020-03-09 2020-03-09 基于损失函数的意图识别方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021179483A1 true WO2021179483A1 (zh) 2021-09-16

Family

ID=71685581

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/098833 WO2021179483A1 (zh) 2020-03-09 2020-06-29 基于损失函数的意图识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN111460806A (zh)
WO (1) WO2021179483A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113888143A (zh) * 2021-12-08 2022-01-04 畅捷通信息技术股份有限公司 一种对账单数据处理方法、装置及存储介质
US20220129633A1 (en) * 2020-10-23 2022-04-28 Target Brands, Inc. Multi-task learning of query intent and named entities
CN114661909A (zh) * 2022-03-25 2022-06-24 鼎富智能科技有限公司 意图识别模型训练方法、装置、电子设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114818703B (zh) * 2022-06-28 2022-09-16 珠海金智维信息科技有限公司 基于BERT语言模型和TextCNN模型的多意图识别方法及系统
CN115600646B (zh) * 2022-10-19 2023-10-03 北京百度网讯科技有限公司 语言模型的训练方法、装置、介质及设备
CN116521822B (zh) * 2023-03-15 2024-02-13 上海帜讯信息技术股份有限公司 基于5g消息多轮会话机制的用户意图识别方法和装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165666A (zh) * 2018-07-05 2019-01-08 南京旷云科技有限公司 多标签图像分类方法、装置、设备及存储介质
CN109657230A (zh) * 2018-11-06 2019-04-19 众安信息技术服务有限公司 融合词向量和词性向量的命名实体识别方法及装置
CN110415086A (zh) * 2019-08-01 2019-11-05 信雅达系统工程股份有限公司 基于用户连续行为序列特征的智能理财推荐方法
US20190377825A1 (en) * 2018-06-06 2019-12-12 Microsoft Technology Licensing Llc Taxonomy enrichment using ensemble classifiers
US20200005194A1 (en) * 2018-06-30 2020-01-02 Microsoft Technology Licensing, Llc Machine learning for associating skills with content
CN110728153A (zh) * 2019-10-15 2020-01-24 天津理工大学 基于模型融合的多类别情感分类方法
CN110851817A (zh) * 2019-10-29 2020-02-28 锐捷网络股份有限公司 一种终端类型识别方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190377825A1 (en) * 2018-06-06 2019-12-12 Microsoft Technology Licensing Llc Taxonomy enrichment using ensemble classifiers
US20200005194A1 (en) * 2018-06-30 2020-01-02 Microsoft Technology Licensing, Llc Machine learning for associating skills with content
CN109165666A (zh) * 2018-07-05 2019-01-08 南京旷云科技有限公司 多标签图像分类方法、装置、设备及存储介质
CN109657230A (zh) * 2018-11-06 2019-04-19 众安信息技术服务有限公司 融合词向量和词性向量的命名实体识别方法及装置
CN110415086A (zh) * 2019-08-01 2019-11-05 信雅达系统工程股份有限公司 基于用户连续行为序列特征的智能理财推荐方法
CN110728153A (zh) * 2019-10-15 2020-01-24 天津理工大学 基于模型融合的多类别情感分类方法
CN110851817A (zh) * 2019-10-29 2020-02-28 锐捷网络股份有限公司 一种终端类型识别方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, GUANGCI: "Short text classification based on FastText", ELECTRONIC DESIGN ENGINEERING, vol. 28, no. 3, 15 February 2020 (2020-02-15), CN, pages 98 - 101, XP009530349, ISSN: 1674-6236 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220129633A1 (en) * 2020-10-23 2022-04-28 Target Brands, Inc. Multi-task learning of query intent and named entities
US11934785B2 (en) * 2020-10-23 2024-03-19 Target Brands, Inc. Multi-task learning of query intent and named entities
CN113888143A (zh) * 2021-12-08 2022-01-04 畅捷通信息技术股份有限公司 一种对账单数据处理方法、装置及存储介质
CN114661909A (zh) * 2022-03-25 2022-06-24 鼎富智能科技有限公司 意图识别模型训练方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN111460806A (zh) 2020-07-28

Similar Documents

Publication Publication Date Title
WO2021179483A1 (zh) 基于损失函数的意图识别方法、装置、设备及存储介质
US11734329B2 (en) System and method for text categorization and sentiment analysis
US11151175B2 (en) On-demand relation extraction from text
Cambria et al. Benchmarking multimodal sentiment analysis
WO2019153551A1 (zh) 文章分类方法、装置、计算机设备及存储介质
WO2021212675A1 (zh) 对抗样本的生成方法、装置、电子设备及存储介质
US10331737B2 (en) System for generation of a large-scale database of hetrogeneous speech
US20190013011A1 (en) Machine learning dialect identification
WO2022222300A1 (zh) 开放关系抽取方法、装置、电子设备及存储介质
US8949211B2 (en) Objective-function based sentiment
US8620837B2 (en) Determination of a basis for a new domain model based on a plurality of learned models
CN111274394A (zh) 一种实体关系的抽取方法、装置、设备及存储介质
CN113055386B (zh) 一种攻击组织的识别分析方法和装置
WO2018045646A1 (zh) 基于人工智能的人机交互方法和装置
US11847423B2 (en) Dynamic intent classification based on environment variables
CN111353028B (zh) 用于确定客服话术簇的方法及装置
CN111930929A (zh) 一种文章标题生成方法、装置及计算设备
US20230089308A1 (en) Speaker-Turn-Based Online Speaker Diarization with Constrained Spectral Clustering
CN107861948B (zh) 一种标签提取方法、装置、设备和介质
CN107391565B (zh) 一种基于主题模型的跨语言层次分类体系匹配方法
CN112101031B (zh) 一种实体识别方法、终端设备及存储介质
CN112347760A (zh) 意图识别模型的训练方法及装置、意图识别方法及装置
WO2023065642A1 (zh) 语料筛选方法、意图识别模型优化方法、设备及存储介质
WO2020172649A1 (en) System and method for text categorization and sentiment analysis
CN113434858A (zh) 基于反汇编代码结构和语义特征的恶意软件家族分类方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20924696

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20924696

Country of ref document: EP

Kind code of ref document: A1