CN113722492A

CN113722492A - Intention identification method and device

Info

Publication number: CN113722492A
Application number: CN202111054974.4A
Authority: CN
Inventors: 张宾; 俞果; 孙喜民; 郑斌; 孙博; 李慧超
Original assignee: State Grid E Commerce Co Ltd; State Grid E Commerce Technology Co Ltd
Current assignee: State Grid E Commerce Co Ltd; State Grid E Commerce Technology Co Ltd
Priority date: 2021-09-09
Filing date: 2021-09-09
Publication date: 2021-11-30

Abstract

The invention discloses an intention identification method and device, comprising the following steps: acquiring a text to be identified; preprocessing the text to be recognized to obtain a preprocessing result; performing feature extraction on the preprocessing result to obtain text features; and inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label. Through a method for generating a text intention classifier, an individualized conversation style matched with a user is determined, and intention information expressed by a user message is accurately predicted based on a machine learning and deep learning algorithm. The accuracy and the reliability of intention identification are improved.

Description

Intention identification method and device

Technical Field

The present invention relates to the field of information processing technologies, and in particular, to an intention recognition method and apparatus.

Background

The method can be summarized as matching the characteristics and the categories of the data to be classified, selecting the optimal matching result as the classification result and describing the text classification problem through a machine learning framework. Text classification is divided into two processes: learning process and classification process.

The learning process needs to learn samples, but in the existing learning process, the final recognition result is not accurate due to less input data, non-standard samples and more intentions.

Disclosure of Invention

In view of the above problems, the present invention provides an intention identification method and apparatus, which achieve the purpose of improving the accuracy and reliability of identification.

In order to achieve the purpose, the invention provides the following technical scheme:

an intent recognition method comprising:

acquiring a text to be identified;

preprocessing the text to be recognized to obtain a preprocessing result;

performing feature extraction on the preprocessing result to obtain text features;

and inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label.

Optionally, the processing the text to be recognized to obtain a preprocessing result includes:

performing word segmentation processing on the text to be recognized;

and performing word-stop-removing processing on the text after word segmentation to obtain a preprocessing result.

Optionally, the performing feature extraction on the preprocessing result to obtain a text feature includes:

performing feature extraction on the preprocessing result by using a bag-of-words model to obtain text features;

or the like, or, alternatively,

counting the occurrence frequency and the inverse text frequency index of each word in the preprocessing result;

determining text characteristics based on the occurrence frequency of each word and the inverse text frequency index;

or the like, or, alternatively,

and extracting word vector characteristics of the preprocessing result to obtain text characteristics.

Optionally, the method further comprises:

extracting visual vocabulary vectors from different types of images, wherein the vectors represent feature points which are not changed locally in the images;

aggregating all the feature point vectors to obtain a word list;

counting the occurrence frequency of each word in the word list in the image to obtain a numerical value vector;

and generating a bag of words model based on the numerical vectors.

Optionally, the method further comprises:

acquiring a training sample, wherein the training sample is text information marked with a conscious graph label;

performing text preprocessing on the training sample, and performing feature extraction on the preprocessed text to obtain a text expression vector;

and carrying out neural network model training on the text expression vector to obtain a target intention recognition model.

An intent recognition apparatus comprising:

the acquiring unit is used for acquiring a text to be recognized;

the preprocessing unit is used for preprocessing the text to be recognized to obtain a preprocessing result;

the feature extraction unit is used for extracting features of the preprocessing result to obtain text features;

and the recognition unit is used for inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label.

Optionally, the pre-processing unit comprises:

the word segmentation subunit is used for carrying out word segmentation processing on the text to be recognized;

and the removal subunit is used for performing word-removal stopping processing on the text after word segmentation to obtain a preprocessing result.

Optionally, the feature extraction unit includes:

the first extraction subunit is used for performing feature extraction on the preprocessing result by using a bag-of-words model to obtain text features;

or the like, or, alternatively,

the statistical subunit is used for counting the occurrence frequency and the inverse text frequency index of each word in the preprocessing result;

the first determining subunit is used for determining text characteristics based on the occurrence frequency of each word and the inverse text frequency index;

or the like, or, alternatively,

and the second extraction subunit is used for extracting word vector characteristics of the preprocessing result to obtain text characteristics.

Optionally, the apparatus further comprises:

the vector extraction subunit is used for extracting visual vocabulary vectors from different types of images, and the vectors represent feature points which are not changed locally in the images;

the aggregation subunit is used for aggregating all the feature point vectors to obtain a word list;

the number counting subunit is used for counting the number of times of each word in the word list appearing in the image to obtain a numerical vector;

and the generating subunit is used for generating a bag-of-words model based on the numerical vectors.

Optionally, the apparatus further comprises:

the system comprises a sample acquisition subunit, a comparison unit and a comparison unit, wherein the sample acquisition subunit is used for acquiring a training sample, and the training sample is text information marked with a conscious graph label;

the sample processing subunit is used for performing text preprocessing on the training sample and performing feature extraction on the preprocessed text to obtain a text expression vector;

and the model training subunit is used for carrying out neural network model training on the text expression vector to obtain a target intention recognition model.

Compared with the prior art, the invention provides an intention identification method and device, comprising the following steps: acquiring a text to be identified; preprocessing the text to be recognized to obtain a preprocessing result; performing feature extraction on the preprocessing result to obtain text features; and inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label. Through a method for generating a text intention classifier, an individualized conversation style matched with a user is determined, and intention information expressed by a user message is accurately predicted based on a machine learning and deep learning algorithm. The accuracy and the reliability of intention identification are improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a schematic flow chart of an intention identifying method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of a text classifier training process according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an intention identifying apparatus according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first" and "second," and the like in the description and claims of the present invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not set forth for a listed step or element but may include steps or elements not listed.

The embodiment of the invention provides an intention recognition method, which is mainly used for text classification of behavior intention recognition of an electric commercial user and relates to the field of natural language processing of electric commercial user behaviors. The method comprises the steps of firstly collecting mass samples from a web server, utilizing a big data technology to carry out deep mining and analysis, determining a personalized conversation mode matched with a user by a method of generating a text intention classifier, and accurately predicting intention information expressed by user information based on a machine learning algorithm and a deep learning algorithm, so that corresponding quality change parameters can be obtained, and control on related feedback information is enhanced. The most pertinent intent can also be identified when only a small range of intended functions are required.

Referring to fig. 1, a flow chart of an intention identification method provided in an embodiment of the present invention is schematically illustrated, where the method may include the following steps:

s101, obtaining a text to be recognized.

The method and the device have the advantages that the intention identification is realized, and the intention refers to the intention of the target object, for example, the query intention of the target object when the target object is in conversation with the intelligent customer service, or the browsing intention of the target object when the target object browses related information, and the like. Correspondingly, the text to be recognized is a text base capable of performing intent recognition, such as a consultation text input by the user, webpage text information browsed by the user, and the like.

S102, preprocessing the text to be recognized to obtain a preprocessing result.

S103, extracting the features of the preprocessing result to obtain text features.

And S104, inputting the text features into a target intention recognition model to obtain an intention recognition result.

In the embodiment of the invention, the intention recognition result is obtained by automatically recognizing the related information through the target intention recognition model. The target intention recognition model is a neural network model obtained based on training of training samples, and the training samples are text information marked with intentional figure labels. Correspondingly, the text to be recognized is not directly input into the target intention recognition model, but the text needs to be preprocessed, namely the text is subjected to normalized processing, such as removing stop words, repeated words, error input words and the like, the text also needs to be subjected to word segmentation processing, then corresponding feature extraction is carried out to obtain text features, the text features are expressed in a vector mode, and after the text features are input into the target intention recognition model, a corresponding intention recognition result can be obtained.

In order to clearly explain the embodiments of the present invention, the related terms will now be explained.

Intention recognition: intent recognition is to extract the intent it expresses from a sentence, which is essentially a text multi-classification problem, classifying sentences or queries we often say into corresponding intent classes.

Text classification: text classification is a supervised machine learning method for classifying a sentence or text document into one or more defined classes. It is a widely used natural language processing method that plays an important role in spam filtering, sentiment analysis, news article classification, and problems associated with many other businesses.

And (3) natural language processing: natural Language Processing (NLP) is a science integrating linguistics, mathematics and computer science. The core goal of the system is to convert human natural language into computer readable instructions, which simply means to make a machine read human language.

Text preprocessing: a method for solving the problems of high dimension of feature space, semantic relevance and sparse feature distribution.

Stop words: mainly comprising some adverbs, adjectives and some conjunctions.

Bag of words model: the Bag of Words model (Bow, Bag of Words) does not consider the context between Words in the text, only considers the weights of all Words (related to the frequency of occurrence of Words in the text), similar to putting all Words into a Bag, each word is independent and does not contain semantic information.

TF-IDF: TF-IDF (term-inverse document frequency) is a commonly used weighting technique for information retrieval and information exploration. TF-IDF is a statistical method to evaluate the importance of a word to one of a set of documents or a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. Various forms of TF-IDF weighting are often applied by search engines as a measure or rating of the degree of relevance between a document and a user query.

Word frequency (TF): tf (term frequency) refers to the number of times a given word appears in the document.

Inverse Document Frequency (IDF): the method means that if the number of documents containing the entries is less, the IDF Inverse Document Frequency (Inverse Document Frequency) is higher, and the entries have good category distinguishing capability.

Intent recognition is the extraction of the intent it expresses from a sentence, and is therefore also a multi-classification problem in nature, in a manner similar to that of classification models. E-commerce user behavior text classification training text classifiers see fig. 2. Firstly, a training set is obtained, then text preprocessing, feature extraction and text representation are realized through feature engineering processing, and corresponding intention classification results are output through a classifier.

It should be noted that, in the embodiment of the present invention, the processing procedure of the text to be recognized is consistent with the processing procedure of the training text in the training procedure of the target intention recognition model. In an embodiment of the present invention, the process of generating the target intention recognition model includes:

Correspondingly, the process of processing the text input into the target recognition model also comprises the following steps: preprocessing and feature extraction.

In a possible implementation manner, the processing the text to be recognized to obtain a preprocessing result includes:

performing word segmentation processing on the text to be recognized;

The embodiment of the invention mainly aims at preprocessing Chinese texts. The characteristic granularity is far better than the word granularity, most classification algorithms do not consider word sequence information, and excessive n-gram information is lost based on the word granularity. Therefore, word segmentation is required, and specifically, chinese word segmentation is mainly divided into two categories: dictionary-based chinese segmentation and statistics-based chinese segmentation.

The core of Chinese word segmentation based on a dictionary is that a unified dictionary table is established firstly, when a sentence needs to be segmented, the sentence is segmented into a plurality of parts, each part is in one-to-one correspondence with the dictionary, if the word is in the dictionary, the word segmentation is successful, otherwise, the word segmentation and matching are continued until the word is successful. Therefore, the dictionary cut rule and the matching order are core.

The Chinese word segmentation method based on statistics comprises the following steps: the word segmentation is considered as a probability maximization problem by statistics, namely, a sentence is split, based on a corpus, the probability of the occurrence of words formed by adjacent words is counted, the occurrence frequency of the adjacent words is large, the occurrence probability is high, and the word segmentation is carried out according to the probability value, so that a complete corpus is important.

Understanding-based word segmentation method: the word segmentation method based on understanding achieves the effect of recognizing words by enabling a computer to simulate the understanding of a sentence by a person. The basic idea is to analyze syntax and semantics while segmenting words, and to process ambiguity phenomenon by using syntax information and semantic information. It generally comprises three parts: word segmentation subsystem, syntax semantic subsystem, and master control part. Under the coordination of the master control part, the word segmentation subsystem can obtain syntactic and semantic information of related words, sentences and the like to judge word segmentation ambiguity, namely the word segmentation subsystem simulates the process of understanding sentences by people. This word segmentation method requires the use of a large amount of linguistic knowledge and information.

The processing process for removing stop words mainly comprises the following steps: and establishing a stop word dictionary, wherein the stop words mainly comprise some adverbs, adjectives and some conjunctions. By maintaining a stop word list, it is actually a feature extraction process, essentially part of feature selection.

In an embodiment of the present invention, when feature extraction is performed on a preprocessing result to obtain text features, the feature extraction can be implemented in three ways, including:

or the like, or, alternatively,

The creation process of the bag-of-words model comprises the following steps:

aggregating all the feature point vectors to obtain a word list;

and generating a bag of words model based on the numerical vectors.

Specifically, a dictionary library is established, the dictionary library comprises all words of the training corpus, each word corresponds to a unique identification number and is represented by a one-hot text.

The word vector dimension of the document is the same as the dimension of the word vector, and the value of each position is the number of times the word at the corresponding position appears in the document, i.e., the bag of words model (BOW).

The creation process of the bag-of-words model comprises the following steps:

(1) visual vocabulary vectors are extracted from images of different classes by using an SIFT algorithm, and the vectors represent feature points which are locally unchanged in the images;

(2) collecting all the feature point vectors into one block, merging visual vocabularies with similar word senses by using a K-Means algorithm, and constructing a word list containing K vocabularies;

(3) and counting the number of times each word in the word list appears in the image, so that the image is represented as a K-dimensional numerical vector.

In one possible implementation, the method can also be based on TF-IDF text feature extraction, wherein the word frequency (TF): tf (term frequency) refers to the number of times a given word appears in the document. Inverse Document Frequency (IDF): the method means that if the number of documents containing the entries is less, the IDF Inverse Document Frequency (Inverse Document Frequency) is higher, and the entries have good category distinguishing capability.

Two parameters, TF and IDF, are used to represent how important a word is in the text. TF refers to the frequency of occurrence of a word in a document, generally, the more times a word occurs in each document, the more important the word is, for example, the BOW model expresses the feature value by the occurrence times, that is, the more times a word occurs in a document, the more weight is, the problem is that the more times a word occurs in a long document is generally greater than the times in a short document, and the feature value is biased to be different. The TF embodies the importance of a word within a document. The IDF is to represent the importance of a word among documents, that is, if a word appears in a very small number of documents, it indicates that the word is highly distinctive to the documents, the corresponding feature value is high, the IDF value is high, IDFi is log (| D |/Ni), D refers to the total number of documents, Ni refers to the number of documents in which a word i appears, and it is obvious that the smaller Ni is, the larger the IDF value is.

In another possible implementation manner, text feature extraction can be performed based on a feature extraction model of the word vector.

If the method is based on a large number of text corpora, each word is mapped into a certain-dimension vector through training of a similar neural network model, the dimension is between dozens of dimensions and hundreds of dimensions, each vector represents the word, and the semantic and grammatical similarity of the word is judged according to the similarity between the vectors.

The common word2vec mainly comprises a CBOW model and a skip-gram model, and the two models are actually a three-layer deep neural network, so that the NNLM is upgraded, a hidden layer is removed, the NNLM consists of an input layer, a projection layer and an output layer, the models are simplified, the training speed of the models is improved, and the time efficiency and the grammatical expression effect of the NNLM are obviously improved. word2vec trains a large amount of linguistic data to finally represent each word by using a dimensional vector, and semantic and grammatical similarity among the words can be represented by the similarity of the vector.

After the target intention recognition model outputs the intention categories, a basis can be provided for the implementation of a subsequent reinforcement learning algorithm.

The text classification is a very typical problem processing technology in NLP, the traditional machine learning algorithm can obtain preliminary text classification of electric commercial user behaviors, but the text classification is not accurate enough and has obvious errors, the text classification of the electric commercial user behaviors can be effectively subjected to intention classification by applying the deep learning algorithm, and the preliminary intention classification can be obtained by implementing different types of deep learning algorithms in the step, so that a foundation is laid for subsequently implementing a reinforcement learning algorithm.

Common models that can be used for classification are: NB model, random forest model (RF), SVM classification model, KNN classification model.

The fast text model, the TextCNN and the TextRNN are common in the deep learning classification model.

The fastText principle: all word vectors in the sentence are averaged (in a sense, only one avg posing special CNN is understood) and then directly connected to a softmax layer for classification.

TextCNN: the CNN is used to extract key information in the sentence like an n-gram.

TextRNN: bi-directional RNN (actually used is bidirectional LSTM) is understood in a sense that it can capture variable length bidirectional "n-gram" information.

TextRNN + Attention: an Attention (Attention) mechanism is a common modeling long-time memory mechanism in the field of natural language processing, can intuitively give the contribution of each word to a result, and basically becomes the standard of a Seq2Seq model. In practice, text classification is also understood in a sense as a special Seq2Seq, so the introduction of the Attention mechanism is considered.

The intent is to identify that there are different processing requirements in different application scenarios. For example:

the rule method based on the dictionary and the template comprises the following steps:

different intentions may have different domain dictionaries such as book names, song names, trade names, etc. When the intention of a user comes, the user judges according to the matching degree or the coincidence degree of the intention and the dictionary, and the simplest rule is that the dictionary coincidence degree of which domain is high, the query is judged to the field.

Based on the query click log: in case of a search engine or the like type service scenario, we can get the user's intention by clicking on the log.

Discriminating the user's intention based on a classification model: since intent recognition is also a classification problem in itself, the method is substantially the same as that of the classification model.

The invention provides an intention identification method and device, comprising the following steps: acquiring a text to be identified; preprocessing the text to be recognized to obtain a preprocessing result; performing feature extraction on the preprocessing result to obtain text features; and inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label. Through a method for generating a text intention classifier, an individualized conversation style matched with a user is determined, and intention information expressed by a user message is accurately predicted based on a machine learning and deep learning algorithm. The accuracy and the reliability of intention identification are improved.

In an embodiment of the present invention, intent recognition is one way to service user behavior in an e-commerce platform. The invention reads mass data through a web server, utilizes big data technology to carry out deep mining and analysis, determines the personalized dialogue style matched with the user through a method for generating a text intention classifier, accurately predicts the intention information expressed by user information based on a machine learning and deep learning algorithm, obtains interesting business indexes and enhances the control of risks.

The invention provides an intention identification method, which comprises the following steps: acquiring a text to be identified; preprocessing the text to be recognized to obtain a preprocessing result; performing feature extraction on the preprocessing result to obtain text features; and inputting the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label. Through a method for generating a text intention classifier, an individualized conversation style matched with a user is determined, and intention information expressed by a user message is accurately predicted based on a machine learning and deep learning algorithm. The accuracy and the reliability of intention identification are improved.

Based on the foregoing embodiment, there is also provided an intention identifying apparatus in an embodiment of the present invention, referring to fig. 3, including:

an acquiring unit 10, configured to acquire a text to be recognized;

the preprocessing unit 20 is configured to preprocess the text to be recognized to obtain a preprocessing result;

a feature extraction unit 30, configured to perform feature extraction on the preprocessing result to obtain a text feature;

and the recognition unit 40 is configured to input the text features into a target intention recognition model to obtain an intention recognition result, where the target intention recognition model is a neural network model trained based on a training sample, and the training sample is text information labeled with a conscious-graph label.

Optionally, the pre-processing unit comprises:

Optionally, the feature extraction unit includes:

or the like, or, alternatively,

Optionally, the apparatus further comprises:

The present invention provides an intention recognition apparatus including: the acquiring unit acquires a text to be recognized; the preprocessing unit preprocesses the text to be recognized to obtain a preprocessing result; the feature extraction unit is used for extracting features of the preprocessing result to obtain text features; the recognition unit inputs the text features into a target intention recognition model to obtain an intention recognition result, wherein the target intention recognition model is a neural network model obtained based on training of a training sample, and the training sample is text information marked with a conscious graph label. Through a method for generating a text intention classifier, an individualized conversation style matched with a user is determined, and intention information expressed by a user message is accurately predicted based on a machine learning and deep learning algorithm. The accuracy and the reliability of intention identification are improved.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An intent recognition method, comprising:

acquiring a text to be identified;

preprocessing the text to be recognized to obtain a preprocessing result;

2. The method according to claim 1, wherein the processing the text to be recognized to obtain a preprocessing result comprises:

performing word segmentation processing on the text to be recognized;

3. The method according to claim 1, wherein the performing feature extraction on the preprocessing result to obtain a text feature comprises:

or the like, or, alternatively,

4. The method of claim 3, further comprising:

aggregating all the feature point vectors to obtain a word list;

and generating a bag of words model based on the numerical vectors.

5. The method of claim 1, further comprising:

6. An intention recognition apparatus, comprising:

the acquiring unit is used for acquiring a text to be recognized;

7. The apparatus of claim 6, wherein the pre-processing unit comprises:

8. The apparatus of claim 6, wherein the feature extraction unit comprises:

or the like, or, alternatively,

9. The apparatus of claim 8, further comprising:

10. The apparatus of claim 6, further comprising: