CN113468308B - Conversation behavior classification method and device and electronic equipment - Google Patents

Conversation behavior classification method and device and electronic equipment Download PDF

Info

Publication number
CN113468308B
CN113468308B CN202110736919.7A CN202110736919A CN113468308B CN 113468308 B CN113468308 B CN 113468308B CN 202110736919 A CN202110736919 A CN 202110736919A CN 113468308 B CN113468308 B CN 113468308B
Authority
CN
China
Prior art keywords
dictionary
training data
label
word
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110736919.7A
Other languages
Chinese (zh)
Other versions
CN113468308A (en
Inventor
简仁贤
吴文杰
苏畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Emotibot Technologies Ltd
Original Assignee
Emotibot Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Emotibot Technologies Ltd filed Critical Emotibot Technologies Ltd
Priority to CN202110736919.7A priority Critical patent/CN113468308B/en
Publication of CN113468308A publication Critical patent/CN113468308A/en
Application granted granted Critical
Publication of CN113468308B publication Critical patent/CN113468308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a conversation behavior classification method and device and electronic equipment, wherein the method comprises the following steps: receiving a database and dialogue behavior training data, wherein the dialogue behavior training data is labeled with a plurality of labels; training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector; summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary; performing first word segmentation processing on the dialogue behavior training data to obtain a first word segmentation result; performing comparison processing on the first dictionary and the first segmentation result to obtain a second dictionary; training the dialogue behavior training data by using the second dictionary to obtain a final classification model; and predicting the text to be detected by using the final classification model to obtain the label of the text to be detected. The invention effectively realizes conversation behavior classification, enriches initial semantic information and overcomes the adverse effect of OOV in short texts.

Description

Conversation behavior classification method and device and electronic equipment
Technical Field
The invention relates to the technical field of natural language processing, in particular to a conversation behavior classification method and device and electronic equipment in the technical field of text classification.
Background
Text classification is one of the important research areas of natural language processing. Wherein the dialogue act (dialog act) classification belongs to the short text classification. Dialog behavior is a specific speech component that labels dialogs according to their meaning category. Dialog behavior is attached to many of the most common types of dialog, usually line-by-line or statement-by-statement. Many researchers have marked various portions of a conversation with conversation activity tags in order to more accurately view or simulate the conversation.
Currently, there are some problems with conversational behavior classification techniques: 1) Most dialogue behavior classification models have certain dependence on pre-training word vectors serving as initial semantic features, and mainstream word vector algorithms only can reflect the relation between word semantics and have certain limitation. 2) Since the short text itself contains less semantics than the long text, the OOV (out of vocabulary, out of dictionary words) problem has a large impact. 3) When the number of the labels is large, the effect is often deteriorated due to misjudgment, and semantic information contained in the labels cannot be utilized.
Disclosure of Invention
The invention aims to provide a conversation behavior classification method, a conversation behavior classification device and electronic equipment, which enrich initial semantic information and overcome adverse effects of OOV in short texts.
The technical scheme for realizing the purpose is as follows:
the application provides a conversation behavior classification method, which comprises the following steps:
receiving a database and dialogue behavior training data, wherein the dialogue behavior training data are labeled with a plurality of labels;
training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector;
summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary;
performing first word segmentation processing on the dialogue behavior training data to obtain a first word segmentation result;
comparing the first dictionary with the first segmentation result to obtain a second dictionary;
training the dialogue behavior training data by using the second dictionary to obtain a final classification model;
and predicting the text to be detected by using the final classification model to obtain the label of the text to be detected.
In one embodiment, the comparing the first dictionary with the first segmentation result to obtain a second dictionary includes:
comparing the Chinese words in the first dictionary with the Chinese words in the first segmentation result to obtain the Chinese words outside the first dictionary;
adding the Chinese words outside the first dictionary and the initial word vectors corresponding to the Chinese words one by one into the first dictionary to obtain a second dictionary;
wherein the length of the initial word vector is the same as the length of the semantic word vector.
In one embodiment, the initial word vector is obtained by random initialization of any array.
In an embodiment, the training the dialogue behavior training data by using the second dictionary to obtain a final classification model includes:
performing second word segmentation processing on the dialogue behavior training data to obtain a second word segmentation result;
searching and extracting a corresponding text semantic word vector in the second dictionary according to the Chinese words in the second word segmentation result;
performing third word segmentation processing on all the labels of the dialogue behavior training data to obtain a third word segmentation result;
according to the Chinese words in the third word segmentation result, searching and taking out corresponding label meaning original word vectors in the second dictionary;
multiplying the attention score of the label of each dialogue behavior training data with all text semantic word vectors of the dialogue behavior training data respectively, and then transmitting the result into a bidirectional long-short term memory network model for processing;
processing the result output by the bidirectional long-short term memory network model through a full link layer to obtain the probability score of the label of each dialogue behavior training data;
taking the label with the probability score higher than a preset threshold value as a prediction result of the current classification model;
and comparing the prediction result of the current classification model with the real label, and reversely propagating and updating the parameters of the current classification model to obtain a final classification model.
In an embodiment, the attention score of the label of the dialogue behavior training data is obtained by multiplying all label primitive word vectors of the label by a parameter matrix in an attention layer.
In one embodiment, the processing the result output by the bidirectional long-short term memory network model through a full-concatenation layer to obtain a probability score of a label of each dialogue behavior training data includes:
and multiplying the output of the bidirectional long-short term memory network model by a parameter matrix in a full link layer, and converting the output into the probability score of the label of each dialogue behavior training data.
In an embodiment, the predicting the text to be tested by using the final classification model to obtain the label of the text to be tested includes:
predicting the texts to be tested by using the final classification model to obtain the probability score of the label of each text to be tested;
and taking the label with the probability score higher than a preset threshold value as a prediction result of the final classification model.
In one embodiment, the predetermined threshold is 0.5;
the parameters of the current classification model include: word vectors, parameters in the attention layer, parameters in the two-way long-short term memory network model, and parameters of the full link layer.
The application provides a conversation action classification device, includes:
the system comprises a receiving module, a database and dialogue behavior training data, wherein the dialogue behavior training data are marked with a plurality of labels;
the word vector preparation module is used for training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector;
the first dictionary module is used for summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary;
the first word segmentation module is used for carrying out first word segmentation on the dialogue behavior training data to obtain a first word segmentation result;
the second dictionary module is used for comparing the first dictionary with the first segmentation result to obtain a second dictionary;
the classification model training module is used for training the dialogue behavior training data by using the second dictionary to obtain a final classification model;
and the dialogue behavior classification module is used for predicting the text to be detected by using the final classification model to obtain the label of the text to be detected.
The application provides an electronic device, the electronic device includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the above-described dialog behavior classification method.
According to the technical scheme provided by the embodiment of the application, the semantic Word vector is used, compared with mainstream Word2vec (Word to vector) and Glove (Global vector for Word Representation), not only are similar words connected together, but also words related to the semantic Word are connected together, and the semantic Word vector contains richer semantic information and is used as an initial semantic Representation, so that the effect of the classification model is greatly improved. Meanwhile, OOV (out of vocabulary, out of word in the dictionary) is initialized and added into the dictionary at random, new semantic representation is learned in training, a flexible word list expansion thought is provided, and adverse effects of OOV in short texts are overcome. Moreover, the classification model integrates semantic information contained in the labels, establishes a relation between the input text and all the labels, and learns that the Attention degree (namely the Attention Score) of each label is adjusted according to input data in training, so that more accurate class matching is facilitated, and the performance fluctuation of the model is basically not caused by increasing the continuous training of the labels.
Drawings
FIG. 1 is a flowchart of a conversation activity classification method provided by an embodiment of the present application;
FIG. 2 is a flow chart of training a classification model in the present application;
FIG. 3 is a flowchart of a conversation activity classification method according to another embodiment of the present application;
fig. 4 is a block diagram of a dialogue acts classification apparatus according to an embodiment of the present application;
FIG. 5 is a block diagram of a classification model training module of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The invention will be further explained with reference to the drawings.
The classification of the dialogue behaviors depends on a corresponding classification model, the classification model needs to be trained, pre-training word vectors are used during training, and the pre-training word vectors are important as initial semantic features. However, the mainstream word vector algorithm can only reflect the relationship between word semantics, for example: word to vector (Word to vector) and Global vector for Word retrieval (Global Word vector Representation) are only used to link similar words, but not to link words related to an semantic meaning, which has great limitation.
In addition, the dialogue behavior classification belongs to short text classification, the short text contains less semantics relative to the long text, once there is a word outside the dictionary, because no corresponding word vector is represented, the accuracy of the trained classification model is reduced, and the final dialogue behavior classification result is influenced.
In addition, in each application field, when the number of tags is large, the effect is deteriorated due to erroneous judgment. Because the semantic information contained in the labels is not blended into the classification model, the semantic information contained in the labels cannot be utilized, and the relation between the input text and the labels cannot be established, so that once the labels are continuously added, the continuous training causes the performance fluctuation of the model, and the final conversation behavior classification result is influenced.
In order to solve the problems, the pre-training word vector contains richer semantic information, the initial semantic representation force is improved, adverse effects of OOV in the short text are overcome, and semantic information contained in the label is fused into the classification model, so that the effect of the classification model is improved. The application provides a conversation behavior classification method, a conversation behavior classification device, electronic equipment and a computer readable storage medium, and conversation behavior classification is accurately and effectively achieved. The present invention can be realized by corresponding software, hardware or a combination of software and hardware, and the embodiments of the present invention are described in detail below.
Referring to fig. 1, the present embodiment provides a conversation activity classification method, including the following steps:
step S100, receiving a database and dialogue action training data. Wherein the dialogue behavior training data is labeled with a plurality of labels.
In this embodiment, a dictionary is established based on the vocabulary in the database. And taking the dialogue behavior training data and the marked labels as materials for training the classification model.
And S101, training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector.
In this embodiment, an original meaning word vector is first used as a pre-training word vector, that is, three layers of information, namely, word (word), semantic (sense) and original meaning (sememe), are incorporated during word vector training, so that the representation capability is improved. Wherein the semantics are paraphrases of words; an semantic meaning refers to the smallest, irreparable unit of semantics in linguistics. Taking the word "apple" as an example, there are two sources of meaning: "fruit" and "science and technology company". The semantics are that apple is used as the paraphrase of fruit and science and technology companies respectively, such as: a red, sweet botanical fruit/company of XXXX business created by geobus in one year.
Each word corresponds to a unique semantic word vector. Chinese words include single words.
By adopting the mode of expressing Chinese words by the semantic word vector, not only are similar words connected together, but also words related to the semantic word are connected together, so that the word vector contains richer semantic information, and the limitation of the current mainstream word vector is overcome.
And S102, summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary.
Step S103, performing first word segmentation processing on the dialogue behavior training data to obtain a first word segmentation result.
And step S104, comparing the first dictionary with the first segmentation result to obtain a second dictionary.
In this embodiment, the dialogue behavior training data is labeled with a plurality of labels. If the label is English, the label is firstly converted into Chinese. Firstly, comparing the Chinese words in the first dictionary with the Chinese words in the first segmentation result to obtain the Chinese words outside the first dictionary; and then adding the Chinese words out of the first dictionary and the initial word vectors corresponding to the Chinese words one by one into the first dictionary to obtain the second dictionary. Wherein, the Chinese words except the first dictionary are in the first segmentation result but not in the first dictionary. The initial word vector is obtained by random initialization of any array. The length of the initial word vector is the same as the length of the semantic word vector. The initial word vector is continuously updated in the following as the classification model is continuously trained.
Thus, the influence caused by the OOV problem in the prior art is reduced to the maximum extent.
And step S105, training the dialogue behavior training data by using the second dictionary to obtain a final classification model.
In this embodiment, the current classification model needs to be trained, continuously refined and updated, and preparation is made for final classification of the dialogue behavior. The current classification model adopts the LRNN (Label embedded Recurrent Neural Network) model. Specifically, as shown in fig. 2, step S105 includes the steps of:
step S1051, carrying out second word segmentation processing on the dialogue behavior training data to obtain a second word segmentation result.
Step S1052, searching and retrieving the corresponding text semantic word vector in the second dictionary according to the chinese word in the second word segmentation result.
And step S1053, carrying out third word segmentation on all the labels of the dialogue behavior training data to obtain a third word segmentation result. If the label is represented by English, the label is translated into Chinese.
And step S1054, searching and taking out the corresponding label meaning original word vector in the second dictionary according to the Chinese word in the third word segmentation result.
And step S1055, multiplying the attention score of the label of each dialogue behavior training data by all text semantic word vectors of the dialogue behavior training data, and then transmitting the result into a bidirectional long-short term memory network model for processing.
In this embodiment, the attention score of the label of the dialogue behavior training data is obtained by multiplying all label-meaning primitive word vectors of the label by a parameter matrix in the attention layer. Namely: the vector representation of the tag (i.e., the vector of the sense primitive word in the tag) is multiplied by the parameter matrix in the attention layer to obtain the corresponding attention score. The attention score is part of the current classification model parameters and is updated as training progresses (the parameter matrix is updated as learning progresses in the model training, and the attention score is also updated). All textual primitive word vectors of the dialogue act training data are: vector representation of dialogue acts training data.
And S1056, processing the output result of the bidirectional Long-Short Term Memory network (Bi-LSTM, fully called Bi-Long-Short Term Memory) model through a Fully Connected layer (FC), and obtaining the probability score of the label of each dialogue behavior training data.
In this embodiment, the bidirectional long-short term memory network model is the prior art, and the detailed description of the specific principle is not needed. A fully-connected layer may be understood as the dimension (i.e., label number) required to convert the output of the previous layer into the final output of the model by multiplying it by a parameter matrix. Therefore, the output of the bidirectional long-short term memory network model is multiplied by the parameter matrix in the full link layer and converted into the probability score of the label of each dialogue behavior training data.
And step S1057, taking the label with the probability score higher than a preset threshold value as a prediction result of the current classification model. The preset threshold is self-defined, for example, 0.5.
And S1058, comparing the prediction result of the current classification model with the real label, and updating the parameters of the current classification model through back propagation to obtain the final classification model.
In this embodiment, the parameters of the current classification model include: word vectors (a word vector has an initial value, i.e. a primitive word vector, and is updated with other parameters during model training, such as the initial word vector described above), parameters in the attention layer, parameters in the bidirectional long-short term memory network model, and parameters in the full concatenation layer. The back propagation is common knowledge in the field of machine learning and does not need to be described in detail.
Through the method, semantic information contained in the labels is blended into the classification model, a relation is established between the input text and the labels, and an effective and accurate final classification model is trained.
And S106, predicting the text to be detected by using the final classification model to obtain the label of the text to be detected.
In the embodiment, the final classification model is used for predicting the texts to be tested, and the probability score of the label of each text to be tested is obtained; and taking the label with the probability score higher than a preset threshold value as a prediction result of the final classification model.
Through the steps S100-S106, not only are the similar words connected together, but also the words related to the sememes are connected together, and the semantic information is richer. Meanwhile, OOV is initialized randomly and added into a dictionary, and adverse effects of OOV in short texts are overcome. Moreover, semantic information contained in the label is merged into the classification model, so that the effect of the classification model is improved.
Referring to fig. 3, the present application provides a conversation behavior classification method applied in the financial field.
In the embodiment of the application, based on the financial field conversation behavior data, the financial data tags are represented in english, 290 tags are firstly translated into chinese, and the representation is as follows:
Figure BDA0003140297410000081
the method can be implemented according to the following steps:
and step S200, receiving a financial news database and financial field conversation behavior training data. The financial field dialogue behavior training data is labeled with the financial data labels.
Step S201, training Chinese words in a financial news database by using words, semantics and sememes to obtain a sememe word vector.
Step S202, chinese words and Chinese word vector corresponding to each other in the financial news database are gathered to obtain a first dictionary.
Step S203, performing first segmentation processing on the financial field dialogue behavior training data to obtain a first segmentation result.
And step S204, comparing the first dictionary with the first segmentation result to obtain a second dictionary.
In the embodiment, the Chinese words in the first dictionary are compared with the Chinese words in the first segmentation result to obtain the Chinese words outside the first dictionary; and then adding the Chinese words out of the first dictionary and the initial word vectors corresponding to the Chinese words one by one into the first dictionary to obtain the second dictionary.
And step S205, training the financial field dialogue behavior training data by using the second dictionary to obtain a final classification model. The method specifically comprises the following steps:
and carrying out second word segmentation on the dialogue behavior training data in the financial field to obtain a second word segmentation result.
And searching and taking out the corresponding text semantic word vector in the second dictionary according to the Chinese words in the second word segmentation result.
And performing third word segmentation processing on all the financial data labels to obtain a third word segmentation result.
And searching and taking out the corresponding label meaning original word vector in the second dictionary according to the Chinese words in the third word segmentation result.
And multiplying the attention score of each financial data label with all text semantic primitive word vectors of the financial field dialogue behavior training data respectively, and then transmitting the result into a bidirectional long-short term memory network model for processing.
And processing the result output by the bidirectional long-short term memory network model through a full link layer to obtain the probability score of the financial data label.
And taking the financial data label with the probability score higher than a preset threshold value as a prediction result of the current classification model.
And comparing the prediction result of the current classification model with the real label, and performing back propagation to update the parameters of the current classification model to obtain a final classification model.
And step S206, predicting the text to be tested by using the final classification model to obtain the financial data label of the text to be tested.
And predicting the text to be predicted by using the final classification model to obtain a probability score corresponding to each financial training label, and taking the financial training labels with the probability scores higher than a preset threshold value as a prediction result of the classification model, namely a conversation behavior classification result of the text to be predicted.
In the embodiment of the present application, for example, "-the latest loss of money still cannot be timely" is input into the final classification model, and the final classification model outputs a label with a probability score threshold greater than 0.5, so as to obtain:
Cant_repay 0.99995
Lack_of_money 0.98127
therefore, the conversation behavior classification method can accurately classify the conversation behaviors in the financial field. For the input text of ' recently having no money right still ", the score probability of the label of ' Cant _ repay ' is up to 0.99995, and the score probability of the label of ' Lack of funds ' (Lock _ of _ money) is up to 0.98127, so that the conversation behavior is accurately identified.
In this embodiment of the present application, for example, "do 25 for repayment date" is input into the final classification model, and the final classification model outputs a label with a probability score threshold greater than 0.5, so as to obtain:
Confirm_repayment_day 0.90940
in the above, for the input text "is 25 th day on repayment date", the score probability of the label "Confirm _ repayment date" is as high as 0.90940, and the corresponding conversation behavior is accurately identified.
In the embodiment of the present application, for example, "what is coffee good? Inputting the final classification model, and outputting a label with a probability score threshold value larger than 0.5 by the final classification model to obtain:
None
in the above, none indicates null, and in this example, there is no corresponding label in the dialogue behavior classification model trained based on the financial dialogue scene, and the result is returned to null. For "what is there a good coffee? "this input text, without a label with a probability score threshold greater than 0.5, i.e.: the application accurately identifies the conversation behavior which does not belong to the financial scene.
The following are embodiments of the apparatus of the present application, which can be used to implement the embodiments of the conversation behavior classification method described above. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the dialog behavior classification method described above.
Referring to fig. 4, the present application provides a conversation behavior classification apparatus, including: a receiving module 300, a word vector preparation module 301, a first lexicon module 302, a first segmentation module 303, a second lexicon module 304, a classification model training module 305, and a dialogue behavior classification module 306.
A receiving module 300, configured to receive a database and dialogue acts training data, where the dialogue acts training data is labeled with a plurality of labels;
a word vector preparation module 301, which trains the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector;
a first dictionary module 302, summarizing the chinese words and the semantic word vectors corresponding to the chinese words one to one, to obtain a first dictionary;
the first segmentation module 303 is configured to perform first segmentation processing on the dialogue behavior training data to obtain a first segmentation result;
the second dictionary module 304 is used for comparing the first dictionary with the first segmentation result to obtain a second dictionary;
a classification model training module 305, which trains the dialogue behavior training data by using the second dictionary to obtain a final classification model;
and the dialogue behavior classification module 306 is used for predicting the text to be tested by using the final classification model to obtain the label of the text to be tested.
In this embodiment, the word vector preparation module 301 incorporates three layers of information, i.e., word, semantic, and semantic, into the word vector during training, so as to obtain a semantic word vector. Each word corresponds to a unique semantic word vector.
In this embodiment, the second dictionary module 304 compares the chinese words in the first dictionary with the chinese words in the first segmentation result to obtain the chinese words outside the first dictionary; and then adding the Chinese words outside the first dictionary and the initial word vectors corresponding to the Chinese words one by one into the first dictionary to obtain the second dictionary. The initial word vector is obtained by random initialization of any array. The length of the initial word vector is the same as the length of the semantic word vector. The initial word vector is continuously updated in the following as the classification model is continuously trained.
In this embodiment, as shown in fig. 5, the classification model training module 305 specifically includes: the system comprises a second word segmentation module 3051, a text semantic word vector module 3052, a third word segmentation module 3053, a label semantic word vector module 3054, a bidirectional long-short term memory network model processing module 3055, a full-link layer processing module 3056, a model prediction result output module 3057 and a final classification model acquisition module 3058.
The second word segmentation module 3051, performing second word segmentation processing on the dialogue behavior training data to obtain a second word segmentation result;
the text semantic original word vector module 3052, which searches and extracts a corresponding text semantic original word vector in the second dictionary according to the Chinese words in the second word segmentation result;
the third word segmentation module 3053 performs third word segmentation processing on all the labels of the dialogue behavior training data to obtain a third word segmentation result;
the tag semantic original word vector module 3054, according to the chinese word in the third word segmentation result, looking up and taking out a corresponding tag semantic original word vector in the second dictionary;
the bidirectional long-short term memory network model processing module 3055 is used for multiplying the attention scores of the labels of the dialogue behavior training data with all text semantic primitive word vectors of the dialogue behavior training data respectively, and then transmitting the result to the bidirectional long-short term memory network model for processing;
the full link layer processing module 3056 is used for processing the result output by the bidirectional long-short term memory network model through a full link layer to obtain the probability score of the label of each piece of dialogue behavior training data;
the model prediction result output module 3057 is used for taking the label with the probability score higher than a preset threshold value as the prediction result of the current classification model;
and the final classification model obtaining module 3058, which compares the prediction result of the current classification model with the real label, and updates the parameters of the current classification model through back propagation to obtain the final classification model.
Referring to fig. 6, an electronic device 400 includes a processor 401 and a memory 402 for storing instructions executable by the processor 401. Wherein the processor 401 is configured to execute the dialog behavior classification method in any of the above embodiments.
The processor 401 may be an integrated circuit chip having signal processing capabilities. The Processor 401 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; or may be a processed signal processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
The Memory 402 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk. The memory 402 also stores one or more modules that are executed by the one or more processors 401, respectively, to implement the steps of the dialog behavior classification method in the above-described embodiment.
An embodiment of the present application further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is executable by the processor 401 to perform the dialog behavior classification method in any of the above embodiments.
In the several embodiments provided in the present application, the disclosed apparatus and method may also be implemented in other manners. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to perform all or part of the steps of the method of the embodiments of the snail capacity. And the aforementioned storage medium includes: a U disk, a removable hard disk, a read-only Memory, a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.
The above embodiments are provided only for illustrating the present invention and not for limiting the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, and therefore all equivalent technical solutions should also fall within the scope of the present invention, and should be defined by the claims.

Claims (5)

1. A conversation activity classification method, comprising:
receiving a database and dialogue behavior training data, wherein the dialogue behavior training data is labeled with a plurality of labels;
training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector;
summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary;
performing first word segmentation processing on the dialogue behavior training data to obtain a first word segmentation result;
comparing the first dictionary with the first segmentation result to obtain a second dictionary;
training the dialogue behavior training data by using the second dictionary to obtain a final classification model;
predicting the text to be tested by using the final classification model to obtain a label of the text to be tested;
wherein the training the dialogue behavior training data using the second dictionary to obtain a final classification model comprises:
performing second word segmentation processing on the dialogue behavior training data to obtain a second word segmentation result;
searching and taking out a corresponding text semantic word vector in the second dictionary according to the Chinese words in the second word segmentation result;
performing third word segmentation processing on all the labels of the dialogue behavior training data to obtain a third word segmentation result;
searching and taking out corresponding label meaning original word vectors in the second dictionary according to the Chinese words in the third word segmentation result;
multiplying the attention score of the label of each dialogue behavior training data with all text semantic word vectors of the dialogue behavior training data respectively, and then transmitting the result into a bidirectional long-short term memory network model for processing;
processing the result output by the bidirectional long-short term memory network model through a full link layer to obtain the probability score of the label of each dialogue behavior training data;
taking the label with the probability score higher than a preset threshold value as a prediction result of the current classification model;
comparing the prediction result of the current classification model with a real label, and updating the parameters of the current classification model through back propagation to obtain a final classification model;
the attention score of the label of the dialogue behavior training data is obtained by multiplying all label original word vectors of the label by a parameter matrix in an attention layer;
the processing of the result output by the bidirectional long-short term memory network model through a full link layer to obtain the probability score of the label of each dialogue behavior training data comprises the following steps:
multiplying the output of the bidirectional long and short term memory network model by a parameter matrix in a full link layer, and converting the output into a probability score of a label of each dialogue behavior training data;
the predicting the text to be tested by using the final classification model to obtain the label of the text to be tested comprises the following steps:
predicting the texts to be tested by using the final classification model to obtain the probability score of the label of each text to be tested;
taking the label with the probability score higher than a preset threshold value as a prediction result of the final classification model;
the preset threshold value is 0.5;
the parameters of the current classification model include: word vectors, parameters in the attention layer, parameters in the two-way long-short term memory network model, and parameters of the fully-connected layer.
2. The dialogue behavior classification method according to claim 1, wherein the matching the first dictionary with the first segmentation result to obtain a second dictionary comprises:
comparing the Chinese words in the first dictionary with the Chinese words in the first segmentation result to obtain the Chinese words outside the first dictionary;
adding the Chinese words outside the first dictionary and the initial word vectors corresponding to the Chinese words one by one into the first dictionary to obtain a second dictionary;
wherein the length of the initial word vector is the same as the length of the semantic word vector.
3. The dialog behavior classification method according to claim 2, characterized in that the initial word vector is obtained by random initialization of an arbitrary array.
4. A conversation behavior classification apparatus based on the conversation behavior classification method according to any one of claims 1 to 3, comprising:
the system comprises a receiving module, a database and dialogue behavior training data, wherein the dialogue behavior training data are marked with a plurality of labels;
the word vector preparation module is used for training the Chinese words in the database by using words, semantics and sememes to obtain a sememe word vector;
the first dictionary module is used for summarizing the Chinese words and the semantic word vectors corresponding to the Chinese words one by one to obtain a first dictionary;
the first word segmentation module is used for carrying out first word segmentation on the dialogue behavior training data to obtain a first word segmentation result;
the second dictionary module is used for comparing the first dictionary with the first segmentation result to obtain a second dictionary;
the classification model training module is used for training the dialogue behavior training data by using the second dictionary to obtain a final classification model;
and the dialogue behavior classification module is used for predicting the text to be detected by using the final classification model to obtain the label of the text to be detected.
5. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the dialogue acts classification method of any of claims 1-3.
CN202110736919.7A 2021-06-30 2021-06-30 Conversation behavior classification method and device and electronic equipment Active CN113468308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110736919.7A CN113468308B (en) 2021-06-30 2021-06-30 Conversation behavior classification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110736919.7A CN113468308B (en) 2021-06-30 2021-06-30 Conversation behavior classification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113468308A CN113468308A (en) 2021-10-01
CN113468308B true CN113468308B (en) 2023-02-10

Family

ID=77876507

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110736919.7A Active CN113468308B (en) 2021-06-30 2021-06-30 Conversation behavior classification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113468308B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597988A (en) * 2018-10-31 2019-04-09 清华大学 The former prediction technique of vocabulary justice, device and electronic equipment across language
CN109614618A (en) * 2018-06-01 2019-04-12 安徽省泰岳祥升软件有限公司 Word treatment method and device outside collection based on multi-semantic meaning
CN109766431A (en) * 2018-12-24 2019-05-17 同济大学 A kind of social networks short text recommended method based on meaning of a word topic model
CN110119786A (en) * 2019-05-20 2019-08-13 北京奇艺世纪科技有限公司 Text topic classification method and device
CN110209818A (en) * 2019-06-04 2019-09-06 南京邮电大学 A kind of analysis method of Semantic-Oriented sensitivity words and phrases

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6904402B1 (en) * 1999-11-05 2005-06-07 Microsoft Corporation System and iterative method for lexicon, segmentation and language model joint optimization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109614618A (en) * 2018-06-01 2019-04-12 安徽省泰岳祥升软件有限公司 Word treatment method and device outside collection based on multi-semantic meaning
CN109597988A (en) * 2018-10-31 2019-04-09 清华大学 The former prediction technique of vocabulary justice, device and electronic equipment across language
CN109766431A (en) * 2018-12-24 2019-05-17 同济大学 A kind of social networks short text recommended method based on meaning of a word topic model
CN110119786A (en) * 2019-05-20 2019-08-13 北京奇艺世纪科技有限公司 Text topic classification method and device
CN110209818A (en) * 2019-06-04 2019-09-06 南京邮电大学 A kind of analysis method of Semantic-Oriented sensitivity words and phrases

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Training a Dialogue Act for Human-human and Human-computer Travel dialogues;Rashmi Prasad等;《SIGDIAL》;20021231;全文 *
借重于人工知识库的词和义项的向量表示:以HowNet为例;孙茂松等;《中文信息学报》;20161115(第06期);全文 *
基于局部语义相关性的定义文本义原预测;杜家驹等;《中文信息学报》;20200515(第05期);全文 *

Also Published As

Publication number Publication date
CN113468308A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
CN107808011B (en) Information classification extraction method and device, computer equipment and storage medium
US10606946B2 (en) Learning word embedding using morphological knowledge
CN110263325B (en) Chinese word segmentation system
CN109858010B (en) Method and device for recognizing new words in field, computer equipment and storage medium
CN109858041B (en) Named entity recognition method combining semi-supervised learning with user-defined dictionary
CN110209805B (en) Text classification method, apparatus, storage medium and computer device
CN112270379A (en) Training method of classification model, sample classification method, device and equipment
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
Jabreel et al. Target-dependent sentiment analysis of tweets using bidirectional gated recurrent neural networks
CN111950287B (en) Entity identification method based on text and related device
Tang et al. Deep learning in sentiment analysis
US11934787B2 (en) Intent determination in a messaging dialog manager system
Benzebouchi et al. Multi-classifier system for authorship verification task using word embeddings
US20230252297A1 (en) Annotating customer data
CN111368130A (en) Quality inspection method, device and equipment for customer service recording and storage medium
CN111145914B (en) Method and device for determining text entity of lung cancer clinical disease seed bank
Shekhar et al. An effective cybernated word embedding system for analysis and language identification in code-mixed social media text
Mankolli et al. Machine learning and natural language processing: Review of models and optimization problems
CN113221553A (en) Text processing method, device and equipment and readable storage medium
CN110874408B (en) Model training method, text recognition device and computing equipment
CN113468308B (en) Conversation behavior classification method and device and electronic equipment
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN112182020B (en) Financial behavior identification and classification method, device and computer readable storage medium
Du et al. Sentiment classification via recurrent convolutional neural networks
Sirirattanajakarin et al. BoydCut: Bidirectional LSTM-CNN Model for Thai Sentence Segmenter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant