CN118377909B - Customer label determining method and device based on call content and storage medium - Google Patents
Customer label determining method and device based on call content and storage medium Download PDFInfo
- Publication number
- CN118377909B CN118377909B CN202410804728.3A CN202410804728A CN118377909B CN 118377909 B CN118377909 B CN 118377909B CN 202410804728 A CN202410804728 A CN 202410804728A CN 118377909 B CN118377909 B CN 118377909B
- Authority
- CN
- China
- Prior art keywords
- word
- semantic
- voice
- coding
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 175
- 230000008569 process Effects 0.000 claims abstract description 136
- 230000008451 emotion Effects 0.000 claims abstract description 56
- 230000007935 neutral effect Effects 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims abstract description 13
- 238000013475 authorization Methods 0.000 claims abstract description 11
- 239000013598 vector Substances 0.000 claims description 275
- 238000012549 training Methods 0.000 claims description 69
- 230000004927 fusion Effects 0.000 claims description 54
- 238000007781 pre-processing Methods 0.000 claims description 40
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000006854 communication Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 abstract description 18
- 230000002996 emotional effect Effects 0.000 abstract description 12
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000004458 analytical method Methods 0.000 description 16
- 230000008909 emotion recognition Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The application discloses a method, a device and a storage medium for determining a client tag based on conversation content, which record conversation process voice data between a service person and a client through recording equipment after obtaining recording permission authorization, and introduce an artificial intelligence based data processing and semantic understanding algorithm at the rear end to carry out text content global semantic coding based on word granularity on the conversation process voice data so as to understand conversation voice content semantics of the client and identify emotion states of the client to judge whether the emotion of the client is positive, negative or neutral. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
Description
Technical Field
The present application relates to the field of intelligent recognition, and more particularly, to a method, apparatus, and storage medium for determining a client tag based on call content.
Background
In modern business environments, telephony services are one of the important ways in which enterprises manage. Call content processing refers to the process of analyzing and processing telephone calls, customer service conversations, or other voice communication content. By analyzing the conversation content, important information about the needs, problems, attitudes and the like of the clients can be obtained, the enterprises are helped to better understand the clients, and more personalized services are provided or corresponding decisions are made. For example, the satisfaction degree of the customer on the service can be judged by analyzing the call content, or the repayment willingness of the customer after the lending is judged, so that a more targeted strategy is formulated.
However, the traditional call content processing method usually focuses on keyword matching, manual analysis and other modes, omits analysis on emotion of the client, and limits understanding of the actual feeling and attitude of the client by enterprises. In addition, the traditional method can only extract part of key information in the call content, and cannot comprehensively grasp the overall situation and context of the call, so that enterprises lack comprehensive basis in judging customer satisfaction and enthusiasm to make more effective decisions.
Accordingly, a customer label determination scheme based on call content is desired.
Disclosure of Invention
The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides a method, a device and a storage medium for determining a client tag based on conversation content, which record conversation process voice data between a service person and a client through recording equipment after recording permission authorization, and introduce an artificial intelligence-based data processing and semantic understanding algorithm at the rear end to carry out word granularity-based text content global semantic coding on the conversation process voice data so as to understand conversation voice content semantics of the client and recognize the emotion state of the client to judge whether the emotion of the client is positive, negative or neutral. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
According to one aspect of the present application, there is provided a client tag determining method based on call contents, including:
after the recording permission authorization is obtained, recording call process voice data between a service person and a client through recording equipment, and storing the call process voice data into a voice database;
calling the voice data in the conversation process from the voice database;
Firstly, carrying out voice recognition on the voice data in the conversation process to convert the voice data into a voice-text recognition result in the conversation process, and then preprocessing the voice-text recognition result in the conversation process to obtain a voice-text recognition result in the conversation process after preprocessing;
Word embedding coding based on word granularity is carried out on the voice-text recognition result in the pre-processing conversation process so as to obtain a sequence of word granularity embedded coding vectors of voice content;
Marking parts of speech of each voice content word in the voice content word sequence and performing one-time hot coding on the parts of speech of each voice content word in the voice content word sequence to obtain a semantic content word part-of-speech one-time hot coding vector sequence;
Performing word-by-word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain voice content global text semantic coding characteristics;
Based on the phonetic content global text semantic coding feature, it is determined whether the customer emotion is positive, negative or neutral.
According to another aspect of the present application, there is provided a client tag determining apparatus based on call contents, comprising:
The data acquisition module is used for recording call process voice data between a service person and a client through recording equipment after obtaining recording permission authorization, and storing the call process voice data into the voice database;
The data calling module is used for calling the voice data in the conversation process from the voice database;
The data preprocessing module is used for preprocessing the obtained conversation process voice-text recognition result after firstly carrying out voice recognition on the conversation process voice data so as to convert the conversation process voice data into a conversation process voice-text recognition result, so as to obtain a preprocessed conversation process voice-text recognition result;
The voice content word granularity embedded coding module is used for carrying out word embedded coding based on word granularity on the voice-text recognition result in the pre-processing call process so as to obtain a sequence of voice content word granularity embedded coding vectors;
the single-heat coding module is used for marking the parts of speech of each voice content word in the voice content word sequence and single-heat coding the parts of speech of each voice content word in the voice content word sequence to obtain a semantic content word part-of-speech single-heat coding vector sequence;
the word granularity multi-scale semantic association coding module is used for carrying out word granularity multi-scale semantic association coding on the obtained sequence of the coding vectors after carrying out word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the sequence of the voice content word granularity embedded coding vectors and the sequence of the semantic content word part-of-speech single-hot coding vectors so as to obtain voice content global text semantic coding characteristics;
And the client tag determining module is used for determining whether the emotion of the client is positive, negative or neutral based on the global text semantic coding features of the voice content.
According to a further aspect of the present application there is provided a computer storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform a method of determining a customer label based on talk content as described above.
Compared with the prior art, the method, the device and the storage medium for determining the client tag based on the conversation content are characterized in that after the authorization of recording permission is obtained, conversation process voice data between a service person and a client are recorded through recording equipment, and text content global semantic coding based on word granularity is conducted on the conversation process voice data through introducing an artificial intelligence-based data processing and semantic understanding algorithm at the rear end, so that conversation voice content semantics of the client are understood, and emotion states of the client are identified, so that whether the emotion of the client is positive, negative or neutral is judged. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 is a flow chart of a method for determining customer labels based on call content according to an embodiment of the present application;
Fig. 2 is a system architecture diagram of a method for determining a client tag based on call contents according to an embodiment of the present application;
FIG. 3 is a flow chart of a training phase of a method for determining customer labels based on conversation content according to an embodiment of the present application;
FIG. 4 is a flowchart of sub-step S4 of a method for determining a customer label based on call content according to an embodiment of the present application;
FIG. 5 is a flowchart of sub-step S6 of a method for determining a customer label based on call content according to an embodiment of the present application;
fig. 6 is a block diagram of a client tag determination apparatus based on call contents according to an embodiment of the present application.
Detailed Description
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative, and different aspects of the systems and methods may use different modules.
A flowchart is used in the present application to describe the operations performed by a system according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.
Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.
The traditional conversation content processing method is usually focused on keyword matching, manual analysis and other modes, analysis on emotion of a client is omitted, and the understanding of the enterprise on the actual feeling and attitude of the client is limited. In addition, the traditional method can only extract part of key information in the call content, and cannot comprehensively grasp the overall situation and context of the call, so that enterprises lack comprehensive basis in judging customer satisfaction and enthusiasm to make more effective decisions. Accordingly, a customer label determination scheme based on call content is desired.
In the technical scheme of the application, a client tag determining method based on call content is provided. Fig. 1 is a flowchart of a method for determining a client tag based on call contents according to an embodiment of the present application. Fig. 2 is a system architecture diagram of a method for determining a client tag based on call contents according to an embodiment of the present application. As shown in fig. 1 and 2, the method for determining a client tag based on call contents according to an embodiment of the present application includes the steps of: s1, after recording permission authorization is obtained, recording call process voice data between a service person and a client through recording equipment, and storing the call process voice data into a voice database; s2, calling the voice data in the conversation process from the voice database; s3, firstly performing voice recognition on the voice data in the conversation process to convert the voice data into a voice-text recognition result in the conversation process, and then preprocessing the voice-text recognition result in the conversation process to obtain a voice-text recognition result in the conversation process after preprocessing; s4, word embedding and coding based on word granularity are carried out on the voice-text recognition result in the pre-processing conversation process so as to obtain a sequence of voice content word granularity embedded coding vectors; s5, marking parts of speech of each voice content word in the voice content word sequence and performing single-heat coding on the parts of speech of each voice content word in the voice content word sequence to obtain a semantic content word part-of-speech single-heat coding vector sequence; s6, performing word-by-word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain voice content global text semantic coding features; s7, determining whether the emotion of the customer is positive, negative or neutral based on the global text semantic coding features of the voice content.
Particularly, after the recording permission authorization is obtained, the S1 and the S2 record the call process voice data between the service personnel and the clients through recording equipment, and store the call process voice data into a voice database; and retrieving the call process voice data from the voice database. By the method, a large amount of voice data can be subjected to semantic analysis, so that emotion change of a client in the conversation process can be intelligently identified, and a basis is provided for decision making of enterprises.
Specifically, in the step S3, the voice recognition is performed on the voice data of the call process first to convert the voice data of the call process into a voice-text recognition result of the call process, and then the obtained voice-text recognition result of the call process is preprocessed to obtain the voice-text recognition result of the call process after preprocessing. It should be understood that, in order to analyze the voice data in the call process to determine the emotional state and the change condition of the client, so as to identify whether the emotion of the client is positive, negative or neutral, so as to make a corresponding policy, in the technical solution of the present application, it is necessary to call the voice data in the call process from the voice database, and perform voice recognition on the voice data in the call process to obtain a voice-text recognition result in the call process. The conversation content can be more fully and conveniently analyzed by using a natural language processing tool and an algorithm through converting the conversation content into a text form, so that enterprises are helped to better understand semantic content and emotion tendencies of clients in voice data in the conversation process, important information such as client requirements, emotion states and the like can be analyzed, and basis is provided for enterprise decision making. It is considered that spelling errors may occur when the call content is converted into text, which may affect subsequent text processing and analysis, while repeated words may result in redundancy of information, which may also affect understanding and analysis of the text content. Therefore, in order to more accurately perform semantic analysis and emotion recognition on the speech-text recognition result of the conversation process to determine the emotional state of the client, in the technical scheme of the application, the speech-text recognition result of the conversation process needs to be preprocessed to obtain the speech-text recognition result of the conversation process after preprocessing. In particular, the preprocessing includes steps of deactivating words, correcting spelling errors, and removing repeated words, wherein by correcting spelling errors, accuracy of text data can be ensured and misunderstanding or misinterpretation can be avoided; the repeated words are removed, so that the text structure can be simplified, redundant information is reduced, and the text is simpler and easier to process. The term "stop word" refers to a word that has no actual meaning or is not important in natural language processing, such as "yes", "in", etc. Removing these stop words can reduce noise, improve text quality and readability, and help to highlight important information and keywords. Through the preprocessing steps, the quality of a voice-text recognition result in the conversation process can be optimized, and the efficiency and accuracy of subsequent semantic analysis and emotion recognition are improved. In a specific example of the present application, after performing speech recognition on the speech data of the call process to convert the speech data into speech-text recognition result of the call process, preprocessing the obtained speech-text recognition result of the call process to obtain a speech-text recognition result of the call process after preprocessing, including: performing voice recognition on the voice data in the conversation process to obtain a voice-text recognition result in the conversation process; removing stop words in the voice-text recognition result in the conversation process; correcting spelling errors in the speech-text recognition result in the conversation process; and removing repeated words of the voice-text recognition result in the conversation process.
Specifically, as shown in fig. 5, the step S4 is to perform word-granularity-based word embedding encoding on the voice-text recognition result in the post-preprocessing conversation process to obtain a sequence of word-granularity-embedded encoding vectors of the voice content. In particular, in one specific example of the present application, as shown in fig. 4, the S4 includes: s41, word segmentation processing is carried out on the voice-text recognition result in the pre-processing conversation process so as to obtain a voice content word sequence; s42, enabling the sequence of the voice content words to pass through a Word embedding encoder based on a Word2Vec model to obtain the sequence of the voice content Word granularity embedding encoding vector.
Specifically, the step S41 performs word segmentation processing on the voice-text recognition result in the pre-processing call process to obtain a sequence of voice content words. Considering that the voice-text recognition result of the pre-processing conversation process contains semantic information related to clients in the conversation process, the semantic information plays an important role in semantic understanding and emotion recognition of conversation content. And, also consider that the semantic information in the speech-text recognition result of the post-preprocessing conversation process is composed of a plurality of words, and each word has an association relationship with each other. Therefore, in order to better capture the meaning and context of each word in a sentence, thereby facilitating the subsequent semantic analysis and emotion recognition of the speech-text recognition result in the conversation process, in the technical scheme of the application, the word segmentation process needs to be performed on the speech-text recognition result in the conversation process after preprocessing to obtain the sequence of the speech content words
Specifically, the step S42 is to pass the sequence of the words of the voice content through a Word embedding encoder based on a Word2Vec model to obtain a sequence of the Word granularity embedded encoding vectors of the voice content. That is, through the Word embedding encoder based on the Word2Vec model, each Word in the speech-text recognition result in the conversation process can be embedded and encoded, so that the semantic features of each Word can be extracted, meaning and association of words in the speech content can be better understood, and subsequent emotion analysis and content understanding are further supported. It is worth mentioning that Word2Vec is a commonly used Word embedding model for mapping words into a continuous vector space to capture semantic relationships between words. Word2Vec has two main implementations: skip-gram and CBOW. Word2Vec model can generate Word embedding vectors with semantic information by learning Word context in a large amount of text data so that words of similar meaning are closer in vector space. These word embedding vectors can be used for various natural language processing tasks such as text classification, emotion analysis, named entity recognition, etc., thereby improving the performance and effect of the model.
It should be noted that, in other specific examples of the present application, the word-granularity-based word embedding encoding may be performed on the voice-text recognition result in the post-preprocessing conversation process in other manners to obtain a sequence of word-granularity-embedded encoding vectors of the voice content, for example: converting the voice data of the communication process after pretreatment into text data; each word is mapped to a high-dimensional real vector to capture semantic relationships between words. Common Word embedding models include Word2Vec, gloVe, fastText, etc.; word embedding encoding is carried out on the word sequence after word segmentation, and each word is expressed as a corresponding word embedding vector; and filling the obtained word embedded coding vector sequence so that all vectors have the same length.
In particular, the step S5 performs part-of-speech tagging on each speech content word in the sequence of speech content words and performs part-of-speech unicode on each speech content word in the sequence of speech content words to obtain a sequence of semantic content word part-of-speech unicode vectors. Considering that when the emotion state and the change of the client are judged by analyzing the speech data in the conversation process, the part of speech of each word in the speech-text recognition result in the conversation process plays a crucial role in assisting in improving the accuracy of emotion state recognition, because the part of speech of each word can reflect the emotion basic tone and habit of the client in the conversation process, thereby improving the accuracy of emotion state analysis of the client. Therefore, in the technical scheme of the application, part-of-speech tagging is further carried out on each voice content word in the voice content word sequence, and part-of-speech of each voice content word in the voice content word sequence is subjected to one-hot encoding to obtain a semantic content word part-of-speech one-hot encoding vector sequence. It should be understood that the part-of-speech tagging helps to identify the grammatical role and meaning of each word in a sentence, and by assigning a corresponding part-of-speech tag to each word, the structure and meaning of the sentence can be better understood, thereby further improving the accuracy of semantic understanding and emotion recognition. In particular, the part-of-speech tagging can help distinguish roles of different words in sentences, such as nouns, verbs, adjectives and the like, which is helpful for better understanding the context and logical relationship of sentences, thereby improving the overall grasp and analysis of voice content and providing accurate basis for the subsequent recognition and change analysis of the emotional state of the client. In addition, the single-hot coding mode can change the parts of speech of each voice content word into a vector form, which is favorable for better expressing and learning part of speech information in a voice-text recognition result in the conversation process and improving the ability of a model to understand text semantics and recognize emotion states.
In particular, the step S6, and performing word-by-word granularity splicing on the corresponding voice content word granularity embedded coding vector and the semantic content word part-of-speech single-hot coding vector in each group in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain the voice content global text semantic coding characteristics. In particular, in one specific example of the present application, the S6 includes: s61, performing word-by-word granularity splicing on the sequence of the voice content word granularity embedded coding vector and the sequence of the semantic content word part-of-speech single-hot coding vector to obtain a sequence of the voice content word semantic-part-of-speech fusion embedded coding vector; s62, the sequence of the voice content word semantic-part of speech fusion embedded coding vectors is processed through a word granularity multi-scale semantic association encoder to obtain voice content global text semantic coding feature vectors serving as the voice content global text semantic coding features.
Specifically, the step S61 is to splice the sequence of the speech content word granularity embedded coding vector and the sequence of the semantic content word part-of-speech single-hot coding vector in a word-by-word granularity manner to obtain a sequence of the speech content word semantic-part-of-speech fusion embedded coding vector. It should be understood that, because the sequence of the speech content word granularity embedded coding vector and the sequence of the semantic content word part-of-speech single-hot coding vector respectively include word granularity semantic coding features and part-of-speech coding features related to the speech-text recognition result in the post-processing conversation process, in order to combine the semantic information and the part-of-speech information of each word, a richer and more comprehensive feature representation is provided for each word, thereby helping to improve the expression of the model in the customer emotion analysis task.
Specifically, in S62, the sequence of the speech content word semantic-part of speech fusion embedded encoding vectors is passed through a word granularity multi-scale semantic association encoder to obtain a speech content global text semantic encoding feature vector as the speech content global text semantic encoding feature. It should be understood that after word-by-word granularity splicing and fusion are performed on word-granularity-based word embedded semantic coding features and part-of-speech single-hot coding features in the voice-text recognition result of the post-preprocessing conversation process, semantic association information with context between each word in the recognition result is considered. Therefore, in order to comprehensively grasp the overall situation and context of a call and more comprehensively understand the semantic structure and the emotional state change of the whole text content, in the technical scheme of the application, the sequence of the voice content word semantic-part-of-speech fusion embedded coding vector is further processed by a word granularity multi-scale semantic association encoder to obtain the voice content global text semantic coding feature vector. Through the processing of the word granularity multi-scale semantic association encoder, the semantic association encoding and information fusion of the global text can be performed by utilizing the similarity and distance relation between the speech content word semantic-part-of-speech fusion embedded encoding vectors corresponding to each word in the speech content. In particular, since the similarity and distance relation among words reflect the context relation and the association degree of the words in the text, the meaning and the effect of each word under different context environments can be better understood by analyzing the semantic relation and using the semantic relation in the association coding of the voice content global text, so that the understanding capability of the text global semantics is improved, and the accuracy and the effect of the recognition and detection task of the emotion state of the customer are improved. In a specific example of the present application, the step of embedding the sequence of the speech content word semantic-part-of-speech fusion embedded encoding vectors through a word granularity multi-scale semantic association encoder to obtain a speech content global text semantic encoding feature vector as the speech content global text semantic encoding feature includes: calculating word granularity semantic similarity between every two voice content word semantic-part-of-speech fusion embedded coding vectors in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors to obtain a word granularity semantic similarity vector consisting of a plurality of word granularity semantic similarities; calculating the number of feature vectors separated between every two voice content word semantic-part-of-speech fusion embedded coding vectors as word distance counting number to obtain a word distance counting vector consisting of a plurality of word distance counting amounts; Calculating an exponential function value based on a natural constant according to the position in the word distance counting vector to obtain a word distance counting class support vector; determining the number of vectors in the sequence of the speech content word semantic-part-of-speech fusion embedded coding vectors to obtain vector number counting; calculating the word granularity semantic similarity vector and the word distance counting class support vector, multiplying the feature vector obtained by the position point and the vector number counting number, and dividing the feature vector by the point to obtain a semantic enhancement factor vector; respectively carrying out weighted correction on the sequence of the voice content word semantic-part of speech fusion embedded coding vector based on the trainable super-parameters so as to obtain a corrected semantic enhancement factor vector and a corrected voice content word semantic-part of speech fusion embedded coding vector sequence; Taking the corrected semantic enhancement factors at each position in the corrected semantic enhancement factor vector as weights, and respectively carrying out weighted enhancement on each voice content word semantic-part-of-speech fusion embedded coding vector in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors so as to obtain a sequence of first-scale voice content global text semantic association coding feature vectors; the sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors corresponding to each group in the sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors are subjected to position-wise addition to obtain a sequence of voice content global text semantic association feature vectors; And cascading all voice content global text semantic association feature vectors in the sequence of voice content global text semantic association feature vectors to obtain voice content global text semantic encoding feature vectors. More specifically, the step of embedding the sequence of the speech content word semantic-part-of-speech fusion embedded coding vectors through a word granularity multi-scale semantic association encoder to obtain a speech content global text semantic coding feature vector as the speech content global text semantic coding feature comprises the following steps: processing the sequence of the voice content word semantic-part-of-speech fusion embedded coding vectors through the word granularity multi-scale semantic association encoder according to the following word granularity multi-scale semantic association coding formula to obtain a sequence of voice content global text semantic association feature vectors; the word granularity multi-scale semantic association coding formula is as follows:
wherein, AndRespectively representing the first and second speech content word semantic-part of speech fusion embedded in the sequence of encoding vectorsAnd (b)The individual phonetic content word semantic-part of speech fusion embeds the encoding vector,Representation ofAnd (3) withThe number of feature vectors that are spaced apart,Representation ofAnd (3) withThe degree of similarity between the two,Representing the total number of feature vectors in the sequence of speech content word semantic-part-of-speech fusion embedded encoding vectors,And (3) withIs equal to the number of the (a) in the formula (c),AndRepresenting the parameters of the trainable super-parameters,Representing the first of the sequence of global text semantically associated feature vectors of the speech contentGlobal text semantic association feature vectors of individual voice content; and cascading all voice content global text semantic association feature vectors in the sequence of voice content global text semantic association feature vectors to obtain voice content global text semantic encoding feature vectors.
It should be noted that, in other specific examples of the present application, the sequence of the speech content word granularity embedded encoding vectors and the sequence of the semantic content word part-of-speech single-hot encoding vectors may be further processed by performing word-by-word granularity splicing on each group of corresponding speech content word granularity embedded encoding vectors and semantic content word part-of-speech single-hot encoding vectors in other manners, and then performing word granularity multi-scale semantic association encoding on the obtained sequence of encoding vectors to obtain global text semantic encoding features of speech content, for example: word segmentation is carried out on the voice content, each word is mapped into a word embedding vector, and a sequence of word granularity embedding coding vectors is obtained; each word in the semantic content is subjected to part-of-speech tagging, and the part of speech is mapped into a single-hot encoding vector to obtain a sequence of the part-of-speech single-hot encoding vector; splicing the word granularity embedded coding vector of each word with the corresponding part-of-speech single-hot coding vector to form a new vector sequence; carrying out multi-scale semantic association coding on the spliced coding vector sequences so as to capture semantic information of different scales; after word granularity multi-scale semantic association coding, the obtained coding vector sequence contains richer semantic information and association, and represents global text semantic coding characteristics of voice content.
In particular, the S7 determines whether the customer emotion is positive, negative or neutral based on the phonetic content global text semantic coding feature. In a specific example of the present application, the speech content global text semantic coding feature vector is passed through a classifier-based customer emotion tag identifier to obtain a recognition result, which is used to represent whether the customer emotion is positive, negative or neutral. That is, the classification process is performed using the global text context Wen Yuyi associated coding features of the speech content to identify the emotional state of the identified customer to determine whether the customer emotion is positive, negative or neutral. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
It should be appreciated that training of the Word2Vec model-based Word embedded encoder, the Word granularity multi-scale semantic association encoder, and the classifier-based customer emotion tag identifier is required prior to inference using the neural network model described above. That is, the method for determining the customer label based on the conversation content further comprises a training stage, which is used for training the Word embedded encoder based on the Word2Vec model, the Word granularity multi-scale semantic association encoder and the customer emotion label identifier based on the classifier.
Fig. 3 is a flowchart of a training phase of a method for determining a client tag based on call content according to an embodiment of the present application. As shown in fig. 3, a method for determining a client tag based on call content according to an embodiment of the present application includes: a training phase comprising: s110, training data is obtained, wherein the training data comprises voice data of a training conversation process between a salesman and a client recorded through recording equipment; s120, firstly performing voice recognition on the voice data in the training call process to convert the voice data into a voice-text recognition result in the training call process, and then preprocessing the voice-text recognition result in the training call process to obtain a voice-text recognition result in the training call process after preprocessing; s130, word embedding and coding based on word granularity are carried out on the voice-text recognition result in the pre-processing training communication process so as to obtain a sequence of training voice content word granularity embedded coding vectors; s140, marking the parts of speech of each voice content word in the training voice content word sequence and performing one-time hot coding on the parts of speech of each voice content word in the voice content word sequence to obtain a training semantic content word part-of-speech one-time hot coding vector sequence; s150, performing word-by-word granularity splicing on the sequence of the training voice content word granularity embedded coding vector and the sequence of the training semantic content word part-of-speech single-hot coding vector to obtain a sequence of the training voice content word semantic-part-of-speech fusion embedded coding vector; s160, enabling the training voice content word semantic-part of speech fusion embedded coding vector sequence to pass through a word granularity multi-scale semantic association encoder to obtain training voice content global text semantic coding feature vectors; s170, optimizing the training voice content global text semantic coding feature vector to obtain an optimized training voice content global text semantic coding feature vector; s180, passing the optimized training voice content global text semantic coding feature vector through a classifier-based customer emotion tag identifier to obtain a classification loss function value; and S190, training the Word embedded encoder based on the Word2Vec model, the Word granularity multi-scale semantic association encoder and the customer emotion label identifier based on the classifier based on the classification loss function value.
The voice content global text semantic coding feature vector is passed through a classifier-based customer emotion tag identifier to obtain a recognition result that is used to indicate whether the customer emotion is positive, negative or neutral. That is, the classification process is performed using the global text context Wen Yuyi associated coding features of the speech content to identify the emotional state of the identified customer to determine whether the customer emotion is positive, negative or neutral. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
In a preferred example, passing the training speech content global text semantic coding feature vector through a classifier-based customer emotion tag identifier to obtain a recognition result specifically includes the steps of: performing point counting on the training voice content global text semantic coding feature vector, the square root of the length of the training voice content global text semantic coding feature vector and the reciprocal of the square root of the second norm of the voice content global text semantic coding feature vector to obtain a training voice content global text semantic rotation offset feature vector; calculating an exponential function based on a natural constant of the training voice content global text semantic rotation offset feature vector to obtain a training voice content global text semantic class offset prediction feature vector; performing point multiplication on the training voice content global text semantic coding feature vector and a norm and a weight super-parameter of the training voice content global text semantic coding feature vector to obtain a training voice content global text semantic boundary constraint feature vector; performing point processing on the training voice content global text semantic class deviation prediction feature vector and the training voice content global text semantic boundary constraint feature vector to obtain an optimized training voice content global text semantic coding feature vector; and passing the optimized training voice content global text semantic coding feature vector through the classifier-based customer emotion tag identifier to obtain the identification result.
In the above preferred example, the structural norm representation of the training voice content global text semantic coding feature vector is used as the local canonical coordinates of each feature value of the voice content global text semantic coding feature vector, the class bias prediction direction of the class bias relative to the feature value of the training voice content global text semantic coding feature vector is determined by the vector global distribution representation of the training voice content global text semantic coding feature vector, and the feature value constraint is performed by the boundary box of the vector feature value distribution of the training voice content global text semantic coding feature vector, so that the constraint of the training voice content global text semantic coding feature vector under the global regression distribution is improved, and the training speed of the model and the accuracy of the recognition result of the training voice content global text semantic coding feature vector obtained by the classifier-based customer emotion label recognizer are improved. Therefore, the semantic meaning of the conversation voice content of the client can be understood more accurately, the emotion state of the client can be identified, and a basis is provided for the decision of an enterprise.
In summary, the method for determining the client tag based on the conversation content according to the embodiment of the application is explained, after the authorization of recording permission is obtained, conversation process voice data between a service person and a client is recorded through recording equipment, and text content global semantic coding based on word granularity is carried out on the conversation process voice data by introducing an artificial intelligence based data processing and semantic understanding algorithm at the rear end, so that conversation voice content semantics of the client are understood, and emotion states of the client are identified, so as to judge whether the emotion of the client is positive, negative or neutral. Thus, the emotion change of the client in the conversation process can be intelligently identified, and a basis is provided for the decision of enterprises. For example, satisfaction of the customer to the service is determined according to the emotional state and the change condition of the customer, or the repayment willingness of the customer after the loan is determined, so that a more targeted strategy is formulated.
Further, a client tag determining device based on the call content is provided.
Fig. 6 is a block diagram of a client tag determination apparatus based on call contents according to an embodiment of the present application. As shown in fig. 6, the client tag determining apparatus 300 based on call contents according to an embodiment of the present application includes: the data collection module 310 is configured to record, after obtaining the recording permission authorization, call process voice data between the service person and the client through the recording device, and store the call process voice data to the voice database; a data retrieving module 320, configured to retrieve the voice data of the call process from the voice database; the data preprocessing module 330 is configured to perform voice recognition on the call process voice data first to convert the call process voice data into a call process voice-text recognition result, and then perform preprocessing on the obtained call process voice-text recognition result to obtain a preprocessed call process voice-text recognition result; the speech content word granularity embedded coding module 340 is configured to perform word embedded coding based on word granularity on the speech-text recognition result in the pre-processing call process to obtain a sequence of speech content word granularity embedded coding vectors; the single-hot encoding module 350 is configured to perform part-of-speech tagging on each voice content word in the sequence of voice content words and single-hot encode the part of speech of each voice content word in the sequence of voice content words to obtain a sequence of semantic content word part-of-speech single-hot encoding vectors; the word-granularity multi-scale semantic association coding module 360 is configured to perform word-granularity multi-scale semantic association coding on the obtained sequence of the encoding vectors after performing word-granularity splicing on each group of corresponding voice content word-granularity embedded encoding vectors and semantic content word-part-of-speech single-hot encoding vectors in the sequence of the voice content word-granularity embedded encoding vectors and the sequence of the semantic content word-part-of-speech single-hot encoding vectors to obtain a voice content global text semantic encoding feature; the client tag determination module 370 is configured to determine whether the client emotion is positive, negative or neutral based on the global text semantic coding feature of the speech content.
As described above, the call content-based client tag determination apparatus 300 according to the embodiment of the present application may be implemented in various wireless terminals, for example, a server or the like having a call content-based client tag determination algorithm. In one possible implementation, the call content-based client tag determination apparatus 300 according to an embodiment of the present application may be integrated into the wireless terminal as a software module and/or a hardware module. For example, the call content-based client tag determination device 300 may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the session content-based client tag determination device 300 may also be one of a number of hardware modules of the wireless terminal.
Alternatively, in another example, the session content-based client tag determination apparatus 300 and the wireless terminal may be separate devices, and the session content-based client tag determination apparatus 300 may be connected to the wireless terminal through a wired and/or wireless network and transmit the interactive information in a contracted data format.
Furthermore, embodiments of the present application may also be a computer-readable storage medium, having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the steps in the functions of the call content based client tag determination method according to the various embodiments of the present application described in the "exemplary method" section above in this specification.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (8)
1. A method for determining a client tag based on call content, comprising:
after the recording permission authorization is obtained, recording call process voice data between a service person and a client through recording equipment, and storing the call process voice data into a voice database;
calling the voice data in the conversation process from the voice database;
Firstly, carrying out voice recognition on the voice data in the conversation process to convert the voice data into a voice-text recognition result in the conversation process, and then preprocessing the voice-text recognition result in the conversation process to obtain a voice-text recognition result in the conversation process after preprocessing;
Word embedding coding based on word granularity is carried out on the voice-text recognition result in the pre-processing conversation process so as to obtain a sequence of word granularity embedded coding vectors of voice content;
Marking parts of speech of each voice content word in the voice content word sequence and performing one-time hot coding on the parts of speech of each voice content word in the voice content word sequence to obtain a semantic content word part-of-speech one-time hot coding vector sequence;
Performing word-by-word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain voice content global text semantic coding characteristics;
determining whether a customer emotion is positive, negative or neutral based on the phonetic content global text semantic coding feature;
the method for performing word-by-word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain voice content global text semantic coding features comprises the following steps:
Performing word-by-word granularity splicing on the sequence of the voice content word granularity embedded coding vector and the sequence of the semantic content word part-of-speech single-hot coding vector to obtain a sequence of the voice content word semantic-part-of-speech fusion embedded coding vector;
The sequence of the voice content word semantic-part-of-speech fusion embedded coding vectors is processed through a word granularity multi-scale semantic association encoder to obtain voice content global text semantic coding feature vectors serving as the voice content global text semantic coding features;
The method for embedding the speech content word semantic-part of speech fusion into the sequence of encoding vectors through a word granularity multi-scale semantic association encoder to obtain a speech content global text semantic encoding feature vector as the speech content global text semantic encoding feature comprises the following steps:
Calculating word granularity semantic similarity between every two voice content word semantic-part-of-speech fusion embedded coding vectors in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors to obtain a word granularity semantic similarity vector consisting of a plurality of word granularity semantic similarities;
calculating the number of feature vectors separated between every two voice content word semantic-part-of-speech fusion embedded coding vectors as word distance counting number to obtain a word distance counting vector consisting of a plurality of word distance counting amounts;
Calculating an exponential function value based on a natural constant according to the position in the word distance counting vector to obtain a word distance counting class support vector;
determining the number of vectors in the sequence of the speech content word semantic-part-of-speech fusion embedded coding vectors to obtain vector number counting;
calculating the word granularity semantic similarity vector and the word distance counting class support vector, multiplying the feature vector obtained by the position point and the vector number counting number, and dividing the feature vector by the point to obtain a semantic enhancement factor vector;
Respectively carrying out weighted correction on the sequence of the voice content word semantic-part of speech fusion embedded coding vector based on the trainable super-parameters so as to obtain a corrected semantic enhancement factor vector and a corrected voice content word semantic-part of speech fusion embedded coding vector sequence;
Taking the corrected semantic enhancement factors at each position in the corrected semantic enhancement factor vector as weights, and respectively carrying out weighted enhancement on each voice content word semantic-part-of-speech fusion embedded coding vector in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors so as to obtain a sequence of first-scale voice content global text semantic association coding feature vectors;
The sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors corresponding to each group in the sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors are subjected to position-wise addition to obtain a sequence of voice content global text semantic association feature vectors;
And cascading all voice content global text semantic association feature vectors in the sequence of voice content global text semantic association feature vectors to obtain voice content global text semantic encoding feature vectors.
2. The method for determining a client tag based on call contents according to claim 1, wherein after performing voice recognition on the call process voice data to convert the call process voice data into a call process voice-text recognition result, performing preprocessing on the obtained call process voice-text recognition result to obtain a preprocessed call process voice-text recognition result, comprising:
performing voice recognition on the voice data in the conversation process to obtain a voice-text recognition result in the conversation process;
Removing stop words in the voice-text recognition result in the conversation process;
correcting spelling errors in the speech-text recognition result in the conversation process; and
And removing repeated words of the voice-text recognition result in the conversation process.
3. The call content-based client tag determination method as claimed in claim 2, wherein performing word-granularity-based word-embedded encoding on the pre-processed call process voice-text recognition result to obtain a sequence of word-granularity-embedded encoding vectors of voice content comprises:
Word segmentation processing is carried out on the voice-text recognition result in the pre-processing conversation process so as to obtain a voice content word sequence;
And the sequence of the voice content words is passed through a Word embedding encoder based on a Word2Vec model to obtain the sequence of the voice content Word granularity embedding encoding vector.
4. A method of determining a call content based client tag of claim 3, wherein determining whether a client emotion is positive, negative or neutral based on the global text semantic coding feature of speech content comprises: the voice content global text semantic coding feature vector is passed through a classifier-based customer emotion tag identifier to obtain a recognition result that is used to indicate whether the customer emotion is positive, negative or neutral.
5. The conversation content based client tag determination method of claim 4 further comprising the training step of: the Word-granularity multi-scale semantic association encoder is used for training the Word-2 Vec model-based Word embedded encoder, the Word-granularity multi-scale semantic association encoder and the classifier-based customer emotion label recognizer.
6. The conversation content based client tag determination method of claim 5 wherein the training step comprises:
acquiring training data, wherein the training data comprises voice data of a training conversation process between a service person and a client recorded by recording equipment;
Firstly, carrying out voice recognition on the voice data in the training call process to convert the voice data into a voice-text recognition result in the training call process, and then preprocessing the obtained voice-text recognition result in the training call process to obtain a voice-text recognition result in the training call process after preprocessing;
Word embedding coding based on word granularity is carried out on the voice-text recognition result in the pre-processing training communication process so as to obtain a sequence of training voice content word granularity embedded coding vectors;
Marking parts of speech of each voice content word in the training voice content word sequence and performing single-hot coding on parts of speech of each voice content word in the voice content word sequence to obtain a training semantic content word part-of-speech single-hot coding vector sequence;
Performing word-by-word granularity splicing on the sequence of the training voice content word granularity embedded coding vector and the sequence of the training semantic content word part-of-speech single-hot coding vector to obtain a sequence of the training voice content word semantic-part-of-speech fusion embedded coding vector;
Embedding the training voice content word semantic-part of speech fusion into a sequence of coding vectors through a word granularity multi-scale semantic association encoder to obtain training voice content global text semantic coding feature vectors;
Optimizing the training voice content global text semantic coding feature vector to obtain an optimized training voice content global text semantic coding feature vector;
The optimized training voice content universe text semantic coding feature vector passes through a customer emotion label identifier based on a classifier to obtain a classification loss function value;
training the Word embedded encoder based on the Word2Vec model, the Word granularity multi-scale semantic association encoder and the classifier-based customer emotion tag identifier based on the classification loss function value.
7. A client tag determination apparatus based on call content, comprising:
The data acquisition module is used for recording call process voice data between a service person and a client through recording equipment after obtaining recording permission authorization, and storing the call process voice data into the voice database;
The data calling module is used for calling the voice data in the conversation process from the voice database;
The data preprocessing module is used for preprocessing the obtained conversation process voice-text recognition result after firstly carrying out voice recognition on the conversation process voice data so as to convert the conversation process voice data into a conversation process voice-text recognition result, so as to obtain a preprocessed conversation process voice-text recognition result;
The voice content word granularity embedded coding module is used for carrying out word embedded coding based on word granularity on the voice-text recognition result in the pre-processing call process so as to obtain a sequence of voice content word granularity embedded coding vectors;
the single-heat coding module is used for marking the parts of speech of each voice content word in the voice content word sequence and single-heat coding the parts of speech of each voice content word in the voice content word sequence to obtain a semantic content word part-of-speech single-heat coding vector sequence;
the word granularity multi-scale semantic association coding module is used for carrying out word granularity multi-scale semantic association coding on the obtained sequence of the coding vectors after carrying out word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the sequence of the voice content word granularity embedded coding vectors and the sequence of the semantic content word part-of-speech single-hot coding vectors so as to obtain voice content global text semantic coding characteristics;
A client tag determination module for determining whether a client emotion is positive, negative or neutral based on the global text semantic coding feature of the voice content;
the method for performing word-by-word granularity splicing on each group of corresponding voice content word granularity embedded coding vectors and semantic content word part-of-speech single-hot coding vectors in the voice content word granularity embedded coding vector sequence and the semantic content word part-of-speech single-hot coding vector sequence, and then performing word granularity multi-scale semantic association coding on the obtained coding vector sequence to obtain voice content global text semantic coding features comprises the following steps:
Performing word-by-word granularity splicing on the sequence of the voice content word granularity embedded coding vector and the sequence of the semantic content word part-of-speech single-hot coding vector to obtain a sequence of the voice content word semantic-part-of-speech fusion embedded coding vector;
The sequence of the voice content word semantic-part-of-speech fusion embedded coding vectors is processed through a word granularity multi-scale semantic association encoder to obtain voice content global text semantic coding feature vectors serving as the voice content global text semantic coding features;
The method for embedding the speech content word semantic-part of speech fusion into the sequence of encoding vectors through a word granularity multi-scale semantic association encoder to obtain a speech content global text semantic encoding feature vector as the speech content global text semantic encoding feature comprises the following steps:
Calculating word granularity semantic similarity between every two voice content word semantic-part-of-speech fusion embedded coding vectors in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors to obtain a word granularity semantic similarity vector consisting of a plurality of word granularity semantic similarities;
calculating the number of feature vectors separated between every two voice content word semantic-part-of-speech fusion embedded coding vectors as word distance counting number to obtain a word distance counting vector consisting of a plurality of word distance counting amounts;
Calculating an exponential function value based on a natural constant according to the position in the word distance counting vector to obtain a word distance counting class support vector;
determining the number of vectors in the sequence of the speech content word semantic-part-of-speech fusion embedded coding vectors to obtain vector number counting;
calculating the word granularity semantic similarity vector and the word distance counting class support vector, multiplying the feature vector obtained by the position point and the vector number counting number, and dividing the feature vector by the point to obtain a semantic enhancement factor vector;
Respectively carrying out weighted correction on the sequence of the voice content word semantic-part of speech fusion embedded coding vector based on the trainable super-parameters so as to obtain a corrected semantic enhancement factor vector and a corrected voice content word semantic-part of speech fusion embedded coding vector sequence;
Taking the corrected semantic enhancement factors at each position in the corrected semantic enhancement factor vector as weights, and respectively carrying out weighted enhancement on each voice content word semantic-part-of-speech fusion embedded coding vector in the sequence of voice content word semantic-part-of-speech fusion embedded coding vectors so as to obtain a sequence of first-scale voice content global text semantic association coding feature vectors;
The sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors corresponding to each group in the sequence of the first-scale voice content global text semantic association coding feature vectors and the corrected voice content word semantic-part of speech fusion embedded coding vectors are subjected to position-wise addition to obtain a sequence of voice content global text semantic association feature vectors;
And cascading all voice content global text semantic association feature vectors in the sequence of voice content global text semantic association feature vectors to obtain voice content global text semantic encoding feature vectors.
8. A computer storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the call content based client tag determination method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410804728.3A CN118377909B (en) | 2024-06-21 | 2024-06-21 | Customer label determining method and device based on call content and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410804728.3A CN118377909B (en) | 2024-06-21 | 2024-06-21 | Customer label determining method and device based on call content and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118377909A CN118377909A (en) | 2024-07-23 |
CN118377909B true CN118377909B (en) | 2024-08-27 |
Family
ID=91908835
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410804728.3A Active CN118377909B (en) | 2024-06-21 | 2024-06-21 | Customer label determining method and device based on call content and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118377909B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114691836A (en) * | 2022-04-24 | 2022-07-01 | 中国科学院空天信息创新研究院 | Method, device, equipment and medium for analyzing emotion tendentiousness of text |
CN115098634A (en) * | 2022-06-27 | 2022-09-23 | 重庆大学 | Semantic dependency relationship fusion feature-based public opinion text sentiment analysis method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104063427A (en) * | 2014-06-06 | 2014-09-24 | 北京搜狗科技发展有限公司 | Expression input method and device based on semantic understanding |
CN112560503B (en) * | 2021-02-19 | 2021-07-02 | 中国科学院自动化研究所 | Semantic emotion analysis method integrating depth features and time sequence model |
CN115879958A (en) * | 2022-12-28 | 2023-03-31 | 吉林农业科技学院 | Foreign-involved sales call decision method and system based on big data |
CN117875652A (en) * | 2024-01-15 | 2024-04-12 | 南方电网数字平台科技(广东)有限公司 | Personnel allocation management system and method for online customer service |
CN117789971B (en) * | 2024-02-13 | 2024-05-24 | 长春职业技术学院 | Mental health intelligent evaluation system and method based on text emotion analysis |
-
2024
- 2024-06-21 CN CN202410804728.3A patent/CN118377909B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114691836A (en) * | 2022-04-24 | 2022-07-01 | 中国科学院空天信息创新研究院 | Method, device, equipment and medium for analyzing emotion tendentiousness of text |
CN115098634A (en) * | 2022-06-27 | 2022-09-23 | 重庆大学 | Semantic dependency relationship fusion feature-based public opinion text sentiment analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN118377909A (en) | 2024-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112528672B (en) | Aspect-level emotion analysis method and device based on graph convolution neural network | |
CN110377911B (en) | Method and device for identifying intention under dialog framework | |
CN112084337B (en) | Training method of text classification model, text classification method and equipment | |
CN113094578B (en) | Deep learning-based content recommendation method, device, equipment and storage medium | |
CN111930914B (en) | Problem generation method and device, electronic equipment and computer readable storage medium | |
CN112528637B (en) | Text processing model training method, device, computer equipment and storage medium | |
WO2022252636A1 (en) | Artificial intelligence-based answer generation method and apparatus, device, and storage medium | |
CN111241237A (en) | Intelligent question and answer data processing method and device based on operation and maintenance service | |
CN111223476B (en) | Method and device for extracting voice feature vector, computer equipment and storage medium | |
WO2021204017A1 (en) | Text intent recognition method and apparatus, and related device | |
CN110399472B (en) | Interview question prompting method and device, computer equipment and storage medium | |
CN111858854A (en) | Question-answer matching method based on historical dialogue information and related device | |
CN113705315A (en) | Video processing method, device, equipment and storage medium | |
CN112668333A (en) | Named entity recognition method and device, and computer-readable storage medium | |
CN112988970A (en) | Text matching algorithm serving intelligent question-answering system | |
CN115134660A (en) | Video editing method and device, computer equipment and storage medium | |
CN114004231A (en) | Chinese special word extraction method, system, electronic equipment and storage medium | |
CN116189039A (en) | Multi-modal emotion classification method and system for modal sequence perception with global audio feature enhancement | |
CN116304745A (en) | Text topic matching method and system based on deep semantic information | |
CN116361442A (en) | Business hall data analysis method and system based on artificial intelligence | |
CN114491023A (en) | Text processing method and device, electronic equipment and storage medium | |
CN114239607A (en) | Conversation reply method and device | |
CN114373443A (en) | Speech synthesis method and apparatus, computing device, storage medium, and program product | |
CN114003700A (en) | Method and system for processing session information, electronic device and storage medium | |
CN113486174A (en) | Model training, reading understanding method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |