CN111554304A - User tag obtaining method, device and equipment - Google Patents

User tag obtaining method, device and equipment Download PDF

Info

Publication number
CN111554304A
CN111554304A CN202010335882.2A CN202010335882A CN111554304A CN 111554304 A CN111554304 A CN 111554304A CN 202010335882 A CN202010335882 A CN 202010335882A CN 111554304 A CN111554304 A CN 111554304A
Authority
CN
China
Prior art keywords
user
preset
tag
label
voice data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010335882.2A
Other languages
Chinese (zh)
Inventor
蒋菱
王恩典
陈浩
曾甜玲
钟蔚伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Citic Bank Corp Ltd
Original Assignee
China Citic Bank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Citic Bank Corp Ltd filed Critical China Citic Bank Corp Ltd
Priority to CN202010335882.2A priority Critical patent/CN111554304A/en
Publication of CN111554304A publication Critical patent/CN111554304A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Abstract

The invention provides a user tag obtaining method, a user tag obtaining device and user tag obtaining equipment, and relates to the technical field of data processing. According to the invention, the service voice data of the user is obtained, the corresponding characteristic information is extracted from the service voice data according to the preset user label type, and the user label of the user is obtained according to the preset label identification model corresponding to the characteristic information and the user label type, so that the user labels of various different types can be obtained according to the service voice data of the user, the problems and the characteristics of the user can be more comprehensively obtained according to the user labels of different types, and further, better service quality can be provided for the user, and the service experience of the user is improved.

Description

User tag obtaining method, device and equipment
Technical Field
The invention relates to the technical field of data processing, in particular to a user tag obtaining method, a user tag obtaining device and user tag obtaining equipment.
Background
A great amount of user voice data are accumulated in customer service systems of financial service companies such as banks, credit card centers and the like, for example: the dialogue between the user and the customer service is recorded, and the voice data contains a large amount of user information. With the increasing demand of users, financial service companies are imperative to refine and standardize the management of user demands, and a large amount of user voice data accumulated in a customer service system often becomes an important basis for managing users.
In the prior art, the usage of user voice data is generally as follows: after the user feeds back the problem, the problem and the characteristic of the user are obtained by calling the conversation record of the user and the customer service and manually analyzing the conversation record of the user and the customer service, so that the corresponding solution is found out.
However, in the prior art, the result of analyzing the dialogue records of the user and the customer service is single, and the problem and the characteristic of the user cannot be obtained comprehensively, so that the service quality cannot be improved, and the service experience of the user is affected.
Disclosure of Invention
The invention provides a user tag obtaining method, a device and equipment, which can obtain various user tags of a user according to service voice data of the user in a customer service system and comprehensively obtain the problems and characteristics of the user, thereby providing better service quality for the user and improving the service experience of the user.
In a first aspect, an embodiment of the present invention provides a method for obtaining a user tag, where the method includes: acquiring service voice data of a user; extracting corresponding characteristic information from the service voice data according to a preset user tag type; and acquiring the user label of the user according to the characteristic information and a preset label identification model corresponding to the user label type.
Optionally, the extracting, according to a preset user tag type, corresponding feature information from the service voice data includes: extracting voiceprint features from the service voice data according to a preset personality emotion label type; correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes: and acquiring the personality emotion label of the user according to the preset personality emotion label identification model corresponding to the voiceprint feature and the personality emotion label type.
Optionally, before the obtaining of the personality emotion label of the user according to the preset personality emotion label recognition model corresponding to the voiceprint feature and the personality emotion label type, the method further includes: obtaining a first sample set, wherein the first sample set comprises sample voiceprint features, and the sample voiceprint features are marked with corresponding personality emotion labels; and training a neural network by adopting the first sample set to obtain the preset personality emotion label recognition model.
Optionally, the extracting, according to a preset user tag type, corresponding feature information from the service voice data includes: converting the service voice data into text information; extracting a first text feature from the text information according to a preset appeal label type; correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes: and acquiring the appeal label of the user according to the first text feature and a preset appeal label identification model corresponding to the appeal label type.
Optionally, before the obtaining of the appealing tag of the user according to the first text feature and the preset appealing tag identification model corresponding to the appealing tag type, the method further includes: obtaining a second sample set, wherein the second sample set comprises sample first text features, and the sample first text features are labeled with corresponding appeal labels; and training a neural network by adopting the second sample set to obtain the preset appeal label identification model.
Optionally, the extracting, according to a preset user tag type, corresponding feature information from the service voice data includes: converting the service voice data into text information; extracting user experience problems provided by customer service personnel from the text information; extracting a second text feature according to a preset evaluation label type and the text information and the user experience problem; correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes: and acquiring the evaluation label of the user according to the second text feature and a preset evaluation label identification model corresponding to the evaluation label type.
Optionally, before the obtaining of the evaluation tag of the user according to the second text feature and the preset evaluation tag identification model corresponding to the evaluation tag type, the method further includes: obtaining a third set of samples, the third set of samples comprising: the sample second text features are marked with corresponding evaluation labels; and training a neural network by adopting the third sample set to obtain the preset evaluation label recognition model.
Optionally, the extracting voiceprint features from the service voice data includes: and the service voice data is processed by pre-emphasis, framing, windowing, fast Fourier transform, Mel MEL filtering analysis and discrete cosine transform in sequence to obtain the voiceprint characteristic vector.
In a second aspect, an embodiment of the present invention provides an apparatus for obtaining a user tag, where the apparatus includes: the acquisition module is used for acquiring service voice data of a user; the characteristic extraction module is used for extracting corresponding characteristic information from the service voice data according to a preset user label type; and the tag identification module is used for acquiring the user tag of the user according to the characteristic information and a preset tag identification model corresponding to the user tag type.
Optionally, the feature extraction module is specifically configured to extract voiceprint features from the service voice data according to a preset personality emotion tag type; correspondingly, the label identification module is specifically configured to obtain the personality emotion label of the user according to the preset personality emotion label identification model corresponding to the voiceprint feature and the personality emotion label type.
Optionally, the apparatus further comprises: the first training module is used for acquiring a first sample set before the label identification module acquires the personality emotion labels of the user according to the voiceprint characteristics and a preset personality emotion label identification model corresponding to the personality emotion label types, wherein the first sample set comprises sample voiceprint characteristics which are marked with corresponding personality emotion labels; and training a neural network by adopting the first sample set to obtain the preset personality emotion label recognition model.
Optionally, the feature extraction module is specifically configured to convert the service voice data into text information; extracting a first text feature from the text information according to a preset appeal label type; correspondingly, the tag identification module is specifically configured to obtain the appeal tag of the user according to the first text feature and the preset appeal tag identification model corresponding to the appeal tag type.
Optionally, the apparatus further comprises: the second training module is used for acquiring a second sample set before the label identification module acquires the appeal label of the user according to the first text feature and a preset appeal label identification model corresponding to the appeal label type, wherein the second sample set comprises a sample first text feature which is marked with a corresponding appeal label; and training a neural network by adopting the second sample set to obtain the preset appeal label identification model.
Optionally, the feature extraction module is specifically configured to convert the service voice data into text information; extracting user experience problems provided by customer service personnel from the text information; extracting a second text feature according to a preset evaluation label type and the text information and the user experience problem; correspondingly, the tag identification module is specifically configured to obtain the evaluation tag of the user according to the second text feature and a preset evaluation tag identification model corresponding to the evaluation tag type.
Optionally, the apparatus further comprises: a third training module, configured to obtain a third sample set before the tag identification module obtains the evaluation tag of the user according to the second text feature and the preset evaluation tag identification model corresponding to the evaluation tag type, where the third sample set includes: the sample second text features are marked with corresponding evaluation labels; and training a neural network by adopting the third sample set to obtain the preset evaluation label recognition model.
Optionally, the feature extraction module is specifically configured to perform pre-emphasis, framing, windowing, fast fourier transform, MEL filter analysis, and discrete cosine transform on the service voice data in sequence to obtain a voiceprint feature vector.
In a third aspect, an embodiment of the present invention provides a user tag obtaining apparatus, including: a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the user tag obtaining device runs, the processor and the storage medium communicate through the bus, and the processor executes the machine-readable instructions to execute the user tag obtaining method according to the first aspect.
In a fourth aspect, an embodiment of the present invention further provides a storage medium, where the storage medium stores a computer program, and the computer program is executed by a processor to perform the user tag obtaining method according to the first aspect.
The invention has the beneficial effects that:
according to the embodiment of the invention, the service voice data of the user is obtained, the corresponding characteristic information is extracted from the service voice data according to the preset user label type, and the user label of the user is obtained according to the characteristic information and the preset label identification model corresponding to the user label type, so that the user labels of various different types can be obtained according to the service voice data of the user, the problems and the characteristics of the user can be more comprehensively obtained according to the user labels of different types, and better service quality can be provided for the user, and the service experience of the user can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic flow chart illustrating a user tag obtaining method according to an embodiment of the present invention;
FIG. 2 is a flow chart of voiceprint feature extraction provided by the embodiment of the invention;
fig. 3 is another schematic flow chart of a user tag obtaining method according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a DNN extraction of personality emotion labels provided by an embodiment of the present invention;
fig. 5 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention;
fig. 6 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention;
fig. 7 is a schematic flow chart illustrating the process of extracting the appeal tags by the BERT model according to the embodiment of the present invention;
fig. 8 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention;
FIG. 9 is a flow chart illustrating the process of extracting user evaluation descriptions by the BERT model according to the embodiment of the present invention;
fig. 10 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a user tag obtaining apparatus according to an embodiment of the present invention;
fig. 12 is another schematic structural diagram of a user tag obtaining apparatus provided in an embodiment of the present invention;
fig. 13 is another schematic structural diagram of a user tag obtaining apparatus provided in an embodiment of the present invention;
fig. 14 is another schematic structural diagram of a user tag obtaining apparatus provided in an embodiment of the present invention;
fig. 15 is a schematic structural diagram illustrating a user tag obtaining device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. In the description of the present invention, it should also be noted that the terms "first", "second", "third", and the like are used for distinguishing the description, and are not intended to indicate or imply relative importance.
The embodiment of the invention provides a user tag obtaining method, which can obtain various user tags of a user according to service voice data of the user in a customer service system and comprehensively obtain the problems and characteristics of the user, thereby providing better service quality for the user and improving the service experience of the user. The execution subject of the method may be a computer, a server, one or more processors, etc., and the invention is not limited herein.
Fig. 1 shows a flowchart of a user tag obtaining method according to an embodiment of the present invention.
As shown in fig. 1, the user tag obtaining method may include:
s101, service voice data of a user are obtained.
The service voice data may refer to: the voice data contains a large amount of user information, such as: the personality of the user, the manner in which the user speaks, the language type of the user, the gender of the user, etc.
S102, extracting corresponding characteristic information from the service voice data according to a preset user label type.
Optionally, the preset user tag types may include: personality emotion tag type, appeal tag type, rating tag type, etc. The feature information corresponding to the personality emotion tag type can be a voiceprint feature in the service voice data, the feature information corresponding to the appeal tag type can be a text feature of the converted text information, and the feature information corresponding to the evaluation tag type can be a text feature constructed by combining a text converted from user voice in the service voice data and experience problems provided by customer service personnel.
S103, acquiring a user label of the user according to the characteristic information and a preset label identification model corresponding to the user label type.
Optionally, the preset tag identification model may include: the personality emotion label identification model comprises a preset personality emotion label identification model, a preset appeal label identification model and a preset evaluation label identification model, wherein the preset personality emotion label identification model, the preset appeal label identification model and the preset evaluation label identification model are in one-to-one correspondence with the personality emotion label type, the appeal label type and the evaluation label type; the preset label identification model can be obtained by training a neural network through a sample set which is artificially labeled in the early stage, and after the characteristic information corresponding to each user label type is input into the corresponding preset label identification model, different preset label identification models can output the user label of each user label type. For example, a personality emotion tag, an appeal tag, and an evaluation tag of the user may be output, respectively. The personality emotion label can reflect the personality emotion of the user, the appeal label can reflect problems encountered by the user in the service process, and the evaluation label can reflect related evaluation of the user on the service.
As can be seen from the above, in the embodiment of the present invention, by acquiring the service voice data of the user, extracting the corresponding feature information from the service voice data according to the preset user tag type, and acquiring the user tag of the user according to the preset tag identification model corresponding to the feature information and the user tag type, it is possible to acquire a plurality of different types of user tags according to the service voice data of the user, so that the problems and characteristics of the user can be acquired more comprehensively according to the different types of user tags, and further, better service quality can be provided for the user, so as to improve the service experience of the user.
In addition, compared with the existing mode of manually analyzing the service voice data of the user, the embodiment of the invention can also effectively improve the analysis efficiency of the service voice data of the user and save a large amount of labor cost.
Specific obtaining modes of the personality emotion label, the appeal label and the evaluation label are described below:
optionally, in an embodiment, the step of extracting corresponding feature information from the service voice data according to a preset user tag type may specifically include: and extracting voiceprint characteristics from the service voice data according to a preset personality emotion label type. Correspondingly, the step of obtaining the user tag of the user according to the preset tag identification model corresponding to the feature information and the user tag type may specifically include: and acquiring the personality emotion label of the user according to a preset personality emotion label identification model corresponding to the voiceprint characteristics and the personality emotion label type.
Wherein the voiceprint feature is a voiceprint feature vector.
Fig. 2 is a schematic flow chart illustrating voiceprint feature extraction according to an embodiment of the present invention.
Optionally, as shown in fig. 2, the step of extracting voiceprint features from the service voice data may specifically include: the service voice data is processed by pre-emphasis, framing, windowing, Fast Fourier Transform (FFT), MEL filter analysis and Discrete Cosine Transform (DCT) in sequence to obtain the voiceprint feature vector.
Specifically, the service voice data contains corresponding audio signals, and in order to eliminate the influence of lips and vocal cords during utterance, compensate for high frequency portions of the audio signals suppressed by the utterance system, and highlight high frequency formants, a first order differential equation may be applied to the service voice data to increase the amplitude of the high frequency formants, and the audio signals are passed through a high pass filter:
H(z)=1-kz-1
wherein k is a pre-emphasis coefficient, and the value of k is between [0 and 1], and is usually 0.97; h (z) represents a transfer function.
In the framing step, the audio signal is divided into N sampled frames, and in order to avoid excessive variation of two adjacent frames, there is an overlap region between the adjacent frames, the overlap region includes M sampled frames, and M < N. Assuming that it is stationary within each frame, 20-30ms is typically used for one frame, with an overlap ratio of 25%, 50%, 75%.
Further, the purpose of windowing is to reduce discontinuities in the audio signal and make the ends smooth enough to connect with the start point. A commonly used window function is the hamming window. Assume that the framed signal is SnWherein N is 1, …, N, and represents a signal frame, and a windowed signal S'nCan be expressed as:
Figure BDA0002466578260000101
the FFT can convert N samples from the time domain to the frequency domain, and is used because it is a fast algorithm that implements a Discrete Fourier Transform (DFT), the formula is as follows:
Figure BDA0002466578260000111
wherein S iskFor the input audio signal, N is the sample point of the fourier transform.
In the MEL filtering analysis, a plurality of redundant signals exist in a frequency domain, and a MEL filter bank can simplify the amplitude of the frequency domain according to the hearing range of human ears. The perception of sound by the human ear is not linear and can be better described by a log of non-linearity. The relationship between MEL frequency and audio signal is as follows:
Figure BDA0002466578260000112
where mel (f) is the mel frequency and f is the frequency of the audio signal in hertz.
After MEL filtering analysis, MEL-frequency cepstrum coefficients (MFCCs) may be calculated by DCT transformation, as follows:
Figure BDA0002466578260000113
where N is the filter channel, mjIs the strength of the jth mel-filter. Therefore, the voiceprint feature vector of each sentence of dialogue in the service voice data corresponding to the user can be obtained.
Fig. 3 is another schematic flow chart of a user tag obtaining method according to an embodiment of the present invention.
Optionally, as shown in fig. 3, before the obtaining of the personality emotion label of the user according to the preset personality emotion label identification model corresponding to the voiceprint feature and the personality emotion label type, the user label obtaining method may further include:
s301, acquiring a first sample set.
The first sample set comprises sample voiceprint features, and the sample voiceprint features are marked with corresponding personality emotion labels.
S302, training the neural network by adopting the first sample set to obtain a preset personality emotion label recognition model.
For example, the Neural network used for training the preset personality emotion label recognition model may be Deep Neural Network (DNN); in the first sample set, the temperament labels of the voiceprint features of the samples can be labeled manually in advance, such as: the training data may be labeled by a business person based on business experience to obtain the first set of samples.
Optionally, the personality emotion label may include: irritability, peace, and urge. It should be noted that, in some embodiments, there may be more types of emotion tags, and the number of types of emotion tags is not limited in the present invention, and may be increased or decreased according to labels of business personnel.
Fig. 4 shows a flowchart of DNN extracting a personality emotion label according to an embodiment of the present invention.
As shown in fig. 4, taking DNN network structure as an example, after DCT transformation, 2560-dimensional voiceprint feature vectors can be obtained, and then the trained 3-layer DNN network is used to obtain the personality emotion label of the user.
After the personality emotion label of the user is obtained, the user can be distributed to appropriate customer service personnel according to the personality emotion label of the user, so that communication between the user and the customer service personnel is more comfortable, and service experience of the user is improved.
Fig. 5 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention.
Optionally, as shown in fig. 5, in another embodiment, the step of extracting corresponding feature information from the service voice data according to a preset user tag type may further include:
and S501, converting the service voice data into text information.
For example, a speech to text engine may be employed to convert service speech data to text.
S502, extracting a first text feature from the text information according to a preset appeal label type.
Optionally, a one-hot (one-hot) encoding technique may be used to pre-process the text information to obtain a corresponding text vector, i.e. the first text feature.
Correspondingly, the step of obtaining the user tag of the user according to the preset tag identification model corresponding to the feature information and the user tag type may specifically include: and acquiring the appeal label of the user according to the first text feature and the preset appeal label identification model corresponding to the appeal label type.
Fig. 6 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention.
Optionally, as shown in fig. 6, before the step of obtaining the appeal tag of the user according to the preset appeal tag identification model corresponding to the first text feature and the appeal tag type, the user tag obtaining method may further include:
and S601, acquiring a second sample set.
The second set of samples may include sample first text features labeled with corresponding appeal tags.
And S602, training the neural network by adopting a second sample set, and acquiring a preset appeal label identification model.
For example, the following steps are carried out: the preset appeal tag identification model may be a transformer Bidirectional Encoder representation model (BERT), and the BERT represents different meanings of words through context. The BERT model is a feature extraction model consisting of two-way transformers. The core idea of attention mechanism used by the Transformer is to calculate the correlation of each word in a sentence to all words in the sentence, and then consider that the correlation between words reflects the relevance and importance degree between different words in the sentence to some extent. In the training process of the BERT model, a hidden language model (masked language model) can be used for randomly shielding some tokens (tokens) in input to perform pre-training, meanwhile, next sentence prediction (next content prediction) of a sentence level task is added to randomly replace some sentences, and the last sentence is used for predicting yes or no (isNext/notNext). By the two tasks, large-scale unmarked corpora are used for optimization, and finally the pretrained BERT model is obtained.
In embodiments of the present invention, the BERT model may be based on a Google Bert Multilingual pre-training model, including 12-layers, 768-hidden size, 12-headers, with approximately 1 hundred million parameters. The question-answering model utilizes the answers of the users to carry out multi-classification tasks on the pre-training model, supervised training is carried out, error labels are picked out from the linguistic data with low prediction confidence coefficient for re-labeling, fine tuning is carried out, when the average error rate of the training prediction is not obviously reduced (after about 20 to 30 epoch), the fine tuning of the model is stopped, and the question recognition model is obtained.
Fig. 7 is a schematic flow chart illustrating extraction of an appeal tag by the BERT model according to the embodiment of the present invention.
As shown in fig. 7, after the text information is preprocessed by using the one-hot encoding technique to obtain the corresponding first text feature, the position feature and the paragraph feature of the text may be added to jointly form the model input. After the model is input, the problem of the user can be obtained through a 12-layer attention (attention) network, a full connection layer and a softmax layer, and the model output can be the probability required by the user. The training data comes from business annotations, and the summarized user questions may also come from business experiences.
Compared with the traditional classification method, the BERT algorithm has rich prior knowledge, so that the expression capability on the text is stronger, and the classification task completed by the BERT algorithm is more accurate.
By taking credit card service as an example, by extracting the appeal labels of the users, the problems encountered by the users can be summarized, the problems which the users can also encounter in the credit card using process can be predicted, the users can be actively served even, the user pleasure is increased, and the confidence of the users to the card center is enhanced.
Optionally, in an embodiment of the present invention, the appeal tag may include: shopping appeal, cash appeal, repayment appeal, account appeal, real-name authentication appeal, security policy appeal, sales appeal, merchant withholding appeal, payment appeal, and the like. In some embodiments, the types of the appealing tags may be more, and the present invention does not limit the types and the number of the appealing tags.
Fig. 8 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention.
Optionally, as shown in fig. 8, in another embodiment, the extracting corresponding feature information from the service voice data according to a preset user tag type may further include:
s801, converting the service voice data into text information.
The conversion method is the same as that in the previous embodiment, and is not described herein again.
S802, extracting user experience problems provided by customer service staff from the text information.
For example, the user experience problem may refer to: during the conversation process between the user and the customer service staff, the customer service staff presents the user with problems about the description of the user experience, such as: whether satisfied with the service, whether there are other requirements for the service, etc.
And S803, extracting a second text feature according to the preset evaluation label type and the text information and the user experience problem.
Correspondingly, the step of obtaining the user tag of the user according to the preset tag identification model corresponding to the feature information and the user tag type may specifically include: and acquiring the evaluation label of the user according to the second text feature and the preset evaluation label identification model corresponding to the type of the evaluation label.
Optionally, the present invention may also use the BERT algorithm to extract the description of the user about the user experience problem, and before extracting the user description using the BERT algorithm, the extracted user experience problem about the user evaluation is first combined with the user dialog to form a new text. The BERT firstly adds a special classification token (special token) mark in front of a user evaluation problem, then connects an experience problem and a paragraph together, and uses the special token (special token) to separate the experience problem and the paragraph to form a new text sequence. A new text sequence is input to BERT by segment embedding and position embedding (segment embedding and positional embedding); finally, the final hidden state of BERT is converted into the probability of answer span by the full connectivity layer and softmax function. Since the fine-tuned BERT can capture the relationship between a user experience question and a paragraph, user experience questions related to user ratings can be extracted in the user-to-customer service person dialog.
Fig. 9 is a schematic flow chart illustrating extraction of user evaluation description by the BERT model according to the embodiment of the present invention.
As shown in fig. 9, the model input first passes through a 14-layer attention network, then passes through a full connection layer, and finally passes through a softmax layer, so that the probability of the starting position and the ending position described by the user can be obtained.
Fig. 10 is a schematic flowchart illustrating a user tag obtaining method according to an embodiment of the present invention.
Optionally, as shown in fig. 10, before the obtaining of the evaluation tag of the user according to the preset evaluation tag identification model corresponding to the second text feature and the evaluation tag type, the user tag obtaining method may further include:
and S1001, acquiring a third sample set.
The third set of samples may include: and the sample second text features are marked with corresponding evaluation labels.
S1002, training the neural network by adopting a third sample set, and obtaining a preset evaluation label recognition model.
Optionally, the preset evaluation tag identification model may still adopt a BERT model, the model structure is basically consistent with that of the preset appeal tag identification model, after the description of the user is converted into a word vector, the text position feature and the paragraph feature may be added to form an input of the model, the model input firstly passes through a 12-layer attention network, then passes through a full connection layer, and finally obtains the emotion classification of the user through a softmax layer, that is, the evaluation tag. It should be noted that, part of emotion data still needs to be labeled in the early stage of training, and the labeling of emotion tendencies still comes from the business department.
Alternatively, the rating label may include: positive evaluation, negative evaluation, and medium evaluation.
As described above, in order to make the obtained user evaluation description play a greater role, after the description of the user is extracted, the embodiment of the present invention may determine that the emotional tendency described by the user belongs to: positive (positive), negative (negative), or medium (constant) ratings to increase the efficiency of the customer service person in finding non-positive descriptions. The evaluation label of the user is extracted, so that the user can be more conveniently seen that the evaluation of the user to the card center is positive, negative or overwhelmed (medium), the service is more quickly improved according to the evaluations, and the service quality is improved.
In summary, the embodiment of the invention can obtain the user tags such as the personality emotion tag, the appeal tag, the evaluation tag and the like corresponding to the user by analyzing the user service voice data, thereby providing a data basis for business decision and supporting the reasonable development of business.
Based on the user tag obtaining method described in the foregoing embodiment, an embodiment of the present invention correspondingly provides a user tag obtaining apparatus, and fig. 11 illustrates a schematic structural diagram of the user tag obtaining apparatus provided in the embodiment of the present invention.
As shown in fig. 11, the user tag obtaining apparatus may include: an obtaining module 10, configured to obtain service voice data of a user; the feature extraction module 20 is configured to extract corresponding feature information from the service voice data according to a preset user tag type; and the tag identification module 30 is configured to obtain the user tag of the user according to the preset tag identification model corresponding to the feature information and the user tag type.
Optionally, the feature extraction module 20 may be specifically configured to extract voiceprint features from the service voice data according to a preset personality emotion tag type; accordingly, the tag identification module 30 may be specifically configured to obtain the personality emotion tag of the user according to the preset personality emotion tag identification model corresponding to the voiceprint feature and the personality emotion tag type.
Fig. 12 is a schematic structural diagram illustrating a user tag obtaining apparatus according to an embodiment of the present invention.
As shown in fig. 12, the user tag obtaining apparatus may further include: the first training module 40 may be configured to obtain a first sample set before the tag identification module 30 obtains the personality emotion tags of the user according to the voiceprint features and the preset personality emotion tag identification model corresponding to the personality emotion tag types, where the first sample set includes sample voiceprint features and the sample voiceprint features are labeled with corresponding personality emotion tags; and training the neural network by adopting the first sample set to obtain a preset personality emotion label recognition model.
Optionally, the feature extraction module 20 may be specifically configured to convert the service voice data into text information; extracting a first text feature from the text information according to a preset appeal label type; correspondingly, the tag identification module 30 is specifically configured to obtain the appeal tag of the user according to the preset appeal tag identification model corresponding to the first text feature and the appeal tag type.
Fig. 13 is another schematic structural diagram of a user tag obtaining apparatus according to an embodiment of the present invention.
As shown in fig. 13, the user tag obtaining apparatus may further include: the second training module 50 is configured to obtain a second sample set before the tag identification module 30 obtains the user appeal tag according to the first text feature and the preset appeal tag identification model corresponding to the appeal tag type, where the second sample set includes the sample first text feature, and the sample first text feature is labeled with a corresponding appeal tag; and training the neural network by adopting a second sample set to obtain a preset appeal label identification model.
Optionally, the feature extraction module 20 may be specifically configured to convert the service voice data into text information; extracting user experience problems provided by customer service personnel from the text information; extracting a second text characteristic according to a preset evaluation label type and according to text information and user experience problems; accordingly, the tag identification module 30 may be specifically configured to obtain the evaluation tag of the user according to the second text feature and the preset evaluation tag identification model corresponding to the evaluation tag type.
Fig. 14 is another schematic structural diagram of a user tag obtaining apparatus according to an embodiment of the present invention.
As shown in fig. 14, the user tag obtaining apparatus may further include: a third training module 60, configured to obtain a third sample set before the tag identification module 30 obtains the evaluation tag of the user according to the second text feature and the preset evaluation tag identification model corresponding to the evaluation tag type, where the third sample set includes: the sample second text features are marked with corresponding evaluation labels; and training the neural network by adopting a third sample set to obtain a preset evaluation label recognition model.
Optionally, the feature extraction module 20 may be specifically configured to sequentially perform pre-emphasis, framing, windowing, fast fourier transform, MEL filter analysis, and discrete cosine transform on the service voice data to obtain a voiceprint feature vector.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus described above may refer to the corresponding process of the method in the foregoing method embodiment, and is not described in detail herein.
An embodiment of the present invention further provides a user tag obtaining device, where the user tag obtaining device may be a computer, a server, or the like, and fig. 15 illustrates a schematic structural diagram of the user tag obtaining device provided in the embodiment of the present invention.
As shown in fig. 15, the user tag obtaining apparatus may include: a processor 100, a storage medium 200 and a bus (not labeled), wherein the storage medium 200 stores machine-readable instructions executable by the processor 100, when the user tag obtaining apparatus is operated, the processor 100 communicates with the storage medium 200 via the bus, and the processor 100 executes the machine-readable instructions to perform the user tag obtaining method as described in the foregoing method embodiments. The specific implementation and technical effects are similar, and are not described herein again.
Based on this, the embodiment of the present invention further provides a storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the user tag obtaining method as described in the foregoing method embodiment is executed. The specific implementation and technical effects are similar, and are not described herein again.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A user tag acquisition method is characterized by comprising the following steps:
acquiring service voice data of a user;
extracting corresponding characteristic information from the service voice data according to a preset user tag type;
and acquiring the user label of the user according to the characteristic information and a preset label identification model corresponding to the user label type.
2. The method of claim 1, wherein the extracting corresponding feature information from the service voice data according to a preset user tag type comprises:
extracting voiceprint features from the service voice data according to a preset personality emotion label type;
correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes:
and acquiring the personality emotion label of the user according to the preset personality emotion label identification model corresponding to the voiceprint feature and the personality emotion label type.
3. The method according to claim 2, wherein before obtaining the personality emotion label of the user according to the preset personality emotion label recognition model corresponding to the voiceprint feature and the personality emotion label type, the method further comprises:
obtaining a first sample set, wherein the first sample set comprises sample voiceprint features, and the sample voiceprint features are marked with corresponding personality emotion labels;
and training a neural network by adopting the first sample set to obtain the preset personality emotion label recognition model.
4. The method of claim 1, wherein the extracting corresponding feature information from the service voice data according to a preset user tag type comprises:
converting the service voice data into text information;
extracting a first text feature from the text information according to a preset appeal label type;
correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes:
and acquiring the appeal label of the user according to the first text feature and a preset appeal label identification model corresponding to the appeal label type.
5. The method of claim 4, wherein before obtaining the appealing tag of the user according to the preset appealing tag identification model corresponding to the first text feature and the appealing tag type, the method further comprises:
obtaining a second sample set, wherein the second sample set comprises sample first text features, and the sample first text features are labeled with corresponding appeal labels;
and training a neural network by adopting the second sample set to obtain the preset appeal label identification model.
6. The method of claim 1, wherein the extracting corresponding feature information from the service voice data according to a preset user tag type comprises:
converting the service voice data into text information;
extracting user experience problems provided by customer service personnel from the text information;
extracting a second text feature according to a preset evaluation label type and the text information and the user experience problem;
correspondingly, the obtaining the user tag of the user according to the feature information and the preset tag identification model corresponding to the user tag type includes:
and acquiring the evaluation label of the user according to the second text feature and a preset evaluation label identification model corresponding to the evaluation label type.
7. The method according to claim 6, wherein before the obtaining of the rating label of the user according to the second text feature and the preset rating label recognition model corresponding to the rating label type, the method further comprises:
obtaining a third set of samples, the third set of samples comprising: the sample second text features are marked with corresponding evaluation labels;
and training a neural network by adopting the third sample set to obtain the preset evaluation label recognition model.
8. The method of claim 2, wherein extracting voiceprint features from the service voice data comprises:
and the service voice data is processed by pre-emphasis, framing, windowing, fast Fourier transform, Mel MEL filtering analysis and discrete cosine transform in sequence to obtain the voiceprint characteristic vector.
9. A user tag obtaining apparatus, the apparatus comprising: the acquisition module is used for acquiring service voice data of a user;
the characteristic extraction module is used for extracting corresponding characteristic information from the service voice data according to a preset user label type;
and the tag identification module is used for acquiring the user tag of the user according to the characteristic information and a preset tag identification model corresponding to the user tag type.
10. A user tag obtaining apparatus, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the user tag obtaining device is operated, the processor executing the machine-readable instructions to perform the user tag obtaining method according to any one of claims 1-8.
CN202010335882.2A 2020-04-25 2020-04-25 User tag obtaining method, device and equipment Pending CN111554304A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010335882.2A CN111554304A (en) 2020-04-25 2020-04-25 User tag obtaining method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010335882.2A CN111554304A (en) 2020-04-25 2020-04-25 User tag obtaining method, device and equipment

Publications (1)

Publication Number Publication Date
CN111554304A true CN111554304A (en) 2020-08-18

Family

ID=72003984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010335882.2A Pending CN111554304A (en) 2020-04-25 2020-04-25 User tag obtaining method, device and equipment

Country Status (1)

Country Link
CN (1) CN111554304A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818841A (en) * 2021-01-29 2021-05-18 北京搜狗科技发展有限公司 Method and related device for recognizing user emotion
CN113240444A (en) * 2021-06-18 2021-08-10 中国银行股份有限公司 Bank customer service seat recommendation method and device
TWI738610B (en) * 2021-01-20 2021-09-01 橋良股份有限公司 Recommended financial product and risk control system and implementation method thereof
TWI741937B (en) * 2021-01-20 2021-10-01 橋良股份有限公司 Judgment system for suitability of talents and implementation method thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107452405A (en) * 2017-08-16 2017-12-08 北京易真学思教育科技有限公司 A kind of method and device that data evaluation is carried out according to voice content
CN107895230A (en) * 2017-11-06 2018-04-10 广州杰赛科技股份有限公司 Customer service quality evaluating method and device
CN108319666A (en) * 2018-01-19 2018-07-24 国网浙江省电力有限公司电力科学研究院 A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion
CN110322899A (en) * 2019-06-18 2019-10-11 平安银行股份有限公司 User's intelligent method for classifying, server and storage medium
CN110459210A (en) * 2019-07-30 2019-11-15 平安科技(深圳)有限公司 Answering method, device, equipment and storage medium based on speech analysis
CN110473549A (en) * 2019-08-21 2019-11-19 北京智合大方科技有限公司 A kind of voice dialogue analysis system, method and storage medium
CN110853649A (en) * 2019-11-05 2020-02-28 集奥聚合(北京)人工智能科技有限公司 Label extraction method, system, device and medium based on intelligent voice technology
CN110968671A (en) * 2019-12-03 2020-04-07 北京声智科技有限公司 Intent determination method and device based on Bert

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107452405A (en) * 2017-08-16 2017-12-08 北京易真学思教育科技有限公司 A kind of method and device that data evaluation is carried out according to voice content
CN107895230A (en) * 2017-11-06 2018-04-10 广州杰赛科技股份有限公司 Customer service quality evaluating method and device
CN108319666A (en) * 2018-01-19 2018-07-24 国网浙江省电力有限公司电力科学研究院 A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion
CN110322899A (en) * 2019-06-18 2019-10-11 平安银行股份有限公司 User's intelligent method for classifying, server and storage medium
CN110459210A (en) * 2019-07-30 2019-11-15 平安科技(深圳)有限公司 Answering method, device, equipment and storage medium based on speech analysis
CN110473549A (en) * 2019-08-21 2019-11-19 北京智合大方科技有限公司 A kind of voice dialogue analysis system, method and storage medium
CN110853649A (en) * 2019-11-05 2020-02-28 集奥聚合(北京)人工智能科技有限公司 Label extraction method, system, device and medium based on intelligent voice technology
CN110968671A (en) * 2019-12-03 2020-04-07 北京声智科技有限公司 Intent determination method and device based on Bert

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜囡: "《语音信号识别技术与实践》", 沈阳:东北大学出版社, pages: 208 - 210 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI738610B (en) * 2021-01-20 2021-09-01 橋良股份有限公司 Recommended financial product and risk control system and implementation method thereof
TWI741937B (en) * 2021-01-20 2021-10-01 橋良股份有限公司 Judgment system for suitability of talents and implementation method thereof
CN112818841A (en) * 2021-01-29 2021-05-18 北京搜狗科技发展有限公司 Method and related device for recognizing user emotion
CN113240444A (en) * 2021-06-18 2021-08-10 中国银行股份有限公司 Bank customer service seat recommendation method and device

Similar Documents

Publication Publication Date Title
CN111554304A (en) User tag obtaining method, device and equipment
CN106847292B (en) Method for recognizing sound-groove and device
CN112259106A (en) Voiceprint recognition method and device, storage medium and computer equipment
Hourri et al. A deep learning approach for speaker recognition
CN110457432A (en) Interview methods of marking, device, equipment and storage medium
CN110970036B (en) Voiceprint recognition method and device, computer storage medium and electronic equipment
CN109313892A (en) Steady language identification method and system
CN110136726A (en) A kind of estimation method, device, system and the storage medium of voice gender
CN110782902A (en) Audio data determination method, apparatus, device and medium
Avila et al. Automatic speaker verification from affective speech using Gaussian mixture model based estimation of neutral speech characteristics
US10522135B2 (en) System and method for segmenting audio files for transcription
Sekkate et al. Speaker identification for OFDM-based aeronautical communication system
Esmaili et al. An automatic prolongation detection approach in continuous speech with robustness against speaking rate variations
CN115831125A (en) Speech recognition method, device, equipment, storage medium and product
Nagaraja et al. Combination of features for multilingual speaker identification with the constraint of limited data
CN111400463A (en) Dialog response method, apparatus, device and medium
CN113782005B (en) Speech recognition method and device, storage medium and electronic equipment
CN114974255A (en) Hotel scene-based voiceprint recognition method, system, equipment and storage medium
CN115063155A (en) Data labeling method and device, computer equipment and storage medium
CN112052994A (en) Customer complaint upgrade prediction method and device and electronic equipment
Yadava et al. Improvements in spoken query system to access the agricultural commodity prices and weather information in Kannada language/dialects
CN114550741A (en) Semantic recognition method and system
Das et al. Investigating text-independent speaker verification systems under varied data conditions
Shome et al. A robust DNN model for text-independent speaker identification using non-speaker embeddings in diverse data conditions
Rajeswari et al. Speech Quality Enhancement Using Phoneme with Cepstrum Variation Features.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200818