CN114398512A - Big data-based voice portrait analysis method for communication operator business customer - Google Patents
Big data-based voice portrait analysis method for communication operator business customer Download PDFInfo
- Publication number
- CN114398512A CN114398512A CN202110989375.5A CN202110989375A CN114398512A CN 114398512 A CN114398512 A CN 114398512A CN 202110989375 A CN202110989375 A CN 202110989375A CN 114398512 A CN114398512 A CN 114398512A
- Authority
- CN
- China
- Prior art keywords
- user
- data
- voice
- service
- analysis method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 41
- 238000004891 communication Methods 0.000 title claims abstract description 27
- 239000013598 vector Substances 0.000 claims abstract description 19
- 230000011218 segmentation Effects 0.000 claims abstract description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 29
- 238000000034 method Methods 0.000 claims description 28
- 238000013145 classification model Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000013527 convolutional neural network Methods 0.000 claims description 5
- 239000003795 chemical substances by application Substances 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000013499 data model Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 230000008447 perception Effects 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 13
- 230000006399 behavior Effects 0.000 description 10
- 238000012423 maintenance Methods 0.000 description 9
- 238000004140 cleaning Methods 0.000 description 5
- 230000004927 fusion Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000008451 emotion Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 235000015243 ice cream Nutrition 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000003442 weekly effect Effects 0.000 description 2
- 206010022998 Irritability Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000008909 emotion recognition Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/64—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/65—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0281—Customer communication at a business location, e.g. providing product or service information, consulting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Business, Economics & Management (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Development Economics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Business, Economics & Management (AREA)
- Bioinformatics & Computational Biology (AREA)
- Accounting & Taxation (AREA)
- Molecular Biology (AREA)
- Marketing (AREA)
- Finance (AREA)
- Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Library & Information Science (AREA)
- Probability & Statistics with Applications (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
Abstract
The invention discloses a big data-based voice portrait analysis method for communication operator industry customers, which comprises the following steps: step 1, collecting voice data of a user and a seat in a communication mode, and converting the voice data into text data; step 2, performing word segmentation and feature selection on the text data, establishing a feature vector of each word segmentation, and modeling data; step 3, realizing automatic clustering of words according to the feature vector of each participle, and after clustering the classified words, carrying out classification marking according to a clustering semantic label; step 4, carrying out voice service on the classification marks to identify the user intention and calculating the user label value through a label model; and 5, analyzing the user label and outputting the user image set through the multi-dimensional index. Has the advantages that: the label model is carried out on the user to form a multi-dimensional user image, so that the information of the customer can be conveniently known by the seat before the seat communicates with the customer, the service is provided specifically, the customer service perception is improved, and the complaint rate is reduced.
Description
Technical Field
The invention relates to the field of communication, in particular to a big data-based voice portrait analysis method for communication operator industry customers.
Background
The entertainment, emotion, high efficiency and excellent experience requirements of customers become the core direction of continuous innovation of driving technology, application, terminals and services under the background of the mobile internet, the business mode transformation of enterprises caused by the method also becomes the key driving force of service innovation, and with the explosive growth of data volume and the maturity of large data technology, more and more behavior data of the capturable customers are obtained, so that the user portrait can be really called as a more valuable portrait.
The method is based on the massive voice data of the user, develops and defines the client appeal behaviors and figures, expands the information of the marketing tendency, complaint tendency, consultation preference, product interest, handling behavior and the like of the user, explains the characteristics of the user in an all-around manner, and provides comprehensive data support for service and market operation and maintenance activities. The method can know the heart sound of the client in time, improve the service capability and promote the innovation of the business process.
At present, operators mostly use structural information to portray users in the aspect of customer portrayal construction, portray the users, can not comprehensively reflect the individuality and the requirement of the users, and can not provide thousands of service experiences for the users in the service process. Meanwhile, the traditional marketing maintenance mode of an operator has the phenomena of unclear targets, directions and rhythms, and huge investment of operation and maintenance resources causes resource waste.
Disclosure of Invention
The invention provides a big data-based voice portrait analysis method for communication operator business customers, which greatly improves the marketing success rate, reduces the user maintenance cost, and simultaneously lightens the labor intensity of customer service personnel, so that the seat can know the business and provide targeted service for thousands of people.
A big data-based voice portrait analysis method for communication operator business customers comprises the following steps:
step 3, realizing automatic clustering of words according to the feature vector of each participle, and after clustering the classified words, carrying out classification marking according to a clustering semantic label;
step 5, analyzing the user label and outputting a user image set through a multi-dimensional index;
and 7, analyzing the user image set to generate a visual multi-dimensional report.
In step 1, the user voice data needs to be cleaned and preprocessed before being converted into text data.
In the step 1, the ASR is used for transcribing the user recording data, and the recognition mode adopts an acoustic model of a deep neural network to finish semantic analysis of the voice transcribed recording to the dialog text.
Wherein, in the step 2, the data set comprises: a user behavior database, a system database, a corpus and a lexicon,
a user behavior database: business habits and preference data used by users;
a system database: basic information and service basic information data of users;
corpus: a user cycle portrait model formed by historical dialogue texts of the user and the agent;
a word bank: operator business product thesaurus and service thesaurus.
In the step 2, word segmentation is performed through an HMM algorithm, feature selection is performed on data through a TF-IDF algorithm and an LDA algorithm, text data can be calculated, feature vectors of the segmented words are built through a word2Vec algorithm, data modeling is performed on the data with the features through a CNN algorithm, and a user label value is calculated through a data model and a classification model.
In step 3, the semantic tags used by the classified labels after the user dialogues text is participled include: business acceptance semantic tags, complaint semantic tags, business consultation semantic tags, business query semantic tags, fault semantic tags and attitude semantic tags.
In the step 4, the user intention recognition is performed through a user intention classification model, and the classification of the user intention by the user intention classification model includes a complaint high-risk type, a flow sensitive type, a product sensitive type, a password sensitive type, a complaint type and a marketing refusal type.
Wherein, in the step 4, the tag model of the user includes: the system comprises a basic feature tag, a product requirement tag, a business feature tag, a consumption feature tag, a channel feature tag, a terminal preference tag, a user service evaluation tag, a position tag and an internet content preference tag.
In the step 4, the user tags include four update levels, namely a daily level, a weekly level, a monthly level and a yearly level, and the tags of different user figures are updated according to different update requirements.
In step 5, the user image set includes a product image, a classification image, a complaint image, a consultation service image, a business tendency image and a consumption tendency image.
The invention at least comprises the following beneficial effects:
according to the invention, a multi-dimensional user image is formed by performing a label model on the user, so that a seat can conveniently know the information of the customer before communicating with the customer, and thus, the service is provided in a targeted manner, the customer service perception is improved, the complaint rate is reduced, the marketing success rate is improved, the user maintenance cost is reduced, and the product operation is assisted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a block flow diagram of a big data based communications carrier business customer speech profile analysis method according to the present invention;
FIG. 2 is an application architecture diagram of the present invention for a big data based voice portrait analysis method for a communication carrier business customer;
FIG. 3 is a schematic diagram of the relationship between user tags for a big data based voice portrait analysis method of a communication carrier business customer according to the present invention;
FIG. 4 is a user tag weight calculation formula for a big data based voice portrait analysis method for a communication carrier business customer according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A big data-based voice portrait analysis method for communication operator business customers comprises the following steps:
step 3, realizing automatic clustering of words according to the feature vector of each participle, and after clustering the classified words, carrying out classification marking according to a clustering semantic label;
step 5, analyzing the user label and outputting a user image set through a multi-dimensional index;
and 7, analyzing the user image set to generate a visual multi-dimensional report.
The distributed message system in step 2 comprises:
portrait application: the customer voice image can be applied to all links of the camp service, and specifically comprises the following steps: service prediction, accurate service, incoming call maintenance, product configuration and value improvement.
Data visualization: user portrait data created by the analysis processing service is presented in a multi-dimensional report form.
Analysis processing service: the method comprises cross analysis, text classification, text clustering and thematic analysis.
And (3) cross analysis: the structured fields of the data are selected for multi-dimensional cross analysis, so that the distribution, the contrast, the variation trend and the like of the analysis subject data on the known dimension can be rapidly known.
Text classification: and according to the keyword modeling, the detailed text data of each record is matched, and the detailed text data is automatically classified into various sets, so that the rapid classification of mass data is realized.
Text clustering: and automatically collecting according to semantic understanding of the detail texts in the analysis data to form different unknown collections, so as to realize rapid classification and collection of mass data.
Analysis of special subjects: and performing deep multi-level analysis on the analysis data by comprehensive rule analysis and intelligent analysis, performing root research on the analysis theme, and supporting the analysis result to be brought into a corpus for knowledge precipitation.
The voice service includes: speech recognition, mute recognition, emotion analysis, scene segmentation, semantic understanding and full-text transcription.
And (3) voice recognition: the voice call telephone traffic is converted into a speaker-separated conversation pair in real time;
mute recognition: identifying the mute condition of a user in the conversation process;
emotion recognition: recognizing the emotion of the seat and the user;
scene segmentation: the conversation process comprises a plurality of scenes, and different scenes are divided;
semantic understanding: understanding the user intent.
Full text transcription: and transferring the call record into a conversation text in an off-line mode.
Basic data layer: including the data set used for modeling.
Service prejudgment is provided before and during call answering of the manual agent through the multi-dimensional report form of the user portrait, so that service, marketing and maintenance are more accurately carried out.
In this embodiment, in step 1, before the user speech data is converted into text data, cleaning and preprocessing are performed, where the cleaning of the data is performed by cleaning the client dialogue record, cleaning the null audio, screening out audio meeting the requirements, encoding and classifying the data, and removing an abnormal value, a completion missing value, and a repetition value.
In this embodiment, in step 1, the ASR transcribes the user recording data, and the recognition mode adopts an acoustic model of a deep neural network, so as to complete semantic analysis of the dialog text by the recording of the voice transcription, and transcribe the user voice to obtain the user semantics through the semantic model.
In this embodiment, in step 2, the data set includes: a user behavior database, a system database, a corpus and a lexicon,
a user behavior database: business habits and preference data used by users, such as common websites, APPs, and the like;
a system database: basic information and service basic information data of the user, such as gender, package and monthly consumption;
corpus: a user cycle portrait model formed by historical dialogue texts of the user and the agent;
a word bank: and the operator industry product word stock and the service word stock, such as ice cream packages and happy shopping.
Example of the product thesaurus: 3G package, Internet financing, bee card, Xinlang V card, Baidu Shen card, unlimited additional product, 4G package, flow rate limitation unlimited, hungry card, Taobao smooth card, ant treasure card, flow rate limitation relieveable, M2M connection service, prepaid product package Mei Tuan card, drip orange card, Tengwang card, voice unlimited, external enterprise payment service, post-paid product package, beep li card, recruit card, drip Wang card, high-value old user smooth experience product, Wo wallet, nail card, enantio card, Jingdong strong card, smooth-crossing ice cream product.
A service word bank: the method comprises the following steps of fixed network basic voice service, internet access service, campus fusion, mobile phone internet surfing, data VPN, fixed network electronic payment, wireless local telephone service, data and network element service, 2I fusion, three-way calling, short message service, video conference, public telephone service, ICT service, caller ID, incoming call restriction, voice VPN, fusion information service, telephone card service, advertisement service, incoming call highlight, call hold, personalized ring back tone, internet payment, unlimited fusion, limited fusion, caller display prohibition, call transfer, call center and mobile phone media.
In this embodiment, in step 2, word segmentation is performed through an HMM algorithm, feature selection is performed on data through a TF-IDF algorithm and an LDA algorithm, text data can be calculated, feature vectors of the segmented words are constructed through algorithms such as word2Vec, derivation 4j, fastext, and LDA, classification and clustering are performed on the text, a classification clustering output result is applied to a data clustering service and a text classification service, data modeling can be performed on the data with features through algorithms such as fastext, CNN, and THUCTC, and a user tag value is calculated through a data model and a classification model.
Introduction of the algorithm applied:
HMM algorithm: the method is applied to the word segmentation process, and is a process for determining implicit parameters of the process from observable parameters and then utilizing the parameters for further analysis.
TF-IDF algorithm: the method is applied to a feature selection process for evaluating the importance of a word to one of a set of documents or a corpus of documents. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus.
The LDA algorithm: the algorithm is applied to three links of feature selection, feature vector and data clustering processing, is a document theme generation model, is also called a three-layer Bayes probability model, and comprises three-layer structures of words, themes and documents.
word2Vec algorithm: the method is applied to a characteristic vector process and is used for generating a correlation model of word vectors, the model is a shallow and double-layer neural network, a word2vec model can be used for mapping each word to a vector and can be used for representing the relation between word-to-word, and the vector is a hidden layer of the neural network.
Deep learning4j algorithm: the method is applied to the process of the feature vector, widely supports the operation frameworks of various deep learning algorithms, and can implement word2vec technology.
The Fasttext algorithm: the method is applied to a characteristic vector, data modeling process, word vector calculation and text classification tool.
CNN algorithm: the method is applied to a data modeling process, data modeling is carried out through a CNN algorithm after data vectorization, and mapping capacity from input to output is formed through learning a large number of voice dialog texts.
The THUCTC algorithm: the method is applied to the data modeling process, and automatically and efficiently realizes the training, evaluating and classifying functions of the user-defined text classification corpus.
After the user tags are calculated, the weights of the user tags in the representation are determined, and the user representation has different tag weights in different scenes.
Calculating user tag weights using the TF-IDF algorithm, for example, there are 3 users and 5 tags (as shown in fig. 3), and the relationship between the tags and the users will reflect the relationship between the tags to some extent, and the number of times a tag T is used to mark a user P is represented by w (P, T). TF (P, T) represents the proportion of the number of times of this tagging in all tags of user P, and the formula is:
as shown in fig. 3, if there are 6 tags a, 4 tags b, and 2 tags c marked on the user a, the a tag TF on the user a is 6/(6+4+ 2).
The corresponding IDF (P, T) indicates the scarcity of the tag T in all tags, i.e. the probability of occurrence of the tag, if a tag T has a small probability of occurrence and is used to mark a user at the same time, the relationship between the user and the tag T is made tighter, and the formula is:
then, the weight value of the label of the user can be obtained according to TF and IDF, the weight at this time does not consider the service scenario, obviously, the user label weight needs to consider the service scenario, how long the time is, the number of times the label is generated by the user, and the like, and the calculation formula is shown in fig. 4.
In this embodiment, in step 3, the semantic tags used by the classified labels after the user dialogues text is participled include: business acceptance semantic tags, complaint semantic tags, business consultation semantic tags, business query semantic tags, fault semantic tags and attitude semantic tags.
Examples of semantic tags are:
the service acceptance type semantics comprise: flow package, main and auxiliary card service, fixed telephone new installation, package change, new installation integration, call transfer, special service function, package change integration, and emergency shutdown and startup.
Complaint class semantics include: the method comprises the following steps of failing to collect short messages, solving the problem of telephone operator service skills, failing to pay account, failing to use services normally, disputing of value-added fees, failing to access the internet, real-name system of client data, solving the problems of certificate blacklist, disputing of traffic fees, failing to talk, harassment, fraud and halt, failing to delay and fail to take effect due to untimely service processing and double capping of traffic.
The business consultation semantics comprise: package consultation, number portability, voyage cloud disk, machine dismantling, international roaming, privilege and exclusive flow, charge storage and delivery machine, remote card supplementing, fixed-width charge, service password, point exchange, handling procedures, charge storage and delivery charge, remote number sale, broadband package year charge, junk information communication fraud consultation, electronic invoice, user passing, 2l package charge, product consultation, fixed-network delivery machine, voyage video, one-card charge and international long distance.
The business query class semantics include: balance allowance inquiry, activity return and expiration time inquiry, local phone number inquiry, complex phone charge inquiry and arrearage reason inquiry, business validation inquiry, business hall information inquiry, point inquiry, fixed-width account password inquiry, account balance allowance inquiry, account balance and expiration time inquiry, account balance and account balance inquiry, account balance and account balance inquiry, account balance inquiry, account balance inquiry, account balance, account balance, account balance, account balance, account balance,
The failure class semantics include: fixed line fault, system upgrade, broadband fault, large area fault.
The attitude class semantics include: bad attitude, friendly attitude, irritability of spleen qi, and mild attitude.
In this embodiment, in step 4, the user intention recognition is performed through a user intention classification model, and the user intention classification model classifies the user intention according to a complaint high-risk type, a flow-sensitive type, a product-sensitive type, a password-sensitive type, a complaint type, and a denial of marketing type.
In this embodiment, in step 4, the tag model of the user includes:
basic feature label: describing client attributes and corresponding social relations from the perspective of natural people;
product requirement labeling: analyzing user ordering Unicom product information from the incoming voice data, including contract plan participation and client tendency information for marketing activity selection;
service characteristic label: analyzing the use condition and the call circle of the user from the aspects of incoming call consultation voice, flow, short message and the like;
a consumption characteristic label: describing the composition of the user's expenditure income, the settlement and payment, the payment and the credit related information;
channel feature labeling: describing channels and channel preference information in customer service contact;
terminal preference tag: describing user terminal use information and terminal preference information through incoming call consultation and service handling;
user service evaluation label: describing the value of a client and the satisfaction degree of the client to the service from the aspects of marketing, maintenance and the like;
position labeling: recording user actions and a base station use track;
internet content preference tag: and classifying the Internet contents to describe the Internet surfing behavior preference of the client.
Such as the client complaint information belonging to the user service evaluation label, the consultation information belonging to the product demand label, the tariff and historical bill information belonging to the consumption characteristic label, the client age and gender belonging to the basic characteristic label, etc.
The behavior data of the user is mapped to a user label system to obtain rules and weights, semantic labels of the user call behaviors are mapped with the user label system, the mapped rules and the weights of different mapping relations are obtained through a label model, an accurate user portrait is constructed, and a characteristic label value and a label weight are calculated.
The user portrait construction needs to be carried out by collecting and cleaning the customer voice through the steps 1 and 2, then the label value of the user is calculated through algorithm combination, the customer voice data is collected through a distributed message system, a data set is established, and the conversation is participled through a word bank of an operator.
In this embodiment, in the step 4, the update appeal of different tag attributes of the operator user tag is different, the user tag includes four update levels, i.e., a daily level, a weekly level, a monthly level and a yearly level, and the tags of different user figures are updated according to different update requirements.
Day-level label: mood-like labels, such as: violent and compatible;
week-level label: product demand labels, such as: a flow packet;
monthly-level tags: package level label: such as: 5G, flow package and auxiliary card handling;
year level label: customer base attribute tag: such as: the age.
In this embodiment, in step 5, the user image set includes a product image, a classification image, a complaint image, a consultation service image, a business tendency image, a consumption tendency image, a service channel preference image, and a consumption capability image.
In the scheme, the business personnel construct the business system based on the business operation requirement to construct the label system, obtain the data of the user in different systems, particularly the voice data, calculate the label value of the user through the label model, finally obtain the user portrait which is different in different application layers, and the user portrait set not only improves the customer service perception, reduces the complaint rate, improves the marketing success rate, reduces the user maintenance cost and assists the product operation.
Although the embodiments of the present invention have been disclosed in the foregoing description, the above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, it can be fully applied to various fields adapted to the present invention, and further modifications can be easily implemented by those skilled in the art, so that any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A big data-based voice portrait analysis method for communication operator business customers is characterized in that: the method comprises the following steps:
step 1, collecting voice data of a user and a seat in a communication mode, and converting the voice data into text data;
step 2, performing word segmentation and feature selection on the text data through a distributed message system based on the data set, forming a feature vector of each word segmentation, and modeling data;
step 3, realizing automatic clustering of words according to the feature vector of each participle, and after clustering the classified words, carrying out classification marking according to a clustering semantic label;
step 4, carrying out voice service on the classification marks to identify the user intention and calculating the user label value through a label model;
step 5, analyzing the user label and outputting a user image set through a multi-dimensional index;
step 6, verifying the accuracy of the user portrait through a verification model;
and 7, analyzing the user image set to generate a visual multi-dimensional report.
2. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in step 1, the user voice data needs to be cleaned and preprocessed before being converted into text data.
3. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in the step 1, the voice recording data of the user is transcribed through the ASR, and the recognition mode adopts an acoustic model of a deep neural network to finish the semantic analysis of the voice transcribed voice to the dialog text.
4. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in step 2, the data set includes: a user behavior database, a system database, a corpus and a lexicon,
a user behavior database: business habits and preference data used by users;
a system database: basic information and service basic information data of users;
corpus: a user cycle portrait model formed by historical dialogue texts of the user and the agent;
a word bank: operator business product thesaurus and service thesaurus.
5. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in the step 2, word segmentation is performed through an HMM algorithm, feature selection is performed on data through a TF-IDF algorithm and an LDA algorithm, text data can be calculated, feature vectors of the segmented words are built through a word2Vec algorithm, data modeling is performed on the data with the features through a CNN algorithm, and a user label value is calculated through a data model and a classification model.
6. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in step 3, the semantic tags used by the classified labels after the user dialogues text is participled comprise: business acceptance semantic tags, complaint semantic tags, business consultation semantic tags, business query semantic tags, fault semantic tags and attitude semantic tags.
7. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in the step 4, the user intention recognition is performed through a user intention classification model, and the user intention classification model classifies the user intention to the complaint high-risk type, the flow sensitive type, the product sensitive type, the password sensitive type, the complaint type and the marketing refusal type.
8. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in step 4, the tag model of the user includes: the system comprises a basic feature tag, a product requirement tag, a business feature tag, a consumption feature tag, a channel feature tag, a terminal preference tag, a user service evaluation tag, a position tag and an internet content preference tag.
9. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in the step 4, the user tags include four update levels, namely a day level, a week level, a month level and a year level, and the tags of different user figures are updated according to different update requirements.
10. The big data based voice portrait analysis method for communication carrier business customer as claimed in claim 1, characterized in that: in the step 5, the user image set comprises a product image, a classification image, a complaint image, a consultation service image, a service tendency image and a consumption tendency image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110989375.5A CN114398512A (en) | 2021-08-26 | 2021-08-26 | Big data-based voice portrait analysis method for communication operator business customer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110989375.5A CN114398512A (en) | 2021-08-26 | 2021-08-26 | Big data-based voice portrait analysis method for communication operator business customer |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114398512A true CN114398512A (en) | 2022-04-26 |
Family
ID=81225905
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110989375.5A Pending CN114398512A (en) | 2021-08-26 | 2021-08-26 | Big data-based voice portrait analysis method for communication operator business customer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114398512A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115623130A (en) * | 2022-12-19 | 2023-01-17 | 北京青牛技术股份有限公司 | Agent conversation service business distribution method and system |
CN115834940A (en) * | 2022-11-14 | 2023-03-21 | 浪潮通信信息系统有限公司 | IPTV/OTT end-to-end data reverse acquisition analysis method and system |
CN115907784A (en) * | 2022-11-01 | 2023-04-04 | 国网江苏省电力有限公司营销服务中心 | Method and system for identifying and actively early warning and notifying sensitive customers in electric power business hall |
CN116095619A (en) * | 2022-12-30 | 2023-05-09 | 天翼物联科技有限公司 | Industry short message channel self-selection method based on naive Bayes algorithm |
CN117745328A (en) * | 2023-12-29 | 2024-03-22 | 深圳市南方网通网络技术开发有限公司 | Multi-platform-based network marketing data processing method and system |
-
2021
- 2021-08-26 CN CN202110989375.5A patent/CN114398512A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115907784A (en) * | 2022-11-01 | 2023-04-04 | 国网江苏省电力有限公司营销服务中心 | Method and system for identifying and actively early warning and notifying sensitive customers in electric power business hall |
CN115834940A (en) * | 2022-11-14 | 2023-03-21 | 浪潮通信信息系统有限公司 | IPTV/OTT end-to-end data reverse acquisition analysis method and system |
CN115623130A (en) * | 2022-12-19 | 2023-01-17 | 北京青牛技术股份有限公司 | Agent conversation service business distribution method and system |
CN116095619A (en) * | 2022-12-30 | 2023-05-09 | 天翼物联科技有限公司 | Industry short message channel self-selection method based on naive Bayes algorithm |
CN117745328A (en) * | 2023-12-29 | 2024-03-22 | 深圳市南方网通网络技术开发有限公司 | Multi-platform-based network marketing data processing method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114398512A (en) | Big data-based voice portrait analysis method for communication operator business customer | |
CN111488433B (en) | Artificial intelligence interactive system suitable for bank and capable of improving field experience | |
US9910845B2 (en) | Call flow and discourse analysis | |
CN109559221A (en) | Collection method, apparatus and storage medium based on user data | |
CN108764649A (en) | Insurance sales method for real-time monitoring, device, equipment and storage medium | |
CN108521525A (en) | Intelligent robot customer service marketing method and system based on user tag system | |
CN107330706A (en) | A kind of electricity battalion's customer service system and commercial operation pattern based on artificial intelligence | |
CN111383093A (en) | Intelligent overdue bill collection method and system | |
CN111539221B (en) | Data processing method and system | |
CN112507116A (en) | Customer portrait method based on customer response corpus and related equipment thereof | |
CN107071193A (en) | The method and apparatus of interactive answering system accessing user | |
WO2021022790A1 (en) | Active risk control method and system based on intelligent interaction | |
CN112200660B (en) | Bank counter business supervision method, device and equipment | |
CN107527240A (en) | A kind of operator's industry product Praise effect identification system and method | |
CN110009480A (en) | The recommended method in judicial collection path, device, medium, electronic equipment | |
CN109145050B (en) | Computing device | |
CN111541819A (en) | Harvesting accelerating method and system | |
CN114971017A (en) | Bank transaction data processing method and device | |
CN113887214A (en) | Artificial intelligence based wish presumption method and related equipment thereof | |
CN111062422B (en) | Method and device for identifying set-way loan system | |
CN115687754B (en) | Active network information mining method based on intelligent dialogue | |
CN112866491B (en) | Multi-meaning intelligent question-answering method based on specific field | |
CN113283979A (en) | Loan credit evaluation method and device for loan applicant and storage medium | |
CN113191882A (en) | Account checking reminding method and device, electronic equipment and medium | |
CN111192008A (en) | Self-help guide system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |