CN111507751A - Communication data-based clue scoring method - Google Patents

Communication data-based clue scoring method Download PDF

Info

Publication number
CN111507751A
CN111507751A CN202010223418.4A CN202010223418A CN111507751A CN 111507751 A CN111507751 A CN 111507751A CN 202010223418 A CN202010223418 A CN 202010223418A CN 111507751 A CN111507751 A CN 111507751A
Authority
CN
China
Prior art keywords
clue
clues
communication data
training
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010223418.4A
Other languages
Chinese (zh)
Inventor
杨植麟
杜羽伦
陈虞君
张宇韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruikelun Intelligent Technology Co ltd
Original Assignee
Beijing Ruikelun Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruikelun Intelligent Technology Co ltd filed Critical Beijing Ruikelun Intelligent Technology Co ltd
Priority to CN202010223418.4A priority Critical patent/CN111507751A/en
Publication of CN111507751A publication Critical patent/CN111507751A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/51Centralised call answering arrangements requiring operator intervention, e.g. call or contact centers for telemarketing

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the technical field of computer information, in particular to a clue scoring method based on communication data, which mainly comprises the following steps: step 1: collecting communication data; step 2: converting the communication data into interactive text; and step 3: performing data preprocessing on the interactive text and the order data to obtain a plurality of clues and labels corresponding to the clues, wherein the labels comprise positive example labels and negative example labels; and 4, step 4: preparing a natural language processing pre-training model; and 5: training and testing the pre-training model through the clues; step 6: and generating a clue scoring result. According to the invention, the artificial intelligence technologies such as voice recognition, natural language processing and machine learning are used to improve the success rate of clue conversion of telemarketers, so that each telemarketer can more accurately contact potential customers with stronger intentions, thereby improving the overall clue conversion rate and operation efficiency and enabling effective data to be more quickly contacted.

Description

Communication data-based clue scoring method
Technical Field
The invention relates to the technical field of computer information, in particular to a clue scoring method based on communication data.
Background
Currently, in the field of telemarketing, sales and marketing personnel take several hours per day to follow sales leads in a pool of leads. How to screen out the clues that the salesperson should follow first so as to improve the final clue conversion rate is an urgent problem.
The traditional telemarketing industry relies on the use of a customer relationship management system (CRM) to select clues to be followed by sales and marketing personnel to input clue-related data into the system in a tedious way and manually judge the single intention of the clues. Despite the best efforts of the sales service management department, these systems have a series of problems such as data loss, manual error filling, and false data tampering, which result in inaccurate or failed final clue intent determination. Therefore, there is a need to invent a general innovative clue scoring method based on non-tamperable phone recording or communication text recording.
Disclosure of Invention
The invention provides a clue scoring method based on communication data, which improves the clue conversion success rate of telemarketers by applying artificial intelligence technologies such as voice recognition, natural language processing, machine learning and the like, so that each telemarketer can more accurately contact potential customers with stronger intention, thereby improving the overall clue conversion rate and operation efficiency and enabling effective data to be more quickly contacted.
In order to achieve the purpose, the invention provides the following technical scheme: a clue scoring method based on communication data mainly comprises the following steps:
step 1: collecting communication data;
step 2: converting the communication data into interactive text;
and step 3: performing data preprocessing on the interactive text and the order data to obtain a plurality of clues and labels corresponding to the clues, wherein the labels comprise positive example labels and negative example labels;
and 4, step 4: preparing a natural language processing pre-training model;
and 5: training and testing the pre-training model through the clues;
step 6: and generating a clue scoring result.
Preferably, the step 3 further comprises the following steps:
step 31: sequencing the interactive text of each clue according to the time stamp, and cutting the interactive text into one or more evaluation time points;
step 32: aiming at an evaluation time point, checking whether a call is formed within N2 days after the evaluation time point by using the conversation text content of each clue N1 days before the evaluation time point;
step 33: cues are characterized, including but not limited to merging interactive text, segmenting interactive text, and adding additional features.
Preferably, the step 5 further comprises the following steps:
step 51: randomly dividing a plurality of the clues of the step 3 into a training set and a testing set;
step 52: fine-tuning a two-classification model formed by a pre-training model by using the training set;
step 53: and testing the pre-training model through the test set, and selecting an optimal model.
The invention has the beneficial effects that: the invention applies the artificial intelligence technology of voice recognition, natural language processing, machine learning and the like to improve the success rate of clue conversion of telemarketers and ensure that each telemarketer can more accurately contact potential customers with stronger intention, thereby improving the overall clue conversion rate and operation efficiency and ensuring that effective clues are more quickly contacted.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of model creation according to the present invention;
FIG. 2 is a flowchart illustrating an exemplary application of the optimal model of the present invention;
FIG. 3 is an exemplary process for evaluating positive and negative instances of a time point according to the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in the flowcharts of fig. 1 and fig. 2, a method for scoring a thread based on communication data includes the following steps:
step 1: collecting communication data;
step 2: converting the communication data into interactive text;
and step 3: performing data preprocessing on the interactive text and the order data to obtain a plurality of clues and labels corresponding to the clues, wherein the labels comprise positive example labels and negative example labels;
and 4, step 4: preparing a natural language processing pre-training model;
and 5: training and testing the pre-training model through the clues;
step 6: and generating a clue scoring result.
Specifically, the method comprises the following steps: first, collect relevant data and store the data in computer database, the data includes communication record between sale and client, such as telephone recording, text chat record, and order data, such as name, alias, unit, etc. Since this data is ultimately used to generate data for training and testing the final cue scoring model. The data field thus contains at least the following information:
Figure RE-GDA0002563717050000031
Figure RE-GDA0002563717050000041
Figure RE-GDA0002563717050000042
for example, if the communication record source and the call record are, the data fields of the record are:
Figure RE-GDA0002563717050000051
the order data is:
Figure RE-GDA0002563717050000052
the main purpose of this step is to transform the telephone recordings of the customer and the sale into structured texts for the natural language processing technique to analyze the semantic parsing clues into the degree of singleness.
Figure RE-GDA0002563717050000053
The third step: and carrying out data preprocessing on the written interactive text and the order data. This step includes a number of preprocessing tasks. Firstly, sorting and sorting the historical call recording text of each clue according to the call time stamp, and setting one or more evaluation time points. Then, aiming at an evaluation time point, the conversation text content of each clue in N1 days before the evaluation time point is used for checking whether an order is formed in N2 days after the evaluation time point. Clues with singleton information are considered positive examples, while clues without singleton information are considered negative examples. Finally, positive and negative examples cues are characterized by methods including, but not limited to, merging call text, call text tokenize, text sliding window segmentation, adding additional features. The final product of this step is a plurality of clue-corresponding features and their corresponding positive and negative labels. As shown in fig. 3:
thread a should be set as a positive example and the text of phone 2 and phone 3 should be merged.
Thread B is a negative example because thread B is not singleton. The text of phone 2 and phone 3 should be merged.
Clue C is neither positive nor negative because the bill of lading is outside the target number of days N2.
Thread D is neither a positive nor a negative example because there is no call data within the target number of days N1.
Chinese participles (tokenize) are a series of non-separable minimum unit symbols that split each chinese sentence or paragraph into. The Chinese tokenizer of Bert may be used herein for segmentation. Because the transcribed text length usually takes 1000 words, the model with large parameter number of Bert cannot support the ultra-long sequence well during training, and a sliding window (sliding window) idea is needed to solve the problem at present. The sliding window divides a long text into several overlapped sections (for example, each section has 256 words, and the overlap length is 128 words), and then each section is input into the Bert model as an independent text for training. And finally integrating the results obtained by the independent texts.
Additional features may also include follow-up cadence features, product information, user behavior features, customer attribute features, and so forth.
The models that can be used here include, but are not limited to, X L Net and Bert.
And fifthly, training a clue scoring model, and finely adjusting the pre-training model X L Net or Bert by using the prepared training data to train a binary model.
In the step, a training set and a test set are randomly separated from the data processed in the third step according to clues, and preferably, 80% of clue data is used as the training set, and the other 20% of clue data is used as the test set. Then, the training set data is used for fine-tuning the two classification models formed by the training pre-training models. And finally, selecting the optimal model according to the effect of the model on the test data set, and storing the model.
And a sixth step: and generating a clue scoring result. For each thread to be scored, the relevant features of the thread are input into the optimal model obtained in the fifth step, and the value obtained by the model after the Softmax layer can be regarded as the score of the thread in a single direction.
In practical applications, for a thread pool, all threads in the pool should be sorted from large to small according to the intention score, and the thread at the head of the sorting is a high-order one-way thread for the sales or marketing staff to follow.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (3)

1. A clue scoring method based on communication data is characterized by comprising the following steps:
step 1: collecting communication data;
step 2: converting the communication data into interactive text;
and step 3: performing data preprocessing on the interactive text and the order data to obtain a plurality of clues and labels corresponding to the clues, wherein the labels comprise positive example labels and negative example labels;
and 4, step 4: preparing a natural language processing pre-training model;
and 5: training and testing the pre-training model through the clues;
step 6: and generating a clue scoring result.
2. The method of claim 1, wherein the step of scoring the communication data comprises: the step 3 further comprises the following steps:
step 31: sequencing the interactive text of each clue according to the time stamp, and cutting the interactive text into one or more evaluation time points;
step 32: aiming at an evaluation time point, checking whether a call is formed within N2 days after the evaluation time point by using the conversation text content of each clue N1 days before the evaluation time point;
step 33: cues are characterized, including but not limited to merging interactive text, segmenting interactive text, and adding additional features.
3. The method of claim 2, wherein the step of scoring the communication data comprises: the step 5 further comprises the following steps:
step 51: randomly dividing a plurality of the clues of the step 3 into a training set and a testing set;
step 52: fine-tuning a two-classification model formed by a pre-training model by using the training set;
step 53: and testing the pre-training model through the test set, and selecting an optimal model.
CN202010223418.4A 2020-03-26 2020-03-26 Communication data-based clue scoring method Pending CN111507751A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010223418.4A CN111507751A (en) 2020-03-26 2020-03-26 Communication data-based clue scoring method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010223418.4A CN111507751A (en) 2020-03-26 2020-03-26 Communication data-based clue scoring method

Publications (1)

Publication Number Publication Date
CN111507751A true CN111507751A (en) 2020-08-07

Family

ID=71875866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010223418.4A Pending CN111507751A (en) 2020-03-26 2020-03-26 Communication data-based clue scoring method

Country Status (1)

Country Link
CN (1) CN111507751A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117827804A (en) * 2024-02-22 2024-04-05 北京仁科互动网络技术有限公司 Thread generation method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332287A1 (en) * 2009-06-24 2010-12-30 International Business Machines Corporation System and method for real-time prediction of customer satisfaction
CN107944913A (en) * 2017-11-21 2018-04-20 重庆邮电大学 High potential user's purchase intention Forecasting Methodology based on big data user behavior analysis
CN110033294A (en) * 2018-01-12 2019-07-19 腾讯科技(深圳)有限公司 A kind of determination method of business score value, business score value determining device and medium
CN110728145A (en) * 2019-10-11 2020-01-24 集奥聚合(北京)人工智能科技有限公司 Method for establishing natural language understanding model based on recording conversation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332287A1 (en) * 2009-06-24 2010-12-30 International Business Machines Corporation System and method for real-time prediction of customer satisfaction
CN107944913A (en) * 2017-11-21 2018-04-20 重庆邮电大学 High potential user's purchase intention Forecasting Methodology based on big data user behavior analysis
CN110033294A (en) * 2018-01-12 2019-07-19 腾讯科技(深圳)有限公司 A kind of determination method of business score value, business score value determining device and medium
CN110728145A (en) * 2019-10-11 2020-01-24 集奥聚合(北京)人工智能科技有限公司 Method for establishing natural language understanding model based on recording conversation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117827804A (en) * 2024-02-22 2024-04-05 北京仁科互动网络技术有限公司 Thread generation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109522556B (en) Intention recognition method and device
US9014363B2 (en) System and method for automatically generating adaptive interaction logs from customer interaction text
US9477752B1 (en) Ontology administration and application to enhance communication data analytics
CN113468296B (en) Model self-iteration type intelligent customer service quality inspection system and method capable of configuring business logic
CN109255027B (en) E-commerce comment sentiment analysis noise reduction method and device
CN102436483A (en) Video advertisement detecting method based on explicit type sharing subspace
CN110413998B (en) Self-adaptive Chinese word segmentation method oriented to power industry, system and medium thereof
CN112966082A (en) Audio quality inspection method, device, equipment and storage medium
CN111651497A (en) User label mining method and device, storage medium and electronic equipment
CN113626573B (en) Sales session objection and response extraction method and system
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN110765776A (en) Method and device for generating return visit labeling sample data
CN113297365A (en) User intention determination method, device, equipment and storage medium
CN113505606B (en) Training information acquisition method and device, electronic equipment and storage medium
CN110196897A (en) A kind of case recognition methods based on question and answer template
CN110782221A (en) Intelligent interview evaluation system and method
CN111507751A (en) Communication data-based clue scoring method
CN111427996B (en) Method and device for extracting date and time from man-machine interaction text
CN117634471A (en) NLP quality inspection method and computer readable storage medium
CN116828109A (en) Intelligent evaluation method and system for telephone customer service quality
CN113657118B (en) Semantic analysis method, device and system based on call text
CN115658911A (en) Food safety standard associated knowledge map construction method and system
CN113723975A (en) System, method, device, processor and computer readable storage medium for realizing intelligent quality inspection processing in intelligent return visit service
CN114328902A (en) Text labeling model construction method and device
CN113094471A (en) Interactive data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200807

RJ01 Rejection of invention patent application after publication