CN114281995A

CN114281995A - Premium payment hastening call extraction and analysis method, device, equipment and medium

Info

Publication number: CN114281995A
Application number: CN202111613750.2A
Authority: CN
Inventors: 顾雷
Original assignee: Aia Life Insurance Co ltd
Current assignee: Aia Life Insurance Co ltd
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-04-05

Abstract

The application provides a premium call-up call extraction and analysis method, device, equipment and medium, wherein text data of a premium call-up call recording is obtained, and the text data is classified to screen out conversation texts of classes without continuous insurance willingness; classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates; and deleting meaningless sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog text of each seat with different success rates in the same theme, and/or scoring the dialog text of each seat with different success rates under each theme to assemble high-score dialog text. The method and the device can more easily locate excellent phonetics and words in the continuous reservation service, thereby improving the success rate of continuous reservation of the seat phone.

Description

Premium payment hastening call extraction and analysis method, device, equipment and medium

Technical Field

The application relates to the technical field of data processing, in particular to a premium hastening call extraction and analysis method, device, equipment and medium.

Background

The premium payment is an important part in the service field of insurance industry, and the communication mode and the communication skill directly influence whether the premium payment can be successfully realized, and also relate to whether the customer can be successfully reserved for continuous payment, and bring long-term income for companies.

The traditional dialogistic promotion method is to record and summarize the contents of experience, skill and the like by combining with business logic according to the summary of experienced seat personnel, and to teach the summarized contents to other seat personnel in a training way. However, the abstract experience is difficult to accurately summarize the records, which results in long time consumption of the recorded results and inaccurate summarization.

With the advancement of scientific technology, and particularly with the introduction of natural language processing algorithms, many practitioners have attempted to use natural language processing algorithms to solve the problems of successful dialog discovery and summarization. The main methods include automatically generating dialogs using algorithms, extracting tags from the content of the entire dialog, and classifying and scoring the dialog to determine whether the dialog was successful. However, due to the complexity of the conversation content and the limitations of the prior art, the above methods have various application problems, such as long construction period, difficulty in accurately locating the successful conversation, difficulty in accurately finding out which link the seat is in for communication, difficulty in finding out the conversation difference between different seats, and the like.

Therefore, how to automatically locate and find out which part of the excellent seats are mainly excellent in communication processing, and whether the successful call is organized and refined depends on whether the whole customer service seat team can integrally improve the success rate of paying the premium, and the method is also a problem that the industry wants to solve.

Disclosure of Invention

In view of the above-mentioned shortcomings of the prior art, the present application aims to provide a method, an apparatus, a device and a medium for extracting and analyzing premium call-giving technology, so as to solve at least one problem in the extraction and arrangement of premium call-giving technology in the prior art.

To achieve the above and other related objects, the present application provides a premium call-giving extraction and analysis method, including: acquiring text data of a premium call-up call recording, and classifying the text data to screen out conversation texts of classes without continuous insurance willingness; classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates; and deleting meaningless sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog text of each seat with different success rates in the same theme, and/or scoring the dialog text of each seat with different success rates under each theme to assemble high-score dialog text.

In an embodiment of the present application, the method for classifying the text data to filter out the dialog text of the class without continuous willingness includes: segmenting a training text which is labeled with different classification labels in advance and generating a first word vector by using a Bert model; taking the first word vector as an input of a TextCNN model to train the TextCNN model; and carrying out primary screening and classification on the text data through a trained TextCNN model so as to select a classification label as a conversation text without continuous preservation intention.

In an embodiment of the present application, the method for topic classification of each sentence in the dialog text includes: segmenting each sentence of a training text which is labeled with different topic labels in advance and generating a second word vector by using a Bert model; taking each sentence as a group of input sentence-level Bi-LSTM models to obtain a sentence vector of each sentence; inputting each sentence vector and the identity information of the dialog person into a Bi-LSTM model according to the dialog sequence to obtain a conversation vector corresponding to each sentence; converting the conversation vector into the probability that each sentence is a topic paragraph starting sentence by using a fully-connected SoftMax layer; taking the probability as the input of a CRF layer and optimizing the probability value to obtain the corrected final probability; and training the Bi-LSTM model by using a training text which is marked with different topic labels in advance, so as to classify each sentence in the dialog text by the trained Bi-LSTM model.

In an embodiment of the present application, the topic classification includes: the initial words, the reason for indicating the incoming call, the reason for not continuing the guarantee consultation, the pros and cons stated and the closing words.

In an embodiment of the present application, the method for deleting a meaningless sentence includes: segmenting all the dialogue texts, and converting the dialogue texts into vectors according to the occurrence times of each word in each sentence by using a word bag model; labeling each sentence as meaningful or meaningless; and constructing a two-classification model by using a logistic regression algorithm according to the vector and the labeling result of the dialog text, and classifying sentences of all the dialog texts to eliminate the sentences labeled as meaningless sentences.

In an embodiment of the present application, the method for scoring the dialog texts of the agents with different success rates under the topics to summarize the high-score dialog texts includes: segmenting each sentence of a training text which is marked with different score labels in advance and generating a third word vector by using a Bert model; inputting the third word vector into a TextCNN model with each paragraph as a group to obtain a score of each paragraph; and training the TextCNN model by using training texts pre-labeled with different score labels, so as to score all the dialog texts through the trained TextCNN model.

In an embodiment of the present application, the method includes: determining whether the corresponding text data is renewed according to whether the insurance policy is paid smoothly at the current date; determining the success rate according to the number of the text data of each agent and the continuous guarantee result; and analyzing the dialog texts of the agents with different success rates, and at least selecting the dialog text of the agent higher than the average line of the success rates and the dialog text of the agent lower than the average line of the success rates.

To achieve the above and other related objects, the present application provides a premium call payment promotion extracting and analyzing apparatus, including: the extraction module is used for acquiring text data of the premium call-up telephone recording and classifying the text data to screen out conversation texts of classes without renewal willingness; the classification module is used for classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates; and the analysis module is used for deleting the nonsense sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog texts of each seat with different success rates in the same theme, and/or scoring the dialog texts of each seat with different success rates under each theme to assemble high-score dialog texts.

To achieve the above and other related objects, the present application provides a computer apparatus, comprising: a memory, and a processor; the memory is to store computer instructions; the processor executes computer instructions to implement the method as described above.

To achieve the above and other related objects, the present application provides a computer readable storage medium storing computer instructions which, when executed, perform the method as described above.

In summary, according to the method, the device, the equipment and the medium for extracting and analyzing the premium call-up, the text data of the premium call-up recording is obtained, and the text data is classified to screen out the conversation texts of the classes without the continuous insurance willingness; classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates; and deleting meaningless sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog text of each seat with different success rates in the same theme, and/or scoring the dialog text of each seat with different success rates under each theme to assemble high-score dialog text.

Has the following beneficial effects:

the method and the device can more easily locate excellent phonetics and words in the continuous reservation service, thereby improving the success rate of continuous reservation of the seat phone.

Drawings

Fig. 1 is a schematic flow chart illustrating a method for extracting and analyzing premium call-giving technology according to an embodiment of the present disclosure.

Fig. 2 is a schematic diagram illustrating an exemplary method for extracting and analyzing premium call-giving services according to the present application.

Fig. 3 is a flow chart illustrating topic classification of dialog text according to an embodiment of the present application.

Fig. 4 is a schematic block diagram of an exemplary premium call payment extraction and analysis device according to the present disclosure.

Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application is provided by way of specific examples, and other advantages and effects of the present application will be readily apparent to those skilled in the art from the disclosure herein. The present application is capable of other and different embodiments and its several details are capable of modifications and/or changes in various respects, all without departing from the spirit of the present application. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only schematic and illustrate the basic idea of the present application, and although the drawings only show the components related to the present application and are not drawn according to the number, shape and size of the components in actual implementation, the type, quantity and proportion of the components in actual implementation may be changed at will, and the layout of the components may be more complex.

As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used in this specification, specify the presence of stated features, steps, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, steps, operations, elements, components, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions, steps or operations are inherently mutually exclusive in some way.

Traditional dialect extraction and arrangement mainly depends on the summary and teaching of seat personnel, but abstract experience is not accurate enough, and the mode of extracting the dialect according to a natural language processing algorithm cannot accurately locate excellent dialect, cannot accurately find which link the seat communicates with to cause a problem, and cannot find conversation difference between different seats.

In order to solve the problems, the application provides a premium call-giving extraction and analysis method, a device, equipment and a medium. The method mainly aims at the premium payment-urging continuous-payment service, comprehensively analyzes the difference between conversations of different seats by extracting the dialogs and carrying out classification, keyword extraction or scoring so as to find out the places with defects or communication errors, summarizes excellent dialogs, trains the seats with low success rate by continuously perfecting and expanding a dialogs library, helps the seats to indicate the defects of the dialogs on various subjects, and improves the continuous-payment success rate.

Fig. 1 is a schematic flow chart showing a premium call-giving extraction and analysis method according to an embodiment of the present application. As shown, the method comprises:

step S101: and acquiring text data of the premium call-up telephone recording, and classifying the text data to screen out conversation texts of classes without continuous insurance willingness.

Simply speaking, the premium call-out audio data is obtained from the database, and the dialogue data of the audio is converted into text data. In addition, data such as a warranty number, a continuous guarantee queue number, a continuous guarantee result of whether continuous guarantee is successful or not and the like can be obtained from the database. Wherein, the result of whether the insurance policy is successfully paid at the current date can be used as the reference for the subsequent summary telephone operation.

For ease of understanding, the subsequent method steps may also refer to the method flow diagram as shown in fig. 2.

In an embodiment of the present application, the method for classifying the text data to filter out the conversation texts of the class without continuous willingness includes:

A. segmenting a training text which is labeled with different classification labels in advance and generating a first word vector by using a Bert model;

B. taking the first word vector as an input of a TextCNN model to train the TextCNN model;

C. and carrying out primary screening and classification on the text data through a trained TextCNN model so as to select a classification label as a conversation text without continuous preservation intention.

In short, the text data is first classified according to the conversation contents. Wherein, through observation and understanding, it is found that the following five conditions are mainly included in the actual call making of the urging payment: the method comprises the following steps of 'customer notification payment already "," short conversation or direct hanging up of the customer', 'the customer indicates that the customer has a renewal intention, but the customer cannot pay the fee on time due to some reasons', 'the customer is not contacted by the agent, so that a telephone call is made to the marketer, the marketer tries to replace the contact', and 'the customer has no renewal intention'.

Then, the text data is classified to note that the text belongs to the above 5 categories. And constructing a text classification model by using the Bert + TextCNN, and performing primary screening classification on the dialogue content.

Bert (bidirectional Encoder restances from transforms), a pre-training model proposed in Google2018, i.e., Encoder of bidirectional Transformer.

The TextCNN model is a model which is proposed by Yoon Kim and uses a Convolutional neural network to process NLP problems, and compared with the traditional models such as RNN/LSTM in NLP, the TextCNN model can more efficiently extract important features.

It should be noted that, for the purpose of promoting the agent payment success rate and summarizing the successful dialect, the main task is to save the client who has no continuous willingness, and then dig the excellent dialect, so only the dialog of "no continuous willingness" is screened out.

The method comprises the steps of constructing a text classification model by Bert + TextCNN, segmenting labeled dialogue text data, generating word vectors by using the Bert model, training the model by using the word vectors as input of the TextCNN model, training the model by using the labeled text data, performing primary screening classification on all text data to be analyzed by using the trained model, and selecting dialogue with a classification result of 'no continuous preservation will'.

Step S102: and classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates.

Preferably, the topic categories include, but are not limited to: the initial words, the reasons for calling, the consultation of the reasons for non-renewal, the statements of pros and cons, the ending words and the like. The method mainly aims to divide a dialog text of 'no continuous guarantee will' into a plurality of topic paragraphs, so that the difference of the seat with high continuous guarantee success rate and the seat with low continuous guarantee success rate can be compared with the final continuous guarantee result obtained from a database.

Such as: a) and comparing whether the missing rate of each topic paragraph in each agent dialog text is different. For example, the dialogue text of each agent lacks the proportion of the topic paragraph of 'statement prosperity'.

b) And comparing whether the main trigger sequences of the topics in the agent dialog texts are different. For example, the main trigger sequence of the agent a is "the beginning" - > "indicating the reason of the call" - > "not continuing to maintain the reason consultation" - > "stating the prosperity" - > "the end, and the main trigger sequence of the agent b is" the beginning "- >" indicating the reason of the call "- >" stating the prosperity "- >" not continuing to maintain the reason consultation "- >" the end.

In some examples, through statistics on the missing condition of the topics and the trigger sequence, the influence of each topic on the success rate or the influence of the topic sequence on the success rate can be found, so that not only the required topics and sequence of the dialoging can be normalized, but also the key topics influencing the success rate can be determined for the subsequent method.

In an embodiment of the present application, the method for classifying each sentence in the dialog text includes:

A. segmenting each sentence of a training text which is labeled with different topic labels in advance and generating a second word vector by using a Bert model;

B. taking each sentence as a group of input sentence-level Bi-LSTM models to obtain a sentence vector of each sentence;

C. inputting each sentence vector and the identity information of the dialog person into a Bi-LSTM model according to the dialog sequence to obtain a conversation vector corresponding to each sentence;

D. converting the conversation vector into the probability that each sentence is a topic paragraph starting sentence by using a fully-connected SoftMax layer;

E. taking the probability as the input of a CRF layer and optimizing the probability value to obtain the corrected final probability;

F. and training the Bi-LSTM model by using a training text which is marked with different topic labels in advance, so as to classify each sentence in the dialog text by the trained Bi-LSTM model.

As shown in fig. 3, briefly, the specific steps are as follows:

a) and segmenting each sentence of the labeled dialog text, and generating a word vector by using a Bert model.

b) And (3) taking each sentence as a group of input sentence level Bi-LSTM models to obtain a sentence vector of each sentence.

BilSTM is an abbreviation of Bi-directional Long Short-Term Memory, and is formed by combining forward LSTM and backward LSTM. Both are often used to model context information in natural language processing tasks.

c) And (3) inputting the sentence vector and the speaker identity information of each sentence into the dialogue-level Bi-LSTM model according to the conversation sequence, and expressing one vector obtained by each sentence in the conversation.

d) And converting the vector obtained in the last step into the probability that each sentence is the starting sentence of a certain topic paragraph by using a fully connected SoftMax layer.

e) And taking the probability obtained in the last step as the input of a CRF layer, and optimizing the probability value to obtain the corrected final probability.

f) And training the model by using the labeled text data, and performing topic classification on sentences of all texts to be analyzed by using the trained model.

Step S103: and deleting meaningless sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog text of each seat with different success rates in the same theme, and/or scoring the dialog text of each seat with different success rates under each theme to assemble high-score dialog text.

In one or more embodiments, the method for deleting a meaningless sentence includes:

A. segmenting all the dialogue texts, and converting the dialogue texts into vectors according to the occurrence times of each word in each sentence by using a word bag model;

B. labeling each sentence as meaningful or meaningless;

C. and constructing a two-classification model by using a logistic regression algorithm according to the vector and the labeling result of the dialog text, and classifying sentences of all the dialog texts to eliminate the sentences labeled as meaningless sentences.

Simply speaking, the sentences in the conversation text of 'will of continuous preservation' are manually marked, and whether the sentences have significance or not is marked. And (3) screening out meaningful sentences in each 'no continuous willingness-to-keep' conversation text by using a bag-of-word model and a logistic regression algorithm, and excluding meaningless sentences such as 'good', 'thank you' and the like.

Firstly, all the dialogue texts with no continuous insurance will are participated, and a bag-of-words model is used for converting the dialogue texts into vectors according to the occurrence frequency of each word in each sentence text. For example as shown in table 1:

TABLE 1 number of occurrences of each term in sentence text

Policy keeping	Continuous guarantee	Premium fee	Telephone set	Loss of power	Unfortunately	Quota guarantee
							2	1	1	0	0	0	1

And then, constructing a binary classification model by using a logistic regression algorithm on the vector and the labeling result of the dialog text sentence, classifying all the sentences of the dialog text without continuous willingness, and eliminating meaningless sentences.

On one hand, after the nonsense sentences are deleted, keywords are extracted from each sentence so as to analyze the word use difference of the keywords of the dialog texts of the seats with different success rates in the same subject.

Preferably, in each topic dialog text from which meaningless sentences are excluded, keyword extraction is performed on the contents of the agent utterances using the TextRank algorithm. The TextRank algorithm is a keyword extraction algorithm, can be separated from the background of a corpus, and can extract keywords of a document only by analyzing a single document. For example, the keyword extraction is performed on the remaining sentences in each node in each communication session.

On the other hand, after the meaningless sentences are deleted, the dialogue texts of the agents with different success rates under the subjects are scored to assemble high-score dialogue texts.

Preferably, the steps specifically include:

A. segmenting each sentence of a training text which is marked with different score labels in advance and generating a third word vector by using a Bert model;

B. inputting the third word vector into a TextCNN model with each paragraph as a group to obtain a score of each paragraph;

C. and training the TextCNN model by using training texts pre-labeled with different score labels, so as to score all the dialog texts through the trained TextCNN model.

Briefly, the content of the agent utterance in the key topics in the dialog text is scored (e.g., 1-5 points). And constructing a text scoring model based on a Bert + TextCNN method, and identifying high-score conversation contents in the key topic. The method comprises the following specific steps:

b) The word vector is input into the TextCNN model with each paragraph as a group, and the score (such as 1-5 points) of each paragraph is obtained.

c) And training the model by using the excessively trained text data, and scoring all texts to be scored by using the trained model.

In addition, the method comprises the following steps: determining whether the corresponding text data is renewed according to whether the insurance policy is paid smoothly at the current date; determining the success rate according to the number of the text data of each agent and the continuous guarantee result; and analyzing the dialog texts of the agents with different success rates, and at least selecting the dialog text of the agent higher than the average line of the success rates and the dialog text of the agent lower than the average line of the success rates.

In brief, whether the corresponding text data is renewed or not is determined according to whether the insurance policy is paid smoothly in the current period or not from the database at first, namely, a renewal result, and then, a renewal success rate can be determined according to the number of sessions of each agent aiming at no renewal desire and the renewal result. The conversation content of the agent with high continuous success rate has higher reference value, and the overall conversation level can be improved by extracting and summarizing the conversation content. For example, keywords may be extracted on the one hand, and higher-scoring dialog text may be aggregated for each topic content on the other hand as excellent dialogs for that topic. In addition, conversations for agents with low success rate are also of reference value, and through comparison, the dialogues are not good enough, and the agents can be reminded to avoid the dialogues in training.

In the application, the highlight points on the high-resolution telephone operation are drawn and summarized by summarizing the high-resolution sections, and the telephone operation library is continuously improved and expanded. Meanwhile, the operators with low success rate are trained to help the operators to indicate the defects of dialect operation on each theme, and the success rate of continuous maintenance is improved.

For example, by summarizing the high score, it can be found that excellent agents in the paragraph "indicate reason for incoming call" would generally say "do your policy not pay successfully? Subsequently, care was asked, do you have that premium stored in the card? ". And the seat with low power often only says that' you have a serious danger, and the seat is not continuously guaranteed at present. Therefore, by comparison, successful cases are added into the dialect library, so that the dialect with a low success rate is improved, and the success rate of continuous preservation is improved.

In one or more implementations, the continuous summary and accumulation of successful calls in premium services according to the present invention can be implemented by comparing the results of the above-described actions, wherein the higher success rate agents generally use a more specialized and understandable language to save the customer, and provide the customer with multiple options. Due to the implementation of the invention, the success rate of the whole continuous premium in the first half of 2021 year is increased by 6 percent compared with the same period in 2020 year. Therefore, excellent phonetics and words in the continuous maintenance service can be positioned more easily, and the success rate of continuous maintenance of the seat phone is improved.

Fig. 4 is a schematic block diagram of a premium call payment promoting extraction and analysis device according to an embodiment of the present application. As shown, the apparatus 400 includes:

the extraction module 401 is used for acquiring text data of the premium call-up telephone recording and classifying the text data to screen out conversation texts of classes without renewal willingness;

a classification module 402, configured to classify a topic of each sentence in the dialog text, so as to analyze topic missing rates and sequence differences between dialog texts of different agents with different success rates;

the analysis module 403 is configured to delete a nonsense sentence, extract keywords from each sentence to analyze the word usage difference of the keywords in the same topic of the dialog text of each agent with different success rates, and/or score the dialog text of each agent with different success rates under each topic to assemble high-score dialog text.

It should be noted that, because the contents of information interaction, execution process, and the like between the modules/units of the apparatus are based on the same concept as the method embodiment described in the present application, the technical effect brought by the contents is the same as the method embodiment of the present application, and specific contents may refer to the description in the foregoing method embodiment of the present application, and are not described herein again.

It should be further noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these units can be implemented entirely in software, invoked by a processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, each module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the system in the form of program code, and a processing element of the apparatus calls and executes the functions of each module. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.

For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a Processing element scheduler code, the Processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a System-on-a-Chip (SoC).

Fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown, the computer device 500 includes: a memory 501 and a processor 502; the memory 501 is used for storing computer instructions; the processor 502 executes computer instructions to implement the method described in fig. 1.

In some embodiments, the number of the memory 501 in the computer device 500 may be one or more, the number of the processor 502 may be one or more, and fig. 5 is taken as an example.

In an embodiment of the present application, the processor 502 in the computer device 500 loads one or more instructions corresponding to processes of an application program into the memory 501 according to the steps described in fig. 1, and the processor 502 executes the application program stored in the memory 501, thereby implementing the method described in fig. 1.

The memory 501 may include a Random Access Memory (RAM), or may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 501 stores an operating system and operating instructions, executable modules or data structures, or a subset thereof, or an expanded set thereof, wherein the operating instructions may include various operating instructions for implementing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks.

The Processor 502 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In some specific applications, the various components of the computer device 500 are coupled together by a bus system that may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. But for clarity of illustration the various buses have been referred to in figure 5 as a bus system.

In an embodiment of the present application, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the method described in fig. 1.

The present application may be embodied as systems, methods, and/or computer program products, in any combination of technical details. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement various aspects of the present application.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.

The computer-readable programs described herein may be downloaded from a computer-readable storage medium to a variety of computing/processing devices, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present application may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry can execute computer-readable program instructions to implement aspects of the present application by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

The application effectively overcomes various defects in the prior art and has high industrial utilization value.

The above embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the invention. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present application.

Claims

1. A premium claim call extraction and analysis method, the method comprising:

acquiring text data of a premium call-up call recording, and classifying the text data to screen out conversation texts of classes without continuous insurance willingness;

classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates;

and deleting meaningless sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog text of each seat with different success rates in the same theme, and/or scoring the dialog text of each seat with different success rates under each theme to assemble high-score dialog text.

2. The method of claim 1, wherein the method of classifying the text data to filter out conversation text of the class of no continuous willingness comprises:

segmenting a training text which is labeled with different classification labels in advance and generating a first word vector by using a Bert model;

taking the first word vector as an input of a TextCNN model to train the TextCNN model;

and carrying out primary screening and classification on the text data through a trained TextCNN model so as to select a classification label as a conversation text without continuous preservation intention.

3. The method of claim 1, wherein the method of topic classification of each sentence in the dialog text comprises:

segmenting each sentence of a training text which is labeled with different topic labels in advance and generating a second word vector by using a Bert model;

taking each sentence as a group of input sentence-level Bi-LSTM models to obtain a sentence vector of each sentence;

inputting each sentence vector and the identity information of the dialog person into a Bi-LSTM model according to the dialog sequence to obtain a conversation vector corresponding to each sentence;

converting the conversation vector into the probability that each sentence is a topic paragraph starting sentence by using a fully-connected SoftMax layer;

taking the probability as the input of a CRF layer and optimizing the probability value to obtain the corrected final probability;

and training the Bi-LSTM model by using a training text which is marked with different topic labels in advance, so as to classify each sentence in the dialog text by the trained Bi-LSTM model.

4. The method of claim 1 or 3, wherein the topic classification comprises: the initial words, the reason for indicating the incoming call, the reason for not continuing the guarantee consultation, the pros and cons stated and the closing words.

5. The method of claim 1, wherein the method of deleting meaningless sentences comprises:

segmenting all the dialogue texts, and converting the dialogue texts into vectors according to the occurrence times of each word in each sentence by using a word bag model;

labeling each sentence as meaningful or meaningless;

and constructing a two-classification model by using a logistic regression algorithm according to the vector and the labeling result of the dialog text, and classifying sentences of all the dialog texts to eliminate the sentences labeled as meaningless sentences.

6. The method of claim 1, wherein the step of scoring the dialog text of agents with different success rates under topics to summarize high-score dialog text comprises:

segmenting each sentence of a training text which is marked with different score labels in advance and generating a third word vector by using a Bert model;

inputting the third word vector into a TextCNN model with each paragraph as a group to obtain a score of each paragraph;

and training the TextCNN model by using training texts pre-labeled with different score labels, so as to score all the dialog texts through the trained TextCNN model.

7. The method according to claim 1, characterized in that it comprises:

determining whether the corresponding text data is renewed according to whether the insurance policy is paid smoothly at the current date;

determining the success rate according to the number of the text data of each agent and the continuous guarantee result;

and analyzing the dialog texts of the agents with different success rates, and at least selecting the dialog text of the agent higher than the average line of the success rates and the dialog text of the agent lower than the average line of the success rates.

8. An insurance premium promissory call extraction and analysis device, the device comprising:

the extraction module is used for acquiring text data of the premium call-up telephone recording and classifying the text data to screen out conversation texts of classes without renewal willingness;

the classification module is used for classifying the topics of each sentence in the dialog text so as to analyze the topic missing rate and the sequence difference among the dialog texts of the agents with different success rates;

and the analysis module is used for deleting the nonsense sentences, extracting keywords from each sentence to analyze the word use difference of the keywords of the dialog texts of each seat with different success rates in the same theme, and/or scoring the dialog texts of each seat with different success rates under each theme to assemble high-score dialog texts.

9. A computer device, the device comprising: a memory, and a processor; the memory is to store computer instructions; the processor executes computer instructions to implement the method of any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon computer instructions which, when executed, perform the method of any one of claims 1 to 7.