CN117076612B

CN117076612B - Call center big data text mining system

Info

Publication number: CN117076612B
Application number: CN202311119951.6A
Authority: CN
Inventors: 金笑; 关旭; 马菁; 马小兰; 乔妮; 秦瀚文; 谢亮
Original assignee: Ningxia Hengxin Chuangda Data Technology Co ltd
Current assignee: Ningxia Hengxin Chuangda Data Technology Co ltd
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2024-02-20
Anticipated expiration: 2043-08-31
Also published as: CN117076612A

Abstract

The invention discloses a large data text mining system of a call center, which relates to the field of call centers and comprises a data writing-in end, a data storage end, a retrieval recognition end and a client mobile end, wherein the data writing-in end comprises a data recognition module for distinguishing data types, a storage selection module for selecting storage directions, the data storage end comprises a distinguishing storage module for classifying different data, a data extraction module for extracting data meeting search requirements, a data mining module for further extracting data possibly meeting the data retrieval requirements or assisting the search requirements, a data checking module for removing invalid extracted data, the retrieval recognition end comprises a voice recognition module capable of supporting voice conversion into texts, a text translation module for supporting conversion of various languages, and the client mobile end comprises a style recognition module for analyzing preference of client data.

Description

Call center big data text mining system

Technical Field

The invention mainly relates to the technical field of call centers, in particular to a big data text mining system of a call center.

Background

The call center is a service mode commonly used by a plurality of online enterprises at present, is used for enhancing the connection between the enterprises and the clients, provides informationized service for the clients, helps the clients to solve part of product problems online, is an important component for promotion and after-sales, can improve the attachment degree of the clients to the enterprises, and keeps the client retention. In a big environment with increasingly strong market competition, improving the business capability of a call center becomes an important link for attracting customers, however, the need of a staff to improve the business capability of a traditional call center is very known by the demands of customers, the staff culture cost is high, the period is long, many enterprises today adopt data searching or even AI to improve the business capability, along with the increase of the number of customers and the development of the enterprises, enterprise databases become more and more huge, and searching for data meeting the requirements of customers or even giving and effectively suggesting the customers in the huge databases becomes more and more difficult, so a call center big data text mining system is needed to support the service of the call center.

According to the call center big data text mining system provided by the application number 201610937056.9, a data mining algorithm is used as a core technology, a business model with an independent function is built, and a mapping relation is built between the business model and the data mining algorithm, so that the architecture of the data mining system of the component is realized; the existing successful business model is integrated by using a data mining algorithm, a new business model is developed and developed aiming at specific business application, and the flexibility of combination of a system and the application is increased.

The patent document can call proper and large amount of data by establishing a business model of an enterprise, selecting a proper large database and assisting by various large data mining algorithms, but has the problem of slow call speed caused by complex data call modes.

Disclosure of Invention

Based on this, the present invention aims to provide a call center big data text mining system, so as to solve the technical problems set forth in the background art.

The system comprises a data writing-in end, a data storage end, a retrieval identification end and a client mobile end, wherein the data writing-in end comprises a data identification module for basically qualifying data by adopting a C-algorithm and judging the data belonging to the type of data, and the data is sent to a storage selection module of a corresponding major class according to the type of data judged by the data identification module; the data storage end is a plurality of storage hardware of a call center, a certain number of feature storage spaces are divided in the storage hardware according to different types of features of data, the data storage end comprises a data mining module, a distinguishing storage module, a data detection module and a data inspection module, wherein the feature storage space is used for extracting features in a newly input data text by using a TF-IDF algorithm, thinning the types of the data, storing the features in the feature storage space corresponding to the thinned features in a hardware server, extracting data meeting searching requirements by using a mode of combining a decision tree algorithm with a K-Mean algorithm, extracting associated data meeting searching requirements stored in the corresponding feature storage space, mining associated data possibly with required features in the whole large class of the same storage hardware by using an Apriori association algorithm, continuously extracting the secondary associated data in other storage hardware by using the extracted associated data, judging the similarity of the primary associated data and the secondary associated data and the data extraction module by using a Bloom-filter algorithm, and removing the data inspection module with lower similarity; the retrieval recognition terminal comprises a voice recognition module capable of supporting voice conversion into text and a text translation module capable of supporting each language conversion; the client mobile terminal comprises a style identification module for analyzing preference bias of clients to data, and a data pushing module for pushing preference data of the clients, in the preferred embodiment, the data is classified by simple summarization and qualitative, the classified data is stored into a hardware disk of a corresponding large class pair by a storage selection module, an extraction area is selected when the extraction is convenient, the data in the large class is thinned into data with a certain characteristic by a TF-IDF algorithm, a storage area is selected according to the characteristic, a large amount of data with the same characteristic can be obtained quickly during data extraction by characteristic extraction, the characteristic information required by a call center is analyzed, the classified characteristic data is extracted by a clustering algorithm, the data extracted in the process basically meets the requirement, some data which are not listed in the corresponding characteristic area but have similar attribute or meaning are deeply mined by a correlation algorithm, more comprehensive information is provided, a large amount of data which do not meet the requirement characteristic is removed by the similarity algorithm, and effective data is prevented from being covered by the useless data.

Preferably, the voice recognition module adopts mature voice recognition software in the market for recognizing talking information in the calling process and converting the talking information into text information, and the text translation module uses mature translation software and is suitable for serving overseas clients.

Preferably, the style recognition module counts the preferred information types of the clients through the searches of the clients, in the preferred embodiment, the style recognition module analyzes the data preferences of the clients and provides corresponding data feature modules for the clients, so that the clients can quickly provide big data mining services for the clients when the clients need calling services, and the satisfaction degree of the clients can be improved.

Preferably, the data pushing module pushes the appropriate data for the client through the data preference of the client, and in the preferred embodiment, the data pushing module provides the data which is extracted in other searching processes and accords with the data preference of the client for the client, so that the computing power can be saved while the data requirement of the client is met.

In summary, the invention has the following advantages:

by carrying out basic major class distinction and necessary feature partitioning on the input data, the method eliminates most other useless data before data mining, greatly reduces the range of data mining, greatly improves the retrieval speed, and greatly improves the feature refinement of the input data, so that the data extracted according to the features is effective data meeting the requirements.

Through the associated data mining algorithm and the similarity elimination algorithm, the defect of data distortion caused by doping of invalid information can be prevented while more required data are provided, and more comprehensive required data can be provided.

By analyzing the data preference of the client, in other search scenes, various data mined by the mining algorithm can be compared with the data preference of the client to push the data which accords with the preference to the corresponding client, so that the mining calculation force is utilized to the maximum extent while additional services are provided for the client, and unnecessary calculation force waste is avoided.

Drawings

FIG. 1 is an overall system flow diagram of the present invention;

FIG. 2 is a schematic diagram of a data writing module according to the present invention;

FIG. 3 is a schematic diagram of a data storage module according to the present invention;

FIG. 4 is a schematic diagram of a search recognition module according to the present invention;

FIG. 5 is a schematic diagram of a client mobile terminal module according to the present invention;

fig. 6 is a schematic diagram of the algorithm support of the present invention.

Description of the drawings: 10. a data writing end; 20. a data storage end; 30. retrieving an identification end; 40. a client mobile terminal; 11. a data identification module; 12. storing the selected module; 21. distinguishing a storage module; 22. a data extraction module; 23. a data mining module; 24. a data checking module; 25. storing hardware; 251. a feature storage space; 31. a voice recognition module; 32. a text translation module; 41. a style identification module; 42. and the data pushing module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

Examples

Referring to fig. 1, 2, 3, 4, and 6, a large data text mining system for a call center includes a data writing end 10, a data storage end 20, a search recognition end 30, a client mobile end 40, wherein the data writing end 10 includes a data recognition module 11 for performing basic qualitative determination on data by using a C4.5 algorithm, judging which type of data belongs to, and transmitting the data to a storage selection module 12 of a corresponding large class according to the type of data judged by the data recognition module 11; the data storage end 20 is a plurality of storage hardware 25 of a call center, a certain number of feature storage spaces 251 are divided in the storage hardware 25 according to different types of features of data, the method comprises the steps of extracting features in a newly input data text by using a TF-IDF algorithm, refining the types of the data, storing the features in a distinguishing storage module 21 in the feature storage space 251 corresponding to the refined features in a hardware server, extracting a data extraction module 22 which is stored in the corresponding feature storage space 251 and accords with search requirement data by using a mode of combining a decision tree algorithm with a K-Mean algorithm, mining associated data which possibly have the required features in the whole large class of the same storage hardware 25 by using an Apriori association algorithm, continuously extracting the secondary associated data in other storage hardware 25 by using the extracted associated data, judging the similarity of the primary associated data and the secondary associated data and the data extraction module 22 by using a Bloom-filter algorithm, and removing a data inspection module 24 with lower similarity; the search recognition terminal 30 comprises a voice recognition module 31 capable of supporting voice conversion into text, and a text translation module 32 capable of supporting each language conversion; the voice recognition module 31 adopts mature voice recognition software on the market for recognizing talking information during a call and converting it into text information, and the text translation module 32 uses mature translation software for serving customers outside the country.

It should be noted that, the data writing end 10 is used for accumulating the original data of the call center, the C4.5 algorithm can sort the data into corresponding major categories according to the features, the basis for distinguishing each major category can be determined according to the requirement of the call center, the data of the data after different major categories is based on the principle of extremely small correlation, if the data with two or more attributes are faced, the data should be copied in the storage hardware 25 of all relevant major categories, the once correlated data features and the initial extracted data are in the same major category, the general correlation is stronger, the mining speed is fast, the shallow data requirement is met, the data is limited by adopting the initial extracted data to carry out the correlation algorithm, and therefore, the data extracted by the method has high probability of distortion, and therefore, the data inspection module 24 is required to reject the data with high distortion degree;

further, according to the data input as required, the data identification module 11 adopts the C4.5 algorithm to carry out basic qualitative on the data, judges the data belonging to that type,

further, the storage selection module 12 sends the data to the storage hardware 25 of the corresponding major class according to the data type judged by the C4.5 algorithm,

further, the distinguishing storage module 21 extracts the features in the newly input data text by using the TF-IDF algorithm, refines the category of the data, and stores the refined data in the feature storage space 251 corresponding to the refined features in the hardware server.

Further, when the customer uses the call service of the call center, the voice recognition module 31 converts the customer voice into text information in real time, and the text translation module 32 translates the text into a proper language and inputs the proper language into the search bar for searching according to the requirement;

further, the data extraction module 22 combines the decision tree algorithm with the K-Mean algorithm to extract the data meeting the requirement characteristics stored in the corresponding feature storage space 251;

further, the data mining module 23 mines relevant data meeting the requirement characteristics in the whole large class of the same storage hardware 25 through an Apriori association algorithm;

further, the other large-class servers continue to extract secondary associated data by utilizing the primary extracted associated data through the data mining module 23 of the other large-class servers;

further, the data checking module 24 uses a Bloom-filter algorithm to identify the similarity between the data of the primary association and the data of the secondary association and the data extracting module 22, and eliminates the data with lower similarity, and outputs the rest data.

Referring to fig. 1 and 5, the client mobile terminal 40 includes a style recognition module 41 for analyzing preference of the client for data, and a data pushing module 42 for pushing data suitable for the client, wherein the style recognition module 41 counts preferred information types through searching of the client, and the data pushing module 42 pushes suitable data for the client through data preference of the client.

It should be noted that, the preference degree data of the client data recorded by the style identification module 41 may be provided to the data pushing module 42 for analysis, the data pushing module 42 is used for detecting the mined data simultaneously when the data is mined, and when the mined data accords with the preference of the client, the data is selectively pushed;

further, the style recognition module 41 searches and analyzes the data preference of the client according to the query service of the client in the call center and the usual data, and sends the preference data to the data pushing module 42;

further, when the data mining module 23 mines data, the data pushing module 42 compares the mined data with preference, and sends the mined useless data listed in the call service to the client who prefers the data, so as to fully utilize mining calculation force.

The working principle of the invention is as follows:

when call service needs to call data, the data input according to the need is basically qualitative by the data recognition module 11 through adopting the C4.5 algorithm, the data belonging to that type is judged, the storage selection module 12 sends the data to the storage hardware 25 of the corresponding large class according to the data type judged by the C4.5 algorithm, the distinguishing storage module 21 extracts the characteristics in the newly input data text through adopting the TF-IDF algorithm, refines the class of the data, stores the data in the characteristic storage space 251 corresponding to the refined characteristics in the hardware server, when a customer uses the call service of the call center, the voice recognition module 31 converts the customer voice into text information in real time, the text translation module 32 translates the text into a proper language and inputs the text into a search column for searching according to the need, the data extraction module 22 combines the decision tree algorithm with the K-Mean algorithm, extracting data meeting the demand characteristics stored in the corresponding characteristic storage space 251, the data mining module 23 mining associated data meeting the demand characteristics in the whole major class of the same storage hardware 25 through an Apriori association algorithm, the other major class servers continue to extract secondary associated data by utilizing the primary extracted associated data through the data mining module 23 of the data mining module, the data checking module 24 adopts a Bloom-filter algorithm to identify the similarity of the primary associated data and the secondary associated data and the data extracting module 22 to extract the data first, reject the data with lower similarity, output the rest of the data, the style identification module 41 analyzes the data preference of the customer according to the query service of the customer in the call center and the usual data search, and sends the preference data to the data pushing module 42, the data mining module 23 mines the data to be, the data pushing module 42 compares the preference of the mined data, and sends the mined useless data listed in the call service to the client who prefers the data, so as to fully utilize the mining calculation force.

The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereto, and any modification made on the basis of the technical scheme according to the technical idea of the present invention falls within the protection scope of the present invention.

Claims

1. The large data text mining system of the call center comprises a data writing-in end (10), a data storage end (20), a retrieval identification end (30) and a client mobile end (40), and is characterized in that the data writing-in end (10) comprises a data identification module (11) for basically qualifying data by adopting a C4.5 algorithm, judging the data belonging to the type of data, and transmitting the data to a storage selection module (12) of the corresponding large class according to the type of the data judged by the data identification module (11);

the data storage end (20) is a plurality of storage hardware (25) of a call center, a certain number of feature storage spaces (251) are divided in the storage hardware (25) according to different types of features of data, the data storage end comprises a data mining module (23) which uses a TF-IDF algorithm to extract features in a newly input data text, refine types of data, store the data in the feature storage spaces (251) corresponding to the refined features in a hardware server, a distinguishing storage module (21) which uses a mode of combining a decision tree algorithm with a K-Mean algorithm to extract data extraction modules (22) which are stored in the corresponding feature storage spaces (251) and meet search requirement data, uses an Apriori association algorithm to mine associated data possibly with required features in the whole large class of the same storage hardware (25), and can continuously extract secondary associated data in other storage hardware (25), the blood-filter algorithm is adopted to judge the similarity of the associated and the secondary associated data and the data extraction modules (22) firstly extract the data, and the data with lower similarity is removed;

the retrieval recognition end (30) comprises a voice recognition module (31) capable of supporting voice conversion into text, and a text translation module (32) capable of supporting each language conversion;

the client mobile terminal (40) comprises a style identification module (41) for analyzing preference bias of the client for data, and a data pushing module (42) for pushing preference data suitable for the client.

2. A call center big data text mining system according to claim 1, characterized in that said speech recognition module (31) employs commercially available sophisticated speech recognition software for recognizing talking information during a call and converting it into text information, and said text translation module (32) uses sophisticated translation software adapted to serve overseas clients.

3. A call centre big data text mining system according to claim 1, characterized in that the style recognition module (41) counts the information types of its preferences by means of a customer's search.

4. A call centre big data text mining system according to claim 1, characterized in that the data pushing module (42) pushes the appropriate data for the customer by its data preferences.