Disclosure of Invention
The invention aims to provide a method, a device and a storage medium for expanding a user complaint database, which are used for mining potential complaint users and expanding the user complaint database by using signaling information and label information of the potential complaint users so as to improve the accuracy of network quality evaluation.
In a first aspect, an embodiment of the present invention provides a method for expanding a customer complaint database, including:
the communication service data of the users without the complaint records and the users with the complaint records are obtained;
analyzing the signaling information in a cell based on the signaling information of the communication service data backtracking user to obtain analysis results corresponding to different signaling analysis indexes;
sequencing the analysis results generated by the communication service of each user according to the time sequence;
performing sample reconstruction on the sequenced analysis results by adopting a sliding window method to obtain a reconstructed sample;
extracting signaling features of the reconstructed samples based on deep learning;
determining potential complaint users according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the users with complaint records by a K-NN algorithm;
and adding the signaling information and the label information of the potential complaint users to a user complaint database.
Further, the determining potential complaint users according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the complaint records by the K-NN algorithm specifically includes:
constructing a field through a K-NN algorithm, taking the signaling characteristics of the user with the complaint record as a neighborhood graph center, and taking the signaling characteristics of the reconstructed sample as branches and leaves to construct a neighborhood graph;
calculating a comprehensive value of the signaling characteristics of the reconstructed sample according to different weights of each signaling analysis index;
pruning the neighborhood graph according to the comprehensive value of the signaling characteristics of the reconstructed sample and a preset threshold value;
and determining potential complaint users according to the pruned neighborhood map.
Further, the obtaining of the communication service data of the user without the complaint record and the user with the complaint record specifically includes:
and acquiring communication service data of the VIP users without the complaint records and the users with the complaint records.
Further, the signaling information includes link setup time and radio link quality.
Further, the signaling analysis indicator includes: the success rate of Attach, the success rate of E-RAB establishment, the success rate of Service request, the success rate of Pagin, the success rate of redirection, the success rate of context abnormal release, the success rate of TAU, the success rate of cell switching, the success rate of ping-pong TAU and the success rate of ping-pong cell switching.
Further, the signaling analysis indicator further includes: DNS access success rate, DNS time delay, TCP 1/3-way handshake success rate, TCP 1/3-way handshake delay, TCP 2/3-way handshake success rate, TCP 2/3-way handshake success delay, ISTP 1/4-way handshake success rate, ISTP 1/4-way handshake delay, ISTP 2/4-way handshake success rate, ISTP 2/4-way handshake delay, ISTP 3/4-way handshake success rate, ISTP 3/4-way handshake delay, HTTP response success rate, HTTP small session response delay, and HTTP large session response delay.
In a second aspect, an embodiment of the present invention provides an expansion apparatus for a customer complaint database, including:
the communication service data acquisition module is used for acquiring communication service data of users without complaint records and users with complaint records;
the backtracking module is used for analyzing the signaling information in a cell based on the signaling information of the communication service data backtracking user to obtain analysis results corresponding to different signaling analysis indexes;
the sequencing module is used for sequencing the analysis result generated by the communication service of each user according to the time sequence;
the sample reconstruction module is used for performing sample reconstruction on the sequenced analysis results by adopting a sliding window method to obtain reconstructed samples;
the characteristic extraction module is used for extracting the signaling characteristics of the reconstructed sample based on deep learning;
the potential complaint user determining module is used for determining potential complaint users according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the complaint records by a K-NN algorithm;
and the adding module is used for adding the signaling information and the label information of the potential complaint users into a user complaint database.
Further, the module for determining potential complaint users specifically includes:
the neighborhood map construction unit is used for constructing a field through a K-NN algorithm, the signaling characteristics of the user with the complaint record are used as the center of a neighborhood map, and the signaling characteristics of the reconstructed sample are used as branches and leaves to construct a neighborhood map;
the comprehensive value calculating unit is used for calculating the comprehensive value of the signaling characteristics of the reconstructed sample according to different weights of each signaling analysis index;
the pruning unit is used for pruning the neighborhood graph according to the comprehensive value of the signaling characteristics of the reconstructed sample and a preset threshold value;
and the potential complaint user determining unit is used for determining potential complaint users according to the pruned neighborhood map.
Further, the obtaining of the communication service data of the user without the complaint record and the user with the complaint record specifically includes:
and acquiring communication service data of the VIP users without the complaint records and the users with the complaint records.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where, when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute the method for expanding a user complaint database as described above.
Compared with the prior art, the communication service data of the user without the complaint record and the communication service data of the user with the complaint record are obtained; analyzing the signaling information in a cell based on the signaling information of the communication service data backtracking user to obtain analysis results corresponding to different signaling analysis indexes; sequencing the analysis results generated by the communication service of each user according to the time sequence; performing sample reconstruction on the sequenced analysis results by adopting a sliding window method to obtain a reconstructed sample; extracting signaling features of the reconstructed samples based on deep learning; determining potential complaint users according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the users with complaint records by a K-NN algorithm; and adding the signaling information and the label information of the potential complaint users to a user complaint database. Therefore, the accuracy of end-to-end network service quality evaluation based on the user complaint data can be improved by mining the potential complaint users and expanding the user complaint database by using the signaling information and the label information of the potential complaint users.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be understood that the step numbers used herein are for convenience of description only and are not intended as limitations on the order in which the steps are performed.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
The terms "comprises" and "comprising" indicate the presence of the described features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The term "and/or" refers to and includes any and all possible combinations of one or more of the associated listed items.
Example 1:
referring to fig. 1, an embodiment of the present invention provides a method for expanding a customer complaint database, including:
and S1, acquiring the communication service data of the users without complaint records and the users with complaint records.
S2, analyzing the signaling information based on the signaling information of the communication service data backtracking user to obtain the analysis results corresponding to different signaling analysis indexes.
In this embodiment of the present invention, for example, the signaling information includes: link setup time and radio link quality. By analyzing the signaling information, analysis results corresponding to different signaling analysis indexes can be obtained. For example, by analyzing the link setup time, if the link setup time is out of time, the Attach setup failure is known and the Attach success rate can be calculated. By analyzing the quality of the wireless link, the wireless link quality evaluation index includes an index such as RSCP (Received Signal Code Power, i.e. level Signal, which can be understood as the Signal strength of a pilot channel Received by the terminal UE), ECIO, etc. for example, if RSCP > -105, it is likely that a drop or handover failure occurs, a handover success rate can be calculated.
Specifically, as shown in table 1, the signaling analysis indicators include: the wireless side: the success rate of Attach, the success rate of E-RAB establishment, the success rate of Service request, the success rate of Pagin, the success rate of redirection, the success rate of context abnormal release, the success rate of TAU, the success rate of cell switching, the success rate of ping-pong TAU and the success rate of ping-pong cell switching. Core network side: the DNS access success rate, the DNS time delay, the TCP1/3 handshake success rate, the TCP1/3 handshake delay, the TCP2/3 handshake success rate, the TCP2/3 handshake success delay, the ISCTP 1/4 handshake success rate, the ISCTP 1/4 handshake delay, the ISCTP 2/4 handshake success rate, the ISCTP 2/4 handshake delay, the ISCTP 3/4 handshake success rate, the ISCTP 3/4 handshake delay, the HTTP response success rate, the HTTP small session response delay and the HTTP large session response delay.
And S3, sequencing the analysis results generated by the communication service of each user according to the time sequence. For example, as shown in table 1.
TABLE 1
And S4, performing sample reconstruction on the sequenced analysis results by adopting a sliding window method to obtain a reconstructed sample. For example, as shown in fig. 2.
It should be noted that, due to the complexity of the wireless network environment and the lag of the user complaints, the reconstructed data is regarded as a sample by reconstructing the signaling analysis data, so that the sample has a time duration.
Preferably, the sliding window size is 6 and the step size is 3.
And S5, extracting the signaling characteristics of the reconstructed sample based on deep learning.
It should be noted that, since wireless signal propagation may be affected by various interference and random factors, during the information acquisition process, due to signal fluctuation and the like, the acquired signaling information has a certain error, for example, as shown in table 2.
TABLE 2
Attach success rate
|
E-RAB establishment success rate
|
90%
|
99%
|
75%
|
78%
|
90%
|
99%
|
96%
|
99%
|
95%
|
98%
|
95%
|
98% |
In table 2, there were some abnormalities in the second data, but the latter were all more normal. This is probably because some terminals have some access problems (e.g. an antenna gate event occurs in an iphone, or this is often the case for some terminals, and even when a user holds the terminal, the terminal is frequently dialed out in the cell, but there is an abnormality in dialing out), and in fact, this factor is not a factor of the network itself, and it only appears in a certain cell occasionally, and it is likely that the user also appears in the cell for a short time, so that there is a large fluctuation of certain data. In order to eliminate the influence of the above factors on the network quality judgment, therefore, the reconstructed signaling data (reconstructed sample) cannot be directly used as the feature data of the complaint user signaling, and the reconstructed signaling data needs to be optimized. The way of optimization is to "smear" the data. Specifically, the data in a certain time period is averaged by adopting deep learning, and the data in T-2, T-1 and T time periods can be averaged by adopting a sliding window mode, for example, the data characteristic of the T time period is required, so that the data is leveled.
The smoothing of the time characteristics can reduce the influence of random interference of the signaling sequence by performing CNN characteristic extraction on the reconstructed sample.
In addition, for the spatial features, for example, the Attach success rate and the E-RAB establishment success rate are related in general, and since there may be differences in each cell or in different scenes (hotels, railway stations, rural areas), it is not known how much the association degree between them is. The success rate of the Attach and the relevance of the E-RAB can be obtained by a convolutional neural network after counting the data in a sliding window. By carrying out feature extraction on the signaling data of the complaints of the users, the signaling data features of the complaint users can be found out which are 'converged'.
And S6, determining potential complaint users according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the users with complaint records through a K-NN algorithm.
In the embodiment of the present invention, the determining, by using a K-NN algorithm, a potential complaint user according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the complaint record specifically includes:
s61, constructing the field through a K-NN algorithm, taking the signaling characteristics of the user with complaint records as the center of a neighborhood graph, and taking the signaling characteristics of the reconstructed sample as branches and leaves to construct the neighborhood graph.
As shown in fig. 3, a user with complaint records, referred to as a complaint user for short, takes signaling features of the complaint user as a neighborhood graph center, takes three features most similar to the signaling features of the complaint user as branches and leaves of the signaling features of the complaint user, respectively takes the three branches and leaves as centers, selects the three features most similar as branches and leaves of a corresponding center, and so on to obtain a final neighborhood graph.
Three of these, uncertain, are determined by the preferences of the customer complaints in each area. Some users are sensitive to access, some users are sensitive to switching, and some users are sensitive, so that the number of the users is determined according to specific actual conditions, and generally 3-5 users are selected.
And S62, calculating a comprehensive value of the signaling characteristics of the reconstructed sample according to different weights of each signaling analysis index.
S63, pruning the neighborhood graph according to the comprehensive value of the signaling characteristics of the reconstructed sample and a preset threshold value.
And S64, determining potential complaint users according to the pruned neighborhood map.
And S7, adding the signaling information and the label information of the potential complaint users into a user complaint database.
In the embodiment of the present invention, it should be noted that, in the process of constructing the neighborhood graph, different influence degrees of the signaling analysis indicators are not considered, therefore, in the embodiment of the present invention, different weights of each signaling analysis indicator are defined by a manual experience method, a comprehensive value of different indicators of the signaling analysis is calculated according to a weighted average method, a threshold method is adopted to prune the neighborhood graph, and finally, the neighborhood graph is obtained by pruning, as shown in fig. 4, which is "determination" of a potential complaint user, after the potential complaint user is determined, signaling information of the complaint user and label information (complaint) thereof are added to a user complaint database, so as to expand the complaint data.
Aiming at the complex wireless network environment and the user complaint hysteresis, the embodiment of the invention firstly reconstructs the signaling analysis data and takes the reconstructed data as a sample, so that the sample has time continuity; in addition, the CNN feature extraction is carried out on the complaint data aiming at the complexity of the wireless environment, so that the influence of random interference of the signaling sequence is reduced to a certain extent. In addition, in consideration of the problem of algorithm complexity, the embodiment of the invention does not adopt a normal vector similarity method to perform cosine similarity analysis on the vector matrix after the characteristics are extracted, but adopts a K-NN method to construct a neighborhood graph, because the relevance of a graph model between calculation data has the advantages of rapidness and vividness, on the basis of constructing the neighborhood graph, different weights of each signaling analysis index are considered, the comprehensive values of different indexes of the signaling analysis are calculated according to a weighted average mode, a threshold value method is adopted to prune the neighborhood graph, and finally the neighborhood graph is obtained through pruning, namely the determination of potential complaint users. Compared with the traditional pairwise comparison method, the method for pruning the neighborhood map is quick and simple and has superiority.
Compared with the prior art, the method and the device have the advantages that the potential complaint users are mined, the user complaint database is expanded by the signaling information and the label information of the potential complaint users, and the accuracy of end-to-end network service quality evaluation based on the user complaint data can be improved.
As an example of the embodiment of the present invention, the obtaining of the communication service data of the user without the complaint record and the user with the complaint record specifically includes:
and acquiring communication service data of the VIP users without the complaint records and the users with the complaint records.
It should be noted that, since the VIP generally contributes 80% of profit, for a user without complaints, only the service data of the VIP user is obtained, so that the user with 80% of favorable moisturizing value can be maintained at a low cost, and the main task of network optimization is completed.
Example 2:
referring to fig. 5, an embodiment of the present invention provides an expanding apparatus for a customer complaint database, including:
a communication service data acquisition module 1, configured to acquire communication service data of a user without a complaint record and a user with a complaint record;
the backtracking module 2 is configured to analyze signaling information in a cell based on the signaling information of the communication service data backtracking user to obtain analysis results corresponding to different signaling analysis indexes;
the sequencing module 3 is used for sequencing the analysis results generated by the communication service of each user according to the time sequence;
the sample reconstruction module 4 is used for performing sample reconstruction on the sequenced analysis results by adopting a sliding window method to obtain reconstructed samples;
a feature extraction module 5, configured to extract signaling features of the reconstructed sample based on deep learning;
a potential complaint user determining module 6, configured to determine a potential complaint user according to the signaling characteristics of the reconstructed sample and the signaling characteristics of the complaint record by using a K-NN algorithm;
and the adding module 7 is used for adding the signaling information and the label information of the potential complaint users to a user complaint database.
As an example of the embodiment of the present invention, the module for determining a potential complaint user specifically includes:
the neighborhood map construction unit is used for constructing a field through a K-NN algorithm, the signaling characteristics of the user with the complaint record are used as the center of a neighborhood map, and the signaling characteristics of the reconstructed sample are used as branches and leaves to construct a neighborhood map;
the comprehensive value calculating unit is used for calculating the comprehensive value of the signaling characteristics of the reconstructed sample according to different weights of each signaling analysis index;
the pruning unit is used for pruning the neighborhood graph according to the comprehensive value of the signaling characteristics of the reconstructed sample and a preset threshold value;
and the potential complaint user determining unit is used for determining potential complaint users according to the pruned neighborhood map.
As an example of the embodiment of the present invention, the obtaining of the communication service data of the user without the complaint record and the user with the complaint record specifically includes:
and acquiring communication service data of the VIP users without the complaint records and the users with the complaint records.
Example 3:
the invention provides a computer-readable storage medium, which specifically includes a stored computer program, wherein when the computer program runs, a device where the computer-readable storage medium is located is controlled to execute the method for expanding a user complaint database according to any one of the embodiments.
It should be noted that, all or part of the flow in the method according to the above embodiments of the present invention may also be implemented by a computer program instructing related hardware, where the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above embodiments of the method may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be further noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.