CN111401591A - Quality difference user determination method and device and readable medium - Google Patents
Quality difference user determination method and device and readable medium Download PDFInfo
- Publication number
- CN111401591A CN111401591A CN201811540990.2A CN201811540990A CN111401591A CN 111401591 A CN111401591 A CN 111401591A CN 201811540990 A CN201811540990 A CN 201811540990A CN 111401591 A CN111401591 A CN 111401591A
- Authority
- CN
- China
- Prior art keywords
- user
- quality difference
- complaint
- quality
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000012549 training Methods 0.000 claims abstract description 73
- 239000013598 vector Substances 0.000 claims abstract description 57
- 238000003066 decision tree Methods 0.000 claims description 59
- 238000010586 diagram Methods 0.000 claims description 27
- 238000012545 processing Methods 0.000 claims description 19
- 230000009467 reduction Effects 0.000 claims description 15
- 238000000513 principal component analysis Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 6
- 238000013508 migration Methods 0.000 claims description 4
- 230000005012 migration Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 abstract description 6
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 230000006399 behavior Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000011156 evaluation Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000013145 classification model Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000010219 correlation analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 206010013952 Dysphonia Diseases 0.000 description 1
- 238000012952 Resampling Methods 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/01—Customer relationship services
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a method, a device and a readable medium for determining a user with poor quality, belonging to the technical field of network optimization, wherein the method and the device provided by the invention are used for acquiring voice service poor quality data of a user to be identified; extracting a quality difference characteristic vector from the quality difference data; determining whether the user to be identified is the user with the quality difference or not according to the quality difference characteristic vector and a quality difference user identification model obtained by training, so that whether the user to be identified is the user with the quality difference or not can be accurately determined; in addition, the quality difference user identification model is obtained by training quality difference data based on historical users, effectively combines the actual quality difference data of the users, evaluates the users more objectively, and is beneficial to realizing network optimization at the user level and improving user care and user experience.
Description
Technical Field
The present invention relates to the field of network optimization technologies, and in particular, to a method, an apparatus, and a readable medium for determining a user with poor quality.
Background
At present, the voice service poor quality positioning mode is generally divided into two situations, one is to directly position the voice quality difference, and the other is to position the poor quality user.
The existing method for positioning the voice quality difference comprises subjective evaluation and objective evaluation, wherein the objective evaluation means that a machine is used for automatically judging the voice quality, and the voice quality is evaluated according to the error comparison of the original voice signal and the voice signal of the distorted voice signal; subjective evaluation refers to voice quality evaluation by a human subject, real users score voice quality after listening evaluation experience, and the method for positioning voice quality difference does not consider user experience and cannot meet the requirements of the users.
The method for locating the users with poor quality mostly obtains the satisfaction information of the users by telephone access and investigation feedback to determine the users with poor quality, or defines a discrimination rule by means of data statistics, for example, a traitor rule is set based on an index of times and duration, the users with poor quality are determined based on the traitor rule, and the investigation feedback method only depends on the subjective feelings of the users without considering the actual voice service quality of the users; the manner of regular localization relies essentially on localizing poor quality speech, again without incorporating the user experience.
Therefore, how to accurately determine the poor user in combination with the actual appeal of the user is one of the primary considerations.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining a user with poor quality and a readable medium, which are used for accurately determining the user with poor quality.
In a first aspect, an embodiment of the present invention provides a method for determining a quality difference user, including:
acquiring voice service quality difference data of a user to be identified;
extracting a quality difference characteristic vector from the quality difference data;
and determining whether the user to be identified is the quality difference user or not according to the quality difference characteristic vector and a quality difference user identification model obtained through training, wherein the quality difference user identification model is obtained through training based on the quality difference data of the historical user.
In a second aspect, an embodiment of the present invention provides a quality-poor user determination apparatus, including:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring voice service quality difference data of a user to be identified;
an extraction unit, configured to extract a quality difference feature vector from the quality difference data;
and the determining unit is used for determining whether the user to be identified is the quality difference user according to the quality difference characteristic vector and a quality difference user identification model obtained through training, wherein the quality difference user identification model is obtained through training based on the quality difference data of the historical user.
In a third aspect, an embodiment of the present invention provides a communication device, including a memory, a processor, and a computer program stored in the memory and executable on the processor; the processor, when executing the program, implements a method of user determination of poor quality as any one of the methods provided herein.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the quality user determination method according to any one of the methods provided in this application.
The invention has the beneficial effects that:
the method, the device and the readable medium for determining the quality difference user, provided by the embodiment of the invention, are used for acquiring the voice service quality difference data of the user to be identified; extracting a quality difference characteristic vector from the quality difference data; determining whether the user to be identified is the user with the quality difference or not according to the quality difference characteristic vector and a quality difference user identification model obtained by training, so that whether the user to be identified is the user with the quality difference or not can be accurately determined; in addition, the quality difference user identification model is obtained by training quality difference data based on historical users, effectively combines the actual quality difference data of the users, evaluates the users more objectively, and is beneficial to realizing network optimization at the user level and improving user care and user experience.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a schematic view of an application scenario of a quality-poor user determination method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for determining a user with poor quality according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of determining whether a user to be identified is a poor user according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart illustrating obtaining output results of each integrated decision tree sub-model according to an embodiment of the present invention;
FIG. 5a is a schematic diagram of an execution logic for training a user recognition model with poor quality according to an embodiment of the present invention;
FIG. 5b is a schematic flowchart of a process for training a user recognition model with poor quality according to an embodiment of the present invention;
FIG. 6 is a schematic flow chart of determining training samples according to an embodiment of the present invention;
FIG. 7 is a schematic diagram illustrating a user group partitioning effect according to an embodiment of the present invention;
fig. 8a is a schematic flowchart of a process for training each integrated decision tree sub-model by using the training sample to form a complaint user identification model according to an embodiment of the present invention;
FIG. 8b is a schematic diagram of an implementation of an integrated decision tree model according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a quality-poor user determination apparatus according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a communication device implementing a method for determining a poor user according to an embodiment of the present invention.
Detailed Description
The method, the device and the readable medium for determining the user with poor quality provided by the embodiment of the invention are used for accurately determining the user with poor quality.
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present invention, and are not intended to limit the present invention, and that the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.
In order to solve the problems in the prior art, an embodiment of the present invention provides a method for determining a user with poor quality, which is shown in the schematic diagram of an application scenario in fig. 1, after the quality difference data of the user to be identified is obtained, the quality difference data is input into a quality difference user identification model obtained by training, then, based on the output result of the quality difference user identification model, whether the user to be identified is the quality difference user can be determined, if the user to be identified is the quality difference user based on the output result, corresponding coping strategies such as displaying a quality difference event map of the user to be identified, giving the user to be identified user care and the like can be given, the excavation of the cell quality difference reason of the quality difference event can be carried out based on the displayed quality difference event map, and optimizing the network of the cell based on the mining result, thereby improving the voice service quality of the cell.
It should be noted that the voice service in the present invention may be, but is not limited to, a voice over long term evolution (Vo L TE) voice service, etc.
Based on the application scenario illustrated in fig. 1, a method for determining a poor quality user provided according to an exemplary embodiment of the present invention is described with reference to fig. 2 to 10. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present invention, and the embodiments of the present invention are not limited in this respect. Rather, embodiments of the present invention may be applied to any scenario where applicable.
As shown in fig. 2, a flow chart of the method for determining the quality difference user according to the embodiment of the present invention may include the following steps:
and S21, acquiring the voice service quality difference data of the user to be recognized.
In this step, the user may be used as the user to be identified based on the speech service quality difference data of the current network actually measured user.
Optionally, the present invention fuses different types of quality difference data, for example, the quality difference data may include, but is not limited to, Mean opinion score (Mos) quality difference records, drop call non-connected quality difference records, and complaint records.
Note that the complaint record in the present invention may be a complaint work order of the user or the like.
And S22, extracting the quality difference characteristic vector from the quality difference data.
In particular, the quality difference features that cause poor perception by the user may be extracted from the quality difference data and then constructed into a quality difference feature vector. Specifically, correlation analysis can be performed on each feature in the quality difference data, and then the feature with relatively high correlation is selected as the quality difference feature.
Alternatively, the quality difference characteristics proposed in the present invention can be referred to table 1:
TABLE 1
Number of bad events days | Number of bad events | Average value of day interval | Average value of hour interval |
Mean value of second interval | Number of cells | The maximum number of abnormal events in the same cell | Cell distribution entropy |
Days of onset of poor quality | Entropy of day distribution |
It should be noted that, in the present application, the trained quality user identification model is used to determine whether the user to be identified is a quality user, so that the training sample required in the subsequent training of the quality user identification model also determines the quality difference characteristic, and thus the quality difference characteristic selection method is introduced in detail in the subsequent training of the quality difference model.
And S23, determining whether the user to be identified is a poor user or not according to the poor characteristic vector and the poor user identification model obtained through training.
The quality difference user identification model is obtained by training quality difference data based on historical users.
Optionally, in the present invention, the quality-poor user recognition model includes a plurality of integrated decision tree sub-models; step S23 may be implemented according to the method shown in fig. 3, including the steps of:
and S31, obtaining the output result of each integrated decision tree sub-model based on the quality difference characteristic vector and each integrated decision tree sub-model.
Specifically, the output result of each basic decision tree submodel is fk(x),k∈[1,n]And n is the number of the integrated decision tree submodels.
Alternatively, step S31 may be implemented according to the flow shown in fig. 4, including the following steps:
and S41, performing dimensionality reduction on the quality difference feature vector by using a Principal Component Analysis (PCA) algorithm.
In this step, a Principal Component Analysis (PCA) algorithm belongs to a dimensionality reduction algorithm, which transforms raw data into a set of representations linearly independent of each dimension through linear transformation, can be used for extracting main characteristic components of the data, and is commonly used for dimensionality reduction of high-dimensional data.
Specifically, after the quality difference feature vector is obtained, the quality difference feature vector is subjected to normalization processing and then is input into a PCA algorithm in a matrix form, and by setting parameters in the PCA algorithm to enable the feature dimension after dimension reduction to be 2, 85% of the main components of the quality difference feature of the user can be reserved, and the two-dimensional feature vector of the user to be identified is obtained.
And S42, inputting the quality difference characteristic vectors subjected to the dimensionality reduction treatment into each integrated decision tree sub-model to obtain an output result of each integrated decision tree sub-model.
In this step, the two-dimensional feature vectors in step S42 are input into the respective integrated decision tree sub-models, so that output results f 'of the respective integrated decision tree sub-models can be obtained'k(x)。
And S32, determining the average value of the output results of the integrated decision tree sub-models as the quality difference score of the user to be identified.
Specifically, the quality difference Score' of the user to be identified may be determined according to the following formula:
s33, if the quality difference score is larger than the quality difference score threshold value, determining that the user to be identified is a quality difference user.
After determining the quality difference score of the user to be identified based on the above formula, it may be determined whether the user to be identified is a quality difference user, that is, an output result of the quality difference user identification model, according to the following formula:
specifically, if it is determined that the quality difference score of the user to be identified is greater than the quality difference score threshold s', the output result is 1, that is, it is determined that the user to be identified is a quality difference user; otherwise, the output result is 0, namely the user to be identified is not the poor quality user. It should be noted that the quality difference score threshold s' in the present invention may be determined according to actual situations.
By adopting the quality difference user determining method provided by the invention, the voice service quality difference data of the user to be identified is obtained; extracting a quality difference characteristic vector from the quality difference data; determining whether the user to be identified is the user with the quality difference or not according to the quality difference characteristic vector and a quality difference user identification model obtained by training, so that whether the user to be identified is the user with the quality difference or not can be accurately determined; in addition, the quality difference user identification model is obtained by training quality difference data based on historical users, effectively combines the actual quality difference data of the users, evaluates the users more objectively, and is beneficial to realizing network optimization at the user level and improving user care and user experience.
Next, a training process of the user recognition model with poor quality provided by the present invention is described, and is shown with reference to fig. 5 a:
optionally, the quality difference user identification model in the invention is composed of a plurality of integrated decision tree sub-models; the poor quality user recognition model can be trained according to the method shown in fig. 5b, which includes the following steps:
and S51, determining a training sample based on the quality difference data of the historical user.
The training samples in the invention include complaint user samples and non-complaint user samples.
Specifically, the training samples may be determined according to the method shown in fig. 6, including the following steps:
and S61, extracting the feature vector of each user from the quality difference data of the historical users.
Specifically, a plurality of poor quality features causing poor user perception can be extracted from historical users, and correlation analysis is performed on each feature, specifically: firstly, determining users with complaint behaviors in historical users, calling the users with the complaint behaviors as candidate complaint users, and then calling the users except the candidate complaint users as candidate non-complaint users, wherein for any user, if the user is determined to have a complaint work order, the user is determined to be the candidate complaint user; then determining the statistical distribution of candidate complaint users and candidate non-complaint users on each feature, calculating the correlation coefficient of each feature and the complaint behavior, and finally determining the feature with the correlation coefficient higher than the correlation threshold value as the quality difference feature.
Alternatively, the correlation coefficient of each feature with the occurrence of the complaint behavior may be determined according to the following formula:
wherein,wherein ETA is a correlation coefficient; TSS is a total variable, an interpreted variable YijAnd the average of the observed values (characteristic values of user characteristics) of (1)The sum of squares of the dispersion (average of feature values of features of all users); RSS is the mean of the observed value of an interpreted variable (characteristic of a user's characteristics) and its attribution class iThe sum of squares of the deviations, m being the number of categories the features have; n isiThe number of features under the ith category.
Based on the correlation coefficient formula, the most relevant and distinguishable quality difference features can be selected from the quality difference data of the user, and the quality difference features are formed into quality difference feature vectors, which are also shown in table 1.
And S62, performing dimension reduction processing on the feature vectors of the users to obtain a two-dimensional scatter diagram.
In the step, considering the personal domain problem of the user, the complaint behaviors of the user do not completely represent the degree of Vo L TE quality difference, namely, noisy users exist, namely, the subjective users exist in the users with the complaint behaviors, the silent users exist in the users without the complaint behaviors, and the user group division is shown in FIG. 7.
After complaint users and non-complaint users are screened out, each feature in the quality difference feature vectors of the complaint users and the non-complaint users can be normalized, then the quality difference feature vectors after the normalization processing are input into a PCA algorithm in a feature matrix form, two-dimensional feature vectors representing each complaint user and each non-complaint user can be output by setting a model parameter in the PCA algorithm (the feature dimension after dimension reduction is 2), each two-dimensional feature vector is composed of two parameter values of x and y, and a two-dimensional scatter diagram can be formed based on the two-dimensional feature vectors of the complaint users and the non-complaint users.
And S63, determining the users meeting the complaint user conditions in the two-dimensional scatter diagram as complaint user samples.
Alternatively, the complaint user condition in the present invention can be, but is not limited to: x is more than or equal to 0 and less than or equal to 0.1, y is more than or equal to 0 and less than or equal to 1, x is more than or equal to 0 and less than or equal to 0.3, and y is more than or equal to 0 and less than or equal to 0.1.
In this step, based on the two-dimensional scattergram generated in step S62, a user whose x and y values fall within a region formed by x being greater than or equal to 0 and less than or equal to 0.1 and y being greater than or equal to 0 and less than or equal to 1, and a user whose x and y values fall within a region formed by x being greater than or equal to 0 and less than or equal to 0.3 and y being greater than or equal to 0 and less than or equal to 0.1 can be determined as a sample of a complaint user.
And S64, determining the users meeting the non-complaint user conditions in the two-dimensional scatter diagram as non-complaint user samples.
Alternatively, the complaint user condition in the present invention can be, but is not limited to: x is more than 0.3, y is more than or equal to 0 and less than or equal to 1, x is more than 0.1, and y is more than 0.3.
In this step, based on the two-dimensional scattergram generated in step S62, a user whose x and y values fall within an area formed by x > 0.3 and y > 0 and y > 1, and a user whose x and y values fall within an area formed by x > 0.1 and y > 0.3 may be determined as a non-complaint user sample.
And S65, determining the complaint user samples and the non-complaint user samples as training samples.
The sample based on the complaint user determined in step S63 and the non-complaint user determined in S64 is determined as a training sample.
And S52, training each integrated decision tree sub-model by using the training sample to form a complaint user identification model.
In the invention, the complaint user sample is a positive sample, and the non-complaint user sample is a negative sample.
Alternatively, step S52 may be executed according to the flow shown in fig. 8a, including the following steps:
and S81, dividing the non-complaint users into a plurality of groups, wherein the ratio of the number of the non-complaint users to the total number of the non-complaint users in each group is the same as the ratio of the total number of the complaint users to the total number of the non-complaint users.
In the invention, complaint users are used as positive samples, non-complaint users are used as negative samples, and the number of the negative samples is far greater than that of the positive samples in terms of data distribution, so that accurate classification effect is difficult to realize for model training. In order to solve the problem, an integrated decision tree model is constructed on the basis of a traditional decision tree classification model. The invention provides a sample division mode, a plurality of decision tree base classifiers are constructed, namely an integrated decision tree submodel in the invention, and the output of the model comprehensively considers the classification results of all the base classifiers, the model improves the reliability of the classification results by the integrated characteristic, indirectly realizes the resampling of the samples, and effectively solves the problem of sample imbalance in the classification model.
Specifically, a ratio K between the total number of the complaint users and the total number of the non-complaint users in the present invention may be determined, and then the non-complaint users are divided into a plurality of groups based on the ratio, so as to ensure that the ratio between the number of the non-complaint users in each group and the total number of the non-complaint users is equal to the ratio K.
For example, it is determined that the ratio between the total number of complaint users and the total number of non-complaint users is 0.08%, the non-complaint users are classified based on the ratio.
Optionally, based on the above-mentioned proportion of 0.08%, the implementation schematic diagram of the integrated decision tree model in the present invention may refer to fig. 8b, and the algorithm implementation flow may refer to the following process:
the method comprises the following steps: forming a set C by the complaint users and a set U by the non-complaint users;
step two: initializing a set F { }ofa decision tree model;
step three: dividing non-complaint users into n groups, and recording as U ═ U1,u2,u3,......,unAnd the number of people in each group satisfies | u }h|=|C|。
Step four: for h ═ 1,2,3, … …, n, the following is cycled:
1) take a group of u of non-complaint usershAnd complaining the user C, training a decision Cart tree sub-model fk;
2) N decision tree sub-models fkAdding the user identification model F into the set F to obtain a poor quality user identification model F ═ { F1,f2,f3,……fn}。
And S82, aiming at each group of non-complaint users, training the corresponding integrated decision tree sub-models by using the feature vectors of the group of non-complaint users and the feature vectors of the complaint users.
And S83, obtaining a complaint user identification model based on each integrated decision tree sub-model obtained through training.
In the prediction process, the complaint user identification model determines the classification result of the user by setting a server threshold s, for example, a sample set of the user to be predicted is represented as X ═ X1,x2,x3,……xlFor each user xjThe predicted result in the kth decision tree submodel is fk(xj) Then user xjThe final classification result F (x)j) The following formula is calculated, wherein the classification threshold s is 0.8 by experiment<s<The classifier performance is best at 0.9:
and S53, performing regression migration processing on the complaint user identification model obtained through training to obtain the quality difference user identification model.
However, the complaint user recognition model obtained in step S52 is determined based on the feature difference between the complaint user and the non-complaint user, and accurate classification can be achieved, but the complaint behavior is "extreme expression" of poor speech quality, and the complaint user is "extreme population" of poor speech quality users, so that the feature difference between the complaint user and the non-complaint user learned in the user complaint recognition model can be quantized to a continuous value in the present invention. And transferring the established user complaint recognition model (integrated decision tree classification model) into the integrated decision tree regression model by virtue of the characteristics of the regression model in machine learning, and outputting the model to obtain the quality difference score of the user. Thus, the feature difference of the positive and negative samples learned by the original classifier is reserved, and the difference is quantized into a score representing the degree of the voice quality difference. And then, according to the experience proportion of the users with poor voice quality provided by the operator, the judgment threshold of the complaint users is reduced, and the defined score of the users with poor voice quality can be obtained.
Based on the above description, a user identification model with poor quality can be obtained, referring to the formulas in steps S32 and S33. In a specific link of locating users with poor quality, aiming at the condition that data of complaining users of operators are extremely inclined, an applicable integrated decision tree classification model is established, and reasonable conversion of locating the users with poor quality by the complaining users is realized by using the idea of transfer learning.
By adopting the quality difference user determining method provided by the invention, the voice service quality difference data of the user to be identified is obtained; extracting a quality difference characteristic vector from the quality difference data; and determining whether the user to be identified is a quality difference user or not according to the quality difference characteristic vector and a quality difference user identification model obtained by training. The user identification model of the quality difference is obtained by training the quality difference data based on the historical users, so that the accuracy of the result of determining whether the user to be identified is the user of the quality difference can be improved.
In addition, the quality difference user recognition model has the advantages of self-training and learning of deep characteristics of data, and factors of voice quality difference and user experience are comprehensively considered; and only a plurality of quality difference characteristics are directly input, automatic quality difference user positioning is realized, complex manual data analysis in regular positioning is avoided, and universality is improved.
Moreover, the quality difference data are analyzed from the user level, a user-level network optimization scheme is favorably realized, and user care and user experience are improved.
Based on the same inventive concept, the embodiment of the present invention further provides a quality difference user determination apparatus, and as the principle of the apparatus for solving the problem is similar to the quality difference user determination method, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 9, a schematic structural diagram of a quality-difference user determining apparatus provided in an embodiment of the present invention includes:
an obtaining unit 91, configured to obtain voice service quality difference data of a user to be identified;
an extracting unit 92, configured to extract a quality difference feature vector from the quality difference data;
a determining unit 93, configured to determine whether the user to be identified is a quality difference user according to the quality difference feature vector and a quality difference user identification model obtained through training, where the quality difference user identification model is obtained through training based on quality difference data of a historical user.
Optionally, the quality user identification model comprises a number of integrated decision tree sub-models; then
The determining unit 93 is specifically configured to obtain an output result of each integrated decision tree sub-model based on the quality difference feature vector and each integrated decision tree sub-model; determining the average value of the output results of all the integrated decision tree sub-models as the quality difference score of the user to be identified; and if the quality difference score is larger than a quality difference score threshold value, determining that the user to be identified is a quality difference user.
Optionally, the quality difference data comprises at least one of: the average subjective opinion is divided into Mos poor quality records, dropped call non-connection poor quality records and complaint records.
Optionally, the determining unit 93 is specifically configured to perform a dimensionality reduction on the quality difference feature vector by using a Principal Component Analysis (PCA) algorithm; and inputting the quality difference characteristic vectors subjected to the dimensionality reduction into each integrated decision tree to obtain an output result of each integrated decision tree.
In one possible embodiment, the quality user identification model is composed of several integrated decision tree sub-models; the apparatus further comprises:
a model training unit 94, configured to determine training samples based on the quality difference data of the historical users, where the training samples include complaint user samples and non-complaint user samples; training each integrated decision tree sub-model by using the training samples to form a complaint user identification model, wherein the complaint user samples are positive samples, and the non-complaint user samples are negative samples; and carrying out regression migration processing on the complaint user identification model obtained by training to obtain the quality difference user identification model.
Optionally, the model training unit 94 is specifically configured to extract feature vectors of each user from the quality difference data of the historical users; performing dimensionality reduction on the feature vectors of all users to obtain a two-dimensional scatter diagram; determining users meeting the complaint user conditions in the two-dimensional scatter diagram as complaint user samples; determining users meeting the non-complaint user conditions in the two-dimensional scatter diagram as non-complaint user samples; and determining the complaint user sample and the non-complaint user sample as training samples.
Optionally, the model training unit 94 is specifically configured to divide the non-complaint users into a plurality of groups, and a ratio between the number of the non-complaint users in each group and the total number of the non-complaint users is the same as a ratio between the total number of the complaint users and the total number of the non-complaint users; aiming at each group of non-complaint users, training the corresponding integrated decision tree sub-models by using the feature vectors of the group of non-complaint users and the feature vectors of the complaint users; and obtaining a complaint user identification model based on each integrated decision tree sub-model obtained by training.
For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same or in multiple pieces of software or hardware in practicing the invention.
Based on the same technical concept, the embodiment of the present application further provides a communication device, which can implement the method in the foregoing embodiment.
Referring to fig. 10, a schematic structural diagram of a communication device according to an embodiment of the present invention is shown in fig. 10, where the communication device may include: a processor 1001, a memory 1002, a transceiver 1003, and a bus interface.
The processor 1001 is responsible for managing the bus architecture and general processing, and the memory 1002 may store data used by the processor 1001 in performing operations. The transceiver 1003 is used for receiving and transmitting data under the control of the processor 1001.
The bus architecture may include any number of interconnected buses and bridges, with one or more processors, represented by the processor 1001, and various circuits, represented by the memory 1002, being linked together. The bus architecture may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface. The processor 1001 is responsible for managing the bus architecture and general processing, and the memory 1002 may store data used by the processor 1001 in performing operations.
The process disclosed in the embodiment of the present invention may be applied to the processor 1001, or implemented by the processor 1001. In implementation, the steps of the signal processing flow may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1001. The processor 1001 may be a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like that implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method for determining the user quality difference disclosed by the embodiment of the present invention may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1002, and the processor 1001 reads the information in the memory 1002 and completes the steps of the signal processing flow in combination with the hardware thereof.
Specifically, the processor 1001 is configured to read a program in a memory and execute any step of any one of the methods.
Based on the same technical concept, the embodiment of the application also provides a computer storage medium. The computer-readable storage medium stores computer-executable instructions for causing the computer to perform any of the steps of any of the methods described above.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (14)
1. A method for determining a quality of a user, comprising:
acquiring voice service quality difference data of a user to be identified;
extracting a quality difference characteristic vector from the quality difference data;
and determining whether the user to be identified is the quality difference user or not according to the quality difference characteristic vector and a quality difference user identification model obtained through training, wherein the quality difference user identification model is obtained through training based on the quality difference data of the historical user.
2. The method of claim 1, wherein the poor user identification model comprises a number of integrated decision tree sub-models; then
Determining whether the user to be identified is a quality difference user according to the quality difference feature vector and a quality difference user identification model obtained by training, wherein the method specifically comprises the following steps:
obtaining an output result of each integrated decision tree sub-model based on the quality difference feature vector and each integrated decision tree sub-model;
determining the average value of the output results of all the integrated decision tree sub-models as the quality difference score of the user to be identified;
and if the quality difference score is larger than a quality difference score threshold value, determining that the user to be identified is a quality difference user.
3. The method of claim 1, wherein the quality difference data comprises at least one of: the average subjective opinion is divided into Mos poor quality records, dropped call non-connection poor quality records and complaint records.
4. The method of claim 2, wherein obtaining the output result of each integrated decision tree based on the quality difference feature vector and each integrated decision tree specifically comprises:
performing dimensionality reduction on the quality difference characteristic vector by using a Principal Component Analysis (PCA) algorithm;
and inputting the quality difference characteristic vectors subjected to the dimensionality reduction treatment into each integrated decision tree sub-model to obtain an output result of each integrated decision tree sub-model.
5. The method of claim 1, wherein the poor user identification model is comprised of a number of integrated decision tree sub-models; training a quality user identification model based on quality difference data of the historical users according to the following method:
determining training samples based on the quality difference data of the historical users, wherein the training samples comprise complaint user samples and non-complaint user samples;
training each integrated decision tree sub-model by using the training samples to form a complaint user identification model, wherein the complaint user samples are positive samples, and the non-complaint user samples are negative samples;
and carrying out regression migration processing on the complaint user identification model obtained by training to obtain the quality difference user identification model.
6. The method of claim 5, wherein determining training samples based on the quality difference data of the historical users comprises:
extracting characteristic vectors of each user from the quality difference data of the historical users respectively;
performing dimensionality reduction on the feature vectors of all users to obtain a two-dimensional scatter diagram;
determining users meeting the complaint user conditions in the two-dimensional scatter diagram as complaint user samples;
determining users meeting the non-complaint user conditions in the two-dimensional scatter diagram as non-complaint user samples;
and determining the complaint user sample and the non-complaint user sample as training samples.
7. The method of claim 6, wherein training each of the integrated decision tree sub-models using the training samples to form a complaint user recognition model comprises:
dividing the non-complaint users into a plurality of groups, wherein the ratio of the number of the non-complaint users to the total number of the non-complaint users in each group is the same as the ratio of the total number of the complaint users to the total number of the non-complaint users;
aiming at each group of non-complaint users, training the corresponding integrated decision tree sub-models by using the feature vectors of the group of non-complaint users and the feature vectors of the complaint users;
and obtaining a complaint user identification model based on each integrated decision tree sub-model obtained by training.
8. An apparatus for determining a quality of a user, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring voice service quality difference data of a user to be identified;
an extraction unit, configured to extract a quality difference feature vector from the quality difference data;
and the determining unit is used for determining whether the user to be identified is the quality difference user according to the quality difference characteristic vector and a quality difference user identification model obtained through training, wherein the quality difference user identification model is obtained through training based on the quality difference data of the historical user.
9. The apparatus of claim 8, in which the poor user identification model comprises a number of integrated decision tree sub-models; then
The determining unit is specifically configured to obtain an output result of each integrated decision tree sub-model based on the quality difference feature vector and each integrated decision tree sub-model; determining the average value of the output results of all the integrated decision tree sub-models as the quality difference score of the user to be identified; and if the quality difference score is larger than a quality difference score threshold value, determining that the user to be identified is a quality difference user.
10. The apparatus of claim 8, wherein the quality difference data comprises at least one of: the average subjective opinion is divided into Mos poor quality records, dropped call non-connection poor quality records and complaint records.
11. The apparatus of claim 9,
the determining unit is specifically configured to perform dimensionality reduction on the quality difference feature vector by using a Principal Component Analysis (PCA) algorithm; and inputting the quality difference characteristic vectors subjected to the dimensionality reduction into each integrated decision tree to obtain an output result of each integrated decision tree.
12. The apparatus of claim 8, wherein the poor user identification model is comprised of a number of integrated decision tree sub-models; the apparatus further comprises:
the model training unit is used for determining training samples based on the quality difference data of historical users, wherein the training samples comprise complaint user samples and non-complaint user samples; training each integrated decision tree sub-model by using the training samples to form a complaint user identification model, wherein the complaint user samples are positive samples, and the non-complaint user samples are negative samples; and carrying out regression migration processing on the complaint user identification model obtained by training to obtain the quality difference user identification model.
13. A communication device comprising a memory, a processor and a computer program stored on the memory and executable on the processor; wherein the processor, when executing the program, implements the method for determining the quality of a user according to any one of claims 1 to 7.
14. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the method for user determination of poor quality according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811540990.2A CN111401591A (en) | 2018-12-17 | 2018-12-17 | Quality difference user determination method and device and readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811540990.2A CN111401591A (en) | 2018-12-17 | 2018-12-17 | Quality difference user determination method and device and readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111401591A true CN111401591A (en) | 2020-07-10 |
Family
ID=71428236
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811540990.2A Pending CN111401591A (en) | 2018-12-17 | 2018-12-17 | Quality difference user determination method and device and readable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401591A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112751710A (en) * | 2020-12-30 | 2021-05-04 | 科大国创云网科技有限公司 | Broadband user quality difference scaling method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106304180A (en) * | 2016-08-15 | 2017-01-04 | 中国联合网络通信集团有限公司 | A kind of method and device of the speech service quality determining user |
CN107818376A (en) * | 2016-09-13 | 2018-03-20 | 中国电信股份有限公司 | Customer loss Forecasting Methodology and device |
CN108377508A (en) * | 2017-11-28 | 2018-08-07 | 中国移动通信集团福建有限公司 | User's categorization of perception method and apparatus based on measurement report data |
CN108540320A (en) * | 2018-04-03 | 2018-09-14 | 南京华苏科技有限公司 | The appraisal procedure of user satisfaction is excavated based on signaling |
-
2018
- 2018-12-17 CN CN201811540990.2A patent/CN111401591A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106304180A (en) * | 2016-08-15 | 2017-01-04 | 中国联合网络通信集团有限公司 | A kind of method and device of the speech service quality determining user |
CN107818376A (en) * | 2016-09-13 | 2018-03-20 | 中国电信股份有限公司 | Customer loss Forecasting Methodology and device |
CN108377508A (en) * | 2017-11-28 | 2018-08-07 | 中国移动通信集团福建有限公司 | User's categorization of perception method and apparatus based on measurement report data |
CN108540320A (en) * | 2018-04-03 | 2018-09-14 | 南京华苏科技有限公司 | The appraisal procedure of user satisfaction is excavated based on signaling |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112751710A (en) * | 2020-12-30 | 2021-05-04 | 科大国创云网科技有限公司 | Broadband user quality difference scaling method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444952B (en) | Sample recognition model generation method, device, computer equipment and storage medium | |
CN108665120B (en) | Method and device for establishing scoring model and evaluating user credit | |
CN107766929B (en) | Model analysis method and device | |
EP3489869A1 (en) | Modeling method and device for evaluation model | |
CN104750674B (en) | A kind of man-machine conversation's satisfaction degree estimation method and system | |
CN103262118B (en) | Attribute value estimation device and property value method of estimation | |
CN108960269B (en) | Feature acquisition method and device for data set and computing equipment | |
CN110674636B (en) | Power consumption behavior analysis method | |
CN110415036B (en) | User grade determining method, device, computer equipment and storage medium | |
DE112020002684T5 (en) | A multi-process system for optimal predictive model selection | |
WO2020135642A1 (en) | Model training method and apparatus employing generative adversarial network | |
CN112182269B (en) | Training of image classification model, image classification method, device, equipment and medium | |
CN111199469A (en) | User payment model generation method and device and electronic equipment | |
US20190220924A1 (en) | Method and device for determining key variable in model | |
CN111510368A (en) | Family group identification method, device, equipment and computer readable storage medium | |
CN113298318A (en) | Novel overload prediction method for distribution transformer | |
CN110189092B (en) | Method and device for evaluating audit group members | |
CN111401591A (en) | Quality difference user determination method and device and readable medium | |
CN112052686B (en) | Voice learning resource pushing method for user interactive education | |
CN117974293A (en) | Credit risk prediction method and device and electronic equipment | |
CN113870052A (en) | Multi-input LSTM-CNN-based work ticket security measure identification method and terminal | |
Mossavat et al. | A Bayesian hierarchical mixture of experts approach to estimate speech quality | |
CN113742495B (en) | Rating feature weight determining method and device based on prediction model and electronic equipment | |
CN113011503B (en) | Data evidence obtaining method of electronic equipment, storage medium and terminal | |
CN109308565B (en) | Crowd performance grade identification method and device, storage medium and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200710 |