CN113868534A - Service processing method, service processing device, electronic device and storage medium - Google Patents
Service processing method, service processing device, electronic device and storage medium Download PDFInfo
- Publication number
- CN113868534A CN113868534A CN202111170511.4A CN202111170511A CN113868534A CN 113868534 A CN113868534 A CN 113868534A CN 202111170511 A CN202111170511 A CN 202111170511A CN 113868534 A CN113868534 A CN 113868534A
- Authority
- CN
- China
- Prior art keywords
- data
- service
- sample
- service data
- package
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides a service processing method, a service processing device, an electronic device and a computer readable storage medium, and belongs to the technical field of computers. The method comprises the following steps: acquiring original service data of a user; preprocessing the original service data to obtain service data to be processed; performing characteristic analysis according to initial characteristic data in the service data to be processed to construct target characteristic data; and processing the target characteristic data by adopting a pre-trained service package recommendation model, and determining a service package recommendation result of the user. The method and the device can quickly and accurately recommend the service package to the user.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a service processing method, a service processing apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of network technology, voice or network communication through terminal devices has become a normal behavior of people in daily life. In order to meet personalized requirements of different users, a service platform usually provides different service package choices for the users, for example, the users can realize call, short message or network access by customizing different communication packages, so how to recommend a suitable service package to the users is very necessary.
In the prior art, when recommending a service to a user, a clustering method is usually adopted, for example, a classification algorithm of KNN (K-Nearest Neighbor ) is adopted to determine a similar user group, and then a similar service package is recommended for the user group. However, with the well-blowout development of the number of users and the service data, the calculation amount and complexity of the method are increased, the recommendation efficiency is reduced, meanwhile, the accuracy of service package recommendation is difficult to guarantee, and the use experience of the users is influenced. Therefore, how to recommend services to users in an accurate and effective manner is a problem to be solved urgently in the prior art.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure provides a service processing method, a service processing apparatus, an electronic device, and a computer-readable storage medium, which overcome the problems of large calculation amount and inaccurate recommendation of a service package in the conventional service processing method at least to a certain extent.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.
According to an aspect of the present disclosure, there is provided a service processing method, including: acquiring original service data of a user; preprocessing the original service data to obtain service data to be processed; performing characteristic analysis according to initial characteristic data in the service data to be processed to construct target characteristic data; and processing the target characteristic data by adopting a pre-trained service package recommendation model, and determining a service package recommendation result of the user.
In an exemplary embodiment of the disclosure, the business package recommendation model is trained by: acquiring original sample service data of a plurality of users and a service package label corresponding to each original sample service data; preprocessing the original sample service data to obtain sample service data to be processed; performing characteristic analysis according to initial sample characteristic data in the to-be-processed sample service data to construct target sample characteristic data; and training a service package recommendation model by adopting the target sample characteristic data and the service package label to obtain the service package recommendation model.
In an exemplary embodiment of the present disclosure, the preprocessing the original sample service data to obtain to-be-processed sample service data includes: performing data cleaning on a plurality of original sample service data; performing data equalization processing on the original sample service data after data cleaning based on the service package information included in each original sample service data to generate intermediate sample service data so as to equalize the service package information distribution of the plurality of original sample service data; and classifying the intermediate sample service data according to the data type of the intermediate sample service data to obtain the to-be-processed sample service data.
In an exemplary embodiment of the present disclosure, the performing data cleansing on a plurality of the original sample service data includes: and performing one or more data processing of removing repeated data, adjusting abnormal data and filling missing data on the original sample service data.
In an exemplary embodiment of the present disclosure, the performing feature analysis according to initial sample feature data in the to-be-processed sample service data to construct target sample feature data includes: acquiring correlation data of initial sample characteristic data and service package information in the to-be-processed sample service data; and constructing the target sample characteristic data according to the correlation data and the statistical information of the initial sample characteristic data.
In an exemplary embodiment of the disclosure, after constructing the target sample feature data, the method further comprises: acquiring importance ranking of the target sample characteristic data; and screening the target sample characteristic data according to the importance ranking, and determining the target sample characteristic data finally used for carrying out service package recommendation model training.
In an exemplary embodiment of the present disclosure, the training a business package recommendation model by using the target sample feature data and the business package label to obtain the business package recommendation model includes: respectively training a first model and a second model by adopting the target sample characteristic data and the service package label; and fusing the trained first model and the trained second model to obtain the business package recommendation model.
According to an aspect of the present disclosure, there is provided a service processing apparatus, including: the service data acquisition module is used for acquiring original service data of a user; the service data preprocessing module is used for preprocessing the original service data to obtain service data to be processed; the characteristic data construction module is used for carrying out characteristic analysis according to initial characteristic data in the service data to be processed and constructing target characteristic data; and the recommendation result determining module is used for processing the target characteristic data by adopting a pre-trained service package recommendation model and determining a service package recommendation result of the user.
In an exemplary embodiment of the disclosure, the business package recommendation model is configured to be trained by the following units: the system comprises a sample data acquisition unit, a service data processing unit and a service package label generation unit, wherein the sample data acquisition unit is used for acquiring original sample service data of a plurality of users and a service package label corresponding to each original sample service data; the sample data preprocessing unit is used for preprocessing the original sample service data to obtain sample service data to be processed; the sample characteristic data construction unit is used for carrying out characteristic analysis according to initial sample characteristic data in the to-be-processed sample business data and constructing target sample characteristic data; and the recommendation model obtaining unit is used for training a service package recommendation model by adopting the target sample characteristic data and the service package label so as to obtain the service package recommendation model.
In an exemplary embodiment of the present disclosure, the sample data preprocessing unit includes: the data cleaning subunit is used for performing data cleaning on the plurality of original sample service data; the data equalization processing subunit is configured to perform data equalization processing on the original sample service data after data cleaning based on the service package information included in each original sample service data, and generate intermediate sample service data, so that the service package information of the plurality of original sample service data is distributed in an equalization manner; and the data classification processing subunit is used for performing classification processing on the intermediate sample service data according to the data type of the intermediate sample service data to obtain the to-be-processed sample service data.
In an exemplary embodiment of the present disclosure, the data cleansing subunit is configured to perform one or more data processes of removing duplicate data, adjusting abnormal data, and filling missing data on a plurality of original sample service data.
In an exemplary embodiment of the present disclosure, the sample feature data construction unit includes: the correlation data acquisition subunit is used for acquiring correlation data of initial sample characteristic data and service package information in the to-be-processed sample service data; and the sample characteristic data constructing subunit is used for constructing the target sample characteristic data according to the correlation data and the statistical information of the initial sample characteristic data.
In an exemplary embodiment of the present disclosure, the service processing apparatus further includes: the sequencing information acquisition unit is used for acquiring importance sequencing of the target sample characteristic data after the target sample characteristic data is constructed; and the target sample characteristic data screening unit is used for screening the target sample characteristic data according to the importance ranking and determining the target sample characteristic data finally used for carrying out business package recommendation model training.
In an exemplary embodiment of the present disclosure, the recommendation model obtaining unit includes: the model training subunit is used for respectively training a first model and a second model by adopting the target sample characteristic data and the service package label; and the model fusion subunit is used for fusing the trained first model and the trained second model to obtain the service package recommendation model.
According to an aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the method of any one of the above via execution of the executable instructions.
According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above.
Exemplary embodiments of the present disclosure have the following advantageous effects:
acquiring original service data of a user; preprocessing original service data to obtain service data to be processed; performing characteristic analysis according to initial characteristic data in the service data to be processed to construct target characteristic data; and processing the target characteristic data by adopting a pre-trained service package recommendation model to determine a service package recommendation result of the user. On one hand, the exemplary embodiment provides a new service processing method, which constructs new target feature data by preprocessing original service data and performing feature analysis, and can fully and comprehensively mine feature data in the original service data, so that higher prediction accuracy is achieved when the newly constructed target feature data is adopted for predicting service package recommendation; on the other hand, compared with the existing clustering mode, the exemplary embodiment processes the target feature data with higher effectiveness obtained by data processing through the pre-trained service package recommendation model, determines the service package recommendation result, reduces the calculation amount, and improves the efficiency of service processing.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.
Fig. 1 schematically illustrates a system architecture diagram of a service processing method in the present exemplary embodiment;
fig. 2 schematically shows a flow chart of a service processing method in the present exemplary embodiment;
FIG. 3 schematically illustrates a sub-flow diagram of a method of business processing in the present exemplary embodiment;
FIG. 4 schematically illustrates another sub-flow diagram of a traffic handling method in the present exemplary embodiment;
FIG. 5 schematically illustrates a flow chart for determining a business package recommendation model in the exemplary embodiment;
fig. 6 is a block diagram schematically showing the structure of a service processing apparatus in the present exemplary embodiment;
fig. 7 schematically illustrates an electronic device for implementing the above method in the present exemplary embodiment.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
An exemplary embodiment of the present disclosure first provides a service processing method, and an application scenario of the method of the present embodiment may be: the operator recommends an appropriate call package, traffic package or comprehensive package, etc. for the user.
Fig. 1 is a system architecture diagram illustrating an operating environment of the exemplary embodiment, and referring to fig. 1, the system 100 may include a terminal device 110 and a server 120. The terminal device 110 and the server 120 may be connected via a network, the terminal device 110 may be a personal computer or a server, and the like, and is configured to obtain original service data for data processing, and the server 120 may be a virtual platform or a specific server, and is configured to receive data sent by the terminal device 110, and predict service package recommendation using a pre-trained service package recommendation model.
It should be understood that the data of each device shown in fig. 1 is only exemplary, and any number of terminal devices or servers and the like may be provided according to actual needs.
Based on the above description, the method in this exemplary embodiment may be applied to the terminal device 110 shown in fig. 1, for example, the terminal device 110 obtains the original service data, processes the original service data, and directly predicts the service package recommendation result of the user; the method may also be applied to the server 120 shown in fig. 1, for example, obtaining original service data from the terminal device 110, processing the original service data, and predicting a service package recommendation result of the user, which is not specifically limited in this disclosure.
The present exemplary embodiment is further described with reference to fig. 2, and as shown in fig. 2, the service processing method may include the following steps S210 to S240:
step S210, obtaining original service data of the user.
The user may refer to a user who needs to perform service package recommendation, for example, a user who customizes a service package of an operator. The raw service data refers to unprocessed service data related to a current user obtained from a data source, and may include multiple types of data, for example, the raw service data may include types of service packages used by the user, such as convergence, 4G (4rd-Generation, fourth Generation mobile communication technology), 3G (3rd-Generation, third Generation mobile communication technology), and so on, network online duration, a billing amount of a current month, accumulated traffic of the current month, time continuously exceeding the packages, local voice calling call duration, age, gender, importance of complaints, and so on, where a package name currently used by the user and specific package contents may also be included, and the disclosure does not specifically limit this. In this exemplary embodiment, the obtaining of the original service data of the user may be obtaining the original service data within a preset time period, for example, obtaining the original service data of a month, two months, or a half year that is closest to the current time, and may be specifically determined according to actual needs.
Step S220, preprocessing the original service data to obtain service data to be processed.
After the original service data is obtained, the original service data may be preprocessed, where the preprocessing may refer to preliminary analysis of the original service data to facilitate subsequent extraction of feature data, and the specific preprocessing may include performing visualization analysis of feature distribution, data cleaning, data sampling, or data encoding processing on the original service data. The service data to be processed refers to service data obtained after preprocessing.
And step S230, performing characteristic analysis according to the initial characteristic data in the service data to be processed to construct target characteristic data.
The initial characteristic data refers to original characteristic data of service data to be processed, such as telephone charge amount of a user in the current month, used flow data and the like, and can be directly extracted from the service data to be processed, the target characteristic data refers to new characteristic data obtained through processing, analysis or statistics based on the initial characteristic data, for example, whether the telephone charge amount of the user in a preset time period is an integer or not, and new characteristic data capable of representing whether the telephone charge amount of the user is sensitive or not can be determined based on the judgment and analysis result. The characteristic analysis may include statistical analysis, judgment analysis, calculation analysis and the like of the initial characteristic data, for example, statistical calculation of minimum, maximum, median and average values of the data, or judgment analysis of the correlation of the initial characteristic data and the user habit, and the like.
And S240, processing the target characteristic data by adopting a pre-trained service package recommendation model, and determining a service package recommendation result of the user.
The service recommendation model refers to a machine learning model which is obtained by training sample service data of a large number of users in advance and can be used for recommending service packages. And inputting the target characteristic data into the service package recommendation model for processing, so that a service package recommendation result of the user can be obtained. The service package refers to a customized package with different service combinations, for example, a service package combines fixed short messages, call time, and traffic into one service package, and service contents and service quantities included in different service packages may be different, for example, a user a may customize a service package with multiple call times, a user B may customize a service package with multiple traffic, and the like. The service package recommendation result refers to a category of a service package suitable for the current user, for example, the user a frequently calls, the service package category with a large call time is more suitable for the user a, the service package recommendation result may be a direct service package result, for example, a determined service package category or a code or an identifier capable of reflecting which service package is, the result may be a recommendation result, for example, a service package recommendation result most suitable for the current user is recommended, or may be multiple recommendation results, for example, previous service package recommendation results suitable for the current user are recommended; the service package recommendation result may also be an indirect data result, for example, a probability result of whether the categories of all service packages are suitable for the current user, and which one or more categories of service packages are recommended to the user may be determined according to the indirect result, which is not specifically limited in this disclosure. The expression form of the service package recommendation result may be a numerical value, a vector, a matrix or the like, and may specifically reflect the name, specific content or prediction result of the package.
Based on the above description, in the present exemplary embodiment, the original service data of the user is acquired; preprocessing original service data to obtain service data to be processed; performing characteristic analysis according to initial characteristic data in the service data to be processed to construct target characteristic data; and processing the target characteristic data by adopting a pre-trained service package recommendation model to determine a service package recommendation result of the user. On one hand, the exemplary embodiment provides a new service processing method, which constructs new target feature data by preprocessing original service data and performing feature analysis, and can fully and comprehensively mine feature data in the original service data, so that higher prediction accuracy is achieved when the newly constructed target feature data is adopted for predicting service package recommendation; on the other hand, compared with the existing clustering mode, the exemplary embodiment processes the target feature data with higher effectiveness obtained by data processing through the pre-trained service package recommendation model, determines the service package recommendation result, reduces the calculation amount, and improves the efficiency of service processing.
In an exemplary embodiment, as shown in fig. 3, the business package recommendation model is obtained by training the following steps:
step S310, acquiring original sample service data of a plurality of users and service package labels corresponding to the original sample service data;
step S320, preprocessing the original sample service data to obtain sample service data to be processed;
step S330, performing characteristic analysis according to initial sample characteristic data in the to-be-processed sample business data to construct target sample characteristic data;
and step S340, training a service package recommendation model by adopting the target sample characteristic data and the service package label to obtain the service package recommendation model.
The present exemplary embodiment may perform training of a service package recommendation model by acquiring real service data of a large number of users as original sample service data, where the original sample service data includes a plurality of data as well as sample service data, for example, the type of a service package used by a user, such as convergence, 4G, or 3G, the online time of a network, the charge amount of a current month, the accumulated flow of the current month, the time of continuously exceeding the package, the call duration of a local voice call, the age, the gender, the complaint importance, the name of the currently used package, the specific package content, and the like. For the convenience of subsequent data processing, the original sample service data may be converted into a vector or matrix form after being acquired. The service package label is a true value of the type of the service package suitable for the original sample service data, that is, what kind of service data is, when which type of service package is correspondingly adopted, better effect is achieved, and the service package label is more suitable for a user or better user experience, and the like. In this exemplary embodiment, the service package label may refer to information such as a name or a type of the service package, and as long as it can represent which service package, preferably, all optional service packages may be encoded, for example, information such as a name or a category of the service package is encoded in a one-hot (one-hot) encoding manner, so as to obtain a vector representing different service packages, which is used as the service package label. The exemplary embodiment may obtain a large amount of original sample service data and corresponding service package labels from historical data, where the obtained time period may be determined according to actual needs, for example, when a large amount of data is needed, a data size of about half a year may be obtained, when a proper amount of training is needed for a service package recommendation model with instantaneity, a data size of about three months or about two months may be adopted, and the like, which is not specifically limited by the present disclosure.
The preprocessing of the original sample data is to firstly carry out a preliminary data processing process on the original sample data to obtain the service data of the sample to be processed so as to be convenient for extracting the characteristic data subsequently. And then, the to-be-processed sample business data can be further processed to construct target characteristic data, and then the constructed target characteristic data is adopted to train a business package recommendation model. The training process may include: the initial business package recommendation model takes target sample characteristic data as input, outputs a business package recommendation result, and can enable the output business package recommendation result to be closer to a business package label by adjusting model parameters until the accuracy of the model reaches a certain standard or calculation convergence and the like, so that training can be considered to be finished. In the exemplary embodiment, a training set and a test set can be obtained by dividing pseudo-ginseng in the data set, training of the service package recommendation model is performed by using 70% of original sample service data of the training set, and testing of the service package recommendation model is performed by using 30% of original sample service data of the test set, so as to improve the prediction accuracy of the service package recommendation model.
In an exemplary embodiment, as shown in fig. 4, the step S320 may include the following steps:
step S410, data cleaning is carried out on a plurality of original sample service data;
step S420, based on the service package information included in each original sample service data, performing data equalization processing on the original sample service data after data cleaning to generate intermediate sample service data, so that the service package information of a plurality of original sample service data is distributed evenly;
and step S430, classifying the intermediate sample service data according to the data type of the intermediate sample service data to obtain the sample service data to be processed.
In the exemplary embodiment, data cleaning may be performed on the obtained multiple original sample service data to ensure validity of the data and avoid a problem that the subsequent service processing efficiency is affected due to data exception. Before data cleaning, the exemplary embodiment may further perform data analysis on the original sample service data, for example, the distribution of each feature data of the original sample service data in the original data set may be visually displayed or analyzed through a third-party package Matplotlib (a 2D mapping library of Python) of Python (a computer programming language), a scientific computing library of Numerical Python (an open source), and a Pandas (Python data analysis module), so as to visually observe what important feature data exist in the original training set and the test set, determine a problem connotation, and lay a cushion for the next data processing.
Further, data cleansing may be performed on the original sample service data, and in an exemplary embodiment, the step S410 may include:
and performing one or more data processing of removing repeated data, adjusting abnormal data and filling missing data on the original sample service data.
A lot of abnormal data or data affecting the service processing efficiency may exist in a large amount of original sample service data, for example, a large amount of data may have data with the same user ID (Identity document), and in order to improve the effectiveness of the original sample service data and improve the data processing efficiency, the repeated data in the original sample service data may be removed. The abnormal data may refer to error data or other data with too large or too small offset, and the present exemplary embodiment may fill or replace the abnormal data by deleting the abnormal data or adopting a specific value, such as an average value, a median value, and the like. When the service data is missing, the missing data can be filled in a plurality of ways, so that the integrity and the effectiveness of the original sample service data are ensured. The present exemplary embodiment may perform data cleaning in any one of the above manners, may also perform data cleaning in combination of the above manners, and the present disclosure is not limited to this specifically.
Considering that service package information generally exists in original sample service data, for example, service packages customized by different users are different, and the distribution of the customized service package categories is unbalanced according to the difference of users, for example, there are many users who customize a service package and few users who customize B service package, if model training is performed by using unbalanced original sample service data, the model may be excessively deviated from a certain type of service package, and thus generalization is not good. Based on the above-described problems, the present exemplary embodiment can determine, from service package information included in original sample service data, the original sample service data after data cleaning is processed with data equalization, the service package information may include the category of the service package in the original sample service data, based on the service package information of each original sample service data, the number of different service package categories in the original sample service data can be counted, the data balancing process can be to reduce the original sample service data with more service packages, and/or, original sample service data with less service package data in the service package information is added, for example, interpolation is performed by means of finding neighbor data, so as to form new data and solve the problem of unbalanced distribution of service package information in the original sample service data.
In practical applications, some data in the service data are discrete, for example, the gender information includes male or female, and the complaint information can be classified into 4 classes according to importance; some data are continuous, for example, the last month's end flow may be a continuous number in the range of 0G to several tens G (G represents the number of bytes consumed by the terminal device to access the internet), and so on. Considering that the feature data of the discrete traffic data and the continuous traffic data may have different processing methods, the present exemplary embodiment may perform a classification process of discrete values or continuous values on the generated intermediate sample traffic data according to the data type of the intermediate sample traffic data. Specifically, for the intermediate sample service data of which the data type is a discrete type, a coding processing mode may be adopted, for example, the name importance of the service package is not high, and one-hot may be adopted to code discrete features such as the package name, so as to characterize which service package; for the intermediate sample service data with the data type of a continuous type, in order to make the intermediate sample service data in a uniform dimension, normalization processing may be performed to avoid difficult convergence and the like caused by data dimension.
In an exemplary embodiment, the step S330 may include the following steps:
acquiring correlation data of initial sample characteristic data and service package information in to-be-processed sample service data;
and constructing target sample characteristic data according to the correlation data and the statistical information of the initial sample characteristic data.
The exemplary embodiment can construct new target sample feature data for model training by comprehensively analyzing the initial sample feature data in the sample service data to be processed. Specifically, the correlation data between the initial sample feature data in the to-be-processed sample service data and the service package information may be obtained first, where the correlation data may be a correlation data value representing the correlation, or may also be image data capable of representing the correlation, and the like, and this is not specifically limited in this disclosure. Specifically, the exemplary embodiment may use a third-party library, Seaborn (graphic visualization Python package) and Matplotlib, to draw a heat map of each initial sample feature data and service package information in the to-be-processed sample service data, as the correlation data, to learn the correlation between each initial feature data in the original sample service data and the service package used by the user, for example, a service package with a large flow rate of 4G is generally adopted, the amount of money is higher, and they will present a positive correlation in the heat map. In addition, according to the correlation data, a specific numerical value of the correlation may also be determined, and the numerical value may be reserved for subsequent processing, for example, for sample initial feature data with a low correlation numerical value, for example, in a scenario where the service package category is predicted, the correlation between the number of times of payment and the category of the service package in the service package information is low, the sample initial feature data may be directly deleted subsequently, so as to reduce the feature data and improve the data processing efficiency, and specifically, the initial feature data may be deleted by setting a preset threshold, for example, feature data with the lowest correlation numerical value of about 30% may be deleted, so as to achieve an effect of reducing the data dimension.
Further, target sample feature data can be constructed according to the obtained correlation data, the target sample feature data is different from the initial sample feature data, the target sample feature data can be novel feature data constructed according to correlation calculation and business understanding of the initial features, for example, whether the telephone charge of the initial sample feature data is an integer is judged, and the novel feature data that the user may or may not exceed the current business package, and the telephone charge is sensitive or insensitive can be determined according to the judgment result; or counting the initial sample characteristic data and the minimum value of the telephone charge in the past four months to determine whether the user is suitable for the cheaper service package; or the statistics of the average unit price of the initial sample characteristic data traffic, determining whether the user is a service package for high traffic, and the like. For the newly obtained feature data, the data type of discrete data or continuous data can be respectively processed, for example, the discrete feature data is represented in an encoding mode, for example, the sensitivity to the telephone charge can be encoded by "1", and the insensitivity to the telephone charge can be encoded by "0"; or, continuous characteristic data, such as average unit price of the flow, is directly encoded or encoded after normalization, etc., which is described here only by way of example, and the content or representation of the encoding may also be enriched according to actual needs, and this disclosure does not specifically limit this. According to the method and the device for constructing the target sample characteristic data, the effectiveness of the data can be optimized through the correlation data and the business understanding, and the optimization is more manageable and controllable.
Further, after the target sample feature data is constructed, the service processing method may further include:
acquiring importance ranking of target sample characteristic data;
and screening the target sample characteristic data according to the importance ranking, and determining the target sample characteristic data finally used for carrying out service package recommendation model training.
In practical applications, the constructed target sample feature data may include a large number of dimensions or multiple types of feature data, for example, when the feature data is constructed according to the service data of a user who uses a service package of an operator, approximately 50 target sample feature data may be constructed, and these data may include some data with extremely low or no relation to the service package recommendation, such as a number of the user, and in order to avoid the computational complexity caused by invalid or irrelevant data computation, the exemplary embodiment may further perform filtering on the target sample feature data. Specifically, the method may include obtaining importance ranks of all obtained target sample feature data through calculation of a random forest model, selecting target sample feature data ranked at a last preset digit, or deleting target sample feature data ranked at a later preset percentage, for example, deleting the payment times or user ID numbers of the ranked users, thereby further processing the target sample feature data to improve the effectiveness of the generated target sample feature data. Further, the training process of the service package recommendation model can be executed by adopting the target sample characteristic data after data screening.
In an exemplary embodiment, the step S340 may include:
respectively training a first model and a second model by adopting target sample characteristic data and a service package label;
and fusing the trained first model and the trained second model to obtain a business package recommendation model.
In order to obtain a better model use effect, the present exemplary embodiment may perform processing using a fusion model. Specifically, an Xboost model and a LightGBM model, which are superior in performance, that is, a gradient lifting tree model, may be employed. During training, the target sample feature data and the corresponding service package label may be respectively input into a first model (e.g., an Xboost model) and a second model (e.g., a LightGBM model), and training and testing processes are respectively performed. Finally, a specific combination strategy, such as a Stacking method, may be adopted to fuse the first model and the second model to obtain a fused model, that is, a service package recommendation model.
Fig. 5 is a flowchart illustrating a method for determining a business package recommendation model in the present exemplary embodiment, which may specifically include the following steps:
step S510, selecting a first model as an Xboost model;
step S520, selecting the second model as a LightGBM model;
step S530, training an Xboost model by adopting target sample characteristic data and a business package label;
step S540, training a LightGBM model by adopting target sample characteristic data and a service package label;
and step S550, fusing the Xboost model and the LightGBM model by adopting a Stacking method to obtain a service package recommendation model.
Based on the result file submitted by the business package recommendation model obtained in the exemplary embodiment, the macro-average F1-Score (F1 Score, a measure of the classification problem) can be used for evaluation.
First, for each user's service package category, statistics are respectively made for TP (correct predicted answer), FP (wrong predicted other class as sample class), FN (predicted other class by local label)
Secondly, calculating the accuracy precision under each category according to the statistical valuekRecall with recalling rate recallkThe calculation formulas are respectively as follows:
then, F1-Score under each category is calculated from the above calculation results in the following manner:
further, the final evaluation result is obtained by averaging the obtained F1-Score under each category, and the calculation mode is as follows:
the results of the calculations show that a score of 0.82 was achieved on the data set of the amount of UNICOM inventory released by the Union institute, compared to several other prior art techniques as shown in the following table:
table 1 comparison of different methods for personalized service package recommendation
Method | F1-Score |
KNN | 0.67 |
CNN (convolutional neural network) | 0.79 |
Xboost | 0.76 |
LightGBM | 0.74 |
The present exemplary embodiment | 0.82 |
The final result shows that the exemplary embodiment has great advantages compared with the traditional simple machine learning method, the system performance is greatly improved, certain improvement is realized under careful characteristic engineering compared with a convolutional neural network, and the calculation complexity of system operation is reduced.
The exemplary embodiment of the present disclosure also provides a service processing apparatus. Referring to fig. 6, the apparatus 600 may include a service data obtaining module 610, configured to obtain original service data of a user; a service data preprocessing module 620, configured to preprocess the original service data to obtain service data to be processed; the feature data construction module 630 is configured to perform feature analysis according to initial feature data in the service data to be processed, and construct target feature data; and the recommendation result determining module 640 is configured to process the target feature data by using a pre-trained service package recommendation model, and determine a service package recommendation result of the user.
In an exemplary embodiment, the business package recommendation model is configured to be trained by the following units: the system comprises a sample data acquisition unit, a service data processing unit and a service package label generation unit, wherein the sample data acquisition unit is used for acquiring original sample service data of a plurality of users and a service package label corresponding to each original sample service data; the sample data preprocessing unit is used for preprocessing the original sample service data to obtain sample service data to be processed; the sample characteristic data construction unit is used for carrying out characteristic analysis according to initial sample characteristic data in the sample service data to be processed and constructing target sample characteristic data; and the recommendation model obtaining unit is used for training the service package recommendation model by adopting the target sample characteristic data and the service package label so as to obtain the service package recommendation model.
In an exemplary embodiment, the sample data preprocessing unit includes: the data cleaning subunit is used for cleaning data of a plurality of original sample service data; the data equalization processing subunit is used for performing data equalization processing on the original sample service data after data cleaning based on the service package information included in each original sample service data to generate intermediate sample service data so as to equalize the service package information distribution of the plurality of original sample service data; and the data classification processing subunit is used for performing classification processing on the intermediate sample service data according to the data type of the intermediate sample service data to obtain the to-be-processed sample service data.
In an exemplary embodiment, the data cleansing subunit is configured to perform one or more data processes of removing duplicate data, adjusting abnormal data, and filling missing data on the plurality of original sample service data.
In an exemplary embodiment, the sample feature data construction unit includes: the correlation data acquisition subunit is used for acquiring correlation data of initial sample characteristic data and service package information in the to-be-processed sample service data; and the sample characteristic data constructing subunit is used for constructing the target sample characteristic data according to the correlation data and the statistical information of the initial sample characteristic data.
In an exemplary embodiment, the service processing apparatus further includes: the sequencing information acquisition unit is used for acquiring importance sequencing of the target sample characteristic data after the target sample characteristic data is constructed; and the target sample characteristic data screening unit is used for screening the target sample characteristic data according to the importance ranking and determining the target sample characteristic data finally used for carrying out business package recommendation model training.
In an exemplary embodiment, the recommendation model obtaining unit includes: the model training subunit is used for respectively training the first model and the second model by adopting the target sample characteristic data and the service package label; and the model fusion subunit is used for fusing the trained first model and the trained second model to obtain the service package recommendation model.
The specific details of each module/unit in the above-mentioned apparatus have been described in detail in the embodiment of the method section, and the details that are not disclosed may refer to the contents of the embodiment of the method section, and therefore are not described herein again.
Exemplary embodiments of the present disclosure also provide an electronic device capable of implementing the above method.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 700 according to such an exemplary embodiment of the present disclosure is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 7, electronic device 700 is embodied in the form of a general purpose computing device. The components of the electronic device 700 may include, but are not limited to: the at least one processing unit 710, the at least one memory unit 720, a bus 730 connecting different system components (including the memory unit 720 and the processing unit 710), and a display unit 740.
Where the memory unit stores program code, the program code may be executed by the processing unit 710 such that the processing unit 710 performs the steps according to various exemplary embodiments of the present disclosure as described in the above-mentioned "exemplary methods" section of this specification. For example, processing unit 710 may perform the steps shown in fig. 2, 3, 4, or 5, and so on.
The storage unit 720 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)721 and/or a cache memory unit 722, and may further include a read only memory unit (ROM) 723.
The memory unit 720 may also include programs/utilities 724 having a set (at least one) of program modules 725, such program modules 725 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 700 may also communicate with one or more external devices 800 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 700, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 700 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 750. Also, the electronic device 700 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 760. As shown, the network adapter 760 communicates with the other modules of the electronic device 700 via the bus 730. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the exemplary embodiments of the present disclosure.
Exemplary embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device.
Exemplary embodiments of the present disclosure also provide a program product for implementing the above method, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit according to an exemplary embodiment of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the following claims.
Claims (10)
1. A method for processing a service, comprising:
acquiring original service data of a user;
preprocessing the original service data to obtain service data to be processed;
performing characteristic analysis according to initial characteristic data in the service data to be processed to construct target characteristic data;
and processing the target characteristic data by adopting a pre-trained service package recommendation model, and determining a service package recommendation result of the user.
2. The method of claim 1, wherein the business package recommendation model is trained by:
acquiring original sample service data of a plurality of users and a service package label corresponding to each original sample service data;
preprocessing the original sample service data to obtain sample service data to be processed;
performing characteristic analysis according to initial sample characteristic data in the to-be-processed sample service data to construct target sample characteristic data;
and training a service package recommendation model by adopting the target sample characteristic data and the service package label to obtain the service package recommendation model.
3. The method of claim 2, wherein the preprocessing the original sample service data to obtain to-be-processed sample service data comprises:
performing data cleaning on a plurality of original sample service data;
performing data equalization processing on the original sample service data after data cleaning based on the service package information included in each original sample service data to generate intermediate sample service data so as to equalize the service package information distribution of the plurality of original sample service data;
and classifying the intermediate sample service data according to the data type of the intermediate sample service data to obtain the to-be-processed sample service data.
4. The method of claim 3, wherein the data cleansing of the plurality of raw sample traffic data comprises:
and performing one or more data processing of removing repeated data, adjusting abnormal data and filling missing data on the original sample service data.
5. The method according to claim 2, wherein the performing feature analysis according to the initial sample feature data in the sample service data to be processed to construct target sample feature data comprises:
acquiring correlation data of initial sample characteristic data and service package information in the to-be-processed sample service data;
and constructing the target sample characteristic data according to the correlation data and the statistical information of the initial sample characteristic data.
6. The method of claim 5, wherein after constructing the target sample feature data, the method further comprises:
acquiring importance ranking of the target sample characteristic data;
and screening the target sample characteristic data according to the importance ranking, and determining the target sample characteristic data finally used for carrying out service package recommendation model training.
7. The method of claim 2, wherein the training a business package recommendation model using the target sample feature data and the business package label to obtain the business package recommendation model comprises:
respectively training a first model and a second model by adopting the target sample characteristic data and the service package label;
and fusing the trained first model and the trained second model to obtain the business package recommendation model.
8. A traffic processing apparatus, comprising:
the service data acquisition module is used for acquiring original service data of a user;
the service data preprocessing module is used for preprocessing the original service data to obtain service data to be processed;
the characteristic data construction module is used for carrying out characteristic analysis according to initial characteristic data in the service data to be processed and constructing target characteristic data;
and the recommendation result determining module is used for processing the target characteristic data by adopting a pre-trained service package recommendation model and determining a service package recommendation result of the user.
9. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of any of claims 1-7 via execution of the executable instructions.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111170511.4A CN113868534A (en) | 2021-10-08 | 2021-10-08 | Service processing method, service processing device, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111170511.4A CN113868534A (en) | 2021-10-08 | 2021-10-08 | Service processing method, service processing device, electronic device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113868534A true CN113868534A (en) | 2021-12-31 |
Family
ID=79001904
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111170511.4A Pending CN113868534A (en) | 2021-10-08 | 2021-10-08 | Service processing method, service processing device, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113868534A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114723520A (en) * | 2022-03-21 | 2022-07-08 | 中国联合网络通信集团有限公司 | Package recommendation method and device, electronic equipment and storage medium |
-
2021
- 2021-10-08 CN CN202111170511.4A patent/CN113868534A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114723520A (en) * | 2022-03-21 | 2022-07-08 | 中国联合网络通信集团有限公司 | Package recommendation method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110070391B (en) | Data processing method and device, computer readable medium and electronic equipment | |
US20190333118A1 (en) | Cognitive product and service rating generation via passive collection of user feedback | |
CN112633962B (en) | Service recommendation method and device, computer equipment and storage medium | |
CN111797320B (en) | Data processing method, device, equipment and storage medium | |
CN113627566B (en) | Phishing early warning method and device and computer equipment | |
CN113015010B (en) | Push parameter determination method, device, equipment and computer readable storage medium | |
CN112241327A (en) | Shared information processing method and device, storage medium and electronic equipment | |
KR20230034332A (en) | Interpretation evaluation for search queries | |
CN111091460A (en) | Data processing method and device | |
CN113762421A (en) | Training method of classification model, traffic analysis method, device and equipment | |
CN112785089A (en) | Agent service configuration method and device, electronic equipment and storage medium | |
CN113762973A (en) | Data processing method and device, computer readable medium and electronic equipment | |
CN111709825A (en) | Abnormal product identification method and system | |
CN113868534A (en) | Service processing method, service processing device, electronic device and storage medium | |
CN109697224B (en) | Bill message processing method, device and storage medium | |
CN107644042B (en) | Software program click rate pre-estimation sorting method and server | |
CN117194779A (en) | Marketing system optimization method, device and equipment based on artificial intelligence | |
CN116304352A (en) | Message pushing method, device, equipment and storage medium | |
CN115376668B (en) | Big data business analysis method and system applied to intelligent medical treatment | |
CN115719183A (en) | Power customer self-feedback service evaluation method and system based on weight dynamic grading | |
CN116910341A (en) | Label prediction method and device and electronic equipment | |
CN115185606A (en) | Method, device, equipment and storage medium for obtaining service configuration parameters | |
CN115017362A (en) | Data processing method, electronic device and storage medium | |
CN114897607A (en) | Data processing method and device for product resources, electronic equipment and storage medium | |
WO2021115269A1 (en) | User cluster prediction method, apparatus, computer device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |