CN110674100B - User demand prediction method and framework based on full-channel operation data - Google Patents

User demand prediction method and framework based on full-channel operation data Download PDF

Info

Publication number
CN110674100B
CN110674100B CN201910928706.7A CN201910928706A CN110674100B CN 110674100 B CN110674100 B CN 110674100B CN 201910928706 A CN201910928706 A CN 201910928706A CN 110674100 B CN110674100 B CN 110674100B
Authority
CN
China
Prior art keywords
data
user
machine learning
channel
learning model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910928706.7A
Other languages
Chinese (zh)
Other versions
CN110674100A (en
Inventor
李虎
曾毅峰
王之良
徐飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co ltd Credit Card Center
Original Assignee
Shanghai Pudong Development Bank Co ltd Credit Card Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co ltd Credit Card Center filed Critical Shanghai Pudong Development Bank Co ltd Credit Card Center
Priority to CN201910928706.7A priority Critical patent/CN110674100B/en
Publication of CN110674100A publication Critical patent/CN110674100A/en
Application granted granted Critical
Publication of CN110674100B publication Critical patent/CN110674100B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Fuzzy Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a user demand prediction method and a framework based on full-channel operation data, wherein the method comprises the following steps: step 1: data collection, namely acquiring original operation data of a user from all systems of a whole channel and storing the original operation data in a big data frame file system; step 2: data processing, namely, constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel; and step 3: establishing a machine learning model, dividing user image data into a data set for machine learning model training and a test data set for machine learning model verification, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model; and 4, step 4: and recommending the service, namely performing model evaluation on the final machine learning model, predicting the corresponding requirement of the original operation data of the user by using the model evaluation, and recommending the service. Compared with the prior art, the method is more suitable for large-capacity data and has good user experience.

Description

User demand prediction method and framework based on full-channel operation data
Technical Field
The invention relates to the technical field of computers, in particular to a user demand prediction method and a user demand prediction framework based on full-channel operation data.
Background
With the maturity of new technology, new and advanced applications come from the fusion of 5G, artificial intelligence and the Internet of things in the future, and an intelligent connected world is created, which affects all individuals, industries, society and economy. Among them, the emergence of artificial intelligence has brought a new era for the great potential of mobile applications. For several years, mobile application developers have employed artificial intelligence in their innovations. For example Siri from apple inc. Machine learning is developing rapidly and users now need a flexible algorithm to enhance the experience.
After 2013, the explosion development of internet finance pushes big data to a new climax. At present, due to the vigorous development of Internet finance and consumer finance, the traditional customer operation mode of banking industry is tested seriously, and a new customer operation means is urgently needed. The traditional client operation mode has the following defects:
(1) The information coverage is not enough: at present, the number of natural people recorded by a personal credit center at the center row reaches 8.6 hundred million people, but only 3 hundred million people have credit records;
(2) The information validity is insufficient: credit records mainly come from financial institutions such as commercial banks and rural credit agencies, and have serious shortcuts in data timeliness, comprehensiveness and hierarchy.
(3) The services are various, and the client cannot find the required services;
(4) The system is not intelligent enough;
(5) The customer experience results cannot be fed back in time.
Disclosure of Invention
The present invention provides a method and architecture for predicting user requirements based on full-channel operation data in order to overcome the above-mentioned drawbacks of the prior art.
The purpose of the invention can be realized by the following technical scheme:
a user demand prediction method based on full-channel operation data comprises the following steps:
step 1: acquiring original operation data of a user from each system of a whole channel, and storing the original operation data in a big data frame file system;
step 2: constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel;
and 3, step 3: establishing a machine learning model, dividing user image data into a test data set for machine learning model verification and a training data set for machine learning model training, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model;
and 4, step 4: and after model evaluation is carried out on the machine learning model which is finally trained and verified, the corresponding requirements of the original operation data of the user are predicted by utilizing the model, and the service is recommended.
Further, the overall channels in step 1 include channels such as an animation channel, an APP channel, a wechat channel, an IVR (Interactive Voice Response) channel, a paupo Service window channel, and a CSR (Customer Service Response) artificial Customer Service.
Further, the user portrait data in step 2 includes customer basic information, customer transaction information, customer activity information and customer buried point information.
Further, the method also comprises the step 5: and after receiving the service corresponding to the requirement, the user collects user feedback and trains and optimizes the machine learning model.
Further, the step 1 specifically includes: and acquiring original operation data of the user from all systems of the whole channel by using an ETL tool, and storing the original operation data in a big data frame file system.
Further, the big data frame file system in step 1 adopts an HDFS file system.
Further, the indexes of the model evaluation in the step 4 include classification, regression, ranking, clustering, topical models and recommendation.
The invention also provides a framework of the user demand prediction method based on the full-channel operation data, and the framework comprises the following steps:
the operation data acquisition module is used for acquiring original data by using an ETL tool and storing the original data in an HDFS file system;
the characteristic engineering module is used for carrying out characteristic construction, characteristic extraction and characteristic selection from the original data;
the model training module is used for training the machine learning model;
the model verification module is used for verifying the machine learning model;
and the model application module is used for running the trained and verified machine learning model on line.
Compared with the prior art, the invention has the following advantages:
(1) The invention adopts Hadoop correlation technique; hadoop is a large data processing framework that can be used for storage and computing services from a single server cluster to thousands of server clusters. Hadoopdistributed File System (HDFS) provides a big data storage service that can span multiple computers, while MapReduce provides a framework for parallel processing. The HDFS file system is adopted for data storage, so that infinite capacity storage can be carried out theoretically, and the bottleneck that data storage of a traditional database is limited is solved. MapReduce provides a computational solution for massive data.
(2) The invention adopts machine learning (deep learning) related technology; as shown in fig. 2, in the conventional programming, a person inputs rules (i.e., programs) and data to be processed according to the rules, and a system outputs answers; in the machine learning modeling process, people input data and answers expected from the data, and the system outputs rules or calling models. These rules can then be applied to new data and cause the computer to generate answers autonomously, with the machine learning system being trained rather than explicitly programmed. Many examples relating to a task are input into the machine learning system, which finds statistical structures in these examples, and eventually finds rules to automate the task.
Drawings
FIG. 1 is a diagram of an architecture for practicing the present invention;
FIG. 2 is a schematic diagram of a technical advantage of the present invention;
FIG. 3 is a flow chart of the method of the present invention;
FIG. 4 is a flowchart of a method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
Examples
At present, banks are developing rapidly in informatization, a large amount of valuable business data and user data are accumulated, user requirements are mined by means of big data, appropriate products are recommended for users, operation benefits are increased under the condition that user disturbance and marketing cost are reduced, and the method is a key point for urgently promoting business of credit cards of various banks.
The invention establishes a customer service recommendation system based on the related concepts of interconnection finance, big data and machine learning, mainly comprises data acquisition, feature engineering, model training, model verification and model application, the general structure is shown in figure 1,
the figure includes:
1. operational data collection
And using an ETL tool to acquire raw data from each system and storing the raw data on an HDFS file system. The raw data needs to have the following characteristics:
a. have to have "representativeness".
b. For the classification problem, the data skew cannot be too severe, and the data quantity of different classes does not have the difference of several orders of magnitude.
c. And estimating the memory consumption of the training model for the magnitude of the evaluation data, the sample quantity and the feature quantity. If the amount of data is too large, it may be considered to reduce training samples, reduce dimensions, or use a distributed machine learning system.
2. Feature engineering
The method comprises the steps of feature construction, feature extraction and feature selection from original data. The maximum effectiveness of original data can be exerted well by characteristic engineering, the effect and performance of an algorithm can be obviously improved, and the effect of a simple model is better than that of a complex model sometimes. Most of the time of data mining is spent on feature engineering, and is a very basic and necessary step for machine learning.
3. Model training
In machine learning, training is an important step and can directly influence the result of machine learning, and no matter which content needs a model in the machine learning, a reasonable algorithm is selected for corresponding modeling, so that people can do better and more accurately.
4. Model validation
Through test data, the effectiveness of the model is verified, error samples are observed, and the reasons for error generation are analyzed, so that a breakthrough point for improving the performance of the algorithm can be found. The error analysis mainly analyzes error sources, data, characteristics and algorithms.
5. Model application
The success or failure of the model is directly determined by the on-line operation effect of the model. The method does not simply comprise the conditions of accuracy, errors and the like, and also comprises whether the running speed (time complexity), the resource consumption degree (space complexity) and the stability are acceptable.
FIG. 3 is a flow chart of the method of the present invention, wherein the "target model" is the output of the whole process of machine learning and is the core of the whole channel customer demand prediction. The method comprises the following steps that original data operated by a full-channel client are summarized according to the day, and the client portrait (or called characteristic) data is obtained (1) through simple data cleaning, processing and storing, reasonable sample data are selected from (1) to construct a training data set and a testing data set which are learned by a data machine, the training data set and the testing data set are crucial to the accuracy of model effects, and relevant specific information in the drawing is as follows:
Figure BDA0002219640600000051
the method mainly focuses on considering the demand direction of users in different channels, forms independent customer figures for each user by collecting user behavior information (mainly by buried point information collection) of the users in each channel, arranges scattered unstructured data information to form historical operation information of the users, combines historical transaction behavior and other information of the users, establishes a model and a training model, takes the customer figures as input parameters, calculates out what services each customer may need in a future time period through a model algorithm, collects behavior information of the users after the customers contact through each channel, and provides sufficient data samples for the evolution of the model in reaction to train and optimize the model, thereby forming information closed loops of the users, the channels and the model, realizing quasi-real-time cyclic customer behavior prediction, and achieving the most needed service of the customers accurately. As shown in particular in fig. 4.
The method and the architecture system have obvious benefits before and after production; the whole channel service demand prediction system is applied to the cartoon customer service channel, and the staged transaction amount is obviously increased compared with 1-5 months in 2018.
The method and the architecture system improve the hit rate before and after production, and have good user experience; the full-channel service demand forecasting system is applied to the cartoon customer service channel, the calling times of the cartoon channel interface reach 1.2 ten thousand times every day, the hit rate is maintained at 74%, the calling times of the cartoon channel interface are obviously increased compared with the calling times of the cartoon channel interface before the cartoon channel is online in 2018 in 1-5 months, the viscosity of a user is improved, the user is more willing to select the cartoon channel to handle business when the user has service demands, and the user experience is good.
The method and the architecture system of the invention are used for personalized recommendation service for different customers; the personalized recommendation service adopted by the full-channel service demand prediction system recommends different service nodes aiming at different customers, and can meet different customer demands from person to person, and the recommended service nodes are the same for different customers before the system is on line, so that the demands of target customers cannot be met, the loss of the target customers is caused, and the loss of the customers can cause irreparable loss to a credit card center.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A user demand prediction method based on full-channel operation data is characterized by comprising the following steps:
step 1: acquiring original operation data of a user from each system of a whole channel, and storing the original operation data in a big data frame file system;
step 2: constructing user portrait data after data cleaning and processing the acquired user original operation data in each system of the whole channel;
and step 3: establishing a machine learning model, dividing user image data into a test data set for machine learning model verification and a training data set for machine learning model training, and then training and verifying the machine learning model by using the test data set and the training data set to obtain a finally trained and verified machine learning model;
and 4, step 4: and after model evaluation is carried out on the machine learning model which is finally trained and verified, the corresponding requirements of the original operation data of the user are predicted by utilizing the model, and the service is recommended.
2. The method according to claim 1, wherein the full channel in step 1 comprises an animation channel, an APP channel, a WeChat channel, an IVR channel, a Payment service Window channel and a CSR customer service channel.
3. The method of claim 1, wherein the user profile data in step 2 comprises customer basic information, customer transaction information, customer activity information and customer site information.
4. The method for predicting the demand of the user based on the channel-wide operation data as claimed in claim 1, further comprising the step 5: and after receiving the service corresponding to the requirement, the user collects user feedback and trains and optimizes the machine learning model.
5. The method for predicting the user demand based on the full channel operation data as claimed in claim 1, wherein the step 1 specifically comprises: and (4) acquiring user original operation data from all the systems of the whole channel by using an ETL tool, and storing the user original operation data in a big data frame file system.
6. The method for predicting user demand based on full channel operation data according to claim 1, wherein the big data frame file system in step 1 adopts an HDFS file system.
7. The method according to claim 1, wherein the model evaluation indexes in step 4 include classification, regression, ranking, clustering, topical models and recommendations.
8. An architecture based on the full channel operation data-based user demand prediction method according to any one of claims 1 to 7, characterized in that the architecture comprises:
the operation data acquisition module is used for acquiring original data by using an ETL (extraction transformation Loading) tool and storing the original data in an HDFS (Hadoop distributed file system);
the characteristic engineering module is used for carrying out characteristic construction, characteristic extraction and characteristic selection from the original data;
the model training module is used for training the machine learning model;
the model verification module is used for verifying the machine learning model;
and the model application module is used for running the trained and verified machine learning model on line.
CN201910928706.7A 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data Active CN110674100B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910928706.7A CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910928706.7A CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Publications (2)

Publication Number Publication Date
CN110674100A CN110674100A (en) 2020-01-10
CN110674100B true CN110674100B (en) 2023-02-10

Family

ID=69079893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910928706.7A Active CN110674100B (en) 2019-09-28 2019-09-28 User demand prediction method and framework based on full-channel operation data

Country Status (1)

Country Link
CN (1) CN110674100B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111612366B (en) * 2020-05-27 2023-08-04 中国联合网络通信集团有限公司 Channel quality assessment method, channel quality assessment device, electronic equipment and storage medium
CN112561598A (en) * 2020-12-23 2021-03-26 中国农业银行股份有限公司重庆市分行 Customer loss prediction and retrieval method and system based on customer portrait
CN112598443A (en) * 2020-12-25 2021-04-02 山东鲁能软件技术有限公司 Online channel business data processing method and system based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MY195692A (en) * 2017-02-17 2023-02-04 Kasatria Analytics Sdn Bhd Computer Implemented System and Method for Customer Profiling Using Micro-Conversions Via Machine Learning
CN107423442B (en) * 2017-08-07 2020-09-25 火烈鸟网络(广州)股份有限公司 Application recommendation method and system based on user portrait behavior analysis, storage medium and computer equipment
CN109509040A (en) * 2019-01-03 2019-03-22 广发证券股份有限公司 Predict modeling method, marketing method and the device of fund potential customers

Also Published As

Publication number Publication date
CN110674100A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
WO2020249125A1 (en) Method and system for automatically training machine learning model
US10977293B2 (en) Technology incident management platform
Munappy et al. Data management challenges for deep learning
CN110400021B (en) Bank branch cash usage prediction method and device
CN110674100B (en) User demand prediction method and framework based on full-channel operation data
US11811708B2 (en) Systems and methods for generating dynamic conversational responses using cluster-level collaborative filtering matrices
KR20200039852A (en) Method for analysis of business management system providing machine learning algorithm for predictive modeling
US11790183B2 (en) Systems and methods for generating dynamic conversational responses based on historical and dynamically updated information
CN112836750A (en) System resource allocation method, device and equipment
CN116800831A (en) Service data pushing method, device, storage medium and processor
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
Tounsi et al. Credit scoring in the age of big data–A state-of-the-art
CN115619571A (en) Financing planning method, system and device
CN115618079A (en) Session recommendation method, device, electronic equipment and storage medium
CN115080856A (en) Recommendation method and device and training method and device of recommendation model
CN114693409A (en) Product matching method, device, computer equipment, storage medium and program product
Jain et al. Collaborative and clustering based strategy in big data
CN112950392A (en) Information display method, posterior information determination method and device and related equipment
Zong-Chang et al. Artificial immune algorithm-based credit evaluation for mobile telephone customers
CN117172632B (en) Enterprise abnormal behavior detection method, device, equipment and storage medium
US20230252267A1 (en) Attention mechanism and dataset bagging for time series forecasting using deep neural network models
US20230342793A1 (en) Machine-learning (ml)-based system and method for generating dso impact score for financial transaction
Vishwakarma et al. House Price Forecasting Based on Hybrid Multi-regression Model
Benavente-Peces et al. Applied Data Analytics
Bhandarkar et al. The Smart Analysis of Performing Scalable Inference for Big Data Analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Tie Jincheng

Inventor after: Li Hu

Inventor after: Zeng Yifeng

Inventor after: Wang Zhiliang

Inventor after: Xu Fei

Inventor before: Li Hu

Inventor before: Zeng Yifeng

Inventor before: Wang Zhiliang

Inventor before: Xu Fei

CB03 Change of inventor or designer information