CN114862506B - Financial product recommendation method based on deep reinforcement learning - Google Patents
Financial product recommendation method based on deep reinforcement learning Download PDFInfo
- Publication number
- CN114862506B CN114862506B CN202210434003.0A CN202210434003A CN114862506B CN 114862506 B CN114862506 B CN 114862506B CN 202210434003 A CN202210434003 A CN 202210434003A CN 114862506 B CN114862506 B CN 114862506B
- Authority
- CN
- China
- Prior art keywords
- product
- client
- customer
- distribution
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000002787 reinforcement Effects 0.000 title claims abstract description 27
- 230000004927 fusion Effects 0.000 claims abstract description 14
- 230000008901 benefit Effects 0.000 claims description 18
- 230000006399 behavior Effects 0.000 claims description 14
- 238000004821 distillation Methods 0.000 claims description 13
- 230000007774 longterm Effects 0.000 claims description 13
- 230000004069 differentiation Effects 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 8
- 230000009471 action Effects 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 238000007405 data analysis Methods 0.000 claims description 3
- 238000005065 mining Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 abstract description 5
- 230000009466 transformation Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 6
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000007115 recruitment Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9027—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/06—Asset management; Financial planning or analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Strategic Management (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Artificial Intelligence (AREA)
- Game Theory and Decision Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Resources & Organizations (AREA)
- Operations Research (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Medical Informatics (AREA)
- Technology Law (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention discloses a financial product recommendation method based on deep reinforcement learning, which comprises the following steps: step 1, establishing a client interest preference model to obtain click preference scores, risk preference scores and asset preference scores; step 2, historical data are mined to form ideal product investment distribution, and an asset balance model is established according to the customer group of the current customer and the holding bin of the customer group to obtain the differential score of the current product distribution and the ideal product distribution; step 3, exploring and modeling potential interests of the clients; step 4, performing fusion parameter self-adaptive learning on the score factors obtained in the previous step by adopting a deep reinforcement learning method; after the implementation of the scheme, the average click rate, the purchase conversion rate and the transaction amount of the user are greatly improved, and personalized recommendation service is carried out on the APP financial mall page at the mobile phone end; through continuous deep understanding of clients, the satisfaction degree of the client service is improved, and powerful technical support is provided for the financial management transformation of the company.
Description
Technical Field
The invention belongs to the technical field of artificial intelligence application, and particularly relates to a financial product recommendation method based on deep reinforcement learning.
Background
In recent years, the financial management market of China is developed at a high speed; by the end of 2020, personal financial assets in China reach 205 trillion, the Internet financial management market reaches 8.2 trillion, and meanwhile, the ages of main financial management client groups are continuously younger, wherein the main Internet financial management client groups are 21-35 years old, the accurate marketing business becomes a very important growth point of securities companies, for example, the net income of the securities industry agency sales financial products is increased year by year, the annual income reaches 134.38 billions in 2020, the same ratio is increased by 148.76%, and the products mainly marketing by the financial management business comprise various categories of resource management products, public offering funds, private fund, marketing trust and the like; the accurate marketing business can bring three capacities to securities companies, the first is the flow capacity, the securities companies are actively establishing a technological platform surrounding new trend of financial management, the service level is improved, the second is the consultant and companion capacity, the securities companies realize comprehensive conversion of business on-line, the viscosity of clients is greatly improved through providing financial services and the content of streaming media such as information video, the services of the consultants and companion are provided for the clients, the third is the product and investment capacity, the competitive power of the most core of the securities companies is also achieved, and the financial products of the securities company are formed into a new system taking public offering and private recruitment product replacement and security dealer resource management as the basis, and the large-class asset configuration, the special user of the foundation and the like are taken as the business forms;
The accurate marketing is to provide specialized, intelligent and personalized services for clients from the demands of the clients; based on rich customer portrait data of securities companies, customer preferences are effectively identified, and thousands of people and thousands of services of information, videos and live broadcast contents are provided for customers; secondly, the interactive feedback of the client and the content further deepens the understanding and the knowledge of the client; on the other hand, securities companies have a rich pool of multi-class products including public offering funds, OTC, trust, private recruitment, etc.; considering the difference of financial interests, risk preference and asset balance demands of different clients, the problem of accurate marketing is how to help the clients to realize long-term benefit maximization on the premise of comprehensively considering multidimensional factors, and whether to recommend personalized financial schemes which best meet the demands of the clients through an intelligent means, so that a financial product recommendation method based on deep reinforcement learning needs to be developed to solve the existing problems.
Disclosure of Invention
The invention aims to provide a financial product recommendation method based on deep reinforcement learning, which aims to solve the problem that personalized financial schemes which best meet the requirements of customers cannot be intelligently recommended.
In order to achieve the above purpose, the present invention provides the following technical solutions: a financial product recommendation method based on deep reinforcement learning comprises the following steps:
step1, establishing a client interest preference model to obtain click preference scores, risk preference scores and asset preference scores;
Step 2, historical data are mined to form ideal product investment distribution, and an asset balance model is established according to the customer group of the current customer and the holding bin of the customer group to obtain the differential score of the current product distribution and the ideal product distribution;
step 3, establishing a potential interest exploration model of the client, and using new product exploration modeling to realize the exploration of unknown interests of the client so as to acquire product exploration scores;
And step 4, performing fusion parameter self-adaptive learning on the score factors obtained in the step by adopting a deep reinforcement learning method, and then performing sequencing recommendation.
Preferably, the client interest preference modeling step includes:
A step of building a tree model; and the clicking and purchasing behaviors of the product and the active clients with more clicking and reading behaviors of the information learn the clicking preference, purchasing preference and risk preference of the active clients through GBDT modeling, and the clicking and purchasing probabilities of the clients are target learning.
Preferably, the client interest preference modeling step further includes: distillation learning step: learning the preference of the new client and the active client, finding the most similar active client to the current new client, expressing the preference of the new client by using the preference of the similar active client, setting Teacher Model the similarity of the training active client for the active client, calculating the similarity of all financial active clients as the calculated result, inputting the calculated result into a Student Model through distillation extraction, wherein the Student Model does not adopt the client purchasing behavior data as a characteristic at the moment, and finding the most similar old client based on the current new client through distillation extraction.
Preferably, the step of modeling the customer asset balance by using the difference value between the current product distribution and the ideal product distribution includes: obtaining a current product distribution and ideal product distribution differentiation value through a distribution differentiation formula; the distribution differentiation formula:
wherein p (c|g) represents the target distribution of a certain product type C of the customer group G where the user U is located; q (c|u) represents the binning and recommendation distribution of user U based on a certain product type C,
C KL represents the difference value between the current product distribution and the ideal product distribution; u represents a user; g represents a customer group; c represents a product type;
summing up on behalf of users in all groups;
p represents the target distribution and q represents the current distribution.
Preferably, the differentiating value between the current product distribution and the ideal product distribution further includes obtaining an optimal subset, which includes the following steps:
from the original set z= {1, …, N } M item formations are chosen:
wherein Z represents a commodity, and N represents a product; m represents selecting an optimal subset of products from N; item represents a commodity;
The optimal subset y=argmax (det (L Y));
wherein Y represents the optimal subset, and L Y represents a determinant score corresponding to the optimal subset Y;
A matrix of client-relevance that is a function of the client-relevance matrix,
L is a constructed client relevance matrix, wherein q i represents the relevance score of the ith candidate content to the client, and D ij represents the distance between the candidate contents i and j;
Adding the candidate content into a set formula Y-U { i }; the candidate content i is added to the optimal subset set Y.
Preferably, in step 4, the types of the new product exploration model include: type exploration, racetrack exploration and new development exploration;
The method for exploring the model for the new product comprises the following steps:
Searching a good product pool with performance, searching a seed customer representation of a product through clicking and purchasing actions of the customer on the product, searching a product corresponding to the seed customer representation closest to the customer according to the current customer representation, performing interest exploration, recommending the good product preferred by a person similar to the customer as an exploration product of the customer, and exploring the potential interest of the customer;
after finding the potential interesting excellent products of the clients, if the client benefits are negative, stopping searching;
if the financial income of the client reaches the set value, the exploration degree is increased.
Preferably, the method further comprises step 5, by exponentially fusing the formula at1+bt2+; optimizing the parameters of t1 and t2, and continuously and adaptively learning with the aim of maximizing the long-term benefit to obtain a long-term investment benefit maximizing modeling; wherein a represents a customer preference modeling factor value; and b represents a customer asset balance modeling factor value, t1 and t2 are fusion parameters.
Preferably, the long-term benefit is obtained by learning a target formula Q (S, a), where S represents a scene attribute of a current customer and a current asset status, a represents a combination of discrete values of a plurality of fusion factor parameters, and R represents feedback on the short-term benefit after the customer purchases the product, or whether the customer purchases the product.
Preferably, the data of the product comprises: risk level, yield, and holding amount, and obtaining a product representation from product data analysis: wherein the number of investment varieties and income analysis in the product portrait is at least 500 labels.
Preferably, the data of the client interest preference model, the asset balance model and the client potential interest exploration model are all used under the premise of client consent, and unauthorized client data are not collected and used.
The invention has the technical effects and advantages that: the reinforcement learning has long-term light, and focuses on decision-making long-term return; the supervised learning generally considers the problem of one time, pays attention to short-term benefits, and considers instant return, and the reinforcement learning method is very matched with a reasonable investor payoff attention to long-term return of investment targets; reinforcement learning solves the decision optimization problem of the sequence action, and continuous training is performed after data is obtained from the environment so as to obtain accurate response to the environment, wherein the continuous change of the client investment combination strategy is the sequence decision problem; model-Free deep reinforcement learning does not need modeling environment, does not need a large amount of marked data, and performs self-learning by using Action trial and error;
the customer service satisfaction is improved, an intelligent recommendation system is built based on a big data platform, customer portraits, product portraits and behavior data bases are built, an artificial intelligent platform is utilized, customer interest preference modeling, customer asset balance modeling and customer potential interest exploration modeling are carried out in recall, sorting and fusion of system levels, historical purchasing behavior and income data of customers are accumulated based on the past 5 years according to modeling results, 6 months of income maximization is used as a final modeling target, reinforcement learning samples are constructed, and continuous application and exploration, feedback and reinforcement are carried out; the APP front-end display system is used for displaying the clients, the click rate of the products is improved by more than 20% compared with that of the prior manual pushing, the purchase conversion rate is improved by about 3 times, and the client conversion rate is improved by more than 30%;
After the implementation of the scheme, the average click rate, the purchase conversion rate and the transaction amount of the user are greatly improved, and besides the financial mall of the APP at the mobile phone end, the C-end service can further expand the recommended result to modules such as the PC-end financial mall, the APP home page and APP information recommendation; b, carrying out saving product recommendation on lost early warning clients and carrying out corresponding product recommendation on clients with assets abnormal; through intelligent recommendation technology application, stock investment clients are converted into financial clients, new clients of more years and lighter groups are brought to open accounts through new technologies, and meanwhile, through continuous deep understanding of the clients, the service satisfaction of the clients is improved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of the tree model of the present invention;
FIG. 3 is a schematic diagram of the distillation learning method according to the present invention;
FIG. 4 is a schematic diagram of a client preference model according to the present invention;
FIG. 5 is a schematic view of a structural framework of the present invention;
FIG. 6 is a schematic diagram of a new product exploration model structure of the present invention;
FIG. 7 is a schematic diagram of an intelligent recommendation system architecture according to the present invention;
FIG. 8 is a schematic diagram of the application of the smart product of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention provides a financial product recommendation method based on deep reinforcement learning as shown in fig. 1 and 5, which comprises the following steps:
step 1, establishing a client interest preference model to obtain click preference scores, risk preference scores and asset preference scores; the customer interest preference modeling step includes: as shown in fig. 2,3 and 4;
A step of building a tree model; active clients with more clicking and purchasing behaviors of products and clicking and reading behaviors of information learn clicking preferences, purchasing preferences and risk preferences of the active clients through GBDT modeling, and learn the clicking and purchasing probabilities of the clients as targets;
Distillation learning step: learning the preference of the new client and the preference of the active client, finding the most similar active client to the current new client, expressing the preference of the new client by using the preference of the similar active client, setting Teacher Model the similarity of the training active client for the active client, calculating the similarity of all financial active clients as the calculated result, inputting the calculated result into a Student Model through distillation extraction, wherein the Student Model does not adopt the client purchasing behavior data as the characteristic at the moment, and finding the most similar old client based on the current new client through distillation extraction;
In this embodiment, different customer product preferences are different, the traditional marketing based on product dimension cannot meet the personalized requirements of customers, how to accurately identify the interests of customers becomes a key problem, the solution thinking is to distinguish old customers from new customers, firstly, for active customers with more clicking and purchasing behaviors of products and more clicking and reading behaviors of information, the data base is enough to support GBDT modeling, and learn the clicking preferences, purchasing preferences and risk preferences of the active customers; for new clients, namely non-financial clients, the behavior and purchase information are very few, a distillation learning method is needed, in TeacherModel in the middle part of the left side in fig. 6, the similarity of active clients is trained, the data are all data features of the bottommost layer, so that higher learning precision can be obtained by training, the calculated result is the similarity of all financial old clients, further, the result is input into the middle part of the Model through distillation extraction, the left side is a new client, the right side is an active client, the Model does not adopt data such as the purchase behavior of the clients as features, the distillation extraction is based on the current new client, the old client which is most similar to the current new client is found, the preference of the new client is indirectly represented by the old client, and the recommendation precision of the new client is improved;
Step 2, historical data are mined to form ideal product investment distribution, and an asset balance model is established according to the customer group of the current customer and the holding bin of the customer group to obtain the differential score of the current product distribution and the ideal product distribution; the step of modeling the customer asset balance by using the difference value between the current product distribution and the ideal product distribution comprises the following steps: obtaining a current product distribution and ideal product distribution differentiation value through a distribution differentiation formula; the distribution differentiation formula:
wherein p (c|g) represents the target distribution of a certain product type C of the customer group G where the user U is located; q (c|u) represents the binning and recommendation distribution of user U based on a certain product type C,
C KL represents the difference value between the current product distribution and the ideal product distribution; u represents a user; g represents a customer group; c represents a product type;
summing up on behalf of users in all groups;
p represents the target distribution and q represents the current distribution;
the current product distribution and ideal product distribution differentiation value further comprises the steps of obtaining an optimal subset, wherein the optimal subset comprises the following steps:
from the original set z= {1,.. N, selecting M item components:
wherein Z represents a commodity, and N represents a product; m represents selecting an optimal subset of products from N; item represents a commodity;
The optimal subset y=argmax (det (L Y));
wherein Y represents the optimal subset, and L Y represents a determinant score corresponding to the optimal subset Y;
A matrix of client-relevance that is a function of the client-relevance matrix,
L is a constructed client relevance matrix, wherein q i represents the relevance score of the ith candidate content to the client, and D ij represents the distance between the candidate contents i and j;
Adding the candidate content into a set formula Y-U { i }; adding the candidate content i into the optimal subset set Y;
In this embodiment, the data of the product includes: risk level, yield, and holding amount, and obtaining a product representation from product data analysis: wherein, the number of investment varieties and income analysis in the product portrait is at least 500 labels;
In the embodiment, from the viewpoint of customer asset balance modeling, according to the asset portfolio balance model theory, investors distribute wealth to various optional assets according to the risk and income principles according to own investment preference to form an optimal asset portfolio; by mining based on historical data, judging what the combination of products is in excellent form in the guest group, and forming ideal product investment distribution; recommending proper products to the customers based on the customer groups and the holding bins of the customers, so that the product combination distribution after the customers purchase the products is more in line with ideal distribution; on the other hand, when the product recommendation is carried out, on the premise of meeting the interest preference of the customer, more products with large difference from the products held in the warehouse are presented to promote the asset balance;
Step 3, establishing a potential interest exploration model of the client, and utilizing new product exploration modeling to realize the mining of unknown interests of the client; types of the new product exploration model include: type exploration, racetrack exploration and new development exploration; as shown in figure 6 of the drawings,
The method for exploring the model for the new product comprises the following steps:
Searching a good product pool with performance, searching a seed customer representation of a product through clicking and purchasing actions of the customer on the product, searching a product corresponding to the seed customer representation closest to the customer according to the current customer representation, performing interest exploration, recommending the good product preferred by a person similar to the customer as an exploration product of the customer, and exploring the potential interest of the customer;
after finding the potential interesting excellent products of the clients, if the client benefits are negative, stopping searching;
If the financial income of the client reaches the set value, increasing the exploration degree;
In the embodiment, the client interest preference and the asset balance modeling are performed based on the existing data of the client, a required exploration function in a recommendation system explores the unknown interest of the client, so that the information cocoons are prevented, the new product exploration modeling of the financial scene and the exploration modeling thought of the Internet are consistent, and the client is heuristically displayed by using excellent financial products; the excellent performance of the financial product is used because even if the customer is not interested, the financial product is considered to be a good product and does not cause the customer to feel strong objection; firstly, finding a product pool with excellent performance, and finding a seed customer representation of a product by adopting a Look-aLike algorithm through the click purchase behavior of the customer on the product; aiming at a certain customer, finding a product corresponding to a seed customer representation closest to the customer according to the current customer representation, and performing interest exploration; that is, recommending the excellent product preferred by the person similar to him to the customer as his exploring product, exploring the potential interests of the customer; when a good product of potential interest of a customer is found, the customer is not explored at any time; when a customer faces a huge deficit, it is more prone to invest in new products in the area of his own familiarity, where exploration is not appropriate; when the recent financial gain of a client is very high, the acceptance of the client to the new product is higher, and the exploration degree can be increased at the moment; introducing a potential interest exploration dynamics model of the client to the right side of the figure 6 to help judge the time for increasing the exploration dynamics for the client; therefore, the client interest exploration is the exploration of different dynamics performed at different times and under different scenes;
Step 4, performing fusion parameter self-adaptive learning and then sequencing recommendation on the score factors obtained in the previous step by using a deep reinforcement learning method, and performing exponential fusion on the score factors according to a formula a < t1+ b < t2+ >; optimizing the parameters of t1 and t2, and continuously and adaptively learning with the aim of maximizing the long-term benefit to obtain a long-term investment benefit maximizing modeling; wherein a represents a customer preference modeling factor value; and b represents a customer asset balance modeling factor value, t1 and t2 are fusion parameters; the interest preference, the asset balance and the interest exploration modeling of the clients are completed, and the multidimensional scores of the clients for different products can be obtained; in the embodiment, the multidimensional scores are accurately fused; the long-term benefits are obtained through learning a target formula Q (S, A), wherein S represents scene attributes and asset status of current clients, A represents combination of discrete values of a plurality of fusion factor parameters, and R represents feedback of the short-term benefits after the clients purchase products or whether the clients purchase the products;
FIG. 7 is a recommendation application framework, wherein the bottom layer is a traditional data warehouse and a big data platform, the upper layer is used for interfacing with an artificial intelligent platform to provide distributed computing power and algorithms, and the top layer is an intelligent recommendation system; the frame of the recommendation system is basically consistent with the recommendation system frame of the Internet, and is divided into three layers of recall, sequencing and fusion, and each layer supports the configuration of service operation; the most basic API service interface is arranged at the upper layer, and a traditional data running batch pushing mode is also provided; the terminal system of the uppermost butt joint comprises various terminals such as APP, PC, weChat and the like; the facing objects also include investment consultants of the business section, and business operators of the headquarters; this is also a different point from internet recommendation systems, which output recommended results to marketers because many business sites will provide assistance in product marketing; as shown in fig. 8, a mode of combining on-line and off-line, and combining manual operation and intelligent algorithm is adopted; the method mainly uses an Internet scene as a main part, mainly relies on an intelligent algorithm, and uses an off-line scene as an auxiliary part; marketing personnel can configure and market products recommended by clients through a marketing platform, and simultaneously, a recommended product list seen by the clients is generated by combining intelligent recommended products and operation products, so that comprehensive analysis is performed to form a marketing analysis report. After seeing the data and analyzing, marketers can deepen understanding of clients, and then contact the clients through off-line telephone or WeChat popularization;
The data of the client interest preference model, the asset balance model and the client potential interest exploration model are all used on the premise of client consent, and unauthorized client data are not collected and used.
Finally, it should be noted that: the foregoing description is only illustrative of the preferred embodiments of the present invention, and although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements or changes may be made without departing from the spirit and principles of the present invention.
Claims (9)
1. A financial product recommendation method based on deep reinforcement learning is characterized in that: the method comprises the following steps:
step1, establishing a client interest preference model to obtain click preference scores, risk preference scores and asset preference scores;
Step 2, historical data are mined to form ideal product investment distribution, and an asset balance model is established according to the customer group of the current customer and the holding bin of the customer group to obtain the differential score of the current product distribution and the ideal product distribution;
step3, establishing a potential interest exploration model of the client, and using the new product exploration model to realize the mining of unknown interests of the client;
step 4, performing fusion parameter self-adaptive learning on the score factors obtained in the previous step by using a deep reinforcement learning method, and then ranking and recommending; the step of modeling the customer asset balance by using the difference value between the current product distribution and the ideal product distribution comprises the following steps: obtaining a current product distribution and ideal product distribution differentiation value through a distribution differentiation formula; the distribution differentiation formula:
Wherein/> Representing a target distribution of a certain product type C in a customer group G in which the user U is located; /(I)Representing the binning and recommendation distribution of the user U based on a certain product type C;
Representing the difference value between the current product distribution and the ideal product distribution; u represents a user; g represents a customer group; c represents a product type;
Representing the sum of the users in all groups.
2. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: the client interest preference model modeling step comprises the following steps:
building a tree model: click preferences, purchase preferences, risk preferences of active customers are learned by GBDT modeling.
3. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: the client interest preference model modeling step further comprises: distillation learning step: learning the preference of the new client and the active client, finding the most similar active client to the current new client, expressing the preference of the new client by using the preference of the similar active client, setting Teacher Model the similarity of the training active client for the active client, calculating the similarity of all financial active clients as the calculated result, inputting the calculated result into a Student Model through distillation extraction, wherein the Student Model does not adopt the client purchasing behavior data as a characteristic at the moment, and finding the most similar old client based on the current new client through distillation extraction.
4. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: the current product distribution and ideal product distribution differentiation value further comprises: the optimal subset is obtained, and the steps are as follows:
Selecting M products from an original product set Z= {1, …, N } to form an optimal subset Y;
Optimal subset ;
Wherein Y represents the optimal subset,Representing determinant scores corresponding to the optimal subset Y;
A matrix of client-relevance that is a function of the client-relevance matrix, L is a constructed customer relevance matrix, wherein/>Representing the relevance score of the ith candidate content to the client,/>Representing the distance between candidate contents i and j;
candidate content joining set formula ; Adding candidate content i to the optimal subset set/>Is a kind of medium.
5. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: in step 3, the types of the new product exploration model include: type exploration, racetrack exploration and new development exploration;
The method for exploring the model for the new product comprises the following steps:
Searching a good product pool with performance, searching a seed customer representation of a product through clicking and purchasing actions of the customer on the product, searching a product corresponding to the seed customer representation closest to the customer according to the current customer representation, performing interest exploration, recommending the good product preferred by a person similar to the customer as an exploration product of the customer, and exploring the potential interest of the customer;
after finding the potential interesting excellent products of the clients, if the client benefits are negative, stopping searching;
if the financial income of the client reaches the set value, the exploration degree is increased.
6. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: the method further comprises step 5, by exponential fusion of the formula a t1+ b t 2+; optimizing the parameters of t1 and t2, and continuously and adaptively learning with the aim of maximizing the long-term benefit to obtain a long-term investment benefit maximizing modeling; wherein a represents a customer preference modeling factor value; and b represents a customer asset balance modeling factor value, t1 and t2 are fusion parameters.
7. The method for recommending financial products based on deep reinforcement learning of claim 6, wherein: the long term benefit is derived by learning a target formula Q (S, a), where Q represents the desire for maximum investment benefit, S represents the current customer' S scene attributes and asset status, and a represents a combination of discrete values of multiple fusion factor parameters.
8. The financial product recommendation method based on deep reinforcement learning of claim 4, wherein: the data of the product include: risk level, yield, and holding amount, and obtaining a product representation from product data analysis: wherein the number of investment varieties and income analysis in the product portrait is at least 500 labels.
9. The financial product recommendation method based on deep reinforcement learning of claim 1, wherein: the data of the client interest preference model, the asset balance model and the client potential interest exploration model are all used on the premise of client consent, and unauthorized client data are not collected and used.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434003.0A CN114862506B (en) | 2022-04-24 | 2022-04-24 | Financial product recommendation method based on deep reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210434003.0A CN114862506B (en) | 2022-04-24 | 2022-04-24 | Financial product recommendation method based on deep reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114862506A CN114862506A (en) | 2022-08-05 |
CN114862506B true CN114862506B (en) | 2024-06-14 |
Family
ID=82632984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210434003.0A Active CN114862506B (en) | 2022-04-24 | 2022-04-24 | Financial product recommendation method based on deep reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114862506B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115239442B (en) * | 2022-09-22 | 2023-01-06 | 湖南快乐通宝小额贷款有限公司 | Method and system for popularizing internet financial products and storage medium |
CN117171203B (en) * | 2023-09-04 | 2024-04-26 | 申万宏源证券有限公司 | SQL automatic generation method and system based on zero code reasoning engine |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335057A (en) * | 2019-04-30 | 2019-10-15 | 广发证券股份有限公司 | A kind of fund Precision Marketing Method that machine learning is merged with artificial rule |
CN111612519A (en) * | 2020-04-13 | 2020-09-01 | 广发证券股份有限公司 | Method, device and storage medium for identifying potential customers of financial product |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109509040A (en) * | 2019-01-03 | 2019-03-22 | 广发证券股份有限公司 | Predict modeling method, marketing method and the device of fund potential customers |
CN111612632A (en) * | 2020-05-14 | 2020-09-01 | 中国工商银行股份有限公司 | User investment data processing method and device |
-
2022
- 2022-04-24 CN CN202210434003.0A patent/CN114862506B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110335057A (en) * | 2019-04-30 | 2019-10-15 | 广发证券股份有限公司 | A kind of fund Precision Marketing Method that machine learning is merged with artificial rule |
CN111612519A (en) * | 2020-04-13 | 2020-09-01 | 广发证券股份有限公司 | Method, device and storage medium for identifying potential customers of financial product |
Also Published As
Publication number | Publication date |
---|---|
CN114862506A (en) | 2022-08-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shao et al. | IoT data visualization for business intelligence in corporate finance | |
CN111553754B (en) | Updating method and device of behavior prediction system | |
CN114862506B (en) | Financial product recommendation method based on deep reinforcement learning | |
CN110222267A (en) | A kind of gaming platform information-pushing method, system, storage medium and equipment | |
Chen et al. | Design and implementation of bank CRM system based on decision tree algorithm | |
CN116468460B (en) | Consumer finance customer image recognition system and method based on artificial intelligence | |
Cui et al. | Text mining to explore the influencing factors of sharing economy driven digital platforms to promote social and economic development | |
CN111429161B (en) | Feature extraction method, feature extraction device, storage medium and electronic equipment | |
Wang | A survey of online advertising click-through rate prediction models | |
Zhong et al. | Design of a personalized recommendation system for learning resources based on collaborative filtering | |
WO2023284516A1 (en) | Information recommendation method and apparatus based on knowledge graph, and device, medium, and product | |
US8046278B2 (en) | Process of selecting portfolio managers based on automated artificial intelligence techniques | |
KR20220083183A (en) | System for supporting bidding strategy establishment using personalization based business recommendation algorithm and method thereof | |
CN113052653A (en) | Financial product content recommendation method and system and computer readable storage medium | |
Wang et al. | Education Data‐Driven Online Course Optimization Mechanism for College Student | |
Rondović et al. | Discovering the determinants and predicting the degree of e-business diffusion using the decision tree method: Evidence from Montenegro | |
Yin et al. | Multimodal deep collaborative filtering recommendation based on dual attention | |
CN112860878A (en) | Service data recommendation method, storage medium and equipment | |
CN116663909A (en) | Provider risk identification data processing method and device | |
Änäkkälä | Exploring value in eCommerce artificial intelligence and recommendation systems | |
Su et al. | Lightweight deep learning model for marketing strategy optimization and characteristic analysis | |
Kumar et al. | Approaching Porter's five forces through social media analytics | |
CN115525819A (en) | Cross-domain recommendation method for information cocoon room | |
Tekin et al. | Big data concept in small and medium enterprises: how big data effects productivity | |
CN113254775A (en) | Credit card product recommendation method based on client browsing behavior sequence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |