CN107798608A

CN107798608A - A kind of investment product combined recommendation method and system

Info

Publication number: CN107798608A
Application number: CN201710976370.2A
Authority: CN
Inventors: 张桐; 肖奋溪
Original assignee: Shenzhen Fly Resistant Technology Co Ltd
Current assignee: Shenzhen Fly Resistant Technology Co Ltd
Priority date: 2017-10-19
Filing date: 2017-10-19
Publication date: 2018-03-13

Abstract

The invention discloses a kind of investment product combined recommendation method, including：Gather user profile and investment product information；Establish and train intensified learning network model；According to the user profile and based on the intensified learning network model after training, consumer's risk preference is obtained；According to the consumer's risk preference and the investment product information, obtain the investment product recommended to user and combine；Record user adopts actual gain and risk information after the investment product combination, and optimizes the intensified learning network model according to the actual gain and risk information.The invention also discloses a kind of investment product combined recommendation system.The present invention can recommend the prime investment product mix for being adapted to user to user.

Description

A kind of investment product combined recommendation method and system

Technical field

The present invention relates to field of computer technology, more particularly to a kind of investment product combined recommendation method and system.

Background technology

With the progress and expanding economy of society, Investment ＆ Financing has been increasingly becoming each Man's Demands in society, And Investment ＆ Financing means at present on the market are various, investment product is multifarious, how to select rational finance product to become The puzzlement of people, goes back the perfect finance product commending system of neither one on the market at present, and finance product assembled scheme is recommended more Do not know where to begin.

Imperfect finance product assembled scheme commending system can cause choosing of the vast ordinary consumer in investment product Select and be absorbed in blindly, it is usually excessively radical or overly conservative, income and risk can not be weighed, cause efficiency of investment low.

The content of the invention

The present invention is directed to problems of the prior art, there is provided a kind of investment product combined recommendation method and system, The prime investment product mix for being adapted to user can be recommended to user.

The technical scheme that the present invention proposes with regard to above-mentioned technical problem is as follows：

On the one hand, the present invention provides a kind of investment product combined recommendation method, including：

Gather user profile and investment product information；

Establish and train intensified learning network model；

According to the user profile and based on the intensified learning network model after training, consumer's risk preference is obtained；

According to the consumer's risk preference and the investment product information, obtain the investment product recommended to user and combine；

Record user adopts actual gain and risk information after the investment product combination, and according to the actual gain Optimize the intensified learning network model with risk information.

Further, it is described to establish and train intensified learning network model, specifically include：

Obtain the historical yield and risk information of historical user information, history investment product information and investment product；

The intensified learning network model is established, and the historical user information is inputted to the intensified learning network mould Type, export an initial risks preference；

According to the initial risks preference and the history investment product information, pre- recommendation investment product combination is obtained；

The pre- historical yield for recommending investment product combination is back to the intensified learning network mould with risk information Type, to adjust the parameter of the intensified learning network model, until the state of the intensified learning network model reaches optimal.

Further, it is described according to the consumer's risk preference and the investment product information, obtain what is recommended to user Investment product combines, and specifically includes：

Different investment products are arranged in pairs or groups according to the investment product information, investment of the generation with different risk factors Product mix list；

The consumer's risk preference combines with the investment product in the investment product Assembly Listing to carry out cosine similar Degree matching, the combination of similarity highest multiple investment products is obtained, and Income Maximum during the multiple investment product is combined Investment product combination is as the investment product combination recommended to user.

Further, the intensified learning network model includes executor's Actor networks；

It is described to obtain consumer's risk preference according to the user profile and based on the intensified learning network model after training, Specifically include：

The user profile is inputted to the intensified learning network model after the training, exported by the Actor networks The consumer's risk preference.

Further, the intensified learning network model also includes estimator's Critic networks；

It is described that the intensified learning network model is optimized according to the actual gain and risk information, specifically include：

The actual gain and risk information are inputted to the intensified learning network model, by the Critic networks meter Calculate award value or punishment value that the investment product that output is recommended to user combines；

The award value or punishment value are inputted to the parameter to the Actor networks, updated in the Actor networks, with Optimize the intensified learning network model.

Further, the award value that the investment product combination recommended to user is exported from the Critic network calculations Or punishment value, specifically include：

Detect whether the actual gain and risk information match with the satisfaction of user by the Critic networks；

If matching, the award value for the investment product combination that output is recommended is calculated；

If mismatching, the punishment value for the investment product combination that output is recommended is calculated.

Further, before the foundation and training intensified learning network model, in addition to：

The data gathered are normalized, the data gathered are converted into structural data deposit data In storehouse.

On the other hand, the present invention provides a kind of investment product combined recommendation system, including：

Information acquisition module, for gathering user profile and investment product information；

Model training module, for establishing and training intensified learning network model；

Risk partiality acquisition module, for according to the user profile and based on the intensified learning network model after training, Obtain consumer's risk preference；

Recommending module, for according to the consumer's risk preference and the investment product information, obtaining what is recommended to user Investment product combines；And

Model optimization module, adopt actual gain and risk information after the investment product combination for recording user, And the intensified learning network model is optimized according to the actual gain and risk information.

Further, the recommending module specifically includes：

Investment product collocation unit, for being arranged in pairs or groups according to the investment product information to different investment products, generation Investment product Assembly Listing with different risk factors；And

Investment product combined recommendation unit, for by the consumer's risk preference and the investment product Assembly Listing Investment product combination carries out cosine similarity matching, obtains the multiple investment product combinations of similarity highest, and will be the multiple The investment product combination of Income Maximum is as the investment product combination recommended to user in investment product combination.

Further, the intensified learning network model includes executor Actor networks and estimator's Critic networks；

The risk partiality acquisition module is specifically used for：

The user profile is inputted to the intensified learning network model after the training, exported by the Actor networks The consumer's risk preference；

The model optimization module specifically includes：

Output unit is calculated, for the actual gain and risk information to be inputted to the intensified learning network model, The award value or punishment value combined by the investment product of the Critic network calculations output recommendation；And

Parameter updating block, for the award value or punishment value to be inputted to the Actor networks, described in renewal Parameter in Actor networks, to optimize the intensified learning network model.

The beneficial effect that technical scheme provided in an embodiment of the present invention is brought is：

Intensified learning network model is established, consumer's risk preference is obtained by the user profile of collection, and to consumer's risk Preference carries out demand matching, recommends user to match the prime investment product mix of suitable user, adopts the throwing in user After providing product mix, actual gain and risk information that the investment product is combined feed back to intensified learning network model, constantly Optimize intensified learning network model, improve the matching precision of intensified learning network model, and can with flexible adaptation environment, for For investor, dynamic risk can be effectively held at any time in the market, obtain maximum revenue.

Brief description of the drawings

Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, on the premise of not paying creative work, other can also be obtained according to these accompanying drawings Accompanying drawing.

Fig. 1 is the schematic flow sheet for the investment product combined recommendation method that the embodiment of the present invention one provides；

Fig. 2 is investment product combined recommendation principle in the investment product combined recommendation method that the embodiment of the present invention one provides Figure；

Fig. 3 is the structural representation for the investment product combined recommendation system that the embodiment of the present invention two provides.

Embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to embodiment party of the present invention Formula is described in further detail.

Embodiment one

The embodiments of the invention provide a kind of investment product combined recommendation method, referring to Fig. 1, this method includes：

S1, collection user profile and investment product information；

S2, foundation simultaneously train intensified learning network model；

S3, according to the user profile and based on the intensified learning network model after training, obtain consumer's risk preference；

S4, according to the consumer's risk preference and the investment product information, obtain the investment product group recommended to user Close；

S5, record user adopt actual gain and risk information after the investment product combination, and according to the reality Income optimizes the intensified learning network model with risk information.

It should be noted that in step sl, the user profile gathered includes userspersonal information and personal preference is believed Breath, the i.e. occupation including investor, income, hobby, deposit, location, social circle, whether there is car, whether have loan, whether have Medical security, whether there are the information such as insurance, social security information, personal reference.Investment product information includes investment product attribute information With economic environment information, wherein, the cycle of investment product attribute information including investment product, greateset risk, prospective earnings, realization Speed, stability etc., economic environment information include share price points, monetary exchange rate, crude oil price etc..

It should be noted that, it is necessary to which data are carried out with certain pretreatment, i.e. normalizing after enough data are collected Change is handled, including：Data are uniformly arrived into identical dimensional, such as occupation can be expressed as by word2vec models for lawyer (0,0.6, 0.5), student is represented by (0.7,0.1,0.3) etc.；By all information unifications to same period, such as income is collectively expressed as In units of year and RMB；Data structured, such as userspersonal information are expressed as { sex, occupation, income etc. }, investment production Whether at any time product information be expressed as { average annual earnings, maximum annual earnings, most big year loss scope, realization, evaluation risk factor Deng.

Further, in step s 2, it is described to establish and train intensified learning network model, specifically include：

It should be noted that the pre- historical yield for recommending investment product combination is being back to intensified learning with risk information After network model, intensified learning network model calculates one award value of output or penalty value, then the award value or penalty value are returned Intensified learning network model is back to, to be adjusted to the parameter in intensified learning network model, and then user profile is inputted In intensified learning network model after to adjustment, continue to be trained intensified learning network model, until intensified learning network The state of model reaches optimal.

Intensified learning network model includes executor Actor networks and estimator's Critic networks.Actor networks are one Full Connection Neural Network model, the structure of its input is identical with the structure of structuring user profile, to input user profile, Its output end is the different some classification output items of risk factor, to export corresponding consumer's risk according to user profile analysis Preference.

After intensified learning network model is established, Actor network parameters f is initialized_θπWith Critic network parameters Q_θπ, just Beginningization Actor objective network parameter θs^π’←θ^π, with Critic objective network parameter θs^Q’←θ^Q, use Actor netinit plans Slightly model g, initialization historical record storage container B.

And then performed in the iteration that number is M：

Initialize all optional motion spaces (obtaining the consumer's risk preference options that can be selected)；

Receive the status information (obtaining user profile) from environment.

Performed in the iteration that number is T：

One action is obtained according to acquisition information and Actor networks；

Among the corresponding information deposit storage container B of made next action and acquisition action at present；

A part of sample is sampled among storage container B；

By minimizing loss function L (θ^Q) renewal Critic network parameters, more new formula is as follows for it：

Wherein, y_iRepresent target output, r_iAward value is represented, γ represents incentive discount coefficient,Represent Actor Network is in state s_i+1Lower selection execution action a_i+1Strategic function,Represent in state s_i+1In adopt Take action a_i+1The maximum award value that can be obtained, θ^Q′For Critic objective network parameters, θ^QFor Critic network parameters.

Actor network parameters are updated using sampled gradients, more new formula is as follows for it：

Wherein, θ^πActor network parameters are represented,Represent Actor networks reflecting corresponding to motion space under state s Penetrate function,Represent that formula is to weight θ in bracket^πDifferentiate.

In addition, Critic objective networks parameter, the more new formula of Actor objective network parameters are as follows：

θ^Q′←τθ^Q+(1-τ)θ^Q′；

θ^π′←τθ^π+(1-τ)θ^π′；

Wherein, τ represents coefficient correlation, θ^Q、θ^πCritic, Actor network parameter, θ are represented respectively^Q′、θ^π′Represent respectively Critic, Actor objective network parameter.

After intensified learning network architecture parameters are updated, the new state of user is inputted to intensified learning network model, with Circulation performs above-mentioned steps, continues iteration and intensified learning network architecture parameters are updated, until intensified learning network model State reach optimal.

Further, in step s3, it is described according to the user profile and based on the intensified learning network mould after training Type, consumer's risk preference is obtained, is specifically included：

Further, in step s 4, it is described according to the consumer's risk preference and the investment product information, obtain to The investment product combination that user recommends, is specifically included：

It should be noted that by the investment product information collected respectively with the earning rate and greateset risk table of same period Show, form product list, and mix into the combination of some risk factors respectively according to capital management principle, ultimately generate a series of Investment product Assembly Listing with different risk factors, to be matched with consumer's risk preference.

Consumer's risk preference is provided in the form of risk factor, according to user's request (such as access flexibly, storage the cycle Deng) by consumer's risk Preference Conversion it is a vector representation form, to be combined with the investment product in investment product Assembly Listing Carry out cosine similarity matching.

Cosine similarity matching process is as follows：

Wherein, cos θ are cosine similarity, and a is the vector of consumer's risk preference, and b is the vector of investment product combination.

After matching, the individual investment product combinations of cosine similarity highest k (k >=1) are obtained, and combine from k investment product It is middle to obtain the combination of income highest investment product as the investment product combination recommended to user.

Further, in step s 5, it is described that the intensified learning net is optimized according to the actual gain and risk information Network model, is specifically included：

It should be noted that after by investment product combined recommendation to user, investment product combination is added to history Among behavior record, and the actual profit and risk status of Follow-up observation investment product combination.Periodically investment product is combined Actual profit input to intensified learning network model and calculated with risk information, if the reality for the investment product combination recommended Profit matches with risk and user's ability to cope with the exigency, i.e., the satisfaction with user matches, and exports an award value；If push away The actual profit and risk and user's ability to cope with the exigency for the investment product combination recommended have certain deviation, the i.e. satisfaction with user Degree mismatches, and exports a penalty value.By award value or the punisher ginseng for feeding back to Actor networks, updating in Actor networks Number.All fed back per suboptimization by Bellman equation (Bellman Equation) form with recursive form, constantly update network, Until every time the investment product combination of recommendation reaches peak efficiency.

It is the schematic diagram for the investment product combined recommendation method that the embodiment of the present invention is provided referring to Fig. 2.Advanced row data Collection and pretreatment, obtain user profile, and user profile is inputted to Actor networks, exports consumer's risk preference.And then root The data cosine similarity that is combined with investment product of consumer's risk preference is calculated according to user profile, by similarity highest and income most Big investment product combination is as the investment product combination recommended to user.By the actual gain that the investment product of recommendation combines with Risk information is inputted to Critic networks, goes out an award value by Critic network calculations or punishment value feeds back to Actor networks, To update Actor network parameters, reach the purpose for continuing to optimize intensified learning network model.

The embodiment of the present invention can establish intensified learning network model, and it is inclined to obtain consumer's risk by the user profile of collection It is good, and demand matching is carried out to consumer's risk preference, user is recommended to match the prime investment product mix of suitable user, After user adopts investment product combination, actual gain and risk information that the investment product is combined feed back to intensified learning Network model, intensified learning network model is continued to optimize, improve the matching precision of intensified learning network model, and can be flexible Environment is adapted to, for investor, dynamic risk can be effectively held at any time in the market, obtain maximum revenue.

Embodiment two

The embodiments of the invention provide a kind of investment product combined recommendation system, can realize that above-mentioned investment product combination pushes away All flows of method are recommended, referring to Fig. 3, the investment product combined recommendation system includes：

Information acquisition module 1, for gathering user profile and investment product information；

Model training module 2, for establishing and training intensified learning network model；

Risk partiality acquisition module 3, for according to the user profile and based on the intensified learning network model after training, Obtain consumer's risk preference；

Recommending module 4, for according to the consumer's risk preference and the investment product information, obtaining what is recommended to user Investment product combines；And

Model optimization module 5, adopt actual gain and risk information after the investment product combination for recording user, And the intensified learning network model is optimized according to the actual gain and risk information.

Further, the recommending module specifically includes：

The risk partiality acquisition module is specifically used for：

The model optimization module specifically includes：

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc., it should be included in the scope of the protection.

Claims

A kind of 1. investment product combined recommendation method, it is characterised in that including：

Gather user profile and investment product information；

Establish and train intensified learning network model；

According to the user profile and based on the intensified learning network model after training, consumer's risk preference is obtained；

According to the consumer's risk preference and the investment product information, obtain the investment product recommended to user and combine；

Record user adopts actual gain and risk information after the investment product combination, and according to the actual gain and wind Intensified learning network model described in dangerous Advance data quality.
2. investment product combined recommendation method as claimed in claim 1, it is characterised in that described to establish and train intensified learning Network model, specifically include：

Obtain the historical yield and risk information of historical user information, history investment product information and investment product；

The intensified learning network model is established, and the historical user information is inputted to the intensified learning network model, Export an initial risks preference；

According to the initial risks preference and the history investment product information, pre- recommendation investment product combination is obtained；

The pre- historical yield for recommending investment product combination is back to the intensified learning network model with risk information, with The parameter of the intensified learning network model is adjusted, until the state of the intensified learning network model reaches optimal.
3. investment product combined recommendation method as claimed in claim 1, it is characterised in that described inclined according to the consumer's risk The good and investment product information, obtain the investment product recommended to user and combine, specifically include：

Different investment products are arranged in pairs or groups according to the investment product information, investment product of the generation with different risk factors Assembly Listing；

The consumer's risk preference is combined with the investment product in the investment product Assembly Listing and carries out cosine similarity Match somebody with somebody, obtain the combination of similarity highest multiple investment products, and during the multiple investment product is combined Income Maximum investment Product mix is as the investment product combination recommended to user.
4. investment product combined recommendation method as claimed in claim 1, it is characterised in that the intensified learning network model bag Include executor's Actor networks；

It is described according to the user profile and based on the intensified learning network model after training, obtain consumer's risk preference, specifically Including：

The user profile is inputted to the intensified learning network model after the training, as described in Actor networks output Consumer's risk preference.
5. investment product combined recommendation method as claimed in claim 4, it is characterised in that the intensified learning network model is also Including estimator's Critic networks；

It is described that the intensified learning network model is optimized according to the actual gain and risk information, specifically include：

The actual gain and risk information are inputted to the intensified learning network model, it is defeated by the Critic network calculations Go out award value or punishment value that the investment product recommended to user combines；

The award value or punishment value are inputted to the parameter to the Actor networks, updated in the Actor networks, with optimization The intensified learning network model.
6. investment product combined recommendation method as claimed in claim 5, it is characterised in that described by the Critic networks meter Award value or punishment value that the investment product that output is recommended to user combines are calculated, is specifically included：

Detect whether the actual gain and risk information match with the satisfaction of user by the Critic networks；

If matching, the award value for the investment product combination that output is recommended is calculated；

If mismatching, the punishment value for the investment product combination that output is recommended is calculated.
7. investment product combined recommendation method as claimed in claim 1, it is characterised in that in the foundation and train extensive chemical Before practising network model, in addition to：

The data gathered are normalized, the data gathered are converted into structural data deposit database In.
A kind of 8. investment product combined recommendation system, it is characterised in that including：

Information acquisition module, for gathering user profile and investment product information；

Model training module, for establishing and training intensified learning network model；

Risk partiality acquisition module, for according to the user profile and based on the intensified learning network model after training, obtaining Consumer's risk preference；

Recommending module, for according to the consumer's risk preference and the investment product information, obtaining the investment recommended to user Product mix；And

Model optimization module, adopt actual gain and risk information after the investment product combination, and root for recording user Optimize the intensified learning network model according to the actual gain and risk information.
9. investment product combined recommendation system as claimed in claim 8, it is characterised in that the recommending module specifically includes：

Investment product collocation unit, for being arranged in pairs or groups according to the investment product information to different investment products, generation has The investment product Assembly Listing of different risk factors；And

Investment product combined recommendation unit, for by the investment in the consumer's risk preference and the investment product Assembly Listing Product mix carries out cosine similarity matching, obtains the combination of similarity highest multiple investment products, and by the multiple investment The investment product combination of Income Maximum is as the investment product combination recommended to user in product mix.
10. investment product combined recommendation system as claimed in claim 8, it is characterised in that the intensified learning network model Including executor Actor networks and estimator's Critic networks；

The risk partiality acquisition module is specifically used for：

The user profile is inputted to the intensified learning network model after the training, as described in Actor networks output Consumer's risk preference；

The model optimization module specifically includes：

Output unit is calculated, for the actual gain and risk information to be inputted to the intensified learning network model, by institute State award value or punishment value that Critic network calculations export the investment product combination of the recommendation；And

Parameter updating block, for the award value or punishment value to be inputted to the Actor networks, update the Actor nets Parameter in network, to optimize the intensified learning network model.