CN113298279A - Stage-distinguishing crowd funding progress prediction method - Google Patents

Stage-distinguishing crowd funding progress prediction method Download PDF

Info

Publication number
CN113298279A
CN113298279A CN202010107306.2A CN202010107306A CN113298279A CN 113298279 A CN113298279 A CN 113298279A CN 202010107306 A CN202010107306 A CN 202010107306A CN 113298279 A CN113298279 A CN 113298279A
Authority
CN
China
Prior art keywords
progress
crowd funding
time step
state
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010107306.2A
Other languages
Chinese (zh)
Inventor
刘淇
王俊
章和夫
潘镇
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202010107306.2A priority Critical patent/CN113298279A/en
Publication of CN113298279A publication Critical patent/CN113298279A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a stage discrimination crowd funding progress prediction method, which comprises the following steps: crawling progress data of crowd funding projects, and extracting static features and dynamic features of the progress data; the method comprises the steps that users of a crowd funding platform are regarded as an intelligent agent, crowd funding projects are regarded as environments, and the intelligent agent is trained in a reinforcement learning mode by combining static features and dynamic features of the crowd funding projects; and judging the stage of the crowd funding project to be predicted through the trained intelligent agent, and predicting future progress of the crowd funding project by using a corresponding strategy. By adopting the method, the accuracy of the progress prediction can be improved.

Description

Stage-distinguishing crowd funding progress prediction method
Technical Field
The invention relates to the field of Internet crowd funding, in particular to a stage-distinguishing type crowd funding progress prediction method.
Background
The crowd funding progress prediction mainly means that the daily progress of a given number of days later is predicted according to the progress change sequence of all days before a certain crowd funding project, and the daily progress is in percentage form. The crowd funding progress sequence contains the characteristics of the field, and a solution for better integrating the characteristics of the field into the crowd funding progress prediction problem is an open research problem and is widely researched.
In current research work and patents, the following methods are mainly used for predicting progress in crowd funding fields:
1) a decomposition synthesis prediction method based on a pure time series base.
Currently, work based on pure time series analysis aims at mining potentially different patterns of the sequence itself and synthesizing the sequence based on predictions in these patterns. Analysis and research through previous work verifies that mapping sequence decomposition to different spaces can indeed help improve the accuracy of the predicted progress.
2) A cyclic neural network prediction method based on sequence-to-sequence.
The progress prediction method based on the recurrent neural network utilizes the neural network structure to automatically extract the dynamically changing characteristics (including comment information and progress change), and adds the prior information specific to the crowd funding field as the constraint. Experiments prove that the method is more suitable for sequence prediction in the crowd funding field.
Although the two methods consider the characteristics of crowd funding sequences to a certain extent and try to give different prediction results according to different modes, the interaction process between users and crowd funding projects in the crowd funding field is not fully considered. For example, the progress of the current time may affect the decision of the user, and the decision of the user may affect the progress of the item at the next time, and both may affect each other in the dynamic interaction process.
Disclosure of Invention
The invention aims to provide a stage-distinguishing crowd funding progress prediction method which can improve the accuracy of progress prediction.
The purpose of the invention is realized by the following technical scheme:
a stage-differentiated crowd funding progress prediction method comprises the following steps:
crawling progress data of crowd funding projects, and extracting static features and dynamic features of the progress data;
the method comprises the steps that users of a crowd funding platform are regarded as an intelligent agent, crowd funding projects are regarded as environments, and the intelligent agent is trained in a reinforcement learning mode by combining static features and dynamic features of the crowd funding projects;
and judging the stage of the crowd funding project to be predicted through the trained intelligent agent, and predicting future progress of the crowd funding project by using a corresponding strategy.
According to the technical scheme provided by the invention, the interaction process between the user and the crowd funding project in the crowd funding field is fully considered, the progress prediction of stage differentiation can be realized, and the accuracy of the progress prediction is greatly improved compared with the prior art; in addition, the prediction result is beneficial to the control of the production progress of the crowd funded product and only when the project publisher starts the production of the crowd funded product in advance; and when some public funding is performed, the prediction also helps to judge whether the project is successful or not and the needed approximate days, and the social influence can be generated finally. In addition, the website or the crowd funding platform can be helped to recommend the item to the appropriate user in a personalized mode.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a stage-differentiated crowd funding progress prediction method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an actuator in an agent according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the effect of training on the differences between different sub-strategies.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a stage-distinguishing crowd funding progress prediction method, which mainly comprises the following steps of:
1. and crawling progress data of crowd funding projects, and extracting static features and dynamic features of the progress data.
In an example of the invention, progress data for crowd funding projects is crawled from a crowd funding platform (e.g., indiegoo.
Progress data of crowd funding projects include: title, label, item category, financing objective, item introduction, daily review, daily schedule change, number of financed days, and number of remaining financing days; wherein, the title, the label, the item category, the financing target and the item introduction are static data; the daily comment, the daily progress change, the staged days and the remaining staged days are dynamic data.
For text information of static data and dynamic data, an embedded model (Word2Vec) is used to convert to corresponding vectorized representations. Then, static features are extracted from the vectorized representation of the static data, daily comments and day information are extracted from the vectorized representation of the dynamic data as dynamic features, and daily progress changes are extracted as tags to be predicted (i.e., the progress sequence below). Finally, the maximum and minimum normalization processing is carried out on all the characteristic data.
A crowd funding item i is represented as a triple (X)i,Ci,Pi) Wherein X isiRepresenting static features; ciAnd PiAll are dynamic sequences, which respectively represent dynamic characteristic sequences and progress sequences of projects.
The crowd funding progress prediction means that static characteristics X of crowd funding project i are obtainediAnd dynamic signature sequences of the first T-T days
Figure BDA0002388800710000031
And project progress sequence
Figure BDA0002388800710000032
In the case of (1), project schedule sequence of the next τ days is predicted
Figure BDA0002388800710000033
Wherein, the dynamic feature vector of the t day of the item i
Figure BDA0002388800710000034
Feature vector of comment information in progress data of crowd funded project
Figure BDA0002388800710000035
Vector composed of days information (settled days, remaining settled days)
Figure BDA0002388800710000036
Composition, T1., T- τ, T being the total days of crowd funding.
2. The method comprises the steps of regarding a user of a crowd funding platform as an agent (agent), regarding a crowd funding project as an environment, and training the agent by combining static features and dynamic features of the crowd funding project in a reinforcement learning mode.
In the embodiment of the invention, the crowd funding project future progress prediction problem is modeled into a reinforced learning problem.
The reinforcement learning quadruple is expressed as < S, A, P, R >; wherein:
s is a state space and comprises state vectors S of all time steps; using a GRU (gated round robin unit) network to model state changes in the time dimension, the input at each time step is dynamic after initialization of the GRU network using static featuresStatus characteristics
Figure BDA0002388800710000037
The output hidden vector is used as a state vector of the environment and shown to the intelligent agent, namely st=ht(ii) a s represents a state vector, h represents a hidden vector output by the GRU network, and t represents a time step, namely a certain day;
a is an action space which comprises an action a of each time step; considering that what we are predicting is the progress of the project (in percentage) at the next time step, and the progress of the project may exceed 100%, the action space is defined as a continuous space made up of all positive numbers, and there is
Figure BDA0002388800710000038
Wherein
Figure BDA0002388800710000039
Representing the estimated value of the progress at the next time step;
r is a return function. Considering that a proper function needs to be selected according to the deviation between the predicted value and the true value of the project progress, a monotonically decreasing positive continuous differentiable function is selected as the return function, and the output of the return function is taken as the final return value r after the running average. Since the goal of strong chemistry is to maximize cumulative benefit, an optimal strategy that is learned from such a defined return function can minimize cumulative bias.
Because the method in embodiments of the present invention is model-free, the state transition probability P is not explicitly defined herein.
In the embodiment of the invention, a reinforced learning framework of an actor-evaluator (operator-critic) is adopted, namely, the inside of an intelligent body comprises two components of the actor and the evaluator; the actor is used to learn an action strategy u, i.e. a mapping μ from a state space to an action space, and to output, i.e. the final action a ═ μθ(s), θ represents a parameter of the strategy μ; the evaluation device is used for evaluating the action taken by the actuator and outputting a value function Qμ(s, a) taking as state s under policy μThe value of action a (μ has been omitted below without specifying a certain strategy) while learning itself by time differentiation.
It should be noted that the output result of the evaluator in the training phase provides the predicted direction for the actor, and the progress can be predicted only by the actor in the testing phase. In addition, the subscripts τ, t in the various formulas to which this patent relates all indicate the corresponding time step (i.e., day); the parameters with subscripts τ, t representing the parameters of the respective time step, e.g. aτ、stThe action of time step τ and the state of time step t are shown, and if these parameters related to time step are not provided with corresponding time step index, the related parameters are only used for describing that the specific time step is not required to be defined, for example, a and s described herein represent action and state.
The learning objective of an actor is to maximize the sum of the expected future benefits, namely:
Figure BDA0002388800710000041
wherein, muθDenotes the strategy μ, E denotes expectation, ρ, with θ as parameterμStates s under strategy μtThe distribution obeyed, γ represents the cumulative discount factor calculated, r(s)τ,μθ(sτ) Is in a state of sτTime dependent strategy muθTake action muθ(sτ) The instant reward obtained, T, represents the total number of days crowd funded.
Then the actor update target applicable to the deterministic policy is:
Figure BDA0002388800710000042
wherein the content of the first and second substances,
Figure BDA0002388800710000043
represents LfuFor the partial differential vector of theta,
Figure BDA0002388800710000044
represents μθ(st) For the partial differential vector of theta,
Figure BDA0002388800710000045
represents Q μ(s)t,at) At at=μθ(st) The partial differential vector of time pair a.
As can be seen from the above equations, the actors are updated in the direction of maximum value evaluated by the arbiter.
The discriminator calculates the evaluation result Q(s) of the current time stept,at) The expectation of deviation from the value re-estimated after obtaining the current immediate benefit is self-learned, namely:
δt=rt+γQ(st+1,at+1)-Q(st,at),
Figure BDA0002388800710000051
wherein r ist=r(st,at) Representing the instantaneous return of the time step t, deltatIs the deviation calculated by the time difference method.
The above formula shows how to model the interaction between users of the crowd-funding platform and crowd-funding projects using a reinforcement learning framework, and to learn a progress prediction method that maximizes future revenue using an 'actor-judger' framework.
Still further, embodiments of the present invention require predicting future progress with accurate learning of historical progress changes. For this purpose, the objective function of the "actor" is improved. Specifically, the method comprises the following steps:
at the current time step t, using the predicted progress of all time steps before the current time step
Figure BDA0002388800710000052
And true progress pτAs a loss function of past predictionsNumber:
Figure BDA0002388800710000053
meanwhile, considering that the progress of the crowd funding project is monotonically increased, a penalty is given to a portion of the predicted output at a larger time step (i.e., a subsequent time step) being smaller than the output at a smaller time step (i.e., a previous time step), and the loss function is:
Figure BDA0002388800710000054
wherein
Figure BDA0002388800710000055
Representing an indicative function;
the objective function of the actor is denoted as Lfu,LpaAnd LregThe weighted sum of the three parts, namely:
Lactor=Lfu1Lpa2Lreg
wherein λ is1、λ2Respectively represent Lpa、LregThe weight of (c).
In addition, research shows that the project schedule sequence in the crowd funding field presents an obvious U shape, namely investment behaviors in the starting and ending stages of the project are more intensive, so that the project schedule is increased rapidly; the investment behavior in the middle stage is less, so that the project progress is slowly increased. How to fully utilize this important a priori knowledge is also a difficulty in incorporating it into the present invention. In order to fully utilize the sequence mode, an option mechanism is used for improving the internal structure of the actuator; briefly, the actor determines the financing stage of the current project using a high-level strategy and then selects a different low-level strategy to give a more optimistic or relatively conservative progress estimate based on the input state.
Formally, ω is used to denote an option (which may be understood as a computation block), i.e. a triplet ω (I)ω,μω,βω) Indicating an option. I isω,μωAnd betaωRespectively, the initial state of the option (which is a similar concept to the previous state s), the low-level policy (which is a similar concept to the previous policy μ), and the termination probability function. In the embodiment of the present invention, it is assumed that the initial states of all options are the whole state space, i.e. the whole state space
Figure BDA0002388800710000061
Iω(ii) S; here muωThe aforementioned strategy μ, which represents the inclusion in ω, is then automatically changed to a lower-level strategy and additionally uses π (ω | s)t) Representing high-level policies, i.e. input states stOutputting the probability of each option; termination function betaωRepresents the range from the state space to [0, 1 ] contained in ω]And mapping the interval. As shown in FIG. 2, state s is entered at time step ttSince the last time step is selected as option ωt-1Then the termination function will be used
Figure BDA0002388800710000062
The probability of output terminates. If terminated, by π (ω | s)t) Probability of output selects new option ωtOtherwise ω is still usedt-1(ii) a Then according to omegatThe low-level policy of (2) outputs an action, i.e., predicted project progress, and determines a probability of termination at the beginning of the next time step. The number of options shown in fig. 2 is merely an example, and is not a limitation.
Those skilled in the art will appreciate that the low-level policy and the high-level policy are general terms in the option mechanism, and both correspond to different mapping functions, which are described above.
In the pair of objective functions LactorWhen the method is improved, the core idea is to represent the state-option tuple (s, omega) as an extended state. Therefore, Q (s, a) (value of taking action a in state s), μω(s) (Low-level policy in State s) and βω(s) (termination function in state s) after expansion are Q (s, ω, a), μ (s, ω) andβ(s,ω)。
on the one hand, the judgers can still be self-updated in a time-differentiated manner. However, since the next time step has a certain probability of terminating the current option, the expected value function of the next time step needs to be calculated according to whether the current option is terminated:
Figure BDA0002388800710000063
wherein, β(s)t+1,ωt) Representing the probability of terminating at the next time step, i.e. if not, the option for the current time step is still selected, the estimate being Q(s)t+1,ωt,at) (ii) a If terminated, then the maximum of all the estimates of the option at the next time step is selected instead of the option at the current time step, i.e., the
Figure BDA0002388800710000064
The evaluator updates the parameters by minimizing the estimated deviation before and after obtaining the current revenue, which is expressed as:
δt′=rt+γU(ωt,st+1)-Q(st,ωt,at),
Figure BDA0002388800710000065
on the other hand, updating of the actor is divided into updating of a low-level strategy and updating of a termination function; the updating of the low-level policy is directly derived from the state extension, i.e. LfuAnd partial updating:
Figure BDA0002388800710000066
wherein, ρ'μRepresents the state(s) after expansion under the policy μt,ωt) The distribution obeyed;
Figure BDA0002388800710000067
represents μθ(st,ωt) The partial differential vector of (i.e. the action corresponding to the post-expansion state) versus theta,
Figure BDA0002388800710000068
represents Q(s)t,ωt,at) A partial differential vector for action a;
the update of the termination function can then be given by the following equation:
Figure BDA0002388800710000071
loss function L at update of termination functiontermExpressed as:
Lterm=β(st+1,ωt)A(st+1,ωt)
the final objective function of the actuator is denoted as Lfu,Lpa,LregAnd LtermThe weighted sum of the four parts, namely:
Lactor′=Lfu1Lpa2Lreg3Lterm
wherein λ is3Represents LtermThe weight of (c).
It should be noted that, in the embodiment of the present invention, how the high-level policy is updated is not specifically described. In fact, the update of the high-level policy may result from many reinforcement learning update methods, including but not limited to random policy gradient updates, time difference methods, and planning methods.
3. And judging the stage of the crowd funding project to be predicted through the trained intelligent agent, and predicting future progress of the crowd funding project by using a corresponding strategy.
First, when the number of options is set to 2, experiments verify that the two low-level strategies learned by the options are relatively positive and relatively conservative. FIG. 3 is a diagram illustrating the effect of training on the differences between different sub-strategies. The specific test process is that all training items are equally divided into 8 parts according to a financing period, then the average value of the termination function of each part in all time steps is calculated, finally, a sub strategy is displayed by drawing, is prone to be used in the initial stage and the final stage of the financing of the items, another sub strategy is prone to be used in the intermediate stage of the financing of the items, and the average value of the output values of the two sub strategies in the same time step is calculated, so that the average value of the former sub strategy is larger, the sub strategy prone to be used in the initial stage and the final stage of the financing is more positive and suitable for fast growth, and the other relatively conservative sub strategy is suitable for slow growth, which is consistent with the crowd-funding sequence characteristics disclosed in the previous research.
Next, in the prediction phase, after extracting the static features and the dynamic features of the crowd funding project to be predicted according to the method described in step 1, only the actor shown in fig. 1 is used to predict the progress. Determining a corresponding state vector by using the static characteristic and the dynamic characteristic, and inputting the state vector to a traveling gear; when the actuator is used for prediction, the option of the previous time step is randomly terminated according to the termination probability output by the previous time step, if the option is terminated, the current stage of the crowd funding project is judged by the high-level strategy, specifically, the state is input into the high-level strategy, the probabilities corresponding to the two options are output by the high-level strategy, and the option with higher probability is selected. The stage where the option is applied is the stage where the determined project should be located. If the probability of the option corresponding to the sub-strategy which is applied to the rapid growth is higher, the method is equivalent to implicitly judging whether the item is in the initial stage or the final stage; conversely, if the probability of the option corresponding to the sub-strategy applying the slow growth is higher, the high-level strategy considers that the project is in the middle stage of financing. Here, both the fast-growing sub-strategy and the slow-growing sub-strategy belong to the low-level strategy μ defined aboveωThe relationship between the two strategies and the option is calibrated in advance, so that the used stage can be directly determined after the option with higher probability is selected. In addition, slow growth and fast growth are preset rates, andit is obvious that there is a fast-slow relationship between the two. The specific division of project phases here depends on the choice of the bounding value k. In this example, when the training program is divided into 8 equal parts according to the financing period, the financing progress increases more than 15% in the first 1/8 time and the last 1/8 time, which are considered as the initial stage and the final stage respectively; the financing progress in the remaining intermediate phases all increased by less than 15%, and is considered an intermediate phase. Whereas the growth variation of the item at the beginning and end is considered to be fast growth, the growth variation at the middle is considered to be slow growth.
After selecting the proper option, the low-level strategy corresponding to the option outputs the estimated value of the progress at the next time step. If the project to be predicted is at a past time step that has already been experienced, then the true progress will be used as part of the status representation; conversely, if a future change in progress needs to be predicted, the estimate output at the previous time step is used as part of the representation of the state at the next time step (i.e., as part of the progress in the dynamic input at the next time step in FIG. 1). Therefore, the purpose of predicting the change and the trend of the crowd funding project in the future days is achieved.
After the future trend of the crowd funding project is predicted within a certain error range, a user of the crowd funding platform can better judge whether the project is worth investment, and in addition, a publisher of the project can also adjust a funding target and a funding period in time according to the future trend, so that the maximum utilization of time and resources is achieved. More importantly, the prediction result is beneficial to the control of the production progress of the crowd funded product and only when the project publisher starts the production of the crowd funded product in advance; and when some public funding is performed, the prediction also helps to judge whether the project is successful or not and the needed approximate days, and the social influence can be generated finally. In addition, the website or the crowd funding platform can be helped to recommend the item to the appropriate user in a personalized mode.
According to the scheme, the progress prediction method capable of carrying out stage differentiation is realized by using the frame of hierarchical reinforcement learning and the improved objective function according to the characteristic information of the crowd funding project, and compared with the prior art, the accuracy of the emotion classification result is greatly improved.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A stage-differentiated crowd funding progress prediction method is characterized by comprising the following steps:
crawling progress data of crowd funding projects, and extracting static features and dynamic features of the progress data;
the method comprises the steps that users of a crowd funding platform are regarded as an intelligent agent, crowd funding projects are regarded as environments, and the intelligent agent is trained in a reinforcement learning mode by combining static features and dynamic features of the crowd funding projects;
and judging the stage of the crowd funding project to be predicted through the trained intelligent agent, and predicting future progress of the crowd funding project by using a corresponding strategy.
2. The method of claim 1, wherein the crawling crowd-funding project progress data comprises extracting static features and dynamic features of the progress data, and the extracting comprises:
progress data of crowd funding projects include: title, label, item category, financing objective, item introduction, daily review, daily schedule change, number of financed days, and number of remaining financing days;
wherein, the title, the label, the item category, the financing target and the item introduction are static data; the daily comment, the daily progress change, the staged days and the remaining staged days are dynamic data;
for text information of static data and dynamic data, converting the embedded model into corresponding vectorization representation, extracting static features from the vectorization representation of the static data, extracting daily comments and day information from the vectorization representation of the dynamic data as dynamic features, extracting daily progress change as a progress sequence, and performing maximum and minimum normalization processing on the extracted data.
3. The method of claim 1, wherein a crowd funding project i is represented as a triple (X)i,Ci,Pi) Wherein X isiRepresenting static features; ciAnd PiAll are dynamic sequences, which respectively represent dynamic characteristic sequences and progress sequences of projects;
the crowd funding progress prediction means that static characteristics X of crowd funding project i are obtainediAnd dynamic signature sequences of the first T-T days
Figure FDA0002388800700000011
And project progress sequence
Figure FDA0002388800700000012
In the case of (1), project schedule sequence of the next τ days is predicted
Figure FDA0002388800700000013
Wherein, the dynamic feature vector of the t day of the item i
Figure FDA0002388800700000014
Funded by the massesFeature vector of comment information in progress data of project
Figure FDA0002388800700000015
Vector formed by the sum of day information
Figure FDA0002388800700000016
Composition, T1., T- τ, T being the total days of crowd funding.
4. The stage differentiation type crowd funding progress prediction method according to claim 1, wherein a reinforcement learning quadruplet is expressed as < S, a, P, R >; wherein:
s is a state space and comprises state vectors S of all time steps; using the GRU network to model the state change in the time dimension, after initializing the GRU network by using the static characteristics, the input in each time step is the dynamic characteristics, the output hidden vector is used as the state vector(s) of the environment display to the intelligent agentt=ht(ii) a s represents a state vector, h represents a hidden vector output by the GRU network, and t represents a time step, namely a certain day;
a is an action space which comprises an action a of each time step; the motion space is defined as a continuous space composed of all positive numbers, and
Figure FDA0002388800700000021
wherein
Figure FDA0002388800700000022
Representing the estimated value of the progress at the next time step, a representing the action;
r is a return function which is a continuous differentiable function with a monotonically decreasing positive value, and the output of the return function is taken as the final return value R after the running average. (ii) a
The state transition probability P is not explicitly defined.
5. A stage discriminative crowd funding as claimed in claim 1 or 4The progress prediction method is characterized in that a reinforcement learning framework of an actor-judger is adopted, namely, the inside of an intelligent body comprises two components of the actor and the judger; the actor is used to learn an action strategy μ, i.e. a mapping μ from a state space to an action space, and to output, i.e. the final action a ═ μθ(s), θ represents a parameter of the strategy μ; the evaluation device is used for evaluating the action taken by the action device and outputting a value function Qμ(s, a) taking the value of action a under policy μ as state s while self-learning by time differentiation;
the learning objective of an actor is to maximize the sum of the expected future benefits, namely:
Figure FDA0002388800700000023
wherein, muθDenotes the strategy μ, E denotes expectation, ρ, with θ as parameterμStates s under strategy μtThe distribution obeyed, γ represents the cumulative discount factor calculated, r(s)τ,μθ(sτ) Is in a state of sτTime dependent strategy muθTake action muθ(sτ) The instant reward obtained, TN represents the total days of crowd funding;
the actor update target applicable to the deterministic policy is then:
Figure FDA0002388800700000024
wherein the content of the first and second substances,
Figure FDA0002388800700000025
represents LfuFor the partial differential vector of theta,
Figure FDA0002388800700000026
represents μθ(st) For the partial differential vector of theta,
Figure FDA0002388800700000027
represents Qμ(st,at) At at=μθ(st) A partial differential vector of time versus a;
the discriminator calculates the evaluation result Q(s) of the current time stept,at) Self-learning with the expectation of deviation from the value re-estimated after obtaining the current immediate benefit:
δt=rt+γQ(st+1,at+1)-Q(st,at),
Figure FDA0002388800700000028
wherein r ist=r(st,at) Representing the instantaneous return of the time step t, deltatIs the deviation calculated by the time difference method.
6. The stage-discriminative crowd funding progress prediction method of claim 5, further comprising: the objective function of the actuator is improved:
at the current time step t, using the predicted progress of all time steps between the current time steps
Figure FDA0002388800700000029
And true progress pτAs a function of the loss to past predictions:
Figure FDA0002388800700000031
meanwhile, punishment is carried out on the part of the predicted output on the larger time step, which is smaller than the output on the smaller time step, and the loss function is as follows:
Figure FDA0002388800700000032
wherein
Figure FDA0002388800700000033
Representing an indicative function;
the objective function of the actor is denoted as Lfu,LpaAnd LregThe weighted sum of the three parts, namely:
Lactor=Lfu1Lpa2Lreg
wherein λ is1、λ2Respectively represent Lpa、LregThe weight of (c).
7. The stage-discriminative crowd funding progress prediction method of claim 6, further comprising: improving the internal structure of the actor by using an option mechanism;
using ω for an option, ω ═ Iω,μω,βω),Iω,μωAnd betaωRespectively representing the initial state, the low-level strategy and the termination probability function of option; it is assumed that the initial states of all options are the entire state space, i.e.
Figure FDA0002388800700000038
μωRepresents the strategy mu contained in omega, automatically changing to a lower-level strategy at this time, and additionally uses pi (omega | s)t) Representing high-level policies, i.e. input states stOutputting the probability of each option; termination function betaωRepresents the range from the state space to [0, 1 ] contained in ω]Mapping the interval; entering a state s at a time step ttThe last time step is selected as option omegat-1Will be terminated by a termination function
Figure FDA0002388800700000034
Probability of output is terminated; if terminated, by π (ω | s)t) Probability of output selects new optionωtOtherwise ω is still usedt-1(ii) a Then according to omegatThe low-level policy of (1) outputs an action, i.e., a predicted project schedule, and determines a termination probability at the beginning of the next time step;
when the state-option tuple (s, ω) is expressed as an extended state, Q (s, a), μω(s) and betaω(s) Q (s, ω, a), μ (s, ω) and β (s, ω), respectively, after expansion;
updating the actor comprises updating a low-level strategy and updating a termination function; the updating of the low-level policy is directly derived from extending the state, i.e. LfuAnd partial updating:
Figure FDA0002388800700000035
wherein, ρ'μRepresents the state(s) after expansion under the policy μt,ωt) The distribution obeyed;
Figure FDA0002388800700000036
represents μθ(st,ωt) For the partial differential vector of theta,
Figure FDA0002388800700000037
represents Q(s)t,ωt,at) A partial differential vector for action a;
the update of the termination function can then be given by the following equation:
Figure FDA0002388800700000041
Lterm=β(st+1,ωt)A(st+1,ωt)
wherein, β(s)t+1,ωt) Representing the probability of termination at the next time step, i.e. if not, the option of the current time step is still selected, with an estimate of Q(s)t+1,ωt,at) (ii) a If terminated, the estimated maximum of all options at the next time step is selected instead of the option at the current time step, i.e.
Figure FDA0002388800700000042
The final objective function of the actuator is denoted as Lfu,Lpa,LregAnd LtermThe weighted sum of the four parts, namely:
Lactor′=Lfu1Lpa2Lreg3Lterm
wherein λ is3Represents LtermThe weight of (c).
8. The method of claim 7, wherein for the evaluator, since the current option is terminated with a certain probability at the next time step, the evaluator calculates the expected value function of the next time step according to whether the termination is determined:
Figure FDA0002388800700000043
the evaluator still updates itself by means of time difference, which is expressed as:
δt′=rt+γU(ωt,st+1)-Q(st,ωt,at),
Figure FDA0002388800700000044
CN202010107306.2A 2020-02-21 2020-02-21 Stage-distinguishing crowd funding progress prediction method Pending CN113298279A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010107306.2A CN113298279A (en) 2020-02-21 2020-02-21 Stage-distinguishing crowd funding progress prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010107306.2A CN113298279A (en) 2020-02-21 2020-02-21 Stage-distinguishing crowd funding progress prediction method

Publications (1)

Publication Number Publication Date
CN113298279A true CN113298279A (en) 2021-08-24

Family

ID=77317424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010107306.2A Pending CN113298279A (en) 2020-02-21 2020-02-21 Stage-distinguishing crowd funding progress prediction method

Country Status (1)

Country Link
CN (1) CN113298279A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030074338A1 (en) * 2001-07-18 2003-04-17 Young Peter M. Control system and technique employing reinforcement learning having stability and learning phases
CN108846747A (en) * 2018-05-24 2018-11-20 阿里巴巴集团控股有限公司 A kind of virtual resource based on block chain is delivered, crowd raises method and device
CN109191276A (en) * 2018-07-18 2019-01-11 北京邮电大学 A kind of P2P network loan institutional risk appraisal procedure based on intensified learning
US20190065978A1 (en) * 2017-08-30 2019-02-28 Facebook, Inc. Determining intent based on user interaction data
CN110097225A (en) * 2019-05-05 2019-08-06 中国科学技术大学 Collaborative forecasting method based on sound state depth characterization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030074338A1 (en) * 2001-07-18 2003-04-17 Young Peter M. Control system and technique employing reinforcement learning having stability and learning phases
US20190065978A1 (en) * 2017-08-30 2019-02-28 Facebook, Inc. Determining intent based on user interaction data
CN108846747A (en) * 2018-05-24 2018-11-20 阿里巴巴集团控股有限公司 A kind of virtual resource based on block chain is delivered, crowd raises method and device
CN109191276A (en) * 2018-07-18 2019-01-11 北京邮电大学 A kind of P2P network loan institutional risk appraisal procedure based on intensified learning
CN110097225A (en) * 2019-05-05 2019-08-06 中国科学技术大学 Collaborative forecasting method based on sound state depth characterization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JUN WANG, HEFU ZHANG, QI LIU, ZHEN PAN, HANQING TAO: "Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach", HTTPS://ARXIV.ORG, pages 1 - 5 *
徐欢云;胡小勇;: "借鉴、融合与创新:教育人工智能发展的多维路向――基于AIED(2011-2018)的启示", 开放教育研究, no. 06 *

Similar Documents

Publication Publication Date Title
CN110490717B (en) Commodity recommendation method and system based on user session and graph convolution neural network
Vanegas et al. Inverse design of urban procedural models
Guo et al. A reinforcement learning decision model for online process parameters optimization from offline data in injection molding
US10824940B1 (en) Temporal ensemble of machine learning models trained during different time intervals
Behnamian et al. Development of a PSO–SA hybrid metaheuristic for a new comprehensive regression model to time-series forecasting
Rempe et al. Trace and pace: Controllable pedestrian animation via guided trajectory diffusion
CN110659411B (en) Personalized recommendation method based on neural attention self-encoder
CN112597392B (en) Recommendation system based on dynamic attention and hierarchical reinforcement learning
Zhang et al. Proximal policy optimization via enhanced exploration efficiency
CN109657800A (en) Intensified learning model optimization method and device based on parametric noise
CN113330462A (en) Neural network training using soft nearest neighbor loss
CN113449260A (en) Advertisement click rate prediction method, training method and device of model and storage medium
Lu et al. Double-track particle swarm optimizer for nonlinear constrained optimization problems
Zheng et al. Guided flows for generative modeling and decision making
CN113688306A (en) Recommendation strategy generation method and device based on reinforcement learning
CN113298279A (en) Stage-distinguishing crowd funding progress prediction method
Lei et al. A novel time-delay neural grey model and its applications
CN114648178B (en) Operation and maintenance strategy optimization method of electric energy metering device based on DDPG algorithm
CN115600009A (en) Deep reinforcement learning-based recommendation method considering future preference of user
CN113300884B (en) GWO-SVR-based step-by-step network flow prediction method
Hunter et al. Simulated nonlinear genetic and environmental dynamics of complex traits
CN110956528B (en) Recommendation method and system for e-commerce platform
CN113191527A (en) Prediction method and device for population prediction based on prediction model
Zand et al. Diffusion Models with Deterministic Normalizing Flow Priors
Hester Texplore: temporal difference reinforcement learning for robots and time-constrained domains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination