CN111340669A - Crowd funding project initial stage financing performance prediction system - Google Patents

Crowd funding project initial stage financing performance prediction system Download PDF

Info

Publication number
CN111340669A
CN111340669A CN202010107299.6A CN202010107299A CN111340669A CN 111340669 A CN111340669 A CN 111340669A CN 202010107299 A CN202010107299 A CN 202010107299A CN 111340669 A CN111340669 A CN 111340669A
Authority
CN
China
Prior art keywords
item
financing
target
project
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010107299.6A
Other languages
Chinese (zh)
Inventor
陈恩红
刘淇
吴李康
李徵
张凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202010107299.6A priority Critical patent/CN111340669A/en
Publication of CN111340669A publication Critical patent/CN111340669A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a crowd funding project initial stage financing performance prediction system, wherein a neural network structure of a graph is used for modeling competition influence among projects and evolution of a market environment, so that a model can model environmental factors of a crowd funding market to further improve prediction accuracy; meanwhile, the system can also visually display various information and the final prediction result in the prediction process, so that the user experience is greatly improved, and the user can conveniently know the relevant conditions of the crowd funding project.

Description

Crowd funding project initial stage financing performance prediction system
Technical Field
The invention relates to the field of figure neural network and network crowd funding, in particular to a crowd funding project initial stage financing performance prediction system.
Background
The rise of network crowd funding in recent years generates a plurality of valuable research problems, such as project success rate prediction, recommendation system based on crowd funding platform and dynamic tracking of crowd funding. Most of the existing research problems concern the financing process after the project is started, and in the crowd funding market, the initial financing performance of the project is a problem which is very concerned by both an initiator and a platform.
Evaluating the initial financing performance of a project prior to its startup can create a great deal of value, however, the prediction is more difficult and in an unexplored stage because the market environment of the project release time has a great impact on its initial investment.
At present, in the crowd funding field, no special equipment capable of realizing accurate information prediction and visually displaying various information and prediction results in the prediction process is available, and therefore improvement is needed.
Disclosure of Invention
The invention aims to provide a crowd funding project initial stage financing performance prediction system which can visually display various information and prediction results in a prediction process.
The purpose of the invention is realized by the following technical scheme:
a crowd funding project initial stage financing performance prediction system comprises:
the static data preprocessing unit is used for processing the target project and the content information of other published projects before the target project pre-publishing time to obtain corresponding feature vectors;
the dynamic data acquisition unit is used for acquiring the financing time sequence of other published items before the pre-publishing time of the target item, and processing the financing time sequence through the embedding layer to obtain a corresponding time sequence vector;
the modeling and predicting unit is used for obtaining a competition pressure state vector suffered by the target project according to the feature vector of the target project, the feature vectors and the time sequence vectors of other published projects and combining the long-short term memory network and the attention network to model a project competition relationship; according to the feature vector of the target project, the feature vectors and the financing time sequence of other published projects, and in combination with the propagation tree structure modeling historical market environment, obtaining the environment state vector of the target project; predicting an initial financing result of the target project by using the competition pressure state vector of the target project and the environment state vector of the target project; the initial stage is within 24 hours;
the display unit is used for independently displaying the target item and the content information of other published items before the target item pre-publishing time, the processing result of the static data preprocessing unit, the financing time sequence acquired by the dynamic data acquisition unit and the initial financing result of the target item acquired by the modeling and prediction unit by dividing different display areas.
According to the technical scheme provided by the invention, the competition influence among projects and the evolution of the market environment are modeled by using the graph neural network structure, so that the model can model the environmental factors of the crowd funding market and further improve the accuracy of prediction; meanwhile, the system can also visually display various information and the final prediction result in the prediction process, so that the user experience is greatly improved, and the user can conveniently know the relevant conditions of the crowd funding project.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a crowd funding project initial stage financing performance prediction system according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a crowd funding project initial stage financing performance prediction system according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a propagation tree structure according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a crowd funding project initial stage financing performance prediction system, as shown in fig. 1, which mainly comprises:
the static data preprocessing unit is used for processing the target project and the content information of other published projects before the target project pre-publishing time to obtain corresponding feature vectors;
the dynamic data acquisition unit is used for acquiring the financing time sequence of other published items before the pre-publishing time of the target item, and processing the financing time sequence through the embedding layer to obtain a corresponding time sequence vector;
the modeling and predicting unit is used for obtaining a competition pressure state vector suffered by the target project according to the feature vector of the target project, the feature vectors and the time sequence vectors of other published projects and combining the long-short term memory network and the attention network to model a project competition relationship; according to the feature vector of the target project, the feature vectors and the financing time sequence of other published projects, and in combination with the propagation tree structure modeling historical market environment, obtaining the environment state vector of the target project; predicting an initial financing result of the target project by using the competition pressure state vector of the target project and the environment state vector of the target project;
the display unit is used for independently displaying the target item and the content information of other published items before the target item pre-publishing time, the processing result of the static data preprocessing unit, the financing time sequence acquired by the dynamic data acquisition unit and the initial financing result of the target item acquired by the modeling and prediction unit by dividing different display areas.
The system can be implemented by matching with related hardware, for example, the display unit can be implemented by matching with a display screen. The static data preprocessing unit, the dynamic data acquiring unit and the modeling and predicting unit can be implemented by cooperating with the processor, and at the same time, the static data preprocessing unit, the dynamic data acquiring unit and the modeling and predicting unit also include some necessary hardware devices, such as a storage device (providing a system operating space and a data space), a communication device (enabling the system to interact with the outside to acquire related information), and the like.
For ease of understanding, the following detailed description is directed to the above-described system.
In the embodiment of the invention, the initial financing performance of the crowd funding project to be predicted by the system mainly refers to the financing performance within 24 hours after the project is issued, but the financing amount of the project cannot be directly used as the prediction target, because the same amount has different performances for projects with different financing targets. Thus, the percentage of the financing amount to the target may be used as a prediction target, and to reduce the difference between the minimum maxima, the present invention uses log2The (-) function constrains the percentage to facilitate model prediction.
Figure BDA0002388800560000031
α in the above formulaiIndicates the number of financing of item i within the first 24 hours, giShows a financing target of the pre-release item i, so
Figure BDA0002388800560000032
Indicating the percentage of the initial financing number of the project to its target.
Fig. 2 is a schematic diagram of the above system provided by the present invention.
First, static data preprocessing unit.
In the embodiment of the invention, the main information of the data used by the crowd funding platform comprises: a project description, a project category, an initiator type, a current exchange rate, a target financing period, and a target financing amount.
Because the content information needs to be converted into a vector form, the numerical type in the content information is discretized to obtain a one-hot encoder; processing the text type by using a text steering vector (doc2vec) method in a natural language processing technology to obtain a corresponding vector; and splicing the vectors corresponding to the types to obtain corresponding feature vectors (static content feature vectors).
Preferably, before using the doc2vec method, for the text data, word segmentation is firstly performed by using word segmentation technology, then all punctuations are deleted, all words are uniformly converted into lower case, and only words with the frequency of occurrence more than 5 times are reserved.
Based on the manner, corresponding feature vectors can be obtained for all the items, in the embodiment of the invention, the target item is marked as g, and because the model training is involved in the invention, a target item set is also constructed
Figure BDA0002388800560000041
In the training process, various results of the target project are known, and content information of the target project is known in the testing process, but since the target project is not published, various financing conditions involved are unknown, and a modeling and prediction unit mentioned later is needed for prediction. And recording the set of other published items before the target item pre-publishing time as psi, wherein the item i and the item j referred to later are published items. The feature vectors of these items are all correspondingly represented as xg、xi、xj
Second, dynamic data acquisition unit
For a given target item g, the pre-release time is TgIts corresponding environmental factor, i.e. contextual characteristic, i.e. TgFunding sequences for other published items in the market by the masses at time onwards.
For item i, the financing time sequence is:
Figure BDA0002388800560000042
in the above formula, v represents the investment amount, t represents the time stamp of the investment, subscript is the number of the investment times, | SiAnd | represents the total investment.
Will fund time series SiProcessing by an Embedding Layer (Embedding Layer) to obtain a corresponding time series vector TSi,TSi=[ξ0,ξ1,...,ξ23]A time series representing item i over the past 24 hours;
ξk=log2(∑vl)
in the above formula, vl∈Si,Tg-(k+1)*Δ≤tl<TgK Δ, k ═ 0, 1.., 23, Δ denotes the time interval of 1 hour, and the amount of the item i financed in each hour in the last 24 hours can be determined by this formula
And thirdly, a modeling and predicting unit.
1. Project competition modeling Part (PCM).
Once a project is released, it is subject to competition from the marketplace. When the competitiveness between projects is established, the pre-release time of a project g to be released is TgThe item g and the time T are establishedgAnd (3) the continuous edges of other running projects are considered to influence the target projects by different contents and different competitiveness sizes, and the competitiveness information of the other projects is aggregated by using a graph attention network (GAT) to express the competition pressure of the target projects. Wherein the competitive strength of other projects in a future period is predicted by modeling the historical time sequence of the other projects by using a long short-term memory network (LSTM); the specific implementation process is as follows:
first, to quantify the competitiveness of each competitor for a future period of time, the initial financing state (i.e., within 24 hours) can be predicted using a long-short term memory network (LSTM) based on the time series vector of each published item to express its competitiveness:
Figure BDA0002388800560000051
in the above formula, TSiTime series vector representing published item i, Ψ represents TgA set of published items that are running on the market at a time.
In consideration of the computational stress of the platform, a plurality of target items are trained in the model at the same time, and in order to achieve the aim, the invention divides a day into 6 stages according to the general work and rest time of human beings, namely 8: 00-12: 00","12: 00-14: 00","14: -17: 00","17: 00-20: 00","20: 00-24: 00 "and" 0: 00-8: 00 ", then define the target set
Figure BDA0002388800560000052
Containing unpublished target items at the same time period within the same day. Meanwhile, in order to prevent the common information leakage on the time sequence task, when the combination psi is obtained, the pre-release time of each item in the definition set psi is unified into
Figure BDA0002388800560000053
Figure BDA0002388800560000054
Wherein, TiThe time of day is published for item i. Considering that time-series modeling using LSTM is time-consuming when Ψ is large, to solve this problem, a pruning method is used to select a published item from the set Ψ that is most likely to compete with the target item at the early stage of the item, i.e., select TgItems in a just-funded tile (containing newly created items within the last three days) and a category tile (containing items of the same category) that is the same as the target item in the time crowd-funding platform are represented by an adjacency matrix:
Figure BDA0002388800560000055
in the above formula, the first and second carbon atoms are,
Figure BDA0002388800560000056
indicating that item i and item j have a continuous edge,
Figure BDA0002388800560000057
indicating that item i and item j do not have a continuous edge,
Figure BDA0002388800560000058
is to map the id in the set Ψ of published items into the column of the adjacency matrix, CiAnd CjIndicates the categories to which the item i and the item j belong, TiAnd TjIndicating the pre-issue time for item i and item j.
The pruning method can reduce the number of time sequence simulation, reduce the calculation amount and reduce the noise of information aggregation. Because of strong competitiveness or the influence of items with contents similar to the target item on the target item is large, the graph attention network is used for carrying out neighbor information aggregation on the target item g:
egi=VT[Wxg||Wxi]
Figure BDA0002388800560000061
in the above formula, xg、xiFeature vectors respectively representing target items g and i, V, W representing mapping parameter matrixes used in the attention mechanism, wherein specific parameters of the mapping parameter matrixes are learned and optimized in the training process of the model, αgiRepresenting attention weights, T is a matrix device symbol,
Figure BDA0002388800560000062
a set of neighbor nodes representing a target project comprised of published projects;
and finally, obtaining a competition pressure state vector of the target project:
Figure BDA0002388800560000063
in the above process, αgiIs calculated from the static content feature vector, WhRepresents a matrix of mapping parameters optimized by learning in training, and uses attention weights αgiAnd predicting financing status
Figure BDA0002388800560000064
In this way, the invention can simultaneously consider the project financing capacity and the project content.
2. Market environmental evolution modeling section (MET).
In fact, the market environment is the context environment of the project, so it is necessary to refer to the initial financing conditions of other projects in the historical market environment of the target project and find out the change of the financing states of the projects along with the market evolution. Since a market can release hundreds or thousands of items in a short few days, the traditional chain structure model for time series modeling is not suitable for the scenario because the effect is significantly reduced as the time series grows. Meanwhile, if the financing status of other items in the historical data is directly aggregated to the target item, a problem arises in that the time levels of the items are put on a level, which is unreasonable in the time series modeling. Therefore, the invention constructs a graph neural network for information transfer based on a propagation tree structure for modeling the whole historical market environment.
When modeling a historical market environment, defining the published items as nodes of a propagation tree, and defining the state of the published items:
hj=[xj||rj]
in the above formula, xjFeature vector, r, representing item jjRepresents the initial (within the initial 24 hours) financing number of item j:
Figure BDA0002388800560000065
in the above formula, TjRepresents the pre-release time of the item j; sjRepresenting a sequence of financing times for the item j,vlrepresents the amount of the first investment; t is tlA timestamp representing the first investment; n ishRepresents 24 hours of the day, and wherein there is a constraint: t isi-Ti>nh*Δ,TiRepresenting the pre-release time of the item i, and delta representing a time interval of 1 hour, in such a way that the item i can observe the initial financing state of the item j, namely the item j is released at the time when the item i is released and exceeds 1 day, the initial financing state of the item j can be observed at the time, and j is defined as an observable node of i; if the historical days are thThen the set of observable nodes is: phii={j|,nh*Δ<Ti-Tj<nh*thDelta }; the propagation tree is built as shown in part (a) of FIG. 3, which includes three nodes and respective observable nodes, and three connecting edges exist<a,g>,<b,g〉,<b, a), the length of each connecting edge is more than 24 hours. If deleted<b, g), the depth of the nodes a and b on the tree taking g as the root is 1 and 2 respectively, and the depth can represent the pre-release time point T of different nodes from the target item ggAnd the process of information passing from node b to a to g is similar to the process of information passing in time steps in an LSTM network.
Consider the more complex case, which is shown in part (b) of fig. 3, and also apply the above method to model the tree structure. In addition, because the market environment is the context environment in the target project prediction task, considering that the model effect can be effectively prevented from being attenuated by using the equal-interval sampling method during the long-period transmission of the time sequence information, the invention constructs a propagation tree which can keep the close time interval between the layers of each subtree of the constructed propagation tree as much as possible, namely a propagation path formed from each leaf node to the root node is close to the time sequence of equal-interval sampling. When building the propagation number structure, t ishNewly released items in each day are arranged in the same layer of the tree, and the financing state of all items released in the day closest to the expected release time
Figure BDA0002388800560000071
As a node of the tree, and is arranged in the first layer of the root node of the propagation tree and connected with the root node; and the financing states of all projects on the next day closest to the expected release time are used as a second layer of the root nodes, each node is connected with the node closest to the node in the first layer, and the finally generated propagation tree structure is represented by an adjacency matrix gamma.
In order to prevent attenuation of information propagation at longer depths, the present invention uses a method of a recurrent neural unit (GRU).
Before information propagation, the state of all nodes in the propagation tree is initialized:
Figure BDA0002388800560000072
Figure BDA0002388800560000073
in the above formula, xg、xiFeature vectors r representing target items g and i, respectivelyiRepresenting the initial financing number of the item i, and using v to refer to each node as nodes are not treated differently in the information propagation process;
in each subsequent propagation process, that is, information aggregation between nodes is performed each time, and the information aggregation mode of each node is as follows:
Figure BDA0002388800560000074
wherein
Figure BDA0002388800560000075
Representing neighbor nodes representing a node v in an adjacency matrix gamma, | G ∪ Φ | is the size of a state vector set of all nodes in a propagation tree, wherein G represents a set of nodes to be predicted, Φ is a set of all observable nodes, h(t-1)TAnd (3) representing the hidden state of the node (t-1) at the moment, subscripts are node serial numbers, b is an offset vector, and then, updating the state of each node by using a recurrent neural unit GRU:
Figure BDA0002388800560000081
Figure BDA0002388800560000082
Figure BDA0002388800560000083
Figure BDA0002388800560000084
in the above formula, WzAnd UzTraining parameter matrix, W, representing updated gating cellsrAnd UrTraining parameter matrix, W, representing reset gating cells1And U1A corresponding parameter matrix representing the output layer.
Propagation t ═ thThe final state of the target item g is then:
Figure BDA0002388800560000085
in the above formula, the first and second carbon atoms are,
Figure BDA0002388800560000086
is the environmental state vector in which the target item is located.
3. And a prediction part.
Vector of competition pressure status of target item
Figure BDA0002388800560000087
And the environmental state vector of the target item
Figure BDA0002388800560000088
And combining, and through a full connection layer, wherein the activation function of the full connection layer is a ReLU function, so as to predict the initial financing result of the target project:
Figure BDA0002388800560000089
4. and (4) a joint training part.
In this embodiment, parameters in the modeling and prediction unit are jointly trained, taking into account two losses:
the first partial Loss is denoted LosspThe Mean Absolute Error (MAE) is calculated, expressed as:
Figure BDA00023888005600000810
in the above formula, ygReal initial financing results for the target item g;
the second partial Loss is denoted LosslRepresenting the Loss of the long and short term memory network in the calculation process of the competitive pressure state vector, namely calculating the MAE Loss of the LSTM output of the competitive object competitiveness in the competitive module PCM, formula and LosspIn agreement, i.e. initial financing status of each issued item i calculated by the long-short term memory network
Figure BDA00023888005600000811
Mapping to one-dimensional y'iNamely, the following steps are provided:
Figure BDA00023888005600000812
in the above formula, yiThe real initial financing result of the published item i.
Will losepAnd LosslPerforming combined training, and respectively defining corresponding weight coefficients, wherein the loss function of the training is as follows:
Figure BDA00023888005600000813
wherein Θ represents a parameter set to be trained of the model, η represents a weight coefficient, the model parameters are updated by using a Stochastic Gradient Descent (SGD) algorithm, and the initial learning rate is defined to be 0.02.
According to the scheme of the embodiment of the invention, by utilizing the fusion of various metadata, the market environment is focused on modeling so as to evaluate the initial stage financing performance of the un-started crowd funding project, so that whether the pre-release time of the project is proper or not is judged, and the project is ensured to have better starting performance; meanwhile, the system is built on related hardware equipment to form a set of complete products, and a user can visually display various information and final prediction results in the prediction process through the related products, so that the user experience is greatly improved, and the user can conveniently know the related conditions of crowd funding projects.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (7)

1. A crowd funding project initial stage financing performance prediction system is characterized by comprising:
the static data preprocessing unit is used for processing the target project and the content information of other published projects before the target project pre-publishing time to obtain corresponding feature vectors;
the dynamic data acquisition unit is used for acquiring the financing time sequence of other published items before the pre-publishing time of the target item, and processing the financing time sequence through the embedding layer to obtain a corresponding time sequence vector;
the modeling and predicting unit is used for obtaining a competition pressure state vector suffered by the target project according to the feature vector of the target project, the feature vectors and the time sequence vectors of other published projects and combining the long-short term memory network and the attention network to model a project competition relationship; according to the feature vector of the target project, the feature vectors and the financing time sequence of other published projects, and in combination with the propagation tree structure modeling historical market environment, obtaining the environment state vector of the target project; predicting an initial financing result of the target project by using the competition pressure state vector of the target project and the environment state vector of the target project; the initial stage is within 24 hours;
the display unit is used for independently displaying the target item and the content information of other published items before the target item pre-publishing time, the processing result of the static data preprocessing unit, the financing time sequence acquired by the dynamic data acquisition unit and the initial financing result of the target item acquired by the modeling and prediction unit by dividing different display areas.
2. The crowd funding project initial financing performance prediction system as claimed in claim 1, wherein the processing of the content information of the target project and other published projects before the target project pre-publishing time to obtain the corresponding feature vector comprises:
discretizing the numerical type in the content information to obtain a one-hot coded vector; processing the text type by using a text turning method in a natural language processing technology to obtain a corresponding vector; splicing the vectors corresponding to each type to obtain corresponding characteristic vectors;
the content information includes: a project description, a project category, an initiator type, a current exchange rate, a target financing period, and a target financing amount.
3. The crowd funding project initial financing performance prediction system as claimed in claim 1, wherein the financing time sequence of project i issued before the target project pre-issuance time is:
Figure FDA0002388800550000011
in the above formula, v represents the investment amount, t represents the time stamp of the investment, subscript is the number of the investment times, | Si| represents the total investment;
will fund time series SiObtaining corresponding time series vector TS through embedding layer processingi,TSi=[ξ0,ξ1,...,ξ23]A time series representing item i over the past 24 hours;
ξk=log2(∑vl)
in the above formula, vj∈Si,Tg-(k+1)*Δ≤tl<Tg-k Δ, k ═ 0, 1.., 23, Δ denotes a time interval of 1 hour.
4. The crowd-funding project initial financing performance prediction system as claimed in claim 1, wherein the obtaining of the competition pressure state vector of the target project according to the feature vector of the target project, the feature vector and the time series vector of other published projects, and the competition relationship between the long-short term memory network and the attention network modeling project comprises:
predicting the initial financing state based on the time sequence vector of the published item by using a long-short term memory network:
Figure FDA0002388800550000021
in the above formula, TSiTime series vector representing published item i, Ψ represents TgSet of published items, T, that are running on the market at a momentgTime of day indicates the target item gReleasing time;
using a pruning method, a published item from the set Ψ that is most likely to compete with the target item at the beginning of the item, i.e., T is selectedgItems that are time in the just-funded tile and the same category tile as the target item, are represented using the adjacency matrix:
Figure FDA0002388800550000022
in the above formula, the first and second carbon atoms are,
Figure FDA0002388800550000023
indicating that item i and item j have a continuous edge,
Figure FDA0002388800550000024
indicating that item i and item j do not have a continuous edge,
Figure FDA0002388800550000025
is to map the id in the set Ψ of published items into the column of the adjacency matrix, CiAnd CjIndicates the categories to which the item i and the item j belong, TiAnd TjRepresenting the pre-release time of the item i and the item j;
and (3) carrying out neighbor information aggregation on the target item g by using the graph attention network:
egi=VT[Wxg||Wxi]
Figure FDA0002388800550000026
in the above formula, xg、xiFeature vectors respectively representing target items g and i, V, W representing mapping parameter matrixes used in the attention mechanism, wherein specific parameters of the mapping parameter matrixes are learned and optimized in the training process of the model, αgiRepresenting attention weights, T is a matrix device symbol,
Figure FDA0002388800550000027
a set of neighbor nodes representing a target project comprised of published projects;
and finally, obtaining a competition pressure state vector of the target project:
Figure FDA0002388800550000031
in the above formula, WhAnd representing a mapping parameter matrix optimized by learning in training.
5. The crowd funding project initial financing performance prediction system according to claim 1, wherein the obtaining of the environmental state vector of the target project according to the feature vector of the target project, the feature vector of other published projects and the financing time sequence in combination with the propagation tree structure modeling of the historical market environment comprises:
when modeling a historical market environment, defining the published items as nodes of a propagation tree, and defining the state of the published items:
hj=[xj||rj]
in the above formula, xjFeature vector, r, representing item jjInitial financing number representing item j:
Figure FDA0002388800550000032
in the above formula, TjRepresents the pre-release time of the item j; sjSequence of financing times, v, representing item jlRepresents the amount of the first investment; t is tlA timestamp representing the first investment; n ishRepresents 24 hours of the day, and wherein there is a constraint: t isi-Tj>nh*Δ,TiRepresenting the pre-release time of the item i, and Δ representing a time interval of 1 hour, in such a way that the item i can observe the initial financing state of the item j, i.e. the time at which the item i is released at which the item j has been released exceeds 1 day, so that the item i can observe the initial financing state of the item jDefining j as an observable node of i;
if the historical days are thThen the set of observable nodes is: phii={j|nh*Δ<Ti-Tj<nh*thΔ};
Building a propagation number structure, and dividing t intohItems newly released every day in the day are arranged in the same layer of the tree, the financing states of all items released in the day closest to the predicted releasing time are used as tree nodes, and the tree nodes are arranged in the first layer of the root node of the propagation tree and connected with the root node; the financing states of all projects on the next day closest to the expected release time are used as a second layer of the root nodes, each node is connected with the node closest to the node in the first layer, and the finally generated propagation tree structure is represented by an adjacency matrix gamma;
before information propagation, the state of all nodes in the propagation tree is initialized:
Figure FDA0002388800550000033
Figure FDA0002388800550000034
in the above formula, xg、xiFeature vectors r representing target items g and i, respectivelyiIndicating the initial financing number of the item i;
in each subsequent propagation process, the information aggregation mode of each node is as follows:
Figure FDA0002388800550000035
wherein
Figure FDA0002388800550000036
The neighbor nodes of the node v are represented in the adjacency matrix gamma, | G ∪ Φ | is the size of the state vector set of all the nodes in the propagation tree, G represents the node set to be predicted, Φ is the set of all the observable nodes, h(t-1)TIndicating the time of the node (t-1)Hidden state, subscript is node sequence number, b is offset vector;
then, the state of each node is updated using a recurrent neural network:
Figure FDA0002388800550000041
Figure FDA0002388800550000042
Figure FDA0002388800550000043
Figure FDA0002388800550000044
in the above formula, WzAnd UzTraining parameter matrix, W, representing updated gating cellsrAnd UrTraining parameter matrix, W, representing reset gating cells1And U1A corresponding parameter matrix representing an output layer;
propagation t ═ thThe final state of the target item g is then:
Figure FDA0002388800550000045
in the above formula, the first and second carbon atoms are,
Figure FDA0002388800550000046
is the environmental state vector in which the target item is located.
6. The crowd-funding project initial-stage financing performance prediction system as claimed in claim 1, wherein the prediction of the initial financing result of the target project using the competition pressure state vector of the target project and the environmental state vector of the target project comprises:
subject the target item to competitionVector of pressure state
Figure FDA0002388800550000047
And the environmental state vector of the target item
Figure FDA0002388800550000048
And combining, and through a full connection layer, wherein the activation function of the full connection layer is a ReLU function, so as to predict the initial financing result of the target project:
Figure FDA0002388800550000049
7. the crowd funding project initial financing performance prediction system as claimed in claim 1 or 6, wherein the method further comprises: training parameters in the modeling and prediction unit, wherein a training loss function is as follows:
Figure FDA00023888005500000410
in the above formula, Θ represents a parameter set to be trained of the model, and η represents a weight coefficient;
loss term LosspThe calculation formula of (2) is as follows:
Figure FDA00023888005500000411
in the above formula, the first and second carbon atoms are,
Figure FDA00023888005500000412
a target item set in the training process is shown, and g is a target item; y isgReal initial financing results for the target item g;
loss term LosslLoss of the long-term and short-term memory network in the process of calculating the competitive pressure state vector; initial financing status of published item i that is to be predicted by long and short term memory network
Figure FDA00023888005500000413
Mapping to one-dimensional y'iNamely, the following steps are provided:
Figure FDA0002388800550000051
in the above formula, yiAnd (5) obtaining a real initial financing result of the published item i.
CN202010107299.6A 2020-02-21 2020-02-21 Crowd funding project initial stage financing performance prediction system Pending CN111340669A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010107299.6A CN111340669A (en) 2020-02-21 2020-02-21 Crowd funding project initial stage financing performance prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010107299.6A CN111340669A (en) 2020-02-21 2020-02-21 Crowd funding project initial stage financing performance prediction system

Publications (1)

Publication Number Publication Date
CN111340669A true CN111340669A (en) 2020-06-26

Family

ID=71181728

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010107299.6A Pending CN111340669A (en) 2020-02-21 2020-02-21 Crowd funding project initial stage financing performance prediction system

Country Status (1)

Country Link
CN (1) CN111340669A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085800A (en) * 2017-04-24 2017-08-22 中国科学技术大学 Quantity optimization method is supplied based on many multi-products for raising platform
CN108830409A (en) * 2018-05-31 2018-11-16 中国科学技术大学 The donations behavior of platform is raised towards crowd and contributor keeps prediction technique
CN109492830A (en) * 2018-12-17 2019-03-19 杭州电子科技大学 A kind of mobile pollution source concentration of emission prediction technique based on space-time deep learning
US20190180358A1 (en) * 2017-12-11 2019-06-13 Accenture Global Solutions Limited Machine learning classification and prediction system
CN109902183A (en) * 2019-02-13 2019-06-18 北京航空航天大学 A kind of knowledge mapping embedding grammar based on various figure attention mechanism
CN110097225A (en) * 2019-05-05 2019-08-06 中国科学技术大学 Collaborative forecasting method based on sound state depth characterization

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107085800A (en) * 2017-04-24 2017-08-22 中国科学技术大学 Quantity optimization method is supplied based on many multi-products for raising platform
US20190180358A1 (en) * 2017-12-11 2019-06-13 Accenture Global Solutions Limited Machine learning classification and prediction system
CN108830409A (en) * 2018-05-31 2018-11-16 中国科学技术大学 The donations behavior of platform is raised towards crowd and contributor keeps prediction technique
CN109492830A (en) * 2018-12-17 2019-03-19 杭州电子科技大学 A kind of mobile pollution source concentration of emission prediction technique based on space-time deep learning
CN109902183A (en) * 2019-02-13 2019-06-18 北京航空航天大学 A kind of knowledge mapping embedding grammar based on various figure attention mechanism
CN110097225A (en) * 2019-05-05 2019-08-06 中国科学技术大学 Collaborative forecasting method based on sound state depth characterization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LIKANG WU等: "Estimating Early Fundraising Performance of Innovations via Graph-based Market Environment Model" *
陈肖华;李元亨;: "基于BP神经网络的众筹项目融资结果预测研究" *

Similar Documents

Publication Publication Date Title
CN109523018B (en) Image classification method based on deep migration learning
TWI788529B (en) Credit risk prediction method and device based on LSTM model
US20090043715A1 (en) Method to Continuously Diagnose and Model Changes of Real-Valued Streaming Variables
CN108763377B (en) Multi-source telemetering big data feature extraction preprocessing method based on satellite fault diagnosis
CN112733997B (en) Hydrological time series prediction optimization method based on WOA-LSTM-MC
CN110119540B (en) Multi-output gradient lifting tree modeling method for survival risk analysis
CN110110372B (en) Automatic segmentation prediction method for user time sequence behavior
CN107437111A (en) Data processing method, medium, device and computing device based on neutral net
CN109493976A (en) Chronic disease recurrence prediction method and apparatus based on convolutional neural networks model
CN109829478A (en) One kind being based on the problem of variation self-encoding encoder classification method and device
CN114548591A (en) Time sequence data prediction method and system based on hybrid deep learning model and Stacking
CN112016097A (en) Method for predicting time of network security vulnerability being utilized
CN110569355B (en) Viewpoint target extraction and target emotion classification combined method and system based on word blocks
CN110223342B (en) Space target size estimation method based on deep neural network
Ali et al. Developing an integrative data intelligence model for construction cost estimation
CN111340669A (en) Crowd funding project initial stage financing performance prediction system
CN116303786A (en) Block chain financial big data management system based on multidimensional data fusion algorithm
CN115409217A (en) Multitask predictive maintenance method based on multi-expert hybrid network
CN115544307A (en) Directed graph data feature extraction and expression method and system based on incidence matrix
Gonzalez-Calvo et al. Multivariate influence through neural networks ensemble: Study of Saharan dust intrusion in the Canary Islands
JP2021144659A (en) Calculator, method for calculation, and program
CN112232557A (en) Switch machine health degree short-term prediction method based on long-term and short-term memory network
CN111737466A (en) Method for quantizing interactive information of deep neural network
CN117648585B (en) Intelligent decision model generalization method and device based on task similarity
CN115830400B (en) Data identification method and system based on federal learning mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200626