Disclosure of Invention
Aiming at the problems, the invention aims to provide an investment management based cloud platform and an investment management method.
The purpose of the invention is realized by the following technical scheme:
a cloud platform based on investment management and an investment management method comprise a registration module, a login module, an information crawling module, an information storage module, a data screening module, a data analysis and management module and an investment management module, wherein the registration module is used for registering accounts of new people, the login module is provided with a login interface, the login interface is used for a client to login the cloud platform for investment management, the cloud platform comprises a name input column, a mobile phone number input column, an identity card number input column and a verification code column, the information crawling module adopts a script frame and crawls real-time data of a target investment platform, such as information of opening price, closing price, daily income rate and the like, the information storage module stores the crawled real-time data into an sql database to further process the data, the data screening module filters data with low confidence coefficient according to the confidence coefficient of each data, and screening repeated, missing and similar data, wherein the data analysis and management module comprises a time sequence unit, an overfitting analysis unit and a residual error unit, the time sequence unit builds a time sequence model, the data is fitted according to the screened high-confidence data, the overfitting analysis unit adopts a Chichi information criterion to constrain the fitted data in order to ensure the accuracy of investment prediction, the residual error module improves the error interval of the model by building a hypothesis experiment and an accurate model solution, and the investment management module plans an investment scheme for a client according to the predicted data and gives suggestions for future investment trends.
Further, the registration module is used for registering an account for a new person to establish connection, so that the platform serves the customer.
Furthermore, the login module is provided with a login interface which comprises a name input field, a mobile phone number input field, an identity card number input field and a verification code field, relevant information is input to carry out safety verification, the operation of a client is guaranteed, and the access frequency and the access record of the client are stored by the cloud platform.
Further, the information crawling module adopts a script framework to crawl real-time data of a target investment platform, such as information of opening price, closing price, daily profitability and the like, and since the whole network url is intercommunicated, in order to efficiently find a target website of response by a crawler in a time-saving manner, a path is explored by using an elite ant system algorithm, and the method specifically comprises the following steps:
(1) assuming that there are L urls to be crawled in the network, and the weighted adjacency matrix of the L urls is U, then:
wherein u is 1,2 Represents the weight, u, of the 1 st url site leading to the 2 nd url site 1,3 Represents the weight, u, of the 1 st url site leading to the 3 rd url site 1,L Represents the weight, u, of the 1 st url site leading to the Lth url site 2,1 Represents the weight, u, of the 2 nd url site leading to the 1 st url site 2,3 Represents the weight, u, of the 2 nd url site leading to the 3 rd url site 2,L Represents the weight, u, of the 2 nd url site leading to the L url site L,1 Represents the weight, u, of the L < th > url site leading to the 1 < th > url site L,2 Represents the weight, u, of the L-th url site leading to the 2 nd url site L,3 Representing the weight of the L-th url site leading to the 3 rd url site, and making a certain item in the matrix u c,d Representing the weight of the c-th url site leading to the d-th url site, wherein the weight adjacency matrix is known as U, and the matrix U is a symmetric matrix, i.e. U c,d =u d,c Wherein u is c,d E [0, + ∞)) and the diagonal element is 0, representing that the same url node crawler does not need to spend additional cost when exploring;
(2) let V crawlers simultaneously crawl data, set τ
c,d (0) Denotes pheromones from the c-th url to the d-th url, o is a constant close to 0, and τ
c,s (0) O, denotes pheromones from the c-th url to the s-th url, assuming the probability that the crawler v transitions from the c-th url to the d-th url at time t
Comprises the following steps:
in the formula D k (c) L, representing the set of urls the crawler v is allowed to select next, η c,d ,η c,s Is a heuristic factor representing the crawler's expected degree from the c-th url to the d-th url, typically the time taken from the c-th url to the d-th url, said alpha and beta representing the relative importance of the pheromone and the expected heuristic factor, respectively;
(3) after the crawler finishes one traversal, the pheromone is updated as follows: tau.
c,d (t+L)=(1-ρ)τ
c,d (t)+ Δτ
c,d Where ρ (0 < ρ < 1) represents the pheromone evaporation coefficient on the path, 1- ρ represents the pheromone persistence coefficient, and Δ τ
c,d Expressing the pheromone increment of the iteration, and setting the best url found by the crawler history as T for the elite ant system algorithm
best Then the improved pheromones are:
wherein e is the adjustment T
best The parameters that affect the weight are selected from the group,
represented by the following formula:
wherein L is best Is a known optimal path T best After the crawler can find the optimal path for exploring the url site through iteration, the crawler under the case can quickly and accurately crawl real-time investment dynamic information after training, and convenient service is brought to clients.
Further, the information storage module stores the crawled real-time data to an sql database so as to further process the data.
Further, the data screening module filters data with low confidence level according to the confidence level of each data, screens out repeated, missing and similar data, assumes that m types of data of n times are crawled, such as opening price, closing price, daily rate of return and other information, and records the screened data matrix as a, and can obtain:
wherein A is a screened data matrix, also called a sample matrix; a is 1,1 Is the 1 st data value of the first unit time, a 1,2 Is the 2 nd data value of the first unit time, a 1,m Is the m-th data value of the first unit time, a 2,1 Is the 1 st data value of the second unit time, a 2,2 Is the 2 nd data value of the second unit time, a 2,m Is the m-th data value of the second unit time, a t,1 1 st data value of t unit time, a t,2 Is the 2 nd data value of the t unit time, a t,m Is the mth data value of the tth unit time, and t e [1, n ∈],a n,1 Is the 1 st data value of the nth unit time, a n,2 Is the 2 nd data value of the nth unit time, a n,m Is the mth data value of the nth unit time.
Further, the data analysis and management module comprises a time sequence unit, an overfitting analysis unit and a residual error unit, the screened samples are analyzed, the samples are a numerical sequence formed by arranging index numerical values according to a time sequence, the numerical sequence is regarded as a time sequence, and a differential autoregressive moving average model (ARIMA) model is used for performing prediction analysis on the investment information, and the specific steps are as follows:
(1) the time sequence unit judges the stationarity of the time sequence firstly, the stationarity of the ARIMA is related to the AR and the MA model as the ARIMA model can be regarded as the linear combination of an autoregressive model (AR) and a moving average Model (MA), the time sequence is stable as long as the ARIMA model meets the stationarity condition of the AR model, and the ARIMA regression model can be used for analyzing only the stationary time sequence;
(2) the differentiated mathematical model can be further simplified by using a hysteresis operator, denoted by the symbol B:
for the first order difference: Δ a t =a t -a t-1 ;
For the second order difference: delta of 2 a t =Δ(a t -a t-1 )=(a t -a t-1 )-(a t-1 -a t-2 )=a t - 2a t-1 +a t-2 =(1-2B+B 2 )a t =(1-B) 2 a t ;
Regarding the data of the cloud platform investment management, 2 differences are generally selected at most;
(3) building an ARIMA model, and writing the ARIMA model into the following steps according to definition:
wherein p is the order of the AR model, q is the order of the MA model, d is the difference times, d belongs to {0, 1, 2}, r is the Pearson correlation coefficient, y is t Is the sample value at time t, C is a constant, beta i Is a regression coefficient, ε t Is a multiplicative error coefficient n-t,j For the j data value of the n-t unit time, the pearson correlation coefficient r can be calculated as:
(4) in order to obtain the p-order q-order value of ARIMA, the autocorrelation coefficient gamma of the sample is calculated
k And the partial autocorrelation coefficient phi
k,p The autocorrelation coefficient γ
k Comprises the following steps:
wherein, a
t Is the sample data value at time t, a
t-k Sample data values at time t-k, cov (a)
t ,a
t-k ) Is a
t And a
t-k Covariance of (a), var (a)
t ) The variance of the data at time t, the partial autocorrelation coefficient phi
k,p Comprises the following steps:
wherein, gamma is
j-k Is the autocorrelation coefficient, gamma, of the sample data at the j-k time
p-k The method comprises the steps that the autocorrelation coefficients of sample data at the p-k moment are obtained, p is the order of an AR (p) model, and the orders p and q are determined by analyzing trailing and truncation conditions of the autocorrelation coefficients of the order d and the partial autocorrelation coefficients;
(5) using least squares, column regression equations for beta
i Solving for regression coefficients, predicting values
Comprises the following steps:
wherein a is
t-i Is the sample data value at time t-i, beta is the regression coefficient, epsilon
t-i Is a multiplicative t-i moment error coefficient;
(6) in order to prevent data overfitting, according to the overfitting analysis unit, the data is judged by adopting an Akaike Information Criterion (AIC):
where m is the number of categories of the parameter,
for the maximum likelihood function estimation value, the following conditions are satisfied:
wherein, p (a)
t | θ) is a joint density function;
(7) after overfitting analysis is carried out on the model, then under-fitting analysis is carried out on the model, white noise inspection needs to be carried out on the residual error, if the residual error is white noise, the rule that the selected model can completely identify time sequence data is shown, namely the model is acceptable, if the residual error is not white noise, part of information is not identified by the model, the model needs to be corrected to identify the part of information, under-fitting of the data is prevented, the residual error module is used for ensuring the precision of the predicted data, the sum of squares of the residual errors is calculated, if the sum of the square of the residual errors exceeds a precision threshold Td, the step (1) is returned to further difference processing on the data, and the calculation formula of the sum of the square of the residual errors RSS is as follows:
further, the investment management module plans an investment scheme for the client according to the predicted data, sets a client-planned investment scheme label comprising a general investment label and a customized investment label, and in order to find an investment service most suitable for the user, adopts a support vector machine, and utilizes a cuckoo algorithm to optimize a penalty factor and a kernel function parameter of the support vector machine, and the specific steps are as follows:
in the cuckoo algorithm, let x i (t) represents the position of the ith bird nest in the population that remains after the ith update using the Laevir flight mode, X i (t) indicates that the ith nest in the population is retained after the t iteration is updatedPosition of (a), p a Indicating the probability of finding, the bird's nest position x i (t) randomly generating a random number rand between 0 and 1, when the random number rand is less than or equal to p a When it is, then X i (t)=x i (t); when the random number rand > p a Then, X is determined in the following manner i The value of (t):
let x j (t) represents the position of the jth bird nest in the population which is reserved after the jth bird nest is updated by adopting a Laevir flight mode for the time t, and when the position x of the bird nest is j (t) satisfies: f (x) j (t))<f(x i (t)), the bird nest position x j (t) adding to the set M i (t) wherein M is i (t) denotes the relative nest position x in the population i (t) set of preferred bird nest positions, f (x) j (t)) represents the bird nest position x j (t) the corresponding fitness function value, f (x) i (t)) represents a bird nest position x i (t) a corresponding fitness function value; set M i (t) bird nest position by its distance from bird nest position x i (t) the Euclidean distances are sorted from near to far to form a sequence Q i (t) adding Q i (t) is expressed as: q i (t)={x i,l (t),l=1,2,...,n i (t) }, in which x i,l (t) represents a sequence Q i The first bird nest position in (t), n i (t) represents a sequence Q i The number of bird nests in (t), definition H i (t) indicates a bird nest position x i (t) spatial detection coefficient, then H i The expression of (t) is:
wherein R is
i,l (t) indicates the bird nest position x
i,l (t) at bird nest position x
i (t) a spatial radius of the center, and R
i,l (t)=|x
i,l (t)-x
i (t) |, let x
i,n (t) represents a sequence Q
i (t) the nth bird nest position, R
i,n (t) indicates the bird nest position x
i,n (t) at bird nest position x
i (t) a spatial radius of the center, and R
i,n (t)=|x
i,n (t)- x
i (t)|,
Represents a sequence Q
i The first k bird nest positions in (t) are bird nest positions x
i (t) is the mean of the spatial radii of the centers, and
k is a given positive integer, and k satisfies: k is less than n
i (t), α and β are weighting coefficients, α and β satisfy: α, β ∈ (0, 1) and α + β ═ 1;
let J i (t) positions x of participating nests in the population i (t) a set of randomly changing preferred bird nest positions using a parameter k i (t) determining a set J i The preferred bird nest positions in (t) are specifically:
(1) according to bird's nest position x i (t) spatial detection coefficient H i (t) determining a parameter k i The value of (t):
in the formula, k
i (t) indicates the bird nest position x
i (t) local range control parameters at random changes,
the median of the spatial detection coefficients of the bird nest positions reserved after the population is updated in the Laevir flight mode for the t time is represented, and
wherein, mean represents taking the median function,
indicating rounding down, and N indicating the number of nests in the population;
(2) sequence Q i Front k in (t) i (t) the bird nest positions are the populationMiddle participating bird nest position x i (t) randomly changed preferred bird nest position, i.e. in sequence Q i (t) selecting the front k i (t) bird nest positions into set J i (t) in (a);
the bird nest position x i (t) random alteration is performed in the following manner:
in the formula, x
i (t) indicates the bird nest position x
i (t) New nest position obtained by random Change, rand
1 Is a randomly generated random number between 0 and 1,
and
are respectively in the set J
i (t) randomly selected bird nest positions, and
let f (x) i (t)) represents the bird nest position χ i (t) fitness function value, when f (χ) i (t))≥f(x i (t)) then X i (t)=x i (t) when f (χ) i (t))<f(x i (t)) then X i (t)=χ i (t)。
The invention has the beneficial effects that: reliable information is retrieved in a frame mode by utilizing a script crawler, and the reliable information is stored in an sql database, so that the retrieval, the storage and the access to historical records of customers can be facilitated at any time, and as network urls are mutually interwoven, the load pressure is huge only by single breadth search or depth search for real-time changing investment data, so that an elite ant system is adopted, the problem that the original ant colony algorithm needs to carry out iterative computation on pheromones generated by each ant accurately is overcome, the pheromones generated by elite ants are used for replacing the original ants, so that the cloud platform investment data information is lightened, convenient and fast service is brought, an ARIMA model is constructed under a powerful cloud platform server through the real-time dynamic characteristic data obtained by crawling of the crawler, fitting regression analysis is carried out on the data, quantitative description and prediction are carried out, the investment can be effectively analyzed and an investment scheme can be drawn up on the premise of guaranteeing the safety, and an investment recommendation is given.
Detailed Description
The invention is further described with reference to the following examples.
Referring to fig. 1, the invention aims to provide an investment management-based cloud platform and an investment management method, and the investment management-based cloud platform comprises a registration module, a login module, an information crawling module, an information storage module, a data screening module, a data analysis and management module and an investment management module, wherein the registration module is used for registering a new person for an account, the login module is provided with a login interface, the login interface is used for a client to log in the investment management cloud platform and comprises a name input column, a mobile phone number input column, an identity card number input column and a verification code column, the information crawling module adopts a script framework to crawl real-time data such as opening price, closing price, daily income ratio and other information for a target investment platform, the information storage module stores the crawled real-time data into an sql database to further process the data, the data screening module performs data screening according to the confidence of each piece of data, the data analysis and management module comprises a time sequence unit, an overfitting analysis unit and a residual error unit, wherein the time sequence unit builds a time sequence model, the data is fitted according to the screened high-confidence data, the overfitting analysis unit adopts a Chichi information criterion to constrain the fitted data in order to ensure the accuracy of investment prediction, the residual error module improves the error interval of the model by building a hypothesis experiment and an accurate model solution, the investment management module plans an investment scheme for a client according to the predicted data and provides suggestions for future investment trends.
Specifically, the registration module is used for registering an account for a new person to establish connection, so that the platform serves a customer.
Specifically, the login module is provided with a login interface which comprises a name input field, a mobile phone number input field, an identity card number input field and a verification code field, relevant information is input to carry out safety verification, the operation of a client is guaranteed, and the access frequency and the access record of the client are stored by the cloud platform.
Specifically, the information crawling module crawls real-time data, such as opening price, closing price, daily profitability and other information, of the target investment platform by adopting a script framework; and the information storage module stores the crawled real-time data into an sql database so as to further process the data. The script framework consists of script Engine, Scheduler, Downloader, Spider, ItemPipeline, Downloader Middlewards and Spider Middlewards components; the script Engine is an Engine of the whole framework and is mainly responsible for communication among a Spider, ItemPipeline, a down loader, a Scheduler and all components, transmission of signals and data and the like; the Scheduler is a Scheduler and is responsible for receiving all request urls, and performing de-duplication, arrangement and enqueuing on the request urls; the Downloader is a Downloader and is responsible for downloading webpage information in the network to complete the downloading of network resources; spider is a crawler, which is responsible for processing all Responses, analyzing and extracting data from them, and providing the url of the initial crawl; ItemPipeline is a data pipeline which is responsible for receiving Item data and performing post-processing such as data analysis, cleaning, warehousing and the like; downloader Middlewares is download middleware; spider middlewaes is the crawler middleware.
Preferably, as the whole network url is intercommunicated, in order to efficiently find a response target website by a crawler in a time-saving manner, the path is explored by using an elite ant system algorithm, and the steps of crawling real-time information are as follows:
(1) assuming that there are L urls to be crawled in the network, and the weighted adjacency matrix of the L urls is U, then:
wherein u is 1,2 Represents the weight, u, of the 1 st url site leading to the 2 nd url site 1,3 Represents the weight, u, of the 1 st url site leading to the 3 rd url site 1,L Represents the weight, u, of the 1 st url site leading to the Lth url site 2,1 Represents the weight, u, of the 2 nd url site leading to the 1 st url site 2,3 Represents the weight, u, of the 2 nd url site leading to the 3 rd url site 2,L Represents the weight, u, of the 2 nd url site leading to the L url site L,1 Represents the weight, u, of the L < th > url site leading to the 1 < th > url site L,2 Represents the weight, u, of the L < th > url site leading to the 2 < nd > url site L,3 Representing the weight of the L-th url site leading to the 3 rd url site, and making a certain item in the matrix u c,d Representing the weight of the c-th url site leading to the d-th url site, wherein the weight adjacency matrix is known as U, and the matrix U is a symmetric matrix, i.e. U c,d =u d,c Wherein u is c,d E [0, + ∞)) and the diagonal element is 0, representing that the same url node crawler does not need to spend additional cost when exploring;
(2) let V crawlers crawl data at the same time, let tau
c,d (0) Denotes pheromones from the c-th url to the d-th url, o is a constant close to 0, and τ
c,s (0) O, denoting pheromones from the c-th to s-th url, assuming the probability that at time t the crawler v transitions from the c-th to the d-th url
Comprises the following steps:
in the formula D k (c) L, representing the set of urls the crawler v is allowed to select next, η c,d ,η c,s Is a heuristic factor representing the crawler's expected degree from the c-th url to the d-th url, typically the time taken from the c-th url to the d-th url, said alpha and beta representing the relative importance of the pheromone and the expected heuristic factor, respectively;
(3) after the crawler completes one traversal, the pheromone is updated as follows: tau is
c,d (t+L)=(1-ρ)τ
c,d (t)+ Δτ
c,d Where ρ (0 < ρ < 1) denotes the pheromone evaporation coefficient on the path, 1- ρ denotes the pheromone persistence coefficient, and Δ τ
c,d Expressing the pheromone increment of the iteration, and setting the best url found by the crawler history as T for the elite ant system algorithm
best Then the improved pheromones are:
wherein e is the adjustment T
best The parameters that affect the weight are selected from the group,
represented by the formula:
wherein L is best Is a known optimal path T best After the crawler can find the optimal path for exploring the url site through iteration, the crawler under the case can quickly and accurately crawl real-time investment dynamic information after training, and convenient service is brought to clients.
Specifically, the information storage module stores the crawled real-time data to an sql database to further process the data.
Specifically, the data screening module filters data with low confidence coefficient according to the confidence coefficient of each data, screens repeated, missing and similar data, assumes that m types of data of n times are crawled, such as opening price, closing price, daily profitability and other information, records the screened data matrix as a, and can obtain:
wherein A is a screened data matrix, also called a sample matrix; a is 1,1 Is the 1 st data value of the first unit time, a 1,2 Is the 2 nd data value of the first unit time, a 1,m Is the mth data value of the first unit time, a 2,1 Is the 1 st data value of the second unit time, a 2,2 Is the 2 nd data value of the second unit time, a 2,m Is the m-th data value of the second unit time, a t,1 1 st data value of t unit time, a t,2 Is the 2 nd data value of the t unit time, a t,m Is the mth data value of the tth unit time, and t e [1, n ∈],a n,1 Is the 1 st data value of the nth unit time, a n,2 Is the 2 nd data value of the nth unit time, a n,m Is the mth data value of the nth unit time.
Specifically, the data analysis and management module comprises a time sequence unit, an overfitting analysis unit and a residual error unit, the screened samples are analyzed, the samples are a numerical sequence formed by arranging index numerical values according to a time sequence, the numerical sequence is regarded as a time sequence, and a differential autoregressive moving average model (ARIMA) model is used for performing prediction analysis on the investment information.
Specifically, a time sequence model is built by the time sequence unit, and data are fitted according to the screened high-confidence data; the overfitting analysis unit is used for restraining the fitting data by adopting a Chichi information criterion in order to ensure the accuracy of investment prediction, and the specific steps are as follows:
(1) the time sequence unit judges the stationarity of the time sequence firstly, and as the ARIMA model can be regarded as a linear combination of an autoregressive model (AR) and a moving average Model (MA), the stationarity of the ARIMA is also related to the AR and MA models, and the condition for judging whether the AR model is stable is as follows:
a. if the modulo lengths of the p solutions are all less than 1, { y } t Smooth, also called { y } t Corresponding ar (p) model stationary;
b. if the modulo length of k of the p solutions equals 1, { y } t Is a k-order unit root process (which can be differentiated into a stationary time series by k-order);
c. if the modulus of at least one of the p roots is greater than 1, { y } t -explosion process;
for the ma (q) model, generally, as long as q is a constant, ma (q) is stable, and in conclusion, as long as the ARIMA model satisfies the stationarity condition of the ar (p) model, the time sequence is stable;
preferably, only stationary time sequences can be analyzed using the ARIMA regression model;
(2) the differentiated mathematical model can be further simplified by using a hysteresis operator, denoted by the symbol B:
for the first order difference: delta a t =a t -a t-1 ;
For the second order difference: delta 2 a t =Δ(a t -a t-1 )=(a t -a t-1 )-(a t-1 -a t-2 )=a t - 2a t-1 +a t-2 =(1-2B+B 2 )a t =(1-B) 2 a t ;
Regarding the data of the cloud platform investment management, 2 differences are generally selected at most;
(3) building an ARIMA model, and writing the ARIMA model into the following steps according to definition:
wherein p is the order of the AR model, q is the order of the MA model, d is the difference times, d belongs to {0, 1, 2}, r is the Pearson correlation coefficient, y is t Is the sample value at time t, C is a constant, beta i Is a regression coefficient, ε t Is a multiplicative error coefficient n-t,j For the j data value of the n-t unit time, the pearson correlation coefficient r can be calculated as:
wherein var (a) n ) Variance of data at time n, var (a) n-t ) Variance of data at time n-t, cov (a) n ,a n-t ) For the covariance of the data at time n and the data at time n-t, we can write as:
the pearson correlation coefficient r may be written as:
(4) in order to obtain the p-order q-order value of ARIMA, the autocorrelation coefficient gamma of the sample is calculated
k Sum-bias autocorrelation coefficient phi
k,p The autocorrelation coefficient γ
k Comprises the following steps:
wherein, a
t Is the sample data value at time t, a
t-k Sample data values at time t-k, cov (a, a)
t-k ) Is a
t And a
t-k Covariance of (a)
t ) The partial autocorrelation coefficient phi is the variance of the data at time t
k,p Comprises the following steps:
wherein,γ
j-k Is the autocorrelation coefficient, gamma, of the sample data at the j-k time
p-k The method comprises the steps that the autocorrelation coefficients of sample data at the p-k moment are obtained, p is the order of an AR (p) model, and the orders p and q are determined by analyzing trailing and truncation conditions of the autocorrelation coefficients of the order d and the partial autocorrelation coefficients;
(5) using least squares, column regression equations for beta
i Solving for regression coefficients, predicting values
Comprises the following steps:
wherein a is
t-i Is the sample data value at time t-i, beta is the regression coefficient, epsilon
t-i Is a multiplicative t-i moment error coefficient;
(6) in order to prevent data overfitting, according to the overfitting analysis unit, data is judged by adopting Akaike Information Criterion (AIC):
where m is the number of categories of the parameter,
for the maximum likelihood function estimation value, the following conditions are satisfied:
wherein, p (a)
t θ) is a joint density function;
(7) after overfitting analysis is carried out on the model, then underfitting analysis is carried out on the model, white noise inspection needs to be carried out on the residual error, if the residual error is white noise, the rule that the selected model can completely identify time sequence data is shown, namely the model is acceptable, if the residual error is not white noise, part of information is not identified by the model, the model needs to be corrected to identify the part of information, data underfitting is prevented, the residual error module is used for ensuring the precision of predicted data, the square sum of the residual error is calculated, if the square sum of the residual error exceeds a precision threshold Td, the step (1) is returned to further carry out difference processing on the data, and the calculation formula of the square sum of the residual error RSS is as follows:
specifically, an investment scheme is drawn for a client according to predicted data, a client drawing-up investment scheme label is set, the client drawing-up investment scheme label comprises a general investment label and a customized investment label, in order to find an investment service which is most suitable for the user, a support vector machine is adopted, and a cuckoo algorithm is utilized to optimize a penalty factor and a kernel function parameter of the support vector machine, and the specific steps are as follows:
in the cuckoo algorithm, let x i (t) represents the position of the ith bird nest in the population which is reserved after the ith bird nest is updated by adopting a Laevir flight mode, X i (t) represents the position of the ith nest in the population that remains after the t iteration update, p a Indicating the probability of finding, the bird's nest position x i (t) randomly generating a random number rand between 0 and 1, when the random number rand is less than or equal to p a When it is, X i (t)=x i (t); when the random number rand > p a Then, X is determined in the following manner i Value of (t):
let x i (t) represents the position of the jth bird nest in the population which is reserved after the jth bird nest is updated by adopting a Laevir flight mode for the time t, and when the position x of the bird nest is j (t) satisfies: f (x) j (t))<f(x i (t)), the bird nest position x is set j (t) adding to the set M i (t) wherein M is i (t) denotes the relative nest position x in the population i (t) set of preferred bird nest positions, f (x) j (t)) represents a bird nest position x j (t) the corresponding fitness function value, f (x) i (t)) represents the bird nest position x i (t) a corresponding fitness function value; set M i (t) bird nest position by its distance from bird nest position x i (t) European equationThe distances are sorted from near to far to form a sequence Q i (t) adding Q i (t) is expressed as: q i (t)= {x i,l (t),l=1,2,...,n i (t) }, in which x i,l (t) represents a sequence Q i The first bird nest position in (t), n i (t) represents a sequence Q i Number of bird nests in (t), definition H i (t) indicates the bird nest position x i (t) spatial detection coefficient, then H i The expression of (t) is:
wherein R is
i,l (t) indicates a bird nest position x
i,l (t) in bird's nest position x
i (t) a spatial radius of the center, and R
i,l (t)=|x
i,l (t)-x
i (t) |, let x
i,n (t) represents a sequence Q
i (t) the nth bird nest position, R
i,n (t) indicates the bird nest position x
i,n (t) at bird nest position x
i (t) a spatial radius of the center, and R
i,n (t)=|x
i,n (t)- x
i (t)|,
Represents a sequence Q
i The first k bird nest positions in (t) and the bird nest position x
i (t) is the mean of the spatial radii of the centers, and
k is a given positive integer, and k satisfies: k is less than n
i (t), α and β are weight coefficients, α and β satisfy: α, β ∈ (0, 1) and α + β ═ 1;
let J i (t) positions x of participating nests in the population i (t) a set of randomly changed preferred bird nest positions, using a parameter k i (t) determining a set J i The preferred bird nest positions in (t) are specifically:
(1) according to bird's nest position x i (t) spatial detection coefficient H i (t) determining a parameter k i (t) ofThe value:
in the formula, k
i (t) indicates the bird nest position x
i (t) local range control parameters at random changes,
the median of the spatial detection coefficients of the bird nest positions reserved after the population is updated in the Laevir flight mode for the t time is represented, and
wherein, mean represents taking the median function,
indicating rounding down, and N indicating the number of nests in the population;
(2) sequence Q i Front k in (t) i (t) the positions of the bird nests are the positions x of the bird nests participating in the population i (t) randomly changed preferred bird nest position, i.e. in sequence Q i (t) selecting the front k i (t) bird nest positions to the set J i (t) in (a);
the bird nest position x i (t) the random change is performed in the following manner:
in the formula, chi
i (t) indicates the bird nest position x
i (t) New nest position obtained by random Change, rand
1 Is a randomly generated random number between 0 and 1,
and
are respectively in the set J
i (t) randomly selected bird nest positions, and
let f (x) i (t)) represents the bird nest position χ i (t) fitness function value, when f (χ) i (t))≥f(x i (t)) then X i (t)=x i (t) when f (χ) i (t))<f(x i (t)) then X i (t)=χ i (t)。
The invention has the beneficial effects that: reliable information is retrieved in a frame mode by using the script crawler and is stored in the sql database, so that the client can conveniently call, store and access the history record at any time, because the network urls are interleaved, load pressure is enormous for real-time changing investment data with only a single breadth search or depth search, therefore, the elite ant system is adopted, the problem that the prior ant colony algorithm needs to accurately carry out iterative computation on the pheromone generated by each ant is overcome, the pheromone generated by the elite ant is used for replacing the prior ant, so that the lightweight cloud platform invests data information, convenient and rapid service is brought, under a powerful cloud platform server, an ARIMA model is constructed by crawling real-time dynamic characteristic data through a crawler, performing fitting regression analysis on the data, and optimizing penalty factors and kernel function parameters of the support vector machine by using a cuckoo algorithm, so that the blindness of manual parameter selection is avoided, and the classification precision of the support vector machine is improved; the traditional cuckoo algorithm has the problems that the local optimization precision is not high enough, the convergence speed is not high enough and the like, and the problems easily cause that the traditional cuckoo algorithm cannot obtain the optimal parameters of the support vector machine, so that in order to improve the precision of optimizing the support vector machine by using the cuckoo algorithm, the preferred embodiment improves the traditional cuckoo algorithm and aims to improve the optimization precision and the convergence speed of the cuckoo algorithm, and the method specifically comprises the following steps: after the traditional cuckoo algorithm adopts the Levy flight mode to update the positions of the bird nests, the positions of part of the bird nests in the population are generally changed randomly, namely two bird nest positions are randomly selected from the population to randomly change the current position of the bird nest, but the position is changed randomlyThe random change mode is too random, lacks self-adaptability, and cannot well improve the effect of improving the local optimization precision and convergence speed, therefore, the preferred embodiment is arranged to select two superior bird nest positions in the population to randomly change the bird nest positions when the bird nest positions are randomly changed, thereby achieving the technical effect of improving the convergence speed of the algorithm, further, in order to enhance the local optimization precision of the algorithm and avoid the algorithm from falling into the local optimization, in the process of randomly changing the bird nest positions, the preferred embodiment measures the spatial overlap degree between the superior bird nest positions closer to the bird nest positions in the population and the bird nest positions by the defined spatial detection coefficients, when the value of the spatial detection coefficients corresponding to the bird nest positions is smaller, it indicates that the local spatial overlap degree formed by the superior bird nest positions closer to the bird nest positions in the population and the bird nest positions is higher, at this time, let parameter k i The value of (t) is large, i.e. in the sequence Q i (t) selecting more bird nest positions to participate in the random change of the bird nest positions, thereby increasing the diversity of the population, when the value of the space detection coefficient of the bird nest positions is larger, indicating that the overlapping degree of the more excellent bird nest positions closer to the bird nest positions in the population and the local space formed by the bird nest positions is smaller, and at the moment, making the parameter k i The value of (t) is small, i.e. in the sequence Q i And (t) selecting fewer bird nest positions to participate in the random change of the bird nest positions, so that the local search of the cuckoo is enhanced in the random change process, and the optimization precision of the algorithm is improved.
Quantitative description and prediction can effectively analyze investment, draw up an investment scheme and give an investment proposal on the premise of ensuring safety.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the protection scope of the present invention, although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.