CN116934486A

CN116934486A - Decision evaluation method and system based on deep learning

Info

Publication number: CN116934486A
Application number: CN202311192874.7A
Authority: CN
Inventors: 陈守红
Original assignee: Shenzhen Gelonghui Information Technology Co ltd
Current assignee: Shenzhen Lanyu Feiyang Technology Co ltd
Priority date: 2023-09-15
Filing date: 2023-09-15
Publication date: 2023-10-24
Anticipated expiration: 2043-09-15
Also published as: CN116934486B

Abstract

The invention relates to the technical field of data processing, and discloses a decision evaluation method and system based on deep learning, which are used for improving the efficiency and accuracy of the decision evaluation based on the deep learning. Comprising the following steps: feature extraction is carried out to obtain a financial feature set, and data extraction is carried out on historical financial data to obtain target basic surface data; the method comprises the steps of performing potential vector mapping through an encoder, outputting an initial potential vector, extracting index data, obtaining an index data set, performing data state analysis, determining a target data state, and performing transaction strategy analysis on the target data state to obtain an initial transaction strategy; predicting user behaviors to obtain initial user behaviors, and constructing a value function to obtain a target value function; resampling the initial potential vector to obtain a target potential vector, inputting the target potential vector into a decoder for data reconstruction to obtain a target data sample, and carrying out strategy updating on the initial transaction strategy to obtain a target transaction strategy.

Description

Decision evaluation method and system based on deep learning

Technical Field

The invention relates to the technical field of data processing, in particular to a decision evaluation method and system based on deep learning.

Background

The financial market is a complex and dynamic system, and is affected by various factors, such as macro economic environment, company basic surface data, technical indexes, and the like. Traditional financial decision evaluation methods often rely too much on subjective judgment by specialized analysts, and lack intelligent and systematic analysis means. Meanwhile, the financial market has huge data volume and complex characteristics, and potential rules and characteristics in the data are difficult to fully mine by the traditional statistical method and the regular transaction strategy, so that the accuracy of decision making and the profitability of the transaction strategy are affected. The deep learning technology can extract useful features from large-scale financial data, and build a complex model to capture the nonlinear relation of the market, so that the accuracy of decision evaluation is improved. The reinforcement learning technology can enable the intelligent body to learn the optimal strategy according to the interaction with the environment, so as to realize self-adaptive and intelligent transaction decision.

However, there are often imbalance problems in the data of the financial market in the prior art, such as imbalance of positive and negative samples, data maldistribution in the long-term stable market and the short-term fluctuation market. This can affect the generalization ability of the model and the accuracy of the decision making assessment. The deep learning model is easy to generate over-fitting problem when dealing with large-scale data, and particularly has limited generalization performance under the condition of small data sample size.

Disclosure of Invention

The invention provides a decision evaluation method and a decision evaluation system based on deep learning, which are used for improving the efficiency and the accuracy of the decision evaluation based on the deep learning.

The first aspect of the present invention provides a decision evaluation method based on deep learning, which comprises:

collecting historical financial data, and carrying out standardized processing on the historical financial data to obtain standardized financial data;

performing feature extraction on the standardized financial data to obtain a financial feature set, wherein the financial feature set comprises a moving average line, a relative strength index and a Boolean band, and performing basic surface data extraction on the historical financial data through the financial feature set to obtain target basic surface data;

inputting the financial feature set and the target basic surface data into a preset encoder for potential vector mapping, outputting an initial potential vector, and extracting index data of the initial potential vector to obtain an index data set of the initial potential vector;

performing data state analysis on the index data set, determining a target data state, and performing transaction strategy analysis on the target data state to obtain an initial transaction strategy;

Predicting the user behavior of the initial transaction strategy through a reinforcement learning algorithm to obtain initial user behavior, and constructing a value function through the initial user behavior to obtain a target value function;

resampling the initial potential vector through the target value function to obtain a target potential vector, inputting the target potential vector into a preset decoder for data reconstruction to obtain a target data sample, and carrying out strategy updating on the initial transaction strategy through the target data sample to obtain a target transaction strategy.

With reference to the first aspect, in a first implementation manner of the first aspect of the present invention, the performing feature extraction on the standardized financial data to obtain a financial feature set, where the financial feature set includes a moving average line, a relative strength index, and a brin belt, and performing, by using the financial feature set, extraction on base surface data of the historical financial data to obtain target base surface data, includes:

performing time window division on the standardized financial data to obtain a plurality of time windows corresponding to the standardized financial data;

screening the time windows to obtain a target time window, and calculating a moving average value through the target time window to obtain a moving average value set;

Constructing an average line through the moving average value set to obtain the moving average line;

performing index data calculation on the standardized financial data based on the moving average line to obtain the relative strength index;

based on the moving average line and the relative strength index, carrying out a brin belt analysis on the standardized financial data to determine the brin belt;

carrying out data weight analysis on the moving average line, the relative strength index and the brin belt to determine a weight data set;

carrying out data weighted fusion on the moving average line, the relative strength index and the brin belt through the weight data set to obtain fusion characteristics;

and extracting basic surface data from the historical financial data based on the fusion characteristics to obtain target basic surface data.

With reference to the first aspect, in a second implementation manner of the first aspect of the present invention, the inputting the financial feature set and the target base surface data into a preset encoder to perform potential vector mapping, and outputting an initial potential vector, and simultaneously, performing index data extraction on the initial potential vector to obtain an index data set of the initial potential vector includes:

Performing data conversion on the financial feature set and the target basic surface data to obtain target tensor data;

inputting the target tensor data into the encoder for forward propagation to obtain candidate potential vectors;

extracting vector average values of the candidate potential vectors to obtain candidate average values, and extracting vector standard deviation of the candidate potential vectors to obtain candidate standard deviation;

based on the candidate mean value and the candidate standard deviation, carrying out random sampling processing on the candidate potential vectors to obtain the initial potential vectors;

and extracting index data from the initial potential vector to obtain an index data set of the initial potential vector.

With reference to the second implementation manner of the first aspect, in a third implementation manner of the first aspect of the present invention, the extracting the index data from the initial potential vector to obtain an index data set of the initial potential vector includes:

acquiring a preset index data type, and performing data extraction logic analysis on data through the index data type to obtain data extraction logic;

constructing a data extraction function through the data extraction logic to obtain a corresponding target data extraction function;

And extracting index data of the initial potential vector through the target data extraction function to obtain an index data set of the initial potential vector.

With reference to the first aspect, in a fourth implementation manner of the first aspect of the present invention, the performing data state analysis on the index data set, determining a target data state, and performing transaction policy analysis on the target data state to obtain an initial transaction policy includes:

constructing a data line graph for the index data set to obtain a data line graph corresponding to the index data set;

carrying out data fluctuation analysis on the data line graph to obtain a data fluctuation trend, and simultaneously carrying out statistical feature analysis on the index data set to obtain corresponding statistical feature data;

carrying out data distribution analysis on the statistical characteristic data to determine data distribution information corresponding to the index data set;

based on the data fluctuation trend and the data distribution information, carrying out data state analysis on the index data set to obtain the target data state;

and carrying out transaction policy analysis on the target data state to obtain an initial transaction policy.

With reference to the first aspect, in a fifth implementation manner of the first aspect of the present invention, the predicting, by using a reinforcement learning algorithm, the user behavior of the initial transaction policy to obtain an initial user behavior, and performing a value function construction by using the initial user behavior to obtain a target value function, where the method includes:

Performing state dimension analysis on the financial feature set and the basic surface data to obtain an initial state dimension;

performing dimension updating on the initial state dimension through a sparse representation algorithm to obtain a target state dimension;

constructing an action space based on the target state dimension to obtain a target action space;

based on the target action space, predicting the user behavior of the initial transaction strategy to obtain the initial user behavior;

and constructing a value function through the user behavior to obtain the target value function.

With reference to the first aspect, in a sixth implementation manner of the first aspect of the present invention, resampling the initial potential vector by the target value function to obtain a target potential vector, inputting the target potential vector into a preset decoder to perform data reconstruction to obtain a target data sample, and performing policy update on the initial transaction policy by using the target data sample to obtain a target transaction policy, where the method includes:

and carrying out sampling parameter analysis on the initial potential vector through a value function approximation algorithm to obtain a sampling parameter set, wherein the sampling parameter set comprises: the sampling times and the sampling step length;

Resampling the initial potential vector through the target value function based on the sampling parameter set to obtain a target potential vector;

inputting the target potential vector into the decoder to perform data back propagation calculation through a back propagation algorithm to obtain the target data sample;

performing policy parameter matching on the target data sample to obtain a policy parameter set corresponding to the target data sample;

and carrying out strategy updating on the initial transaction strategy through the strategy parameter set to obtain the target transaction strategy.

The second aspect of the present invention provides a deep learning-based decision evaluation system, comprising:

resampling the initial potential vector through the target value function to obtain a target potential vector, inputting the target potential vector into a preset decoder for data reconstruction to obtain a target data sample, and carrying out strategy updating on the initial transaction strategy through the target data sample to obtain a target transaction strategy, wherein the method comprises the following steps:

A third aspect of the present invention provides a decision evaluation device based on deep learning, comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the deep learning based decision evaluation device to perform the deep learning based decision evaluation method described above.

A fourth aspect of the present invention provides a computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the deep learning based decision evaluation method described above.

According to the technical scheme provided by the invention, through standardized processing of the historical financial data, different types of financial data can be in the same numerical range, and the stability and convergence of subsequent feature extraction and model training are facilitated. Through extracting financial characteristics such as the mobile average line, the relative strength index and the Brin zone, the trend and fluctuation of the financial market can be captured, basic surface data extraction is carried out on historical financial data through the financial characteristic set, richer company basic surface information can be obtained, potential vector mapping is carried out on the financial characteristic set and target basic surface data through a preset encoder, and original data can be converted into potential representation, so that important characteristics of the data are extracted. Index data extraction is carried out on the initial potential vectors, information about data states and market characteristics can be obtained from potential space, and more comprehensive guidance is provided for subsequent transaction strategy analysis. By analyzing the data state of the index data set, the state and market trend of the data can be known, and the direction and the target of the transaction strategy can be determined. To further improve the efficiency and accuracy of the deep learning based decision evaluation.

Drawings

FIG. 1 is a schematic diagram of an embodiment of a decision evaluation method based on deep learning in an embodiment of the present invention;

FIG. 2 is a flow chart of potential vector mapping by an encoder in an embodiment of the present invention;

FIG. 3 is a flowchart of extracting index data from initial potential vectors according to an embodiment of the present invention;

FIG. 4 is a flow chart of data state analysis of an index dataset according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an embodiment of a deep learning based decision evaluation system in accordance with an embodiment of the present invention;

FIG. 6 is a schematic diagram of an embodiment of a decision evaluation device based on deep learning in an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a decision evaluation method and a decision evaluation system based on deep learning, which are used for improving the accuracy of the decision evaluation based on the deep learning.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, and one embodiment of a decision evaluation method based on deep learning in an embodiment of the present invention includes:

s101, acquiring historical financial data, and carrying out standardized processing on the historical financial data to obtain standardized financial data;

it will be appreciated that the implementation subject of the present invention may be a decision evaluation system based on deep learning, and may also be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.

Specifically, the server selects an appropriate financial data provider or data source. The provider is selected taking into account factors such as data coverage, data quality, frequency and cost. The server writes the code using an appropriate tool or programming language (e.g., python, R, etc.), and obtains the financial data from the selected data source via an API interface or web crawler. For example, the server uses the provided API to obtain historical price and trade volume data for a stock. After the data is acquired, the data is typically cleaned, including removing invalid data, processing missing values, outliers, and the like. This step ensures the quality and accuracy of the data. The server performs standardized processing on the data. Normalization is the conversion of data of different scales and magnitudes into the same range to facilitate subsequent analysis and modeling. Common normalization methods include Z-score normalization and maximum minimum normalization, among others. Z-score normalization converts data to data with a mean of 0 and standard deviation of 1, while maximum-minimum normalization converts data to data within a specified range. Through the normalization process, the server eliminates dimensional differences between different data, so that the data are compared and analyzed on the same scale. For example, assume that a server obtains historical price data for a stock using a provided API. The raw data are as follows:

The server calculates the mean and standard deviation:

the server performs Z-score normalization:

through such processing, the server converts the original stock price data into standardized data having a mean value of 0 and a standard deviation of 1.

S102, carrying out feature extraction on standardized financial data to obtain a financial feature set, wherein the financial feature set comprises a moving average line, a relative strength index and a Brin belt, and carrying out basic surface data extraction on historical financial data through the financial feature set to obtain target basic surface data;

specifically, the standardized financial data is divided into a plurality of time windows. These time windows are filtered, a portion of the target time windows are selected, and then a moving average within each target time window is calculated. The moving average is data obtained by smoothing the price of the stock, and can reflect the trend of the price. Based on the moving average line, a relative strength index is calculated for comparing the relative performance of different target assets. And (3) carrying out Brin zone analysis by using the moving average line and the relative strength index, and determining the fluctuation interval and trend of the price. In this way, the server obtains a series of financial features. The server performs data weight analysis on the financial characteristics, and determines weights of different characteristics according to actual application requirements and statistical analysis. These weights will influence the degree of contribution of the feature to the final decision. And carrying out data weighted fusion on the moving average line, the relative strength index and the Brin zone through the weight data set to obtain fusion characteristics. The fusion characteristics comprehensively consider the weights of different characteristics, and the state and trend of the financial market can be more comprehensively described. And the server extracts basic surface data of the historical financial data based on the fusion characteristics to obtain target basic surface data. These basic plane data include information such as the stock's long-term trends and volatility. Based on this data, the server performs further decision making and transaction policy formulation. For example, the number of the cells to be processed,

Suppose that the server has historical price data for a stock as follows:

the server divides the data into a time window according to every 5 transaction days, and the following time windows are obtained:

assuming that the server selects the target time windows W2, W4, W6, then calculates the moving average within each target time window, resulting in the following:

based on the moving average line, the server calculates the relative strength index, assuming the reference price is 120, and the following results are obtained:

the server uses the moving average line and the relative strength index to carry out the Brin zone analysis, and the following result is obtained under the assumption that the standard deviation is 5:

the server performs data weight analysis on the moving average, the relative strength index, and the brin belt, assuming weights of 0.4, 0.3, and 0.3 are respectively given. And carrying out data weighted fusion on the moving average line, the relative strength index and the Brin zone through the weight data set to obtain the following results:

the server extracts basic surface data of the historical financial data based on the fusion characteristics to obtain the following results:

these base data contain information of the fusion characteristics that can help the server better understand the long-term trends and volatility of the stock, making more informed investment decisions.

S103, inputting the financial feature set and the target basic surface data into a preset encoder for potential vector mapping, outputting an initial potential vector, and extracting index data of the initial potential vector to obtain an index data set of the initial potential vector;

specifically, the server performs data conversion on the financial feature set and the target basic plane data, and combines them into one target tensor data. The target tensor data contains information of all the features and is used as input in the deep learning model. The server inputs the target tensor data into a preset encoder, which is a neural network model, for forward propagation, for mapping the input data into a potential vector space. The server obtains candidate potential vectors through the forward propagation process of the encoder. And extracting vector mean value and vector standard deviation of the candidate potential vectors. Useful information is extracted from the distribution of potential vectors. The server obtains a candidate average value by carrying out average operation on all samples of the potential vector. Meanwhile, candidate standard deviations are obtained by calculating the standard deviations of the potential vectors. Based on the candidate mean and the candidate standard deviation, the server performs random sampling processing on the candidate potential vectors to obtain initial potential vectors. This process is to introduce a certain randomness, so that the sampling of the potential vectors is more diversified, and the generalization capability of the model is increased. And extracting index data of the initial potential vector to obtain an index data set of the initial potential vector. The index data set may contain various statistical information of potential vectors, such as maximum value, minimum value, average value, etc., and may also contain other specific index information, and is designed according to specific requirements. For example, assume that a server has a set of financial features and target base plane data as follows: financial feature set= [ feature 1, feature 2, feature 3, feature 4, feature 5], target base face data= [ base face data 1, base face data 2, base face data 3]. The server performs data conversion on these data to obtain target tensor data: target tensor data= [ feature 1, feature 2, feature 3, feature 4, feature 5, basic plane data 1, basic plane data 2, basic plane data 3]. The server inputs the target tensor data into a preset encoder for forward propagation, and candidate potential vectors are obtained: candidate potential vector= [0.2, -0.4,0.6, -0.3,0.8]. The server performs vector mean extraction and vector standard deviation extraction on candidate potential vectors: candidate mean=0.18, candidate standard deviation=0.44. Based on the candidate mean and the candidate standard deviation, the server performs random sampling processing to obtain an initial potential vector: initial potential vector = 0.15. Extracting index data of the initial potential vector to obtain an index data set of the initial potential vector: index data set of initial potential vector= [0.15].

The server acquires a preset index data type. The index data type is a predefined set of index names or index IDs that identify specific indices extracted from the initial potential vector. For example, there may be index data types such as "maximum", "minimum", "average", and the like. And carrying out data extraction logic analysis through the index data type. In this process, the server designs the corresponding data extraction logic according to the definition of each index data type. These logic include finding elements of a particular location in the potential vector, or performing a series of mathematical calculations to obtain index data. The design of the data extraction logic takes the diversity of index data types into consideration, so that the extracted index data can be ensured. The server builds a data extraction function through data extraction logic. The data extraction function is a program or method for extracting corresponding index data from the initial potential vector according to a given index data type. When constructing the data extraction function, the data extraction logic obtained by the previous analysis is converted into an actual code implementation. The data extraction function can be an independent function or part of a deep learning model, and is selected according to the actual application scene. By constructing the data extraction function, the server obtains the corresponding target data extraction function. This function is capable of extracting corresponding index data from the initial potential vector according to a given index data type. The target data extraction function may be used during a model training or prediction phase to achieve index data extraction of the initial potential vector. For example, assume that the server has a deep learning based financial model that generates an initial potential vector. The initial potential vector may be expressed in the form: the initial potential vector = [0.2, -0.4,0.6, -0.3,0.8] server predefines some index data types, including "maximum" and "average". For the "max" type, the server finds the largest element in the potential vector, i.e., 0.8. For the "average" type, the server calculates an average of all elements in the potential vector. All elements are added and then divided by the total number of elements (5 elements), the server gets an average of 0.18. In this embodiment, the server extracts the "maximum" and "average" from the initial potential vectors according to the predefined index data type. In practice, the server defines more index data types and designs corresponding data extraction logic for extracting more useful information from the potential vectors. The extracted index data may be used for subsequent decision making and transaction policy making.

S104, carrying out data state analysis on the index data set, determining a target data state, and carrying out transaction strategy analysis on the target data state to obtain an initial transaction strategy;

specifically, the server builds a data line graph for the index data set. The line graph is a common data visualization mode, and can show the change trend of the index along with time. By drawing a line graph of the index data set, the server intuitively knows the fluctuation condition and the change trend of the index. And the server performs data fluctuation analysis on the data line graph to obtain a data fluctuation trend. The data fluctuation trend can help the server judge the fluctuation degree and the fluctuation period of the index data. This helps the server to know the stability and risk situation of the index. And meanwhile, the server performs statistical feature analysis on the index data set to obtain corresponding statistical feature data. The statistical characteristic data comprises information such as mean value, variance, skewness, kurtosis and the like of the index data. These statistical features may provide information about the distribution and degree of dispersion of the index data. And the server performs data distribution analysis on the statistical characteristic data and determines data distribution information corresponding to the index data set. The data distribution information describes the distribution of index data, such as normal distribution, biased distribution, and the like. This helps the server to learn the overall distribution characteristics of the index data. Based on the data fluctuation trend and the data distribution information, the server performs data state analysis on the index data set to obtain a target data state. The target data state is a summary of the overall condition of the index data set, and can be stable, has large fluctuation, is in an ascending trend or a descending trend, and the like. And the server analyzes the transaction policy of the target data state to obtain an initial transaction policy. The transaction policy can be formulated according to the target data state, for example, a conservative policy is adopted in a stable stage, risk control measures are adopted when fluctuation is large, and the bin is timely increased when the trend is upward. For example, assume that the server has an index data set of stocks, including the closing price data of the past several months. The server firstly draws a line drawing of the closing price data, and discovers that the stock price shows an ascending trend. The server performs data fluctuation analysis, and finds that the fluctuation range of the receiving price data is gradually increased, but the whole of the receiving price data shows an ascending trend. And the server performs statistical feature analysis to obtain the information such as the mean value, variance, skewness and the like of the price receiving data. The server finds that the mean value gradually increases and the variance gradually expands, indicating that the rising trend of stock price is accelerating, but also accompanied by greater volatility. The server performs data distribution analysis on the statistical characteristic data, and discovers that the price collecting data presents positive bias distribution, and indicates that the rising property of the stock price is larger. Based on the data fluctuation trend and the data distribution information, the server performs data state analysis to obtain a conclusion: the stock is in a state of rising trend and large fluctuation. The server formulates a transaction policy based on the data state. As stocks are in an upward trend, servers will take strategies to increase the space to obtain higher yields. However, in view of the large volatility, the server is also provided with risk control measures, such as setting a loss stop bit, to prevent large losses.

S105, predicting user behaviors of the initial transaction strategy through a reinforcement learning algorithm to obtain initial user behaviors, and constructing a value function through the initial user behaviors to obtain a target value function;

specifically, the server performs state dimension analysis on the financial feature set and the basic surface data to obtain an initial state dimension. The status dimension is an abstraction and description of the status of the financial market, and may include multiple dimensions of stock prices, volume of deals, market rates, and the like. By analyzing the historical data of these dimensions, the server gets the initial state dimensions. And the server performs dimension updating on the initial state dimension through a sparse representation algorithm to obtain a target state dimension. The sparse representation algorithm is an algorithm for dimension reduction or feature selection, and can extract the most important information from the original high-dimensional state dimension, so as to reduce redundant information. Through the sparse representation algorithm, the server obtains a more compact and meaningful target state dimension. Based on the target state dimension, the server performs action space construction to obtain a target action space. An action space refers to a collection of all actions that a transaction policy can take, such as buy, sell, hold, etc. The server defines a target action space according to the target state dimension and the transaction rules. Based on the target action space, the server predicts user behavior for the initial transaction policy. The user behavior prediction is to predict the behavior taken by the user according to the current state and action space through a reinforcement learning algorithm. This process simulates the decision-making behavior of the user in the financial market, thereby evaluating the effectiveness of the transaction strategy. And through user behaviors, the server performs value function construction to obtain a target value function. The value function is the core concept in reinforcement learning algorithms that is used to evaluate the value of taking different actions in a given state. Through the value function, the server evaluates and optimizes the transaction policy to find the optimal actions in different states. For example, assume that the server uses reinforcement learning algorithms to optimize a stock exchange strategy. The server firstly performs feature extraction on the historical financial data to obtain a financial feature set and basic surface data. And carrying out state dimension analysis on the characteristics to obtain an initial state dimension. And the server performs dimension updating on the initial state dimension by using a sparse representation algorithm to obtain a target state dimension. Through a sparse representation algorithm, the server screens out the most important dimension from the original high-dimensional state dimensions, for example, stock prices, market rates, and volume of exchanges are selected as target status dimensions. Based on the target state dimension, the server defines an action space, e.g., allowing the transaction policy to take actions such as buy, sell, or hold. The server uses a reinforcement learning algorithm to make predictions of user behavior. In each state, the server predicts what the user will take based on the current state and action space, e.g., predicts that the user will choose to buy in a certain state. And through user behaviors, the server performs value function construction to obtain a target value function. The value function may evaluate the effectiveness of the transaction policy based on the results of the user's actions, such as predicting whether the user has received revenue after purchase.

S106, resampling the initial potential vector through a target value function to obtain a target potential vector, inputting the target potential vector into a preset decoder to reconstruct data to obtain a target data sample, and carrying out strategy updating on the initial transaction strategy through the target data sample to obtain a target transaction strategy.

The server uses a value function approximation algorithm to sample parameter analysis on the initial potential vector to obtain a sampling parameter set. The sampling parameter set includes a sampling number and a sampling step size. These parameters are used to guide the resampling process, determining the frequency and amplitude of the samples. Based on the sampling parameter set, the server resamples the initial potential vector through the target value function to obtain the target potential vector. Resampling refers to the gradual approximation of a potential vector to an optimal value under the direction of a value function by sampling and varying the step size multiple times. By resampling, the server gets a better potential vector in order to optimize the transaction policy. And the server inputs the target potential vector into a preset decoder for data reconstruction to obtain a target data sample. The decoder is a neural network used to map the potential vectors back into the original data space. And (3) obtaining reconstructed target data samples by the server through back propagation calculation of the decoder, wherein the samples are generated according to target potential vectors and reflect the data conditions corresponding to the optimized transaction strategies. And carrying out policy parameter matching on the target data sample to obtain a policy parameter set corresponding to the target data sample. The policy parameter set contains specific parameter values of the optimized trade policy institute, such as a buying threshold value, a selling threshold value and the like. And carrying out strategy updating on the initial transaction strategy through the strategy parameter set to obtain the target transaction strategy. By applying the optimized policy parameters to the initial transaction policy, the server obtains an optimized target transaction policy. The optimization process is implemented by a deep learning model and a value function approximation algorithm to improve the effectiveness and stability of the trading strategy. For example, suppose that the server has a deep learning based transaction policy model with an initial potential vector of [0.2, -0.4,0.6, -0.3,0.8]. The server uses a value function approximation algorithm to set the sampling parameter set to 10 times of sampling and the sampling step length to 0.1. The initial potential vector is resampled according to the set of sampling parameters. After 10 samplings, the server gets the target potential vector [0.21, -0.43,0.59, -0.31,0.79]. And inputting the target potential vector into a decoder for data reconstruction to obtain a target data sample. The decoder maps the target potential vector back into the original data space by back-propagation computation, resulting in target data samples, e.g. [100.21, 200.34, 150.59, 180.21, 250.79]. Policy parameter matching is performed according to the target data sample, so as to obtain a policy parameter set, for example, the buying threshold is set to be 100, and the selling threshold is set to be 250. And carrying out strategy updating on the initial transaction strategy through the strategy parameter set to obtain the target transaction strategy. By applying the optimized policy parameters to the initial trading policy, the server obtains an optimized target trading policy with a buy threshold of 100 and a sell threshold of 250.

In the embodiment of the invention, through standardized processing of the historical financial data, different types of financial data can be in the same numerical range, and the stability and convergence of subsequent feature extraction and model training are facilitated. Through extracting financial characteristics such as the mobile average line, the relative strength index and the Brin zone, the trend and fluctuation of the financial market can be captured, basic surface data extraction is carried out on historical financial data through the financial characteristic set, richer company basic surface information can be obtained, potential vector mapping is carried out on the financial characteristic set and target basic surface data through a preset encoder, and original data can be converted into potential representation, so that important characteristics of the data are extracted. Index data extraction is carried out on the initial potential vectors, information about data states and market characteristics can be obtained from potential space, and more comprehensive guidance is provided for subsequent transaction strategy analysis. By analyzing the data state of the index data set, the state and market trend of the data can be known, and the direction and the target of the transaction strategy can be determined. To further improve the efficiency and accuracy of the deep learning based decision evaluation.

In a specific embodiment, the process of executing step S102 may specifically include the following steps:

(1) Performing time window division on the standardized financial data to obtain a plurality of time windows corresponding to the standardized financial data;

(2) Screening the multiple time windows to obtain a target time window, and calculating a moving average value through the target time window to obtain a moving average value set;

(3) Constructing an average line through the moving average value set to obtain a moving average line;

(4) Calculating index data of the standardized financial data based on the moving average line to obtain a relative strength index;

(5) Based on the moving average line and the relative strength index, carrying out the Boolean band analysis on the standardized financial data to determine the Boolean band;

(6) Carrying out data weight analysis on the moving average line, the relative strength index and the Brin zone to determine a weight data set;

(7) Carrying out data weighted fusion on the moving average line, the relative strength index and the Brin belt through the weight data set to obtain fusion characteristics;

(8) And extracting basic surface data from the historical financial data based on the fusion characteristics to obtain target basic surface data.

Specifically, the server performs time window division on the standardized financial data to obtain a plurality of time windows. The time window is a time window in which financial data is divided at certain time intervals, such as daily, weekly, or monthly. The target time windows are screened out of these time windows. The target time window may be selected according to particular needs, for example, selecting the last year's time window as the target time window. And calculating a moving average value through a target time window to obtain a moving average value set. A moving average refers to an operation of averaging data over a time window for smoothing fluctuations in the data. After the collection of moving averages is obtained, the server performs an average line construction, and the moving averages are plotted as a curve, called a moving average line. Moving the average line helps to more intuitively observe the long-term trend of the data. And the server calculates index data of the standardized financial data based on the mobile average line to obtain a relative strength index. The relative strength index is a common technical index used for measuring the strength of the price. After obtaining the moving average line and the relative strength index, the server performs a brin belt analysis to determine the brin belt. The brin belt is a technical index for measuring price fluctuation and consists of a middle rail (namely a moving average line), an upper rail and a lower rail. And then, the server performs data weight analysis on the moving average line, the relative strength index and the Brin zone to determine a weight data set. These weights may reflect the importance of different metrics to the decision. And through the weight data set, the server performs data weighted fusion on the moving average line, the relative strength index and the Brin zone to obtain fusion characteristics. The fusion feature integrates the effects of multiple indicators for more fully characterizing financial data. Based on the fusion characteristics, the server extracts basic surface data of the historical financial data to obtain target basic surface data. The base data is an important factor for evaluating the value of the financial asset, such as profitability, etc. For example, assume that a server processes historical price data for a stock and extracts base plane data to make a transaction decision. The server divides the price data into a time window every 5 days, and screens out the time window of the last year as a target time window. A moving average of stocks over each time window is calculated and these values are plotted as a moving average line. And calculating relative strength indexes according to the moving average line to measure the strength of the stock price. And (5) carrying out the analysis of the Brinell zone according to the moving average line and the relative strength index to obtain an upper rail, a middle rail and a lower rail of the Brinell zone. The server determines the weights of different indexes in the model according to the data weight analysis, for example, the weight of a moving average line is set to be 0.4, and the weight of a relative strength index is set to be 0.6. And the server performs weighted fusion on the moving average line and the relative strength index through the weight data set to obtain fusion characteristics. Based on this fusion feature, the server extracts basic plane data of the stock, for example, determines whether the stock is in a super-buying or super-selling state, thereby determining the time to buy or sell.

In a specific embodiment, as shown in fig. 2, the process of performing step S103 may specifically include the following steps:

s201, performing data conversion on the financial feature set and the target basic surface data to obtain target tensor data;

s202, inputting target tensor data into an encoder for forward propagation to obtain candidate potential vectors;

s203, extracting vector average values of candidate potential vectors to obtain candidate average values, and extracting vector standard deviation of the candidate potential vectors to obtain candidate standard deviation;

s204, carrying out random sampling processing on candidate potential vectors based on the candidate mean value and the candidate standard deviation to obtain initial potential vectors;

and S205, extracting index data of the initial potential vector to obtain an index data set of the initial potential vector.

Specifically, first, the server converts the set of financial features and the target base plane data into target tensor data. The target tensor data is a multi-dimensional array that can organize the data according to a particular format and be used for input to the neural network. For example, assume that the server has the following set of financial features and target base plane data: financial feature set: moving average line data, relative strength index data and brin zone data. Target base plane data: the server combines the data according to a certain format to obtain target tensor data, wherein each dimension represents a feature or basic surface data, and each element in the tensor is a specific value of the corresponding feature or basic surface data. The server inputs the target tensor data into the encoder for forward propagation to obtain candidate potential vectors. The encoder is a neural network model that maps input data to a potential vector space, typically a low-dimensional space. This potential vector represents an abstract representation of the input data in the encoder. After the candidate potential vector is obtained, the server performs vector mean extraction and vector standard deviation extraction on the candidate potential vector. This is to extract more information from the distribution of potential vectors. For example, the server calculates the mean and standard deviation of candidate potential vectors to obtain a piece of information representing the center of distribution and the degree of dispersion of the potential vectors. And the server performs random sampling processing on the candidate potential vectors based on the candidate mean and the candidate standard deviation to obtain initial potential vectors. This random sampling process may increase the robustness and diversity of the model. And extracting index data of the initial potential vector by the server to obtain an index data set of the initial potential vector. This can be understood as extracting and sorting the information in the initial potential vector in a manner for subsequent data analysis and decision evaluation. For example, assume that the server uses a financial data set containing a plurality of characteristics including moving average line data of stocks, relative strength index data, and brin zone data. At the same time, the server has some basic data, such as the yield, market rate and market value of the stock. The server combines the data into a target tensor data, wherein each dimension corresponds to a feature or base surface data, and each element is a specific value corresponding to the feature or base surface data. For example, a target tensor data is a two-dimensional array shaped as (batch_size, num_features), where batch_size is the size of the data batch and num_features is the number of feature and base plane data. The server inputs the target tensor data into a pre-trained encoder for forward propagation to obtain candidate potential vectors. The encoder will perform feature extraction and conversion on each data sample to obtain a potential vector representation. The server extracts a vector mean and a vector standard deviation from the candidate potential vectors. Let the candidate potential vector obtained by the server be (batch_size, latency_dim), where latency_dim is the dimension of the potential vector. The server calculates the mean and standard deviation of all potential vectors in the whole batch. Based on the candidate mean and the candidate standard deviation, the server performs random sampling processing on the candidate potential vectors to obtain initial potential vectors. Specifically, the server uses a gaussian distribution to sample to ensure that the sampled vector as a whole conforms to the original potential vector distribution. The server performs index data extraction on the initial potential vector. This includes calculating certain statistical features in the potential vector, such as mean, variance, or maximum, etc., for subsequent decision making and policy making.

In a specific embodiment, as shown in fig. 3, the process of executing step S205 may specifically include the following steps:

s301, acquiring a preset index data type, and performing data extraction logic analysis on data through the index data type to obtain data extraction logic;

s302, constructing a data extraction function through data extraction logic to obtain a corresponding target data extraction function;

and S303, extracting index data of the initial potential vector through a target data extraction function to obtain an index data set of the initial potential vector.

In particular, the server defines preset index data types, which may include different data types such as numbers, text, dates, etc. For example, the number type includes numerical data such as stock prices, transaction amounts, etc., the text type includes descriptive data such as company names, industry classifications, etc., and the date type includes data such as transaction dates. After the index data type is acquired, the server performs a data extraction logic analysis. And determining the data extraction rule and the processing mode of each index data type. For example, for digital type data, the server selects a simple extraction method, such as direct numeric value; for the data of the text type, text processing such as word segmentation, word vectorization and the like is carried out; for date type data, time series processing such as extracting data of a specific time period is performed. And constructing a corresponding data extraction function by the server according to the result of the data extraction logic analysis. The data extraction function is a function that processes raw data according to an index data type, which can convert the raw data into a format conforming to target data extraction logic. For example, for data of the numeric type, a simple function may be constructed to extract the numeric value; for text type data, a text processing function can be constructed to perform word segmentation and word vectorization; for date-type data, a time-series processing function may be constructed to extract data for a particular time period. And the server obtains the corresponding target data extraction function through the construction of the data extraction logic and the data extraction function. The target data extraction function is a comprehensive processing function, which classifies raw data according to the type of index data and converts the processed data into a target format. For example, for each index data in the set of financial features, the server extracts the corresponding index data by a target data extraction function, resulting in an index data set of initial potential vectors. For example, assume that the server has a set of financial characteristics including price, volume and market value for stocks, and target base data including company name and industry. The index data type preset by the server is a digital type and a text type. The server defines the data of the digital type as price, transaction amount and market value, and the data of the text type as company name and industry. And performing data extraction logic analysis. For data of the digital type, the server simply chooses to directly extract the value. For text type data, the server performs text processing, such as converting company names into word vector representations. The server builds a data extraction function. For example, for data of a numeric type, the server constructs a function to directly extract the numeric value. For text type data, the server builds a function to perform text processing to convert the company name into a word vector representation. The server obtains the target data extraction function. Through the function, the server extracts each index data in the financial feature set and the target basic surface data to obtain an index data set of the initial potential vector. For example, the server extracts price, trading volume, and market value of stocks as index data of a digital type, and extracts company name and industry as index data of a text type. These index data will be used for subsequent data analysis and decision evaluation.

In a specific embodiment, as shown in fig. 4, the process of executing step S104 may specifically include the following steps:

s401, constructing a data line graph of the index data set to obtain a data line graph corresponding to the index data set;

s402, carrying out data fluctuation analysis on the data line graph to obtain a data fluctuation trend, and simultaneously carrying out statistical feature analysis on the index data set to obtain corresponding statistical feature data;

s403, carrying out data distribution analysis on the statistical characteristic data, and determining data distribution information corresponding to the index data set;

s404, carrying out data state analysis on the index data set based on the data fluctuation trend and the data distribution information to obtain a target data state;

s405, carrying out transaction policy analysis on the target data state to obtain an initial transaction policy.

Specifically, the index data sets are ordered according to time sequence, and then a data line graph is drawn according to time and corresponding numerical values. The data line graph can help the server to intuitively know the change condition of the data along with time, and helps the discovered trend and periodical change. By carrying out fluctuation analysis on the data line graph, the server observes the fluctuation condition of the data, including wave crest, wave trough, fluctuation amplitude and the like. Meanwhile, statistical feature analysis is performed, including calculation of statistics of mean, variance, standard deviation, skewness, kurtosis and the like, so as to better describe distribution and fluctuation features of data. After knowing the statistical characteristics of the data, the server performs data distribution analysis. The distribution of the data is visualized by drawing a graph such as a histogram and a density map. The data distribution analysis may help the server understand the skewness and kurtosis of the data to further understand the distribution of the data. And carrying out data state analysis by combining the data fluctuation trend and the data distribution information. The server judges whether the data is in an oscillation state or a trend state according to the fluctuation trend, and judges whether the data deviates from normal distribution according to the data distribution information. The data state analysis may help the server identify market quotation features and potential risks. Based on the result of the data state analysis, the server formulates a corresponding transaction policy. For example, if the data is in an concussive state, the server takes an inverse potential transaction policy; if the data is in a trending state, the server takes a trending following transaction policy. The transaction strategy is formulated by considering the characteristics of the data and the actual situation of the market. For example, assume that the server has a price index dataset of stocks, which the server first builds a data line graph. And drawing the change condition of the price along with time, and finding that the stock price shows an increasing trend in the past period of time by the server. And the server performs data fluctuation analysis and statistical feature analysis. By calculating the fluctuation range and the statistical characteristics of the price, the server finds that the fluctuation of the stock price is large and the positive bias distribution is presented. The server performs data distribution analysis and draws a histogram and a density map of the price. Through graphic visualization, the server sees that the price distribution deviates from the normal distribution and has a certain tail. When the data state analysis is carried out, the server combines the fluctuation trend and the distribution information of the data. The server finds that the stock price is in an increasing trend and has large fluctuation, but the distribution is biased. This indicates that the market is in an oscillating phase of rising trend. And according to the result of the data state analysis, the server formulates a corresponding transaction strategy. For example, an inverse trading strategy may be adopted, i.e. buying and selling operations are performed when price fluctuations are large, to obtain profits during the concussion phase. Or may also take a trend following trading strategy, i.e., keep stock in the trend of rising prices, to obtain greater revenue in the trend of rising prices. The establishment of these trading strategies will be determined by the actual market situation and the risk preferences of the investors.

In a specific embodiment, the process of executing step S105 may specifically include the following steps:

(1) Performing state dimension analysis on the financial feature set and the basic surface data to obtain an initial state dimension;

(2) Performing dimension updating on the initial state dimension through a sparse representation algorithm to obtain a target state dimension;

(3) Performing action space construction based on the target state dimension to obtain a target action space;

(4) Based on the target action space, predicting user behavior of the initial transaction strategy to obtain initial user behavior;

(5) And constructing a value function through user behaviors to obtain a target value function.

Specifically, the server performs state dimension analysis on the financial feature set and the basic surface data. The status dimension refers to the dimension of the financial data that reflects market conditions and characteristics. By counting and analyzing the financial data, the server finds the features that best characterize the market state and takes them as initial state dimensions. Sparse representation algorithms are a way to characterize data by way of linear combinations. The server uses a sparse representation algorithm to update and optimize the initial state dimension, and a more accurate and meaningful target state dimension is obtained. The sparse representation algorithm may learn a sparse representation of the data by minimizing the error, resulting in a more compact state dimension representation. After the target state dimension is obtained, the server builds an action space based on the target state dimension. The action space refers to all possible decision options in the decision system. In financial decisions, the action space may include different transaction policies, such as different transaction actions for buying, selling, holding, etc. The server predicts the actions that the user will take in a particular market state by a user behavior prediction model based on the target action space. These behaviors may be trained from historical data and models, or may be derived from reinforcement learning algorithms. The value function is a function that measures the effectiveness of a decision, which can help the server evaluate the benefits of taking different actions in a particular market state. Through user behavior prediction and actual market data, the server constructs a value function to measure the effects of different transaction strategies, so that the optimal transaction strategy is found. For example, assuming that the server is to make a stock exchange decision, the server selects a particular set of financial characteristics, such as stock prices and exchanges over a period of time, while adding some underlying data, such as corporate financial indicators, etc. These features constitute the server's financial feature set and base plane data. The server performs state dimension analysis on the financial feature set and the basic surface data to find features which can most reflect the market state, such as fluctuation of stock price, change of trading volume and the like. The server uses a sparse representation algorithm to update and optimize the initial state dimension to obtain a more compact and meaningful target state dimension, such as a stock price trend, a transaction amount fluctuation trend and the like. The server builds an action space based on the target state dimension, such as may choose to buy, sell, or hold stocks as the action space. Through the user behavior prediction model, the server predicts that in a particular market state, the user will choose to buy or sell stocks. The server builds a value function to measure the effect of different transaction strategies, and the server finds the optimal transaction strategy by comparing the value functions of the different strategies, so that a more intelligent decision is made.

In a specific embodiment, the process of executing step S106 may specifically include the following steps:

(1) And carrying out sampling parameter analysis on the initial potential vector by a value function approximation algorithm to obtain a sampling parameter set, wherein the sampling parameter set comprises: the sampling times and the sampling step length;

(2) Resampling the initial potential vector through a target value function based on the sampling parameter set to obtain a target potential vector;

(3) Inputting the target potential vector into a decoder to perform data back propagation calculation through a back propagation algorithm to obtain a target data sample;

(4) Performing policy parameter matching on the target data sample to obtain a policy parameter set corresponding to the target data sample;

(5) And carrying out strategy updating on the initial transaction strategy through the strategy parameter set to obtain the target transaction strategy.

Specifically, the server selects an appropriate value function approximation algorithm, such as the monte carlo method or importance sampling. The server samples the initial potential vector for a plurality of times, and the step size of each sampling is the sampling step size. The sampling parameter set includes a sampling number and a sampling step size. The server obtains a plurality of target potential vectors according to the sampling parameter set. These target potential vectors are obtained by sampling the initial potential vectors by a value function approximation algorithm. The target potential vector is the decision result under different market conditions. And inputting the target potential vector into a decoder to perform data back propagation calculation to obtain a target data sample. The decoder functions to convert the potential vector back into the original data space, generating data samples corresponding to the target potential vector. And performing policy parameter matching on the target data sample, namely finding a policy parameter set corresponding to the target data sample. The policy parameter set may include parameters of the transaction policy, such as buy opportunities, sell opportunities, proportion of the warehouse taken, etc. And carrying out strategy updating on the initial transaction strategy through the strategy parameter set to obtain the target transaction strategy. The strategy updating is to adjust the transaction strategy according to the target data sample and the strategy parameters so as to achieve better transaction effect. For example, assume that the server performs a value function approximation algorithm using the monte carlo method, and sets the sampling number to 10 times, and the sampling step size to 1. The server samples the initial potential vector 10 times, each sampling step being 1. 10 target potential vectors are obtained. The 10 target potential vectors are input into a decoder for data back propagation calculation, so that 10 target data samples are obtained. And carrying out policy parameter matching on the 10 target data samples, and finding a policy parameter set corresponding to each target data sample. The policy parameter set comprises parameters such as buying time, selling time and the like. And carrying out strategy updating on the initial transaction strategy according to the strategy parameter set to obtain 10 target transaction strategies. These target trading strategies are adjusted according to different market states and strategy parameters to achieve better trading results.

Through the steps, through standardized processing of the historical financial data, different types of financial data can be in the same numerical range, and stability and convergence of subsequent feature extraction and model training are facilitated. Through extracting financial characteristics such as the mobile average line, the relative strength index and the Brin zone, the trend and fluctuation of the financial market can be captured, basic surface data extraction is carried out on historical financial data through the financial characteristic set, richer company basic surface information can be obtained, potential vector mapping is carried out on the financial characteristic set and target basic surface data through a preset encoder, and original data can be converted into potential representation, so that important characteristics of the data are extracted. Index data extraction is carried out on the initial potential vectors, information about data states and market characteristics can be obtained from potential space, and more comprehensive guidance is provided for subsequent transaction strategy analysis. By analyzing the data state of the index data set, the state and market trend of the data can be known, and the direction and the target of the transaction strategy can be determined. To further improve the efficiency and accuracy of the deep learning based decision evaluation.

The deep learning-based decision evaluation method in the embodiment of the present invention is described above, and the deep learning-based decision evaluation system in the embodiment of the present invention is described below, referring to fig. 5, and one embodiment of the deep learning-based decision evaluation system in the embodiment of the present invention includes:

the collection module 501 is configured to collect historical financial data, and perform standardized processing on the historical financial data to obtain standardized financial data;

the extracting module 502 is configured to perform feature extraction on the standardized financial data to obtain a financial feature set, where the financial feature set includes a moving average line, a relative strength index, and a brin belt, and perform basic plane data extraction on the historical financial data through the financial feature set to obtain target basic plane data;

a mapping module 503, configured to input the financial feature set and the target base surface data into a preset encoder to perform potential vector mapping, output an initial potential vector, and extract index data of the initial potential vector to obtain an index data set of the initial potential vector;

the analysis module 504 is configured to perform data state analysis on the index data set, determine a target data state, and perform transaction policy analysis on the target data state to obtain an initial transaction policy;

The construction module 505 is configured to predict a user behavior of the initial transaction policy by using a reinforcement learning algorithm to obtain an initial user behavior, and perform value function construction by using the initial user behavior to obtain a target value function;

the sampling module 506 is configured to resample the initial potential vector according to the target value function to obtain a target potential vector, input the target potential vector into a preset decoder to perform data reconstruction to obtain a target data sample, and perform policy update on the initial transaction policy according to the target data sample to obtain a target transaction policy.

Through the cooperation of the components, the historical financial data can be subjected to standardized processing, so that different types of financial data can be in the same numerical range, and the stability and convergence of subsequent feature extraction and model training are facilitated. Through extracting financial characteristics such as the mobile average line, the relative strength index and the Brin zone, the trend and fluctuation of the financial market can be captured, basic surface data extraction is carried out on historical financial data through the financial characteristic set, richer company basic surface information can be obtained, potential vector mapping is carried out on the financial characteristic set and target basic surface data through a preset encoder, and original data can be converted into potential representation, so that important characteristics of the data are extracted. Index data extraction is carried out on the initial potential vectors, information about data states and market characteristics can be obtained from potential space, and more comprehensive guidance is provided for subsequent transaction strategy analysis. By analyzing the data state of the index data set, the state and market trend of the data can be known, and the direction and the target of the transaction strategy can be determined. To further improve the efficiency and accuracy of the deep learning based decision evaluation.

The deep learning-based decision evaluation system in the embodiment of the present invention is described in detail from the point of view of the modularized functional entity in fig. 5 above, and the deep learning-based decision evaluation device in the embodiment of the present invention is described in detail from the point of view of hardware processing below.

Fig. 6 is a schematic structural diagram of a deep learning-based decision-making evaluation device 600 according to an embodiment of the present invention, where the deep learning-based decision-making evaluation device 600 may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 610 (e.g., one or more processors) and a memory 620, and one or more storage media 630 (e.g., one or more mass storage devices) storing applications 633 or data 632. Wherein the memory 620 and the storage medium 630 may be transitory or persistent storage. The program stored on the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations in the deep learning-based decision evaluation device 600. Still further, the processor 610 may be configured to communicate with the storage medium 630 to execute a series of instruction operations in the storage medium 630 on the deep learning-based decision evaluation device 600.

The deep learning based decision evaluation device 600 may also include one or more power supplies 640, one or more wired or wireless network interfaces 650, one or more input/output interfaces 660, and/or one or more operating systems 631, such as Windows Server, mac OS X, unix, linux, freeBSD, and the like. It will be appreciated by those skilled in the art that the deep learning based decision evaluation device structure shown in fig. 6 does not constitute a limitation of the deep learning based decision evaluation device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

The present invention also provides a deep learning-based decision evaluation device, which includes a memory and a processor, where the memory stores computer readable instructions that, when executed by the processor, cause the processor to execute the steps of the deep learning-based decision evaluation method in the above embodiments.

The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, and may also be a volatile computer readable storage medium, in which instructions are stored which, when executed on a computer, cause the computer to perform the steps of the deep learning based decision evaluation method.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The integrated units, if implemented in the form of software functional units and sold or passed as separate products, may be stored in a computer readable storage medium. Based on the understanding that the technical solution of the present invention may be embodied in essence or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a storage medium, comprising instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A decision evaluation method based on deep learning, the method comprising:

2. The deep learning-based decision evaluation method of claim 1, wherein the feature extraction of the standardized financial data to obtain a financial feature set, wherein the financial feature set includes a moving average line, a relative strength index, and a brin belt, and the basic surface data extraction of the historical financial data is performed through the financial feature set to obtain target basic surface data, and the method comprises:

3. The deep learning-based decision evaluation method according to claim 1, wherein the inputting the financial feature set and the target base surface data into a preset encoder to perform potential vector mapping, and outputting an initial potential vector, and simultaneously performing index data extraction on the initial potential vector to obtain an index data set of the initial potential vector, includes:

4. The deep learning-based decision evaluation method of claim 3, wherein the extracting the index data from the initial potential vector to obtain the index data set of the initial potential vector comprises:

5. The deep learning based decision evaluation method of claim 1, wherein the performing data state analysis on the index data set, determining a target data state, and performing transaction policy analysis on the target data state, obtaining an initial transaction policy, comprises:

6. The deep learning-based decision evaluation method according to claim 1, wherein the performing the user behavior prediction on the initial transaction policy by the reinforcement learning algorithm to obtain an initial user behavior, and performing the value function construction by the initial user behavior to obtain a target value function includes:

7. The deep learning-based decision evaluation method according to claim 1, wherein resampling the initial potential vector by the target value function to obtain a target potential vector, inputting the target potential vector to a preset decoder for data reconstruction to obtain a target data sample, and performing policy update on the initial transaction policy by the target data sample to obtain a target transaction policy, comprises:

8. A deep learning-based decision evaluation system, the deep learning-based decision evaluation system comprising:

the acquisition module is used for acquiring historical financial data and carrying out standardized processing on the historical financial data to obtain standardized financial data;

the extraction module is used for carrying out feature extraction on the standardized financial data to obtain a financial feature set, wherein the financial feature set comprises a mobile average line, a relative strength index and a cloth belt, and basic face data extraction is carried out on the historical financial data through the financial feature set to obtain target basic face data;

The mapping module is used for inputting the financial feature set and the target basic surface data into a preset encoder to perform potential vector mapping, outputting an initial potential vector, and extracting index data of the initial potential vector to obtain an index data set of the initial potential vector;

the analysis module is used for carrying out data state analysis on the index data set, determining a target data state, and carrying out transaction strategy analysis on the target data state to obtain an initial transaction strategy;

the construction module is used for predicting the user behavior of the initial transaction strategy through a reinforcement learning algorithm to obtain an initial user behavior, and constructing a value function through the initial user behavior to obtain a target value function;

the sampling module is used for resampling the initial potential vector through the target value function to obtain a target potential vector, inputting the target potential vector into a preset decoder for data reconstruction to obtain a target data sample, and carrying out strategy updating on the initial transaction strategy through the target data sample to obtain a target transaction strategy.

9. A deep learning-based decision evaluation device, characterized in that the deep learning-based decision evaluation device comprises: a memory and at least one processor, the memory having instructions stored therein;

The at least one processor invoking the instructions in the memory to cause the deep learning based decision evaluation device to perform the deep learning based decision evaluation method of any of claims 1-7.

10. A computer readable storage medium having instructions stored thereon, which when executed by a processor implement the deep learning based decision evaluation method of any of claims 1-7.