WO2024021354A1 - Model training method, price prediction method, terminal device and storage medium - Google Patents

Model training method, price prediction method, terminal device and storage medium Download PDF

Info

Publication number
WO2024021354A1
WO2024021354A1 PCT/CN2022/129587 CN2022129587W WO2024021354A1 WO 2024021354 A1 WO2024021354 A1 WO 2024021354A1 CN 2022129587 W CN2022129587 W CN 2022129587W WO 2024021354 A1 WO2024021354 A1 WO 2024021354A1
Authority
WO
WIPO (PCT)
Prior art keywords
price
market
time period
data
social media
Prior art date
Application number
PCT/CN2022/129587
Other languages
French (fr)
Chinese (zh)
Inventor
吴胤旭
叶子
陈会
周亚雯
姜青山
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2024021354A1 publication Critical patent/WO2024021354A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • This application relates to the field of financial technology technology, and in particular to a model training method based on a market price prediction model, a price prediction method, a terminal device and a computer storage medium.
  • the rise and fall of stocks can be predicted to a certain extent, thereby providing investment advice to small and medium-sized investors.
  • the conclusions of sentiment analysis of stock comments are also It can provide analytical basis for institutional investors and stock analysts.
  • the predictive factors used in traditional quantitative investment are generally stock-related data or macro- and micro-economic data, and do not model the inefficiency of the market.
  • This application provides a model training method, a price prediction method, a terminal device and a computer storage medium based on a market price prediction model.
  • the model training method includes:
  • the data set includes stock market data and social media data for several time periods
  • the market price prediction model is trained using the prediction output of the market price prediction model to be trained to obtain the final market price prediction model.
  • the stock market data includes opening price, highest price, lowest price, closing price and/or trading volume.
  • the historical price technical indicators include moving averages of similarity and difference, simple moving averages, relative strength indicators and/or fund flow indicators;
  • the historical price technical indicators are calculated, including:
  • the historical social media market sentiment index is calculated based on the social media data of each time period, including:
  • comment texts lower than the preset threshold are defined as negative comment texts, and comment texts higher than the preset threshold are defined as positive comment texts;
  • the gradient sentiment divergence index of the social media data is calculated.
  • the model training method further includes:
  • the gradient bull market sentiment index of the social media data is calculated using the difference corresponding to all comment texts of the social media data and the number of all comment texts.
  • the correlation and merging of historical price technical indicators and historical social media market sentiment indexes in the same time period include:
  • the highest price in the same time period is correlated and merged with the simple moving average and the gradient bull sentiment index.
  • the obtained data set includes:
  • crawler technology or social media programming interfaces to collect social media data.
  • the price prediction method includes:
  • the market price prediction model is trained by the above-mentioned model training method.
  • a terminal device which includes a memory and a processor coupled to the memory;
  • the memory is used to store program data
  • the processor is used to execute the program data to implement the above-mentioned model training method and/or the above-mentioned price prediction method.
  • the computer storage medium is used to store program data.
  • the program data is executed by a computer, it is used to implement the above-mentioned model training method and/or the above-mentioned price prediction method.
  • the beneficial effects of this application are: the terminal device obtains the data set; based on the stock market data of each time period, calculates historical price technical indicators; based on the social media data of each time period, calculates the historical social media market sentiment index; the same time period
  • the historical price technical indicators and the historical social media market sentiment index are associated and merged, and are input as features to the market price prediction model to be trained; the market price prediction model is trained using the prediction output of the market price prediction model to be trained, and the final market price prediction model.
  • the model training method of this application uses the continuous update characteristics of social media to reduce the time granularity of prediction and obtain close to real-time prediction capabilities. It also achieves better prediction results by combining market technical indicators.
  • Figure 1 is a schematic flow chart of an embodiment of a model training method based on a market price prediction model provided by this application;
  • Figure 2 is a schematic diagram of the overall flow of the model training method and its price prediction method based on the market price prediction model provided by this application;
  • Figure 3 is a schematic diagram of the closing price predicted by the market price prediction model provided by this application.
  • Figure 4 is a schematic framework diagram of an embodiment of the data set provided by this application.
  • Figure 5 is a schematic flow chart of an embodiment of a price prediction method based on a market price prediction model provided by this application;
  • Figure 6 is a schematic structural diagram of an embodiment of a terminal device provided by this application.
  • Figure 7 is a schematic structural diagram of an embodiment of a computer storage medium provided by this application.
  • This type of prediction technology makes use of market price data, and also obtains a large number of user comment texts through social media for sentiment analysis of big data to conduct market predictions.
  • a common model is a deep neural network.
  • this application aims to propose a neural network prediction method combined with social media sentiment analysis on the issue of financial market prediction. This method can effectively utilize data sets and improve prediction accuracy.
  • Figure 1 is a flow diagram of an embodiment of a model training method based on a market price prediction model provided by this application.
  • Figure 2 is a model training method based on a market price prediction model provided by this application and its Schematic diagram of the overall process of price forecasting method.
  • the model training method based on the market price prediction model in the embodiment of this application specifically includes the following steps:
  • Step S11 Obtain a data set, where the data set includes stock market data and social media data for several time periods.
  • the data sources of stock market data in the data set include but are not limited to various securities firms and trading markets.
  • the collection method can use crawler technology or collect historical price data through the supplier's programming interface.
  • the stock market data must be K-line data with specified time granularity, and the fields must include at least: opening price, closing price, highest price, lowest price, and trading volume. For example, if you specify a time granularity of 30 minutes, then there will be a record every 30 minutes from the opening of the market, recording the opening price, closing price, highest price, lowest price, and trading volume within that time period.
  • the data sources of social media data in the data set include but are not limited to various social media, such as Weibo, Tieba, Twitter, user comment data in trading markets, etc.
  • the collection method can use crawler technology or social media programming interfaces to collect historical social media data .
  • historical social media data needs to include at least two fields: publishing time and published text content.
  • Step S12 Calculate historical price technical indicators based on the stock market data in each time period.
  • the historical price technical indicators used include but are not limited to: one or more of the moving average of similarity and difference, simple moving average, relative strength indicator and fund flow indicator.
  • the terminal device can calculate the moving average of similarities and differences based on the opening price of each time period; calculate the simple moving average based on the highest price of each time period; calculate the relative moving average based on the lowest price of each time period. Strength indicator; based on the closing price of each time period, the money flow indicator is calculated.
  • MACD Moving Average Convergence/Divergence
  • MACD Moving Average Convergence/Divergence
  • EMA26 slow exponential moving average
  • EMA12 fast exponential moving average
  • MACD 2 ⁇ (9-day weighted moving average DEA of fast line DIF-DIF)
  • the meaning of MACD is basically the same as that of the double moving average, that is, the current long and short status and the possible development trend of the stock price are represented by the dispersion and aggregation of the fast and slow moving averages, but it is more convenient to read.
  • Changes in MACD represent changes in market trends, and MACD at different K-line levels represents the buying and selling trend in the current level cycle.
  • the formula is:
  • MACD EMA (12-period) -EMA (26-period)
  • EMA is the exponential average indicator.
  • EMA Exponential Moving Average
  • EXPMA exponential moving average
  • the exponential moving average is a moving average weighted by exponential decreasing.
  • SMA is a simple moving average
  • the simple moving average is an arithmetic moving average, which is a simple and universal moving average.
  • the formula is:
  • 3.RSI is the relative strength indicator, which is the most famous oscillator in the futures market and stock market. The principle is to estimate the strength of the market trend by calculating the rise and fall of stock prices, and predict the continuation or reversal of the trend based on this. In fact, it shows the upward fluctuation of the stock price as a percentage of the total fluctuation. If the value is large, it means that the market is in a strong state. If the value is small, it means that the market is in a weak state.
  • the formula is:
  • n 14.
  • 4MFI Monitoring Flow Index
  • RSSI relative strength index
  • OOV popularity index
  • the MFI indicator can be used to measure the momentum of trading volume and investment interest, and changes in trading volume provide clues to future changes in stock prices, so the MFI indicator can help determine the trend of stock price changes.
  • the formula is:
  • n 14.
  • the data in the current dataset can contain the following fields:
  • Step S13 Calculate the historical social media market sentiment index based on the social media data in each time period.
  • the terminal device obtains the comment text of the social media data in each time period, and obtains the emotion score of each comment text; based on the emotion score of each comment text, comments that are lower than the preset threshold are Text is defined as negative comment text, and comment text higher than the preset threshold is defined as positive comment text; based on all positive comment texts and negative comment texts of social media data in each time period, the social media Gradient Sentiment Divergence Index of the data.
  • the terminal device can also obtain the difference between each comment text of the social media data and the gradient sentiment deviation index; use the difference corresponding to all comment texts of the social media data, and the number of all comment texts , calculate the gradient bull sentiment index for the social media data.
  • the specific process of the terminal device calculating the historical social media market sentiment index is as follows:
  • the terminal device performs emotional scoring on the comment text of the social media data in each time period.
  • Tools used for emotional scoring include but are not limited to VADER, etc.
  • the terminal device performs an emotion score on each comment text and can obtain a value between -1 and 1, which represents the emotion of the comment text.
  • the terminal device may also define the comment text with the sentiment score between -1 and 0 as negative comment text, and the comment text with the sentiment score between 0 and 1 as positive comment text.
  • the terminal device uses the defined positive comment text and negative comment text to calculate the Gradient Bullish Sentiment Index (Small Granular Sentiment Bullish Index, SGSDI) and the Gradient Sentiment Divergence Index (Small Granular Sentimental Divergence Index, SGSBI).
  • SGSDI Gradient Bullish Sentiment Index
  • SGSBI Gradient Sentiment Divergence Index
  • D(t) is the sentiment index set of all comments in the t time period.
  • Step S14 Correlate and merge the historical price technical indicators and the historical social media market sentiment index in the same time period, and input them as features into the market price prediction model to be trained.
  • the terminal device associates and merges the price data, technical indicators and emotional indicators of the same time period according to the time granularity area corresponding to the above data.
  • the data set contains the following attributes:
  • Step S15 Use the prediction output of the market price prediction model to be trained to train the market price prediction model to obtain the final market price prediction model.
  • the terminal device inputs the above-mentioned combined data set as a feature and adds a prediction target column.
  • the prediction target column is the closing price of the next time granularity, Please refer to Figure 3 for details.
  • Figure 3 is a schematic diagram of the closing price predicted by the market price prediction model provided by this application.
  • the terminal device can input the data set from September 24, 2017 to September 27, 2017 into the market price prediction model.
  • the time granularity is set to 1 day, and the prediction result output by the market price prediction model is 2017. The closing price on September 29, 2019.
  • the terminal device can also input the data set from 00:00:00 on September 24, 2017 to 02:00:00 on September 24, 2017 into the market price prediction model.
  • the time granularity is set to 30 minutes, and the market price prediction model outputs
  • the prediction result is the closing price at 02:30:00 on September 24, 2017.
  • the terminal device can divide the data set into a training set and a test set. Usually, 80% is used as the training set and 20% is used as the test set. For example, as shown in Figure 4 below, Figure 4 is the application A schematic diagram of the framework of an embodiment of the data set is provided.
  • the data set of the embodiment of this application includes data collected from 00:00:00 on September 24, 2017 to 23:59:59 on November 30, 2020.
  • the terminal device will be on September 24, 2017.
  • the data from 00:00:00 on April 11, 2020 to 23:59:59 on April 11, 2020 is used as the training set, and the data from 00:00:00 on April 12, 2020 to 23:59:59 on November 30, 2020 is used as the test set. set.
  • the training set in the embodiment of this application is only used to train the model.
  • a deep neural network model is used, such as a long-short-term memory model (Long-Short Term Memory, LSTM) and other models.
  • the training parameters need to be adjusted according to the data situation.
  • it can also be used For more complex integrated models, the training cycle may be longer, but the model effect is generally slightly higher than that of a single deep neural network model.
  • the test set is used to test the trained market price prediction model.
  • the market price prediction model can use various regressors, including but not limited to linear regression, deep neural networks, and various integrated regressors.
  • mean absolute error Mean Absolute Error, MAE
  • mean square error MSE
  • root mean square error Root Mean Square Error, RMSE
  • R 2 R squared , Coefficient of determination
  • Mean Absolute Percentage Error MEE
  • the three evaluation indicators of mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE) are used to evaluate the model effect and compared with existing models.
  • the experimental results consistently show that using this method
  • the patented method can predict prices more accurately and can be widely used to predict the prices of financial products, and has very broad application prospects.
  • Table 1 and Table 2 show the prediction error statistics of the two models at different granularities at two times.
  • LSTM is the Long-Short Term Memory model (Long-Short Term Memory)
  • GRU is the Gate Recurrent Unit model (Gate Recurrent Unit).
  • the data for this simulation uses Bitcoin price data and #Bitcoin topic tweets on Twitter from September 24, 2017 to November 30, 2020. The following results were obtained using 80% training set and 20% test set.
  • the continuous updating characteristics of social media are used to replace news data, so that the time granularity of prediction can be further reduced, and the ability of near-real-time prediction is obtained; this application provides a new market sentiment within a time period Compared with the current simple statistical method, the index calculation method can better reflect the specific market sentiment and improve the accuracy of the model.
  • this application further proposes a price prediction method based on the market price prediction model.
  • Figure 5 is a model based on the market price prediction model provided by this application. A flowchart of an embodiment of a price prediction method.
  • the price prediction method in the embodiment of this application specifically includes the following steps:
  • Step S21 Obtain the stock market data of the current time period and obtain all social comment data of the current time period.
  • Step S22 Calculate price technical indicators of the current time period based on the stock market data.
  • Step S23 Calculate the social media market sentiment index of the current time period based on all social comment data.
  • Step S24 Correlate and merge the price technical indicators and the social media market sentiment index of the current time period, and input them as features into the pre-trained market price prediction model.
  • Step S25 Based on the output of the market price prediction model, obtain the predicted price after the current time period.
  • Figure 5 is a schematic structural diagram of an embodiment of a terminal device provided by this application.
  • the terminal device 500 in the embodiment of the present application includes a processor 51, a memory 52, an input and output device 53, and a bus 54.
  • the processor 51, the memory 52, and the input and output device 53 are respectively connected to the bus 54.
  • the memory 52 stores program data.
  • the processor 51 is used to execute the program data to implement the model training method and/or price prediction described in the above embodiment. method.
  • the processor 51 may also be called a CPU (Central Processing Unit).
  • the processor 51 may be an integrated circuit chip with signal processing capabilities.
  • the processor 51 can also be a general-purpose processor, a digital signal processor (DSP, Digital Signal Process), an application specific integrated circuit (ASIC, Application Specific Integrated Circuit), a field programmable gate array (FPGA, Field Programmable Gate Array) or other available Programmed logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the general processor may be a microprocessor or the processor 51 may be any conventional processor or the like.
  • FIG. 7 is a schematic structural diagram of an embodiment of the computer storage medium provided by this application.
  • the computer storage medium 600 stores program data 61.
  • the program data 61 is in When executed by the processor, it is used to implement the model training method and price prediction method of the above embodiment.
  • the embodiments of the present application When the embodiments of the present application are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the method described in each embodiment of the application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Mathematical Optimization (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Mathematics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Algebra (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Operations Research (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Technology Law (AREA)

Abstract

Disclosed in the present application are a model training method based on a market price prediction model, a price prediction method based on a market price prediction model, and a terminal device and a computer storage medium. The model training method comprises: acquiring a data set; calculating a historical price technical index on the basis of stock market data of each time period; calculating a historical social media market sentiment index on the basis of social media data of each time period; associating and merging the historical price technical index and the historical social media market sentiment index in the same time period, and inputting same as features into a market price prediction model to be trained; and training the market price prediction model by using a prediction output of the market price prediction model to be trained, so as to obtain a final market price prediction model. By means of the model training method in the present application, the time granularity of prediction is reduced by means of the characteristic of continuous updating of social media, so as to obtain a capability of almost real-time prediction; and in view of a market technical index, a better prediction effect is achieved.

Description

模型训练方法、价格预测方法、终端设备以及存储介质Model training methods, price prediction methods, terminal equipment and storage media 技术领域Technical field
本申请涉及金融科技技术领域,特别是涉及一种基于市场价格预测模型的模型训练方法、价格预测方法、终端设备以及计算机存储介质。This application relates to the field of financial technology technology, and in particular to a model training method based on a market price prediction model, a price prediction method, a terminal device and a computer storage medium.
背景技术Background technique
纵观货币发展史,货币形态完成了从实物货币、金属货币到可兑换纸币、不可兑换信用货币再到电子货币、数字货币的演变。技术创新不仅带动了社会发展,也改变了货币的形态。近三十年来,互联网技术的普及与发展极大程度上推动了支付方式、结算方式的变革,并带动了电子商务、电子政务以及数字经济等的发展。人们对纸币的需求量越来越小,取而代之的是电子化货币和数字化货币。当前,我国正着力发展现代化经济体系,加大力度推广数字经济,在此背景下,众多学者将研究目光聚焦至数字货币。2017年1月,我国央行正式成立了数字货币研究所,2020年4月,由中央行自主设计发行的法定数字货币相继在苏州、深圳、雄安以及成都四地试点测试。数字货币作为一个“新兴事物”正在得到人们越来越多的关注。数字货币与数字经济的逐步发展,也将改变金融业的格局,其虚拟化等特征将对金融业提出更高更快的要求,金融科技将迎来更多技术升级与更新,研究如何使用人工智能在金融业的应用也将更具有实际意义。Throughout the history of currency development, currency forms have evolved from physical currency and metal currency to convertible banknotes, non-convertible credit currency, and then to electronic currency and digital currency. Technological innovation not only drives social development, but also changes the form of currency. In the past thirty years, the popularization and development of Internet technology have greatly promoted the changes in payment and settlement methods, and promoted the development of e-commerce, e-government and digital economy. People's demand for paper money is getting smaller and smaller, and is replaced by electronic money and digital money. Currently, our country is focusing on developing a modern economic system and increasing efforts to promote the digital economy. Against this background, many scholars have focused their research on digital currency. In January 2017, the Central Bank of my country officially established the Digital Currency Research Institute. In April 2020, the legal digital currency independently designed and issued by the Central Bank was pilot tested in Suzhou, Shenzhen, Xiongan and Chengdu. As an "emerging thing", digital currency is receiving more and more attention. The gradual development of digital currency and digital economy will also change the pattern of the financial industry. Its virtualization and other features will put forward higher and faster requirements for the financial industry. Fintech will usher in more technological upgrades and updates, and research on how to use artificial intelligence The application of intelligence in the financial industry will also have more practical significance.
同时,随着金融科技的发展,各种基于机器学习与人工智能的市场价格预测模型越来越受到各界的关注,其中结合了情感分析的模型本领域的主流研究方向之一。这种结合了文本分析的模型由于利用了文本以及价格数据,如何将文本数据进行特征提取成为了主要的挑战。随着互联网金融和证券市场的迅速发展,我国新增投资者数目屡创新高,股吧、各大金融论坛和微博等社交媒体成为了股市投资者分享信息和进行投资参考的重要媒介。市场非有效性理论说明了股票价格不能完全反应股票价值,即投资者的情绪和股票价格走势存在一定的相关性。基于对股吧中投资者对于股票评论的情感分析和对股票价格相关 信息的时间序列分析,可以一定程度上预测股票涨跌,从而给中小投资者提供投资建议,同时,股票评论情感分析的结论也可以给机构投资者和股票分析师提供分析基础。传统量化投资所采用的预测因子一般为股票相关数据或宏、微观经济相关数据,并未对市场的非有效性进行建模。At the same time, with the development of financial technology, various market price prediction models based on machine learning and artificial intelligence have attracted more and more attention from all walks of life. Among them, models incorporating sentiment analysis are one of the mainstream research directions in this field. This model that combines text analysis uses text and price data, so how to extract features from text data becomes a major challenge. With the rapid development of Internet finance and securities markets, the number of new investors in my country has hit new highs. Social media such as stock bars, major financial forums, and Weibo have become important media for stock market investors to share information and make investment references. Market inefficiency theory explains that stock prices cannot fully reflect stock value, that is, there is a certain correlation between investor sentiment and stock price trends. Based on the sentiment analysis of stock comments by investors in the stock bar and the time series analysis of stock price-related information, the rise and fall of stocks can be predicted to a certain extent, thereby providing investment advice to small and medium-sized investors. At the same time, the conclusions of sentiment analysis of stock comments are also It can provide analytical basis for institutional investors and stock analysts. The predictive factors used in traditional quantitative investment are generally stock-related data or macro- and micro-economic data, and do not model the inefficiency of the market.
发明内容Contents of the invention
本申请提供一种基于市场价格预测模型的模型训练方法、价格预测方法、终端设备以及计算机存储介质。This application provides a model training method, a price prediction method, a terminal device and a computer storage medium based on a market price prediction model.
本申请采用的一个技术方案是提供一种基于市场价格预测模型的模型训练方法,所述模型训练方法包括:One technical solution adopted by this application is to provide a model training method based on a market price prediction model. The model training method includes:
获取数据集,其中,所述数据集包括若干时间段的股市数据以及社交媒体数据;Obtain a data set, wherein the data set includes stock market data and social media data for several time periods;
基于每一时间段的股市数据,计算历史价格技术指标;Calculate historical price technical indicators based on stock market data for each time period;
基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数;Based on the social media data of each time period, calculate the historical social media market sentiment index;
将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,并作为特征输入到待训练的市场价格预测模型;Correlate and merge historical price technical indicators and historical social media market sentiment index in the same time period, and input them as features into the market price prediction model to be trained;
利用所述待训练的市场价格预测模型的预测输出对所述市场价格预测模型进行训练,得到最终的市场价格预测模型。The market price prediction model is trained using the prediction output of the market price prediction model to be trained to obtain the final market price prediction model.
其中,所述股市数据包括开盘价、最高价、最低价、收盘价和/或成交量。Wherein, the stock market data includes opening price, highest price, lowest price, closing price and/or trading volume.
其中,所述历史价格技术指标包括异同移动平均线、简单移动平均线、相对强弱指标和/或资金流量指标;Wherein, the historical price technical indicators include moving averages of similarity and difference, simple moving averages, relative strength indicators and/or fund flow indicators;
所述基于每一时间段的股市数据,计算历史价格技术指标,包括:Based on the stock market data of each time period, the historical price technical indicators are calculated, including:
基于每一时间段的开盘价,计算所述异同移动平均线;Calculate the moving average of similarity and difference based on the opening price of each time period;
和/或,基于每一时间段的最高价,计算所述简单移动平均线;and/or, calculating said simple moving average based on the highest price in each time period;
和/或,基于每一时间段的最低价,计算所述相对强弱指标;and/or, calculate the relative strength indicator based on the lowest price in each time period;
和/或,基于每一时间段的收盘价,计算所述资金流量指标。and/or, calculating the money flow indicator based on the closing price for each time period.
其中,所述基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数,包括:Among them, the historical social media market sentiment index is calculated based on the social media data of each time period, including:
获取每一时间段的社交媒体数据的评论文本,获取每一条评论文本的情感评分;Obtain the comment text of social media data in each time period and obtain the sentiment score of each comment text;
基于所述每一条评论文本的情感评分,将低于预设阈值的评论文本定义为负向评论文本,将高于预设阈值的评论文本定义为正向评论文本;Based on the sentiment score of each comment text, comment texts lower than the preset threshold are defined as negative comment texts, and comment texts higher than the preset threshold are defined as positive comment texts;
基于每一时间段的社交媒体数据的所有正向评论文本和负向评论文本,计算所述社交媒体数据的梯度情绪背离指数。Based on all positive comment texts and negative comment texts of the social media data in each time period, the gradient sentiment divergence index of the social media data is calculated.
其中,所述计算所述社交媒体数据的梯度情绪背离指数之后,所述模型训练方法还包括:Wherein, after calculating the gradient sentiment divergence index of the social media data, the model training method further includes:
获取所述社交媒体数据的每一评论文本与所述梯度情绪背离指数的差值;Obtain the difference between each comment text of the social media data and the gradient sentiment deviation index;
利用所述社交媒体数据的所有评论文本对应的差值,以及所有评论文本的数量,计算所述社交媒体数据的梯度牛市情绪指数。The gradient bull market sentiment index of the social media data is calculated using the difference corresponding to all comment texts of the social media data and the number of all comment texts.
其中,所述将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,包括:Among them, the correlation and merging of historical price technical indicators and historical social media market sentiment indexes in the same time period include:
将同一时间段的开盘价,与异同移动平均线、所述梯度情绪背离指数进行关联合并;Correlate and merge the opening price in the same time period with the moving average of similarity and difference and the gradient sentiment divergence index;
将同一时间段的最高价,与简单移动平均线、所述梯度牛市情绪指数进行关联合并。The highest price in the same time period is correlated and merged with the simple moving average and the gradient bull sentiment index.
其中,所述获取数据集,包括:Among them, the obtained data set includes:
利用爬虫技术或者供应商的编程接口收集股市数据;Use crawler technology or the supplier's programming interface to collect stock market data;
和/或,利用爬虫技术或者社交媒体编程接口收集社交媒体数据。and/or, use crawler technology or social media programming interfaces to collect social media data.
本申请采用的另一个技术方案是提供一种基于市场价格预测模型的价格预测方法,所述价格预测方法包括:Another technical solution adopted by this application is to provide a price prediction method based on a market price prediction model. The price prediction method includes:
获取当前时间段的股市数据,以及获取当前时间段的所有社交评论数据;Get the stock market data for the current time period, and get all social comment data for the current time period;
基于所述股市数据,计算所述当前时间段的价格技术指标;Based on the stock market data, calculate price technical indicators for the current time period;
基于所述所有社交评论数据,计算所述当前时间段的社交媒体市场情感指数;Calculate the social media market sentiment index of the current time period based on all social comment data;
将所述当前时间段的价格技术指标和社交媒体市场情感指数进行关联合并,并作为特征输入到预先训练的市场价格预测模型;Correlate and merge the price technical indicators and social media market sentiment index of the current time period, and input them as features into the pre-trained market price prediction model;
基于所述市场价格预测模型的输出,获取当前时间段以后的预测价格;Based on the output of the market price prediction model, obtain the predicted price after the current time period;
其中,所述市场价格预测模型由上述的模型训练方法训练得到。Wherein, the market price prediction model is trained by the above-mentioned model training method.
本申请采用的另一个技术方案是提供一种终端设备,所述终端设备包括存 储器以及与所述存储器耦接的处理器;Another technical solution adopted by this application is to provide a terminal device, which includes a memory and a processor coupled to the memory;
其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如上述的模型训练方法和/或上述的价格预测方法。Wherein, the memory is used to store program data, and the processor is used to execute the program data to implement the above-mentioned model training method and/or the above-mentioned price prediction method.
本申请采用的另一个技术方案是提供一种计算机存储介质,所述计算机存储介质用于存储程序数据,所述程序数据在被计算机执行时,用以实现如上述的模型训练方法和/或上述的价格预测方法。Another technical solution adopted by this application is to provide a computer storage medium. The computer storage medium is used to store program data. When the program data is executed by a computer, it is used to implement the above-mentioned model training method and/or the above-mentioned price prediction method.
本申请的有益效果是:终端设备获取数据集;基于每一时间段的股市数据,计算历史价格技术指标;基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数;将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,并作为特征输入到待训练的市场价格预测模型;利用待训练的市场价格预测模型的预测输出对市场价格预测模型进行训练,得到最终的市场价格预测模型。本申请的模型训练方法通过社交媒体的连续更新的特性,缩小预测的时间粒度,得到接近实时预测的能力,并通过结合市场技术指标,达到更好的预测效果。The beneficial effects of this application are: the terminal device obtains the data set; based on the stock market data of each time period, calculates historical price technical indicators; based on the social media data of each time period, calculates the historical social media market sentiment index; the same time period The historical price technical indicators and the historical social media market sentiment index are associated and merged, and are input as features to the market price prediction model to be trained; the market price prediction model is trained using the prediction output of the market price prediction model to be trained, and the final market price prediction model. The model training method of this application uses the continuous update characteristics of social media to reduce the time granularity of prediction and obtain close to real-time prediction capabilities. It also achieves better prediction results by combining market technical indicators.
附图说明Description of drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without exerting creative efforts.
图1是本申请提供的基于市场价格预测模型的模型训练方法一实施例的流程示意图;Figure 1 is a schematic flow chart of an embodiment of a model training method based on a market price prediction model provided by this application;
图2是本申请提供的基于市场价格预测模型的模型训练方法及其价格预测方法的整体流程示意图;Figure 2 is a schematic diagram of the overall flow of the model training method and its price prediction method based on the market price prediction model provided by this application;
图3是本申请提供的市场价格预测模型预测收盘价的示意图;Figure 3 is a schematic diagram of the closing price predicted by the market price prediction model provided by this application;
图4是本申请提供的数据集一实施例的框架示意图;Figure 4 is a schematic framework diagram of an embodiment of the data set provided by this application;
图5是本申请提供的基于市场价格预测模型的价格预测方法一实施例的流程示意图;Figure 5 is a schematic flow chart of an embodiment of a price prediction method based on a market price prediction model provided by this application;
图6是本申请提供的终端设备一实施例的结构示意图;Figure 6 is a schematic structural diagram of an embodiment of a terminal device provided by this application;
图7是本申请提供的计算机存储介质一实施例的结构示意图。Figure 7 is a schematic structural diagram of an embodiment of a computer storage medium provided by this application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请的一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.
近年来,随着互联网技术的迅速发展和社交网络的广泛流行,大量与股票相关的信息在互联网上传播,这些实时信息规模巨大,其中隐含着重要的、与股票市场相关的信息。基于互联网和社交网络信息的股票价格趋势预测"逐渐成为主流研究方向。与此同时,传统的股票价格趋势分析与预测方法由于没有考虑数据的规模和时效性,已经无法适应网络环境中、海量数据规模下的股票市场分析和预测要求。In recent years, with the rapid development of Internet technology and the widespread popularity of social networks, a large amount of stock-related information has been spread on the Internet. The scale of this real-time information is huge, and it contains important information related to the stock market. Stock price trend prediction based on Internet and social network information has gradually become a mainstream research direction. At the same time, traditional stock price trend analysis and prediction methods have not taken into account the scale and timeliness of data, and have been unable to adapt to the massive data in the network environment. Stock market analysis and forecasting requirements at scale.
基于上述研究思路,美德两国大学的研究发现,通过对社交网络Twitter中发布的大量微博消息进行分析可预测个股的涨跌情况。美国佩斯大学在追踪了星巴克、可口可乐和耐克这三家公司的股价之后,得出研究结果,认为通过一家公司在社交媒体上的受欢迎程度,可预测该公司股价日常走势。Based on the above research ideas, research from universities in the United States and the United States found that the rise and fall of individual stocks can be predicted by analyzing a large number of Weibo messages posted on the social network Twitter. Pace University in the United States tracked the stock prices of three companies: Starbucks, Coca-Cola and Nike, and concluded that the daily trend of a company's stock price can be predicted by a company's popularity on social media.
德国慕尼黑科技大学的研究人员根据Twitter消息中包含的信息预测个股走势。该大学之前曾实施了一项研究,研究所采用的情感分析方法被用于分析与某些股票相关的Twitter消息,以及这些消息是否包含"看涨"、"看跌"或"持有"等信息。Researchers at the Technical University of Munich in Germany used information contained in Twitter messages to predict individual stock trends. The university previously conducted a study in which sentiment analysis methods were used to analyze Twitter messages related to certain stocks and whether those messages contained "bullish," "bearish" or "hold" messages.
这类预测技术利用了市场价格数据,另外通过社交媒体获取大量的用户评论文本进行大数据的情感分析,以此进行市场预测,常见的模型是深度神经网络。This type of prediction technology makes use of market price data, and also obtains a large number of user comment texts through social media for sentiment analysis of big data to conduct market predictions. A common model is a deep neural network.
基于以上基础,本申请旨在金融市场预测问题上,提出一种结合社交媒体情感分析的神经网络预测方法,该方法能有效利用数据集,并能提高预测精度。Based on the above foundation, this application aims to propose a neural network prediction method combined with social media sentiment analysis on the issue of financial market prediction. This method can effectively utilize data sets and improve prediction accuracy.
具体请参与图1和图2,图1是本申请提供的基于市场价格预测模型的模型训练方法一实施例的流程示意图,图2是本申请提供的基于市场价格预测模型的模型训练方法及其价格预测方法的整体流程示意图。Please refer to Figures 1 and 2 for details. Figure 1 is a flow diagram of an embodiment of a model training method based on a market price prediction model provided by this application. Figure 2 is a model training method based on a market price prediction model provided by this application and its Schematic diagram of the overall process of price forecasting method.
如图1所示,本申请实施例的基于市场价格预测模型的模型训练方法具体包括以下步骤:As shown in Figure 1, the model training method based on the market price prediction model in the embodiment of this application specifically includes the following steps:
步骤S11:获取数据集,其中,数据集包括若干时间段的股市数据以及社交媒体数据。Step S11: Obtain a data set, where the data set includes stock market data and social media data for several time periods.
在本申请实施例中,数据集中股市数据的数据来源包括但不限于各种券商、交易市场,收集手段可以使爬虫技术或者通过供应商的编程接口收集历史价格数据。In the embodiment of this application, the data sources of stock market data in the data set include but are not limited to various securities firms and trading markets. The collection method can use crawler technology or collect historical price data through the supplier's programming interface.
其中,股市数据须为指定时间粒度的K线数据,字段必须至少包含:开盘价、收盘价、最高价、最低价,交易量。例如指定时间粒度为30分钟,则从开市开始每30分钟为一条记录,分别记录该时间段内的开盘价、收盘价、最高价、最低价,交易量。Among them, the stock market data must be K-line data with specified time granularity, and the fields must include at least: opening price, closing price, highest price, lowest price, and trading volume. For example, if you specify a time granularity of 30 minutes, then there will be a record every 30 minutes from the opening of the market, recording the opening price, closing price, highest price, lowest price, and trading volume within that time period.
因此,股市数据需要包含以下字段:Therefore, the stock market data needs to contain the following fields:
Figure PCTCN2022129587-appb-000001
Figure PCTCN2022129587-appb-000001
数据集中社交媒体数据的数据来源包括但不限于各种社交媒体,例如微博、贴吧、推特、交易市场的用户评论数据等,收集手段可以使爬虫技术或者社交媒体编程接口收集历史社交媒体数据。其中,历史社交媒体数据需要至少包括两个字段:发布时间以及发布的文字内容。The data sources of social media data in the data set include but are not limited to various social media, such as Weibo, Tieba, Twitter, user comment data in trading markets, etc. The collection method can use crawler technology or social media programming interfaces to collect historical social media data . Among them, historical social media data needs to include at least two fields: publishing time and published text content.
步骤S12:基于每一时间段的股市数据,计算历史价格技术指标。Step S12: Calculate historical price technical indicators based on the stock market data in each time period.
在本申请实施例中,所采用的历史价格技术指标包括但不限于:异同移动平均线、简单移动平均线、相对强弱指标和资金流量指标中的一种或多种。In the embodiment of this application, the historical price technical indicators used include but are not limited to: one or more of the moving average of similarity and difference, simple moving average, relative strength indicator and fund flow indicator.
终端设备可以基于每一时间段的开盘价,计算所述异同移动平均线;基于每一时间段的最高价,计算所述简单移动平均线;基于每一时间段的最低价,计算所述相对强弱指标;基于每一时间段的收盘价,计算所述资金流量指标。The terminal device can calculate the moving average of similarities and differences based on the opening price of each time period; calculate the simple moving average based on the highest price of each time period; calculate the relative moving average based on the lowest price of each time period. Strength indicator; based on the closing price of each time period, the money flow indicator is calculated.
具体地,上述历史价格技术指标的计算方法如下:Specifically, the calculation method of the above historical price technical indicators is as follows:
1.MACD(Moving Average Convergence/Divergence)为异同移动平均线,是从双指数移动平均线发展而来的,由快的指数移动平均线(EMA12)减去慢的指数移动平均线(EMA26)得到快线DIF,再用2×(快线DIF-DIF的9日加权移动均线DEA)得到MACD柱。MACD的意义和双移动平均线基本相同, 即由快、慢均线的离散、聚合表征当前的多空状态和股价可能的发展变化趋势,但阅读起来更方便。MACD的变化代表着市场趋势的变化,不同K线级别的MACD代表当前级别周期中的买卖趋势。其公式为:1. MACD (Moving Average Convergence/Divergence) is the Moving Average Convergence/Divergence, which is developed from the double exponential moving average. It is obtained by subtracting the slow exponential moving average (EMA26) from the fast exponential moving average (EMA12). Fast line DIF, and then use 2× (9-day weighted moving average DEA of fast line DIF-DIF) to get the MACD column. The meaning of MACD is basically the same as that of the double moving average, that is, the current long and short status and the possible development trend of the stock price are represented by the dispersion and aggregation of the fast and slow moving averages, but it is more convenient to read. Changes in MACD represent changes in market trends, and MACD at different K-line levels represents the buying and selling trend in the current level cycle. The formula is:
MACD=EMA (12-period)-EMA (26-period) MACD=EMA (12-period) -EMA (26-period)
其中,EMA为指数平均数指标。EMA(Exponential Moving Average)是指数移动平均值,也叫EXPMA指标,它也是一种趋向类指标,指数移动平均值是以指数式递减加权的移动平均。Among them, EMA is the exponential average indicator. EMA (Exponential Moving Average) is an exponential moving average, also called EXPMA indicator. It is also a trend indicator. The exponential moving average is a moving average weighted by exponential decreasing.
2.SMA为简单移动平均线,简单移动平均线是算术移动平均线是简单而普遍的移动平均线。其公式为:2. SMA is a simple moving average, and the simple moving average is an arithmetic moving average, which is a simple and universal moving average. The formula is:
Figure PCTCN2022129587-appb-000002
Figure PCTCN2022129587-appb-000002
其中,p n为第n个值,n为移动窗口,在本申请实施例中,n=30。 Wherein, p n is the nth value, n is the moving window, and in the embodiment of the present application, n=30.
3.RSI为相对强弱指标,相对强弱指标是指期货市场和股票市场中最为著名的摆动指标。其原理就是通过计算股价涨跌的幅度来推测市场运动趋势的强弱度,并据此预测趋势的持续或者转向。实际上它显示的是股价向上波动的幅度占总的波动幅度的百分比,如果其数值大,就表示市场处于强势状态,如果数值小,则表示市场处于弱势。其公式为:3.RSI is the relative strength indicator, which is the most famous oscillator in the futures market and stock market. The principle is to estimate the strength of the market trend by calculating the rise and fall of stock prices, and predict the continuation or reversal of the trend based on this. In fact, it shows the upward fluctuation of the stock price as a percentage of the total fluctuation. If the value is large, it means that the market is in a strong state. If the value is small, it means that the market is in a weak state. The formula is:
Figure PCTCN2022129587-appb-000003
Figure PCTCN2022129587-appb-000003
其中,
Figure PCTCN2022129587-appb-000004
在本申请实施例中n=14。
in,
Figure PCTCN2022129587-appb-000004
In the embodiment of this application, n=14.
4MFI(Money Flow Index)为资金流量指标,资金流量指标是相对强弱指标(RSI)和人气指标(OBV)两者的结合。MFI指标可以用于测度交易量的动量和投资兴趣,而交易量的变化为股价未来的变化提供了线索,所以MFI指标可以帮助判断股票价格变化的趋势。其公式为:4MFI (Money Flow Index) is a fund flow indicator, which is a combination of the relative strength index (RSI) and the popularity index (OBV). The MFI indicator can be used to measure the momentum of trading volume and investment interest, and changes in trading volume provide clues to future changes in stock prices, so the MFI indicator can help determine the trend of stock price changes. The formula is:
Figure PCTCN2022129587-appb-000005
Figure PCTCN2022129587-appb-000005
其中,
Figure PCTCN2022129587-appb-000006
在本申请实施例中n=14。
in,
Figure PCTCN2022129587-appb-000006
In the embodiment of this application, n=14.
至此,当前数据集中的数据可以包含以下字段:At this point, the data in the current dataset can contain the following fields:
Figure PCTCN2022129587-appb-000007
Figure PCTCN2022129587-appb-000007
步骤S13:基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数。Step S13: Calculate the historical social media market sentiment index based on the social media data in each time period.
在本申请实施例中,终端设备获取每一时间段的社交媒体数据的评论文本,获取每一条评论文本的情感评分;基于所述每一条评论文本的情感评分,将低于预设阈值的评论文本定义为负向评论文本,将高于预设阈值的评论文本定义为正向评论文本;基于每一时间段的社交媒体数据的所有正向评论文本和负向评论文本,计算所述社交媒体数据的梯度情绪背离指数。In the embodiment of this application, the terminal device obtains the comment text of the social media data in each time period, and obtains the emotion score of each comment text; based on the emotion score of each comment text, comments that are lower than the preset threshold are Text is defined as negative comment text, and comment text higher than the preset threshold is defined as positive comment text; based on all positive comment texts and negative comment texts of social media data in each time period, the social media Gradient Sentiment Divergence Index of the data.
进一步地,终端设备还可以获取所述社交媒体数据的每一评论文本与所述梯度情绪背离指数的差值;利用所述社交媒体数据的所有评论文本对应的差值,以及所有评论文本的数量,计算所述社交媒体数据的梯度牛市情绪指数。Further, the terminal device can also obtain the difference between each comment text of the social media data and the gradient sentiment deviation index; use the difference corresponding to all comment texts of the social media data, and the number of all comment texts , calculate the gradient bull sentiment index for the social media data.
具体地,终端设备计算历史社交媒体市场情感指数的具体过程如下:Specifically, the specific process of the terminal device calculating the historical social media market sentiment index is as follows:
终端设备对每一时间段的社交媒体数据的评论文本进行情感评分,用于进行情感评分的工具,包括但不限于VADER等。终端设备对每一评论文本进行情感评分,可以得到一个-1至1之间的数值,代表该条评论文本的情感。The terminal device performs emotional scoring on the comment text of the social media data in each time period. Tools used for emotional scoring include but are not limited to VADER, etc. The terminal device performs an emotion score on each comment text and can obtain a value between -1 and 1, which represents the emotion of the comment text.
终端设备还可以将情感评分在-1至0之间的评论文本定义为负向评论文本,将情感评分在0至1之间的评论文本定义为正向评论文本。The terminal device may also define the comment text with the sentiment score between -1 and 0 as negative comment text, and the comment text with the sentiment score between 0 and 1 as positive comment text.
具体地,终端设备利用定义的正向评论文本和负向评论文本计算梯度牛市情绪指数(Small Granular Sentiment Bullish Index,SGSDI),梯度情绪背离指数(Small Granular Sentimental Divergence Index,SGSBI)。Specifically, the terminal device uses the defined positive comment text and negative comment text to calculate the Gradient Bullish Sentiment Index (Small Granular Sentiment Bullish Index, SGSDI) and the Gradient Sentiment Divergence Index (Small Granular Sentimental Divergence Index, SGSBI).
其中,梯度情绪背离指数的计算公式为:Among them, the calculation formula of the gradient sentiment divergence index is:
Figure PCTCN2022129587-appb-000008
Figure PCTCN2022129587-appb-000008
其中,
Figure PCTCN2022129587-appb-000009
为t时间段内的所有正向评论的情感指数之和,
Figure PCTCN2022129587-appb-000010
为t时间段内所有负向评论的情感指数绝对值之和。
in,
Figure PCTCN2022129587-appb-000009
is the sum of the sentiment index of all positive comments in the t time period,
Figure PCTCN2022129587-appb-000010
It is the sum of the absolute values of the sentiment index of all negative comments in the t time period.
其中,梯度牛市情绪指数的计算公式为:Among them, the calculation formula of the gradient bull market sentiment index is:
Figure PCTCN2022129587-appb-000011
Figure PCTCN2022129587-appb-000011
其中,D(t)为t时间段内的所有评论的情感指数集合。Among them, D(t) is the sentiment index set of all comments in the t time period.
步骤S14:将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,并作为特征输入到待训练的市场价格预测模型。Step S14: Correlate and merge the historical price technical indicators and the historical social media market sentiment index in the same time period, and input them as features into the market price prediction model to be trained.
在本申请实施例中,终端设备根据上述数据对应的时间粒度区域,将同一时间段的价格数据、技术指标以及情感指标进行关联合并,此时数据集包含以下属性:In the embodiment of this application, the terminal device associates and merges the price data, technical indicators and emotional indicators of the same time period according to the time granularity area corresponding to the above data. At this time, the data set contains the following attributes:
Figure PCTCN2022129587-appb-000012
Figure PCTCN2022129587-appb-000012
步骤S15:利用待训练的市场价格预测模型的预测输出对市场价格预测模型进行训练,得到最终的市场价格预测模型。Step S15: Use the prediction output of the market price prediction model to be trained to train the market price prediction model to obtain the final market price prediction model.
在本申请实施例中,终端设备将上述关联合后的数据集作为特征输入,添加预测目标列,在本申请实施例的市场价格预测模型中,预测目标列为下一时间粒度的收盘价,具体请参阅图3,图3是本申请提供的市场价格预测模型预测收盘价的示意图。In the embodiment of the present application, the terminal device inputs the above-mentioned combined data set as a feature and adds a prediction target column. In the market price prediction model of the embodiment of the present application, the prediction target column is the closing price of the next time granularity, Please refer to Figure 3 for details. Figure 3 is a schematic diagram of the closing price predicted by the market price prediction model provided by this application.
如图3所示,终端设备可以将2017年9月24日至2017年9月27日的数据集输入到市场价格预测模型,时间粒度设置为1天,市场价格预测模型输出的预测结果为2017年9月29日的收盘价。终端设备还可以将2017年9月24日00:00:00至2017年9月24日02:00:00的数据集输入到市场价格预测模型,时间粒度设置为30分钟,市场价格预测模型输出的预测结果为2017年9月24日02:30:00的收盘价。As shown in Figure 3, the terminal device can input the data set from September 24, 2017 to September 27, 2017 into the market price prediction model. The time granularity is set to 1 day, and the prediction result output by the market price prediction model is 2017. The closing price on September 29, 2019. The terminal device can also input the data set from 00:00:00 on September 24, 2017 to 02:00:00 on September 24, 2017 into the market price prediction model. The time granularity is set to 30 minutes, and the market price prediction model outputs The prediction result is the closing price at 02:30:00 on September 24, 2017.
其中,关于步骤S11中的数据集,终端设备可以将数据集分割为训练集与测试集,通常使用80%作为训练集,20%作为测试集,例如以下图4所示,图4是本申请提供的数据集一实施例的框架示意图。Among them, regarding the data set in step S11, the terminal device can divide the data set into a training set and a test set. Usually, 80% is used as the training set and 20% is used as the test set. For example, as shown in Figure 4 below, Figure 4 is the application A schematic diagram of the framework of an embodiment of the data set is provided.
如图4所示,本申请实施例的数据集包括2017年9月24日00:00:00至2020 年11月30日23:59:59期间收集的数据,终端设备将2017年9月24日00:00:00至2020年4月11日23:59:59的数据作为训练集,将2020年4月12日00:00:00至11月30日23:59:59的数据作为测试集。As shown in Figure 4, the data set of the embodiment of this application includes data collected from 00:00:00 on September 24, 2017 to 23:59:59 on November 30, 2020. The terminal device will be on September 24, 2017. The data from 00:00:00 on April 11, 2020 to 23:59:59 on April 11, 2020 is used as the training set, and the data from 00:00:00 on April 12, 2020 to 23:59:59 on November 30, 2020 is used as the test set. set.
本申请实施例的训练集只用于训练模型,一般选用深度神经网络模型,例如长短期记忆模型(Long-Short Term Memory,LSTM)等模型,训练参数需要根据数据情况进行调教,另外也可以使用更为复杂的集成模型,训练周期可能更长,但是模型效果一般略高于单个深度神经网络模型。而测试集用于测试训练后的市场价格预测模型。The training set in the embodiment of this application is only used to train the model. Generally, a deep neural network model is used, such as a long-short-term memory model (Long-Short Term Memory, LSTM) and other models. The training parameters need to be adjusted according to the data situation. In addition, it can also be used For more complex integrated models, the training cycle may be longer, but the model effect is generally slightly higher than that of a single deep neural network model. The test set is used to test the trained market price prediction model.
其中,市场价格预测模型可以选用各种回归器,包括但不限于线性回归、深度神经网络、各种集成回归器。对于市场价格预测模型的评价指标可以选用平均绝对误差(Mean Absolute Error,MAE)、均方误差(Mean Square Error,MSE)、均方根误差(Root Mean Square Error,RMSE)、R 2(R squared,Coefficient of determination)、平均绝对百分比误差(Mean Absolute Percentage Error,MAPE)等。上述各个评价指标达到可以接收范围内时保存市场价格预测模型。 Among them, the market price prediction model can use various regressors, including but not limited to linear regression, deep neural networks, and various integrated regressors. For the evaluation indicators of the market price prediction model, you can choose the mean absolute error (Mean Absolute Error, MAE), mean square error (Mean Square Error, MSE), root mean square error (Root Mean Square Error, RMSE), R 2 (R squared , Coefficient of determination), Mean Absolute Percentage Error (MAPE), etc. When each of the above evaluation indicators reaches an acceptable range, the market price prediction model is saved.
为了验证本专利提出技术路线的有效性和先进性,在twitter与bitcoin的真实数据集上进行了多次广泛的实验,以评估所提出的方法的性能。In order to verify the effectiveness and advancement of the technical route proposed by this patent, multiple extensive experiments were conducted on the real data sets of Twitter and Bitcoin to evaluate the performance of the proposed method.
采用平均绝对误差(MAE)、均方根误差(RMSE)、平均绝对百分比误差(MAPE)三个方面的评价指标对模型效果进行评估,与现有模型进行比对,实验结果一致表明,利用本专利方法可有更准确地预测价格,可以被广泛的应用于金融产品价格的预测,具有十分广阔的应用前景。The three evaluation indicators of mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE) are used to evaluate the model effect and compared with existing models. The experimental results consistently show that using this method The patented method can predict prices more accurately and can be widely used to predict the prices of financial products, and has very broad application prospects.
以下的Table1与Table2为两次时间不同粒度下的两种模型的预测误差统计,其中LSTM为长短期记忆模型(Long-Short Term Memory),GRU为门控循环单元模型(Gate Recurrent Unit)。本次模拟的数据采用了2017年9月24日至2020年11月30日的Bitcoin价格数据与Twitter上的#Bitcoin主题推文进行。采用了80%训练集与20%测试集得到以下结果。The following Table 1 and Table 2 show the prediction error statistics of the two models at different granularities at two times. LSTM is the Long-Short Term Memory model (Long-Short Term Memory) and GRU is the Gate Recurrent Unit model (Gate Recurrent Unit). The data for this simulation uses Bitcoin price data and #Bitcoin topic tweets on Twitter from September 24, 2017 to November 30, 2020. The following results were obtained using 80% training set and 20% test set.
Table 1 30分钟时间粒度下的模型预测性能结果表Table 1 Model prediction performance results table at 30-minute time granularity
Figure PCTCN2022129587-appb-000013
Figure PCTCN2022129587-appb-000013
Table 2 1天时间粒度下的模型预测性能结果表Table 2 Model prediction performance results table at 1-day time granularity
Figure PCTCN2022129587-appb-000014
Figure PCTCN2022129587-appb-000014
在本申请实施例中,利用了社交媒体的连续更新的特性,替代新闻数据,让预测的时间粒度可以进一步缩小,得到接近实时预测的能力;本申请给出一种新的时间段内市场情感指数计算方法,相对于目前的简单统计方法,更能反应具体的市场情绪,提高了模型精度。In the embodiment of this application, the continuous updating characteristics of social media are used to replace news data, so that the time granularity of prediction can be further reduced, and the ability of near-real-time prediction is obtained; this application provides a new market sentiment within a time period Compared with the current simple statistical method, the index calculation method can better reflect the specific market sentiment and improve the accuracy of the model.
基于上述实施例的基于市场价格预测模型的模型训练方法,本申请进一步提出一种基于市场价格预测模型的价格预测方法,具体请参阅图5,图5是本申请提供的基于市场价格预测模型的价格预测方法一实施例的流程示意图。Based on the model training method based on the market price prediction model in the above embodiment, this application further proposes a price prediction method based on the market price prediction model. For details, please refer to Figure 5. Figure 5 is a model based on the market price prediction model provided by this application. A flowchart of an embodiment of a price prediction method.
如图5所示,本申请实施例的价格预测方法具体包括以下步骤:As shown in Figure 5, the price prediction method in the embodiment of this application specifically includes the following steps:
步骤S21:获取当前时间段的股市数据,以及获取当前时间段的所有社交评论数据。Step S21: Obtain the stock market data of the current time period and obtain all social comment data of the current time period.
步骤S22:基于所述股市数据,计算所述当前时间段的价格技术指标。Step S22: Calculate price technical indicators of the current time period based on the stock market data.
步骤S23:基于所有社交评论数据,计算当前时间段的社交媒体市场情感指数。Step S23: Calculate the social media market sentiment index of the current time period based on all social comment data.
步骤S24:将当前时间段的价格技术指标和社交媒体市场情感指数进行关联合并,并作为特征输入到预先训练的市场价格预测模型。Step S24: Correlate and merge the price technical indicators and the social media market sentiment index of the current time period, and input them as features into the pre-trained market price prediction model.
步骤S25:基于市场价格预测模型的输出,获取当前时间段以后的预测价格。Step S25: Based on the output of the market price prediction model, obtain the predicted price after the current time period.
请继续参见图6,图5是本申请提供的终端设备一实施例的结构示意图。本申请实施例的终端设备500包括处理器51、存储器52、输入输出设备53以及总线54。Please continue to refer to Figure 6. Figure 5 is a schematic structural diagram of an embodiment of a terminal device provided by this application. The terminal device 500 in the embodiment of the present application includes a processor 51, a memory 52, an input and output device 53, and a bus 54.
该处理器51、存储器52、输入输出设备53分别与总线54相连,该存储器52中存储有程序数据,处理器51用于执行程序数据以实现上述实施例所述的模型训练方法和/价格预测方法。The processor 51, the memory 52, and the input and output device 53 are respectively connected to the bus 54. The memory 52 stores program data. The processor 51 is used to execute the program data to implement the model training method and/or price prediction described in the above embodiment. method.
在本申请实施例中,处理器51还可以称为CPU(Central Processing Unit, 中央处理单元)。处理器51可能是一种集成电路芯片,具有信号的处理能力。处理器51还可以是通用处理器、数字信号处理器(DSP,Digital Signal Process)、专用集成电路(ASIC,Application Specific Integrated Circuit)、现场可编程门阵列(FPGA,Field Programmable Gate Array)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器51也可以是任何常规的处理器等。In the embodiment of this application, the processor 51 may also be called a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip with signal processing capabilities. The processor 51 can also be a general-purpose processor, a digital signal processor (DSP, Digital Signal Process), an application specific integrated circuit (ASIC, Application Specific Integrated Circuit), a field programmable gate array (FPGA, Field Programmable Gate Array) or other available Programmed logic devices, discrete gate or transistor logic devices, discrete hardware components. The general processor may be a microprocessor or the processor 51 may be any conventional processor or the like.
本申请还提供一种计算机存储介质,请继续参阅图7,图7是本申请提供的计算机存储介质一实施例的结构示意图,该计算机存储介质600中存储有程序数据61,该程序数据61在被处理器执行时,用以实现上述实施例的模型训练方法和/价格预测方法。This application also provides a computer storage medium. Please continue to refer to Figure 7. Figure 7 is a schematic structural diagram of an embodiment of the computer storage medium provided by this application. The computer storage medium 600 stores program data 61. The program data 61 is in When executed by the processor, it is used to implement the model training method and price prediction method of the above embodiment.
本申请的实施例以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。When the embodiments of the present application are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the method described in each embodiment of the application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code. .
以上所述仅为本申请的实施方式,并非因此限制本申请的专利范围,方式利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above descriptions are only embodiments of the present application and are not intended to limit the patent scope of the present application. Equivalent structures or equivalent process transformations may be made using the contents of the description and drawings of the present application, or directly or indirectly applied to other related technologies. fields are equally included in the scope of patent protection of this application.

Claims (10)

  1. 一种基于市场价格预测模型的模型训练方法,其特征在于,所述模型训练方法包括:A model training method based on a market price prediction model, characterized in that the model training method includes:
    获取数据集,其中,所述数据集包括若干时间段的股市数据以及社交媒体数据;Obtain a data set, wherein the data set includes stock market data and social media data for several time periods;
    基于每一时间段的股市数据,计算历史价格技术指标;Calculate historical price technical indicators based on stock market data for each time period;
    基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数;Based on the social media data of each time period, calculate the historical social media market sentiment index;
    将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,并作为特征输入到待训练的市场价格预测模型;Correlate and merge historical price technical indicators and historical social media market sentiment index in the same time period, and input them as features into the market price prediction model to be trained;
    利用所述待训练的市场价格预测模型的预测输出对所述市场价格预测模型进行训练,得到最终的市场价格预测模型。The market price prediction model is trained using the prediction output of the market price prediction model to be trained to obtain the final market price prediction model.
  2. 根据权利要求1所述的模型训练方法,其特征在于,The model training method according to claim 1, characterized in that:
    所述股市数据包括开盘价、最高价、最低价、收盘价和/或成交量。The stock market data includes opening price, high price, low price, closing price and/or trading volume.
  3. 根据权利要求2所述的模型训练方法,其特征在于,The model training method according to claim 2, characterized in that:
    所述历史价格技术指标包括异同移动平均线、简单移动平均线、相对强弱指标和/或资金流量指标;The historical price technical indicators include moving averages of similarity and divergence, simple moving averages, relative strength indicators and/or fund flow indicators;
    所述基于每一时间段的股市数据,计算历史价格技术指标,包括:Based on the stock market data of each time period, the historical price technical indicators are calculated, including:
    基于每一时间段的开盘价,计算所述异同移动平均线;Calculate the moving average of similarity and difference based on the opening price of each time period;
    和/或,基于每一时间段的最高价,计算所述简单移动平均线;and/or, calculating said simple moving average based on the highest price in each time period;
    和/或,基于每一时间段的最低价,计算所述相对强弱指标;and/or, calculate the relative strength indicator based on the lowest price in each time period;
    和/或,基于每一时间段的收盘价,计算所述资金流量指标。and/or, calculating the money flow indicator based on the closing price for each time period.
  4. 根据权利要求1所述的模型训练方法,其特征在于,The model training method according to claim 1, characterized in that:
    所述基于每一时间段的社交媒体数据,计算历史社交媒体市场情感指数,包括:The historical social media market sentiment index is calculated based on the social media data of each time period, including:
    获取每一时间段的社交媒体数据的评论文本,获取每一条评论文本的情感评分;Obtain the comment text of social media data in each time period and obtain the sentiment score of each comment text;
    基于所述每一条评论文本的情感评分,将低于预设阈值的评论文本定义为负向评论文本,将高于预设阈值的评论文本定义为正向评论文本;Based on the sentiment score of each comment text, comment texts lower than the preset threshold are defined as negative comment texts, and comment texts higher than the preset threshold are defined as positive comment texts;
    基于每一时间段的社交媒体数据的所有正向评论文本和负向评论文本,计算所述社交媒体数据的梯度情绪背离指数。Based on all positive comment texts and negative comment texts of the social media data in each time period, the gradient sentiment divergence index of the social media data is calculated.
  5. 根据权利要求4所述的模型训练方法,其特征在于,The model training method according to claim 4, characterized in that:
    所述计算所述社交媒体数据的梯度情绪背离指数之后,所述模型训练方法还包括:After calculating the gradient sentiment divergence index of the social media data, the model training method further includes:
    获取所述社交媒体数据的每一评论文本与所述梯度情绪背离指数的差值;Obtain the difference between each comment text of the social media data and the gradient sentiment deviation index;
    利用所述社交媒体数据的所有评论文本对应的差值,以及所有评论文本的数量,计算所述社交媒体数据的梯度牛市情绪指数。The gradient bull market sentiment index of the social media data is calculated using the difference corresponding to all comment texts of the social media data and the number of all comment texts.
  6. 根据权利要求5所述的模型训练方法,其特征在于,The model training method according to claim 5, characterized in that:
    所述将同一时间段的历史价格技术指标和历史社交媒体市场情感指数进行关联合并,包括:The above-mentioned correlation and merging of historical price technical indicators and historical social media market sentiment index in the same time period includes:
    将同一时间段的开盘价,与异同移动平均线、所述梯度情绪背离指数进行关联合并;Correlate and merge the opening price in the same time period with the moving average of similarity and difference and the gradient sentiment divergence index;
    将同一时间段的最高价,与简单移动平均线、所述梯度牛市情绪指数进行关联合并。The highest price in the same time period is correlated and merged with the simple moving average and the gradient bull sentiment index.
  7. 根据权利要求1所述的模型训练方法,其特征在于,The model training method according to claim 1, characterized in that:
    所述获取数据集,包括:The obtained data set includes:
    利用爬虫技术或者供应商的编程接口收集股市数据;Use crawler technology or the supplier's programming interface to collect stock market data;
    和/或,利用爬虫技术或者社交媒体编程接口收集社交媒体数据。and/or, use crawler technology or social media programming interfaces to collect social media data.
  8. 一种基于市场价格预测模型的价格预测方法,其特征在于,所述价格预测方法包括:A price prediction method based on a market price prediction model, characterized in that the price prediction method includes:
    获取当前时间段的股市数据,以及获取当前时间段的所有社交评论数据;Get the stock market data for the current time period, and get all social comment data for the current time period;
    基于所述股市数据,计算所述当前时间段的价格技术指标;Based on the stock market data, calculate price technical indicators for the current time period;
    基于所述所有社交评论数据,计算所述当前时间段的社交媒体市场情感指数;Calculate the social media market sentiment index of the current time period based on all social comment data;
    将所述当前时间段的价格技术指标和社交媒体市场情感指数进行关联合并,并作为特征输入到预先训练的市场价格预测模型;Correlate and merge the price technical indicators and social media market sentiment index of the current time period, and input them as features into the pre-trained market price prediction model;
    基于所述市场价格预测模型的输出,获取当前时间段以后的预测价格;Based on the output of the market price prediction model, obtain the predicted price after the current time period;
    其中,所述市场价格预测模型由权利要求1至7任一项所述的模型训练方 法训练得到。Wherein, the market price prediction model is trained by the model training method described in any one of claims 1 to 7.
  9. 一种终端设备,其特征在于,所述终端设备包括存储器以及与所述存储器耦接的处理器;A terminal device, characterized in that the terminal device includes a memory and a processor coupled to the memory;
    其中,所述存储器用于存储程序数据,所述处理器用于执行所述程序数据以实现如权利要求1~7任一项所述的模型训练方法和/或权利要求8所述的价格预测方法。Wherein, the memory is used to store program data, and the processor is used to execute the program data to implement the model training method according to any one of claims 1 to 7 and/or the price prediction method according to claim 8. .
  10. 一种计算机存储介质,其特征在于,所述计算机存储介质用于存储程序数据,所述程序数据在被计算机执行时,用以实现如权利要求1~7任一项所述的模型训练方法和/或权利要求8所述的价格预测方法。A computer storage medium, characterized in that the computer storage medium is used to store program data, and when the program data is executed by a computer, it is used to implement the model training method as described in any one of claims 1 to 7. /or the price prediction method according to claim 8.
PCT/CN2022/129587 2022-07-28 2022-11-03 Model training method, price prediction method, terminal device and storage medium WO2024021354A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210899612.3 2022-07-28
CN202210899612.3A CN115423499A (en) 2022-07-28 2022-07-28 Model training method, price prediction method, terminal device, and storage medium

Publications (1)

Publication Number Publication Date
WO2024021354A1 true WO2024021354A1 (en) 2024-02-01

Family

ID=84197284

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/129587 WO2024021354A1 (en) 2022-07-28 2022-11-03 Model training method, price prediction method, terminal device and storage medium

Country Status (2)

Country Link
CN (1) CN115423499A (en)
WO (1) WO2024021354A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116611696B (en) * 2023-07-19 2024-01-26 北京大学 Digital asset market risk prediction system based on time sequence analysis
CN117635179A (en) * 2023-07-25 2024-03-01 北京壹清能环科技有限公司 Carbon transaction price prediction method, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778215A (en) * 2014-01-17 2014-05-07 北京理工大学 Stock market forecasting method based on sentiment analysis and hidden Markov fusion model
CN105022825A (en) * 2015-07-22 2015-11-04 中国人民解放军国防科学技术大学 Financial variety price prediction method capable of combining financial news mining and financial historical data
CN106384166A (en) * 2016-09-12 2017-02-08 中山大学 Deep learning stock market prediction method combined with financial news
CN113435204A (en) * 2021-02-02 2021-09-24 上海卡方信息科技有限公司 Stock price fluctuation prediction method based on news information
US11238535B1 (en) * 2017-09-14 2022-02-01 Wells Fargo Bank, N.A. Stock trading platform with social network sentiment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778215A (en) * 2014-01-17 2014-05-07 北京理工大学 Stock market forecasting method based on sentiment analysis and hidden Markov fusion model
CN105022825A (en) * 2015-07-22 2015-11-04 中国人民解放军国防科学技术大学 Financial variety price prediction method capable of combining financial news mining and financial historical data
CN106384166A (en) * 2016-09-12 2017-02-08 中山大学 Deep learning stock market prediction method combined with financial news
US11238535B1 (en) * 2017-09-14 2022-02-01 Wells Fargo Bank, N.A. Stock trading platform with social network sentiment
CN113435204A (en) * 2021-02-02 2021-09-24 上海卡方信息科技有限公司 Stock price fluctuation prediction method based on news information

Also Published As

Publication number Publication date
CN115423499A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
Fang et al. Cryptocurrency trading: a comprehensive survey
Gorenc Novak et al. Prediction of stock price movement based on daily high prices
WO2024021354A1 (en) Model training method, price prediction method, terminal device and storage medium
Wang et al. Cryptocurrency price prediction based on multiple market sentiment
Chen et al. Financial trading strategy system based on machine learning
Fister et al. Two robust long short-term memory frameworks for trading stocks
Qian et al. On exploring the impact of users’ bullish-bearish tendencies in online community on the stock market
Rahmani Cherati et al. Cryptocurrency direction forecasting using deep learning algorithms
Feuerriegel et al. Evaluation of news-based trading strategies
Wang et al. Crowds on wall street: Extracting value from social investing platforms
Zhang et al. Research on Influential Factors in Stock Market Prediction with LSTM
Li et al. Optimization of investment strategies through machine learning
Zhang et al. A hybrid forecasting model based on deep learning feature extraction and statistical arbitrage methods for stock trading strategies
Miciuła The concept of FTS analysis in forecasting trends of exchange rate changes
Agarwal et al. Merger and acquisition pricing using agent based modelling
Fabozzi et al. News-based sentiment and the value premium
Fernandes et al. Decision-making simulator for buying and selling stock market shares based on twitter indicators and technical analysis
Parra-Moyano et al. Your sentiment matters: A machine learning approach for predicting regime changes in the cryptocurrency market
Jiang et al. A MIDAS multinomial logit model with applications for bond ratings
Perry-Carrera Effect of sentiment on Bitcoin price formation
Li et al. A collective portfolio selection approach for investment clubs
Huang et al. Direct interaction in digital interactive media and stock performance: Evidence from Panorama
Li Essays in financial technology: banking efficiency and application of machine learning models in Supply Chain Finance and credit risk assessment
Han et al. Empirical analysis of SH50ETF and SH50ETF option prices under regime-switching jump-diffusion models
Gray Economic significance of predictability in Australian equities

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22952791

Country of ref document: EP

Kind code of ref document: A1