WO2018094777A1 - 一种证券实时交易关联分析的方法 - Google Patents

一种证券实时交易关联分析的方法 Download PDF

Info

Publication number
WO2018094777A1
WO2018094777A1 PCT/CN2016/109349 CN2016109349W WO2018094777A1 WO 2018094777 A1 WO2018094777 A1 WO 2018094777A1 CN 2016109349 W CN2016109349 W CN 2016109349W WO 2018094777 A1 WO2018094777 A1 WO 2018094777A1
Authority
WO
WIPO (PCT)
Prior art keywords
time
data
securities
real
analysis
Prior art date
Application number
PCT/CN2016/109349
Other languages
English (en)
French (fr)
Inventor
郑锐韬
李勇波
孙傲冰
张恒
季统凯
Original Assignee
国云科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国云科技股份有限公司 filed Critical 国云科技股份有限公司
Publication of WO2018094777A1 publication Critical patent/WO2018094777A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • the invention relates to the field of big data processing and analysis technology, and is a method for real-time transaction correlation analysis of securities.
  • the real-time transaction information of the securities market has the characteristics of large amount of information, large quantity, and frequent transactions. There are many transaction linkage information in this frequent transaction. For those who conduct short-term transactions in the market, the association of each securities information Linkage has a good reference value. The sooner you know the related information of the securities trading process, the faster you can get the judgment of buying and selling in the ever-changing securities market, so that you can profit in the market, but the way people get information. It is difficult to obtain a large amount of transaction information in a short period of time. Take the A shares in the Chinese securities market as an example. As of November 2016, there are about 3,000 stocks, and there are 4 hours of trading time per day, every 3 seconds. Obtaining a transaction information, there is about 4.5G of data per day.
  • Real-time analysis of securities has the characteristics of large data volume and high real-time requirements. Data access to traditional relational databases is far from satisfactory when performance is large, so that real-time conditions cannot be met.
  • the technical problem solved by the invention is to provide a real-time transaction association of securities based on MongoDB.
  • the method of analyzing solves the shortcomings of low query efficiency and slow response speed when using traditional relational storage in the data analysis process of large data volume, and from a multi-threaded perspective, realizes obtaining and performing a large amount of securities data in a short time. analysis.
  • the method includes the following steps:
  • Step 1 Use MongoDB as the data access space to build a stand-alone MongoDB or a cluster formed by multiple MongoDBs for access operations after real-time data acquisition of securities;
  • Step 2 Obtain detailed information of each security for real-time acquisition and analysis of securities data
  • Step 3 Design a permanent storage table and a temporary storage table, each time the acquired data is stored in the permanent storage table for subsequent data validation analysis; the temporary storage table is valid for half an hour, and is used for each short time. Data acquisition for association analysis;
  • Step 4 setting an interval acquisition time, obtaining real-time securities data at the interval point, and storing the acquired real-time data on the permanent storage table and the temporary storage table after deduplication processing;
  • Step 5 By querying the data on the temporary storage table, analyzing and obtaining the representation forms of each security in a plurality of time periods, and outputting corresponding transaction types and securities codes for various abnormalities occurring simultaneously in each time period, forming a real-time Real-time correlation analysis of securities for guidance on securities trading.
  • the access operation includes: setting up an independent MongoDB or MongoDB cluster, storing historical data and temporary data obtained by real-time data acquisition of securities, and setting up a MongoDB cluster for historical data, and reading the historical data by date. take.
  • the detailed information includes information such as code (prefix), name, share capital, and proportion of major shareholders.
  • Step 3 the specific steps are:
  • Step 1 Design a permanent storage table on MongoDB to store the accumulated historical data of real-time securities transactions, partition by date and time, and design and store them in different data storage spaces. For deployment in clusters, you can add securities. Coded hash storage, different certificates by security code The coupon data hash is stored on multiple servers;
  • Step 2 For real-time securities data analysis, design a temporary storage table on MongoDB to store the securities data within half an hour, and use MongoDB's TTL index to create a TTL index on a time column on the temporary table. Set a half-hour time, which automatically deletes the data of the temporary table.
  • Step 4 the specific steps are as follows:
  • Step 1 Call the real-time data interface of the securities on the basis of obtaining the relevant securities writing and prefixes. All the securities data, according to the existing detailed data, complete the initialization of the real-time data call, run the real-time acquisition of the securities data, and execute the thread concurrently. Send data to the interface to obtain data;
  • Step 2 Real-time data of securities obtained by multi-threading, including the opening price, the highest price, the lowest price, the real-time price, the trading volume, the transaction amount, the quantity and price of the five-selling, the quantity and price of the five-selling, and the price of each day.
  • the thread of the data parses and queries whether the acquired real-time information is the same as the latest acquired information, and the same representation already exists, and is not saved;
  • Step 3 The real-time data of the securities after deduplication are respectively stored in the permanent storage table and the temporary storage table for subsequent historical query and real-time association analysis;
  • the interval acquisition time of the real-time data acquisition of the securities can be configured to be acquired every 3 seconds or 5 seconds.
  • storage and system processing require high requirements.
  • Step 1 Divide all the securities data into multiple threads, perform analysis on multiple time periods and multiple types of different types on each thread, and summarize the information of the analysis results into a unified display area for unified display;
  • Step 2 In each thread that divides the securities, obtain the current transaction information of each security on the temporary table, and obtain real-time transaction information in multiple time periods separately, respectively, and increase the range The analysis of the decline rate, the closing of the limit, the closing of the board, the exchange of hands, etc., sorting the various abnormalities in each time period in order and outputting the top digits to the unified collection procedure;
  • Step 3 Each thread outputs the transaction information in a specific time interval to a unified collection program, and the collection program summarizes the data of all the threads, including the transaction type and the security code, and then performs uniform ordering, and then in the sort order, respectively. Obtain information such as rising, falling, closing, closing, closing, and rapid turnover in each time interval, and input them to the display area in a unified manner, so as to realize the correlation analysis of each security at each time interval;
  • Step 4 For a variety of time periods, user-defined configuration can be performed as needed; various types of abnormalities can be subsequently increased; various abnormalities such as increase, decrease, and turnover rate can be customized as needed. Configuration; the configuration of each parameter can be modified according to the specific conditions of the market and set to the optimal settings.
  • the various abnormalities described can be configured as: rapid rise, rapid decline, closed up and stop, closed down, and rapid expansion of transactions.
  • the plurality of time periods can be configured as: first 30 seconds, first 1 minute, first 2 minutes, first 5 minutes, first 10 minutes, and first 15 minutes.
  • the obtaining real-time data and the associated analysis refers to two programs running separately, which are completed within 3 seconds or 5 seconds of acquiring data at intervals, and are multi-threaded for a large number of securities data respectively.
  • the method may further include the step 6: after the real-time transaction data of the securities is saved on the permanent table, the subsequent real-time transaction correlation relationship may be analyzed in a larger time interval, and the recent models may be analyzed according to a certain model. Day or longer market information can be used as a broader analysis of securities trading associations.
  • the method of the present invention utilizes MongoDB's efficient storage and rich query support, and more types.
  • the function supported by the index regards MongoDB as an efficient data access space for the storage and analysis of real-time transaction data through securities. It solves the problem of large-volume data analysis process using traditional relational storage.
  • the method of the present invention selects MongoDB based on MongoDB as a NoSQL type database, which is a database based on memory and physical data stored in physical memory to adjust read and write data, and its TTL index can be performed very quickly.
  • the deletion of data after the time is over, the real-time analysis is the efficient management of outdated data, and has good performance of big data processing, so as to carry out efficient real-time transaction correlation analysis of securities, in the process of securities trading in large data volume, Obtaining sudden change of information, providing traders with an efficient method of correlating transaction analysis, making judgments on securities transactions for customers and discovering trading opportunities among them.
  • 1 is a flow chart showing the functionalization of the computer software system of the present invention.
  • FIG. 1 is a flowchart of the functionalization of the computer software system of the present invention. The following describes the specific implementation of each process:
  • Step 1 Set up the MongoDB environment and build a stand-alone MongoDB or cluster for real-time data storage and analysis.
  • Step 2 On the built MongoDB, design a permanent storage table, each time the acquired data is stored in the permanent storage table for subsequent data validation analysis; and design a temporary storage table that is valid within half an hour. To establish a TTL index, the data of more than half an hour is automatically deleted by MongoDB, and is used for data acquisition of each association analysis in a short time;
  • Step 3 Initialize the detailed information of each security, such as encoding (prefix), name, share capital, and proportion of major shareholders, for use in real-time acquisition and analysis of securities data;
  • Step 4 Start the program, obtain real-time securities data through multi-threading every 3 or 5 seconds, and perform de-reprocessing to store the acquired real-time data into the permanent storage table and the temporary storage table respectively;
  • Step 5 The analysis program queries the data on the temporary storage table in a multi-threaded manner by analyzing the timestamp with the lowest interval, and analyzes the time in the past 30 seconds, 1 minute, 2 minutes, 5 minutes, 10 minutes, 15 minutes, etc.
  • the performance forms of each securities, for each time interval, such as rapid rise, rapid decline, closed limit, closed down, and rapid expansion of the transaction, output the corresponding transaction type and securities code, and output to a unified summary program, summary
  • the program After acquiring the transaction type and securities code of each thread, the program finally displays the threshold to the display area by sorting and analyzing the transaction, and forms a real-time real-time correlation analysis of securities for guiding the securities transaction;
  • Step 6 After the real-time transaction data of the securities is saved on the permanent table, the subsequent analysis of the reversal of the securities can analyze the real-time transaction association relationship of the securities in a larger time interval, and analyze the market information of the last few days or longer according to a certain model. Can be used as a broader analysis of securities trading associations.
  • MongoDB By setting up an independent MongoDB or MongoDB cluster, it is used to store historical data and temporary data after real-time data acquisition of securities;
  • MongoDB has a physical method based on memory and storing hot data in physical memory to adjust read and write. It has great advantages for real-time analysis of real-time securities, so MongoDB is selected for efficient real-time analysis.
  • the efficiency of data reading can be improved by establishing a cluster and partitioning the historical data by date.
  • the two storage tables are created on MongoDB, and the specific steps are as follows:
  • Step 1 Design a table for storing real-time securities transactions on MongoDB for the storage of historical data. This data is continuously accumulated data; considering the historical data is relatively large and the data needs to be read later, this permanent table It needs to be partitioned by date and time, and designed to be stored in different data storage spaces; if deployed on a cluster, consider hashing by adding a security code, and storing different securities data hashes on multiple servers through security coding. on;
  • Step 2 For real-time analysis of securities data, if the real-time data involved can be determined in real time in half an hour, a temporary table is created for storing real-time securities data; the temporary table is utilized by MongoDB.
  • TTL index by setting a TTL index (time-to-live index) on a time column, setting a half hour on the temporary table, the temporary table data will be automatically deleted.
  • Step 1 All the securities data, according to the existing detailed data, complete the initialization of the real-time data call, and send the data to the interface to obtain the data through the thread concurrent form;
  • Step 2 The real-time information of the securities obtained by multi-threading includes the opening price, the highest price, the lowest price, the real-time price, the transaction volume, the transaction amount, the quantity and price of the five-selling, the quantity and the price of the five-selling, and each After obtaining the real-time information of each security, the thread that obtains the data parses and queries whether the acquired real-time information is the same as the latest acquired information. If the same representation already exists, it is not saved.
  • Step 3 The real-time data of the securities after deduplication are saved on the permanent table and the temporary table for subsequent historical query and real-time correlation analysis;
  • Step 4 The real-time data acquisition of the above securities, the acquisition time of the interval can be obtained every 3 seconds or 5 seconds.
  • the acquisition frequency mainly takes into account the real-time requirements of data storage and securities, and the data storage requirements of high acquisition frequency are compared. High, there are high requirements for the processing of the system.
  • Step 1 Divide all securities into multi-threads, perform analysis on multiple time intervals and multiple transaction types on each thread, and summarize the information of the analysis results into a unified display area for unified display;
  • Step 2 In each thread that divides the securities, obtain the current transaction information of each security on the temporary table, and obtain the timestamp from the obtained timestamps 30 seconds before, 1 minute before, 2 minutes before, Real-time trading information such as 5 minutes ago, 10 minutes ago, and 15 minutes before, the analysis of the increase rate, the decline range, the closing limit, the closing limit, and the exchange hand, etc., the increase in the time period is relatively large. , the decline is relatively large, the limit of the daily limit, the limit of the limit, the exchange of the speed of the order is sorted in order and the top of the ranking is output to a unified collection procedure;
  • Step 3 After each thread outputs the transaction information in a specific time interval to the unified collection program, the collection program summarizes the data of all the threads, performs uniform sorting, and then obtains the time intervals in each sorting interval. Relevant information such as rising, falling, closing the board, closing the board, and changing the speed, and inputting them to the exhibition area, so as to realize the correlation analysis of each securities at each time interval;
  • Step 4 For each interval time period, user-defined configuration can be performed as needed; the type of each transaction can be subsequently increased; the increase, decrease, and turnover rate of each transaction can also be performed as needed.
  • Custom configuration the configuration of each parameter can be modified according to the specific conditions of the market and set to the optimal settings.
  • Real-time transaction data correlation analysis of securities obtaining real-time data and correlation analysis, running in two programs, and multi-threaded data acquisition and analysis for a large number of securities data, data acquisition and analysis, basic
  • the real-time analysis can be achieved by completing the data within 3 or 5 seconds of the interval acquisition data. Therefore, for the sub-threading of all the securities data, the number of securities obtained by each thread depends on the number of securities. It can be implemented by multiple threads to ensure data acquisition and analysis within the time interval of acquiring data.
  • the real-time transaction data of the securities After the real-time transaction data of the securities is stored on the permanent table, it can be analyzed in the subsequent round-up analysis, and the real-time transaction correlation relationship of the securities in a larger time interval can be analyzed, and the latest models can be analyzed according to a certain model. Day or longer market information can be used as a broader analysis of securities trading associations.
  • MongoDB has a physical method based on memory and storing hot data in physical memory to adjust read and write data.
  • Real-time analysis of securities with high real-time requirements has great advantages, and is automatically deleted by using TTL index over time. Function, create temporary tables, and carry out efficient securities real-time transaction data association, in order to obtain sudden transaction information in the process of large-volume securities trading, and provide a timely and timely method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

本发明涉及软件信息的大数据处理分析领域,是一种基于MongoDB对证券实时交易数据进行存储并进行实时分析的方法。本发明方法包括:搭建一个独立的MongoDB或由多个MongoDB形成的集群;获取各个证券的详细信息;设计一个永久存储表和一个在半个小时内有效的临时存储表;设置间隔获取时间,前述的间隔点上获取实时证券数据,并经过去重处理,把获取的实时数据分别存储到永久表及临时表上;通过查询临时表的数据,分析获取在多种时间段内的各证券的表现形式,对于各时间段内同时出现的各种异常,输出相应的异动类型与证券编码,形成实时的证券实时关联分析。本发明方法解决大数据量的数据分析过程中存储时查询效率低、响应速度慢的缺点,提高客户的操作及时性。

Description

一种证券实时交易关联分析的方法 技术领域
本发明涉及大数据处理分析技术领域,是一种证券实时交易关联分析的方法。
背景技术
证券市场的实时交易信息,具有信息多、数量大、交易频繁的特点,在这频繁的交易中存在着很多的交易联动信息,对于在市场中进行短期交易的人来说,各证券信息的关联联动有着很好的参考价值,越快知道证券交易过程是的关联信息,在瞬息万变的证券市场中,越能快速地获取买卖的判断,从而在市场中获利,但人的获取信息量的途径的限,很难在短时间内获取大量的异动信息,以中国证券市场的A股为例,截止2016年11月大概有3000只股票,每天有4个小时的交易时间,按每3秒钟获取一次交易信息,一天就大概有4.5G的数据量,这么大的数据量,基本有很多是没有异动或是没有关联的自然运动,可大量过滤掉,但人无法一下子及时处理这么多的数据,通过本方法,可能借助计算机的自动处理的能力,帮助交易人员从大量的交易信息中获取有用的关联异动信息。
证券实时分析具有数据量大、实时性要求高的特点,对于传统的关系型数据库进行数据的存取,当数据量大时在性能上远远达不到要求,从而无法满足实时性的情况。
发明内容
本发明解决的技术问题在于提供一种基于MongoDB的证券实时交易关联分 析的方法,解决大数据量的数据分析过程中采用传统关系型存储时查询效率低、响应速度慢的缺点,并从多线程的角度,实现在大量的证券数据在短时间内进行获取并进行分析。
本发明解决上述技术问题的技术方案是,
所述的方法包括以下几个步骤:
步骤1:以MongoDB作为数据存取空间,搭建一个独立的MongoDB或由多个MongoDB形成的集群用于证券实时数据获取后的存取操作;
步骤2:获取各个证券的详细信息,用于进行证券数据实时获取与进行分析;
步骤3:设计一个永久存储表和临时存储表,每次获取的数据存储在永久存储表,用于后续的数据确认分析;临时存储表在半个小时内有效,用于进行短时间内的各关联分析的数据获取;
步骤4:设置间隔获取时间,在间隔点上获取实时证券数据,并经过去重处理后把获取的实时数据分别存储到永久存储表及临时存储表上;
步骤5:通过查询临时存储表上的数据,分析获取在多种时间段内的各证券的表现形式,对于各时间段内同时出现的各种异常,输出相应的异动类型与证券编码,形成实时的证券实时关联分析,用于进行证券交易的指导参考。
所述的存取操作,包括:搭建独立的MongoDB或MongoDB集群,存储证券实时数据获取下来后的历史数据与临时数据,对于历史数据大的,搭建MongoDB集群,并对历史数据按日期进行分区读取。
所述的详细信息,包括编码(前缀)、名称、股本、大股东占比等信息。
所述步骤3,具体步骤为:
步骤一:在MongoDB上设计一个永久存储表,用于存储不断累积的证券实时交易的历史数据,按日期时间进行分区,并设计存储在不同的数据存储空间上;对于部署在集群,可增加证券编码的哈希存储,通过证券编码将不同的证 券数据哈希存储在多台服务器上;
步骤二:对于实时的证券数据分析,在MongoDB上设计一个临时存储表,用于存储半个小时内的证券数据,利用MongoDB的TTL索引,在一个时间列上建立一个TTL索引,在临时表上设置半个小时时间,该时间点自动删除临时表的数据。
所述的步骤4,具体步骤为:
步骤一:在获取相关的证券编写及前缀基础上调用证券实时数据接口,所有的证券数据,按已有的详细数据,完成实时数据调用的初始化,运行证券数据实时获取程序,通过线程并发的形式把数据发送到接口上获取数据;
步骤二:多线程获取的证券实时数据,包括当天的开盘价、最高价、最低价、实时价、成交量、成交额、买五档的数量及价格、卖五档的数量及价格,各获取数据的线程在获取了各个证券的实时信息后,进行解析,同时查询获取的实时信息是否与最近的一次获取的信息是相同的,相同表示已经存在的,不再进行保存;
步骤三:经过去重后的证券实时数据,分别保存在永久存储表与临时存储表,用于后续的历史查询与实时关联分析;
所述的证券实时数据获取的间隔获取时间可配置为每3秒或5秒获取一次,对于获取频率高的数据,存储和系统的处理需高要求。
所述实现证券实时交易异动关联分析,具体步骤为:
步骤一:把所有证券数据分为多线程,在各线程上进行多种时间段、多种异动类型的分析,把分析的结果出现异动的信息,汇总到统一的展示区进行统一的展示;
步骤二:在划分了证券的各个线程内,在临时表上统一获取各证券的当前交易信息,并分别获取多种时间段内的实时交易信息,分别进行上涨幅度、下 跌幅度、封涨停板、封跌停板、成交换手等的分析,把在各时间段内的各种异常程度大的按顺序进行排序并把排名前几位的输出到统一的收集程序上;
步骤三:各线程对在特定的时间间隔内的异动信息输出到统一的收集程序上,收集程序汇总所有线程的数据,包括异动类型与证券编码,再进行统一的排序,再按排序顺序,分别获取在各时间间隔内,关联的上涨、下跌、封涨停板、封跌停板、极速换手等信息,统一输入到展示区,从而实现在各个时间间隔的各证券的关联分析;
步骤四:对于多种时间段,可按需要进行用户自定义配置;各种异常的类型,可后续进行增加;各种异常的涨幅、跌幅、换手率等,可按需要进行用户的自定义配置;各参数的配置可按行情的具体情况进行修改并设置为最优的设置。
所述的各种异常,可配置为:快速上涨、快速下跌、封涨停板、封跌停板、成交快速放大。
所述的多种时间段,可配置为:前30秒、前1分钟、前2分钟、前5分钟、前10分钟、前15分钟。
所述的获取实时的数据与进行关联的分析,是指两种各自运行的程序,在间隔获取数据的3秒或5秒钟的时间内完成,对于数量多的证券数据,分别进行多线程的数据获取与分析,并且对于所有证券数据的分线程,当每个线程分得的证券数量多时可通过分多个线程进行实现。
所述的方法,还可包括步骤6:证券实时交易数据保存在永久表上后,在后续进行翻盘分析,可分析更大时间间隔内的证券实时交易关联关系,按一定的模型可分析最近几天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
本发明的有益效果如下:
1、本发明的方法利用MongoDB的高效存储及丰富的查询支持、较多类型 的索引支持的功能,把MongoDB当作一个高效的数据存取空间,用于通过证券实时交易数据的存储与分析获取,解决大数据量的数据分析过程中采用传统关系型存储时查询效率低、响应速度慢的缺点,并从多线程的角度,实现在大量的证券数据在短时间内进行获取并进行分析,得出结果的过程,为瞬息万变的证券交易,从中及时获取异动的信息,提供了一种高效的方法。
2、本发明的方法选用MongoDB是基于MongoDB是NoSQL型的数据库,是一个基于内存、将热数据存在物理内存中从而达到调整读写的数据物理方式的数据库,而且其TTL索引可以很快速地进行数据超过时间后的删除,为实时分析是进行了过时数据的高效管理,具有很好的大数据处理性能,从而进行高效的证券实时交易关联分析,为在大数据量的证券交易过程中,从中获取突发的异动信息,为交易人员提供一种高效的关联异动分析的方法,为客户进行证券交易的判断,发现其中的交易机会。
附图说明
下面结合附图对本发明进一步说明:
附图1是本发明计算机软件系统功能组件化的流程图。
具体实施方式
请参见图1,为本发明计算机软件系统功能组件化的流程图,下面分别对其各个流程具体实现进行描述:
步骤1:搭建MongoDB环境,建立一个独立的MongoDB或集群,用于实时数据的存储与分析获取;
步骤2:在搭建的MongoDB上,设计一个永久的存储表,每次获取的数据存储在这个永久存储表上,用于后续的数据确认分析;同时设计一个在半个小时内有效的临时存储表,建立TTL索引,超过半小时的数据通过MongoDB自动删除,用于进行短时间内的各关联分析的数据获取;
步骤3:初始化各个证券的详细信息,如编码(前缀)、名称、股本、大股东占比等信息,用于进行证券数据实时获取与进行分析时使用;
步骤4:启动程序,每隔3秒或5秒通过多线程的方式获取实时证券数据,并经过去重处理,把获取的实时数据分别存储到永久存储表及临时存储表上;
步骤5:分析程序通过间隔最低的分析时间戳,按多线程的方式查询临时存储表上的数据,分别分析在过去30秒、1分钟、2分钟、5分钟、10分钟、15分钟等时间的各证券的表现形式,对于各时间间隔内出现同时的快速上涨、快速下跌、封涨停板、封跌停板、成交快速放大等异常,输出相应的异动类型与证券编码,输出到统一的汇总程序上,汇总程序在获取了各线程的异动类型与证券编码后,通过排序与分析异动的阀值,最终显示到展示区,形成实时的证券实时关联分析,用于进行证券交易的指导参考;
步骤6:证券实时交易数据保存在永久表上后,在后续进行翻盘分析,可分析更大时间间隔内的证券实时交易关联关系,按一定的模型可分析最近几天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
通过搭建独立的MongoDB或MongoDB集群,用于存储证券实时数据获取下来后的历史数据与临时数据;
MongoDB具有基于内存、将热数据存在物理内存中从而达到调整读写的数据物理方式,对于实时性要求比较高的证券实时分析有很大的优势,所以选择MongoDB用于进行高效的实时分析;
对于历史数据比较大的情况,可通过建立集群、并对历史数据按日期进行分区的方式,从而提高数据进行读取的效率。
在进行证券实时数据接口调用时,需先获取相关的证券编写及前缀,在进行数据分析时,需涉及到股本、大股东持股等信息,所以在进行证券数据实时获取程序运行前,需进行相关证券信息的初始化,获取完整的详细信息用于数据获取、分析。
所述在MongoDB上建立两个存储表,具体步骤为:
步骤一、在MongoDB上设计一个用于存储证券实时交易的表,用于历史数据的存储,此数据是不断累积的数据;考虑到历史数据比较大并且后续需对数据进行读取,此永久表需要按日期时间进行分区,并设计存储在不同的数据存储空间上;如果部署在集群上,可考虑按增加证券编码进行哈希存储,通过证券编码把不同的证券数据哈希存储在多台服务器上;
步骤二、对于实时的证券数据分析,如果涉及的实时数据在半个小时的时间内,差不多可以确定实时性,所以建立一个临时的表,用于存储实时的证券数据;临时表通过利用MongoDB的TTL索引,通过在一个时间列上建立一个TTL索引(time-to-liveindex),在临时表上设置半个小时时间,临时表的数据这个时间就会被自动删除。
所述的获取证券数据的方法特征在于:
步骤一、所有的证券数据,按已有的详细数据,完成实时数据调用的初始化,并通过线程并发的形式把数据发送到接口上获取数据;
步骤二、多线程获取的证券实时信息,包括了当天的开盘价、最高价、最低价、实时价、成交量、成交额、买五档的数量及价格、卖五档的数量及价格,各获取数据的线程在获取了各个证券的实时信息后,进行解析,同时查询获取的实时信息是否与最近的一次获取的信息是相同的,如果相同表示已经存在,不再进行保存;
步骤三、经过去重后的证券实时数据,分别保存取永久表与临时表上,用于后续的历史查询与实时关联分析;
步骤四、以上的证券实时数据获取,间隔的获取时间可按配置为每3秒或5秒获取一次,对于获取频率主要考虑到数据的存储与证券的实时要求,获取频率高的数据存储要求比较高,对系统的处理有较高的要求。
所述实现证券实时交易异动关联分析的具体步骤为:
步骤一、把所有证券分为多线程,在各线程上进行多种时间间隔、多种异动类型的分析,把分析的结果出现异动的信息,汇总到统一的展示区进行统一的展示;
步骤二、在划分了证券的各个线程内,在临时表上统一获取各证券的当前交易信息,并从获取的时间戳中,分别获取时间戳的30秒前、1分钟前、2分钟前、5分钟前、10分钟前、15分钟前等时间的实时交易信息,分别进行上涨幅度、下跌幅度、封涨停板、封跌停板、成交换手等的分析,把在各时间段内的上涨幅度比较大的、下跌幅度比较大的、封死涨停板的、封死跌停板的、成交换手极速放大的按顺序进行排序并把排名前几位的输出到统一的收集程序上;
步骤三、各线程对在特定的时间间隔内的异动信息输出到统一的收集程序上后,收集程序汇总所有线程的数据,再进行统一的排序,再按排序顺序,分别获取在各时间间隔内,关联的上涨、下跌、封涨停板、封跌停板、极速换手等信息,统一输入到展示区,从而实现在各个时间间隔的各证券的关联分析;
步骤四、对于各间隔的时间段,可按需要进行用户自定义配置;各异动的类型,可后续进行增加;各异动的涨幅、跌幅、换手率等,也可按需要进行用户的自定义配置,各参数的配置可按行情的具体情况进行修改并设置为最优的设置。
证券实时交易数据关联分析,获取实时的数据与进行关联的分析,分两种程序各自运行,并对数量较多的证券数据,分别进行多线程的数据获取与分析,数据的获取与分析,基本在间隔获取数据的3秒或5秒钟的时间内完成,就能达到实现实时分析的效果,所以对于所有证券数据的分线程,具体看每个线程分得的证券数量,如果证券数量太多可通过分多个线程进行实现,保证在获取数据的时间间隔内实现数据的获取与分析。
证券实时交易数据保存在永久表上后,可在后续进行翻盘分析,可分析出在更大的时间间隔内的证券实时交易关联关系,按一定的模型可分析出最近几 天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
采用MongoDB具有基于内存、将热数据存在物理内存中从而达到调整读写的数据物理方式,实现实时性要求比较高的证券实时分析有很大的优势,并用利用其TTL索引的超过时间自动删除的功能,创建临时表,进行高效的证券实时交易数据关联,为在大数据量的证券交易过程中,从中获取突发的异动信息,提供一种及时有交的方法。

Claims (10)

  1. 一种证券实时交易关联分析的方法,其特征在于,包括以下步骤:
    步骤1:以MongoDB作为数据存取空间,搭建一个独立的MongoDB或由多个MongoDB形成的集群用于证券实时数据获取后的存取操作;
    步骤2:获取各个证券的详细信息,用于进行证券数据实时获取与进行分析;
    步骤3:设计一个永久存储表和临时存储表,每次获取的数据存储在永久存储表,用于后续的数据确认分析;临时存储表在半个小时内有效,用于进行短时间内的各关联分析的数据获取;
    步骤4:设置间隔获取时间,在间隔点上获取实时证券数据,并经过去重处理后把获取的实时数据分别存储到永久存储表及临时存储表上;
    步骤5:通过查询临时存储表上的数据,分析获取在多种时间段内的各证券的表现形式,对于各时间段内同时出现的各种异常,输出相应的异动类型与证券编码,形成实时的证券实时关联分析,用于进行证券交易的指导参考。
  2. 根据权利要求1所述的方法,其特征在于,所述的存取操作,包括:存储并读取其历史数据与临时数据,对于历史数据大的,搭建MongoDB集群,并对历史数据按日期进行分区读取;
    所述的详细信息,包括编码、名称、股本、大股东占比等信息。
  3. 根据权利要求1所述的方法,其特征在于,所述步骤3,具体步骤为:
    步骤一:在MongoDB上设计一个永久存储表,用于存储不断累积的证券实时交易的历史数据,按日期时间进行分区,并设计存储在不同的数据存储空间上;对于部署在集群,可增加证券编码的哈希存储,通过证券编码将不同的证券数据哈希存储在多台服务器上;
    步骤二:对于实时的证券数据分析,在MongoDB上设计一个临时存储表, 用于存储半个小时内的证券数据,利用MongoDB的TTL索引,在一个时间列上建立一个TTL索引,在临时表上设置半个小时时间,该时间点自动删除临时表的数据。
  4. 根据权利要求2所述的方法,其特征在于,所述步骤3,具体步骤为:
    步骤一:在MongoDB上设计一个永久存储表,用于存储不断累积的证券实时交易的历史数据,按日期时间进行分区,并设计存储在不同的数据存储空间上;对于部署在集群,可增加证券编码的哈希存储,通过证券编码将不同的证券数据哈希存储在多台服务器上;
    步骤二:对于实时的证券数据分析,在MongoDB上设计一个临时存储表,用于存储半个小时内的证券数据,利用MongoDB的TTL索引,在一个时间列上建立一个TTL索引,在临时表上设置半个小时时间,该时间点自动删除临时表的数据。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述的步骤4,具体步骤为:
    步骤一:在获取相关的证券编写及前缀基础上调用证券实时数据接口,所有的证券数据,按已有的详细数据,完成实时数据调用的初始化,运行证券数据实时获取程序,通过线程并发的形式把数据发送到接口上获取数据;
    步骤二:多线程获取的证券实时数据,包括当天的开盘价、最高价、最低价、实时价、成交量、成交额、买五档的数量及价格、卖五档的数量及价格,各获取数据的线程在获取了各个证券的实时信息后,进行解析,同时查询获取的实时信息是否与最近的一次获取的信息是相同的,相同表示已经存在的,不再进行保存;
    步骤三:经过去重后的证券实时数据,分别保存在永久存储表与临时存储表,用于后续的历史查询与实时关联分析;
    所述的证券实时数据获取的间隔获取时间可配置为每3秒或5秒获取一次, 对于获取频率高的数据,存储和系统的处理需高要求。
  6. 根据权利要求1-4任一项所述的方法,其特征在于,所述的步骤5,具体步骤为:
    步骤一:把所有证券数据分为多线程,在各线程上进行多种时间段、多种异动类型的分析,把分析的结果出现异动的信息,汇总到统一的展示区进行统一的展示;
    步骤二:在划分了证券的各个线程内,在临时表上统一获取各证券的当前交易信息,并分别获取多种时间段内的实时交易信息,分别进行上涨幅度、下跌幅度、封涨停板、封跌停板、成交换手等的分析,把在各时间段内的各种异常程度大的按顺序进行排序并把排名前几位的输出到统一的收集程序上;
    步骤三:各线程对在特定的时间间隔内的异动信息输出到统一的收集程序上,收集程序汇总所有线程的数据,包括异动类型与证券编码,再进行统一的排序,再按排序顺序,分别获取在各时间间隔内,关联的上涨、下跌、封涨停板、封跌停板、极速换手等信息,统一输入到展示区,从而实现在各个时间间隔的各证券的关联分析;
    步骤四:对于多种时间段,可按需要进行用户自定义配置;各种异常的类型,可后续进行增加;各种异常的涨幅、跌幅、换手率等,可按需要进行用户的自定义配置;各参数的配置可按行情的具体情况进行修改并设置为最优的设置
    所述的各种异常,可配置为:快速上涨、快速下跌、封涨停板、封跌停板、成交快速放大;
    所述的多种时间段,可配置为:前30秒、前1分钟、前2分钟、前5分钟、前10分钟、前15分钟;
    所述的获取实时的数据与进行关联的分析,是指两种各自运行的程序,在间隔获取数据的3秒或5秒钟的时间内完成,对于数量多的证券数据,分别进 行多线程的数据获取与分析,并且对于所有证券数据的分线程,当每个线程分得的证券数量多时可通过分多个线程进行实现。
  7. 根据权利要求5所述的方法,其特征在于,所述的步骤5,具体步骤为:
    步骤一:把所有证券数据分为多线程,在各线程上进行多种时间段、多种异动类型的分析,把分析的结果出现异动的信息,汇总到统一的展示区进行统一的展示;
    步骤二:在划分了证券的各个线程内,在临时表上统一获取各证券的当前交易信息,并分别获取多种时间段内的实时交易信息,分别进行上涨幅度、下跌幅度、封涨停板、封跌停板、成交换手等的分析,把在各时间段内的各种异常程度大的按顺序进行排序并把排名前几位的输出到统一的收集程序上;
    步骤三:各线程对在特定的时间间隔内的异动信息输出到统一的收集程序上,收集程序汇总所有线程的数据,包括异动类型与证券编码,再进行统一的排序,再按排序顺序,分别获取在各时间间隔内,关联的上涨、下跌、封涨停板、封跌停板、极速换手等信息,统一输入到展示区,从而实现在各个时间间隔的各证券的关联分析;
    步骤四:对于多种时间段,可按需要进行用户自定义配置;各种异常的类型,可后续进行增加;各种异常的涨幅、跌幅、换手率等,可按需要进行用户的自定义配置;各参数的配置可按行情的具体情况进行修改并设置为最优的设置。
    所述的各种异常,可配置为:快速上涨、快速下跌、封涨停板、封跌停板、成交快速放大;
    所述的多种时间段,可配置为:前30秒、前1分钟、前2分钟、前5分钟、前10分钟、前15分钟;
    所述的获取实时的数据与进行关联的分析,是指两种各自运行的程序,在间隔获取数据的3秒或5秒钟的时间内完成,对于数量多的证券数据,分别进 行多线程的数据获取与分析,并且对于所有证券数据的分线程,当每个线程分得的证券数量多时可通过分多个线程进行实现。
  8. 根据权利要求1-4任一项所述的方法,其特征在于,所述的方法,还可包括步骤6:证券实时交易数据保存在永久表上后,在后续进行翻盘分析,可分析更大时间间隔内的证券实时交易关联关系,按一定的模型可分析最近几天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
  9. 根据权利要求5所述的方法,其特征在于,所述的方法,还可包括步骤6:证券实时交易数据保存在永久表上后,在后续进行翻盘分析,可分析更大时间间隔内的证券实时交易关联关系,按一定的模型可分析最近几天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
  10. 根据权利要求7所述的方法,其特征在于,所述的方法,还可包括步骤6:证券实时交易数据保存在永久表上后,在后续进行翻盘分析,可分析更大时间间隔内的证券实时交易关联关系,按一定的模型可分析最近几天或更长时间的行情信息,可作为更大范围的证券交易关联分析。
PCT/CN2016/109349 2016-11-25 2016-12-10 一种证券实时交易关联分析的方法 WO2018094777A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611062583.6A CN106776837A (zh) 2016-11-25 2016-11-25 一种基于MongoDB的证券实时交易关联分析的方法
CN201611062583.6 2016-11-25

Publications (1)

Publication Number Publication Date
WO2018094777A1 true WO2018094777A1 (zh) 2018-05-31

Family

ID=58901801

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/109349 WO2018094777A1 (zh) 2016-11-25 2016-12-10 一种证券实时交易关联分析的方法

Country Status (2)

Country Link
CN (1) CN106776837A (zh)
WO (1) WO2018094777A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636611A (zh) * 2018-10-19 2019-04-16 深圳平安财富宝投资咨询有限公司 清算配置信息的获取方法、服务器、存储介质及装置
CN111062810A (zh) * 2019-11-12 2020-04-24 上交所技术有限责任公司 适用于证券交易系统基于接口多维索引数据的处理方法
CN112750040A (zh) * 2021-01-13 2021-05-04 国泰君安证券股份有限公司 应用于内存交易系统实现线程安全的数据处理方法、系统、应用、装置、处理器及存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019436B (zh) * 2017-07-14 2021-04-09 北京国双科技有限公司 数据导入\导出方法和装置、数据表处理方法和装置
CN107862031A (zh) * 2017-11-02 2018-03-30 东软集团股份有限公司 业务处理方法、装置和服务器
CN108197301A (zh) * 2018-01-26 2018-06-22 北京博睿宏远数据科技股份有限公司 一种用于量化不同券商app行情刷新速度的方法
CN108470045B (zh) * 2018-03-06 2020-02-18 平安科技(深圳)有限公司 电子装置、数据链式归档的方法及存储介质
CN110278260A (zh) * 2019-06-17 2019-09-24 武汉灯塔之光科技有限公司 一种不同证券行情数据的转发录播方法、系统和装置
CN112835930A (zh) * 2021-03-03 2021-05-25 上海渠杰信息科技有限公司 一种数据库的查询方法及设备
CN114661771A (zh) * 2022-04-14 2022-06-24 广州经传多赢投资咨询有限公司 一种股票数据的存储与读取方法、设备以及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1647901A1 (en) * 2004-10-15 2006-04-19 Samsung Electronics Co.,Ltd. System and method for collecting network performance data and storing it in a single relational table.
CN102024010A (zh) * 2010-06-04 2011-04-20 西本新干线股份有限公司 数据处理系统及其处理方法
CN103207919A (zh) * 2013-04-26 2013-07-17 北京亿赞普网络技术有限公司 一种MongoDB集群快速查询计算的方法及装置
CN104112010A (zh) * 2014-07-16 2014-10-22 深圳市国泰安信息技术有限公司 一种数据存储方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102609464A (zh) * 2012-01-16 2012-07-25 北京亿赞普网络技术有限公司 Mongodb分片联表查询方法及装置
WO2016032548A1 (en) * 2014-08-25 2016-03-03 Hewlett Packard Enterprise Development Lp Providing transactional support to a data storage system
CN105959169B (zh) * 2016-07-19 2019-09-17 中国银联股份有限公司 一种交易数据处理系统及方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1647901A1 (en) * 2004-10-15 2006-04-19 Samsung Electronics Co.,Ltd. System and method for collecting network performance data and storing it in a single relational table.
CN102024010A (zh) * 2010-06-04 2011-04-20 西本新干线股份有限公司 数据处理系统及其处理方法
CN103207919A (zh) * 2013-04-26 2013-07-17 北京亿赞普网络技术有限公司 一种MongoDB集群快速查询计算的方法及装置
CN104112010A (zh) * 2014-07-16 2014-10-22 深圳市国泰安信息技术有限公司 一种数据存储方法及装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636611A (zh) * 2018-10-19 2019-04-16 深圳平安财富宝投资咨询有限公司 清算配置信息的获取方法、服务器、存储介质及装置
CN111062810A (zh) * 2019-11-12 2020-04-24 上交所技术有限责任公司 适用于证券交易系统基于接口多维索引数据的处理方法
CN112750040A (zh) * 2021-01-13 2021-05-04 国泰君安证券股份有限公司 应用于内存交易系统实现线程安全的数据处理方法、系统、应用、装置、处理器及存储介质
CN112750040B (zh) * 2021-01-13 2023-07-14 国泰君安证券股份有限公司 应用于内存交易系统实现线程安全的数据处理方法、系统、应用、装置、处理器及存储介质

Also Published As

Publication number Publication date
CN106776837A (zh) 2017-05-31

Similar Documents

Publication Publication Date Title
WO2018094777A1 (zh) 一种证券实时交易关联分析的方法
TWI673680B (zh) 用於多維時序資料的互動性視覺化分析學之裝置與方法及儲存媒體
US20160328432A1 (en) System and method for management of time series data sets
US8862638B2 (en) Interpolation data template to normalize analytic runs
US10162855B2 (en) Systems and methods for optimizing data analysis
CN108446305A (zh) 多维度统计业务数据的系统和方法
Bara et al. A model for business intelligence systems' development
CN109189861A (zh) 基于指标的数据流统计方法、服务器及存储介质
CN104182460A (zh) 基于倒排索引的时间序列相似性查询方法
CN111127105A (zh) 用户分层模型构建方法及系统、运营分析方法及系统
CN110309110A (zh) 一种大数据日志监控方法及装置、存储介质和计算机设备
CN105843842A (zh) 一种大数据环境下多维聚集查询与展示系统及方法
CN101706926A (zh) 一种卷烟消费信息调查及处理方法
CN113032420A (zh) 数据查询方法、装置和服务器
CN110716950A (zh) 一种口径系统建立方法、装置、设备及计算机存储介质
CN105095436A (zh) 数据源数据自动建模方法
CN109360106A (zh) 画像构建方法、系统、介质和计算机系统
CN109033173A (zh) 一种用于生成多维指标数据的数据处理方法及装置
CN108182204A (zh) 基于房产交易多维度数据的数据查询的处理方法及装置
CN110737432A (zh) 一种基于词根表的脚本辅助设计方法及装置
Rundensteiner et al. Xmdvtool q: quality-aware interactive data exploration
CN107004002A (zh) 根据结构化数据项的集合生成非结构化搜索查询
US11016978B2 (en) Joiner for distributed databases
CN110019432A (zh) 针对读者行为信息的海量数据分析及可视化系统
CN106202408A (zh) 基于olap的数据查询服务器、系统和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16922479

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16922479

Country of ref document: EP

Kind code of ref document: A1