CN116737727B - Stock transaction data column type storage method and server based on tree structure - Google Patents

Stock transaction data column type storage method and server based on tree structure Download PDF

Info

Publication number
CN116737727B
CN116737727B CN202311022942.5A CN202311022942A CN116737727B CN 116737727 B CN116737727 B CN 116737727B CN 202311022942 A CN202311022942 A CN 202311022942A CN 116737727 B CN116737727 B CN 116737727B
Authority
CN
China
Prior art keywords
data
stock
node
target
subdivision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311022942.5A
Other languages
Chinese (zh)
Other versions
CN116737727A (en
Inventor
金基东
汤汝军
顾金国
易朝霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Kafang Information Technology Co ltd
Original Assignee
Hangzhou Chi Squared Distribution Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Chi Squared Distribution Information Technology Co ltd filed Critical Hangzhou Chi Squared Distribution Information Technology Co ltd
Priority to CN202311022942.5A priority Critical patent/CN116737727B/en
Publication of CN116737727A publication Critical patent/CN116737727A/en
Application granted granted Critical
Publication of CN116737727B publication Critical patent/CN116737727B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Fuzzy Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of data storage, and discloses a tree structure-based stock transaction data column storage method and a server, which are used for collecting original stock transaction data with a time stamp in stock transaction, preprocessing the original stock transaction data to obtain first target data, analyzing the subdivision field of the first target data based on a fuzzy function, storing the first target data of the subdivision field in a target tree structure of a corresponding subdivision field, and obtaining a frequent positive sequence pattern set based on stock support and stock co-occurrence; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set; and sequentially carrying out hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and storing the simplified stock node data in a target tree structure according to node position information.

Description

Stock transaction data column type storage method and server based on tree structure
Technical Field
The invention relates to the technical field of data storage, in particular to a stock exchange data column-type storage method and a server based on a tree structure.
Background
The high-speed development of the Internet enables the electronic management system to be widely and importantly applied to various industries, has an electronic transaction mode with the advantages of high speed, low cost, breakthrough of site limitation and the like, greatly meets the increasing demand of transaction amount, and in order to enable a user to timely master stock market quotation, puts forward high requirements on the storage performance of stock transaction data of a transaction system;
as disclosed in chinese patent with patent publication number CN113781220a, a system and a method for trading and matching distributed stocks are disclosed, wherein the stocks of interest to the user are obtained through daily behavior logs of the user, the daily gain rate of the stocks of interest to the user is predicted by using a stock K diagram, the daily gain rate is matched with the stocks of interest to the user, the demand of the user for buying the stocks is increased, and the trading rate of the stocks is increased;
the basis of matching is a storage system which needs a perfect stock trade data, but the existing storage system has the following problems;
1. the inability to efficiently organize and store large-scale stock exchange data results in wasted storage space and inefficiency of storage.
2. The stock trade data is huge and complex, the key mode and trend in the stock trade data are difficult to extract by the traditional data analysis method at present, the simple function cannot be realized, and the efficiency is low when a user queries and retrieves the data, so that great inconvenience is brought. And large-scale stock exchange data requires a large amount of storage space, and has high storage requirements on databases.
In view of this, the present invention provides a method for columnar storage of stock exchange data based on a tree structure.
Disclosure of Invention
In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present invention provide a method and a server for storing stock exchange data in a column based on a tree structure.
According to one aspect of the present invention, there is provided a tree-structure-based stock exchange data columnar storage method including the steps of:
s1: collecting original stock trade data with a time stamp in stock trade, preprocessing the original stock trade data to obtain first target data, analyzing the subdivision field of the first target data based on a fuzzy function, and storing the first target data of the subdivision field on a corresponding storage node;
s2: constructing a target tree structure for the first target data of each subdivision domain, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision domain, the subdivision domain is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information;
S3: generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
s4: and sequentially carrying out hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and storing the simplified stock node data in a target tree structure according to node position information.
In a preferred embodiment, the specific analysis process of the subdivision region corresponding to the first target data is as follows:
performing data preprocessing on the original stock trading data to obtain first target data; data preprocessing includes, but is not limited to, one or more of the following processing modes: filtering error data, filtering repeated data and normalizing processed data;
extracting subdivision field features based on the first target data, and converting the subdivision field features into fuzzy field variables;
calculating the membership degree of the subdivision domain where the stock trade data is located by the fuzzy domain variable through a membership function, and dividing the membership degree to obtain a fuzzy numerical interval corresponding to the fuzzy domain variable;
The fuzzy numerical intervals based on the subdivision domain feature map characterize the subdivision domain of the stock exchange data.
In a preferred embodiment, the acquiring the subdivision domain feature includes:
extracting associated domain features of the subdivision domain corresponding to the first target data by adopting a preset feature extraction network model;
and obtaining subdivision domain features through a cross-validation model for the associated domain features and the first target data.
In a preferred embodiment, the backup node data information is a backup of the corresponding node data information.
In a preferred embodiment, the node position information of the root node is the number of positions of the root node, the first target data subdivided under the root node is a child node, the node position information of the child node is composed of the number of positions of a parent node of the current node and the number of positions of the node itself, and the number of positions of the node represents a representation symbol of the position of the node in the same layer node.
In a preferred embodiment, mining of time series patterns in the frequent positive series pattern set, deriving a set of stock time positive series patterns includes:
counting the stock support degree and the stock co-occurrence degree of a time sequence mode, wherein the stock support degree is used for representing the total number of occurrences of the time sequence mode in all stock sequence data, the co-occurrence degree is used for representing how many stock sequence data the time sequence mode occurs in, the time sequence mode represents that a second time node occurs after a first time node occurs, and the first time node and the second time node are any two time nodes in the frequent positive sequence mode set;
And under the condition that the stock support degree corresponding to any two time nodes is larger than a preset stock support degree threshold value and the stock co-occurrence degree is larger than the preset stock co-occurrence degree threshold value, taking the corresponding stock sequence data as elements in the stock time positive sequence mode set.
In a preferred embodiment, the simplified stock node data acquisition logic is:
the method comprises the steps that data grouping is carried out on stock node data in a stock time positive sequence mode set according to preset node data items, and a weighted hash value corresponding to the stock node data is calculated by utilizing a hash algorithm and weight values of all the stock node data;
comparing weighted hash values corresponding to all stock node data, and merging two stock node data with Hamming distance smaller than a preset Hamming threshold value between the hash values into the same similar stock data set;
combining the same item of similar stock data set into new stock node data, and marking the new stock node data as simplified stock node data; and updating node position information of the stock node data; the reduced stock node data is stored on the updated node location information.
In a preferred embodiment, updating node location information of stock node data includes:
Acquiring node position information corresponding to node data information in the same similar stock data set;
obtaining the distance between the current node and the root node based on the node position information; the distance between the current node and the root node is the sum of the first distance and the second distance; the first distance is the distance from the parent node to the root node of the current node; the second distance is the ratio of the current node position number to the total number of node positions under the father node;
comparing all node position information with each other; and the node position information with the smallest node distance from the root node is updated.
In a preferred embodiment, the acquiring logic of the weight value of the stock node data is:
the weight value comprises a fixed weight value and a fluctuation weight value, wherein the fixed weight value is the reciprocal of the distance from corresponding stock node data to the root node, and the fluctuation weight value is the difference value between the maximum value of the data fluctuation amplitude and a preset amplitude threshold value; the fluctuation weight value is determined according to any one of the following methods:
when the data fluctuation range of the stock node data is larger than or equal to a preset amplitude threshold value, increasing a weight value corresponding to the node data information;
And when the data fluctuation range of the stock node data is smaller than a preset amplitude threshold value, reducing the weight value corresponding to the node data information.
According to another aspect of the present invention, there is provided a stock exchange data storage server comprising:
the data acquisition module acquires original stock trade data with a time stamp in stock trade, preprocesses the original stock trade data to obtain first target data, analyzes the subdivision field of the first target data based on a fuzzy function, and stores the first target data of the subdivision field on a corresponding storage node;
the tree structure construction module is used for constructing a target tree structure for the first target data of each subdivision field, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision field, the subdivision field is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information;
the simplified sequence generation module is used for generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
And the data storage module sequentially performs hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and stores the simplified stock node data in the target tree structure according to the node position information.
According to still another aspect of the present invention, there is provided an electronic apparatus including: a processor and a memory, wherein the memory stores a computer program for the processor to call;
the processor executes the above-described stock exchange data columnar storage method based on the tree structure by calling the computer program stored in the memory.
According to yet another aspect of the present invention, there is provided a computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the above-described tree-structure-based stock exchange data columnar storage method.
The stock exchange data column type storage method based on the tree structure has the technical effects and advantages that:
the invention can effectively organize and store a large amount of stock transaction data in a tree structure storage mode, and can rapidly locate and search specific stock node data by storing the data according to the subdivision field and the node position information, thereby improving the storage efficiency of the data; the frequent positive sequence mode can be mined through counting the stock support and co-occurrence of the stock sequence data, so that important trends and modes in stock trading are revealed, and the method has important significance in making investment strategies and predicting market trends; the method can also capture the time correlation in stock exchange, help analysts and investors find important time series patterns, and be used for market prediction and decision making, and finally, the original stock node data can be simplified through hash operation and data merging. The simplification can reduce the occupation of the storage space and reduce the storage cost; thus, the storage efficiency and the data mining capability of the stock exchange data can be improved, and more effective support is provided for stock analysis and decision making.
Drawings
FIG. 1 is a diagram of a stock exchange data storage server of the present invention;
FIG. 2 is a schematic diagram of a method for storing a list of stock exchange data based on a tree structure according to the present invention;
FIG. 3 is a schematic diagram of a target tree structure according to the present invention;
FIG. 4 is a schematic diagram of an electronic device according to the present invention;
fig. 5 is a schematic diagram of a computer readable storage medium according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, the present embodiment is a stock transaction data storage server, which includes a data acquisition module 1, a tree structure construction module 2, a reduced sequence generation module 3, and a data storage module 4, where the modules are connected by wired and/or wireless connection to realize data transmission between the modules;
The data acquisition module 1 acquires original stock trade data with a time stamp in stock trade, preprocesses the original stock trade data to obtain first target data, analyzes the subdivision field of the first target data based on a fuzzy function, and stores the first target data of the subdivision field on a corresponding storage node;
what needs to be explained here is: first, stock exchange data includes a data source of a stock exchange, a financial data provider or a third party API; acquiring a plurality of stock trading data from a plurality of directions, which can perfect the accuracy and the integrity of the acquired data to a certain extent; the disadvantage is that there is a large amount of identical data overlap, which subsequently requires preprocessing of the original stock exchange data to obtain relatively accurate and uniform first target data.
Analyzing a subdivision region of the first target data based on the fuzzy function, the manner of the subdivision region including, but not limited to, one or more of the following: industry, market, securities types (such as stocks, options, futures, etc.), or other custom classification criteria, and each stock exchange data is stored on a corresponding storage node according to different subdivision domains. Ensuring that the time stamp information is kept in the storage process so as to be analyzed or queried later and added; facilitating subsequent periodic updating and maintenance of stock exchange data.
The specific analysis process of the subdivision field corresponding to the first target data is as follows:
performing data preprocessing on the original stock trading data to obtain first target data; data preprocessing includes, but is not limited to, one or more of the following processing modes: filtering error data, filtering repeated data and normalizing processed data;
extracting subdivision field features based on the first target data, and converting the subdivision field features into fuzzy field variables;
calculating the membership degree of the subdivision domain where the stock trade data is located by the fuzzy domain variable through a membership function, and dividing the membership degree to obtain a fuzzy numerical interval corresponding to the fuzzy domain variable;
the fuzzy numerical intervals based on the subdivision domain feature map characterize the subdivision domain of the stock exchange data.
What needs to be explained here is: the collected raw stock trade data is first pre-processed to ensure accuracy and consistency of the data.
The pretreatment mode comprises the following steps:
filtering the error data: data points for which errors or outliers exist, such as data for which the price is negative or the transaction volume is abnormally high, are identified and excluded.
Repeated data were filtered: repeated transaction data is detected and deleted to avoid repeated impact on the analysis results.
Normalizing the processed data: and (3) carrying out normalization processing on the data, and unifying the data with different scales into a specific range so as to eliminate deviation caused by data difference.
Features associated with the selected segment are extracted from the pre-processed stock exchange data, the particular feature extraction method being dependent upon the selected segment. Specific examples: for the field of subdivision of market index data, it is possible to extract daily index values; for the subdivision field of individual stock data, characteristics such as price, transaction amount and the like of stocks can be extracted; for the subdivision field of the industry field, the characteristic fields of electronics, agriculture, biology, chemistry, catering and the like can be extracted, and the definition and the category of the subdivision field are classified according to the current server.
The subdivision domain features are converted into fuzzy domain variables, membership degrees of the fuzzy domain variables in different subdivision domains are calculated, and the fuzzy domain variables quantify the relationship between the subdivision domain features and the subdivision domains by defining membership functions. The membership function may be a gaussian function, a trigonometric function, or other suitable functional form.
A mapping table can be established, and the characteristics of the subdivision field are corresponding to the fuzzy numerical intervals, so that the meanings of the intervals can be referred in the subsequent analysis and decision process, the selection, definition and actual application scene of the subdivision field can be specifically adjusted and expanded according to the specific situation and the requirement of the actual subdivision field.
The acquiring the subdivision domain feature includes:
extracting associated domain features of the subdivision domain corresponding to the first target data by adopting a preset feature extraction network model;
and obtaining subdivision domain features through a cross-validation model for the associated domain features and the first target data.
What needs to be explained here is: a pre-trained feature extraction network model is used, which model is trained in the domain related to the subdivision domain. By inputting the first target data into the feature extraction network model, the associated domain features of the subdivided domain can be extracted.
And training and evaluating the obtained associated domain characteristics of the subdivision domain and the first target data through a cross-validation model. Cross-validation is a technique that divides data into training and validation sets and performs multiple training and validation. Through cross-validation, the effect of the associated domain features in the corresponding subdivision domain may be evaluated, from which features with good performance on the first target data are selected.
A tree structure construction module 2 for constructing a target tree structure for the first target data of each subdivision domain, the target tree structure comprising a root node and at least one group of child nodes; the storage node of the first target data storage is a target tree structure corresponding to the subdivision field, the subdivision field is a root node of the current target tree structure, the first target data is stored in the corresponding target tree structure, and stock node data of the corresponding node are obtained; the stock node data includes node location information, node data information, and a plurality of backup node data information.
What needs to be explained here is: in this embodiment, a target tree structure is first constructed, where the target tree structure may be a multi-tree, and includes a root node and multiple groups of child nodes, where the root node is a storage node, and the root node is an initial parent node; the root node represents all first target data in the corresponding subdivision domain; the child node is first target data subdivided under the father node;
the node position information of the root node is the position number of the root node, the node position information of the child node is composed of the position number of the father node and the position number of the node, the position number of the node represents the representation symbol of the position of the node in the same layer of nodes, and the length of the father node corresponding to the current node distance is marked as the step length of the current node.
Further, the backup node data information is a backup of the corresponding node data information; the purpose of storing backup node data information on corresponding node location information here is to: when the node data information on the corresponding node position information is missing or invalid, directly extracting the backup node data information of the corresponding node, and if the extracted backup node data information is missing or invalid, extracting the next backup node data information, and backing up the missing historical data from the backup node data information; if the data information of a plurality of backup nodes is true or invalid, the server is attacked with a high probability, and maintenance is required for the server.
The simplified sequence generation module 3 generates stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
what needs to be explained here is: each stock sequence data is created by mapping each stock node to a corresponding time sequence pattern according to its time stamp. Ensuring that the time series pattern corresponding to each time stamp is correct for subsequent analysis; traversing all stock sequence data, and counting the support degree (namely the frequency of occurrence in the sequence) of each stock and the co-occurrence degree (namely the frequency of occurrence in the same sequence) between stocks;
based on the stock support and the stock co-occurrence, a method of association rule mining may be used to obtain a set of frequent positive sequence patterns. Mining methods such as Apriori algorithm or FP-growth algorithm find frequent positive sequence patterns in stock sequence data. These patterns are sequences that occur frequently and are of interest; further mining is performed from the frequent positive sequence pattern set to obtain a more compact stock sequence pattern set. Some data mining and machine learning techniques, such as sequence pattern mining, cluster analysis, or association rule mining, may be used to find more useful and valuable stock sequence patterns.
Mining time sequence patterns in the frequent positive sequence pattern set, the obtaining a stock time positive sequence pattern set comprising:
counting the stock support degree and the stock co-occurrence degree of a time sequence mode, wherein the stock support degree is used for representing the total number of occurrences of the time sequence mode in all stock sequence data, the co-occurrence degree is used for representing how many stock sequence data the time sequence mode occurs in, the time sequence mode represents that a second time node occurs after a first time node occurs, and the first time node and the second time node are any two time nodes in the frequent positive sequence mode set;
and under the condition that the stock support degree corresponding to any two time nodes is larger than a preset stock support degree threshold value and the stock co-occurrence degree is larger than the preset stock co-occurrence degree threshold value, taking the corresponding stock sequence data as elements in the stock time positive sequence mode set.
What needs to be explained here is: through the method, the stock time positive sequence mode set can be obtained, wherein each element meets the preset stock support threshold value and stock co-occurrence threshold value condition, so that the importance and co-occurrence of the time sequence mode are ensured. The stock support threshold and the stock co-occurrence threshold are mainly obtained through professional analysis, and the stock support threshold and the stock co-occurrence threshold are also adjusted according to the difference of the stock sequence data corresponding to different subdivision fields.
And the data storage module 4 sequentially performs hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and stores the simplified stock node data in the target tree structure according to the node position information.
What needs to be explained here is: for each stock node data in the stock time positive sequence pattern set, it is hashed using a hash function. The hash function maps the node data to a unique hash value for subsequent data merging and storage.
The hash operation may employ a conventional hash function such as MD5, SHA-1, SHA-256, or the like. Ensuring that the selected hash function has good hashes to minimize hash collisions.
Combining the stock node data of Ha Xihou; different merging strategies, such as simple adjacent node merging, node attribute-based merging rules or time window-based merging, are selected for subsequent data analysis and query operations, according to specific actual requirements and the design of a target tree structure.
The acquisition logic of the simplified stock node data is as follows:
the method comprises the steps that data grouping is carried out on stock node data in a stock time positive sequence mode set according to preset node data items, and a weighted hash value corresponding to the stock node data is calculated by utilizing a hash algorithm and weight values of all the stock node data;
Comparing weighted hash values corresponding to all stock node data, and merging two stock node data with Hamming distance smaller than a preset Hamming threshold value between the hash values into the same similar stock data set;
combining the same item of similar stock data set into new stock node data, and marking the new stock node data as simplified stock node data; and updating node position information of the stock node data; the reduced stock node data is stored on the updated node location information.
What needs to be explained here is: and carrying out data grouping and weighted hash value calculation on preset node data items, then merging similar stock data sets according to Hamming distance and Hamming threshold values, finally obtaining simplified stock node data and storing the simplified stock node data on updated node position information.
In addition, the character strings of the stock node data stored on the target tree structure are consistent in length, so that the Hamming distance is used for calculating the difference degree between the stock node data, equal-length character strings of the two stock node data can be compared bit by bit, the number of different positions of the equal-length character strings on the same position is counted, and the number of different positions is unequal; or performing exclusive or operation on the stock node data to obtain a new binary character string, and counting the number of 1 in the new binary character string, which can be expressed as Hamming distance.
Updating node location information of stock node data includes:
acquiring node position information corresponding to node data information in the same similar stock data set;
obtaining the distance between the current node and the root node based on the node position information; the distance between the current node and the root node is the sum of the first distance and the second distance; the first distance is the distance from the parent node to the root node of the current node; the second distance is the ratio of the current node position number to the total number of node positions under the father node;
comparing all node position information with each other; and the node position information with the smallest node distance from the root node is updated.
Here, it is necessary to exemplify: the root node is a storage node and plays a role of a catalog nodeBy marking it asThe child node corresponding to the root node is marked as +.>,/>Is the +.>Personal node->,And->Representing the total number of the positions of the child nodes corresponding to the root node; stock node data is marked +.>The parent node corresponding to the current stock node data is +.>The +.>A plurality of nodes; />,/>And->Representing the total number of child node positions under the parent node where the current stock node data is located; thus the distance of the current stock node data to the parent node is the second distance +. >Calculated by the formula +.>The method comprises the steps of carrying out a first treatment on the surface of the And by analogy, obtaining the distance from the current stock node data to the root node, and setting the node closest to the root node as updated node position information.
The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.
The acquisition logic of the weight value of the stock node data is as follows:
the weight value comprises a fixed weight value and a fluctuation weight value, wherein the fixed weight value is the reciprocal of the distance from corresponding stock node data to the root node, and the fluctuation weight value is the difference value between the maximum value of the data fluctuation amplitude and a preset amplitude threshold value; the fluctuation weight value is determined according to any one of the following methods:
when the data fluctuation range of the stock node data is larger than or equal to a preset amplitude threshold value, increasing a weight value corresponding to the node data information;
and when the data fluctuation range of the stock node data is smaller than a preset amplitude threshold value, reducing the weight value corresponding to the node data information.
What needs to be explained here is: here, the preset amplitude threshold value needs to be adjusted according to specific requirements and data characteristics to control the variation degree of the fluctuation weight value. And synthesizing the fixed weight value and the fluctuation weight value to obtain the weight value of the final stock node data. Thus, the weight value of each node data will include a fixed weight value and a fluctuating weight value that is adjusted according to the fluctuation of the data to reflect the location of the node data in the tree structure and the degree of fluctuation of the data.
What should be stated here is also: in the prior art, the weight value of the stock node data is directly counted, and the method comprises the following steps: the weight value of each stock node data is pre-distributed mainly according to preset node data items, and is set mainly according to priori knowledge or suggestions of domain experts based on importance, influence or experience in the domain of the data items.
These statistical features may also be used as weight values by statistical features such as mean, variance, standard deviation, etc. A larger statistical feature value may be given a higher weight, indicating the importance of the attribute in the data;
on the other hand, the weight value is calculated in many fields currently through a machine learning method, and a model is trained by using the machine learning method (such as regression, decision tree and the like) according to the existing data and the target variable, so that the weight value of each node data item is obtained. These models may determine weight values based on the relationship between the characteristics of the data and the target variables.
Selecting a proper weight calculation method according to the specific application scene and the data characteristics, and adjusting and optimizing weight values according to the complexity of the problem and the data characteristics: the adjusting and optimizing process is to reduce the influence degree of the fluctuation weight value on the fixed weight value.
The explanation is made in connection with specific embodiments:
a mapping table is first created for characterizing the relationship between the first target data and the subdivision domain. Then creating target tree structures according to the subdivision areas, wherein each target tree structure corresponds to one subdivision area, and storing first target data in the corresponding subdivision area on a corresponding storage node, namely storing the first target data on the target tree structure in the corresponding subdivision area; each target tree structure represents a particular domain, including electronic, agricultural, biological, chemical, and dining domains;
for each target tree structure, the target tree structure may include a root node and multiple groups of child nodes, as shown in fig. 3, where the root node is a starting storage point on the target tree structure, a first-level type division may be performed on the root node to obtain a group of first-level child nodes, the root node is a parent node of the first-level child nodes, a second-level type division may also be performed on the first-level child nodes to obtain a group of second-level child nodes, the first-level child nodes are parent nodes of the second-level child nodes, and so on, so as to generate the target tree structure. The storage node of the first target data storage is a target tree structure corresponding to the subdivision field, the subdivision field is a root node of the current target tree structure, the first target data is stored in the corresponding target tree structure, and stock node data of the corresponding node are obtained.
Specific examples: the root node is "dining", the first level sub-node is "fast food", "high-grade restaurant" and "cafe", the second level sub-node is stock code, name, historical data, etc. of corresponding dining enterprises in the first level sub-node, through the above example configuration, can realize the discernment to the stock field. If you want to acquire stocks associated with the catering field, the tree structure can be traversed to find out the nodes of the catering field and acquire the corresponding stock nodes in the child nodes. In this way, you can acquire stock information related to the catering field, and so on, in practical application, the adding child node can be modified according to practical situations.
Example 2
Referring to fig. 2, the embodiment is not described in detail in the section of the description of the embodiment, and the embodiment provides a method for storing a list of stock exchange data based on a tree structure, which includes: the method comprises the following steps:
s1: collecting original stock trade data with a time stamp in stock trade, preprocessing the original stock trade data to obtain first target data, analyzing the subdivision field of the first target data based on a fuzzy function, and storing the first target data of the subdivision field on a corresponding storage node;
The specific analysis process of the subdivision field corresponding to the first target data is as follows:
performing data preprocessing on the original stock trading data to obtain first target data; data preprocessing includes, but is not limited to, one or more of the following processing modes: filtering error data, filtering repeated data and normalizing processed data;
extracting subdivision field features based on the first target data, and converting the subdivision field features into fuzzy field variables;
calculating the membership degree of the subdivision domain where the stock trade data is located by the fuzzy domain variable through a membership function, and dividing the membership degree to obtain a fuzzy numerical interval corresponding to the fuzzy domain variable;
the fuzzy numerical intervals based on the subdivision domain feature map characterize the subdivision domain of the stock exchange data.
The acquiring the subdivision domain feature includes:
extracting associated domain features of the subdivision domain corresponding to the first target data by adopting a preset feature extraction network model;
and obtaining subdivision domain features through a cross-validation model for the associated domain features and the first target data.
S2: constructing a target tree structure for the first target data of each subdivision domain, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision domain, the subdivision domain is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information;
The backup node data information is a backup of the corresponding node data information.
The node position information of the root node is the position number of the root node, the node position information of the child node is composed of the position number of the father node of the current node and the position number of the node, and the position number of the node represents the representation symbol of the position of the node in the same layer of nodes.
S3: generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
mining time sequence patterns in the frequent positive sequence pattern set, the obtaining a stock time positive sequence pattern set comprising:
counting the stock support degree and the stock co-occurrence degree of a time sequence mode, wherein the stock support degree is used for representing the total number of occurrences of the time sequence mode in all stock sequence data, the co-occurrence degree is used for representing how many stock sequence data the time sequence mode occurs in, the time sequence mode represents that a second time node occurs after a first time node occurs, and the first time node and the second time node are any two time nodes in the frequent positive sequence mode set;
And under the condition that the stock support degree corresponding to any two time nodes is larger than a preset stock support degree threshold value and the stock co-occurrence degree is larger than the preset stock co-occurrence degree threshold value, taking the corresponding stock sequence data as elements in the stock time positive sequence mode set.
S4: and sequentially carrying out hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and storing the simplified stock node data in a target tree structure according to node position information.
The acquisition logic of the simplified stock node data is as follows:
the method comprises the steps that data grouping is carried out on stock node data in a stock time positive sequence mode set according to preset node data items, and a weighted hash value corresponding to the stock node data is calculated by utilizing a hash algorithm and weight values of all the stock node data;
comparing weighted hash values corresponding to all stock node data, and merging two stock node data with Hamming distance smaller than a preset Hamming threshold value between the hash values into the same similar stock data set;
combining the same item of similar stock data set into new stock node data, and marking the new stock node data as simplified stock node data; and updating node position information of the stock node data; the reduced stock node data is stored on the updated node location information.
The acquisition logic of the weight value of the stock node data is as follows:
the weight value comprises a fixed weight value and a fluctuation weight value, wherein the fixed weight value is the reciprocal of the distance from corresponding stock node data to the root node, and the fluctuation weight value is the difference value between the maximum value of the data fluctuation amplitude and a preset amplitude threshold value; the fluctuation weight value is determined according to any one of the following methods:
when the data fluctuation range of the stock node data is larger than or equal to a preset amplitude threshold value, increasing a weight value corresponding to the node data information;
and when the data fluctuation range of the stock node data is smaller than a preset amplitude threshold value, reducing the weight value corresponding to the node data information.
Updating node location information of stock node data includes:
acquiring node position information corresponding to node data information in the same similar stock data set;
obtaining the distance between the current node and the root node based on the node position information; the distance between the current node and the root node is the sum of the first distance and the second distance; the first distance is the distance from the parent node to the root node of the current node; the second distance is the ratio of the current node position number to the total number of node positions under the father node;
Comparing all node position information with each other; and the node position information with the smallest node distance from the root node is updated.
The invention can effectively organize and store a large amount of stock transaction data in a tree structure storage mode, and can rapidly locate and search specific stock node data by storing the data according to the subdivision field and the node position information, thereby improving the storage efficiency of the data; the frequent positive sequence mode can be mined through counting the stock support and co-occurrence of the stock sequence data, so that important trends and modes in stock trading are revealed, and the method has important significance in making investment strategies and predicting market trends; the method can also capture the time correlation in stock exchange, help analysts and investors find important time series patterns, and be used for market prediction and decision making, and finally, the original stock node data can be simplified through hash operation and data merging. The simplification can reduce the occupation of the storage space and reduce the storage cost; thus, the storage efficiency and the data mining capability of the stock exchange data can be improved, and more effective support is provided for stock analysis and decision making.
Example 3
An electronic device is shown according to an exemplary embodiment, comprising: a processor and a memory, wherein the memory stores a computer program for the processor to call;
the processor executes the above-described tree-structure-based stock exchange data columnar storage method by calling the computer program stored in the memory.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application, where the electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors (Central Processing Units, CPU) and one or more memories, where at least one computer program is stored in the memories, and the at least one computer program is loaded and executed by the processors to implement the stock algorithm trading method based on the deep neural network provided in the foregoing method embodiments. The electronic device can also include other components for implementing the functions of the device, for example, the electronic device can also have a wired or wireless network interface, an input-output interface, and the like, for input-output. The embodiments of the present application are not described herein.
Example 4
FIG. 5 is a schematic diagram of a computer-readable storage medium according to one embodiment of the present application. As shown in fig. 5, is a computer-readable storage medium 100 according to one embodiment of the present application. Computer readable storage medium 100 has stored thereon computer readable instructions. The path planning method according to the embodiment of the present application described with reference to the above drawings may be performed when the computer readable instructions are executed by the processor. Storage medium 100 includes, but is not limited to, for example, volatile memory and/or non-volatile memory. Volatile memory can include, for example, random Access Memory (RAM), cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like.
In addition, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, the present application provides a non-transitory machine-readable storage medium storing machine-readable instructions executable by a processor to perform instructions corresponding to the method steps provided by the present application, such as: collecting original stock trade data with a time stamp in stock trade, preprocessing the original stock trade data to obtain first target data, analyzing the subdivision field of the first target data based on a fuzzy function, and storing the first target data of the subdivision field on a corresponding storage node; constructing a target tree structure for the first target data of each subdivision domain, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision domain, the subdivision domain is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information; generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set; and sequentially carrying out hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and storing the simplified stock node data in a target tree structure according to node position information.
The methods and apparatus, devices of the present application may be implemented in numerous ways. For example, the methods and apparatus, devices of the present application may be implemented by software, hardware, firmware, or any combination of software, hardware, firmware. The above-described sequence of steps for the method is for illustration only, and the steps of the method of the present application are not limited to the sequence specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present application may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present application. Thus, the present application also covers a recording medium storing a program for executing the method according to the present application.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
It should be understood that determining B from a does not mean determining B from a alone, but can also determine B from a and/or other information.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (9)

1. The stock trade data column type storage method based on the tree structure is characterized by comprising the following steps:
s1: collecting original stock trade data with a time stamp in stock trade, preprocessing the original stock trade data to obtain first target data, analyzing the subdivision field of the first target data based on a fuzzy function, and storing the first target data of the subdivision field on a corresponding storage node;
s2: constructing a target tree structure for the first target data of each subdivision domain, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision domain, the subdivision domain is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information;
s3: generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
S4: sequentially carrying out hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and storing the simplified stock node data in a target tree structure according to node position information;
mining time sequence patterns in the frequent positive sequence pattern set, the obtaining a stock time positive sequence pattern set comprising:
counting the stock support degree and the stock co-occurrence degree of a time sequence mode, wherein the stock support degree is used for representing the total number of occurrences of the time sequence mode in all stock sequence data, the co-occurrence degree is used for representing how many stock sequence data the time sequence mode occurs in, the time sequence mode represents that a second time node occurs after a first time node occurs, and the first time node and the second time node are any two time nodes in the frequent positive sequence mode set;
when the stock support degree corresponding to any two time nodes is larger than a preset stock support degree threshold value and the stock co-occurrence degree is larger than the preset stock co-occurrence degree threshold value, corresponding stock sequence data are used as elements in the stock time positive sequence mode set;
The acquisition logic of the simplified stock node data is as follows:
the method comprises the steps that data grouping is carried out on stock node data in a stock time positive sequence mode set according to preset node data items, and a weighted hash value corresponding to the stock node data is calculated by utilizing a hash algorithm and weight values of all the stock node data;
comparing weighted hash values corresponding to all stock node data, and merging two stock node data with Hamming distance smaller than a preset Hamming threshold value between the hash values into the same similar stock data set;
combining the same item of similar stock data set into new stock node data, and marking the new stock node data as simplified stock node data; and updating node position information of the stock node data; storing the reduced stock node data on the updated node location information;
updating node location information of stock node data includes:
acquiring node position information corresponding to node data information in the same similar stock data set;
obtaining the distance between the current node and the root node based on the node position information; the distance between the current node and the root node is the sum of the first distance and the second distance; the first distance is the distance from the parent node to the root node of the current node; the second distance is the ratio of the current node position number to the total number of node positions under the father node;
Comparing all node position information with each other; and the node position information with the smallest node distance from the root node is updated.
2. The method for columnar storage of stock exchange data based on a tree structure according to claim 1, wherein the specific analysis process of the subdivision area corresponding to the first target data includes:
performing data preprocessing on the original stock trading data to obtain first target data; data preprocessing includes, but is not limited to, one or more of the following processing modes: filtering error data, filtering repeated data and normalizing processed data;
extracting subdivision domain features based on the first target data, and converting the subdivision domain features into fuzzy domain variables;
calculating the membership degree of the subdivision domain where the stock trade data is located by the fuzzy domain variable through a membership function, and dividing the membership degree to obtain a fuzzy numerical interval corresponding to the fuzzy domain variable;
the fuzzy numerical intervals based on the subdivision domain feature map characterize the subdivision domain of the stock exchange data.
3. The tree-structured stock exchange data columnar storage method of claim 2, wherein obtaining the segment domain features comprises:
Extracting associated domain features of the subdivision domain corresponding to the first target data by adopting a preset feature extraction network model;
and obtaining subdivision domain features through a cross-validation model for the associated domain features and the first target data.
4. A method of columnar storage of stock exchange data based on a tree structure as claimed in claim 3 wherein the backup node data information is a backup of the corresponding node data information.
5. The tree-based stock exchange data columnar storage method as recited in claim 4, wherein the node position information of the root node is a number of positions of the root node, the first target data subdivided under the root node is a child node, the node position information of the child node is composed of a number of positions of a parent node of the current node and a number of positions of the node itself, and the number of positions of the node represents a sign indicating a position thereof in the same-layer node.
6. The tree-structured stock exchange data columnar storage method of claim 5 wherein the weight value acquisition logic of the stock node data is:
the weight value comprises a fixed weight value and a fluctuation weight value, wherein the fixed weight value is the reciprocal of the distance from corresponding stock node data to the root node, and the fluctuation weight value is the difference value between the maximum value of the data fluctuation amplitude and a preset amplitude threshold value; the fluctuation weight value is determined according to any one of the following methods:
When the data fluctuation range of the stock node data is larger than or equal to a preset amplitude threshold value, increasing a weight value corresponding to the node data information;
and when the data fluctuation range of the stock node data is smaller than a preset amplitude threshold value, reducing the weight value corresponding to the node data information.
7. A stock exchange data storage server for driving a tree-based stock exchange data columnar storage method as claimed in any one of claims 1 to 6, wherein the server comprises:
the data acquisition module acquires original stock trade data with a time stamp in stock trade, preprocesses the original stock trade data to obtain first target data, analyzes the subdivision field of the first target data based on a fuzzy function, and stores the first target data of the subdivision field on a corresponding storage node;
the tree structure construction module is used for constructing a target tree structure for the first target data of each subdivision field, wherein the storage node of the first target data storage is a target tree structure of the corresponding subdivision field, the subdivision field is a root node of the current target tree structure, and the first target data is stored in the corresponding target tree structure to obtain stock node data of the corresponding node; the stock node data comprises node position information, node data information and backup node data information;
The simplified sequence generation module is used for generating stock sequence data according to the time sequence mode corresponding to the time stamp by the stock node data; counting the stock support degree and the stock co-occurrence degree of all the stock sequence data, and obtaining a frequent positive sequence mode set based on the stock support degree and the stock co-occurrence degree; mining time sequence modes in the frequent positive sequence mode set to obtain a stock time positive sequence mode set;
and the data storage module sequentially performs hash operation and data combination on the stock node data in the stock time positive sequence mode set to obtain simplified stock node data, and stores the simplified stock node data in the target tree structure according to the node position information.
8. An electronic device, comprising: a processor and a memory, wherein the memory stores a computer program for the processor to call;
the processor performs the tree-based stock exchange data columnar storage method of any one of claims 1-6 by invoking a computer program stored in the memory.
9. A computer-readable storage medium, characterized by: instructions stored thereon which, when executed on a computer, cause the computer to perform the tree-based stock exchange data columnar storage method of any one of claims 1 to 6.
CN202311022942.5A 2023-08-15 2023-08-15 Stock transaction data column type storage method and server based on tree structure Active CN116737727B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311022942.5A CN116737727B (en) 2023-08-15 2023-08-15 Stock transaction data column type storage method and server based on tree structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311022942.5A CN116737727B (en) 2023-08-15 2023-08-15 Stock transaction data column type storage method and server based on tree structure

Publications (2)

Publication Number Publication Date
CN116737727A CN116737727A (en) 2023-09-12
CN116737727B true CN116737727B (en) 2023-12-01

Family

ID=87904776

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311022942.5A Active CN116737727B (en) 2023-08-15 2023-08-15 Stock transaction data column type storage method and server based on tree structure

Country Status (1)

Country Link
CN (1) CN116737727B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419630A (en) * 2008-12-11 2009-04-29 中国科学院计算技术研究所 Top-k item digging method and system in data flow
CN104574153A (en) * 2015-01-19 2015-04-29 齐鲁工业大学 Method for quickly applying negative sequence mining patterns to customer purchasing behavior analysis
KR101564616B1 (en) * 2015-04-13 2015-11-02 주식회사 프로이트 Method for analyzing big data based on association rule
CN107783993A (en) * 2016-08-25 2018-03-09 阿里巴巴集团控股有限公司 The storage method and device of data
CN111209591A (en) * 2019-12-31 2020-05-29 浙江工业大学 Storage structure sorted according to time and quick query method
CN112837158A (en) * 2021-02-19 2021-05-25 苏州科知律信息科技有限公司 Stock data acquisition and storage method, device and system based on cloud computing technology
CN115757411A (en) * 2022-11-17 2023-03-07 企知道网络技术有限公司 Stock market information data management method, system, equipment and storage medium
CN115964501A (en) * 2021-10-11 2023-04-14 中国移动通信集团设计院有限公司 Data processing method and device, computing equipment and computer storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101419630A (en) * 2008-12-11 2009-04-29 中国科学院计算技术研究所 Top-k item digging method and system in data flow
CN104574153A (en) * 2015-01-19 2015-04-29 齐鲁工业大学 Method for quickly applying negative sequence mining patterns to customer purchasing behavior analysis
KR101564616B1 (en) * 2015-04-13 2015-11-02 주식회사 프로이트 Method for analyzing big data based on association rule
CN107783993A (en) * 2016-08-25 2018-03-09 阿里巴巴集团控股有限公司 The storage method and device of data
CN111209591A (en) * 2019-12-31 2020-05-29 浙江工业大学 Storage structure sorted according to time and quick query method
CN112837158A (en) * 2021-02-19 2021-05-25 苏州科知律信息科技有限公司 Stock data acquisition and storage method, device and system based on cloud computing technology
CN115964501A (en) * 2021-10-11 2023-04-14 中国移动通信集团设计院有限公司 Data processing method and device, computing equipment and computer storage medium
CN115757411A (en) * 2022-11-17 2023-03-07 企知道网络技术有限公司 Stock market information data management method, system, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Erasable pattern mining based on tree structures with damped window over data streams;Yoonji Baek等;《Engineering Applications of Artificial Intelligence》;1-20 *
关联规则推荐的高效分布式计算框架;李昌盛;伍之昂;张璐;曹杰;;计算机学报(06);1218-1231 *

Also Published As

Publication number Publication date
CN116737727A (en) 2023-09-12

Similar Documents

Publication Publication Date Title
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
CN112148843B (en) Text processing method and device, terminal equipment and storage medium
CN111506637B (en) Multi-dimensional anomaly detection method and device based on KPI (Key Performance indicator) and storage medium
CN107291895B (en) Quick hierarchical document query method
Wang et al. A scalable method for time series clustering
Drakopoulos et al. Higher order graph centrality measures for Neo4j
CN115577701A (en) Risk behavior identification method, device, equipment and medium for big data security
Yi-bin et al. Improvement of ID3 algorithm based on simplified information entropy and coordination degree
CN104598599A (en) Method and system for removing name ambiguity
Sagar et al. Analysis of prediction techniques based on classification and regression
Wang et al. An improved clustering method for detection system of public security events based on genetic algorithm and semisupervised learning
US20160283862A1 (en) Multi-distance similarity analysis with tri-point arbitration
Wijayanti et al. K-means cluster analysis for students graduation: case study: STMIK Widya Cipta Dharma
Hassan et al. Crime news analysis: Location and story detection
CN116737727B (en) Stock transaction data column type storage method and server based on tree structure
Elouataoui et al. An End-to-End Big Data Deduplication Framework based on Online Continuous Learning
Gunawan et al. C4. 5, K-Nearest Neighbor, Naïve Bayes, and Random Forest Algorithms Comparison to Predict Students' on TIME Graduation
Wibawa et al. International Journal Quartile Classification Using the K-Nearest Neighbor Method
Rahul et al. Data cleaning mechanism for big data and cloud computing
Szymczak et al. Coreference detection in XML metadata
Hou A new clustering validity index based on K-means algorithm
Lin et al. Toward a MapReduce-based K-means method for multi-dimensional time serial data clustering
Ismanto et al. Comparison of running time between C4. 5 and k-nearest neighbor (k-NN) algorithm on deciding mainstay area clustering.
Wu et al. Top-k contrast order-preserving pattern mining for time series classification
Siddiqi et al. Detecting Outliers in Non-IID Data: A Systematic Literature Review

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240430

Address after: 201100 floor 2, building 2, No. 1508, Kunyang Road, Minhang District, Shanghai

Patentee after: Shanghai Kafang Information Technology Co.,Ltd.

Country or region after: China

Address before: Room 1801, Building 3, No. 1186-1 Bin'an Road, Changhe Street, Binjiang District, Hangzhou City, Zhejiang Province, 310000

Patentee before: Hangzhou Chi-squared distribution Information Technology Co.,Ltd.

Country or region before: China

TR01 Transfer of patent right