CN110209742B - Block chain based storage system and method classified according to data importance - Google Patents

Block chain based storage system and method classified according to data importance Download PDF

Info

Publication number
CN110209742B
CN110209742B CN201910521926.8A CN201910521926A CN110209742B CN 110209742 B CN110209742 B CN 110209742B CN 201910521926 A CN201910521926 A CN 201910521926A CN 110209742 B CN110209742 B CN 110209742B
Authority
CN
China
Prior art keywords
data
storage
module
importance
routing information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910521926.8A
Other languages
Chinese (zh)
Other versions
CN110209742A (en
Inventor
于辉
刘善武
李进
刘文敏
唐坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201910521926.8A priority Critical patent/CN110209742B/en
Publication of CN110209742A publication Critical patent/CN110209742A/en
Application granted granted Critical
Publication of CN110209742B publication Critical patent/CN110209742B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques

Abstract

The invention particularly relates to a storage system and a storage method based on block chain classification according to data importance. The storage system based on the block chain and classified according to the data importance comprises a metadata crawler module, a data importance judgment module, a data repeatability judgment module, a data storage routing information module, an SQL Server database cluster and a block chain storage module; the data to be stored is accessed to the SQL Server database cluster and the block chain storage module through the data storage routing information module, the SQL Server database cluster stores non-importance data, and the block chain storage module stores importance data. According to the storage system and the storage method based on the block chain and classified according to the data importance, the importance data and the non-importance data are respectively stored while repeated storage is avoided, and meanwhile, the characteristics of openness, transparency, non-deletability and the like of the block chain are utilized, so that the important data in a mass data resource pool are efficiently accessed, and the storage efficiency and the equipment utilization rate of storage resources are improved.

Description

Block chain based storage system and method classified according to data importance
Technical Field
The invention relates to the technical field of block chains, in particular to a block chain-based storage system and a block chain-based storage method classified according to data importance.
Background
Currently, with the continuous development of cloud computing and big data technology, the data volume gradually shows geometric growth. Therefore, the efficiency of data retrieval is becoming lower, and a lot of important data is lost in a huge amount of garbage (non-important) data, which causes a great waste of storage resources and affects the utilization efficiency of data resources.
The existing storage mode and the storage system do not judge the importance of data and do not effectively store and retrieve the important data, and the storage hardware equipment is greatly consumed in the process of storing unimportant garbage data resources.
The block chain technology is a brand new distributed infrastructure and computing mode which utilizes a block chain type data structure to verify and store data, utilizes a distributed node consensus algorithm to generate and update data, utilizes a cryptography mode to ensure the safety of data transmission and access, and utilizes an intelligent contract composed of automatic script codes to program and operate data. The blockchain technology has the characteristics of openness, transparency, non-tampering and permanent storage, and the authenticity and the robustness of data are high. Currently, the blockchain has been widely used in the financial industry, and various virtual currencies are most representative of bitcoin; besides being used in financial industries, other industries such as logistics industry and real estate industry have good application prospects.
Compared with the traditional database or other recording modes, the invention designs a storage system and a storage method based on block chain classification according to data importance.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a simple and efficient storage system and method based on block chains and classified according to data importance.
The invention is realized by the following technical scheme:
a block chain based storage system that classifies data importance, comprising: the system comprises a metadata crawler module, a data importance judgment module, a data repeatability judgment module, a data storage routing information module, an SQLServer database cluster and a block chain storage module; the metadata crawler module is connected to the data storage routing information module through the data importance judging module, the data storage requirement is accessed to the data storage routing information module through the data repeatability judging module, the data to be stored is accessed to the SQLServer database cluster and the block chain storage module through the data storage routing information module, the SQL Server database cluster stores non-importance data, and the block chain storage module stores importance data;
the SQL Server database cluster comprises an SQL Server storage cluster management module and an SQL Server storage node, wherein the SQL Server storage cluster management module manages the storage behavior of non-important data in the SQL Server storage node and simultaneously monitors and manages the state of the SQL Server storage node;
the data storage routing information module is also connected with a data storage routing information backup module, and is connected with a block chain storage module through an intelligent contract module; the intelligent contract module automatically calls the importance data and calculates the specific storage position in the block chain storage module.
The metadata crawler module is used for configuring key data and important data sources in the resource field according to the configured data crawling source and aiming at the resource types; the data importance judging module is used for calculating an importance value IMP, comparing the importance value IMP with a data importance standard value QUA preset by a user, judging that the importance data is important data if the calculated importance value IMP is larger than a data importance standard value QUA, and judging that the importance data is non-important data if the calculated importance value IMP is not larger than the data importance standard value QUA; the data access routing information module is used for storing routing information records of data storage, and the routing information records are equivalent to index record information; the data access route information backup module is used for performing backup storage on route information records stored by data and forming a main/standby mode with the data access route information module; when the data access routing information module fails to influence the overall function, the data access routing information backup module actively switches the service to the data access routing information backup module to operate, so that the same data access routing retrieval function as the data access routing information module is realized, and the robustness of the system is improved; and the data repeatability judging module is used for comparing the data storage requirement with the storage information record of the data storage routing information module, judging whether the data is repetitive data or not, and if so, giving up the storage.
The data crawling source is updated and set by a user independently and comprises a search engine and a thesis, academic and periodical website.
The calculation formula of the importance value IMP of the resource Q is as follows:
Figure GDA0003110199370000021
wherein, lambda is a correction factor which is more than 0 and less than 1, M is the number of important sources related to the resource Q, and M isGeneral assemblyThe total number of important sources configured for the system, N is the number of times resource Q is referenced, NGeneral assemblyThe number of times all resources crawled for the crawler are referenced, L is the number of times resource Q is retrieved by the userGeneral assemblyThe number of times that all resources are retrieved by the user, W is the number of the existing reference resources Q in the system, WGeneral assemblyThe total number of all resources existing in the system.
According to the storage method of the storage system based on the block chain classification according to the data importance, the data repeatability and the data importance are judged during data storage, repeated storage is avoided, the intelligent contract module is called according to the importance judgment result, the importance data are stored by adopting the block chain technology, and the SQLServer database cluster is adopted to store the non-importance data, so that the importance data and the non-importance data are stored separately, and the storage efficiency of the important data and the resource utilization rate of the storage device are improved. By utilizing the characteristics of open, transparent and undeletable block chains and the like, the key data in the mass data resource pool can be efficiently accessed, and the storage of storage resources and the utilization rate of equipment are improved.
The invention discloses a storage method for classifying a storage system according to data importance based on a block chain, which comprises the following steps:
(1) after a user inputs a data storage requirement, a data repeatability judging module searches whether repeated data information exists in a data storage routing information module or not, and if so, the data storage requirement is abandoned;
(2) the data storage requirement of the non-repeated data judged by the data repeatability judging module is accessed to the data storage routing information module, the data storage routing information module sends the data to be stored to the data importance judging module, calculates an importance value IMP, compares the importance value IMP with a data importance standard value QUA preset by a user, judges the data to be important if the calculated importance value IMP is greater than a data importance standard value QUA, and otherwise, judges the data to be non-important data;
(3) the data importance judging module returns the judging result to the data storage routing information module, the intelligent contract module automatically calls importance data and stores the importance data into the block chain storage module, and non-importance data is sent to the SQL Server database cluster for storage;
the data storage routing information module stores the routing information record of the data storage for calling, and the data access routing information backup module performs backup storage on the routing information record of the data storage.
The invention has the beneficial effects that: according to the storage system and the storage method based on the block chain and classified according to the data importance, the importance data and the non-importance data are respectively stored while repeated storage is avoided, and meanwhile, the characteristics of openness, transparency, non-deletability and the like of the block chain are utilized, so that the important data in a mass data resource pool are efficiently accessed, and the storage efficiency and the equipment utilization rate of storage resources are improved.
Drawings
FIG. 1 is a schematic diagram of a system and method for sorting storage according to data importance based on block chains according to the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is described in detail below with reference to the accompanying drawings and embodiments. It should be noted that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention
The storage system based on the block chain and classified according to the data importance comprises a metadata crawler module, a data importance judgment module, a data repeatability judgment module, a data storage routing information module, an SQL Server database cluster and a block chain storage module; the metadata crawler module is connected to the data storage routing information module through the data importance judging module, the data storage requirement is accessed to the data storage routing information module through the data repeatability judging module, the data to be stored is accessed to the SQL Server database cluster and the block chain storage module through the data storage routing information module, the SQL Server database cluster stores non-importance data, and the block chain storage module stores importance data;
the SQL Server database cluster comprises an SQL Server storage cluster management module and an SQL Server storage node, wherein the SQL Server storage cluster management module manages the storage behavior of non-important data in the SQL Server storage node and simultaneously monitors and manages the state of the SQL Server storage node;
the data storage routing information module is also connected with a data storage routing information backup module, and is connected with a block chain storage module through an intelligent contract module; the intelligent contract module automatically calls the importance data and calculates the specific storage position in the block chain storage module.
The metadata crawler module is used for configuring key data and important data sources in the resource field according to the configured data crawling source and aiming at the resource types; the data importance judging module is used for calculating an importance value IMP, comparing the importance value IMP with a data importance standard value QUA preset by a user, judging that the importance data is important data if the calculated importance value IMP is larger than a data importance standard value QUA, and otherwise, judging that the importance data is non-important data; the data access routing information module is used for storing routing information records of data storage, and the routing information records are equivalent to index record information; the data access route information backup module is used for performing backup storage on route information records stored by data and forming a main/standby mode with the data access route information module; when the data access routing information module fails to influence the overall function, the data access routing information backup module actively switches the service to the data access routing information backup module to operate, so that the same data access routing retrieval function as the data access routing information module is realized, and the robustness of the system is improved; and the data repeatability judging module is used for comparing the data storage requirement with the storage information record of the data storage routing information module, judging whether the data is repetitive data or not, and if so, giving up the storage.
The data crawling source is updated and set by a user independently and comprises a search engine and a thesis, academic and periodical website.
The calculation formula of the importance value IMP of the resource Q is as follows:
Figure GDA0003110199370000051
wherein, lambda is a correction factor which is more than 0 and less than 1, M is the number of important sources related to the resource Q, and M isGeneral assemblyThe total number of important sources configured for the system, N is the number of times resource Q is referenced, NGeneral assemblyThe number of times all resources crawled for the crawler are referenced, L is the number of times resource Q is retrieved by the userGeneral assemblyThe number of times that all resources are retrieved by the user, W is the number of the existing reference resources Q in the system, WGeneral assemblyThe total number of all resources already (stored) in the system.
The correction factor lambda can be dynamically adjusted according to the system operation condition. When the system values are stored in the system (SQL Server database cluster and block chain storage module) less (the application degree of the stored resources may be inaccurate), the value of the correction factor lambda can be increased properly, so that 1-lambda becomes smaller, and the importance degree is calculated from the internet data by a metadata crawler module; with the continuous operation of the system, when the data stored in the system is gradually increased, the numerical value of the correction factor lambda can be properly improved, and the proportion of the latter half of the formula is improved.
For example: the initial value lambda of the system can be given to be 0.8, the value of the correction factor lambda is reduced by 0.1 every time the system operates for a period of time, and finally the value of the correction factor lambda tends to be stable after the system gradually reaches dynamic balance.
According to the storage method of the block chain based storage system classified according to the data importance, the data repeatability and the data importance are judged during data storage, repeated storage is avoided, an intelligent contract module is called according to the importance judgment result, the importance data are stored by adopting a block chain technology, an SQL Server database cluster is adopted to store the non-importance data in a database mode, so that the importance data and the non-importance data are stored separately, the storage efficiency of the importance data and the resource utilization rate of storage equipment are improved, the characteristics of the block chain being open, transparent and non-deletable are utilized, efficient access of the importance data in a mass data resource pool is achieved, and the storage of storage resources and the equipment utilization rate are improved.
The storage method for classifying the storage system according to the data importance based on the block chain comprises the following steps:
(1) after a user inputs a data storage requirement, a data repeatability judging module searches whether repeated data information exists in a data storage routing information module or not, and if so, the data storage requirement is abandoned;
(2) the data storage requirement of the non-repeated data judged by the data repeatability judging module is accessed to the data storage routing information module, the data storage routing information module sends the data to be stored to the data importance judging module, calculates an importance value IMP, compares the importance value IMP with a data importance standard value QUA preset by a user, judges the data to be important if the calculated importance value IMP is greater than a data importance standard value QUA, and otherwise, judges the data to be non-important data;
(3) the data importance judging module returns the judging result to the data storage routing information module, the intelligent contract module automatically calls importance data and stores the importance data into the block chain storage module, and non-importance data is sent to the SQL Server database cluster for storage;
(4) the data storage routing information module stores the routing information record of the data storage for calling, and the data access routing information backup module performs backup storage on the routing information record of the data storage.

Claims (4)

1. A block chain based storage system that classifies data importance, comprising: the system comprises a metadata crawler module, a data importance judgment module, a data repeatability judgment module, a data storage routing information module, an SQL Server database cluster and a block chain storage module; the metadata crawler module is connected to the data storage routing information module through the data importance judging module, a data storage requirement input by a user comprises a resource Q to be stored, the resource Q to be stored is accessed to the data storage routing information module through the data repeatability judging module, the resource Q to be stored is accessed to the SQL Server database cluster and the block chain storage module through the data storage routing information module, the SQL Server database cluster stores non-importance data, and the block chain storage module stores importance data;
the SQL Server database cluster comprises an SQL Server storage cluster management module and an SQL Server storage node, wherein the SQL Server storage cluster management module manages the storage behavior of non-important data in the SQL Server storage node and simultaneously monitors and manages the state of the SQL Server storage node;
the data storage routing information module is also connected with a data storage routing information backup module, and is connected with a block chain storage module through an intelligent contract module; the intelligent contract module automatically calls the importance data and calculates the specific storage position in the block chain storage module;
the metadata crawler module is used for crawling key data and important data sources in the resource field according to the configured data crawling source aiming at the type of the resource Q; the data importance judging module is used for calculating an importance value IMP of the resource Q, comparing the importance value IMP of the resource Q with a data importance standard value QUA preset by a user, if the calculated importance value IMP of the resource Q is larger than a data importance standard value QUA, judging the resource Q to be importance data, and if not, judging the resource Q to be non-importance data; the data storage routing information module is used for storing routing information records of data storage, which are equivalent to index record information; the data storage routing information backup module is used for performing backup storage on routing information records stored in the data storage and forming a main/standby mode with the data storage routing information module; when the data storage routing information module fails to influence the overall function, the data storage routing information backup module actively switches the service to the data storage routing information backup module to operate, so that the same data access routing retrieval function as the data storage routing information module is realized, and the robustness of the system is improved; the data repeatability judging module is used for comparing the data storage requirement with the storage information record of the data storage routing information module, judging whether the resource Q is repetitive data or not, and if so, giving up the storage;
the calculation formula of the importance value IMP of the resource Q is as follows:
Figure FDA0003110199360000011
wherein, λ is a correction factor larger than 0 and smaller than 1, M is the number of important data sources associated with resource Q, M isGeneral assemblyTotal number of important data sources configured for the system, N is the number of times resource Q is referenced, N isGeneral assemblyIs the number of times all resources in the important data source associated with resource Q are referenced, L is the number of times resource Q is retrievedGeneral assemblyThe number of times all resources in the important data source associated with resource Q have been retrieved; w is the number of resources of the existing reference resource Q in the SQL Server database cluster and the block chain storage module, W isGeneral assemblyThe total number of resources existing in the SQL Server database cluster and the block chain storage module.
2. The system according to claim 1, wherein the memory system is classified according to data importance based on blockchain, and comprises: the data crawling source is updated and set by a user independently and comprises a search engine and a thesis, academic and periodical website.
3. The method according to claim 1 or 2, wherein the storage system is classified according to data importance based on blockchain, and the method comprises: the method comprises the steps of judging data repeatability and data importance during data storage, avoiding repeated storage, calling an intelligent contract module according to an importance judgment result, storing importance data by adopting a block chain technology, storing non-importance data by adopting an SQL Server database cluster, realizing separate storage of the importance data and the non-importance data, improving storage efficiency of the importance data and resource utilization rate of storage equipment, realizing efficient access of the importance data in a mass data resource pool by utilizing the characteristics of openness, transparency and non-delectability of a block chain, and improving storage of storage resources and equipment utilization rate.
4. The method of claim 3, comprising the steps of:
(1) after a user inputs a data storage requirement, a data repeatability judging module searches whether repeated data information exists in a data storage routing information module or not, and if so, the data storage requirement is abandoned;
(2) the data storage requirement of the non-repeated data judged by the data repeatability judging module is accessed to the data storage routing information module, the data storage routing information module sends the data to be stored to the data importance judging module, calculates an importance value IMP, compares the importance value IMP with a data importance standard value QUA preset by a user, judges the data to be important if the calculated importance value IMP is greater than a data importance standard value QUA, and otherwise, judges the data to be non-important data;
(3) the data importance judging module returns the judging result to the data storage routing information module, the intelligent contract module automatically calls importance data and stores the importance data into the block chain storage module, and non-importance data is sent to the SQL Server database cluster for storage;
(4) the data storage routing information module stores the routing information record of the data storage for calling, and the data storage routing information backup module performs backup storage on the routing information record of the data storage.
CN201910521926.8A 2019-06-17 2019-06-17 Block chain based storage system and method classified according to data importance Active CN110209742B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910521926.8A CN110209742B (en) 2019-06-17 2019-06-17 Block chain based storage system and method classified according to data importance

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910521926.8A CN110209742B (en) 2019-06-17 2019-06-17 Block chain based storage system and method classified according to data importance

Publications (2)

Publication Number Publication Date
CN110209742A CN110209742A (en) 2019-09-06
CN110209742B true CN110209742B (en) 2021-07-27

Family

ID=67793130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910521926.8A Active CN110209742B (en) 2019-06-17 2019-06-17 Block chain based storage system and method classified according to data importance

Country Status (1)

Country Link
CN (1) CN110209742B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111680321A (en) * 2020-05-20 2020-09-18 厦门区块链云科技有限公司 Block chain decentralized digital asset management system
CN112633881A (en) * 2020-12-24 2021-04-09 中付(深圳)技术服务有限公司 Transaction information storage and financial storage method based on traditional data and block chain

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930447A (en) * 2009-12-31 2010-12-29 北京中加国道科技有限公司 Retrieval system for network academic resources
CN103823900A (en) * 2014-03-17 2014-05-28 北京百度网讯科技有限公司 Information point significance determining method and device
CN104408174A (en) * 2014-12-12 2015-03-11 用友软件股份有限公司 Database routing device and method
CN106570074A (en) * 2016-10-14 2017-04-19 深圳前海微众银行股份有限公司 Distributed database system and implementation method thereof
CN108259622A (en) * 2018-02-07 2018-07-06 福建南威软件有限公司 A kind of trans-regional sharing method of electronics license data
CN109299188A (en) * 2018-08-21 2019-02-01 平安科技(深圳)有限公司 Utilize block chain date storage method, device and electronic equipment
CN109726966A (en) * 2019-01-11 2019-05-07 青岛华制智能互联科技有限公司 A kind of factory's material management system based on block chain

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW200407736A (en) * 2002-11-08 2004-05-16 Hon Hai Prec Ind Co Ltd System and method for classifying patents and displaying patent classification
CN106230952A (en) * 2016-08-05 2016-12-14 王楚 Monitor the big data storing platform network architecture
CN106815526A (en) * 2016-12-27 2017-06-09 苏州春禄电子科技有限公司 A kind of safety-type database storage system based on block chain technology
CN107704196B (en) * 2017-03-09 2020-03-27 深圳壹账通智能科技有限公司 Block chain data storage system and method
CN109379429A (en) * 2018-10-25 2019-02-22 龚玉环 A kind of multichain management method and system based on block chain

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930447A (en) * 2009-12-31 2010-12-29 北京中加国道科技有限公司 Retrieval system for network academic resources
CN103823900A (en) * 2014-03-17 2014-05-28 北京百度网讯科技有限公司 Information point significance determining method and device
CN104408174A (en) * 2014-12-12 2015-03-11 用友软件股份有限公司 Database routing device and method
CN106570074A (en) * 2016-10-14 2017-04-19 深圳前海微众银行股份有限公司 Distributed database system and implementation method thereof
CN108259622A (en) * 2018-02-07 2018-07-06 福建南威软件有限公司 A kind of trans-regional sharing method of electronics license data
CN109299188A (en) * 2018-08-21 2019-02-01 平安科技(深圳)有限公司 Utilize block chain date storage method, device and electronic equipment
CN109726966A (en) * 2019-01-11 2019-05-07 青岛华制智能互联科技有限公司 A kind of factory's material management system based on block chain

Also Published As

Publication number Publication date
CN110209742A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
US7890488B2 (en) System and method for caching posting lists
US8266147B2 (en) Methods and systems for database organization
Santos et al. Real-time data warehouse loading methodology
CN106104525B (en) Event processing system
CN108600321A (en) A kind of diagram data storage method and system based on distributed memory cloud
CN110134714B (en) Distributed computing framework cache index method suitable for big data iterative computation
CN112015741A (en) Method and device for storing massive data in different databases and tables
CN103595805A (en) Data placement method based on distributed cluster
CN102722553A (en) Distributed type reverse index organization method based on user log analysis
CN110209742B (en) Block chain based storage system and method classified according to data importance
EP1883022A1 (en) Fast algorithms for computing semijoin reduction sequences
CN104679646A (en) Method and device for detecting defects of SQL (structured query language) code
CN109325001B (en) Method, device and equipment for deleting small files based on metadata server
CN117076466B (en) Rapid data indexing method for large archive database
US11449521B2 (en) Database management system
CN112241396B (en) Spark-based method and system for merging small files of Delta
Doulkeridis et al. On saying" enough already!" in mapreduce
Schuh et al. AIR: adaptive index replacement in Hadoop
US20130013824A1 (en) Parallel aggregation system
CN116126901A (en) Data processing method, device, electronic equipment and computer readable storage medium
CN111813833B (en) Real-time two-degree communication relation data mining method
CN101996246A (en) Method and system for instant indexing
CN114253938A (en) Data management method, data management device, and storage medium
CN111260452A (en) Method and system for constructing tax big data model
CN111949439B (en) Database-based data file updating method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant