CN113901131B - Index-based on-chain data query method and device - Google Patents

Index-based on-chain data query method and device Download PDF

Info

Publication number
CN113901131B
CN113901131B CN202111027751.9A CN202111027751A CN113901131B CN 113901131 B CN113901131 B CN 113901131B CN 202111027751 A CN202111027751 A CN 202111027751A CN 113901131 B CN113901131 B CN 113901131B
Authority
CN
China
Prior art keywords
data
abstract
operation record
index
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111027751.9A
Other languages
Chinese (zh)
Other versions
CN113901131A (en
Inventor
王红熳
刘明民
杨放春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202111027751.9A priority Critical patent/CN113901131B/en
Publication of CN113901131A publication Critical patent/CN113901131A/en
Application granted granted Critical
Publication of CN113901131B publication Critical patent/CN113901131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for querying data on a chain based on an index, wherein the method comprises the following steps: constructing a summary dictionary tree index for the extracted data summary, packaging the data summary to be uplink and the constructed summary dictionary tree index into a block structure and uplink; constructing a data operation record for the original data operation, constructing or updating a data operation record chain index for the constructed data operation record, packaging the data operation record and the data operation record chain index into a block structure and carrying out uplink, wherein the data operation record chain is designed to link all operation records of the same data in a chain structure, and the first node address of the data operation record chain is stored in a corresponding node in the abstract dictionary tree; retrieving the original data by executing the abstract dictionary tree based on the original data query request; and searching and acquiring the historical operation record of the data abstract through a line abstract dictionary tree and an operation record chain based on the data operation record inquiry request.

Description

Index-based on-chain data query method and device
Technical Field
The invention relates to the technical fields of data retrieval, blockchain, computer and the like, in particular to an index-based on-chain data query method and device.
Background
The block chain technology is an emerging technology in the field of information technology, is an information technology formed by fusing a plurality of technologies including distributed data storage, password technology and the like, and has the characteristics of decentralization, openness, non-falsification and the like. Because of the increasing proliferation of decentralized applications, blockchains have received a great deal of attention, and the demand for data queries in blockchains is also increasing. The distributed storage and data non-tamperable nature of blockchain technology is well suited for secure sharing and reliable traceback schemes of design data. For the traditional block chain, the data query operation needs to be executed in two steps, firstly, all the existing blocks in the block chain are traversed sequentially; and then scanning all the data records in each block, and judging whether each data record meets the query requirement. Obviously, the data query mode has quite low efficiency, and is difficult to meet the application scenes of the blockchain with frequent data query such as data sharing, data tracing and the like.
At present, in the application of the related blockchain technology, most research at home and abroad is focused on utilizing the existing characteristics of the blockchain to solve the problems of fair sharing and reliable storage existing in the specific scene of data sharing or tracing, and the problems of low efficiency of data query or data retrieval in a blockchain sharing platform are solved by the fresh literature.
Therefore, how to solve the problem of low query efficiency of the traditional blockchain data and realize quick and efficient query and tracing of the data on the blockchain is a problem to be solved.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide an index-based on-chain data query method and device so as to realize quick and efficient query and tracing of on-chain data (such as original data and historical operation records thereof).
In one aspect of the present invention, there is provided an index-based on-chain data query method, the method comprising the steps of:
The method comprises an original data uplink step, a data extraction step and a data extraction step, wherein the original data uplink step is used for extracting data abstracts for each piece of original data to be uplink, constructing or updating abstract dictionary tree indexes for the extracted data abstracts, packaging the data abstracts to be uplink and the constructed abstract dictionary tree indexes into a block structure and carrying out uplink processing, and the abstract dictionary tree is constructed by taking hash values of the uplink data abstracts distributed in different blocks as key values;
A data operation record generating and chaining step, which is used for constructing a data operation record for an original data operation, constructing or updating a data operation record chain index for the constructed data operation record, packaging the data operation record and the data operation record chain index into a block structure and performing chaining processing, wherein the data operation record chain is designed to link all operation records of the same data in a chain structure, and the head node address of the data operation record chain is stored in a corresponding node in the abstract dictionary tree;
An original data query step, which is used for extracting the hash value of the data abstract to be searched from the original data query request of the user, obtaining the latest abstract dictionary tree root node from the latest block of the blockchain network, obtaining the abstract storage address by executing abstract dictionary tree retrieval, obtaining the data abstract according to the obtained storage address, and obtaining the original data by analyzing and accessing the access address of the original data in the data abstract; and
And a data operation record inquiring step, which is used for extracting the associated data abstract hash value from the data operation record inquiring request of the user, acquiring the latest abstract dictionary tree root node from the latest block of the blockchain network, retrieving and acquiring the head node address of the operation record chain by executing the abstract dictionary tree based on the extracted data abstract hash value, retrieving and acquiring the storage addresses of all operation records of the data abstract based on the acquired head node address of the operation record chain, and acquiring the historical operation records of the data abstract from the blockchain network according to the acquired storage addresses.
In some embodiments of the present invention, the nodes of the abstract dictionary tree include index nodes and data nodes; the index node is stored with an index key and a pointer array, and each pointer in the pointer array points to the next node of different paths; the data node stores an index key, a pointer pointing to the address of the data node corresponding to the last version of the current data abstract in the abstract dictionary tree, the storage address of the data abstract corresponding to the index key of the current node and the first node address of the data abstract operation record chain corresponding to the index key of the current node; the data operation record chain is of a single-chain table structure, the elements of the data operation record chain are operation record arrays, the data operation record chain comprises index nodes and data nodes, and index pointers and data pointer arrays are stored in the index nodes of the data operation record chain; the data operation record chain data node comprises at least part of the following field information: data digest hash value, data operator, timestamp, and operation type.
In some embodiments of the present invention, the value of the index key in the index node is a common prefix of the hash value of the data digest, and the value of the index key in the data node is a complete hash value of the data digest.
In some embodiments of the invention, the step of raw data chaining includes: collecting the original data to be uplink, and constructing a data abstract for each original data; constructing a abstract dictionary tree index for each data abstract to be uplink by constructing or updating the abstract dictionary tree; packaging the data to be uplink data abstract and abstract dictionary tree index data into a block structure, and serializing the block structure into a newly built block file; and distributing and storing the newly built block file to each node in the network.
In some embodiments of the present invention, the constructing the abstract dictionary tree index for each data abstract to be uplink includes: calculating the hash value of each data abstract, and creating a data node for the hash value, wherein the index of the data node is the hash value of the current data abstract; acquiring a current abstract dictionary tree, and constructing the abstract dictionary tree under the condition that the current abstract dictionary tree does not exist; determining whether each data abstract is a newly added abstract, if not, obtaining a hash value of a previous version of the data abstract, retrieving a dictionary tree of the current abstract to obtain a data node corresponding to the data abstract of the historical version, and assigning the address of the data node to the pointer of the current data node; if the abstract is newly added, the pointer of the current data node is set to be empty, and each newly built data node is inserted into the current abstract dictionary tree.
In some embodiments of the invention, the data operation record generating and linking step includes: monitoring the operation behavior of the original data on the chain, and constructing a data operation record for the original data operation; constructing a data operation record chain index for each data operation record to be uplink by constructing or updating the operation record chain; packaging the operation record and index data to be uplink into a block structure, and serializing the block structure into a newly built block file; and distributing and storing the newly built block file to each node in the network.
In some embodiments of the present invention, the step of querying the raw data includes: after receiving an original data query request of a user, extracting a hash value of a data abstract to be checked; obtaining the latest block file in the current network and extracting abstract dictionary tree root nodes from the latest block file; acquiring a data abstract storage address to be checked by retrieving an abstract dictionary tree based on the extracted hash value of the data abstract to be checked; acquiring a data abstract to be checked from the block according to the data abstract storage address; extracting an original data access address in the data abstract, and acquiring the original data by accessing the address;
the data operation record inquiring step comprises the following steps: extracting hash values of the data abstracts to be checked based on the data operation record inquiry request, acquiring the latest block files in the current network and extracting abstract dictionary tree root nodes from the latest block files; acquiring a chain head address of the operation record of the data to be checked by searching a abstract dictionary tree based on the extracted hash value of the abstract of the data to be checked; acquiring all operation record storage addresses of the data abstract to be checked through a retrieval operation record chain; and acquiring the operation record from the block according to the storage address.
In some embodiments of the present invention, the constructing a data operation record chain index for each data operation record to be chained by constructing or updating an operation record chain includes: classifying the constructed data operation records according to the operated original data, so that the operation records of the same original data are classified into one type; creating an index node for each original data operation class, and assigning all operation record addresses in the operation class to a data pointer array field of the index node; calculating a hash value of the data abstract corresponding to the current data operation class, acquiring an operation record chain head address of the data abstract through retrieving an abstract dictionary tree, and assigning the operation record chain head address to an index pointer field of the current index node; and updating the newly-built index node address serving as a new first node of the data operation record chain to a corresponding node of the abstract dictionary tree, thereby realizing the construction of the index of the newly-added data operation record chain.
In another aspect of the invention there is provided an index-based on-chain data querying device comprising a processor and a memory, the memory having stored therein computer instructions for executing the computer instructions stored in the memory, the device implementing the steps of the method as described above when the computer instructions are executed by the processor.
In yet another aspect of the invention, there is also provided a computer storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method as described above.
According to the index-based on-chain data efficient query method and device, aiming at the problems of low data query efficiency of the traditional block chain based on the data sharing and data tracing scene of the block chain, the data abstracts and the operation records are constructed for the original data and the operation records thereof and stored in a chain, and indexes are constructed for the original data abstracts and the data operation records in the block chain by designing the abstract dictionary tree and the data operation record chain, so that the quick and efficient query and tracing of the original data and the history operation records thereof are realized.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
It will be appreciated by those skilled in the art that the objects and advantages that can be achieved with the present invention are not limited to the above-described specific ones, and that the above and other objects that can be achieved with the present invention will be more clearly understood from the following detailed description.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and together with the description serve to explain the application. In the drawings:
FIG. 1 is a schematic diagram of a logic structure of a data operation record chain according to an embodiment of the present invention.
FIG. 2 is an organization of uplink data in a prior art blockchain.
FIG. 3 is a schematic diagram of the relationship and organization of the data abstract and abstract dictionary tree in the embodiment of the invention.
FIG. 4 is a schematic diagram of the relationship and organization of the operation records and the operation record chain of the original data F in the blockchain according to the embodiment of the present invention.
FIG. 5 is a schematic diagram of a memory structure of blocks in a blockchain.
FIG. 6 is a schematic block diagram of an index-based on-chain data access (query) device in an embodiment of the invention.
Fig. 7 is a schematic block diagram of a link data query service module in an embodiment of the invention.
Fig. 8 is a schematic block diagram of a data uplink service module in an embodiment of the present invention.
FIG. 9 is a schematic block diagram of an index retrieval module in an embodiment of the invention.
FIG. 10 is a schematic block diagram of an index building block in an embodiment of the present invention.
FIG. 11 is a schematic block diagram of a block query module in an embodiment of the invention.
FIG. 12 is a schematic block diagram of a block memory module in accordance with an embodiment of the invention.
Fig. 13 is a logical structure example of the dictionary tree.
Fig. 14 is a logical structure example of a summary dictionary tree in an embodiment of the present invention.
FIG. 15 is a flowchart of a method for index-based on-chain data access according to an embodiment of the present invention.
Fig. 16 is a schematic diagram of an original data uplink flow according to an embodiment of the invention.
FIG. 17 is a schematic diagram of an original data query flow according to an embodiment of the present invention.
FIG. 18 is a diagram illustrating a data operation record structure and a uplink flow according to an embodiment of the present invention.
FIG. 19 is a flowchart of a method for querying a data operation record according to an embodiment of the invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments and the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent. The exemplary embodiments of the present invention and the descriptions thereof are used herein to explain the present invention, but are not intended to limit the invention.
It should be noted here that, in order to avoid obscuring the present invention due to unnecessary details, only structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, while other details not greatly related to the present invention are omitted.
It should be emphasized that the term "comprises/comprising" when used herein is taken to specify the presence of stated features, elements, steps or components, but does not preclude the presence or addition of one or more other features, elements, steps or components.
Aiming at the current situation that the conventional blockchain inquiry efficiency is low and frequent data inquiry requirements under a data sharing or tracing scene are difficult to meet, constructing and linking data abstracts and operation records for original data and operation thereof, and providing a design method of on-chain data indexes based on the current situation, namely constructing indexes for original data abstracts and data operation records in the blockchain by designing abstract dictionary trees and data operation record chains, accelerating acquisition of the data abstracts and the operation records on the chains, and further realizing quick and efficient inquiry and tracing of the original data and history operation records thereof. The invention further provides an index-based on-chain data access method and device based on the design method of the on-chain data index. The invention provides an index-based on-chain data access method and device, which detail an index-based on-chain data access flow, namely a data abstract and operation record uplink and query flow.
The function and design method of the on-chain data index according to the embodiment of the present invention are described below. The embodiment of the invention designs two different index structures of a abstract dictionary tree and a data operation record chain aiming at two different query requirements of the data abstract and the data operation record. The abstract dictionary tree is designed to construct concentrated indexes for all data abstracts in the blockchain, so that the query efficiency of the data abstracts is accelerated; the data operation record chain is used for linking different operation records of the same data in different blocks so as to speed up the query efficiency of the data summary operation records. The following describes the design method of the on-chain data index according to the embodiment of the present invention in detail.
The design method of the on-chain data index in the embodiment of the invention comprises the following steps S1 and S2:
and S1, constructing a abstract dictionary tree.
More specifically, in this step, the hash value of the ul data digests distributed in different blocks is used as a key to construct a digest dictionary tree.
A dictionary tree is a special multi-way tree whose key (key) content is typically a string of characters. The position of a node in the dictionary tree is determined by the content of the key, and the value associated with each key is typically stored in the node corresponding to the last letter of the key. An example of a dictionary tree composed of keys (am, bad, be, so) is shown in fig. 13, for example.
The dictionary tree has the following advantages: 1) The efficiency of the dictionary tree lookup depends on the length of the key being looked up; 2) Given a set of keys, regardless of the order of insertion, the constructed dictionary trees are completely consistent (build consistency); 3) The dictionary tree has good local updating property, and when the local key is changed, the state of the node in the path from the root node to the node where the key is located is only changed, and the states of other nodes in the tree are not required to be changed.
In the embodiment of the invention, the hash values of the uplink data digests distributed in different blocks are used as key values to construct the dictionary tree, the consistency of all data states in each network node is ensured by utilizing the structural consistency of the dictionary tree, and meanwhile, the efficient retrieval of the uplink data digests is realized by virtue of the characteristic that the query efficiency of the dictionary tree is only related to key length.
Function of (one) structured abstract dictionary tree
For a conventional blockchain network, a blockchain is formed by concatenating a set of blocks through hash pointers, and data on the blockchain network is stored in the blocks. Thus, when querying a data record on a chain, each chunk and each data record in a chunk needs to be traversed sequentially until the data record under investigation is found. Obviously, the efficiency of realizing data query in a sequential traversal mode is quite low, and the query requirement of the uplink data of a plurality of next blockchain application scenes cannot be met.
Aiming at the existing problems, in the embodiment of the invention, the dictionary tree is constructed by designing the abstract dictionary tree, namely calculating the hash value of the uplink data abstract, and taking the hash values of all the data abstracts as key values, so that concentrated indexes are constructed for all the data abstracts in the blockchain, and the query efficiency of the data abstracts is further accelerated.
The nodes of the abstract dictionary tree designed by the embodiment of the invention are mainly divided into two types, namely index nodes (non-leaf nodes) and data nodes (leaf nodes). The storage structure of the nodes of the abstract dictionary tree will be described below.
(1) Inode data structure definition:
TABLE 1 definition of abstract dictionary tree inode data Structure
As shown in Table 1, the index nodes in the abstract dictionary tree contain two fields, a "key" field and a "next" field. Wherein, the key field is an index key, and the value of the key field is a public prefix of a hash value of the data abstract and is used for assisting in inquiring; the next field is an array of pointers, each pointer in the array pointing to the next node of a different path.
(2) Data node data structure definition:
TABLE 2 definition of data structure for abstract dictionary tree data nodes
Fields Description of the invention
key Index key whose value is the complete hash value of the data digest
preVersion Pointer field whose value is the address of the data node corresponding to the last version of the data abstract
abstractAddr A value field, the value of which is the storage address of the data abstract corresponding to the hash of the data abstract
opRecordAddr A value field, the value of which is the storage address of the head node of the data operation chain corresponding to the data abstract
As shown in Table 2, the data nodes in the abstract dictionary tree contain four fields, namely, the four fields "key", "preVersion", "abstractAddr" and "opRecordAddr". The key is an index key, and the value of the key is a complete hash value of the data abstract; "preVersion" is a pointer whose value is the address of the corresponding data node in the abstract dictionary tree of the last version of the current data abstract, and is used for maintaining the update status of the data abstract; the abstractAddr field records the storage address of the data abstract corresponding to the key and is used for quickly acquiring the data abstract information; the 'opRecordAddr' field records the address of the first node of the data abstract operation record chain corresponding to the key, and is mainly used for quickly acquiring the historical operation record of the data abstract.
(3) Abstract dictionary tree example:
The correspondence between the data digests and the digest dictionary tree is as follows in table 3:
TABLE 3 data summary information Table
Data digest address Hash value Data operation record chain head node address
abstract_1 0x123456789a node_1
abstract_2 0x123456789b node_2
abstract_3 0x123456788a node_3
abstract_4 0x223456789a node_4
The specific information of the four data digests in the network is shown in table 3, including the address content (abstract_), the hash value of the data digest, and the address (node_) of the data operation chain head node corresponding to the data, where abstract_2 is the updated data digest of abstract_1. The storage structure of the abstract dictionary tree corresponding to the data abstract in table 3 is shown in fig. 14.
And S2, constructing a data operation record chain.
For many scenarios of blockchains, users have frequent query or tracing requirements for on-chain data to enable tracking of data operation records. For the traditional block chain, the insertion inquiry of each piece of data on the chain is a sequential traversing process, the data inquiry efficiency is very time-consuming, and the frequent inquiry of the data history operation records cannot be met. In order to solve the problems, the embodiment of the invention links all operation records of the same data in different blocks by a chain structure through designing a data operation record chain, stores the address of the first node of the chain structure into leaf nodes corresponding to the original data abstract of the chain structure in an abstract dictionary tree, establishes the association relation between the data abstract and the data operation record, and further accelerates the query efficiency of the original data history operation record. The design (construction) of the data operation record chain is described below.
The data operation record chain is a single-chain table structure, the data in the chain table is represented by nodes, and each node is formed by: the elements are memory cells for storing data, and the pointers are address data for connecting each node. The chain structure has the greatest advantage of simple realization on the premise of ensuring that all data are connected in series.
In the invention, the element of the data operation record chain is an operation record array which records all operation records of the same data in the block, thus realizing the aggregation of the operation records of the same data in the same block; the pointer of the data operation record chain is the address data of the previous node, which realizes the aggregation of the same data operation record in different blocks. The embodiment of the invention logically realizes the centralized storage of all operation records of the same data on the block chain through a chain structure, and realizes the efficient retrieval of the operation records of the data on the chain.
The nodes of the data operation record chain designed by the embodiment of the invention are mainly divided into two types, namely index nodes and data nodes. The storage structure is as follows:
(1) Inode structure definition
TABLE 4 data manipulation record chain inode structure
As shown in Table 4, the index node in the chain of data operation records contains two fields, preNode and records fields. Wherein, preNode field takes the value as the address of the previous index node of the data, which is used to link the history operation records of the same data among different blocks; the records field is a data pointer array, which takes the value of the physical address (i.e. the node address of the operation record data) stored in all operation records of the data in the current block, so as to aggregate the different operation records of the same data in the current block.
(2) Data node structure definition
TABLE 5 data structure of data operation record chain data node
Fields Description of the invention
abstract A hash value of the data abstract, which identifies the data abstract corresponding to the data operation record
operator Data operator, representing the identity of the data operator of the operation of the data summary
timestamp Time stamp, representing the time of this operation of the data digest
opType The operation type indicates the operation type (new, changed, inquired) of the data operation at this time
As shown in table 5, the data node in the data operation record chain contains four fields, namely: abstract, operator, opType and timestamp. The abstract field is used for recording the hash value of the data abstract associated with the operation record; the operator field is used for recording the identity information of the operator when the data is abstracted and operated; the opType field is used for recording the data operation type of the data abstract during the data operation; the timestamp field is used to record the time of this operation for the data digest.
(3) Data operation record chain example
The correspondence of the data summary operation record and the chain of data operation records is shown in the following table.
TABLE 6 data sharing information Table
Data summary operation record Block number Data summary Operator Operation type Date of day
data_1 i 0x123456789a A New addition of 20210401
data_2 i 0x123456789a B Querying 20210402
data_3 j 0x123456789a C Querying 20210403
data_4 j 0x123456789a D Querying 20210404
Table 6 shows all operation records of the data digest (0 x123456789 a), and the data digest is performed four times in the whole network, namely, the node A uploads the data digest, then the node B, the node C and the node D acquire the data digest by query, and the four operation records are distributed in the block i and the block j respectively, so that the data operation record chain corresponding to all the operation records of the data digest (0 x123456789 a) is shown in FIG. 1.
The organization of the uplink data in an existing blockchain is shown in fig. 2. In the existing blockchain system, data is stored by taking a block as a basic logic unit, wherein the block can be logically divided into a block head and a data body, and some basic information of the block, such as a hash value of a current block, a hash value of a previous block, a Merker tree root node and the like, is mainly stored in the block head; the actual ul data is stored in the data body. However, the existing blockchain system does not carry out strict convention or constraint on the organization form of the uplink data in the data body, so that the invention realizes quick and efficient inquiry of the original data and the data operation record by constructing an index structure for the uplink data abstract and the data operation record.
In the embodiment of the invention, the relation and organization structure of the data digests and the abstract dictionary tree are shown in fig. 3, and from the view of the relation of data on the chain, the original data, the data digests and the data nodes in the abstract dictionary tree are in one-to-one correspondence, namely, for each original data, one data digest corresponding to each original data is stored in the blockchain, and one data node corresponding to each data digest exists in the abstract dictionary tree. In terms of the on-chain data logic storage structure, the blockchain does not directly store the uplink original data, but stores a data abstract and an abstract dictionary tree of the original data into a data body of a block, wherein the abstract dictionary tree is an index tree constructed according to hash values of the data abstract in the blockchain, and a root node address of the index tree is stored into a block head of the block so as to support quick acquisition of the abstract dictionary tree when the original data query operation is executed.
FIG. 4 illustrates the relationship and organization of operation records and operation record chains of raw data in a blockchain in an embodiment of the present invention. From the relation of data on the chain, the original data, the data abstract and the operation record chain have a one-to-one correspondence, namely, each original data corresponds to one data abstract and also corresponds to one operation record chain (for example, the original data has one operation record chain belonging to the original data); in addition, the index node of the operation record chain is in a one-to-many relationship with the data operation record, i.e. in the embodiment of the present invention, an index node of an operation record chain is constructed for all operation records of the same original data in a block (as shown in fig. 4, the first node of the operation record chain stores the storage addresses of all operation records of the data in the block). In terms of the on-chain data logic storage structure, the embodiment of the invention stores the data operation record and the operation record chain into the data body of the block, and simultaneously stores the head node address of the data operation record chain into the data node of the corresponding data abstract in the abstract dictionary tree, so that all the history operation records of the corresponding original data can be quickly queried through the data abstract hash.
FIG. 5 illustrates the logical organization and manner of storage of blocks in a blockchain network, i.e., how blocks of the above logical level are stored in the blockchain network. From the logical structure of the block chain, each block in the block chain is arranged according to time sequence, and each block is connected with each block in sequence in series by recording the address hash of the previous block; from the storage structure of the block chain, each block in the block chain is stored in a LevelDB database in a file form, each block corresponds to one block file, and each block file realizes the chain structure of the block by storing the hash address of the previous block file.
The following describes the function of the index-based on-chain data efficient access device and the modules within the device.
FIG. 6 is a schematic block diagram illustrating an index-based on-chain data access device in accordance with an embodiment of the present invention. As shown in fig. 6, the index-based on-link data access device according to the embodiment of the present invention mainly includes the following modules: the system comprises a data uplink service module, a data query service module, an index construction module, an index retrieval module, a block storage module, a block query module and an infrastructure. The functions of each module are as follows:
(1) And the data uplink service module: the method is an entry for providing the original data and the data operation record uplink service for the user;
(2) Data query service module (or data query service module on a chain): the method is an entry for providing the original data and the data operation record query service for the user;
(3) And an index construction module: the function is used for providing index construction for the two uplink data of the original data abstract and the data operation record to be uplink;
(4) Index retrieval module: the method is mainly used for providing a retrieval function for indexing two data of the abstract dictionary tree and the operation record chain;
(5) Block storage module: the method is mainly used for realizing the construction and storage functions of the block, namely the module receives the data to be uplinked and the data index, integrates and packages the data to be uplinked and the data index into a block structure, then stores the block structure into a block file in a serialization mode, and finally distributes and stores the block file into LevelDB databases of all nodes of a network through an infrastructure;
(6) Block query module: the module is mainly used for realizing the block acquisition and analysis functions, namely, firstly, the module acquires the current latest legal block file through an infrastructure, then deserializes the file into a block structure, and finally analyzes the block structure to acquire uplink data and index data;
(7) Infrastructure network: the infrastructure is a bottom layer network communication module, which is a bottom layer distributed network system physically composed of nodes, and mainly realizes the transmission of block files for the nodes in the network.
In some embodiments of the present invention, the data query service module may include: an original data query component and a data operation record query component. The structure of the data query service module and the interaction logic between the module and the external module are shown in fig. 7, and the original data query component is used for providing the original data query service for the user. Firstly, the component receives the abstract hash value of the original data input by a user, acquires hit nodes of the abstract hash value of the original data to be checked in an abstract dictionary tree through an abstract dictionary tree retrieval component in an index retrieval module of the on-chain data access device, then extracts storage addresses of the abstract of the original data from the hit nodes, acquires the abstract of the original data from the on-chain data in the block through the storage addresses, and finally acquires the original data to be checked through extracting access addresses of the original data from the abstract of the data. The data operation record query component is used for realizing the data abstract operation record query service for the user. Firstly, the component receives a data operation record query request of a user and extracts an associated data abstract hash value from the data operation record query request; then, obtaining hit nodes of hash values of the data abstract to be checked in the abstract dictionary tree through a data abstract dictionary tree retrieval component of the index retrieval module, and extracting the head node address of a data operation record chain from the hit nodes; acquiring the storage addresses of all the historical operation records of the data abstract through an operation record chain retrieval component in the index retrieval module; and finally, acquiring corresponding data operation records from the uplink data in the block through the storage address, thereby realizing the query service of the user on the data operation records on the chain.
In some embodiments of the present invention, the data-based uplink service module may include two components, a raw data-based uplink component and a data-operation-record-based uplink component. The structure of the data uplink service module and the interaction logic between the module and the external module are as shown in fig. 8, and the original data uplink component is used for providing the original data uplink service for the user. Firstly, the original data uplink component receives data uploaded by a user, extracts a data abstract, and then transmits the data abstract to be uplink to an index construction module and a block storage module so as to realize the construction of a subsequent index and block. The data operation record uplink component is used for realizing the data summary operation record uplink service for the user. The data operation record uplink operation is automatically triggered when the original data is stored and inquired, the data operation record uplink component firstly constructs the data operation record and then transmits the data operation record to be uplink to the index construction module and the block storage module so as to realize the construction of the subsequent index and block.
In some embodiments of the present invention, the index retrieval module may comprise: the abstract dictionary tree retrieval component and the operation record chain retrieval component. The structure of the index searching module and the interaction logic between the index searching module and the external module are shown in fig. 9, and the abstract dictionary tree searching component is mainly used for searching the abstract dictionary tree. When the original data and operation record inquiring operation is executed, the abstract dictionary tree retrieving component obtains the node hit by the data abstract to be checked through retrieving the data abstract tree, and the upper module is supported to obtain the data abstract storage address or the first node address of the data abstract operation record chain from the hit node. The operation record chain searching component is mainly used for realizing the searching flow of the operation record chain. When the data operation record inquiry operation is executed, the operation record chain searching component acquires the storage addresses of all the historical operation records of the data abstract through searching the operation record chain corresponding to the data abstract.
In some embodiments of the present invention, the index building module may comprise: the abstract dictionary tree construction component and the operation record chain construction component. The structure of the index construction module and the interaction logic between the index construction module and the external module are shown in fig. 10, and the abstract dictionary tree construction component is mainly used for constructing or updating the abstract dictionary tree when the original data is uplink. When the original data is being uploaded, the abstract dictionary tree construction component collects the newly added data abstract constructed by the data upload service module and calculates a hash value for the newly added data abstract, and then constructs an abstract dictionary tree for the newly added data abstract or inserts the newly added data abstract into the existing abstract dictionary tree, thereby constructing an index for the newly added data abstract. The operation record chain construction component is mainly used for realizing construction or update of the operation record chain when the data operation record is wound. When the data operation record is to be linked up, the operation record chain constructing component collects the newly added data operation record constructed by the data uplink service module and constructs an operation record chain for the newly added data operation record or inserts the newly added data operation record into the existing operation record chain, so that an index is constructed for the newly added data operation record.
In some embodiments of the present invention, the block query module may include: a block acquisition component and a block analysis component. The structure of the block query module and the interaction logic between the block query module and the external module are shown in fig. 11, and the block acquisition component is mainly used for realizing the functions of acquiring the block file and extracting the block, and the component acquires the latest legal block file in the current network through the infrastructure, and then deserializes the block file into a block structure in a deserializing mode, thereby assisting the user to execute the uplink or query operation. The block parsing component is mainly used for realizing the parsing function of the block structure, and the component parses the block structure output by the block acquisition component, separates the uplink data (original data abstract or operation record) and the index data (data abstract dictionary tree index or operation record chain index) in the block structure, thereby assisting a user to execute the uplink or query operation.
In some embodiments of the present invention, the block storage module may include: the block construction component, the local node block storage component and the other node block storage components, the structure of the block storage module and the interaction logic of the module with the external module are shown in fig. 12. The block construction component is mainly used for realizing the construction and serialization functions of blocks, and the component firstly receives data to be uplinked and data indexes, packages the data and the data indexes into a block structure, and then stores the block structure into a block file in a serialization mode, so that the aggregation of the uplinked data and the index data is realized. The local node block storage component is mainly used for realizing the local storage function of the newly built block files, the component can store the newly built block files of the block construction component into a local LevelDB database, and then the newly built block files are issued to other nodes through an infrastructure network facility so as to ensure that the block files among the nodes are kept consistent. The other node block storage component is mainly used for storing block files issued by other nodes, and the component can receive the block files issued by other nodes through an infrastructure and store the block files into a local LevelDB database after verifying the validity of the block files so as to keep the block files stored by all nodes in the network consistent.
The infrastructure in some embodiments of the invention is an underlying network communication module that is an underlying distributed network system physically comprised of nodes that primarily provide the underlying support for communication of nodes in a blockchain network. The infrastructure module uses the basic network communication architecture in the existing blockchain network to mainly realize the transmission and synchronization of the block files for each node. The module is not described in detail herein, and is not described in detail herein.
The index-based data uplink method and the on-chain data query method of the present invention are described below.
According to the design method of the index structures of the abstract dictionary tree and the operation record chain and the on-chain data query device based on indexes, the on-chain data query method based on indexes can be realized through construction and use of the index structures of the abstract dictionary tree and the operation record chain. FIG. 15 is a flowchart of an index-based on-chain data access method according to an embodiment of the present invention. As shown in fig. 15, the method includes the steps of: step S110: a step of original data chaining; step S120: inquiring original data; step S130: generating and linking a data operation record; step S140: and a data operation record inquiring step.
Step S110, the original data is uplink based on the constructed abstract dictionary tree.
In the embodiment of the present invention, the original data uplink refers to that a blockchain system collects original data to be uplink uploaded by a user (the specific number can be configured according to the size of a blockfile and the system collection time), extracts a data digest for each original data to be uplink and constructs or updates a digest dictionary tree index for the extracted data digest, packages the data digest to be uplink and the constructed digest dictionary tree index into a blockstructure and sequences them into one blockfile (blockfile), and finally distributes the newly built blockfile storage to databases of various nodes of a network through an infrastructure, as shown in fig. 16, the method for the original data uplink includes the following steps S111-S112:
Step S111, collecting the original data to be linked, and constructing a data abstract for each original data.
In the embodiment of the invention, considering that the original data to be uplinked has different formats and limited storage capacity of a single block in a block chain, the data abstract is preferably constructed according to a set field extraction rule of the original data to be uplinked, and the specific implementation steps are as follows:
(1.1) collecting a certain amount of original data to be uplink according to the set block size and the system collection time;
(1.2) extracting corresponding fields and values from each received original data according to the set original data uplink format, and constructing an uplink data digest. The original data uplink format mainly comprises data hash value, data provider address, data uploading time stamp, original data access address and other fields.
Step S112, constructing or updating the abstract dictionary tree, and constructing an abstract dictionary tree index for each data abstract to be uplink.
In this step, the summary dictionary tree is constructed or updated, which means that the summary dictionary tree is updated according to the original data to be uplinked when the summary dictionary tree already exists, and a new summary dictionary tree is created according to the original data to be uplinked when the summary dictionary tree does not exist currently.
In the embodiment of the present invention, after completing the collection of the data to be uplink, in order to accelerate the access of the data abstract, an index is constructed for the data abstract in step 2 by constructing or updating the abstract dictionary tree, and the specific implementation steps are as follows:
(2.1) calculating a hash value of each data digest and creating a data node for it, whose key is the number
Hash value of the digest;
(2.2) acquiring a current abstract dictionary tree, judging whether the current abstract dictionary tree is empty, and if so, constructing the abstract dictionary tree;
that is, if there is no abstract dictionary tree yet (the current abstract dictionary tree is empty), the abstract dictionary tree is constructed so as to obtain the abstract dictionary tree; if the abstract dictionary tree exists currently, the current abstract dictionary tree is directly obtained.
(2.3) Judging whether each data abstract is a newly added abstract, if not, obtaining a hash value of a previous version of the data abstract, then searching a dictionary tree of the current abstract to obtain a data node corresponding to the data abstract of the historical version, and assigning the address of the data node to a preVersion field of the current data node, and if so, emptying a preVersion field;
and (2.4) inserting each newly built data node into the current abstract dictionary tree, and constructing the index of the data abstract dictionary tree by updating the abstract dictionary tree.
Step S113, packaging the data summary and index data to be uplink into a block structure, and then serializing the block structure into a file.
Considering that the original blockchain takes a block file as a storage form, the embodiment of the invention stores uplink data and indexes into a block structure, and then sequences the block structure into a file, and the specific implementation steps are as follows:
(3.1) integrating and packaging the data abstract dictionary tree indexes to be uplink output in the step S111 and constructed in the step S112 into a block structure;
and (3.2) adding necessary verification information to the block structure obtained by packaging, such as the hash value of the previous block, the hash value of the current block and the storage position (intra-block offset) of the root node of the abstract dictionary.
(3.3) Serializing the block structure containing the uplink data digest, the data digest dictionary tree index and the check information into a file, wherein the file takes the block as a suffix.
Step S114, distributing and storing the newly built block file to each node in the network through the infrastructure.
In a blockchain network, data is stored in each network node in the form of a blockfile for backup to achieve consensus of the data across the network. Similarly, in the embodiment of the present invention, the step of distributing and storing the block file also exists in the data uplink, which specifically includes: and distributing the block file generated in the step S113 to other nodes in the network through the basic block chain network, and storing the block file into a LevelDB database after the other nodes receive the block file and verify the validity.
Step S120, the original data query is performed based on the abstract dictionary tree.
In the embodiment of the present invention, the original data query refers to a process that after receiving an original data query request of a user, a system extracts a hash value of an original data summary to be queried, obtains latest block file information from a blockchain network and reverse-sequences the latest block file information into a block structure, then performs summary dictionary tree search based on the extracted hash value of the original data summary to obtain a summary storage address, directly obtains a data summary according to the obtained storage address, and obtains the original data by analyzing and accessing an access address of original data in the data summary, and the original data query flow includes:
Step S121, based on the original data query request, the latest block file in the current network is obtained and the abstract dictionary tree root node is extracted therefrom.
More specifically, in the step, after receiving an original data query request of a user, a hash value of a data abstract to be checked is extracted; and acquiring the latest block file in the current network and extracting the root node of the abstract dictionary tree from the latest block file.
In the invention, the data and the data index on the chain are stored in the form of blocks, and the blocks are stored in the nodes of the block chain network in the form of files, so that when the query operation is executed, the latest block file is acquired from the block chain network and the latest index information is acquired from the latest block file, and the method comprises the following specific implementation steps:
(1.1) judging whether the latest block file exists in the block files stored in the node, and if not, requesting to acquire the latest block file from the node with the latest block file through the infrastructure;
(1.2) deserializing the latest block file into a block structure by a block parsing component, and then obtaining the root node of the abstract dictionary tree from the block structure.
Step S122, retrieving the abstract dictionary tree based on the extracted data abstract hash value to obtain the data abstract storage address to be checked.
In the invention, the abstract dictionary tree is a tree index structure for recording and dispersing the storage positions of the data abstracts in each block, and the storage positions of the data abstracts can be quickly inquired by searching the abstract dictionary tree, and the specific implementation steps are as follows:
(2.1) starting from the root node of the abstract dictionary tree, judging whether the current node is a data node corresponding to the hash of the data to be checked, and if not, acquiring the next node according to the node index pointer;
(2.2) repeating the steps until a corresponding data node is found, if no node exists, the data abstract is represented not to be uplink, and the process is ended; otherwise, continuing to execute;
And (2.3) analyzing the data nodes searched in the step, and obtaining the storage address of the data abstract to be checked.
Step S123, the data abstract to be checked is obtained from the block according to the data abstract storage address.
According to the data summary storage address output in step S122, the data summary to be checked can be directly obtained from the corresponding block structure and returned.
In step S124, the original data access address in the data digest is extracted, and the original data is obtained by accessing the address.
Analyzing the data abstract obtained in the step S123, extracting the access address of the original data from the data abstract, and then obtaining the original data by accessing the address query.
Step S130, a data operation record generation and a chaining step.
In the embodiment of the invention, the data operation record is generated by automatically constructing the system when the operations such as the original data uplink or query are executed. The generation and chaining of data operation records involves the system automatically constructing a data operation record for an original data operation, constructing an index for the operation record (constructing or updating a chain of data operation records), packaging the operation record and index data into a block structure and serializing it into a block file, and distributing the newly created block file store to a database of each node of the network through an infrastructure, the data operation record generation and chaining steps specifically comprising the steps of:
Step S131, the operation behavior of the original data on the chain is monitored, and a data operation record is constructed for the original data operation.
In the invention, the data operation record uplink component monitors the operation behavior of the original data through a monitoring mechanism, when the original data uplink or inquiry operation event exists, the data operation record uplink component automatically collects the event and constructs a data operation record for the event according to a preset format, and the specific implementation steps are as follows:
(1.1) collecting a fixed number of raw data operation events in a fixed time period according to a set block size and a set listening time period;
(1.2) constructing a data operation record for each of the raw data operation events collected in the above steps according to the set data operation record uplink format. The data operation record uplink format mainly comprises the fields of an operated original data hash value, a data operator identity, data operation time and the like.
Step S132, a data operation record chain index is constructed for each data operation record to be uplink by constructing or updating the operation record chain.
In the present invention, after completing the collection of data operation events and the construction of data operation records, in order to accelerate the access of original data history operation records, the data operation record chain index needs to be constructed or updated for the data operation records output in step S131 by constructing or updating the operation record chain, and the specific implementation steps are as follows:
(2.1) classifying the data operation records output in the step S131 according to the operated original data, namely classifying all operation records of the same original data into one operation class, and executing the following operation on each original data operation class;
(2.2) creating an index node for each original data operation class, and assigning all operation record addresses (offset in a block) in the operation class to the records field of the index node;
(2.3) calculating a data abstract hash corresponding to the data operation class, obtaining an operation record chain head address of the data abstract through retrieving an abstract dictionary tree, and assigning the operation record chain head address to a preNode field of a current index node;
(2.4) updating the newly-built index node address serving as a new head node of the data operation record chain into a corresponding node of the abstract dictionary tree, so as to realize the construction of the index of the newly-added data operation record chain;
Step S133, packaging the operation record and index data to be uplink into a block structure, and then serializing the block structure into the newly built block file.
This step is very similar to the block construction and serialization step (step S113) in the original data uplink step, and will not be described here.
Step S134, the newly built block files are distributed and stored to all nodes in the network through the infrastructure.
This step is very similar to the block distribution storage method (step S114) in the original data uplink step, and will not be described here.
Step S140, a data operation record querying step.
In the embodiment of the present invention, the data operation record query refers to a process of extracting hash values of a data abstract to be searched from a data operation record query request of a user, obtaining information of a latest block file from a blockchain network and inversely sequencing the information into a block structure, then performing abstract dictionary tree search based on the extracted hash values of the data abstract to be searched to obtain a head node address of an operation record chain, then performing operation record chain search to obtain storage addresses of all operation records of the data abstract, and finally obtaining all history operation records of the data abstract from the blockchain network according to the obtained storage addresses, wherein the step S140 specifically includes the following steps:
and step S141, extracting hash values of the data abstract to be checked based on the data operation record inquiry request, acquiring the latest block file in the current network and extracting the tree root node of the abstract dictionary.
This step is identical to step S121 of acquiring the latest block file and extracting the root node of the abstract dictionary tree in the original data query step, and will not be described here.
Step S142, retrieving the abstract dictionary tree based on the extracted hash value of the data abstract to obtain the head address of the operation record chain of the data to be checked.
In the invention, the data node of the abstract dictionary tree records not only the storage address of the data abstract, but also the first node address of the operation record chain corresponding to the data abstract, so that the storage position of the first node of the operation record chain can be obtained by searching the abstract dictionary tree, and the specific implementation steps are as follows:
(2.1) starting from the root node of the abstract dictionary tree, judging whether the current node is a data node corresponding to the hash of the data to be checked, and if not, acquiring the next node according to the node index pointer;
(2.2) repeating the steps until a corresponding data node is found, if no node exists, the data abstract is represented not to be uplink, and the process is ended; otherwise, continuing to execute;
And (2.3) analyzing the data node retrieved in the step, and obtaining the storage position of the operation record chain head node of the data abstract to be checked.
Step S143, all operation record storage addresses of the data abstract to be checked are obtained through the operation record searching chain.
In the embodiment of the invention, the operation record chain is a chain structure for linking the data operation records scattered in each block through an address pointer, and the storage position of the data abstract can be quickly searched through searching the operation record chain, and the specific implementation steps are as follows:
(3.1) starting from the first node of the operation record chain, traversing the index nodes according to the address pointer sequence, and executing the following steps for each node;
(3.2) resolving the records field of each index node, extracting the data operation record storage address from the records field and adding the data operation record storage address into the operation record result set;
(3.3) returning the complete data operation record storage address set;
step S144, the operation record is obtained from the block according to the storage address.
According to the data operation record storage address set output in step S143, all the history operation records of the summary of the data to be checked can be directly obtained from the corresponding block structure and returned.
The above steps of the embodiments of the present invention are not limited to the illustrated order, e.g., step S120 and step S130 may be completely interchanged.
In a word, the organization form of the on-chain data digests is optimized by designing the digest dictionary tree, a centralized index is built for the cross-block on-chain data digests, and efficient query of the on-chain data digests is realized; the organization form of the data operation records on the chain is optimized by designing the data operation record chain, and the index is constructed for the data operation records through the chain structure, so that the efficient query of the data operation records on the chain is realized; the invention further establishes the association relation from the data abstract to the data operation record by storing the first node address of the data operation record chain into the abstract dictionary tree, thereby accelerating the query efficiency of the data operation record.
In summary, the above-mentioned index-based on-chain data access method according to the embodiment of the present invention has the following advantages:
(1) Aiming at the problem that the query requirement of the uplink data abstract cannot be met due to low query efficiency of the traditional block chain data, the embodiment of the invention constructs indexes for all the uplink data abstracts in the block chain by designing the abstract dictionary tree, and stores the index root node into the block head of the latest block so as to realize the efficient query of the data abstracts.
(2) Aiming at the problem that the query requirement of the uplink data operation record cannot be met due to low query efficiency of the traditional block chain data, the embodiment of the invention designs the data operation record chain, links all operation records of the same data in a chain structure, stores the first node address of the operation records into a node corresponding to a abstract dictionary tree (a specific node refers to a node of a data abstract associated with the operation record in the abstract dictionary tree), and establishes the association relation from the data abstract to the data operation record so as to realize the efficient query of the data operation record.
(3) The embodiment of the invention provides an index-based efficient query method and device for on-chain data, provides a data query and search optimization idea and scheme, provides an overall design framework and flow of on-chain data abstract and data operation record query on the basis, and designs a blockchain system which is more fit with the actual application scene of the blockchain.
According to the method, indexes are built for all data summaries in the block chain by designing the summary dictionary tree aiming at the query operation of the original data, so that the acquisition of the data summaries on the chain is accelerated, and the quick and efficient query of the original data is realized; the invention aims at inquiring the data history operation records, links all operation records of the same data by a chain structure through designing a data operation record chain, stores the first node address of the operation records into the node corresponding to the abstract dictionary tree (the specific node refers to the node of the data abstract associated with the operation record in the abstract dictionary tree), and establishes the association relation from the data abstract to the data operation record, thereby accelerating the inquiring efficiency of the data operation record.
The index-based on-chain data efficient query device of the embodiment of the invention comprises a processor and a memory, wherein the memory is stored with computer instructions, the processor is used for executing the computer instructions stored in the memory, and the device realizes the steps of the index-based on-chain data query method when the computer instructions are executed by the processor.
Embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the index-based on-chain data efficient querying method described above. The computer readable storage medium may be a tangible storage medium such as an optical disk, a USB flash drive, a floppy disk, a hard drive, etc.
Those of ordinary skill in the art will appreciate that the various illustrative components, systems, and methods described in connection with the embodiments disclosed herein can be implemented as hardware, software, or a combination of both. The particular implementation is hardware or software dependent on the specific application of the solution and the design constraints. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave.
It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. The method processes of the present invention are not limited to the specific steps described and shown, but various changes, modifications and additions, or the order between steps may be made by those skilled in the art after appreciating the spirit of the present invention.
In this disclosure, features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments and/or in combination with or instead of the features of the other embodiments.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An index-based on-chain data query method, comprising the steps of:
The method comprises an original data uplink step, a data extraction step and a data extraction step, wherein the original data uplink step is used for extracting data abstracts for each piece of original data to be uplink, constructing or updating abstract dictionary tree indexes for the extracted data abstracts, packaging the data abstracts to be uplink and the constructed abstract dictionary tree indexes into a block structure and carrying out uplink processing, and the abstract dictionary tree is constructed by taking hash values of the uplink data abstracts distributed in different blocks as key values;
A data operation record generating and chaining step, which is used for constructing a data operation record for an original data operation, constructing or updating a data operation record chain index for the constructed data operation record, packaging the data operation record and the data operation record chain index into a block structure and performing chaining processing, wherein the data operation record chain is designed to link all operation records of the same data in a chain structure, and the head node address of the data operation record chain is stored in a corresponding node in the abstract dictionary tree;
An original data query step, which is used for extracting the hash value of the data abstract to be searched from the original data query request of the user, obtaining the latest abstract dictionary tree root node from the latest block of the blockchain network, obtaining the abstract storage address by executing abstract dictionary tree retrieval, obtaining the data abstract according to the obtained storage address, and obtaining the original data by analyzing and accessing the access address of the original data in the data abstract; and
And a data operation record inquiring step, which is used for extracting the associated data abstract hash value from the data operation record inquiring request of the user, acquiring the latest abstract dictionary tree root node from the latest block of the blockchain network, retrieving and acquiring the head node address of the operation record chain by executing the abstract dictionary tree based on the extracted data abstract hash value, retrieving and acquiring the storage addresses of all operation records of the data abstract based on the acquired head node address of the operation record chain, and acquiring the historical operation records of the data abstract from the blockchain network according to the acquired storage addresses.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
The nodes of the abstract dictionary tree comprise index nodes and data nodes; the index node is stored with an index key and a pointer array, and each pointer in the pointer array points to the next node of different paths; the data node stores an index key, a pointer pointing to the address of the data node corresponding to the last version of the current data abstract in the abstract dictionary tree, the storage address of the data abstract corresponding to the index key of the current node and the first node address of the data abstract operation record chain corresponding to the index key of the current node;
The data operation record chain is of a single-chain table structure, the elements of the data operation record chain are operation record arrays, the data operation record chain comprises index nodes and data nodes, and index pointers and data pointer arrays are stored in the index nodes of the data operation record chain; the data operation record chain data node comprises at least part of the following field information: data digest hash value, data operator, timestamp, and operation type.
3. The method of claim 2, wherein the value of the index key in the index node is a common prefix of the hash value of the data digest, and wherein the value of the index key in the data node is a complete hash value of the data digest.
4. A method according to claim 2 or 3, wherein the raw data chaining step comprises:
collecting the original data to be uplink, and constructing a data abstract for each original data;
constructing a abstract dictionary tree index for each data abstract to be uplink by constructing or updating the abstract dictionary tree;
packaging the data to be uplink data abstract and abstract dictionary tree index data into a block structure, and serializing the block structure into a newly built block file; and
And distributing and storing the newly built block file to each node in the network.
5. The method of claim 4, wherein constructing the summary dictionary tree index for each data summary to be uplinked by constructing or updating the summary dictionary tree comprises:
Calculating the hash value of each data abstract, and creating a data node for the hash value, wherein the value of an index key of the data node is the hash value of the current data abstract;
acquiring a current abstract dictionary tree, and constructing the abstract dictionary tree under the condition that the current abstract dictionary tree does not exist;
Determining whether each data abstract is a newly added abstract, if not, obtaining a hash value of a previous version of the data abstract, retrieving a dictionary tree of the current abstract to obtain a data node corresponding to the data abstract of the historical version, and assigning the address of the data node to the pointer of the current data node; if the abstract is newly added, the pointer of the current data node is set to be empty, and each newly built data node is inserted into the current abstract dictionary tree.
6. The method of claim 1, wherein the data operation record generating and linking step comprises:
monitoring the operation behavior of the original data on the chain, and constructing a data operation record for the original data operation;
constructing a data operation record chain index for each data operation record to be uplink by constructing or updating the operation record chain;
packaging the operation record and index data to be uplink into a block structure, and serializing the block structure into a newly built block file; and
And distributing and storing the newly built block file to each node in the network.
7. The method of claim 2, wherein the step of determining the position of the substrate comprises,
The original data query step comprises the following steps: after receiving an original data query request of a user, extracting a hash value of a data abstract to be checked;
Obtaining the latest block file in the current network and extracting abstract dictionary tree root nodes from the latest block file;
retrieving the abstract dictionary tree based on the extracted hash value of the data abstract to be checked to obtain the storage address of the data abstract to be checked;
Acquiring a data abstract to be checked from the block according to the data abstract storage address; and
Extracting an original data access address in the data abstract, and acquiring the original data by accessing the address;
the data operation record inquiring step comprises the following steps:
extracting a hash value of the data abstract to be checked based on the data operation record inquiry request;
Obtaining the latest block file in the current network and extracting abstract dictionary tree root nodes from the latest block file;
Retrieving the abstract dictionary tree based on the extracted hash value of the data abstract to obtain the head address of the operation record chain of the data to be checked;
acquiring all operation record storage addresses of the data abstract to be checked through a retrieval operation record chain;
and acquiring the operation record from the block according to the storage address.
8. The method of claim 7, wherein constructing a chain of data operation records index for each data operation record to be chained by constructing or updating a chain of operation records comprises:
classifying the constructed data operation records according to the operated original data, so that the operation records of the same original data are classified into one type;
Creating an index node for each original data operation class, and assigning all operation record addresses in the operation class to a data pointer array field of the index node;
calculating a hash value of the data abstract corresponding to the current data operation class, acquiring an operation record chain head address of the data abstract through retrieving an abstract dictionary tree, and assigning the operation record chain head address to an index pointer field of the current index node;
And updating the newly-built index node address serving as a new head node of the data operation record chain into a corresponding data node of the abstract dictionary tree, thereby realizing the construction of the index of the newly-added data operation record chain.
9. An index-based on-chain data querying device comprising a processor and a memory, wherein the memory has stored therein computer instructions for executing the computer instructions stored in the memory, which device, when executed by the processor, implements the steps of the method according to any of claims 1 to 8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
CN202111027751.9A 2021-09-02 2021-09-02 Index-based on-chain data query method and device Active CN113901131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111027751.9A CN113901131B (en) 2021-09-02 2021-09-02 Index-based on-chain data query method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111027751.9A CN113901131B (en) 2021-09-02 2021-09-02 Index-based on-chain data query method and device

Publications (2)

Publication Number Publication Date
CN113901131A CN113901131A (en) 2022-01-07
CN113901131B true CN113901131B (en) 2024-06-07

Family

ID=79188488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111027751.9A Active CN113901131B (en) 2021-09-02 2021-09-02 Index-based on-chain data query method and device

Country Status (1)

Country Link
CN (1) CN113901131B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114385996A (en) * 2022-01-10 2022-04-22 北京新华夏信息技术有限公司 Block chain consensus method and system based on node identity hierarchical management
CN116521698B (en) * 2023-05-09 2024-09-13 重庆数字城市科技有限公司 Data uplink method and system based on abstract information
CN118332049B (en) * 2024-06-12 2024-09-13 中科山水(北京)科技信息有限公司 Ecological resource data synchronization method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165224A (en) * 2018-08-24 2019-01-08 东北大学 A kind of indexing means being directed to keyword key on block chain database
WO2019117651A1 (en) * 2017-12-13 2019-06-20 서강대학교 산학협력단 Search method using data structure for supporting multiple search in blockchain-based iot environment, and device according to method
CN111488614A (en) * 2020-04-08 2020-08-04 北京瑞策科技有限公司 Digital identity storage method and device based on service data block chain
CN112800065A (en) * 2021-02-09 2021-05-14 北京工业大学 Efficient data retrieval method based on improved block storage structure
CN113127562A (en) * 2021-03-30 2021-07-16 河南九域腾龙信息工程有限公司 Low-redundancy block chain data storage and retrieval method and system
CN113139100A (en) * 2021-04-27 2021-07-20 中国科学院计算技术研究所 Network flow real-time indexing method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019117651A1 (en) * 2017-12-13 2019-06-20 서강대학교 산학협력단 Search method using data structure for supporting multiple search in blockchain-based iot environment, and device according to method
CN109165224A (en) * 2018-08-24 2019-01-08 东北大学 A kind of indexing means being directed to keyword key on block chain database
CN111488614A (en) * 2020-04-08 2020-08-04 北京瑞策科技有限公司 Digital identity storage method and device based on service data block chain
CN112800065A (en) * 2021-02-09 2021-05-14 北京工业大学 Efficient data retrieval method based on improved block storage structure
CN113127562A (en) * 2021-03-30 2021-07-16 河南九域腾龙信息工程有限公司 Low-redundancy block chain data storage and retrieval method and system
CN113139100A (en) * 2021-04-27 2021-07-20 中国科学院计算技术研究所 Network flow real-time indexing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
考虑节点自主特性的对等网络拓扑优化调整;陈才;杨放春;苏森;双锴;;北京邮电大学学报;20110215(01);全文 *

Also Published As

Publication number Publication date
CN113901131A (en) 2022-01-07

Similar Documents

Publication Publication Date Title
CN113901131B (en) Index-based on-chain data query method and device
CN111046034B (en) Method and system for managing memory data and maintaining data in memory
CN113986873B (en) Method for processing, storing and sharing data modeling of mass Internet of things
US12032576B2 (en) Joining large database tables
US9984128B2 (en) Managing site-based search configuration data
US9124612B2 (en) Multi-site clustering
CN107924408B (en) System and method for searching heterogeneous index of metadata and tags in file system
CN110597852B (en) Data processing method, device, terminal and storage medium
CN110109866B (en) Method and equipment for managing file system directory
CN112148680B (en) File system metadata management method based on distributed graph database
CN113407600B (en) Enhanced real-time calculation method for dynamically synchronizing multi-source large table data in real time
CN109189759B (en) Data reading method, data query method, device and equipment in KV storage system
CN111125213A (en) Data acquisition method, device and system
CN114691704A (en) Metadata synchronization method based on MySQL binlog
CN111506589A (en) Block chain data service system based on alliance chain, access method and storage medium
CN108763323A (en) Meteorological lattice point file application process based on resource set and big data technology
CN115576905A (en) Archive file management method and device, electronic equipment and storage medium
CN112131228B (en) FABRIC-based alliance chain system convenient for data retrieval
CN113282579A (en) Heterogeneous data storage and retrieval method, device, equipment and storage medium
CN111782886A (en) Method and device for managing metadata
CN117493333A (en) Data archiving method and device, electronic equipment and storage medium
CN116595226A (en) Distributed storage method and system for graphic data based on judicial industry
US7233957B1 (en) Method and apparatus for identifying and processing changes to management information
US7536398B2 (en) On-line organization of data sets
CN114896252A (en) Query method and device for Internet of things equipment, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant