CN112416941A

CN112416941A - Block chain-based rapid data retrieval method and system

Info

Publication number: CN112416941A
Application number: CN202011369688.2A
Authority: CN
Inventors: 肖玉连
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-02-26

Abstract

The invention relates to the technical field of a block chain, and discloses a block chain-based rapid data retrieval method, which comprises the following steps: acquiring data to be stored, and encoding the data to be stored by using a data encoding scheme based on a production line to obtain encoded data; storing the encoded data into a blockchain, and storing a copy of the data to be stored into a blockchain link point by using a time-based copy storage method; slicing the data stored in the block chain by using an erasure code-based data slicing mode to obtain erasure code-based sliced data; constructing a time sequence index by using an index construction method based on time sequence; and quickly retrieving the data by utilizing the time sequence index of the data. The invention also provides a rapid data retrieval system based on the block chain. The invention realizes the retrieval of data.

Description

Block chain-based rapid data retrieval method and system

Technical Field

The invention relates to the technical field of block chains, in particular to a block chain-based rapid data retrieval method and a block chain-based rapid data retrieval system.

Background

Along with the rapid popularization of social networks, intelligent hardware, mobile internet and internet of things, the value implied by big data can be displayed to a greater extent, and a new era which pays more attention to the data value and data openness is fortunately coming. Along with this, the fields of business, scientific research, public service and the like all put forward urgent needs for the open sharing of big data, however, due to the lack of a safe and credible data sharing environment, the big data is still stored and controlled by various government agencies, business enterprises, scientific research institutions and even individuals, so that a data island is formed, which seriously affects the sharing and opening of the big data.

The block chain is attracted by attention of various industries due to the characteristics of decentralized trust, complete distribution and the like, and the appearance of the block chain is capable of breaking a large data sharing barrier and realizing trusted data interconnection. The existing block chain is used as a decentralized distributed shared database, each node is required to store complete block data, and with the increase of the number of nodes in a system and the complexity of transaction, the nodes need more and more local storage spaces to store the block data, so that the bottleneck of the block chain in practical application is formed; meanwhile, the current block chain scheme does not support temporal data processing, and efficient query processing is prevented by sequential access based on block files in a block chain.

In view of this, how to optimize the storage manner of data in the block chain to achieve more efficient data retrieval is a problem to be urgently solved by those skilled in the art.

Disclosure of Invention

The invention provides a block chain-based rapid data retrieval method, which is characterized in that a data coding scheme based on a production line is adopted to code data so as to store coded data into a block chain, and the locally stored block data is subjected to data slicing, so that the block data is reconstructed, and the storage optimization of the block chain is realized; meanwhile, a temporal index of the block data is established by using a time-based index construction algorithm, so that the access amount to the block data and a database is reduced, and more efficient data retrieval is realized.

In order to achieve the above object, the present invention provides a fast data retrieval method based on a block chain, which includes:

acquiring data to be stored, and encoding the data to be stored by using a data encoding scheme based on a production line to obtain encoded data;

storing the encoded data into a blockchain, and storing a copy of the data to be stored into a blockchain link point by using a time-based copy storage method;

slicing the data stored in the block chain by using an erasure code-based data slicing mode to obtain erasure code-based sliced data;

construction of time sequence index by using time sequence-based index construction method

And quickly retrieving the data by utilizing the time sequence index of the data.

Optionally, the encoding processing on the data to be stored by using the pipeline-based data encoding scheme includes:

the pipeline-based data coding scheme refers to that coding and decoding calculation processes are operated on different blocks in a pipeline mode, namely for data to be stored o₁，o₂，...，o_kCorresponding memory block h of₁，h₂，...，h_n：

When coding, the corresponding memory block h_iRespectively treat the stored data o₁，o₂，...，o_kEncoding is carried out to obtain respective encoded data c₁，c₂，...，c_nWherein n is>k；

When decoding, the coded data c₁Storage block h of₁To c₁Performing decoding operation to obtain a decoding intermediate block i₁And decoding the intermediate block i₁Is sent to the coded data c₂Storage block h of₂(ii) a Memory block h₂According to the coded data c₂And i₁The decoding operation of (a) results in a decoded intermediate block i₂In summary, block h is finally decoded_nObtaining the final decoded data i_n；

In one embodiment of the invention, the invention uses a classical systematic code (8,4) to convert the data O to (O)₁，o₂，o₃，o₄) Encoding into 8-dimensional encoded data C ═ (C)₁，...，c₈) Wherein the coding algorithm employs a finite field algorithm

After the coding is finished, the code is storedStorage block h_iStoring coded data c_i(ii) a In the data decoding process, the block h is stored_iRespectively encode data c_iIs sent to a decoding node n_iI 1.. 8, then the decoding node n that holds the first encoded data₁For coded data c₁Performing linear operation, and sending the obtained result to the second coded data c₂Decoding node n₂，n₂Will be from n₁The results obtained are compared with c₂Performing operation and sending the result to the third coded data n₃Decoding node n₃According to this method, the decoding process proceeds in a pipelined fashion, and finally from the decoding node n₈Raw data were obtained.

Optionally, the storing the copy of the data to be stored into the block link point by using a time-based copy storage method includes:

in detail, when a user uploads or recovers data for the first time, the invention stores a copy of the data in the last blockchain node participating in the encoding and decoding process, and simultaneously takes the data as hot data, which can be read again by the user in a short time;

1) when the copy is stored, a deletion clock T and a threshold value T are set simultaneously and stored in the block link point together with the copy, and the parameter value T of the clock is zero at the beginning and increases along with time; in the process, the calculation cost is reduced to zero, and the storage cost is increased by four blocks, so that the calculation cost can be reduced at the cost of the storage cost;

2) if the user reads the data again when the time T is less than T, the copy of the data can be directly obtained from the last block link point, and the clock parameter value is reset to zero;

3) if T is larger than or equal to T, deleting the copy stored in the node, and releasing the storage space; that is, if the user does not read the object longer than the threshold, the data is regarded as cold data, the user does not read it for a long time, and for the cold data, the copy of the cold data is deleted to release the storage space.

The size determining factor of the threshold value T comprises the time interval T of two accesses of the data by the user and the existing available network resources. In contrast, the present invention sets the value of the network available computing resource to be C, sets the value of the network available storage space resource to be S, and updates the threshold value when the user accesses the data each time:

wherein:

t is the time interval between two times of accessing the same data by the user;

C_old，C_newrespectively calculating the network available computing resource values of the first access data and the second access data of the user;

S_old，S_newrespectively accessing the network available storage space resource values of the data for the first time and the data for the second time by the user;

α, β ∈ [0, 1], α + β ═ 1, where α, β respectively denote importance of computation resources and storage resources in the blockchain to the blockchain network, and in one embodiment of the present invention, α ═ 0.4 and β ═ 0.6 are set.

Optionally, the slicing the data stored in the blockchain by using the erasure code-based data slicing method includes:

1) the ith block B in the block chainⁱAre equally divided into

A total of k data slices;

2) performing matrix multiplication operation on the data slice and a preset segmentation matrix based on the erasure code to obtain slice data based on the erasure code:

wherein:

is a matrix value in the coding matrix; in a specific embodiment of the present invention, the adopted coding matrix is a cauchy matrix-based coding matrix, and the cauchy matrix-based coding matrix is:

x_i，y_iis an element in the Galois field in which m>n；

A k-th data slice of an ith block of the block chain;

the r-th erasure code based slice data for the ith block of the blockchain, where r>k；

3) Because m in the m multiplied by n order coding matrix is larger than n, the quantity of the slice data based on the erasure code is more than the quantity of the original data slices, the invention achieves the aim of saving storage by deleting part of the slice data based on the erasure code, and the node can select to reserve the quantity of the data slices according to different local storage capacities of the node; in a specific embodiment of the invention, in order to ensure that the data slices of the whole network are distributed stably, the nodes delete the coding slices randomly; if the coding matrix G of the system is of order mxn, the original block data is divided into n data slices, and if k data slices are averagely retained for each block of a certain node after the coding is finished and the corresponding storage space optimization efficiency is η, then:

4) the nodes after being coded and deleted have clearly stored transaction information, and when the nodes need complete block information, enough coded data slices need to be acquired from other nodes; because network transmission needs to occupy network bandwidth resources of a node, when the number of coded data slices of a certain block is greater than or equal to the number n of coded matrix columns, the block can be completely reconstructed, and therefore, the data amount Q of the node recovery block at least needing to be transmitted by other nodes is as follows:

Q＝(n-k)*p

wherein:

n is the original block data divided into n data slices;

k is the average reserved k data slices of each block of a certain node;

p is the amount of data in each slice.

Optionally, the flow of the index construction method based on the time sequence is as follows:

1) setting an initial time t₁As the start time of the time series index construction, the start time of the next construction is set as t₂For each slice data k_iAt time [ t₁，t₂]In (2), time is divided into non-overlapping adjacent time periods theta (k) { theta } theta₁，θ₂，...，θ_mM is the number of divided time periods;

2) using Get (C)<k_i，θ_n>) Acquisition over a time period theta_nIth slice data k_iCorresponding state ε (k)_i，θ_n) And updating the current time sequence index state, i.e. adding the current data_n+1To epsilon (k)_i，θ_n) In, at the same time k_iCorresponding data value data_nUpdated to data_n+1；

3) If so_nIf the time division condition is satisfied, the method will<(k_i，θ_n)，ε(k_i，θ_n)>Commit to a tile file, update the index into the historical data, create a new time interval θ_n+1(ii) a In an embodiment of the present invention, the time division condition is a dynamic interval division condition, i.e. the size of the time interval is determined by measuring both the time calculation and the slice data amount, and a time interval is fixedAnd slice data quantity values, the determination of the time interval must satisfy either of the following two conditions: first, when the time interval is equal to a fixed value, the number of slice data must be equal to or exceed a prescribed slice data amount; second, when the number of slice data is equal to a fixed value, the time interval must be equal to or greater than the fixed value, avoiding the situation where too much or too little index data is formed within a certain time period θ.

Optionally, the process of performing fast retrieval on the data by using the time sequence index of the data includes:

if all data related to the data k in the time interval tau are searched, firstly, inquiring a time interval theta (k) corresponding to the data k through a returned iterator, and calculating a time sequence connection relation or a time sequence containing relation interval existing between the theta (k) and a target inquiry interval tau, wherein the interval is marked as o (theta (k), tau), and the first theta and the last theta in the o (theta (k), tau are in a time sequence connection relation; the time sequence connection relation is the interval theta of the time state_iAnd theta_jIf present, if present

Then call theta_iAnd theta_jIs a time sequence connection relation; the timing inclusion relationship is to the temporal interval θ_iAnd theta_jIf present, if present

Then call theta_iAnd theta_jIs a timing inclusion relationship;

for each theta contained in o (theta (k), tau), executing a call of < k, theta >, and parsing the block file through the returned iterator; for the interval with the time sequence containing relationship, directly adding the data analyzed by the iterator to the result set, and for the interval with the time sequence connection relationship, traversing the data returned by the iterator to remove the data not in the interval tau; and finally, outputting a result set which is a data retrieval result.

In addition, to achieve the above object, the present invention further provides a block chain-based fast data retrieval system, including:

the data acquisition device is used for acquiring data to be stored and encoding the data to be stored by utilizing a data encoding scheme based on a production line;

the data processor is used for storing the coded data into the block chain and storing the copy of the data to be stored into the block chain link points by using a time-based copy storage method; meanwhile, slicing the data stored in the block chain by using an erasure code-based data slicing mode to obtain erasure code-based sliced data;

and the data retrieval device is used for constructing the time sequence index for the slice data by using the time sequence-based index construction method, so that the data is quickly retrieved by using the time sequence index of the data.

In addition, to achieve the above object, the present invention also provides a computer readable storage medium, which stores thereon data retrieval program instructions, which are executable by one or more processors to implement the steps of the implementation method of fast data retrieval based on block chains as described above.

Compared with the prior art, the invention provides a block chain-based rapid data retrieval method, which has the following advantages:

firstly, the invention provides a data coding scheme based on a production line, and during coding, corresponding storage blocks respectively treat stored data o₁，o₂，...，o_kEncoding is carried out to obtain respective encoded data c₁，c₂，...，c_nWherein n is>k; when decoding, the coded data c₁Storage block h of₁To c₁Performing decoding operation to obtain a decoding intermediate block i₁And decoding the intermediate block i₁Is sent to the coded data c₂Storage block h of₂(ii) a Memory block h₂From coded data 2₂And i₁The decoding operation of (a) results in a decoded intermediate block i₂To sum up the steps, the mostFinally decoding block h_nObtaining the final decoded data i_nIn the process of pipeline coding or decoding, no message is transmitted between different blocks, the current block can not obtain the information of other blocks and does not know the information of other blocks, the anonymity between different blocks in the chain link points of the blocks is ensured, a certain block in the chain link nodes of the blocks is attacked, the data information of other blocks in the chain link points of the blocks can not be influenced, and the safety of the data stored in the chain of the blocks is ensured.

The invention also provides a time-based copy storage method, which is characterized in that a time threshold T is determined based on an available computing resource value C and a storage space resource value S of a current block chain network, and the time threshold T is updated when a user accesses data in a block chain each time:

wherein: t represents a threshold value before update; t' represents the updated threshold; α, β ∈ [0, 1], α + β ═ 1, α, β respectively represent the importance of computing resources and storage resources in the network to the network; when a user uploads or recovers data for the first time, the copy of the data is stored in the last blockchain node participating in the encoding and decoding process, and the data is used as hot data, so that the user can read the data again in a short time; if the user reads the data again when the clock T is less than T, the copy of the data can be directly obtained from the last block chain link point, and meanwhile, the calculation overhead is reduced to zero in the process of resetting the clock parameter value to zero, and the storage overhead is increased by four blocks, so that the block chain calculation overhead can be reduced at the expense of the storage overhead; if T is larger than or equal to T, the copy stored in the node is deleted, and the storage space is released; if the time of the object unread by the user is longer than the threshold value, the data is regarded as cold data, the user cannot read the cold data for a long time, and for the cold data, the copy of the cold data is deleted to release the storage space, so that the block chain storage overhead is reduced, and more efficient block chain data storage is realized.

Because the current block chain scheme does not support temporal data processing, efficient query processing is prevented by sequential access based on block files in a block chain; therefore, the invention provides an index construction method based on time sequence, which is realized by setting initial time t₁As the start time of the time series index construction, the start time of the next construction is set as t₂For each slice data k_iAt time [ t₁，t₂]In (2), time is divided into non-overlapping adjacent time periods theta (k) { theta } theta₁，θ₂，...，θ_mM is the number of divided time periods; using Get (C)<k_i，θ_n>) Acquisition over a time period theta_nIth slice data k_iCorresponding state ε (k)_i，θ_n) And updating the current time sequence index state, i.e. adding the current data_n+1To epsilon (k)_i，θ_n) In, at the same time k_iCorresponding data value data_nUpdated to data_n+1(ii) a If so_nIf the time division condition is satisfied, the method will<(k_i，θ_n)，ε(k_i，θ_n)>Commit to a tile file, update the index into the historical data, create a new time interval θ_n+1(ii) a In the data retrieval process, if all data related to data k in a time interval tau are to be retrieved, firstly querying a time interval theta (k) corresponding to the data k through a returned iterator, and calculating a time sequence connection relation or a time sequence containing relation interval existing between the theta (k) and a target query interval tau, wherein the interval is marked as o (theta (k), tau, and the first theta and the last theta in the o (theta (k), tau) are in a time sequence connection relation; for each theta contained in o (theta (k), tau), performing<k，θ>The block file is analyzed through the returned iterator; for the interval with the time sequence containing relationship, directly adding the data analyzed by the iterator to the result set, and for the interval with the time sequence connection relationship, traversing the data returned by the iterator to remove the data not in the interval tau; and finally, outputting a result set which is a data retrieval result. Query data k in time interval t by traditional retrieval method₁，t₂]Internal state, requiring full accessIn the time interval, the time sequence index is used, and only o (theta (k), tau) files need to be deserialized, namely the time sequence index is established at a reasonable time interval, so that the access times of the files can be greatly reduced, and the data retrieval efficiency is effectively improved.

Drawings

Fig. 1 is a schematic flowchart of a block chain-based fast data retrieval method according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a block chain-based fast data retrieval system according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Data are coded by adopting a data coding scheme based on a production line so as to store coded data into a block chain, and data slicing is carried out on the locally stored block data, so that the block data is reconstructed, and the storage optimization of the block chain is realized; meanwhile, a temporal index of the block data is established by using a time-based index construction algorithm, so that the access amount to the block data and a database is reduced, and more efficient data retrieval is realized. Fig. 1 is a schematic diagram illustrating a block chain-based fast data retrieval method according to an embodiment of the present invention.

In this embodiment, the fast data retrieval method based on the block chain includes:

and S1, acquiring the data to be stored, and encoding the data to be stored by using a data encoding scheme based on a production line to obtain encoded data.

Firstly, the invention acquires the data to be stored and utilizes a data coding scheme based on a production line to carry out coding processing on the data to be stored, wherein the data coding scheme based on the production line refers to that coding and decoding calculation processes are operated on different blocks in a production line mode, namely, the data o to be stored₁，o₂，...，o_kCorresponding memory block h of₁，h₂，...，h_n：

After the coding is completed, the block h is stored_iStoring coded data c_i(ii) a In the data decoding process, the block h is stored_iRespectively encode data c_iIs sent to a decoding node n_iI 1.. 8, then the decoding node n that holds the first encoded data₁For coded data c₁Performing linear operation, and sending the obtained result to the second coded data c₂Decoding node n₂，n₂Will be from n₁The results obtained are compared with c₂Performing operation and sending the result to the third coded data c₃Decoding node n₃According to this method, the decoding process proceeds in a pipelined fashion, and finally from the decoding node n₈Raw data were obtained.

And S2, storing the coded data into the block chain, and storing the copy of the data to be stored into the block chain link points by using a time-based copy storage method.

Further, for the encoded data, the invention stores it into the corresponding block chain block; in one embodiment of the invention, for the encoded data c₁，c₂，...，c_nThe invention stores the data into the corresponding block h_iWherein i is 1.. multidot.n;

further, the invention stores the copy of the data to be stored into the block chain node by using a time-based copy storage method, in detail, when a user uploads or recovers the data for the first time, the invention stores the copy of the data into the last block chain node participating in the encoding and decoding process, and simultaneously takes the data as hot data, so that the user can read the data again in a short time;

the time-based copy storage method comprises the following steps:

wherein:

S3, the data stored in the blockchain is sliced in the erasure code based data slicing method, and erasure code based slice data is obtained.

Further, the invention uses a data slicing mode based on erasure codes to slice the data stored in the block chain, wherein the data slicing mode based on erasure codes is as follows:

1) the ith block B in the block chainⁱAre equally divided into

A total of k data slices;

wherein:

x_i，y_iis an element in the Galois field in which m>n；

A k-th data slice of an ith block of the block chain;

Q＝(n-k)*p

wherein:

n is the original block data divided into n data slices;

k is the average reserved k data slices of each block of a certain node;

p is the amount of data in each slice.

And S4, constructing the time sequence index for the slice data by using a time sequence-based index construction method.

Further, the invention constructs the time sequence index for the slice data by using a time sequence-based index construction method, wherein the time sequence-based index construction method comprises the following steps:

3) If so_nIf the time division condition is satisfied, the method will<(k_i，θ_n)，ε(k_i，θ_n)>Submitting to a block file, updating the index into historical data, creatingNew time interval theta_n+1(ii) a In an embodiment of the present invention, the time division condition is a dynamic interval division condition, that is, the size of the time interval is determined by measuring both the time calculation and the slice data amount, and by fixing a value of the time interval and the slice data amount, the time interval must be determined in any one of the following two cases: first, when the time interval is equal to a fixed value, the number of slice data must be equal to or exceed a prescribed slice data amount; second, when the number of slice data is equal to a fixed value, the time interval must be equal to or greater than the fixed value, avoiding the situation where too much or too little index data is formed within a certain time period θ.

And S5, quickly searching the data by using the time sequence index of the data.

Furthermore, according to the temporal index of the data, the invention realizes the rapid retrieval of the data in the block chain;

Then call theta_iAnd theta_jIs a timing inclusion relationship;

The following describes embodiments of the present invention through an algorithmic experiment and tests of the inventive treatment method. The hardware test environment of the algorithm of the invention is as follows: the operating system is Linux CentOS 6.9, and the memory is 16G; the contrast retrieval method is a data retrieval method based on Hash index storage, a data retrieval method based on reverse index storage and an index-free data retrieval method.

In the algorithm experiment, 5G data is collected, a comparison algorithm and the algorithm provided by the invention are used for storage and retrieval, and the time required by the retrieval completion is used as an evaluation index of the data retrieval method.

According to the experimental result, the retrieval time of the data retrieval method based on Hash index storage is 1.2s, the retrieval time of the data retrieval method based on inverted index storage is 0.68s, and the retrieval time of the non-index data retrieval method is 0.72 s.

The invention also provides a block chain-based rapid data retrieval system. Fig. 2 is a schematic diagram illustrating an internal structure of a block chain-based fast data retrieval system according to an embodiment of the present invention.

In the present embodiment, the block chain-based fast data retrieval system 1 at least includes a data acquisition device 11, a data processor 12, a data retrieval device 13, a communication bus 14, and a network interface 15.

The data acquisition device 11 may be a PC (Personal Computer), a terminal device such as a smart phone, a tablet Computer, or a mobile Computer, or may be a server.

The data processor 12 includes at least one type of readable storage medium including flash memory, hard disks, multi-media cards, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, and the like. The data processor 12 may in some embodiments be an internal storage unit of the blockchain based fast data retrieval system 1, for example a hard disk of the blockchain based fast data retrieval system 1. The data processor 12 may also be an external storage device of the block chain based fast data retrieval system 1 in other embodiments, such as a plug-in hard disk provided on the block chain based fast data retrieval system 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the data processor 12 may also comprise both an internal storage unit and an external storage device of the blockchain based fast data retrieval system 1. The data processor 12 can be used not only to store application software installed in the block chain based fast data retrieval system 1 and various kinds of data, but also to temporarily store data that has been output or will be output.

The data retrieving device 13 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the data processor 12 or Processing data, such as data retrieving program instructions.

The communication bus 14 is used to enable connection communication between these components.

The network interface 15 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), and is typically used to establish a communication link between the system 1 and other electronic devices.

Optionally, the system 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the block chain based fast data retrieval system 1 and for displaying a visualized user interface, among others.

While FIG. 2 only shows the fast data retrieval system 1 with components 11-15 and based on blockchain, those skilled in the art will appreciate that the structure shown in FIG. 1 does not constitute a limitation of the blockchain based fast data retrieval system 1 and may include fewer or more components than shown, or combine certain components, or a different arrangement of components.

In the embodiment of the apparatus 1 shown in fig. 2, the data processor 12 has stored therein data retrieval program instructions; the steps of the data retrieval device 13 executing the data retrieval program instructions stored in the data processor 12 are the same as the implementation method of the block chain based fast data retrieval method, and are not described here.

Furthermore, an embodiment of the present invention also provides a computer-readable storage medium having stored thereon data retrieval program instructions, which are executable by one or more processors to implement the following operations:

It should be noted that the above-mentioned numbers of the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that includes the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A block chain-based fast data retrieval method is characterized in that the method comprises the following steps:

constructing a time sequence index by using an index construction method based on time sequence;

2. The method as claimed in claim 1, wherein the encoding process of the data to be stored by using the pipeline-based data encoding scheme includes:

When decoding, the coded data c₁Storage block h of₁To c₁Performing decoding operation to obtain a decoding intermediate block i₁And decoding the intermediate block i₁Is sent to the coded data c₂Storage block h of₂(ii) a Memory block h₂According to the coded data c₂And i₁The decoding operation of (a) results in a decoded intermediate block i₂In summary, block h is finally decoded_nObtaining the final decoded data i_n。

3. The method as claimed in claim 2, wherein the storing the copy of the data to be stored into the block link point by using the time-based copy storage method comprises:

1) when the copy is stored, a deletion clock T and a threshold value T are set simultaneously and stored in the block link point together with the copy, and the parameter value T of the clock is zero at the beginning and increases along with time;

The size determining factors of the threshold value T comprise a time interval T of the data accessed by the user twice and the existing available network resources; by setting the value of the network available computing resource as C and the value of the network available storage space resource as S, updating the threshold value when the user accesses the data each time:

wherein:

t represents a threshold value before update;

t' represents the updated threshold;

α, β ∈ [0, 1], α + β ═ 1, α ═ 0.4, and β ═ 0.6 are set.

4. The method as claimed in claim 3, wherein the slicing process for the data stored in the blockchain by using the erasure code based data slicing method comprises:

1) the ith block B in the block chainⁱAre equally divided into

A total of k data slices;

wherein:

is a matrix value in the coding matrix;

a k-th data slice of an ith block of the block chain;

3) According to different local storage capacities of the nodes, the nodes randomly delete the coding slices, wherein the more the nodes with poorer local storage capacities delete more slice data;

4) because network transmission needs to occupy network bandwidth resources of a node, when the number of coded data slices of a certain block is greater than or equal to the number n of coded matrix columns, the block can be completely reconstructed, and therefore, the data amount Q of the node recovery block at least needing to be transmitted by other nodes is as follows:

Q＝(n-k)*p

wherein:

n is the original block data divided into n data slices;

k is the average reserved k data slices of each block of a certain node;

p is the amount of data in each slice.

5. The method according to claim 4, wherein the time-series-based index building method comprises the following steps:

3) If so_nIf the time division condition is satisfied, the method will<(k_i，θ_n)，ε(k_i，θ_n)>Commit to a tile file, update the index into the historical data, create a new time interval θ_n+1。

6. The method as claimed in claim 5, wherein the fast retrieving of data by using the time sequence index of data comprises:

if all data related to the data k in the time interval tau are searched, firstly, inquiring a time interval theta (k) corresponding to the data k through a returned iterator, and calculating a time sequence connection relation or a time sequence containing relation interval existing between the theta (k) and a target inquiry interval tau, wherein the interval is marked as o (theta (k), tau), and the first theta and the last theta in the o (theta (k), tau are in a time sequence connection relation;

7. A block chain based fast data retrieval system, the system comprising:

8. A computer readable storage medium having stored thereon data retrieval program instructions executable by one or more processors to implement the steps of a method for implementing block chain based fast data retrieval as claimed in any one of claims 1 to 6.