CN113742386B

CN113742386B - Processing and analyzing method for block data of block chain account book

Info

Publication number: CN113742386B
Application number: CN202111293255.8A
Authority: CN
Inventors: 张卫平; 丁烨; 张浩宇; 李显阔
Original assignee: Global Digital Group Co Ltd
Current assignee: Global Numerical Technology Co ltd
Priority date: 2021-11-03
Filing date: 2021-11-03
Publication date: 2022-02-08
Anticipated expiration: 2041-11-03
Also published as: CN113742386A

Abstract

The present invention provides a method for processing and analyzing block data of a blockchain ledger. The processing and analyzing method utilizes the characteristics of a plurality of distributed computing nodes in a blockchain system, and the query requirements put forward by service nodes to query nodes Perform feature item analysis to obtain at least one feature item of the query requirement; the service node dispatches the at least one feature item to multiple nodes in the distributed system for block backtracking, and obtains the feature value that matches the feature item of the query requirement The records in the relevant block ledgers; further, after the records in the obtained ledgers are cached in a distributed manner, multiple key-value pairs are generated; finally, after a relational database is generated by collecting multiple key-value pairs, the final result is output to the requesting node. The present invention can improve the efficiency of converting into relational data, and can efficiently output and present it to the data requester at the end, so as to present the statistical data of the ledger more intuitively.

Description

Processing and analyzing method for block data of block chain account book

Technical Field

The present invention relates to the field of blockchain data processing. In particular, the present invention relates to a method for processing and analyzing block data of a block chain account book.

Background

With the advent of blockchain technology, people have increased their applications to blockchains. In short, the block chain is essentially a distributed ledger, including both blocks and chains. The block is a container data structure contained in a public account book and aggregating transaction information, can be described as a page for accounting, and is used for recording transaction conditions occurring in a period of time; a chain may be understood as a tool to stick together multiple pages of billing.

The block chain is formed by connecting a plurality of blocks in series according to time sequence, and related information such as transaction, account numbers and the like is stored in each block. Each block is like a paper book, and daily journal of many people is recorded on the paper book. If we want to check how many large expenses are spent in the last year, we need to move out all accounts in the last year, and one book is looked over from beginning to end to find out the corresponding records. It can also be said that each tile is just like a paper book, and if we want to search from a stack of paper books by some keywords, there is no way but to look over from beginning to end. Data on the block chain are all discretized data, and a more effective data organization mode is needed so as to facilitate further query and analysis and visually present the final data.

According to the related published technical scheme, the technical scheme with the publication number of CN110928950 (A) adopts a synchronous mode of chain accounting and offline accounting to update the offline account book in real time, so that the dynamic acquisition of block chain data is realized, and the block chain account book can be further processed as soon as possible; the technical scheme of publication number US2021157798 (a1) proposes to obtain a data read-write request for an intelligent contract through a block link point, further analyze a storage field for indicating a target object in a block chain data set, and directly read the number of the storage field, thereby improving the reading efficiency; the technical solution of publication number US2021042272 (a1) proposes to classify the blockchain data hierarchically, so as to give a certain degree of priority to processing the blockchain data, and also describes a processing manner for improving the efficiency of processing the blockchain data from the side. However, the current solution cannot provide a more intuitive processing and analyzing method for block data to intuitively obtain statistical data in block transactions.

Disclosure of Invention

The invention aims to provide a processing and analyzing method suitable for block chain account book data, and aims to more intuitively present statistical data of an account book.

The invention adopts the following technical scheme: a processing and analyzing method for block data of a block chain account book; the processing analysis method comprises the following steps:

the service node receives a query request from the request node;

the service node analyzes the characteristic items contained in the query request as target characteristic items;

the service node sends a target characteristic value or a target characteristic value range to a first node which is responsible for at least one characteristic statistical organization of a target characteristic item;

the first node dispatches a backtracking task to at least one characteristic statistical node under the characteristic statistical organization; the feature statistical node performs on-chain backtracking, extracts all the account book records of the target feature item, and caches the extracted records;

the first node counts the records of the target characteristic items and submits the statistical data to the service node;

the service node integrates the statistical data to form a statistical report and returns the report to the request node;

wherein, the block chain comprises a plurality of characteristic statistical organizations; each feature statistic organization comprises a plurality of feature statistic nodes; a characteristic statistic organization is responsible for the statistic affairs of a target characteristic item, and a designated member service provider belonging to the characteristic statistic organization issues a digital certificate to the characteristic statistic node in each characteristic statistic organization; the characteristic node holding the digital certificate allows access to the cache space of other characteristic statistic nodes in the characteristic statistic organization and performs writing operation;

the block chain is provided with an external application program interface, and an application program is communicated with the request node through the external application program interface; a client makes a query request and receives statistical data through the application program;

the block chain is internally provided with a first channel; the first channel is used for legality communication between the request node and the service node; after receiving the query request of the request node, the service node applies to a block chain to verify the validity of the request node; after the validity of the request node is verified, submitting a query request to the service node;

a second channel is arranged in the block chain; the service node and the feature statistical organizations carry out legal communication in the second channel; the service node sends a target feature item and a target feature value/feature value range to at least one feature statistical organization in the second channel; the service node receives a statistical data result returned by at least one characteristic statistical organization in the second channel;

recording a characteristic item template in the block chain; the characteristic item template is used for describing a plurality of characteristic dimensions expressed by a block chain account book; the characteristic item template is stored in a block of the block chain and allows all nodes of the block chain to access; the feature item template is maintained by a plurality of feature statistical organizations, and only the specified feature statistical nodes are allowed to be modified;

in the process of backtracking the block, the characteristic counting node extracts all account book records including the target characteristic item; after extracting an account book record, the characteristic statistic node caches the account book record;

when the characteristic statistic node caches the account book record, the caching mode comprises local caching and distributed caching on each characteristic statistic node in the characteristic statistic organization;

when the account book record is cached, caching a Key Value pair < Key-Value > type of non-relational data; wherein, the Key is the target characteristic item, and the Value is the target characteristic Value.

The characteristic statistical nodes locate the target characteristic items in a cache, extract Key Value pairs < Key-Value > meeting conditions according to the target characteristic values or target characteristic Value ranges, and submit all the Key Value pairs < Key-Value > meeting conditions to the first node;

the first node converts all the Key Value pairs meeting the conditions into a relational data structure of the < Key-Value >; and submitting the converted relational data result to the service node.

The beneficial effects obtained by the invention are as follows:

1. the processing and analyzing method provided by the invention utilizes the characteristic that a block chain has a plurality of distributed nodes, carries out synchronous feature item search on the block chain, implements concurrent backtracking, and effectively utilizes the characteristic of high concurrent computing capability of a distributed system;

2. according to the processing and analyzing method, after the target characteristic Value of the target characteristic item is obtained, the non-relational data Key Value pair < Key-Value > is generated for the characteristic item and the characteristic Value, and the high flow conversion performance of data in the caching process is improved.

3. The processing and analyzing method can synchronously screen and convert the non-relational data while caching the block chain data, thereby improving the efficiency of converting the non-relational data into the relational data and finally efficiently outputting and presenting the relational data to a data requester.

4. The processing and analyzing method provided by the invention is used for modularly designing the software and the hardware, facilitates future upgrading or replacement of related software and hardware environments, and reduces the use cost.

Drawings

The invention will be further understood from the following description in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. Like reference numerals designate corresponding parts throughout the different views.

FIG. 1 is a schematic diagram of an organization structure of nodes of a distributed system according to the present invention;

FIG. 2 is a schematic flow chart illustrating a method for processing and analyzing block book data according to the present invention;

FIG. 3 is a schematic diagram of data exchange of each node on multiple channels in the blockchain according to the present invention;

FIG. 4 is a diagram illustrating feature value screening performed by a first node in a plurality of feature statistics nodes according to the present invention;

FIG. 5 is a diagram illustrating the statistical results of the relational database generated by the present invention.

Detailed Description

In order to make the technical solution and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the embodiments thereof; it should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. Other systems, methods, and/or features of the present embodiments will become apparent to those skilled in the art upon review of the following detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims. Additional features of the disclosed embodiments are described in, and will be apparent from, the detailed description that follows.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it is to be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", etc. based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not intended to indicate or imply that the device or assembly referred to must have a specific orientation.

The first embodiment is as follows:

a processing and analyzing method for block data of a block chain account book; the processing analysis method comprises the following steps:

the service node receives a query request from the request node;

the first node converts all the Key Value pairs meeting the conditions into a relational data structure of the < Key-Value >; submitting the converted relational data result to the service node;

the first channel and the second channel are data exchange channels isolated between opposite channels in a block chain; under the condition of considering a plurality of different departments, different organizations, different access strategies of data privacy and the like, the application of multiple channels in the block chain realizes that different nodes in the same block chain are distributed to different channels for data exchange, so as to effectively distinguish independent transmission of data information among different nodes; therefore, preferably, the request node and the service node are allocated to the first channel, so that the condition that the request node is in doubt in validity is avoided, illegal communication to other feature statistical nodes is unauthorized, and large-area congestion and even paralysis caused by data interaction in the distributed system when a plurality of request nodes simultaneously initiate query requests and even network multiple concurrent attacks are avoided;

further, the blockchain network is managed by a plurality of organizations, each blockchain node has an identity information belonging to a certain organization, each node is authenticated through a Certificate Authority (CA), and according to the blockchain organization to which the node belongs, the certificate authority issues a digital certificate representing the membership of the organization to express the organization identity of the node. Mapping of identity information and organization about a node is provided by a Member Service Provider (MSP) which determines how a node is assigned to a particular role in a given organization and obtains the associated rights to access blockchain resources. A node can only be owned by one organization, and therefore can only be associated with a single MSP;

where CA is a certificate authority, a method for confirming the identity of a user by generating a digital certificate, which is one of asymmetric encryption techniques, through cryptography (asymmetric encryption). By using a digital certificate mode, the node uses a private key of the node to check and sign the data to be sent when accounting, and a receiver uses a public key of a sender to verify the sent data; the verification mode of the digital certificate can verify the correctness of the data and ensure the safety of the data;

furthermore, for different feature items, a specific organization is designated to be responsible for the designated feature items through a consensus mechanism in the block chain, so that the service node can more efficiently allocate tasks needing block backtracking to the designated organization, and the efficiency of block backtracking is improved on one hand;

on the other hand, different organizations cannot interfere backtracking of the same characteristic item, so that the situation of repeated operation or operation congestion is avoided in a distributed system;

further, the service node and a plurality of feature statistic organizations use the second channel for data interaction; preferably, if there are a large number of the feature statistical organizations and a large number of the feature statistical nodes, more channels may be used to separate different feature statistical organizations from the communication channels of the service nodes, thereby improving effective and high-speed transmission of information in the channels.

Example two:

this embodiment should be understood to include at least all of the features of any of the foregoing embodiments and further modifications thereon;

when data needs to be visually and specifically presented to people, more requirements are that the arrangement of the data is clear and the logic is complete, and a two-dimensional table mode is adopted in a traditional relational database, wherein each row is a data tuple; each column, which is a feature item described herein; each row of the relational database must have a primary key to uniquely identify the row (tuple); therefore, the relational database has strict data structure logic and is beneficial to later-stage processing and screening of data; therefore, for the purpose of presenting data, preferably, the data is stored by using a relational database, and is presented to people after being finally beautified based on the relational database;

however, according to the blockchain technical characteristics, the information of the blockchain ledger exists on each block, and the ledger information stored by each block is only the record of a series of transactions that occur on the blockchain within a period of time in the generation phase of the block; therefore, if account book information on a plurality of blocks needs to be extracted, a plurality of pieces of extracted information are scattered information before being processed and analyzed, and the information lacks a data structure relationship and is huge in quantity;

further, based on the characteristics of a distributed system, in the process of traversing the block book, high-concurrency backtracking, extraction and storage can be performed by using a plurality of feature statistical nodes; however, the traditional relational database needs to establish a data logical relationship during storage, and establishes a series of integrity constraints, so that the concurrency performance is low, and the statistical efficiency of the block chain data is greatly influenced; therefore, in this embodiment, as a characteristic of being suitable for block chain account book data, a non-relational database is formed in a non-relational data structure form, and initial data extracted by the feature statistical node is cached first and then is processed in the next step;

the non-relational database refers to a non-relational, distributed data storage system which generally does not guarantee compliance with the ACID principle; the non-relational database is characterized in that the analysis processing of the traditional Structured Query Language (SQL) is not needed, and the read-write performance of the non-relational database greatly exceeds the performance of the relational database; currently common non-relational databases include: such as HBase, Hypertable, MongoDB, Cassandra, etc.;

the non-relational data is stored based on Key Value pairs < Key-Value >, the structure of the non-relational data is not fixed, each data tuple can have different fields, each data tuple can be added with some own Key Value pairs < Key-Value > according to needs, and the fixed structure and the fixed space characteristics of the relational database are cancelled; when reading/writing the non-relational data, the multi-table is not subjected to associated query, and the operation can be completed only by extracting or writing the corresponding value according to the key, so that a large amount of operation resources and time are saved; therefore, the non-relational data and the non-relational database are just suitable for being used as the cached data type and data structure in the traversal process of the block chain ledger;

the embodiment comprises a conversion step of converting a non-relational database into a relational database, and aims to realize the conversion of a non-relational data set in a distributed node cache into a final relational database; the method aims to provide clearer data display logic when the last query result is submitted to the request node;

the first node submits the form of the screened multiple Key Value pairs < Key-Value > to the service node, including directly submitting complete Key Value pair data to the service node, or only submitting storage addresses of the screened multiple Key Value pairs < Key-Value > in a non-relational database based on the purpose of effectively utilizing the cache space of the service node, and the service node acquires the screened Key Value pairs < Key-Value > data by itself through connection in the second channel;

further, the service node records the screened multiple Key Value pairs < Key-Value > as a first data set; the service node establishes a new blank relational database and establishes a plurality of connections of the relational database; the service node extracts a plurality of Key values of a first data set one by one or in batches, and completes writing the data of the first data set into the relational database through the connection according to relational mapping;

further, the service node performs processing such as sorting, charting, segmentation and the like on the relational database so as to realize a final more visual data statistical result.

Example three:

this embodiment should be understood to include at least all of the features of any of the embodiments described above and further refinements thereto:

because the ledger records of the block chain on the previous blocks have the characteristic of being not falsifiable, no matter backtracking is carried out at any time and backtracking is carried out by any node, the obtained results are highly consistent and have strict logic sequence, so that the common ledger records can be kept in a distributed cache through an optimized cache mechanism for the ledger records of the specific characteristic items obtained by the characteristic statistic node, and the extraction in the cache is improved when the same ledger records are extracted at the later stage without backtracking the data of the blocks again;

in a distributed cache system composed of a plurality of feature statistical nodes, each feature statistical node is responsible for different feature items, and the hardware used for caching by each feature statistical node is different;

optionally, some of the feature statistics nodes have a larger Random Access Memory (RAM) cache capacity, for example, 64GB or more than 128GB, and may use the RAM as a main cache location, and for a server configured with a four-channel RAM, a RAM read-write bandwidth of 100GB/s may be provided; secondly, for a server configured by a dual-channel RAM, the read-write bandwidth of 50GB/s can be provided;

secondly, some of the characteristic statistical nodes take a solid State Storage (SSD) as a main memory, can provide a read-write speed of more than about 4GB/s, and relatively provide a cache empty amount of more than 1 TB;

secondly, some of the characteristic statistical nodes are only configured by older servers, such as mechanical hard disks, and the cache speed is low and the random reading performance is poor;

therefore, in this embodiment, the processing analysis method of the present invention is optimized by optimizing a distributed cache mechanism, and the embodiment includes the following implementation steps:

s201: preferentially setting a local cache as a cache preferred position in a local server side of the characteristic statistical node;

s202: counting, by the first node organized by the feature statistics, a number of extracted ledger records that are already in a distributed cache for a period of time, such as 24 hours or 48 hours; or the number of times of extraction within a certain number of queries, for example, within the number of queries of 10000 times past;

s203: classifying a plurality of account book records in the distributed cache according to the number of hot spots, for example, setting the account book record with the most read top 10% as a first priority; the account book records with the highest reading times of 11% to 40% are set as a second priority; setting other account book records as a third priority level;

s204: copying the account book records of the first priority to the feature statistical node which takes an RAM as cache hardware; copying the account book record of the second priority to the feature statistical node which takes the SSD as cache hardware; the account book records of the third priority are cached in a default rule and a default rule according to the nearby principle of the current distributed cache system;

through the optimization mode, the account book records with high heat degrees are preferably placed in a high-speed cache position, so that high-efficiency reading efficiency is provided, and the conversion speed from a non-relational database to a relational database is increased;

furthermore, the cache position of the account book record is adjusted through the heat degree of the periodic account book record, until the account book record is not cached any more, so that the limited distributed cache space is optimized, and the capacity of each characteristic statistic node is reduced.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Although the invention has been described above with reference to various embodiments, it should be understood that many changes and modifications may be made without departing from the scope of the invention. That is, the methods, systems, and devices discussed above are examples. Various configurations may omit, substitute, or add various procedures or components as appropriate. For example, in alternative configurations, the methods may be performed in an order different than that described, and/or various components may be added, omitted, and/or combined. Moreover, features described with respect to certain configurations may be combined in various other configurations, as different aspects and elements of the configurations may be combined in a similar manner. Further, elements therein may be updated as technology evolves, i.e., many elements are examples and do not limit the scope of the disclosure or claims.

Specific details are given in the description to provide a thorough understanding of the exemplary configurations including implementations. However, configurations may be practiced without these specific details, for example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the configurations. This description provides example configurations only, and does not limit the scope, applicability, or configuration of the claims. Rather, the foregoing description of the configurations will provide those skilled in the art with an enabling description for implementing the described techniques. Various changes may be made in the function and arrangement of elements without departing from the spirit or scope of the disclosure.

In conclusion, it is intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that these examples are illustrative only and are not intended to limit the scope of the invention. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims

1. A method for processing and analyzing block data of a blockchain ledger, wherein the method for processing and analyzing comprises:

The service node receives the query request from the requesting node;

The service node analyzes the feature item contained in the query request as the target feature item;

The service node sends the target feature value or the target feature value range to the first node responsible for at least one feature statistics organization of the target feature item;

The first node assigns the backtracking task to at least one feature statistics node under the feature statistics organization; the feature statistics node performs on-chain backtracking, extracts all the ledger records of the target feature item, and performs an analysis on the extracted records. cache;

The first node performs statistics on the ledger records of the target feature item according to the target feature value or target feature value range, and submits the statistical data to the service node;

The service node integrates the statistical data to form a statistical report, and returns the report to the requesting node;

Among them, the blockchain contains multiple feature statistics organizations; each feature statistics organization includes multiple feature statistics nodes; one feature statistics organization is responsible for the statistical affairs of a target feature item, and is managed by a designated member belonging to the The member service providers of the feature statistics organization issue digital certificates to the feature statistics nodes in each of the feature statistics organizations; the feature statistics nodes that hold the digital certificates allow the feature statistics organizations to count other features in the feature statistics organization The cache space of the node is accessed and written.

2. A method for processing and analysing block data of a blockchain ledger according to claim 1, wherein the blockchain has an external application program interface, and the application program communicates with all the data through the external application program interface. communicate with the requesting node; clients make query requests and receive statistical data through the application.

3. The method for processing and analyzing block data of a blockchain ledger according to claim 2, wherein the blockchain has a first channel; the first channel is used for the requesting node Communicate with the validity of the service node; after receiving the query request from the requesting node, the service node applies to the blockchain to verify the validity of the requesting node; when the validity of the requesting node is verified After passing, submit a query request to the service node.

4. The method for processing and analyzing block data of a blockchain ledger according to claim 3, wherein the blockchain has a second channel; the service node and a plurality of the feature statistics The organization conducts legal communication in the second channel; the service node sends the target feature item and the target feature value/feature value range to at least one of the feature statistics organizations in the second channel; the service node is in the In the second channel, the statistical data result returned by at least one of the characteristic statistical organizations is received.

5. A method for processing and analysing block data of a blockchain ledger according to claim 1, wherein a feature item template is recorded in the blockchain; the feature item template is used to describe the area Multiple feature dimensions expressed by the blockchain ledger; the feature item template is stored in the block of the blockchain and allows all nodes of the blockchain to access; the feature item template is carried out by a plurality of the feature statistical organizations maintained, and only allow modification of the specified feature statistics node.

6. A method for processing and analyzing block data of a blockchain ledger according to claim 5, wherein the feature statistics node extracts all ledger records including target feature items in the process of backtracking blocks ; After extracting a ledger record, the feature statistics node caches the ledger record.

7. A method for processing and analysing block data of a blockchain ledger according to claim 6, characterized in that, when the feature statistics node caches the ledger records, the cache mode includes a local cache, and the Distributed cache on each of the feature statistics nodes in the feature statistics organization.

8. a kind of processing analysis method to the block data of blockchain ledger according to claim 7, is characterized in that, when described ledger record is cached, adopts the key-value pair <Key-Value of non-relational data >type for caching; where the key Key is the target feature item, and Value is the target feature value.

9. The method for processing and analysing block data of a blockchain ledger according to claim 8, wherein a plurality of the feature statistics nodes locate the target feature item in the cache, and according to the target feature Feature value or target feature value range, extract the qualified key-value pairs <Key-Value>, and submit all qualified key-value pairs <Key-Value> to the first node.

10. The method for processing and analysing block data of a blockchain ledger according to claim 9, wherein the first node relates all the key-value pairs that meet the conditions to <Key-Value> The transformation of the relational data structure; the transformed relational data result is submitted to the service node.