Disclosure of Invention
The invention aims to provide a processing and analyzing method suitable for block chain account book data, and aims to more intuitively present statistical data of an account book.
The invention adopts the following technical scheme: a processing and analyzing method for block data of a block chain account book; the processing analysis method comprises the following steps:
the service node receives a query request from the request node;
the service node analyzes the characteristic items contained in the query request as target characteristic items;
the service node sends a target characteristic value or a target characteristic value range to a first node which is responsible for at least one characteristic statistical organization of a target characteristic item;
the first node dispatches a backtracking task to at least one characteristic statistical node under the characteristic statistical organization; the feature statistical node performs on-chain backtracking, extracts all the account book records of the target feature item, and caches the extracted records;
the first node counts the records of the target characteristic items and submits the statistical data to the service node;
the service node integrates the statistical data to form a statistical report and returns the report to the request node;
wherein, the block chain comprises a plurality of characteristic statistical organizations; each feature statistic organization comprises a plurality of feature statistic nodes; a characteristic statistic organization is responsible for the statistic affairs of a target characteristic item, and a designated member service provider belonging to the characteristic statistic organization issues a digital certificate to the characteristic statistic node in each characteristic statistic organization; the characteristic node holding the digital certificate allows access to the cache space of other characteristic statistic nodes in the characteristic statistic organization and performs writing operation;
the block chain is provided with an external application program interface, and an application program is communicated with the request node through the external application program interface; a client makes a query request and receives statistical data through the application program;
the block chain is internally provided with a first channel; the first channel is used for legality communication between the request node and the service node; after receiving the query request of the request node, the service node applies to a block chain to verify the validity of the request node; after the validity of the request node is verified, submitting a query request to the service node;
a second channel is arranged in the block chain; the service node and the feature statistical organizations carry out legal communication in the second channel; the service node sends a target feature item and a target feature value/feature value range to at least one feature statistical organization in the second channel; the service node receives a statistical data result returned by at least one characteristic statistical organization in the second channel;
recording a characteristic item template in the block chain; the characteristic item template is used for describing a plurality of characteristic dimensions expressed by a block chain account book; the characteristic item template is stored in a block of the block chain and allows all nodes of the block chain to access; the feature item template is maintained by a plurality of feature statistical organizations, and only the specified feature statistical nodes are allowed to be modified;
in the process of backtracking the block, the characteristic counting node extracts all account book records including the target characteristic item; after extracting an account book record, the characteristic statistic node caches the account book record;
when the characteristic statistic node caches the account book record, the caching mode comprises local caching and distributed caching on each characteristic statistic node in the characteristic statistic organization;
when the account book record is cached, caching a Key Value pair < Key-Value > type of non-relational data; wherein, the Key is the target characteristic item, and the Value is the target characteristic Value.
The characteristic statistical nodes locate the target characteristic items in a cache, extract Key Value pairs < Key-Value > meeting conditions according to the target characteristic values or target characteristic Value ranges, and submit all the Key Value pairs < Key-Value > meeting conditions to the first node;
the first node converts all the Key Value pairs meeting the conditions into a relational data structure of the < Key-Value >; and submitting the converted relational data result to the service node.
The beneficial effects obtained by the invention are as follows:
1. the processing and analyzing method provided by the invention utilizes the characteristic that a block chain has a plurality of distributed nodes, carries out synchronous feature item search on the block chain, implements concurrent backtracking, and effectively utilizes the characteristic of high concurrent computing capability of a distributed system;
2. according to the processing and analyzing method, after the target characteristic Value of the target characteristic item is obtained, the non-relational data Key Value pair < Key-Value > is generated for the characteristic item and the characteristic Value, and the high flow conversion performance of data in the caching process is improved.
3. The processing and analyzing method can synchronously screen and convert the non-relational data while caching the block chain data, thereby improving the efficiency of converting the non-relational data into the relational data and finally efficiently outputting and presenting the relational data to a data requester.
4. The processing and analyzing method provided by the invention is used for modularly designing the software and the hardware, facilitates future upgrading or replacement of related software and hardware environments, and reduces the use cost.
Detailed Description
In order to make the technical solution and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the embodiments thereof; it should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. Other systems, methods, and/or features of the present embodiments will become apparent to those skilled in the art upon review of the following detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the accompanying claims. Additional features of the disclosed embodiments are described in, and will be apparent from, the detailed description that follows.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it is to be understood that if there is an orientation or positional relationship indicated by the terms "upper", "lower", "left", "right", etc. based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not intended to indicate or imply that the device or assembly referred to must have a specific orientation.
The first embodiment is as follows:
a processing and analyzing method for block data of a block chain account book; the processing analysis method comprises the following steps:
the service node receives a query request from the request node;
the service node analyzes the characteristic items contained in the query request as target characteristic items;
the service node sends a target characteristic value or a target characteristic value range to a first node which is responsible for at least one characteristic statistical organization of a target characteristic item;
the first node dispatches a backtracking task to at least one characteristic statistical node under the characteristic statistical organization; the feature statistical node performs on-chain backtracking, extracts all the account book records of the target feature item, and caches the extracted records;
the first node counts the records of the target characteristic items and submits the statistical data to the service node;
the service node integrates the statistical data to form a statistical report and returns the report to the request node;
wherein, the block chain comprises a plurality of characteristic statistical organizations; each feature statistic organization comprises a plurality of feature statistic nodes; a characteristic statistic organization is responsible for the statistic affairs of a target characteristic item, and a designated member service provider belonging to the characteristic statistic organization issues a digital certificate to the characteristic statistic node in each characteristic statistic organization; the characteristic node holding the digital certificate allows access to the cache space of other characteristic statistic nodes in the characteristic statistic organization and performs writing operation;
the block chain is provided with an external application program interface, and an application program is communicated with the request node through the external application program interface; a client makes a query request and receives statistical data through the application program;
the block chain is internally provided with a first channel; the first channel is used for legality communication between the request node and the service node; after receiving the query request of the request node, the service node applies to a block chain to verify the validity of the request node; after the validity of the request node is verified, submitting a query request to the service node;
a second channel is arranged in the block chain; the service node and the feature statistical organizations carry out legal communication in the second channel; the service node sends a target feature item and a target feature value/feature value range to at least one feature statistical organization in the second channel; the service node receives a statistical data result returned by at least one characteristic statistical organization in the second channel;
recording a characteristic item template in the block chain; the characteristic item template is used for describing a plurality of characteristic dimensions expressed by a block chain account book; the characteristic item template is stored in a block of the block chain and allows all nodes of the block chain to access; the feature item template is maintained by a plurality of feature statistical organizations, and only the specified feature statistical nodes are allowed to be modified;
in the process of backtracking the block, the characteristic counting node extracts all account book records including the target characteristic item; after extracting an account book record, the characteristic statistic node caches the account book record;
when the characteristic statistic node caches the account book record, the caching mode comprises local caching and distributed caching on each characteristic statistic node in the characteristic statistic organization;
when the account book record is cached, caching a Key Value pair < Key-Value > type of non-relational data; wherein, the Key is the target characteristic item, and the Value is the target characteristic Value.
The characteristic statistical nodes locate the target characteristic items in a cache, extract Key Value pairs < Key-Value > meeting conditions according to the target characteristic values or target characteristic Value ranges, and submit all the Key Value pairs < Key-Value > meeting conditions to the first node;
the first node converts all the Key Value pairs meeting the conditions into a relational data structure of the < Key-Value >; submitting the converted relational data result to the service node;
the first channel and the second channel are data exchange channels isolated between opposite channels in a block chain; under the condition of considering a plurality of different departments, different organizations, different access strategies of data privacy and the like, the application of multiple channels in the block chain realizes that different nodes in the same block chain are distributed to different channels for data exchange, so as to effectively distinguish independent transmission of data information among different nodes; therefore, preferably, the request node and the service node are allocated to the first channel, so that the condition that the request node is in doubt in validity is avoided, illegal communication to other feature statistical nodes is unauthorized, and large-area congestion and even paralysis caused by data interaction in the distributed system when a plurality of request nodes simultaneously initiate query requests and even network multiple concurrent attacks are avoided;
further, the blockchain network is managed by a plurality of organizations, each blockchain node has an identity information belonging to a certain organization, each node is authenticated through a Certificate Authority (CA), and according to the blockchain organization to which the node belongs, the certificate authority issues a digital certificate representing the membership of the organization to express the organization identity of the node. Mapping of identity information and organization about a node is provided by a Member Service Provider (MSP) which determines how a node is assigned to a particular role in a given organization and obtains the associated rights to access blockchain resources. A node can only be owned by one organization, and therefore can only be associated with a single MSP;
where CA is a certificate authority, a method for confirming the identity of a user by generating a digital certificate, which is one of asymmetric encryption techniques, through cryptography (asymmetric encryption). By using a digital certificate mode, the node uses a private key of the node to check and sign the data to be sent when accounting, and a receiver uses a public key of a sender to verify the sent data; the verification mode of the digital certificate can verify the correctness of the data and ensure the safety of the data;
furthermore, for different feature items, a specific organization is designated to be responsible for the designated feature items through a consensus mechanism in the block chain, so that the service node can more efficiently allocate tasks needing block backtracking to the designated organization, and the efficiency of block backtracking is improved on one hand;
on the other hand, different organizations cannot interfere backtracking of the same characteristic item, so that the situation of repeated operation or operation congestion is avoided in a distributed system;
further, the service node and a plurality of feature statistic organizations use the second channel for data interaction; preferably, if there are a large number of the feature statistical organizations and a large number of the feature statistical nodes, more channels may be used to separate different feature statistical organizations from the communication channels of the service nodes, thereby improving effective and high-speed transmission of information in the channels.
Example two:
this embodiment should be understood to include at least all of the features of any of the foregoing embodiments and further modifications thereon;
when data needs to be visually and specifically presented to people, more requirements are that the arrangement of the data is clear and the logic is complete, and a two-dimensional table mode is adopted in a traditional relational database, wherein each row is a data tuple; each column, which is a feature item described herein; each row of the relational database must have a primary key to uniquely identify the row (tuple); therefore, the relational database has strict data structure logic and is beneficial to later-stage processing and screening of data; therefore, for the purpose of presenting data, preferably, the data is stored by using a relational database, and is presented to people after being finally beautified based on the relational database;
however, according to the blockchain technical characteristics, the information of the blockchain ledger exists on each block, and the ledger information stored by each block is only the record of a series of transactions that occur on the blockchain within a period of time in the generation phase of the block; therefore, if account book information on a plurality of blocks needs to be extracted, a plurality of pieces of extracted information are scattered information before being processed and analyzed, and the information lacks a data structure relationship and is huge in quantity;
further, based on the characteristics of a distributed system, in the process of traversing the block book, high-concurrency backtracking, extraction and storage can be performed by using a plurality of feature statistical nodes; however, the traditional relational database needs to establish a data logical relationship during storage, and establishes a series of integrity constraints, so that the concurrency performance is low, and the statistical efficiency of the block chain data is greatly influenced; therefore, in this embodiment, as a characteristic of being suitable for block chain account book data, a non-relational database is formed in a non-relational data structure form, and initial data extracted by the feature statistical node is cached first and then is processed in the next step;
the non-relational database refers to a non-relational, distributed data storage system which generally does not guarantee compliance with the ACID principle; the non-relational database is characterized in that the analysis processing of the traditional Structured Query Language (SQL) is not needed, and the read-write performance of the non-relational database greatly exceeds the performance of the relational database; currently common non-relational databases include: such as HBase, Hypertable, MongoDB, Cassandra, etc.;
the non-relational data is stored based on Key Value pairs < Key-Value >, the structure of the non-relational data is not fixed, each data tuple can have different fields, each data tuple can be added with some own Key Value pairs < Key-Value > according to needs, and the fixed structure and the fixed space characteristics of the relational database are cancelled; when reading/writing the non-relational data, the multi-table is not subjected to associated query, and the operation can be completed only by extracting or writing the corresponding value according to the key, so that a large amount of operation resources and time are saved; therefore, the non-relational data and the non-relational database are just suitable for being used as the cached data type and data structure in the traversal process of the block chain ledger;
the embodiment comprises a conversion step of converting a non-relational database into a relational database, and aims to realize the conversion of a non-relational data set in a distributed node cache into a final relational database; the method aims to provide clearer data display logic when the last query result is submitted to the request node;
the first node submits the form of the screened multiple Key Value pairs < Key-Value > to the service node, including directly submitting complete Key Value pair data to the service node, or only submitting storage addresses of the screened multiple Key Value pairs < Key-Value > in a non-relational database based on the purpose of effectively utilizing the cache space of the service node, and the service node acquires the screened Key Value pairs < Key-Value > data by itself through connection in the second channel;
further, the service node records the screened multiple Key Value pairs < Key-Value > as a first data set; the service node establishes a new blank relational database and establishes a plurality of connections of the relational database; the service node extracts a plurality of Key values of a first data set one by one or in batches, and completes writing the data of the first data set into the relational database through the connection according to relational mapping;
further, the service node performs processing such as sorting, charting, segmentation and the like on the relational database so as to realize a final more visual data statistical result.
Example three:
this embodiment should be understood to include at least all of the features of any of the embodiments described above and further refinements thereto:
because the ledger records of the block chain on the previous blocks have the characteristic of being not falsifiable, no matter backtracking is carried out at any time and backtracking is carried out by any node, the obtained results are highly consistent and have strict logic sequence, so that the common ledger records can be kept in a distributed cache through an optimized cache mechanism for the ledger records of the specific characteristic items obtained by the characteristic statistic node, and the extraction in the cache is improved when the same ledger records are extracted at the later stage without backtracking the data of the blocks again;
in a distributed cache system composed of a plurality of feature statistical nodes, each feature statistical node is responsible for different feature items, and the hardware used for caching by each feature statistical node is different;
optionally, some of the feature statistics nodes have a larger Random Access Memory (RAM) cache capacity, for example, 64GB or more than 128GB, and may use the RAM as a main cache location, and for a server configured with a four-channel RAM, a RAM read-write bandwidth of 100GB/s may be provided; secondly, for a server configured by a dual-channel RAM, the read-write bandwidth of 50GB/s can be provided;
secondly, some of the characteristic statistical nodes take a solid State Storage (SSD) as a main memory, can provide a read-write speed of more than about 4GB/s, and relatively provide a cache empty amount of more than 1 TB;
secondly, some of the characteristic statistical nodes are only configured by older servers, such as mechanical hard disks, and the cache speed is low and the random reading performance is poor;
therefore, in this embodiment, the processing analysis method of the present invention is optimized by optimizing a distributed cache mechanism, and the embodiment includes the following implementation steps:
s201: preferentially setting a local cache as a cache preferred position in a local server side of the characteristic statistical node;
s202: counting, by the first node organized by the feature statistics, a number of extracted ledger records that are already in a distributed cache for a period of time, such as 24 hours or 48 hours; or the number of times of extraction within a certain number of queries, for example, within the number of queries of 10000 times past;
s203: classifying a plurality of account book records in the distributed cache according to the number of hot spots, for example, setting the account book record with the most read top 10% as a first priority; the account book records with the highest reading times of 11% to 40% are set as a second priority; setting other account book records as a third priority level;
s204: copying the account book records of the first priority to the feature statistical node which takes an RAM as cache hardware; copying the account book record of the second priority to the feature statistical node which takes the SSD as cache hardware; the account book records of the third priority are cached in a default rule and a default rule according to the nearby principle of the current distributed cache system;
through the optimization mode, the account book records with high heat degrees are preferably placed in a high-speed cache position, so that high-efficiency reading efficiency is provided, and the conversion speed from a non-relational database to a relational database is increased;
furthermore, the cache position of the account book record is adjusted through the heat degree of the periodic account book record, until the account book record is not cached any more, so that the limited distributed cache space is optimized, and the capacity of each characteristic statistic node is reduced.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Although the invention has been described above with reference to various embodiments, it should be understood that many changes and modifications may be made without departing from the scope of the invention. That is, the methods, systems, and devices discussed above are examples. Various configurations may omit, substitute, or add various procedures or components as appropriate. For example, in alternative configurations, the methods may be performed in an order different than that described, and/or various components may be added, omitted, and/or combined. Moreover, features described with respect to certain configurations may be combined in various other configurations, as different aspects and elements of the configurations may be combined in a similar manner. Further, elements therein may be updated as technology evolves, i.e., many elements are examples and do not limit the scope of the disclosure or claims.
Specific details are given in the description to provide a thorough understanding of the exemplary configurations including implementations. However, configurations may be practiced without these specific details, for example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the configurations. This description provides example configurations only, and does not limit the scope, applicability, or configuration of the claims. Rather, the foregoing description of the configurations will provide those skilled in the art with an enabling description for implementing the described techniques. Various changes may be made in the function and arrangement of elements without departing from the spirit or scope of the disclosure.
In conclusion, it is intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that these examples are illustrative only and are not intended to limit the scope of the invention. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.