Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of protection.
First, a block chain type account book in the embodiment of the present specification will be described. In a centralized database service provider, a block chain type ledger is generated as follows, as shown in fig. 1, fig. 1 is a schematic flow chart of generating a block chain type ledger provided by an embodiment of this specification, and includes:
s101, receiving data records to be stored, and determining hash values of the data records. Wherein the data record comprises a service attribute.
As mentioned above, the data records to be stored here may be various consumption records of individual users of the client, or may be business results, intermediate states, operation records, and the like generated by the application server when executing the business logic based on instructions of the users. Specific business scenarios may include consumption records, audit logs, supply chains, government regulatory records, medical records, court blocks, and the like.
In embodiments of the present description, a database server may provide services to an organization or an individual. The service attributes in the docking mechanism can include a user name, a user identity card number, a driving license number, a mobile phone number, a unique item number and the like based on different service scenes.
For example, when the docking mechanism is a supermarket, the data record at this time may be a shipment record for a certain commodity, and the corresponding service attribute may be a character string "2201102334" formed in a form in which "220" represents a company number, for example, a milk product company; "110" characterizes the Beijing area; "23" characterizes a commercial species, e.g., yogurt; "34" characterization type, strawberry fruity yogurt. In other words, the service attributes include characters characterizing different granularity characteristics, and are arranged according to a sequence defined by a user.
The service attributes and the service data that the user needs to store together constitute the data record to be stored. For example, the user may directly concatenate the service attribute and the service data to generate the data record. Wherein a specified field (e.g., a header or a trailer) of the data record is used to store the service attribute.
S103, when the preset blocking condition is reached, determining each data record to be written into the data block, and generating the Nth data block containing the hash value of the data block and the data record.
The preset blocking condition comprises the following steps: when the number of data records to be stored reaches a number threshold, for example, a new data block is generated every time one thousand data records are received, and one thousand data records are written into the block; alternatively, a time interval from the last blocking time reaches a time threshold, e.g., every 5 minutes, a new data block is generated, and the data records received within the 5 minutes are written into the block.
N here refers to a sequence number of the data block, that is, in the embodiment of the present specification, the data block is arranged in a block chain manner, and is arranged in sequence based on the blocking time, so that the data block has a strong timing characteristic. The block height of the data block is monotonically increased based on the sequence of the blocking time. The block height may be a sequence number, and at this time, the block height of the nth data block is N; the block height may also be generated in other ways.
When N is 1, the data block at this time is the initial data block. The hash value and the block height of the initial data block are given based on a preset mode. For example, the initial data block does not contain data records, the hash value is any given hash value, and the block height blknum is 0; for another example, the trigger condition for generation of the initial data block is consistent with the trigger conditions of other data blocks, but the hash value of the initial data block is determined by hashing all of the contents in the initial data block.
When N >1, since the content and hash value of the previous data block have already been determined, at this time, the hash value of the current data block (nth data block) may be generated based on the hash value of the previous data block (i.e., nth-1 data block), for example, one possible way is to determine the hash value of each data record to be written into the nth data block, generate a mercker tree in the order of arrangement in the blocks, concatenate the root hash value of the mercker tree with the hash value of the previous data block, and generate the hash value of the current block again using the hash algorithm. For example, the hash value of the data block may be generated by concatenating the data records in the order of the data records in the block and hashing the concatenated data records to obtain the hash value of the entire data record, concatenating the hash value of the previous data block and the hash value of the entire data record, and performing a hash operation on the concatenated string.
After the user successfully uploads the data, the hash value of the corresponding data record and the hash value of the located data block can be obtained and stored, and integrity verification can be initiated based on the hash values. The specific verification method comprises the steps of recalculating the hash value of the data record and the hash value of the data block, and comparing the recalculated hash value with the locally stored hash value.
The data block generated in the above manner may include two parts, a block header and a block body. Plaintext that can be used to store data records in a block, or hash values of data records, etc.; the block header may be used to store metadata about the data block, such as a version number of the ledger, a hash value of a previous data block, a root hash value of a merkel tree composed of data records in the self data block, a hash value of the self data block, a state array for recording an operated state of the data records, and the like. As shown in fig. 2, fig. 2 is a schematic diagram of a block header of a data block according to an embodiment of the present disclosure.
By the above-mentioned manner of generating data blocks, each data block is determined by a hash value, and the hash value of the data block is determined by the content and the sequence of data records in the data block and the hash value of the previous data block. The user can initiate integrity verification based on the hash value of the data block at any time, and modification of any content in the data block (including modification of data record content or sequence in the data block) can cause inconsistency between the hash value of the data block calculated during verification and the hash value generated during data block generation, so that verification failure is caused, and centralized non-tampering is realized.
The integrity verification comprises integrity verification of a data block, namely, reconstructing a Mercker tree according to the hash values of data records in the data block, calculating a root hash value of the Mercker tree, recalculating the hash value of the data block according to the root hash value of the Mercker tree and the hash value of the previous data block, and performing consistency comparison with the hash value of the data block saved in advance.
The integrity verification may also include integrity verification for several consecutive data blocks, i.e. the hash value of a data block is recalculated from the root hash value of the merkel tree stored in the block header of the data block and the hash value of the previous data block and compared with the hash values of the previously stored data blocks.
In the foregoing manner, the user has stored the data record in the block chain ledger in a non-tamperproof manner, but the data record is typically stored in a plurality of data blocks in a decentralized manner. Based on this, embodiments of the present specification provide an index creating method for a data record, which is applied to a centralized database service provider that stores data through a block chain type account book, facilitates business processing of a docking mechanism, and improves user experience. As shown in fig. 3, fig. 3 is a schematic flowchart of an index creating method for data records provided in an embodiment of the present specification, where the flowchart specifically includes the following steps:
s301, acquiring the business attribute in the data record in the account book.
The specific location of the service attribute in the data record and the manner of obtaining may be a prior negotiation between the database server and the docking mechanism. For example, when the data record provided by the docking mechanism is a standard structured data record, the service attribute may be obtained by specifying an offset from the data record, or identifying a start position and an end position by a specific character; or, when the data records provided by the docking mechanism are unstructured data, the head or the tail containing the service attribute can be directly spliced in each data record when the docking mechanism uploads the unstructured data, and the database server can directly acquire the service attribute of each data record from the head or the tail.
And S303, determining the position information of the data record in the ledger or the hash value of the data record.
In embodiments of the present description, a data record may be uniquely identified by a hash value. As long as the hash value is obtained based on the data record, the database server can always obtain the data record corresponding to the hash value by searching in a mode of traversing the block chain type account book.
Meanwhile, as mentioned above, a blockchain ledger is composed of a plurality of data blocks, and a data block usually contains a plurality of data records. Therefore, in the embodiments of the present specification, the location information specifically refers to which data block in the ledger a data record is stored on, and at what location in the data block.
In other words, the location information of a data record may also uniquely identify the data record. The location information includes a block height of a data block in which the data record is located, and an offset in the located data block.
In the data blocks provided in the embodiments of the present specification, there are many ways to identify different data blocks, including hash values or block heights of the data blocks.
The hash value of the data block is obtained by performing hash calculation according to the hash value of the previous block and the data record of the data block, and can be used for uniquely and definitely identifying one data block. In the block chain type account book, the block height of the first data block is 0 generally, and 1 is added for each additional data block; alternatively, the blocking time of a data block may be converted to a large sequence of monotonically increasing integer data (typically 12 to 15 bits) as the block height of the data block. Thus, a data block typically has a definite block height.
For another example, in a determined data block to be written into the database, the ordering of the data records is also fixed, so that the sequence number of a data record in the data block is also clear, and when the length of the data record is a fixed unit, the sequence number can also be used to clear the position information of the data record in the data block where the data record is located. That is, the sequence number may be used to indicate the offset as well.
Meanwhile, since a plurality of data records are usually included in one data block, the data records in the data block can be identified by the address offset of each data record in the data block. Obviously, the address offset of each data record is not the same in the same data block.
Of course, since the specific format of the data block can be customized in the manner provided in the embodiments of the present specification (for example, the metadata information and remark information included in the block header of the data block, the form taken by the block height of the data block, and the like), the content of the location information may also be different in different formats, which does not form a limitation to the present solution.
S305, establishing a corresponding relation between the service attribute and the position information or the hash value, and writing the corresponding relation into an index taking the service attribute as a main key.
That is, the index is an inverted index. In this index, the primary key is a business attribute contained in the data record. The specific writing mode is that when the main key in the index does not contain the service attribute, an index record with the service attribute as the main key is created in the index table.
And when the primary key in the index contains the service attribute, writing the position information or the hash value into the index record where the specified identification field is located. Here, the writing is not overwriting, but position information or a hash value is added to the value of the index record, and is present in the index record in parallel with other position information or a hash value.
As shown in table 1, table 1 is an exemplary index table provided in the embodiments of the present specification. The Key is a specific Value of the service attribute, each array of the Value part is a piece of position information, the front part of each array is high, the rear part of each array is a serial number of a data record in the data block, and a data record can be uniquely determined through the block height and the serial number. It is easily understood that one key may correspond to a plurality of location information in the index table.
TABLE 1
Key
|
Value
|
2201102324
|
(2,08),(2,10),(300,89),(300,999)
|
2201102325
|
(5,01),(8,22)
|
2301102324
|
(3,08),(4,11)
|
……
|
…… |
According to the scheme provided by the embodiment of the specification, for the data record written into the account book, the service attribute of the data record and the storage position or the hash value in the account book are determined, the corresponding relation between the service attribute and the storage position or the hash value is established, the inverted index with the service attribute as the main key is created, the service details of the user do not need to be known, and the data record can be correspondingly counted and subsequently inquired and verified based on the service attribute from the index.
Further, in the index, the primary keys in the index may also be sorted. Specifically, the primary keys in the index are arranged in order according to the order of the characters included in the service attribute. For example, "2201102324" and "2201102325" are arranged in the index table in numerical order, and if there is a new correspondence to the index that needs to be written, if the service attribute is "2201102326", they are inserted between "2201102325" and "2301102324" in table one. By the method, data with similar service attributes in the index can be closer, clustering query of users is facilitated, and efficiency is improved.
After the index table is created, the query and statistics of the service attribute can be performed based on the index table. As shown in fig. 4, fig. 4 is a schematic flowchart of a query method in a block chain type ledger provided in an embodiment of the present specification, which is applied to a centralized database service provider that stores data through the block chain type ledger, where the process includes:
s401, receiving a query instruction which is sent by a client and contains a field to be queried.
For example, after table 1 is created, the user enters a query instruction, Retrieve (220%, & v, FULL), and queries the blockchain ledger to obtain the relevant data record with company number "220". Wherein, 220% is the field to be inquired, and% represents wildcard.
In one embodiment, the user may also input a field Length to be queried in the query instruction, for example, Retrieve (220110%, Length ═ 8, & v, FULL), a data record characterizing the service attribute that needs to query the first 8 characters containing the packet string "220110".
If the user wants to have finer query granularity, the user can input a query command containing more fields to be queried. For example, Retrieve (220110%, & v, FULL), i.e., a characterization requires a query against business data for company number 220, beijing area. Because the characters in the service attribute are user-defined, the user can adjust the query granularity based on the requirement to meet the actual requirement
And S403, determining a service attribute matched with the field to be inquired from the head in a pre-established index table, wherein the index table comprises a corresponding relation between the service attribute and the position information, and the service attribute comprises characters representing different granularity characteristics.
Specifically, the database server performs corresponding matching from the index table 1. The matching is performed one by one starting from the first character of the main key in the index. If the user specifies the matching length in the query instruction, the matching length of the user instruction is taken as the standard; and if the user does not specify the matching length in the query instruction, taking the length of the field to be queried as the matching length.
And when the user does not specify the matching length, carrying out complete matching by taking the length of the field to be queried as the standard. For example, for the query instruction Retrieve (220%, & v, FULL), the index table may be queried for the primary key for which the first 3 characters are "220".
When the user specifies the matching Length, similar matching is performed based on the specified matching Length, for example, for a query instruction Retrieve (220110%, Length ═ 8, & v, FULL), a query is performed on a primary key containing "220110" in the first 8 characters.
Further, the user may also perform an approximate query, for example, the user inputs a query instruction Retrieve (220110%, & v, Proxi), and queries the primary key whose first 6 characters are similar to "220110". The approximation criterion may be preset in the database server, for example, characters are compared one by one, and if the number of different characters does not exceed a preset value (for example, 1 character, or the length to be queried is 0.1 and an integer is taken up), the characters are regarded as the approximate primary key. In this manner, other data relevant around the target population may be queried.
S405, in the index table, the position information or the hash value corresponding to the matched service attribute is obtained.
After the matching service attribute is determined, the corresponding location information or hash value may be obtained from the index table. Taking table 1 as an example, for the instruction Retrieve (220%, & v, FULL) input by the user, it can be known that the matched service attributes include "2201102324" and "2201102325", so that the corresponding location information can be obtained as: (2,08), (2,10), (300,89), (300,999), (5,01) and (8, 22).
And S407, inquiring and acquiring a corresponding data record from the block chain type account book according to the position information or the hash value, wherein the data record comprises the matched service attribute.
As previously mentioned, in a blockchain ledger, location information or a hash value always uniquely identifies a data record. Therefore, the database server can query the corresponding data record based on the obtained location information or the hash value. Because the granularity of the data record is determined based on the query instruction of the user, the queried data record will satisfy the query instruction determined by the user, that is, the service attribute in the queried data record is always matched with the field to be queried in the query instruction input by the user.
According to the scheme provided by the embodiment of the specification, the database server side creates an index in advance according to the user-defined service attribute, then performs service attribute query matched with the query granularity based on the user instruction, obtains the position information or the hash value matched with the user instruction from the index, and further obtains the corresponding data record according to the position information or the hash value query, so that the query requirements of the user under different granularities are met, and the user experience is improved.
Correspondingly, an embodiment of the present specification further provides an inquiry apparatus in a block chain type account book, as shown in fig. 5, fig. 5 is a schematic structural diagram of the inquiry apparatus in a block chain type account book provided in the embodiment of the present specification, and is applied to a centralized database service provider that stores data in a block chain type account book, and the inquiry apparatus includes:
the receiving module 501 receives a query instruction containing a field to be queried, which is sent by a client;
a determining module 503, configured to determine, in a pre-established index table, a service attribute that matches the field to be queried from the head, where the index table includes a correspondence between the service attribute and the location information, and the service attribute includes characters representing different granularity features;
an obtaining module 505, configured to obtain, in the index table, location information or a hash value corresponding to the matched service attribute;
and the query module 507 queries and acquires a corresponding data record from the block chain type account book according to the position information or the hash value, wherein the data record comprises the matched service attribute.
Further, the determining module 503 determines, from the header of the service attribute, a character with the same length as the field to be queried; and determining the service attribute containing the characters which are the same as or similar to the field to be inquired as the service attribute matched with the field to be inquired.
Further, the apparatus further includes an index creating module 509, configured to pre-establish an index, and obtain a service attribute in a data record in the account book; determining the position information of the data record in an account book or the hash value of the data record; and establishing the corresponding relation between the service attribute and the position information or the detailed corresponding relation, and writing the corresponding relation into an index taking the service attribute as a main key.
Further, the apparatus further includes a data block generating module 511, which receives data records to be stored, and determines hash values of the data records, where the data records include service attributes; when a preset blocking condition is reached, determining each data record to be written into the data block, and generating an nth data block containing the hash value of the data block and the data record, specifically comprising: when N is 1, the hash value and the block height of the initial data block are given based on a preset mode; and when N is greater than 1, determining the hash value of the Nth data block according to the hash values of the data records to be written in the data block and the (N-1) th data block, and generating the Nth data block comprising the hash value of the Nth data block and the data records, wherein the block height of the data block is monotonically increased based on the sequence of the blocking time.
Further, the preset blocking condition includes: the number of data records to be stored reaches a number threshold; alternatively, the time interval from the last chunking time reaches a time threshold.
Further, the index creating module 509 sequentially arranges the main keys in the index according to the sequence of the characters included in the service attribute, and writes the corresponding relationship into the sequentially arranged index using the service attribute as the main key.
Embodiments of the present specification also provide a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the query method shown in fig. 4 when executing the program.
Fig. 6 is a schematic diagram illustrating a more specific hardware structure of a computing device according to an embodiment of the present disclosure, where the computing device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present specification also provide a computer-readable storage medium on which a computer program is stored, where the computer program is executed by a processor to implement the query method shown in fig. 4.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The systems, methods, modules or units described in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to the partial description of the method embodiment for relevant points. The above-described method embodiments are merely illustrative, wherein the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present specification. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.