Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present specification, the technical solutions in the embodiments of the present specification will be described in detail below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of protection.
First, a centralized block chain type ledger in the embodiments of the present specification will be described. The database server is oriented to various organizations, and the organizations can record data generated between the organizations and third-party users (including other organizations or individuals) on the database server side for storage. As shown in fig. 1, fig. 1 is a schematic diagram of a system architecture according to an embodiment of the present disclosure. In this illustration, one enterprise may be oriented to multiple users, and each user may query the database service provider through its corresponding enterprise.
For example, the mechanism to which the database server is interfaced is a financial product company, and the data record may be a financial record of the individual user at the financial product company; alternatively, the mechanism of the interface may be a government department, wherein the data records are overhead details of public projects managed by the government department; or the mechanism for the database server to interface is a certain hospital, and the data records are medical records of patients; alternatively, the institution to which the database facilitator interfaces is a third party payment institution, the data records may be payment records of individual users through the institution, and so on.
In a centralized database service provider, a block chain type ledger is generated as follows, as shown in fig. 2, fig. 2 is a schematic flow chart of generating a block chain type ledger provided by an embodiment of this specification, and includes:
s201, receiving data records to be stored, and determining hash values of the data records, wherein the data records comprise service attributes and serial numbers.
The data records to be stored here may be various consumption records of individual users of the client, and also may be business results, intermediate states, operation records, and the like generated by the application server when executing business logic based on instructions of the users. Specific business scenarios may include consumption records, audit logs, supply chains, government regulatory records, medical records, and the like.
The number here is a string of values, determined by the client. Specifically, the determination may be made in the following manner:
first, a user provides a writable option of a number through a client designation, for example, in a client interface, the user fills the number by himself, and the client generates a data record including a service attribute and the number according to the number filled by the user. For example, the serial numbers are sequentially filled in according to the time sequence of the medical records of the user to obtain a plurality of medical record records containing the serial numbers, and in this way, the serial numbers directly reflect the sequence of the data records uploaded by the user.
Secondly, the client determines the data record based on the preset service logic, for example, a counter for numbering is preset in the client, and the numbering value is automatically +1 every time the user uploads a data record through the client, and the numbering value is written into the data record. In this way, the number value reflects the uploading sequence of the user on the client, and indirectly expresses the sequence of the data records uploaded by the user.
S203, when the preset blocking condition is reached, determining each data record to be written into the data block, and generating the Nth data block containing the hash value of the data block and the data record.
The preset blocking condition comprises the following steps: when the number of data records to be stored reaches a number threshold, for example, a new data block is generated every time one thousand data records are received, and one thousand data records are written into the block; alternatively, a time interval from the last blocking time reaches a time threshold, e.g., every 5 minutes, a new data block is generated, and the data records received within the 5 minutes are written into the block.
N here refers to the number of the data block, in other words, in the embodiment of the present specification, the data block is arranged in the form of a block chain, and is arranged in sequence based on the blocking time, and has a strong timing characteristic. The block height of the data block is monotonically increased based on the sequence of the blocking time. The block height may be a number, and the block height of the nth data block is N at this time; the block height may also be generated in other ways.
When N is 1, the data block at this time is the initial data block. The hash value and the block height of the initial data block are given based on a preset mode. For example, the initial data block does not contain data records, the hash value is any given hash value, and the block height blknum is 0; for another example, the trigger condition for generation of the initial data block is consistent with the trigger conditions of other data blocks, but the hash value of the initial data block is determined by hashing all of the contents in the initial data block.
When N >1, since the content and hash value of the previous data block have already been determined, at this time, the hash value of the current data block (nth data block) may be generated based on the hash value of the previous data block (i.e., nth-1 data block), for example, one possible way is to determine the hash value of each data record to be written into the nth data block, generate a mercker tree in the order of arrangement in the blocks, concatenate the root hash value of the mercker tree with the hash value of the previous data block, and generate the hash value of the current block again using the hash algorithm. For example, the hash value of the data block may be generated by concatenating the data records in the order of the data records in the block and hashing the concatenated data records to obtain the hash value of the entire data record, concatenating the hash value of the previous data block and the hash value of the entire data record, and performing a hash operation on the concatenated string.
The data block generated in the above manner may include two parts, a block header and a block body. The block body can be used for storing plaintext of splicing data, or hash values of the splicing data, and the like; the block header may be used to store metadata about the data block, such as a version number of the ledger, a hash value of a previous data block, a root hash value of a merkel tree composed of the concatenated data in the data block itself, a hash value of the data block itself, a state array for recording an operated state of the concatenated data, and the like. As shown in fig. 3, fig. 3 is a schematic diagram of a block header of a data block according to an embodiment of the present disclosure.
After the user successfully uploads the data, the hash value of the corresponding data record and the hash value of the located data block can be obtained and stored, and integrity verification can be initiated based on the hash values. The specific verification mode is to recalculate the hash value of the data record and the hash value of the data block in the database, and compare the calculated hash values with those stored locally.
By the above-mentioned manner of generating data blocks, each data block is determined by a hash value, and the hash value of the data block is determined by the content and the sequence of data records in the data block and the hash value of the previous data block. The user can initiate verification based on the hash value of the data block at any time, and modification of any content in the data block (including modification of data record content or sequence in the data block) can cause inconsistency between the hash value of the data block calculated during verification and the hash value generated during data block generation, so that verification failure is caused, and centralized non-tampering is realized.
When the block chained account book is verified, generally, a segment of data block is designated for continuous integrity verification, or continuous integrity verification is performed from the initial data block. The verification mode is to obtain the hash value of the previous data block, and recalculate the hash value of the data block according to the data record of the data block and the hash value of the previous data block by adopting the same algorithm as that used for generating the hash value of the data block.
As mentioned above, although these data records have strong timing characteristics when the account book is written, the data records are usually stored in a plurality of data blocks in a distributed manner for the user. For example, a hospital uploads medical records for all patients. Even if the user uploads the medical records in sequence according to the time generation sequence, the medical records may be stored in a plurality of data blocks in a scattered manner due to network delay and the like, and the sequence of the medical records in the account book may not be consistent with the uploading sequence. And generally, the query is performed based on the order of the account book during the query, which brings inconvenience to the user. Based on this, the embodiments of the present specification further provide an index creation method, so as to implement the ordered query requirement of the user.
As shown in fig. 4, fig. 4 is a schematic flowchart of an index creating method provided in the embodiment of the present specification, where the flowchart specifically includes the following steps:
s401, acquiring the service attribute and the number contained in the data record, wherein the number is determined by the client.
The location and manner of acquisition of the service attributes in the data records may be negotiated in advance by the database server and the docking mechanism. For example, when the data record provided by the docking mechanism is a standard structured data record, the service attribute may be obtained by specifying an offset from the data record, or identifying a start position and an end position by a specific character; or, when the data records provided by the docking mechanism are unstructured data, the header containing the service attribute and the number can be directly spliced at the head of each data record when the docking mechanism uploads the unstructured data, and the database server can directly acquire the service attribute and the number of each data record from the header.
And S403, determining the position information of the data record in the account book, wherein the position information comprises the block height of the data block where the data record is located and the offset in the located data block.
As mentioned above, a block-chained ledger is composed of a plurality of data blocks, and a data block usually contains a plurality of data records. Therefore, in the embodiments of the present specification, the location information specifically refers to which data block in the ledger a data record is stored on, and at what location in the data block.
In the data blocks provided in the embodiments of the present specification, there are many ways to identify different data blocks, including hash values or block heights of the data blocks.
The hash value of the data block is obtained by performing hash calculation according to the hash value of the previous block and the data record of the data block, and can be used for uniquely and definitely identifying one data block. In the block chain type account book, the block height of the first data block is 0 generally, and 1 is added for each additional data block; alternatively, the blocking time of a data block may be converted to a large sequence of monotonically increasing integer data (typically 12 to 15 bits) as the block height of the data block. Thus, a data block typically has a definite block height.
For another example, in a determined data block to be written into the database, the ordering of the data records is also fixed, so that the sequence number of a data record in the data block is also clear, and when the length of the data record is a fixed unit, the sequence number in the data block can also be used to clear the position information of the data record in the data block where the data record is located. That is, the sequence number in the data block may also be used to indicate the offset.
Meanwhile, since a plurality of data records are usually included in one data block, the data records in the data block can be identified by the address offset of each data record in the data block. Obviously, the address offset of each data record is not the same in the same data block.
Of course, since the specific format of the data block can be customized in the manner provided in the embodiments of the present specification (for example, the metadata information and remark information included in the block header of the data block, the form taken by the block height of the data block, and the like), the content of the location information may also be different in different formats, which does not form a limitation to the present solution.
S405, generating a combined field containing the service attribute and the number.
Specifically, the service attribute and the number may be directly spliced, and may be in the form of service attribute + number, or may be in the form of number + service attribute.
S407, establishing the corresponding relation between the merged field and the position information, and writing an index taking the merged field as a main key.
As shown in table 1, table 1 is an exemplary index table provided in the embodiments of the present specification. The Key is a merged field containing service attributes and numbers, the array of the Value part is a piece of position information, the front part of the array is high, the rear part of the array is a serial number of a data record in the data block, and a data record can be uniquely determined through the block height and the serial number. It is easily understood that one key uniquely corresponds to one position information in the index table. Wherein "Bingli" is the service attribute, and "001" is the number given by the client.
TABLE 1
Key
|
Value
|
Bingli001
|
(2,08)
|
Bingli003
|
(2,10)
|
Bingli002
|
(3,99)
|
……
|
…… |
In an embodiment, when the database server side obtains the service attribute and the number in the data record, an obtaining manner may be synchronously created, that is, when the data record is received, the service attribute and the number are directly obtained by parsing, and a merge field is obtained, and when the data block is written into the ledger, the index is synchronously created. The other way is that after the data block is written into the account book, the index does not need to be created immediately, but when the database has spare resources, the service attribute and the number are obtained for each data record in the data block newly written into the account book, a merge field is generated, the asynchronous index creation is realized, and the asynchronous index creation mode is favorable for saving resources of the database server.
In one embodiment, the number may be concatenated after the service attribute, and a merge field containing the service attribute and the number may be generated. In this way, the merged fields may be sorted according to the numbers included in the merged fields, and the merged fields are sequentially written into the index using the merged fields as the primary key.
For example, in table 1, sorting is performed according to the numbers of "Bingli 001", "Bingli 003", and "Bingli 002", thereby realizing the sequential arrangement of the primary keys "Bingli 001", "Bingli 002", and "Bingli 003" including the same service attribute in table 1. And this order is implemented based on the numbering determined by the client. In other words, the sequence of the user's data records is already reflected in the index.
According to the scheme provided by the embodiment of the specification, for the data record written into the account book, the service attribute and the number of the data record are determined, the merged field containing the service attribute and the number is generated, the storage position in the account book is stored, and the corresponding relation between the merged field and the position information is established, so that the index contains the number information specified by the user.
On the other hand, after the index table is created, the status query and statistics of the service attributes can be performed based on the index table. As shown in fig. 5, fig. 5 is a schematic flowchart of a data query method based on the foregoing index according to an embodiment of the present disclosure, where the method includes:
s501, receiving an inquiry command containing the service attribute.
For example, a query request is received that contains specific values of service attributes (in general, the query request may be sent in the form of instructions). The query request may come from the docking facility or from a service user of the docking facility. Thus, the database can be matched from the index table according to the specific value of the service attribute. For example, after table 1 is created, the user inputs a query instruction, Retrieve (bingli, & v, FULL), to the server through the client.
S503, according to the service attribute, matching is carried out from an index table, and the position information corresponding to each main key containing the service attribute in the index table is determined.
In this query manner, the database server will first query all the primary keys containing the service attributes, in the example of table 1, "bingli 001", "bingli 003", "bingli 002", etc. can be obtained, and the location information (2,08), (2,10) and (3,99) corresponding to each primary key is sequentially determined.
And S505, acquiring the corresponding data record from the account book according to the position information, and returning the acquired corresponding data record to the inquiry instruction sender.
In this manner, the data records corresponding to the position information of "bingli 001", "bingli 003" and "bingli 002" are read in sequence, that is, the data records corresponding to the position information of "bingli 001", "bingli 003" and "bingli 002" are read in sequence according to the order of the main keys, and in this manner, the data records obtained by the user are also arranged in the order of "bingli 001", "bingli 003" and "bingli 002" in the index.
As mentioned above, the primary keys in the index table can be reordered according to the numbers, so it is easy to understand that the primary keys are arranged sequentially, and in this way, the sequence of the obtained data records is arranged sequentially according to the numbers, thereby meeting the needs of users.
In an inquiry mode, when the database server obtains each primary key containing the service attribute in the instruction from the index table, the database server can also directly obtain the number from the primary key, and then sort the inquired data files according to the number based on the one-to-one correspondence of the primary key and the data records to obtain an ordered data record set (namely, the data records are sequentially arranged according to the numbers contained by the data records), and return the ordered data record set to the inquiry instruction sender. In this manner, sequential querying of data records may be achieved even if no reordering of primary keys has been performed in the index table.
In addition, the user can also directly input a query instruction containing a number, in this way, the database server needs to determine the corresponding primary key according to the number and the service attribute, in an embodiment, if the query instruction contains only one number, such as Retrieve (bingli, 001, & v), at this time, the corresponding primary key can be determined to be "bingli 001", so as to query the location information (2, 08); in another embodiment, the query command may further include one or more number intervals, such as Retrieve (bingli, (001, 007), (100,110), (v), so that it can be determined that the corresponding primary keys are from "bingli 001" to "bingli 007", and "bingli 100" to "bingli 110", and still arranged according to the number order when the data is returned, thereby implementing an accurate and ordered query in the block chain ledger.
In one aspect, an embodiment of the present specification further provides an index creating apparatus, which is applied to a centralized database server that stores data through a block chain ledger, as shown in fig. 6, where fig. 6 is an index creating apparatus provided in an embodiment of the present specification, and includes:
an obtaining module 601, configured to obtain a service attribute and a number included in a data record, where the number is determined by a client;
a position information determining module 603, configured to determine position information of the data record in the ledger, where the position information includes a block height of a data block where the data record is located and an offset in the located data block;
a generating module 605 for generating a merged field containing the service attribute and the number;
the writing module 607 establishes the corresponding relationship between the merged field and the location information, and writes an index using the merged field as a primary key.
Further, the generating module 605 concatenates the number after the service attribute, and generates a merge field containing the service attribute and the number.
Further, the generating module 605 sorts the merged fields according to the numbers contained in the merged fields, and sequentially writes the merged fields into the index using the merged fields as the primary key.
Further, the apparatus further includes a data block generating module 609, which receives data records to be stored, and determines a hash value of each data record, where the data records include a service attribute and a number; when a preset blocking condition is reached, determining each data record to be written into the data block, and generating an nth data block containing the hash value of the data block and the data record, specifically comprising:
when N is 1, the hash value and the block height of the initial data block are given based on a preset mode;
and when N is greater than 1, determining the hash value of the Nth data block according to the hash values of the data records to be written in the data block and the (N-1) th data block, and generating the Nth data block comprising the hash value of the Nth data block and the data records, wherein the block height of the data block is monotonically increased based on the sequence of the blocking time.
Further, the preset blocking condition includes: the number of data records to be stored reaches a number threshold; alternatively, the time interval from the last chunking time reaches a time threshold.
In another aspect, an embodiment of the present specification further provides an apparatus for querying based on the foregoing index, as shown in fig. 7, fig. 7 is a schematic structural diagram of an apparatus for querying provided by an embodiment of the present specification, and includes:
an instruction receiving module 701, which receives a query instruction containing a service attribute;
a matching module 703, configured to perform matching from an index table according to the service attribute, and determine location information corresponding to each primary key that includes the service attribute in the index table;
the data obtaining module 705 obtains the corresponding data record from the book according to the location information, and returns the obtained corresponding data record to the sender of the query instruction.
Further, the apparatus further includes a number determining module 707, configured to determine, before acquiring a corresponding data record from the ledger according to the location information, a number included in each primary key including the service attribute in the index table, and correspondingly, the data acquiring module 705 sorts, according to the number, the queried data records to generate an ordered data record set; and returning the ordered data record set to a query instruction sender.
The receiving module 701 receives a query instruction including a service attribute and a serial number; correspondingly, the matching module 703 performs matching from an index table according to the service attribute and the number, determines a primary key according to the service attribute and the number, and determines the position information corresponding to the primary key determined by the service attribute and the number in the acquisition of the index table.
Embodiments of the present specification further provide a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the index creation method shown in fig. 4 when executing the program.
Embodiments of the present specification further provide a computer device, which at least includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the data query method shown in fig. 5 when executing the program.
Fig. 8 is a schematic diagram illustrating a more specific hardware structure of a computing device according to an embodiment of the present disclosure, where the computing device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Embodiments of the present specification also provide a computer-readable storage medium on which a computer program is stored, where the computer program is executed by a processor to implement the index creation method shown in fig. 4.
Embodiments of the present specification also provide a computer-readable storage medium on which a computer program is stored, where the computer program is executed by a processor to implement the data query method shown in fig. 5.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
From the above description of the embodiments, it is clear to those skilled in the art that the embodiments of the present disclosure can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the embodiments of the present specification may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The systems, methods, modules or units described in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to the partial description of the method embodiment for relevant points. The above-described method embodiments are merely illustrative, wherein the modules described as separate components may or may not be physically separate, and the functions of the modules may be implemented in one or more software and/or hardware when implementing the embodiments of the present specification. And part or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is only a specific embodiment of the embodiments of the present disclosure, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the embodiments of the present disclosure, and these modifications and decorations should also be regarded as the protection scope of the embodiments of the present disclosure.