CN112862613A - Transaction data processing method and device - Google Patents

Transaction data processing method and device Download PDF

Info

Publication number
CN112862613A
CN112862613A CN202110336785.XA CN202110336785A CN112862613A CN 112862613 A CN112862613 A CN 112862613A CN 202110336785 A CN202110336785 A CN 202110336785A CN 112862613 A CN112862613 A CN 112862613A
Authority
CN
China
Prior art keywords
sub
database
key value
preset
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110336785.XA
Other languages
Chinese (zh)
Inventor
何雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202110336785.XA priority Critical patent/CN112862613A/en
Publication of CN112862613A publication Critical patent/CN112862613A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Computer Security & Cryptography (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a transaction data processing method and device, and relates to the technical field of automatic program design. One specific implementation mode of the method comprises the steps of receiving a transaction data processing request, obtaining a message field of the request, calling a preset mapping engine to identify a corresponding subbase key value and a partition key value, and calculating through a consistent Hash division algorithm to obtain a basket number where to-be-processed data is located; reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request. Therefore, the implementation mode of the invention divides the data in two dimensions or in multiple dimensions to adapt to the multilevel management mode of an enterprise, simplifies the processing logic of the aggregation service and realizes more flexible database and table division processing.

Description

Transaction data processing method and device
Technical Field
The invention relates to the technical field of automatic program design, in particular to a transaction data processing method and device.
Background
With the development of enterprises, more and more data and more complex services need to be managed by a background system, a single database cannot completely bear all data and services, the data cannot be segmented in a database-by-database and table-by-table mode, and the capacity and the processing capacity of the whole system are improved.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
most of the existing database and table dividing modes are one-dimensional, data is divided once according to a certain dimension, the divided data is scattered completely, the method cannot be well adapted to service processing scenes of certain management aspects of a system before database and table dividing, and cannot be coordinated with a management framework of a company. For example, in the banking industry, provincial branches are basically used as a large management unit, each provincial branch manages business personnel and clients independently, and each provincial branch gives out a management type report.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for processing transaction data, so as to solve the problems in the prior art, the method and the apparatus split data in two or more dimensions, so as to adapt to a multi-level management mode of an enterprise, simplify processing logic of aggregation services, and achieve more flexible database and table partitioning processing.
In order to achieve the above object, according to an aspect of the embodiments of the present invention, a transaction data processing method is provided, including receiving a transaction data processing request, obtaining a message field of the request, calling a preset mapping engine to identify a corresponding banking key value and a partitioning key value, and further calculating by a consistent hash partitioning algorithm to obtain a basket number where to-be-processed data is located; reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
Optionally, before receiving the transaction data processing request, the method includes:
acquiring system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created;
selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse;
and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
Optionally, based on the attribute identifier in the system information, calling a corresponding capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created, including:
judging whether the system is a migration system or not according to the attribute identification in the system information, if so, calling a migration capacity evaluation component, and determining the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created.
Optionally, invoking a new system capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created, including:
determining the number of deployed databases of the new system to obtain the number of sub-databases, calculating to obtain the future predicted data volume of the system to obtain the data volume of the single database, dividing the data volume of the single database by the preset maximum data volume of the single table, rounding up, and multiplying by the preset expansion threshold to obtain the number of sub-tables; and
calling a migration capacity evaluation component, and determining the database number and the table number of a database to be created, wherein the steps comprise:
and determining the growth rate of the future preset time according to the data volume of the current system, calculating to obtain the total data volume, and calculating to obtain the sub-database number and the sub-meter number based on a preset single-database data volume threshold, a preset single-meter data volume threshold and a preset expansion threshold.
Optionally, obtaining a message field of the request, and calling a preset mapping engine to identify a corresponding banking key value and a partitioning key value, including:
acquiring a message field of the request, judging whether a partition key and a sub-library key exist in the message field and are not empty, and if so, directly extracting a sub-library key value and a partition key value; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
Optionally, the preset mapping function includes an intercept function and a splice function;
and calling a preset interception function or a splicing function, and acquiring a database partitioning key value and a partitioning key value from the message field.
Optionally, the method further comprises:
setting an identification strategy field to configure a multi-line strategy; wherein the multi-line policy includes a mapping function, an index table, and a direct fetch.
Optionally, comprising:
the mapping relation of the sub-library partition key in the mapping engine is realized through a SpringEL expression configured in the transaction rule table.
In addition, the invention also provides a transaction data processing device, which comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for receiving a transaction data processing request, acquiring a message field of the request, calling a preset mapping engine to identify a corresponding sub-library key value and a corresponding partition key value, and further calculating through a consistent Hash sub-algorithm to obtain a basket number where to-be-processed data is located; the processing module is used for reading a preset deployment configuration table and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
Optionally, before the obtaining module receives the transaction data processing request, the obtaining module includes:
acquiring system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created;
selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse;
and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
Optionally, the obtaining module invokes a corresponding capacity evaluation component based on the attribute identifier in the system information, and determines the number of sub-databases and the number of sub-tables of the database to be created, including:
judging whether the system is a migration system or not according to the attribute identification in the system information, if so, calling a migration capacity evaluation component, and determining the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created.
Optionally, the obtaining module invokes a new system capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created, including:
determining the number of deployed databases of the new system to obtain the number of sub-databases, calculating to obtain the future predicted data volume of the system to obtain the data volume of the single database, dividing the data volume of the single database by the preset maximum data volume of the single table, rounding up, and multiplying by the preset expansion threshold to obtain the number of sub-tables; and
calling a migration capacity evaluation component, and determining the database number and the table number of a database to be created, wherein the steps comprise:
and determining the growth rate of the future preset time according to the data volume of the current system, calculating to obtain the total data volume, and calculating to obtain the sub-database number and the sub-meter number based on a preset single-database data volume threshold, a preset single-meter data volume threshold and a preset expansion threshold.
Optionally, the obtaining module obtains the message field of the request, and invokes a preset mapping engine to identify the corresponding banking key value and partitioning key value, including:
acquiring a message field of the request, judging whether a partition key and a sub-library key exist in the message field and are not empty, and if so, directly extracting a sub-library key value and a partition key value; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
Optionally, the obtaining module is further configured to:
the mapping relation of the sub-library partition key in the mapping engine is realized through a SpringEL expression configured in the transaction rule table.
One embodiment of the above invention has the following advantages or benefits: the invention introduces the database splitting key to split the data from two dimensions or multiple dimensions, which is convenient for the system to better manage the data and better realize complex functions; moreover, the consistent Hash algorithm is adopted to position the basket number, the system capacity is designed in advance, the data redistribution cost is reduced during capacity expansion, and the influence of the redistribution on the data system is simplified; meanwhile, a flexible and changeable mapping mechanism of the sub-library key and the fragment key ensures that the transaction realization function of the system is not restricted by data, and the diversity of the realizable function of the system is greatly improved; in addition, a variable mapping mechanism of the library dividing key and the fragment dividing key is realized through SpringEL expression.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic view of a main flow of a transaction data processing method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of an example consistent hashing algorithm according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a drop node on a basket number according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an example of determining a banking key and a partition key according to an embodiment of the present invention;
FIG. 5 is a schematic illustration of application data distribution after a partition key and a library key are determined in accordance with an embodiment of the present invention;
fig. 6 is a schematic view of a main flow of a transaction data processing method according to a second embodiment of the present invention;
FIG. 7 is a schematic diagram of the main modules of a transaction data processing apparatus according to an embodiment of the present invention;
FIG. 8 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 9 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic view of a main flow of a transaction data processing method according to a first embodiment of the present invention, as shown in fig. 1, the transaction data processing method includes:
step S101, receiving a transaction data processing request, obtaining a message field of the request, calling a preset mapping engine to identify a corresponding sub-library key value and a corresponding partition key value, and further calculating through a consistent Hash partitioning algorithm to obtain a basket number where to-be-processed data is located.
In an embodiment, the consistent hashing algorithm maps the entire hash value space into a virtual circle. When a server is removed or added, the mapping relationship between the existing service request and the processing request server can be changed as little as possible. The database sub-table is used for solving the problem of performance reduction of a database caused by overlarge data volume and simultaneously improving the throughput of the whole system, an original independent database is divided into a plurality of databases by adopting a certain rule to form the database sub-table, a table with large data is divided into a plurality of data tables, so that the data volume of a single database and the single data table is reduced, and the overall performance and the capacity are enhanced. The subbase key is the first layer of subbase identification of the two-level subbase and subtable algorithm, through which a group of (single) subbases can be located. The partition key is a second-layer database-partitioning identifier of a two-stage database-partitioning algorithm, and the basket number can be calculated by using a consistent hash algorithm for the field. The basket number is a different name of the number of the branch table, is calculated by a consistent hash algorithm, is a node on a hash ring, and is collected clockwise on the ring to form a database node.
In other embodiments, after the message field of the request is obtained, it may be determined whether a partition key and a sub-pool key exist in the message field and are not empty, and if yes, a sub-pool key value and a partition key value are directly extracted; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
That is, if the message field includes the partition key identification policy field and the sub-pool key identification policy field, and neither of them is empty, the sub-pool key value and the partition key value of the message field are directly configured in the corresponding partition key identification policy field and sub-pool key identification policy field in the transaction identification policy table, for example: table 1 is a transaction identification policy table, and the transaction code TX _ ACC _ INQ packet carries an ACC _ NO field, and the word of the field is a partition key, so that the field name of the transaction field is filled in the partition key identification policy field.
TABLE 1
Figure BDA0002997943690000071
In addition, if the fields of the transaction message do not directly include the partition key and the partition key, a preset mapping function (for example, an interception function and a concatenation function are adopted, that is, the mapping function can be used to obtain the partition key and the partition key from the fields of the transaction message) or a preset index table (that is, the partition key and the partition key are found by reading the index table through one or more fields) can be called. For example: in table 1, the transaction code TX _ UID _ UPD, where the first 2 bits of the UID are the branch number and the second 15 bits are the account, is intercepted from the UID field to obtain the partition key value and the vault key value. When the transaction code TX _ ACC _ INQ locates the banking Key, it is found that the type is configured with comma, comma-front transaction field, and comma-back index table (as table 2), the system uses the value of the ACC _ NO field of the transaction as Key, and ACC _ BR _ MAPPING as type to read the index table, and finally obtains TargetID 25, and the obtained value 25 is the banking Key.
TABLE 2
Key Type TargetID
250010010031 ACC_BR_MAPPING 25
15019230014 PHONE_ACC_MAPPING 250010010031
In addition, some transaction data processing requests may need to acquire the banking key values and the partitioning key values in multiple ways, and then multiple rows of policies can be configured in the identification policy field, namely directly acquired, and through a preset mapping function and an index table until the banking key values and the partitioning key values are identified.
It is worth to be noted that the mapping relation of the sub-library partition key in the mapping engine is realized through the SpringEL expression configured in the transaction rule table. Among them, SpringEL (springdynamic expression language) is a powerful dynamic language for runtime queries and operands, with display such as explicit method calls and basic string template functions.
In other embodiments, the basket number where the data to be processed is calculated by using a consistent hash algorithm, as shown in fig. 2, according to the consistent hash algorithm, the client already implements a fixed mapping of 32-bit space on the ring, and then marks N baskets (the basket number is a branch table number) on the ring, and each client can map to a unique basket in a clockwise or counterclockwise direction. The number of baskets is fixed (i.e. a certain number of sub-tables) and if the number of baskets changes, the customer number needs to be remapped into the baskets, causing a lot of data movement. Considering that data is migrated as little as possible during subsequent capacity expansion, nodes, namely database sub-libraries, are introduced above the basket number, and taking the loan system as an example (as shown in fig. 3), one node may be divided into one row, namely one sub-library. If the business volume of a certain branch is increased explosively, a plurality of branch libraries can be moved and split among the branch libraries by taking the basket as a unit, the relation between the client and the basket is unchanged, data is migrated as little as possible, and application codes are not sensed.
Step S102, reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data is located based on the library division key value and the basket number.
In some embodiments, the core of the two-stage database-partitioning and table-partitioning strategy is to locate the database-partitioning group first and then locate the database-partitioning and table-partitioning by the basket number, and the database-partitioning group functions to aggregate the data with higher association (one database or multiple databases), thereby avoiding unnecessary cross-database queries. As shown in table 3, the data of rows 31, 32, and 33 are located in the sub-pool where SPU1 is located, which has 16 sub-lists, and the data of row 44 is located in two sub-pools, which are respectively the sub-pools corresponding to SPU2 and SPU3, where SPU2 is located in the sub-pool with only basket 1 to 5, and SPU3 is located in the sub-pool with basket 6 to 16.
TABLE 3
Figure BDA0002997943690000081
Figure BDA0002997943690000091
Step S103, accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
As some embodiments, before step S101 is executed, a database of the system needs to be established, and the system data is stored in the database, so that the transaction data can be processed. The specific implementation process comprises the steps of obtaining system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created; selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse; and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
It can be seen that in the embodiment of the present invention, a plurality of hash rings are formed by the library splitting key, and then the shard key is used to locate the nodes on the rings by using the consistent hash algorithm and to locate the same virtual node according to the same direction rule, so as to form the library splitting. One or more sub-bases can be provided on one hash ring, taking e-commerce platform as an example, not only the product ID but also the store ID is required as a sub-base key, and assuming that there are 1024 sub-tables, the data of the product ID is also in the 6 th (6mod 1024 ═ 6) sub-table, except that the sub-base where the product ID is located is no longer determined by the product ID alone, and now determined by the sub-base key (store ID) together with the product ID, and finally the data of the product should be in the 6 th sub-table of the third sub-base if the store ID is 123(123 is in the third sub-base and has only one sub-base).
Preferably, according to the attribute identifier in the system information, whether the system is a migration system can be judged, if so, a migration capacity evaluation component is called to determine the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created. It should be noted that, if the migration capacity evaluation component is called, the increase rate of the future preset time is determined according to the current system data volume, the total data volume is obtained through calculation, and then the sub-pool number and the sub-table number are obtained through calculation based on the preset single-pool data volume threshold, the preset single-table data volume threshold and the preset expansion threshold. And if the new system capacity evaluation component is called, determining the number of the new system deployment databases to obtain the sub-database number so as to obtain the future predicted data volume of the system and calculating to obtain the single-database data volume, and further dividing the single-database data volume by the preset maximum single-table data volume, rounding up and multiplying by the preset capacity expansion threshold value to obtain the sub-table number.
Therefore, the system designed by the database and the table can reduce or even fundamentally avoid data from being moved, and ensure the stable and healthy operation of the system. Capacity assessment is considered a lot for new systems, generally based on cost and expectations for the future. For example: based on the cost, the system budget can only deploy 4 databases at most, the number of the sub-databases can only be 4, and in addition, the future estimated number of the system users can reach 100 ten thousand of data, a single database basically needs 25 ten thousand of data, if the single table design does not exceed 10 ten thousand of data, the number of the sub-tables is determined to be 25/10, then the whole is taken upwards to be 3, so the number of the sub-databases is 4, and the number of the sub-tables is 3. Meanwhile, in order to better realize capacity expansion, the estimated number of the branch tables is generally proposed to be multiplied by a capacity expansion threshold value 2, so that the number of the branch banks is 4, and the number of the branch tables is 6. For a migrated system, the capacity estimation is generally based on the existing traffic and the expected growth amount for several years, for example, 1000 ten thousand of data exist in the system, the total data amount is 2000 ten thousand when the five years are expected to grow by 100%, and 2000/500-4 sub-pools and 500/10-50 sub-tables are required when a single-table is optimally stored by 500 ten thousand (i.e., a single-table data amount threshold) and an optimal 10 ten thousand (i.e., a single-table data amount threshold) are assumed. Meanwhile, in order to better realize capacity expansion, the number of the evaluation branch tables is generally recommended to be multiplied by 2, so that the number of the branch bases is 4, and the number of the branch tables is 100.
As another embodiment, the invention can divide the data in two dimensions or even in multiple dimensions, the most common aggregation query condition can be used as a sub-library key during specific design, and only the sub-library group where the sub-library key is located needs to be queried during online aggregation query without the need of querying all sub-libraries simultaneously, so that the implementation is simple, the network request quantity cannot be expanded, only a single or a plurality of specified sub-library files need to be processed during batch processing, and the simplicity and the controllability are well realized.
That is, the data structure is adapted to the management structure of the company, aggregate query brought by management type transactions is reduced as much as possible, a column field adapted to the organization transaction should be selected when the branch database key is selected, and if a bank system generally takes a branch as a management unit, a branch number is suggested to be used as the branch database key, so that branch data are aggregated. The e-commerce platform mainly manages stores, and most of management reports are issued with stores as basic units, and store IDs are suggested as branch base keys. The fragment key suggests selecting fields capable of scattering data evenly, a customer number is suggested to be used in a bank system, and a commodity ID is suggested to be used in an e-commerce platform. For example, as shown in fig. 4, taking the bank loan system as an example, the table AR _ ACC _ BASE selects the BRANCH number field job sublicense key of BRANCH _ NO, and ACC _ NO is used as the partition key. The distribution of the application data after the determination of the library key and the partition key is shown in fig. 5. After the database dividing key and the partition key are appointed, the application needs to put the data with the same database dividing key into the same database, and put the data with the same basket number obtained by the partition key through a consistent Hash algorithm into the same branch table.
In summary, the invention solves the problem that the data can not be split from multiple dimensions in the prior art, and simultaneously, the data is aggregated a little according to the service characteristics, so that the problem that aggregated query is difficult to realize due to data dispersion is solved, namely, aggregated query conditions can be used as a branch library key to gather the data, so that simple aggregated query is realized; compared with the traditional scheme, the method has the advantages that the mapping relation is written out or the transaction is required to be provided with or contain the partition key in the positioning process of the data, various mapping requirements of the Spring EL expression are met, and the transaction input data and the using mode are not limited; in the process of data capacity expansion, the invention can easily migrate data without recalculating the data. In addition, the invention can also divide the data vertically into homogeneous micro service center management, and divide the data horizontally into sub-base or sub-table processing. The horizontal segmentation is to divide the data in one table into different tables or databases according to a certain rule. The vertical segmentation is to divide the tables into different database tables according to the functional modules on the premise of not damaging the database design principle.
Fig. 6 is a schematic view of a main flow of a transaction data processing method according to a second embodiment of the present invention, as shown in fig. 6, the transaction data processing method includes:
acquiring system information of a database to be created, and calling a corresponding capacity evaluation component to determine the database dividing number and the table dividing number of the database to be created based on an attribute identifier in the system information; selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse; and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table. The method comprises the steps of receiving a transaction data processing request, obtaining a message field of the request, calling a preset mapping engine to identify a corresponding sub-library key value and a corresponding partition key value, and calculating to obtain the basket number of the data to be processed through a consistent Hash division algorithm. And then reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number. And finally, accessing a database and a deployment unit where the data to be processed are located, and executing the transaction data processing request, wherein the basket number is used as a table name suffix, so that a physical table can be uniquely determined on a single database.
That is, the data records having the association relationship are packaged into a data tuple (e.g., a client table and a detail table), the hash value calculated for the data tuple according to the partition key is put into a corresponding basket, one basket corresponds to a set of databases, i.e., is allocated to a corresponding database virtual Node, and a Node finally stored in the data basket is determined according to an actual Node corresponding to the virtual Node (e.g., an actual Node B corresponding to the virtual Node B2). Therefore, according to the idea of classifying the data distribution logic state and the deployment state, the data migration complexity after node expansion is simplified, the data migration complexity during node capacity expansion is reduced, decoupling of the data distribution logic state and the deployment state is realized, and management of fragmented data from a record level is avoided. Wherein, the logic state is divided into a fixed number (at least more than 500) of data baskets such as BK1 and BK2, namely virtual nodes, which are evenly distributed in a Hash space of 2^32, each basket corresponds to a table family, a group of data tables related to service logic is arranged inside the basket, and the later migration can directly carry out export and import operations on the tables without carrying out operations on data records.
Fig. 7 is a schematic diagram of main blocks of a transaction data processing apparatus according to an embodiment of the present invention, which includes an acquisition module 701 and a processing module 702, as shown in fig. 7. The acquisition module 701 receives a transaction data processing request, acquires a message field of the request, calls a preset mapping engine to identify a corresponding warehouse key value and a partition key value, and then calculates through a consistent Hash division algorithm to obtain a basket number where to-be-processed data is located; the processing module 702 reads a preset deployment configuration table, and obtains a database and a deployment unit where to-be-processed data is located based on the sub-library key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
In some embodiments, before the obtaining module 701 receives the transaction data processing request, the obtaining module includes:
acquiring system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created;
selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse;
and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
In some embodiments, the obtaining module 701, based on the attribute identifier in the system information, invokes a corresponding capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created, including:
judging whether the system is a migration system or not according to the attribute identification in the system information, if so, calling a migration capacity evaluation component, and determining the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created.
In some embodiments, the obtaining module 701 invokes the new system capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created, including:
determining the number of deployed databases of the new system to obtain the number of sub-databases, calculating to obtain the future predicted data volume of the system to obtain the data volume of the single database, dividing the data volume of the single database by the preset maximum data volume of the single table, rounding up, and multiplying by the preset expansion threshold to obtain the number of sub-tables; and
calling a migration capacity evaluation component, and determining the database number and the table number of a database to be created, wherein the steps comprise:
and determining the growth rate of the future preset time according to the data volume of the current system, calculating to obtain the total data volume, and calculating to obtain the sub-database number and the sub-meter number based on a preset single-database data volume threshold, a preset single-meter data volume threshold and a preset expansion threshold.
In some embodiments, the obtaining module 701 obtains the message field of the request, and invokes a preset mapping engine to identify the corresponding banking key value and partitioning key value, including:
acquiring a message field of the request, judging whether a partition key and a sub-library key exist in the message field and are not empty, and if so, directly extracting a sub-library key value and a partition key value; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
In some embodiments, the obtaining module 701 is further configured to:
the mapping relation of the sub-library partition key in the mapping engine is realized through a SpringEL expression configured in the transaction rule table.
It should be noted that the transaction data processing method and the transaction data processing apparatus of the present invention have corresponding relation in the specific implementation content, and therefore, the repeated content is not described again.
Fig. 8 shows an exemplary system architecture 800 of a transaction data processing method or a transaction data processing apparatus to which embodiments of the invention may be applied.
As shown in fig. 8, the system architecture 800 may include terminal devices 801, 802, 803, a network 804, and a server 805. The network 804 serves to provide a medium for communication links between the terminal devices 801, 802, 803 and the server 805. Network 804 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 801, 802, 803 to interact with a server 805 over a network 804 to receive or send messages or the like. The terminal devices 801, 802, 803 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 801, 802, 803 may be various electronic devices having transaction data processing screens and supporting web browsing, including but not limited to smart phones, tablets, laptop portable computers, desktop computers, and the like.
The server 805 may be a server that provides various services, such as a back-office management server (for example only) that supports shopping-like websites browsed by users using the terminal devices 801, 802, 803. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the transaction data processing method provided by the embodiment of the present invention is generally executed by the server 805, and accordingly, the computing device is generally disposed in the server 805.
It should be understood that the number of terminal devices, networks, and servers in fig. 8 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 9, shown is a block diagram of a computer system 900 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, the computer system 900 includes a Central Processing Unit (CPU)901 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM903, various programs and data necessary for the operation of the computer system 900 are also stored. The CPU901, ROM902, and RAM903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.
The following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including a display such as a Cathode Ray Tube (CRT), a liquid crystal transaction data processor (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit (CP U) 901.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module and a processing module. Wherein the names of the modules do not in some cases constitute a limitation of the module itself.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs, and when the one or more programs are executed by the equipment, the equipment receives a transaction data processing request, acquires a message field of the request, calls a preset mapping engine to identify a corresponding sub-library key value and a corresponding partition key value, and further obtains a basket number where to-be-processed data is located through a consistent Hash sub-algorithm; reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
According to the technical scheme of the embodiment of the invention, the data is divided in two dimensions or in multiple dimensions so as to adapt to an enterprise multilevel management mode, the processing logic of the aggregation service is simplified, and more flexible database-based and table-based processing is realized.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. A transaction data processing method, comprising:
receiving a transaction data processing request, acquiring a message field of the request, calling a preset mapping engine to identify a corresponding sub-library key value and a partition key value, and calculating by a consistent Hash sub-algorithm to obtain a basket number where to-be-processed data is located;
reading a preset deployment configuration table, and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number;
and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
2. The method of claim 1, wherein prior to receiving the transaction data processing request, comprising:
acquiring system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created;
selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse;
and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
3. The method of claim 2, wherein the step of calling the corresponding capacity evaluation component to determine the number of sub-databases and the number of sub-tables of the database to be created based on the attribute identifier in the system information comprises:
judging whether the system is a migration system or not according to the attribute identification in the system information, if so, calling a migration capacity evaluation component, and determining the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created.
4. The method of claim 3, wherein invoking the new system capacity assessment component to determine the number of sub-banks and the number of sub-tables of the database to be created comprises:
determining the number of deployed databases of the new system to obtain the number of sub-databases, calculating to obtain the future predicted data volume of the system to obtain the data volume of the single database, dividing the data volume of the single database by the preset maximum data volume of the single table, rounding up, and multiplying by the preset expansion threshold to obtain the number of sub-tables; and
calling a migration capacity evaluation component, and determining the database number and the table number of a database to be created, wherein the steps comprise:
and determining the growth rate of the future preset time according to the data volume of the current system, calculating to obtain the total data volume, and calculating to obtain the sub-database number and the sub-meter number based on a preset single-database data volume threshold, a preset single-meter data volume threshold and a preset expansion threshold.
5. The method of claim 1, wherein obtaining the message field of the request and invoking a preset mapping engine to identify corresponding banking key values and partitioning key values comprises:
acquiring a message field of the request, judging whether a partition key and a sub-library key exist in the message field and are not empty, and if so, directly extracting a sub-library key value and a partition key value; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
6. The method of claim 5, wherein the preset mapping function comprises a truncation function and a concatenation function;
and calling a preset interception function or a splicing function, and acquiring a database partitioning key value and a partitioning key value from the message field.
7. The method of claim 5, further comprising:
setting an identification strategy field to configure a multi-line strategy; wherein the multi-line policy includes a mapping function, an index table, and a direct fetch.
8. The method according to any one of claims 1 to 7, comprising:
the mapping relation of the sub-library partition key in the mapping engine is realized through a SpringEL expression configured in the transaction rule table.
9. A transaction data processing apparatus, comprising:
the acquisition module is used for receiving a transaction data processing request, acquiring a message field of the request, calling a preset mapping engine to identify a corresponding sub-library key value and a corresponding partition key value, and further calculating by a consistent Hash partitioning algorithm to obtain a basket number where to-be-processed data is located;
the processing module is used for reading a preset deployment configuration table and acquiring a database and a deployment unit where to-be-processed data are located based on the library dividing key value and the basket number; and accessing the database and the deployment unit where the data to be processed is located, and executing the transaction data processing request.
10. The apparatus of claim 9, wherein the obtaining module, prior to receiving the transaction data processing request, comprises:
acquiring system information of a database to be created, calling a corresponding capacity evaluation component based on an attribute identifier in the system information, and determining the database dividing number and the table dividing number of the database to be created;
selecting a sub-warehouse key value and a partition key value according to the service scene in the system information so as to put the system data with the same sub-warehouse key value into the same sub-warehouse;
and obtaining the corresponding basket number through consistent hash calculation according to the partition key value so as to put the system data with the same basket number into the same branch table.
11. The apparatus of claim 10, wherein the obtaining module invokes a corresponding capacity evaluation component to determine the number of sub-banks and the number of sub-tables of the database to be created based on the attribute identifier in the system information, and comprises:
judging whether the system is a migration system or not according to the attribute identification in the system information, if so, calling a migration capacity evaluation component, and determining the number of sub-databases and the number of sub-tables of the database to be created; and if not, calling a new system capacity evaluation component, and determining the database dividing number and the table dividing number of the database to be created.
12. The apparatus of claim 11, wherein the obtaining module invokes the new system capacity assessment component to determine the number of sub-banks and the number of sub-tables of the database to be created, comprising:
determining the number of deployed databases of the new system to obtain the number of sub-databases, calculating to obtain the future predicted data volume of the system to obtain the data volume of the single database, dividing the data volume of the single database by the preset maximum data volume of the single table, rounding up, and multiplying by the preset expansion threshold to obtain the number of sub-tables; and
calling a migration capacity evaluation component, and determining the database number and the table number of a database to be created, wherein the steps comprise:
and determining the growth rate of the future preset time according to the data volume of the current system, calculating to obtain the total data volume, and calculating to obtain the sub-database number and the sub-meter number based on a preset single-database data volume threshold, a preset single-meter data volume threshold and a preset expansion threshold.
13. The apparatus of claim 9, wherein the obtaining module obtains the message field of the request, and invokes a preset mapping engine to identify the corresponding banking key value and partitioning key value, and includes:
acquiring a message field of the request, judging whether a partition key and a sub-library key exist in the message field and are not empty, and if so, directly extracting a sub-library key value and a partition key value; and if not, calling a preset mapping function or a preset index table, and identifying corresponding sub-library key values and partition key values based on the message field.
14. The apparatus according to any one of claims 9-13, wherein the obtaining module is further configured to:
the mapping relation of the sub-library partition key in the mapping engine is realized through a SpringEL expression configured in the transaction rule table.
15. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.
CN202110336785.XA 2021-03-29 2021-03-29 Transaction data processing method and device Pending CN112862613A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110336785.XA CN112862613A (en) 2021-03-29 2021-03-29 Transaction data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110336785.XA CN112862613A (en) 2021-03-29 2021-03-29 Transaction data processing method and device

Publications (1)

Publication Number Publication Date
CN112862613A true CN112862613A (en) 2021-05-28

Family

ID=75993167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110336785.XA Pending CN112862613A (en) 2021-03-29 2021-03-29 Transaction data processing method and device

Country Status (1)

Country Link
CN (1) CN112862613A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023103338A1 (en) * 2021-12-06 2023-06-15 深圳前海微众银行股份有限公司 Data processing method and apparatus, and device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190392047A1 (en) * 2018-06-25 2019-12-26 Amazon Technologies, Inc. Multi-table partitions in a key-value database
CN111078776A (en) * 2019-12-10 2020-04-28 北京明略软件系统有限公司 Data table standardization method, device, equipment and storage medium
CN111339088A (en) * 2020-02-21 2020-06-26 苏宁云计算有限公司 Database division and table division method, device, medium and computer equipment
CN111399851A (en) * 2020-06-06 2020-07-10 四川新网银行股份有限公司 Batch processing execution method based on distributed system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190392047A1 (en) * 2018-06-25 2019-12-26 Amazon Technologies, Inc. Multi-table partitions in a key-value database
CN111078776A (en) * 2019-12-10 2020-04-28 北京明略软件系统有限公司 Data table standardization method, device, equipment and storage medium
CN111339088A (en) * 2020-02-21 2020-06-26 苏宁云计算有限公司 Database division and table division method, device, medium and computer equipment
CN111399851A (en) * 2020-06-06 2020-07-10 四川新网银行股份有限公司 Batch processing execution method based on distributed system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023103338A1 (en) * 2021-12-06 2023-06-15 深圳前海微众银行股份有限公司 Data processing method and apparatus, and device and storage medium

Similar Documents

Publication Publication Date Title
US8200705B2 (en) Method and apparatus for applying database partitioning in a multi-tenancy scenario
JP6542909B2 (en) File operation method and apparatus
US11496588B2 (en) Clustering layers in multi-node clusters
US11314770B2 (en) Database multiplexing architectures
US20190229984A1 (en) System and method for generic configuration management system application programming interface
US8402044B2 (en) Systems and methods for secure access of data
US20180129691A1 (en) Dynamic creation and maintenance of multi-column custom indexes for efficient data management in an on-demand services environment
CN110019080B (en) Data access method and device
US20120054182A1 (en) Systems and methods for massive structured data management over cloud aware distributed file system
CN107704202B (en) Method and device for quickly reading and writing data
CN107480205B (en) Method and device for partitioning data
US11586646B2 (en) Transforming data structures and data objects for migrating data between databases having different schemas
US11140220B1 (en) Consistent hashing using the power of k choices in server placement
WO2019205790A1 (en) Data operating method and device
CN112925859A (en) Data storage method and device
CN110287264A (en) Batch data update method, device and the system of distributed data base
WO2019226279A1 (en) Frequent pattern analysis for distributed systems
US20140095644A1 (en) Processing of write requests in application server clusters
CN112597126A (en) Data migration method and device
CN105447151A (en) Method for accessing distributed database, data source proxy apparatus and application server
CN109343962A (en) Data processing method, device and distribution service
CN112862613A (en) Transaction data processing method and device
CN110795419A (en) Method and device for dynamic database-based routing
CN113407108A (en) Data storage method and system
CN113704245A (en) Database main key generation method, sub-table positioning method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination