CN114443772A - Distributed data processing method, device, equipment and medium - Google Patents

Distributed data processing method, device, equipment and medium Download PDF

Info

Publication number
CN114443772A
CN114443772A CN202210113083.XA CN202210113083A CN114443772A CN 114443772 A CN114443772 A CN 114443772A CN 202210113083 A CN202210113083 A CN 202210113083A CN 114443772 A CN114443772 A CN 114443772A
Authority
CN
China
Prior art keywords
sql
processing
rewriting
data
routing path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210113083.XA
Other languages
Chinese (zh)
Inventor
张磊
李田雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202210113083.XA priority Critical patent/CN114443772A/en
Publication of CN114443772A publication Critical patent/CN114443772A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for processing distributed data. Wherein, the method comprises the following steps: if the account information change event is detected, acquiring an SQL statement and information change parameters sent by a client; analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters; determining a routing path of the SQL statement according to the parameters required by the fragments; and rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing. According to the technical scheme, the data from the database can be rapidly processed under the conditions that the cache of the application server is not occupied and the performance of the database is consumed little, so that the distribution processing of the data is completed.

Description

Distributed data processing method, device, equipment and medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a medium for processing distributed data.
Background
With the advent of the big data era, the data volume stored in the system is greatly increased, a performance bottleneck is easily generated when a traditional relational database processes a large amount of data, the throughput capacity of the whole system is affected, particularly, when complex scenes are processed, connection operation among multiple tables is often involved, larger performance overhead is brought, and in some scenes, the performance requirements of the system can not be met.
At present, JAVA or other languages are generally adopted, after data are searched out from a database at one time, the balance of each time mode is assembled by processing according to dimensions such as customer identification, date and the like one by one.
However, the scheme has high resource consumption and may cause memory overflow. If balance summary is carried out on transaction data with data volume of hundred million units, memory overflow can be generated by one-time query from a database; and the processing is needed one by one, which takes a long time.
Disclosure of Invention
The invention provides a distributed data processing method, a distributed data processing device, distributed data processing equipment and a distributed data processing medium, which can rapidly process data from a database under the conditions of not occupying an application server cache and consuming little database performance, thereby completing the distribution processing of the data.
According to an aspect of the present invention, there is provided a distributed data processing method, including:
if the account information change event is detected, acquiring an SQL statement and information change parameters sent by a client;
analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters;
determining a routing path of the SQL statement according to the parameters required by the fragments;
and rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
Optionally, after rewriting the SQL statement into a path-executable SQL rewriting statement according to the routing path and distributing and processing, the method further includes:
and performing result set merging calculation on the distribution processing result based on the abstract syntax tree, and feeding the merging result back to the client.
Optionally, the parameters required by the fragment include at least one of date, amount, account number characteristic bit and processing mechanism code.
Optionally, determining a routing path of the SQL statement according to the parameters required by the segment includes:
and determining a routing path of the SQL statement according to at least one of date, amount, account number characteristic bit and processing mechanism code included in the parameters required by the fragments and a preconfigured routing rule.
Optionally, the data node includes at least one database, and the database is any one of a MySQL database, a PostgreSQL database, an Oracle database, and an SQLServer database.
Optionally, after distributing to the data node corresponding to the routing path for processing, the method further includes:
storing the SQL rewriting statements into a target database, and splitting the SQL rewriting statements into a single task, a task method and a public method; the task method is used for providing a daily average data calculation service under the condition that the task method is called, and the public method is used for providing a service of information of a starting day, an ending day and the current day of a period under the condition that the task method is called.
Optionally, the single task is used for calculating at least one of an accumulated deposit balance, a daily average deposit balance, an accumulated loan balance, and a daily average loan balance.
According to another aspect of the present invention, there is provided a distributed data processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring SQL statements and information change parameters sent by a client if an account information change event is detected;
the parameter extraction module is used for analyzing the SQL statement to obtain an abstract syntax tree and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters;
a routing path determining module, configured to determine a routing path of the SQL statement according to the parameters required by the segment;
and the distribution processing module is used for rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method for processing distributed data according to any of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the distributed data processing method according to any one of the embodiments of the present invention when the computer instructions are executed.
According to the technical scheme of the embodiment of the invention, if an account information change event is detected, an SQL statement and an information change parameter sent by a client are obtained; analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters; determining a routing path of the SQL statement according to the parameters required by the fragments; and rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing. According to the technical scheme, the data from the database can be rapidly processed under the conditions that the cache of the application server is not occupied and the performance of the database is consumed little, so that the distribution processing of the data is completed.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a distributed data processing method according to an embodiment of the present invention;
fig. 2 is a flowchart of a distributed data processing method according to a second embodiment of the present invention;
fig. 3 is an overall structure diagram of a storage process to which the distributed data processing method according to the second embodiment of the present invention is applied;
fig. 4 is a schematic structural diagram of a distributed data processing apparatus according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a distributed data processing method according to an embodiment of the present invention;
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a distributed data processing method according to an embodiment of the present invention, where the embodiment is applicable to a case of processing distributed data, the method may be performed by a distributed data processing apparatus, the distributed data processing apparatus may be implemented in a form of hardware and/or software, and the distributed data processing apparatus may be configured in an electronic device with data processing capability. As shown in fig. 1, the method includes:
s110, if the account information change event is detected, obtaining an SQL statement and information change parameters sent by the client.
The technical scheme of the embodiment can be executed by a background server, and bank transaction detail data with large data magnitude from heterogeneous data sources including Sybase and Mysql databases can be rapidly processed by using Apache Shardingsphere middleware and a database storage process under the conditions of not occupying application server cache and having small performance pressure on the databases, so that the collection processing of cross-time mode data is completed, and the high availability of a distributed database during the data collection processing is ensured. In this embodiment, Apache ShardingSphere enables the database to have a distributed storage capability through a data fragmentation scheme.
Apache Shardingsphere is a set of ecocircles consisting of distributed database solutions, which consists of three major parts, Sharding-JDBC, Sharding-Proxy, and Sharding-Sidecar. The system and the method provide functions of standardized data level extension, distributed transaction, distributed governance and the like, and are applicable to various diversified application scenes such as Java isomorphism, heterogeneous languages, cloud protogenesis and the like. With continued exploration of Apache ShardingSphere in query optimizers and distributed transaction engines, Apache ShardingSphere has gradually broken product boundaries and evolved to a platform-level solution that integrates both types of access and stability.
The account information change event may include an event of information change generated in real time by the user account information. Illustratively, the account information is changed data due to the depositing and withdrawing operation of the user. SQL statements are a standard query language for relational databases. The SQL statements sent by the client and acquired in this embodiment may be SQL statements that are generated by the back end of the server according to the deposit and withdrawal operation of the user and need to be executed. The information variation parameter may be a date, amount, account number, etc. of the variation. In this embodiment, when the background server detects that the account information of the user changes, parameters such as the SQL statement and the changed date, amount, and account number that are generated and needed to be executed and sent by the client are acquired by analyzing the database protocol packet or by means of Sharding-JDBC driver.
In this embodiment, the processing of the tasks by the server may include processing of two tasks, namely, daytime and end-of-day. The data which changes in real time are generated during the day, and the data which is summarized in one day is obtained at the end of the day. The server can realize the calling of various single tasks through batch task processing.
S120, analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the slicing from the abstract syntax tree and the information variation parameters.
An Abstract Syntax Tree (AST) is a tree representation of an abstract syntax structure of a source code, and each node on the tree represents a structure in the source code, which is abstract because the abstract syntax tree does not represent every detail of a real syntax, for example, a nesting bracket is hidden in the structure of the tree and is not represented in the form of a node. The abstract syntax tree does not depend on the syntax of the source language, that is, the context used in the parsing stage is grammatical-free, because when the grammar is written, the grammar is often equivalently transformed (left recursion, backtracking, ambiguity and the like are eliminated), which introduces some redundant components into the grammar analysis, adversely affects the subsequent stages, and even makes the combined stages confused. Therefore, many compilers often construct parse trees independently, building a clear interface for the front-end and back-end.
The parameters required by the fragmentation can be key word parameters extracted from the SQL statements for the fragmentation. Parameters required by the slicing can include date, amount, account number characteristic bits, branch codes and other parameters; the account number characteristic bits may be the first few digits of the account number. Specifically, taking the date as an example of the parameter required by the segment, it is determined whether the current information is of the current year or of the previous year according to the analysis, and then the different segments are divided according to the date. In this embodiment, the backend server may analyze the obtained SQL statements to obtain the abstract syntax tree AST according to the lexical analyzer and the syntax analyzer, and extract the keyword parameters required for the fragmentation from the SQL statements and the information variation parameters.
S130, determining the routing path of the SQL statement according to the parameters required by the fragments.
The routing path may be understood as a path where the segment of the SQL statement is located. In this embodiment, the routing path of the SQL statement may be determined according to at least one of the date, the amount, the account feature bit, and the processing mechanism code in the parameters required for the fragmentation.
And S140, rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
The data nodes may be respective data nodes corresponding to the routing paths in the abstract syntax tree. The data node may comprise at least one database. In this embodiment, the background server may rewrite the SQL statement into an SQL rewrite statement executable by the path according to the obtained routing path, and distribute the SQL rewrite statement to each data node corresponding to the routing path in the abstract syntax tree for processing.
In the embodiment, the executable SQL statements can be automatically routed to corresponding data nodes according to the fragmentation algorithm preset by the user, so that the aim of operating a plurality of databases is fulfilled. The user may use multiple databases managed by Apache ShardingSphere like a stand-alone database.
According to the technical scheme of the embodiment of the invention, if an account information change event is detected, an SQL statement and an information change parameter sent by a client are obtained; analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters; determining a routing path of the SQL statement according to the parameters required by the fragments; and rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing. According to the technical scheme, the data from the database can be rapidly processed under the conditions that the cache of the application server is not occupied and the performance of the database is consumed little, so that the distribution processing of the data is completed.
Example two
Fig. 2 is a flowchart of a distributed data processing method according to a second embodiment of the present invention, and the present embodiment is optimized based on the second embodiment. The concrete optimization is as follows: after rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path and distributing processing, the method further includes: and performing result set merging calculation on the distribution processing result based on the abstract syntax tree, and feeding the merging result back to the client.
As shown in fig. 2, the method includes:
s210, if the account information change event is detected, obtaining an SQL statement and information change parameters sent by the client.
S220, analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters.
And S230, determining the routing path of the SQL statement according to the parameters required by the fragment.
And S240, rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
And S250, performing result set merging calculation on the distribution processing result based on the abstract syntax tree, and feeding the merging result back to the client.
The distribution processing result may be a result obtained by the server distributing the executable SQL statement to the data node corresponding to the routing path for processing. In this embodiment, the background server may perform merging calculation of the streaming or full-size memory result set according to the abstract syntax tree, and return a merging result to the client according to the encapsulated database protocol package or JDBC result set.
For example, in this embodiment, the daily average balance of the year and the daily average balance of the year may be divided according to the abstract syntax tree, and the calculation may be performed separately, and the results may be displayed to the client in a merged manner. The balance data aggregation may be to aggregate the bank transaction details and the balance of each period of the previous day to generate the period balance data of the current day.
According to the technical scheme of the embodiment of the invention, if an account information change event is detected, an SQL statement and an information change parameter sent by a client are obtained; analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters; determining a routing path of the SQL statement according to the parameters required by the fragments; rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing; and performing result set merging calculation on the distribution processing result based on the abstract syntax tree, and feeding the merging result back to the client. According to the technical scheme, under the conditions that the cache of the application server is not occupied and the performance of the database is low in consumption, the bank transaction detail data with large data magnitude from the heterogeneous data source of the database can be rapidly processed, so that the data summarization processing is completed, and the high availability of the distributed database during the data summarization processing period is ensured.
In this embodiment, optionally, the parameters required by the fragment include at least one of a date, an amount, an account number characteristic bit, and a processing mechanism code.
The account characteristic bits can be the first few bits of the account; the processing mechanism code may be a line code of the processing mechanism. The parameters required for fragmentation in this embodiment may include at least one of date, amount, first digits of the account number, and branch code of the processing mechanism.
According to the scheme, the data can be segmented according to the keyword information, so that the data can be processed more conveniently and rapidly.
In this embodiment, optionally, determining a routing path of the SQL statement according to the parameters required by the segment includes: and determining a routing path of the SQL statement according to at least one of date, amount, account number characteristic bit and processing mechanism code included in the parameters required by the fragments and a preconfigured routing rule.
The preconfigured routing rule may be matching the fragment key according to a user preset algorithm, and calculating a routing path. In this embodiment, the background server may match the fragment key according to at least one of the date, the amount of money, the account number feature bit, and the processing mechanism code included in the parameters required for the fragment and according to a user preset algorithm, and calculate the routing path, thereby determining the routing path of the SQL statement.
For example, what database may be pointed to for a date in the parameters needed for fragmentation; it is also possible to determine to which database the account number points for the first few digits of the account number, and what products, such as personal, debit, and credit card products, etc. The sharding key determines the distribution of the aggregated documents within the shards of the collection. A shard key is an index field or compound index field that exists in each document in the collection. The distributed data storage database MongoDB uses the sharded key value range to partition the data in the set. Each range defines a non-overlapping range of sharding keys and is associated with a chunk. The MongoDB may attempt to evenly distribute the chunks among the shards in the cluster. Furthermore, the fragmentation key is directly related to the effectiveness of the chunk distribution.
According to the scheme, the SQL sentences can be automatically routed to the corresponding data nodes according to the slicing algorithm preset by the user through the setting, so that the aim of operating a plurality of databases is fulfilled.
In this embodiment, optionally, the data node includes at least one database, and the database is any one of a MySQL database, a PostgreSQL database, an Oracle database, and an SQLServer database.
In the embodiment, the data node comprises at least one database, and a user can use a plurality of databases managed by Apache Shardingsphere like using a stand-alone database. Currently, there are MySQL databases, PostgreSQL databases, Oracle databases, SQLServer databases, and any databases that support the SQL92 standard and JDBC standard protocols.
By means of the scheme, the storage process of the database can be based on, application program cache is not occupied, and the performance of the consumed database is low.
In this embodiment, optionally, after distributing to the data node corresponding to the routing path for processing, the method further includes: storing the SQL rewriting statements into a target database, and splitting the SQL rewriting statements into a single task, a task method and a public method; the task method is used for providing a daily average data calculation service under the condition that the task method is called, and the public method is used for providing a service of information of a starting day, an ending day and the current day of a period under the condition that the task method is called.
The task processing in the embodiment includes two tasks of day time and day end. The data which changes in real time are generated during the day, and the data which is summarized in one day is obtained at the end of the day. And various single tasks are processed through batch task processing. The task method can be used for providing the daily average data calculation service under the condition that the task method is called, and the public method can be used for providing the service of the information of the starting day, the ending day and the current day of the period under the condition that the task method is called.
The overall structure of the storage process in this embodiment is shown in fig. 3. The single task can be called by the end-of-day batch to process a specific task. Such as a data cleansing task, a credit accumulation balance task, a loan accumulation balance task, a daily or business balance task, etc. The single task is understood to mean that only the deposit cumulative balance is calculated, and the other only the loan cumulative balance is calculated, etc., each as a single task.
The task method can be called by a single task storage process, and balance data of the same logic in different time modes can be processed according to different input parameters. For example, the method for processing the average daily deposit balance in the period comprises the following concrete implementation steps: 1) and calling a public method to acquire parameters such as initial date, end date, interval days and the like in the time mode. 2) Deleting the average daily deposit data of the intermediate table during the current day; 3) inserting daily average deposit data into a middle table in the time mode; 4) deleting the daily average deposit data of the target table during the current day; 5) intermediate table data is inserted into the target table.
The common method may be a method of acquiring common parameters, such as acquiring day information of the current period, the initial date of the current period, the final date of the current period, and the like. And the number of days of average day of the year and the number of days after the system is on line can be acquired.
Illustratively, the common method obtains stored procedure code logic for the current day parameter:
Figure BDA0003495413670000111
Figure BDA0003495413670000121
Figure BDA0003495413670000131
Figure BDA0003495413670000141
according to the scheme of the embodiment, the process of storing the SQL rewriting statements into the target database can be divided into the logically independent storage processes, the storage processes are divided into a single task storage process, a task method storage process and a public method storage process, the operation and the code complexity of the database association table can be reduced through mutual calling among the storage processes, and the balance data summarizing and processing speed is improved. But also can be summarized into information of balance of each client at the end of the day, at the end of the month, at the end of the season, at the end of the year, and other daily balance data and the like under the conditions of not occupying the cache of the application server and consuming less performance of the database,
in this embodiment, optionally, the single task is used to calculate at least one of an accumulated deposit balance, a daily average deposit balance, an accumulated loan balance, and a daily average loan balance.
The single task in this embodiment may be used to calculate the cumulative deposit balance, the average daily deposit balance, the cumulative loan balance, the average daily loan balance, and the like. According to the scheme, a single task is designed through the setting, and the sharing technology is used for calling, so that support and high available support for multi-source data are realized.
EXAMPLE III
Fig. 4 is a schematic structural diagram of a distributed data processing apparatus according to a third embodiment of the present invention. The device can execute the distributed data processing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. As shown in fig. 4, the apparatus includes:
the obtaining module 410 is configured to, if an account information change event is detected, obtain an SQL statement and an information change parameter sent by a client;
a parameter extraction module 420, configured to parse the SQL statement to obtain an abstract syntax tree, and extract parameters required for fragmentation from the abstract syntax tree and the information variation parameters;
a routing path determining module 430, configured to determine a routing path of the SQL statement according to the parameters required by the segment;
and the distribution processing module 440 is configured to rewrite the SQL statement into a path-executable SQL rewrite statement according to the routing path, and distribute the SQL rewrite statement to a data node corresponding to the routing path for processing.
Optionally, the apparatus further comprises: a result set aggregation calculation module to:
and after rewriting the SQL statement into an SQL rewriting statement executable by the path according to the routing path and distributing and processing, performing result set merging calculation on a distribution processing result based on the abstract syntax tree, and feeding back the merging result to the client.
Optionally, the parameters required by the fragment include at least one of date, amount, account number characteristic bit and processing mechanism code.
Optionally, the routing path determining module 430 is specifically configured to:
and determining a routing path of the SQL statement according to at least one of date, amount, account number characteristic bit and processing mechanism code included in the parameters required by the fragments and a preconfigured routing rule.
Optionally, the data node includes at least one database, and the database is any one of a MySQL database, a PostgreSQL database, an Oracle database, and an SQLServer database.
Optionally, the apparatus further comprises: a storage splitting module to:
after the SQL rewriting statements are distributed to data nodes corresponding to a routing path for processing, the SQL rewriting statements are stored in a target database and are divided into three processes, namely a single task, a task method and a public method; the task method is used for providing a daily average data calculation service under the condition that the task method is called, and the public method is used for providing a service of information of a starting day, an ending day and the current day of a period under the condition that the task method is called.
Optionally, the single task is used for calculating at least one of an accumulated deposit balance, a daily average deposit balance, an accumulated loan balance, and a daily average loan balance.
The distributed data processing device provided by the embodiment of the invention can execute the distributed data processing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
FIG. 5 illustrates a schematic diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM)12, a Random Access Memory (RAM)13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM)12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as processing of method distributed data.
In some embodiments, the processing of method distributed data may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the processing of the distributed data of the method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the processing of the method distributed data by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for processing distributed data, the method comprising:
if the account information change event is detected, acquiring an SQL statement and information change parameters sent by a client;
analyzing the SQL statement to obtain an abstract syntax tree, and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters;
determining a routing path of the SQL statement according to the parameters required by the fragments;
and rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path, and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
2. The method of claim 1, wherein after rewriting the SQL statement into a path executable SQL rewrite statement according to the routing path and distributing the processing, the method further comprises:
and performing result set merging calculation on the distribution processing result based on the abstract syntax tree, and feeding the merging result back to the client.
3. The method of claim 1, wherein the parameters required for fragmentation include at least one of date, amount, account number characteristic bits, and processing mechanism code.
4. The method according to claim 3, wherein determining the routing path of the SQL statement according to the parameters required by the segment includes:
and determining a routing path of the SQL statement according to at least one of date, amount, account number characteristic bit and processing mechanism code included in the parameters required by the fragments and a preconfigured routing rule.
5. The method of claim 1, wherein the data node comprises at least one database, and wherein the database is any one of a MySQL database, a PostgreSQL database, an Oracle database, and a SQLServer database.
6. The method of claim 1, wherein after being distributed to the data nodes corresponding to the routing paths for processing, the method further comprises:
storing the SQL rewriting statements into a target database, and splitting the SQL rewriting statements into a single task, a task method and a public method; the task method is used for providing a daily average data calculation service under the condition that the task method is called, and the public method is used for providing a service of information of a starting day, an ending day and the current day of a period under the condition that the task method is called.
7. The method of claim 6, wherein the single task is used to calculate at least one of a cumulative deposit balance, a daily average deposit balance, a cumulative loan balance, and a daily average loan balance.
8. An apparatus for processing distributed data, the apparatus comprising:
the acquisition module is used for acquiring SQL statements and information change parameters sent by a client if an account information change event is detected;
the parameter extraction module is used for analyzing the SQL statement to obtain an abstract syntax tree and extracting parameters required by the fragments from the abstract syntax tree and the information variation parameters;
a routing path determining module, configured to determine a routing path of the SQL statement according to the parameters required by the segment;
and the distribution processing module is used for rewriting the SQL statement into a path executable SQL rewriting statement according to the routing path and distributing the SQL rewriting statement to a data node corresponding to the routing path for processing.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of processing distributed data according to any one of claims 1 to 7.
10. A computer-readable storage medium storing computer instructions for causing a processor to perform the method of processing distributed data according to any one of claims 1 to 7 when executed.
CN202210113083.XA 2022-01-29 2022-01-29 Distributed data processing method, device, equipment and medium Pending CN114443772A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210113083.XA CN114443772A (en) 2022-01-29 2022-01-29 Distributed data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210113083.XA CN114443772A (en) 2022-01-29 2022-01-29 Distributed data processing method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN114443772A true CN114443772A (en) 2022-05-06

Family

ID=81371877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210113083.XA Pending CN114443772A (en) 2022-01-29 2022-01-29 Distributed data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN114443772A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453578A (en) * 2023-12-25 2024-01-26 杭州云动智能汽车技术有限公司 NMEA sentence detection method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117453578A (en) * 2023-12-25 2024-01-26 杭州云动智能汽车技术有限公司 NMEA sentence detection method and device, electronic equipment and storage medium
CN117453578B (en) * 2023-12-25 2024-04-19 杭州云动智能汽车技术有限公司 NMEA sentence detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114205690B (en) Flow prediction method, flow prediction device, model training device, electronic equipment and storage medium
US20220358178A1 (en) Data query method, electronic device, and storage medium
CN114579104A (en) Data analysis scene generation method, device, equipment and storage medium
CN115145924A (en) Data processing method, device, equipment and storage medium
CN114443772A (en) Distributed data processing method, device, equipment and medium
CN115525659A (en) Data query method and device, electronic equipment and storage medium
CN115794744A (en) Log display method, device, equipment and storage medium
CN115525721A (en) Data synchronization method, device, equipment and storage medium
CN112887426B (en) Information stream pushing method and device, electronic equipment and storage medium
CN114661918A (en) Knowledge graph construction method and device, storage medium and electronic equipment
CN115237426A (en) Method, device and equipment for determining database difference and storage medium
CN114443802A (en) Interface document processing method and device, electronic equipment and storage medium
CN114968950A (en) Task processing method and device, electronic equipment and medium
CN113836157A (en) Method and device for acquiring incremental data of database
CN113676531A (en) E-commerce flow peak clipping method and device, electronic equipment and readable storage medium
CN113495891A (en) Data processing method and device
US20220107949A1 (en) Method of optimizing search system
CN115587091A (en) Data storage method, device, equipment and storage medium
CN116303828A (en) Data query method, device, electronic equipment and storage medium
CN114416881A (en) Real-time synchronization method, device, equipment and medium for multi-source data
CN117033402A (en) Data access method, device, equipment and storage medium
CN115525774A (en) Map generation method and device, electronic equipment and storage medium
CN115495528A (en) Distributed database statement execution method, device, equipment and storage medium
CN115878627A (en) Database partitioning method, device, equipment and storage medium
CN113569027A (en) Document title processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination