CN113297326A - Data processing method and device, computer readable storage medium and processor - Google Patents

Data processing method and device, computer readable storage medium and processor Download PDF

Info

Publication number
CN113297326A
CN113297326A CN202110559412.9A CN202110559412A CN113297326A CN 113297326 A CN113297326 A CN 113297326A CN 202110559412 A CN202110559412 A CN 202110559412A CN 113297326 A CN113297326 A CN 113297326A
Authority
CN
China
Prior art keywords
data
database
target data
storage medium
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110559412.9A
Other languages
Chinese (zh)
Inventor
朱纯磊
董爱军
张昭坤
袁继宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China Ltd filed Critical Postal Savings Bank of China Ltd
Priority to CN202110559412.9A priority Critical patent/CN113297326A/en
Publication of CN113297326A publication Critical patent/CN113297326A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device, a computer readable storage medium and a processor. Wherein, the method comprises the following steps: the method comprises the steps of obtaining a synchronization task of a first database and a second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database; performing data configuration according to the synchronous task to obtain configuration information, wherein the configuration information is used for determining an execution strategy of the synchronous task; acquiring target data from a first database according to the configuration information; the target data is stored to the intermediate storage medium and the target data is synchronized from the intermediate storage medium to the second database. The invention solves the technical problem of low data interaction rate among different databases in the prior art.

Description

Data processing method and device, computer readable storage medium and processor
Technical Field
The invention relates to the technical field of databases, in particular to a data processing method and device, a computer readable storage medium and a processor.
Background
In the related technology, a sqoop technology is mainly adopted for interaction between a relational database and a data warehouse, and corresponds to massive data interaction requirements in daily service application, the sqoop technology needs to perform database partitioning and table partitioning according to service requirements, and data is split into different databases.
Aiming at the problem of low data interaction rate among different databases in the prior art, no effective solution is provided at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and device, a computer readable storage medium and a processor, which are used for at least solving the technical problem of low data interaction rate between different databases in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a data processing method, including: the method comprises the steps of obtaining a synchronization task of a first database and a second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database; performing data configuration according to the synchronous task to obtain configuration information, wherein the configuration information is used for determining an execution strategy of the synchronous task; acquiring target data from a first database according to the configuration information; the target data is stored to the intermediate storage medium and the target data is synchronized from the intermediate storage medium to the second database.
Further, the configuration information includes a metadata table, and performs data configuration according to the synchronization task to obtain the configuration information, including: acquiring data source information and sub-base information in a synchronous task; and performing data configuration according to the data source information and the sub-database information to obtain a metadata table.
Further, the configuration information includes a mapping table, and performs data configuration according to the synchronization task to obtain the configuration information, including: step information and data source information of the synchronous task are obtained, wherein the step information comprises subtasks for processing target data; and carrying out data configuration according to the step information and the data source information to obtain a mapping table.
Further, the data source information comprises a synchronization direction from the first database to the second database, field information and an acquisition mode of target data; the step information comprises any one or more of the following items: the method comprises a target data cleaning step, an index creating step and a target table renaming or rebuilding step.
Further, after synchronizing the target data from the intermediate storage medium to the second database, the method further comprises: judging whether the subtask is executed successfully; under the condition that all the subtasks are successfully executed, the synchronous task is determined to be successfully executed; and when any one subtask fails to execute, continuing to execute the subtask of the synchronous task from the failed subtask.
Further, the configuration information includes a task registry, and performs data configuration according to the synchronous task to obtain the configuration information, including: and registering the synchronous task into a task registry.
Further, storing the target data in an intermediate storage medium, comprising: cleaning and converting the format of the target data to obtain a target data file; the target data file is stored in an intermediate storage medium.
According to another aspect of the embodiments of the present invention, there is also provided a data processing apparatus, including: the task acquisition module is used for acquiring a synchronization task of the first database and the second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database; the configuration module is used for carrying out data configuration according to the synchronous task to obtain configuration information; the data acquisition module is used for acquiring target data from the first database according to the configuration information; and the synchronization module is used for storing the target data in the intermediate storage medium and synchronizing the target data from the intermediate storage medium to the second database.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, including: the computer readable storage medium includes a stored program, wherein when the program runs, the apparatus in which the computer readable storage medium is located is controlled to execute the processing method of any one of the data.
According to another aspect of the embodiments of the present invention, there is also provided a processor, including: the processor is used for running a program, wherein the program runs to execute the processing method of any one of the data of the preceding claims.
In the embodiment of the invention, the data configuration is carried out according to the synchronization task by acquiring the synchronization task of the first database and the second database to obtain the configuration information, the target data is acquired from the first database according to the configuration information, the target data is stored in the intermediate storage medium, and the target data is synchronized to the second database from the intermediate storage medium, so that the sequence step of dynamically arranging data interaction is realized, the intermediate storage medium provides data buffering for massive data quantity, the system breakdown caused by overlarge data quantity is prevented, the data transmission rate between different databases is improved, and the technical problem of low data interaction rate between different databases in the prior art is further solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method of processing data according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an alternative method of processing data according to an embodiment of the invention;
fig. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of a method for processing data, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than that presented herein.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, as shown in fig. 1, the method including the steps of:
step S102, a synchronization task of the first database and a synchronization task of the second database are obtained, wherein the synchronization task is used for synchronizing target data of the first database to the second database.
Specifically, the first database and the second database are two different databases that do not need data interaction, and the types of the first database and the second database may be the same or different, for example, the first database and the second database may both be relational databases or both be data warehouses, or the first database is a relational database and the second database is a data warehouse.
The target data is data which needs to be transmitted from the first database to the second database, and the synchronization task is used for transmitting the target data in the first database to the second database. In an alternative implementation, the target data is data that needs to be consumed in the terminal application, that is, data that a user needs to obtain through the terminal.
And step S104, performing data configuration according to the synchronous task to obtain configuration information, wherein the configuration information is used for determining an execution strategy of the synchronous task.
The execution strategy of the synchronous task can comprise database dividing information, the data transmission direction, the execution sequence of the data transmission steps and the like, the expansion or deletion of the configuration information of the database can be flexibly realized through data configuration, the data interaction sequence can be dynamically arranged, and the data interaction speed among different databases can be realized.
It should be noted that different data configurations can be flexibly performed for different synchronization tasks, and when a synchronization task is changed, only configuration information needs to be changed, and program codes do not need to be modified, thereby reducing the workload of developers.
And step S106, acquiring target data from the first database according to the configuration information.
The configuration information may include identification information of the target data in the first database, for example, the identification information may be address information, and the target data may be obtained according to the address information of the target data in the first database.
Step S108, storing the target data in the intermediate storage medium, and synchronizing the target data from the intermediate storage medium to the second database.
The intermediate storage medium is independent of the storage medium where the first database is located and the storage medium where the second database is located, and can be used for data buffering and temporary storage of target data in the transmission process, so that system breakdown caused by overlarge data volume is avoided, the stability of data interaction between the databases is improved, repeated operation on the target data can be realized by setting the intermediate storage medium, for example, when the size of a data file is checked, the condition that the data are inconsistent is found, and the transmitted data can be repeatedly processed. The intermediate storage medium may be an HDFS (Hadoop Distributed File System), a hard disk, or the like.
In an alternative embodiment, the second database may be a relational database, the first database may be a data warehouse, and the user may directly interact with the relational database of the front end through the terminal application, and when the user issues a request for obtaining target data in the data warehouse, according to the acquisition request of the user, a synchronization task for transmitting the target data from the data warehouse to the relational database is determined, and according to the synchronization task, configuring the relational database and the data warehouse, wherein the configuration information comprises the identification of the relational database and the data warehouse, the data transmission direction and the data transmission execution step, extracting target data from the data warehouse according to the configuration information, storing the target data in an intermediate storage medium, and then, the target data are synchronized to the relational database from the intermediate storage medium, and the relational database can feed the target data back to the terminal application. Compared with the data interaction rate of 2M/s in the sqoop mode in the prior art, the data processing method in the embodiment improves the data transmission rate between the databases to 20M/s, the transmission rate can reach 50M/s without considering bandwidth limitation, only 1 minute is needed for transmitting 1G of data, and the data transmission rate between the databases is improved.
In this embodiment, by obtaining the synchronization task of the first database and the second database, performing data configuration according to the synchronization task to obtain configuration information, obtaining target data from the first database according to the configuration information, storing the target data in the intermediate storage medium, and synchronizing the target data from the intermediate storage medium to the second database, a sequential step of dynamically arranging data interaction is implemented, the intermediate storage medium provides data buffering for a large amount of data, so as to prevent system crash caused by an excessively large amount of data, improve data transmission rates between different databases, and solve the technical problem in the prior art that data interaction rates between different databases are low.
As an alternative embodiment, the configuration information includes a metadata table, and the data configuration is performed according to the synchronization task to obtain the configuration information, including: acquiring data source information and sub-base information in a synchronous task; and performing data configuration according to the data source information and the sub-database information to obtain a metadata table.
The data source information may include database names and types of the first database and the second database (for example, the database type may be a relational database, a Hive data warehouse, or the like), and connection information of the databases, and according to the data source information, connection between the first database and the second database may be implemented.
The database dividing information can comprise information stored in a first database and a second database in a data block mode, the position of the target data in the first database can be determined through the database dividing information, and then the target data are extracted from the first database. Specifically, the database dividing information comprises the database dividing, the table dividing and the partition in the first database and the second database, and the data query efficiency is improved through the database dividing, the table dividing and the partition.
As an optional embodiment, the configuration information includes a mapping table, and performs data configuration according to the synchronization task to obtain the configuration information, including: step information and data source information of the synchronous task are obtained, wherein the step information comprises subtasks for processing target data; and carrying out data configuration according to the step information and the data source information to obtain a mapping table.
The step information is used for determining the logic of data interaction of the first database and the second database in the synchronous task, the state registration is carried out on each step operation in the mapping table, and the state of each node can be checked by matching with a scheduling system. The content in the data source information and the distribution information can be flexibly defined, and the dynamic arrangement of the execution sequence of data interaction between different databases can be realized by changing the content in the mapping table, so that the transmission rate of data is improved.
In an optional embodiment, the data source information includes a synchronization direction from the first database to the second database, field information, and an acquisition mode of the target data; the step information comprises any one or more of the following items: the method comprises a target data cleaning step, an index creating step and a target table renaming or rebuilding step.
The synchronization direction of the first database to the second database is the transmission direction of the target data, for example, from the relational database to the data warehouse, or from the data warehouse to the relational database. The target data is obtained by extracting the target data from the first database (for example, the target data may be in a data stream replication mode or a big data engine high-speed extraction mode) and the extracted target data item. The field information is specifically a field constituting a record in the first database and the second database, and the data source information may also be in a Structured Query Language (SQL) mode of the first database and the second database.
The step information may include the operation steps of cleaning data, creating an index, renaming a target table, reconstructing a table, etc. before and after data synchronization. The step information can be flexibly configured according to the requirements of users, and is not limited to the operation steps.
In an alternative embodiment, the first database is a data warehouse, the second database is a relational database, and the method for synchronizing the target data in the data warehouse into the relational database comprises: the method comprises the steps of configuring according to a synchronous task of target data to obtain configuration information, wherein the configuration information can comprise a metadata table and a mapping table, the target data in a data warehouse is extracted at a high speed by adopting a big data engine through reading the configuration information, the data format of the target data is cleaned and converted into a data file format required by terminal application, meanwhile, in an intermediate storage medium, the intermediate storage medium can comprise a disk and an HDFS cluster, the intermediate storage medium can realize temporary storage of the target data, and the data transmission system is prevented from being crashed due to overlarge data volume. It should be noted that the HDFS cluster may be used not only for buffering target data, but also for checking the target data to ensure consistency of data interaction, for example, checking the number and size of data files, and in the case of inconsistent data, repeatedly processing transmitted data. And finally, synchronizing the target data file in the intermediate storage medium to the relational database at high speed by adopting a data stream copying mode, and completing the synchronization of the target data in the data warehouse to the relational database.
In another alternative embodiment, the first database is a relational database, the second database is a data warehouse, and the method for synchronizing the target data in the data warehouse into the relational database includes: the method comprises the steps of configuring according to a synchronous task of target data to obtain configuration information, wherein the configuration information can comprise a metadata table and a mapping table, reading the configuration information, copying the target data in a relational database at a high speed by adopting a data stream copying mode, cleaning a data format of the target data, converting the target data into a data file format required by terminal application, and storing the data file format in an intermediate storage medium to realize temporary storage of the target data and prevent a data transmission system from being crashed due to overlarge data volume. And loading the target data file in the intermediate storage medium to a data warehouse by using a big data engine, and completing the synchronization of the target data in the relational database to the data warehouse.
Through the steps, the interaction rate of the relational database and the data warehouse is improved, so that both the relational database and the data warehouse can be used as download data sources of terminal application, the problem of single download data source is solved, the problem that the data file format needs to be fixed in advance is effectively solved, the data configuration is carried out according to the synchronous task, the flexibility, the reliability and the transmission efficiency of data transmission from the data warehouse and the relational database are improved, and the expandability in future application is ensured.
As an alternative embodiment, after synchronizing the target data from the intermediate storage medium to the second database, the method further comprises: judging whether the subtask is executed successfully; under the condition that all the subtasks are successfully executed, the synchronous task is determined to be successfully executed; and when any one subtask fails to execute, continuing to execute the subtask of the synchronous task from the failed subtask.
The sub-tasks include sub-tasks for processing the target data in the step information, and the sub-tasks may be stored in a mapping table by configuration.
It should be noted that, in this embodiment, only when all the subtasks in the mapping table are successfully executed, the data transmission between the first database and the second database is successful, if any one of the subtasks in the mapping table fails to be executed, the data synchronization task fails this time, and the data synchronization task starts to be executed from the failed subtask (i.e., a breakpoint) in the next operation until all the subtasks in the mapping table are successfully executed. By setting an interactive checking mechanism for checking the subtasks, the method avoids re-executing all subtasks under the condition of failure of the synchronous task, and reduces the influence on the terminal application service in the data synchronization process.
As an optional embodiment, the configuration information includes a task registry, and performs data configuration according to the synchronization task to obtain the configuration information, including: and registering the synchronous task into a task registry.
It should be noted that, in the data interaction between the first database and the second database, a plurality of synchronization tasks may be established, different synchronization tasks are used for transmission of different target data, the plurality of synchronization tasks are registered in the task registry one by one, and transmission of different target data is sequentially achieved for the plurality of synchronization tasks according to the above method steps.
As an alternative embodiment, storing the target data in an intermediate storage medium includes: cleaning and converting the format of the target data to obtain a target data file; the target data file is stored in an intermediate storage medium.
It should be noted that the format of the target data file obtained after the cleaning and conversion is a format supported by the terminal application, so that the target data file can be directly used by the terminal application.
Fig. 2 is a schematic diagram of an optional data processing method according to an embodiment of the present invention, and as shown in fig. 2, the big data interaction device 20 is configured to execute the data processing method, where the big data interaction device 20 includes a data processing core engine, an intermediate storage medium, and a configuration module for operating configuration information, the data processing core engine is configured to implement data transmission between the database 21 and the database 23 according to the data processing method, the data processing core engine may be any one of a mapreduce (hadoop processing engine), a spark (memory-based fast processing engine), a flush (stream processing engine), or another custom processing engine, and the intermediate storage medium is configured to buffer and temporarily store data transmitted between the database 21 and the database 23. The application 22 and the application 24 may be applications of terminal devices, the application 22 may be a data generating application, the application 24 may be a data consuming application (i.e. a data using and demanding party), specifically, the data generated by the application 22 is stored in the database 21, the application 24 may obtain data required to be consumed from the database 23, the database 21 and the database 23 may interact through the big data interaction device 20, and the data in the database 21 is transmitted to the database 23, so that a user may obtain the required data from the application 24. Specifically, the method for transferring data from the database 21 to the database 23 may include the steps of:
step S201, configuring the collected data generation information of the database 21, the application 22, and the log and the message queue of the application 22.
Step S202, configure the data consumption information of the database 23, the application 24, and the like.
Step S203, configuring configuration information in the big data interaction device 20, wherein the configuration information includes data processing arrangement logic.
And step 204, the data processing core engine stores the acquired and processed data into an intermediate storage medium according to the configuration information.
In step S205, the data processing core engine transmits the data in the intermediate storage medium to the database 23 for the application 24 to use.
In the embodiment, the database sub-table partitioning and the data thickness granularity extraction can be realized through configuration, the data interaction speed is improved, the step sequence and the logic of the step information of the data interaction can be flexibly configured, extra program development is not needed when the interaction of different databases is adjusted and changed, and the development workload is reduced. The components (such as the intermediate storage medium and the data processing core engine) of the big data interaction device 20 can be plugged in and pulled out, so that the modification and manual arrangement of the configuration information are easy to realize, and the development and maintenance cost is reduced. The embodiment can be applied to data interaction between different databases in a CRM (Customer Relationship Management) system, can shorten the interaction synchronization time point between different databases, and saves the system running time.
Example 2
According to an embodiment of the present invention, there is provided an embodiment of a data processing apparatus, and fig. 3 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention, as shown in fig. 3, the apparatus including:
the task obtaining module 31 is configured to obtain a synchronization task of the first database and the second database, where the synchronization task is used to synchronize target data of the first database to the second database; a configuration module 32, configured to perform data configuration according to the synchronization task to obtain configuration information; a data obtaining module 33, configured to obtain target data from the first database according to the configuration information; a synchronization module 34 for storing the target data in the intermediate storage medium and synchronizing the target data from the intermediate storage medium to the second database.
In this embodiment, by obtaining the synchronization task of the first database and the second database, performing data configuration according to the synchronization task to obtain configuration information, obtaining target data from the first database according to the configuration information, storing the target data in the intermediate storage medium, and synchronizing the target data from the intermediate storage medium to the second database, a sequential step of dynamically arranging data interaction is implemented, the intermediate storage medium provides data buffering for a large amount of data, thereby preventing system crash caused by an excessive amount of data, improving data transmission rate between different databases, and further solving a technical problem in the prior art that data interaction rate between different databases is low.
As an alternative embodiment, the configuration information includes a metadata table, and the configuration module includes: the first acquisition submodule is used for acquiring data source information and sub-base information in the synchronous task; and the metadata configuration submodule is used for performing data configuration according to the data source information and the sub-database information to obtain a metadata table.
As an alternative embodiment, the configuration information includes a mapping table, and the configuration module includes: the second acquisition submodule is used for acquiring step information and data source information of the synchronous task, wherein the step information comprises a subtask for processing target data; and the mapping table configuration submodule is used for carrying out data configuration according to the step information and the data source information to obtain the mapping table.
As an optional embodiment, the data source information includes a synchronization direction from the first database to the second database, field information, and an acquisition mode of the target data; the step information comprises any one or more of the following items: the method comprises a target data cleaning step, an index creating step and a target table renaming or rebuilding step.
As an alternative embodiment, the apparatus further comprises: the judging module is used for judging whether the execution of the subtask is successful; the determining module is used for determining that the synchronous task is successfully executed under the condition that all the subtasks are successfully executed; and the continuous execution module is used for continuously executing the subtasks of the synchronous task from the failed subtask under the condition that any one of the subtasks fails to execute.
As an alternative embodiment, the configuration information includes a task registry, and the configuration module includes: and the registration submodule is used for registering the synchronous task into the task registry.
As an alternative embodiment, the synchronization module comprises: the cleaning submodule is used for cleaning and converting the format of the target data to obtain a target data file; and the storage submodule is used for storing the target data file in the intermediate storage medium.
It should be noted that, reference may be made to the relevant description in embodiment 1 for alternative or preferred embodiments of this embodiment, and details are not described here again.
Example 3
According to an embodiment of the present invention, there is provided an embodiment of a computer-readable storage medium including: the computer readable storage medium includes a stored program, wherein when the program runs, the apparatus in which the computer readable storage medium is located is controlled to execute the processing method of any one of the data.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: the method comprises the steps of obtaining a synchronization task of a first database and a second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database; performing data configuration according to the synchronous task to obtain configuration information, wherein the configuration information is used for determining an execution strategy of the synchronous task; acquiring target data from a first database according to the configuration information; the target data is stored to the intermediate storage medium and the target data is synchronized from the intermediate storage medium to the second database.
There is also provided, in accordance with an embodiment of the present invention, an embodiment of a processor, including: the processor is used for running a program, wherein the program runs to execute the processing method of any one of the data of the preceding claims.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method for processing data, comprising:
the method comprises the steps of obtaining a synchronization task of a first database and a second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database;
performing data configuration according to the synchronous task to obtain configuration information, wherein the configuration information is used for determining an execution strategy of the synchronous task;
acquiring the target data from the first database according to the configuration information;
storing the target data in an intermediate storage medium and synchronizing the target data from the intermediate storage medium to the second database.
2. The data processing method according to claim 1, wherein the configuration information includes a metadata table, and performing data configuration according to the synchronization task to obtain configuration information includes:
acquiring data source information and sub-base information in the synchronous task;
and performing data configuration according to the data source information and the sub-database information to obtain the metadata table.
3. The data processing method according to claim 1, wherein the configuration information includes a mapping table, and the data configuration is performed according to the synchronization task to obtain the configuration information, including:
step information and data source information of a synchronous task are obtained, wherein the step information comprises subtasks for processing the target data;
and performing data configuration according to the step information and the data source information to obtain the mapping table.
4. The data processing method according to claim 3,
the data source information comprises a synchronization direction from the first database to the second database, field information and an acquisition mode of the target data;
the step information comprises any one or more of the following items: the method comprises a target data cleaning step, an index creating step and a target table renaming or rebuilding step.
5. The method of processing data of claim 3, wherein after synchronizing the target data from the intermediate storage medium to the second database, the method further comprises:
judging whether the subtask is executed successfully;
under the condition that all the subtasks are successfully executed, determining that the synchronous task is successfully executed;
and when any one subtask fails to execute, continuing to execute the subtask of the synchronous task from the failed subtask.
6. The data processing method according to claim 1, wherein the configuration information includes a task registry, and performing data configuration according to the synchronization task to obtain the configuration information includes:
registering the synchronization task in the task registry.
7. The method of claim 1, wherein storing the target data in an intermediate storage medium comprises:
cleaning and converting the format of the target data to obtain a target data file;
and storing the target data file in the intermediate storage medium.
8. An apparatus for processing data, comprising:
the task acquisition module is used for acquiring a synchronization task of a first database and a second database, wherein the synchronization task is used for synchronizing target data of the first database to the second database;
the configuration module is used for carrying out data configuration according to the synchronous task to obtain configuration information;
the data acquisition module is used for acquiring the target data from the first database according to the configuration information;
and the synchronization module is used for storing the target data in an intermediate storage medium and synchronizing the target data from the intermediate storage medium to the second database.
9. A computer-readable storage medium, comprising a stored program, wherein when the program runs, the computer-readable storage medium controls an apparatus to execute the data processing method according to any one of claims 1 to 7.
10. A processor, configured to execute a program, wherein the program executes to perform the data processing method according to any one of claims 1 to 7.
CN202110559412.9A 2021-05-21 2021-05-21 Data processing method and device, computer readable storage medium and processor Pending CN113297326A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110559412.9A CN113297326A (en) 2021-05-21 2021-05-21 Data processing method and device, computer readable storage medium and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110559412.9A CN113297326A (en) 2021-05-21 2021-05-21 Data processing method and device, computer readable storage medium and processor

Publications (1)

Publication Number Publication Date
CN113297326A true CN113297326A (en) 2021-08-24

Family

ID=77323745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110559412.9A Pending CN113297326A (en) 2021-05-21 2021-05-21 Data processing method and device, computer readable storage medium and processor

Country Status (1)

Country Link
CN (1) CN113297326A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310082A1 (en) * 2014-04-24 2015-10-29 Luke Qing Han Hadoop olap engine
CN110362632A (en) * 2019-07-22 2019-10-22 无限极(中国)有限公司 A kind of method of data synchronization, device, equipment and computer readable storage medium
CN110737720A (en) * 2019-09-06 2020-01-31 苏宁云计算有限公司 DB2 database data synchronization method, device and system
CN111125065A (en) * 2019-12-24 2020-05-08 阳光人寿保险股份有限公司 Visual data synchronization method, system, terminal and computer readable storage medium
CN111324610A (en) * 2020-02-19 2020-06-23 深圳市融壹买信息科技有限公司 Data synchronization method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150310082A1 (en) * 2014-04-24 2015-10-29 Luke Qing Han Hadoop olap engine
CN110362632A (en) * 2019-07-22 2019-10-22 无限极(中国)有限公司 A kind of method of data synchronization, device, equipment and computer readable storage medium
CN110737720A (en) * 2019-09-06 2020-01-31 苏宁云计算有限公司 DB2 database data synchronization method, device and system
CN111125065A (en) * 2019-12-24 2020-05-08 阳光人寿保险股份有限公司 Visual data synchronization method, system, terminal and computer readable storage medium
CN111324610A (en) * 2020-02-19 2020-06-23 深圳市融壹买信息科技有限公司 Data synchronization method and device

Similar Documents

Publication Publication Date Title
JP7090744B2 (en) Distributed database cluster system and data synchronization method
CN108536761B (en) Report data query method and server
KR102307371B1 (en) Data replication and data failover within the database system
CN110196871B (en) Data warehousing method and system
CN106250543B (en) A kind of automated data inquiry synchronous storage method
CN101551801B (en) Data synchronization method and data synchronization system
CN107958010B (en) Method and system for online data migration
CN111324610A (en) Data synchronization method and device
CN104699541A (en) Method, device, data transmission assembly and system for synchronizing data
CN109542865A (en) Distributed cluster system configuration file synchronous method, device, system and medium
CN108121782A (en) Distribution method, database middleware system and the electronic equipment of inquiry request
CN109032796B (en) Data processing method and device
CN104809201A (en) Database synchronization method and device
CN104731956A (en) Method and system for synchronizing data and related database
CN105468720A (en) Method for integrating distributed data processing systems, corresponding systems and data processing method
CN111858760B (en) Data processing method and device for heterogeneous database
CN112073240B (en) Blue-green deployment system and method based on registration center component and storage medium
CN106294741A (en) A kind of automation data inquiry synchronizes storage system
CN113177090A (en) Data processing method and device
CN114416868B (en) Data synchronization method, device, equipment and storage medium
CN111026397B (en) Rpm packet distributed compiling method and device
CN116775712A (en) Method, device, electronic equipment, distributed system and storage medium for inquiring linked list
CN111506646B (en) Data synchronization method, device, system, storage medium and processor
CN113934797B (en) Banking industry super-large data synchronization method and system
CN113297326A (en) Data processing method and device, computer readable storage medium and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination