CN116150236A - Data synchronization method and device, electronic equipment and computer readable storage medium - Google Patents

Data synchronization method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN116150236A
CN116150236A CN202211228327.5A CN202211228327A CN116150236A CN 116150236 A CN116150236 A CN 116150236A CN 202211228327 A CN202211228327 A CN 202211228327A CN 116150236 A CN116150236 A CN 116150236A
Authority
CN
China
Prior art keywords
synchronization
data
synchronous
source
tool
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211228327.5A
Other languages
Chinese (zh)
Inventor
彭鹏
胡兵
林伟华
吴海英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN202211228327.5A priority Critical patent/CN116150236A/en
Publication of CN116150236A publication Critical patent/CN116150236A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure provides a data synchronization method and apparatus, an electronic device, and a computer readable storage medium, where the method includes: inquiring and obtaining metadata from a source data source through a metadata inquiry interface, and obtaining target data to be synchronized from the metadata; generating and deploying a synchronous configuration file corresponding to the tool type based on the tool type of the selected synchronous tool; the target data is synchronized to the target data source based on the synchronization profile by a synchronization tool. According to the embodiment of the disclosure, the data synchronization efficiency can be improved.

Description

Data synchronization method and device, electronic equipment and computer readable storage medium
Technical Field
The disclosure relates to the field of computer technology, and in particular, to a data synchronization method and device, an electronic device, and a computer readable storage medium.
Background
With the rapid development of computer technology and the internet industry, data generated in various industries and daily life, such as employee information, order records, web page (Web) browsing records, etc., are increasing. Because different databases have different functional characteristics, the data can be managed in different aspects through different databases; in order to increase the efficiency of data management such that the data remains intact and uniform, it is necessary to synchronize the data between the different databases.
The common data synchronization method is to write a structured query language (Structure Query Language, SQL) by using a data synchronization tool, execute the SQL in a data source management tool to acquire data to be synchronized, write the data to be synchronized into a target storage medium in a specified mode, and realize a certain degree of automation by using a scripting language, but has larger workload such as development test and low data synchronization efficiency. Therefore, in the field of data synchronization, how to efficiently perform data synchronization is one of the hot problems in research today.
Disclosure of Invention
The disclosure provides a data synchronization method and device, electronic equipment and a computer readable storage medium, which can improve data synchronization efficiency.
In a first aspect, the present disclosure provides a data synchronization method, including: inquiring and obtaining metadata from a source data source through a metadata inquiry interface, and obtaining target data to be synchronized from the metadata; generating and deploying a synchronous configuration file corresponding to the tool type based on the tool type of the selected synchronous tool; the target data is synchronized to the target data source based on the synchronization profile by a synchronization tool.
In a second aspect, the present disclosure provides a data synchronization apparatus comprising: the query module is used for obtaining metadata from a source data source through a metadata query interface and obtaining target data to be synchronized from the metadata; the configuration module is used for generating and deploying a synchronous configuration file corresponding to the tool type based on the tool type of the selected synchronous tool; and the synchronization module is used for synchronizing the target data to the target data source based on the synchronization configuration file through the synchronization tool.
In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, the one or more computer programs being executable by the at least one processor to enable the at least one processor to perform the data synchronization method described above.
In a fourth aspect, the present disclosure provides a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor/processing core, implements the data synchronization method described above.
In a fifth aspect, the present disclosure provides a computer program or a computer program product comprising a computer program storable in a computer readable storage medium, the computer program implementing the data synchronization method as described above when executed by a processor.
According to the data synchronization method provided by the embodiment of the disclosure, metadata is inquired and obtained through the metadata inquiry interface, target data to be synchronized is obtained from the inquired and obtained metadata, and synchronization configuration files of different types of synchronization tools are automatically generated, so that different types of synchronization tools are adapted; the data synchronization process does not need to develop and test a data acquisition script, thereby being beneficial to reducing the workload of data synchronization and improving the data synchronization efficiency.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:
Fig. 1 is a flowchart of a data synchronization method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a data synchronization system according to an embodiment of the disclosure;
FIG. 3 is a block diagram of a data synchronization apparatus according to an embodiment of the present disclosure;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In the related art, the data synchronization method includes the steps of: developing a data acquisition script according to the data synchronization requirement, and creating an acquisition configuration file corresponding to the data acquisition script according to the specification of an acquisition tool under the condition that the data acquisition script passes the script test; saving the collection configuration file to a designated directory of a server where the collection tool is located, and starting a collection task; in the process of executing the acquisition task, the output log of the task and the running result can be checked, and whether the acquisition abnormality exists or not is judged manually; if the acquisition abnormality exists, repairing the data acquisition script or repairing the configuration file, and restarting the task; if no abnormality exists, the data synchronization is completed.
In the data synchronization method of the related art, the workload of developing and testing the data acquisition script is large; the common data synchronization tool does not support the automatic generation of standardized acquisition configuration files, can only analyze the data synchronization requirement manually, and write the synchronization configuration files manually, so that the data synchronization efficiency is low.
According to the data synchronization method disclosed by the embodiment of the disclosure, the synchronization configuration files of different types of synchronization tools can be automatically generated through metadata, so that the different types of synchronization tools are adapted; and moreover, development and test of a data acquisition script are not needed, so that the workload of data synchronization is reduced, and the data synchronization efficiency is improved.
The data synchronization method according to the embodiments of the present disclosure may be performed by an electronic device such as a terminal device or a server, where the terminal device may be a vehicle-mounted device, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, etc., and the method may be implemented by a processor invoking computer readable program instructions stored in a memory. The servers may include independent physical servers, a server cluster consisting of multiple servers, or cloud servers capable of cloud computing.
Referring to fig. 1, a flowchart of a data synchronization method is provided in an embodiment of the disclosure. Referring to fig. 1, the method includes the steps of:
s110, inquiring and obtaining metadata from a source data source through a metadata inquiry interface, and obtaining target data needing to be synchronized from the metadata.
In some embodiments, the target data that needs to be synchronized is the data to be synchronized; the source data source refers to a data source for providing data to be synchronized, and the target data source is a data source which needs to be data synchronized with the source data source. Data synchronization is a data processing manner of synchronizing data to be synchronized from a source data source to a target data source. Metadata is data for managing data, and may be understood as data for describing a data structure.
In some embodiments, the types of source data sources include: a relational database management system (Relational Database Management System, RDBMS) type and a system interface type. In some scenarios, RDBMS type data sources are also referred to as traditional data sources or database type data sources. That is, stated another way, the types of source data sources include database types and system interface types.
In some embodiments, the metadata includes a data source name, entity information, and corresponding entity attribute information; step S110 may specifically include the following steps:
S11, calling a metadata query interface, querying a source data source by using a data source name, and acquiring a plurality of corresponding candidate entity information and a plurality of candidate entity attribute information from the searched source data source.
As an example, a metadata query interface is called, a data source (source data source) to be mapped is queried by using a data source name, and a plurality of corresponding entity information and a plurality of entity attribute information are acquired from the searched source data source; and taking the queried entity information and entity attribute information corresponding to the data source name as candidate information.
S12, displaying a plurality of candidate entity information and a plurality of candidate entity attribute information through a page; the page comprises a plurality of first control elements for marking a plurality of candidate entity information and a plurality of second control elements for marking a plurality of candidate entity attribute information, wherein one candidate entity information corresponds to one first control element, and one candidate entity attribute information corresponds to one second control element.
As an example, the first control element and the second control element may each be a selection class control, such as at least one of a check box, a multi-choice tab.
S13, responding to operation instructions for a plurality of first control elements and a plurality of second control elements, and marking candidate entity information of the operated first control elements and candidate entity attribute information corresponding to the operated second control elements as target data to be synchronized.
In this embodiment, the candidate information obtained by the query may be displayed by means of a Web page, where the Web page provides an information selection function by selecting a class control, so as to select entity information and entity attribute information in the displayed candidate information, and based on the selected entity information and the selected entity attribute information on the Web page, target data (i.e., data to be synchronized) that needs to be synchronized may be determined.
In the embodiment of the disclosure, the entity information and the entity attribute information which need to be synchronized are selected from the metadata in a web mode, and the synchronous configuration file can be automatically generated according to the type of the synchronous tool, so that the configuration difficulty can be reduced, and meanwhile, errors easily caused when manually writing data query sentences can be reduced, thereby reducing the error rate of data acquisition and improving the accuracy of data acquisition.
In the embodiment of the disclosure, metadata can be extracted from a data source of a database type and a system interface type, information in the metadata can be abstract summarized, the information in the metadata is divided into three layers of a data source name, entity information and entity attribute information, for example, an information model of the metadata is constructed from the three layers of the data source name, entity and attribute, so that the metadata is managed from the three layers of the data source name, entity and attribute.
From the foregoing, the types of source data sources include database types and system interface types; if the type of the source data source is a database type, the data source name is a database name, the entity information is a table name, and the entity attribute information is a field name in the corresponding table. As an example, if the source data source is MySQL database, the data source name is a database name, the entity information is a table name, the entity attribute information is a table field, and the attribute value is a field content.
If the type of the source data source is the system interface type, the data source name is the interface name, the entity information is the request structure information and the response structure information of the system interface, and the entity attribute information is the field information in the corresponding request structure and the field information in the corresponding response structure. The field information includes a field name and a value of the field. As an example, if the source data source is a system interface, the data source name is an interface name, the entity information is a response structure and a request structure of the interface, and the entity attribute information is a field in the request structure and the response structure; the attribute value is the field content.
In some embodiments, each entity attribute has a particular type, e.g., when the type of source data source is a database type, the attribute type may be a field type of a table field; when the source data source type is a system interface type, the attribute type may be a field type of a field contained in the request structure and the response structure. Illustratively, the field type may be, for example, at least one of the following types: binary type, character type, date and time type, number type, currency type, etc. It should be understood that the data type may be set up by definition according to actual situations, and the embodiments of the present disclosure are not limited in particular.
For ease of understanding, the specific content of system interface metadata in the case where the type of source data source is a system interface type is described below through table 1.
Table 1 details of system interface metadata
Figure BDA0003880970390000051
In table 1, the interface name of the system interface "acquire user information", which indicates that the system interface is used to acquire user information; the request interface information may include, for example: position identification (Position Id), position number (Position No), employee number (EmpNo), employee name (empName), and department number (depthno); the response structure information may include, for example: employee number (empNo), employee identification (EmpId), and employee base (BaseInfo); wherein the employee basic information includes at least one of the following information items: company mailboxes (Company Email), ages (ages), political faces (politics), professional years (jobyes), job grades (jobygrades), highest academics (maxedoudeloma), highest academic professions (maxedousprigy), head portrait paths (avatar), department identifications (Departid), employee post names (personesection), marital status (Maritalstatus), gender (Gender), labor relationship status (Laborrelstat), employee post names (personesection).
Illustratively, the department names may also include more detailed department information, such as a third department name (Deptthrename), a fourth department name (DeptFourname), a fifth department name (Deptfivename), a sixth department name (Deptsixname), etc., affiliated with a certain headquarter. For example: one research and development, two research and development, … …, six research and development, and the like.
In table 1, when the type of the source data source is the system interface type, the data source name is the interface name, the entity information is the request structure information and the response structure information of the system interface, and the entity attribute information is the field information in the corresponding request structure and the field information in the corresponding response structure; the related attribute information may be, for example, interface link information.
It should be understood that the specific contents of the request structure information and the response structure information in table 1 are only schematic illustrations, and the specific contents may be determined according to actual situations. In addition, the request structure information and the response structure information of the system interface may further include more different information items, and specifically may be set in a user-defined manner according to actual needs, which is not specifically limited in the embodiments of the present disclosure.
In the data synchronization method of the embodiment of the disclosure, the data synchronization of the traditional source data source serving as a database is supported, and only the source data source serving as a system interface is synchronized; compared with the related art, the data synchronization tool only supports the data synchronization of the database type, but not the data synchronization of the interface type, and the data synchronization method of the present disclosure supports the data synchronization of the interface type, and expands the application scenario of the data synchronization, so that the data synchronization requirements of more data sources of different types can be satisfied.
In embodiments of the present disclosure, different types of source data sources may have respective unique attributes; in some embodiments, metadata further includes: related attribute information corresponding to the data source type. In the case where the source data source type is a database type, the related attribute information includes at least one of the following information items: database link information and database login information; in the case that the source data source type is a system interface type, the related attribute information includes at least one of the following information items: interface link information and interface configuration information.
Illustratively, when the source data source is a MySQL database, the relevant attribute information includes a database link (Uniform Resource Locator, URL) and login information such as a user name (username) and a password (password) required for logging in the database; when the source data source is a system interface, the related attribute information can include interface configuration such as timeout control, flow control restriction and the like; it should be understood that, related attribute information of different types of source data sources may be set in a customized manner according to actual situations, and embodiments of the present disclosure are not limited in particular.
In the data synchronization method of the embodiment of the disclosure, for different types of source data sources, besides dividing information in metadata into three layers of data source names, entity information and entity attribute information, managing metadata from the three layers of data source names, entities and attributes, managing attribute information specific to each of the different types of source data sources can also be performed, so that data synchronization of diversified metadata from the different types of source data sources is realized.
S120, based on the tool type of the selected synchronous tool, generating and deploying a synchronous configuration file corresponding to the tool type.
In some embodiments, the data synchronization methods of embodiments of the present disclosure may be adapted to a variety of synchronization tools. The tool type in this step is used to indicate the different synchronization tools. By way of example, synchronization tools include, but are not limited to, at least one of the following types: offline data synchronization tool/platform DataX, distributed offline and real-time data synchronization tool flankx, distributed computing engine based change data acquisition component (Flink Change Data Capture, flankcdc), data warehouse tool Sqoop.
The DataX is an open-source heterogeneous data source offline synchronization tool, and can realize stable and efficient data synchronization functions among various heterogeneous data sources such as a relational database (MySQL, oracle and the like), a distributed file system (Hadoop Distributed File System, HDFS), a data warehouse tool (Hive), an open data processing service (Ontology Design Patterns, ODPS), a column-oriented storage distributed database (HBase), a file transfer service (File Transfer Protocol, FTP) and the like; the FlinkX is a distributed offline/real-time data synchronization plug-in based on the Flink, and can realize high-efficiency data synchronization of various heterogeneous data sources; the flankcdc is a resource component capable of directly reading full-volume data and incremental change data from a relational database; synchronizing incremental change records of a Source database (Source) to one or more purposes by a Change Data Capture (CDC) technique; sqoop is a data transfer tool based on server software Apache, and is used for data transfer between Hadoop and Hive and a traditional relational database. For example, data in a relational database may be imported into the HDFS of Hadoop by Sqoop, or data in the HDFS may be imported into the relational database.
It should be understood that the synchronization tools in embodiments of the present disclosure may also be of other different types; the specific selection and adjustment can be selected according to actual needs, and the embodiments of the present disclosure are not particularly limited.
In some embodiments, step S120 may specifically include the following steps.
S21, acquiring the selected index information of the target data source, calling a metadata query interface, and querying entity attribute information corresponding to the index information from metadata by using the index information.
The target data sources in embodiments of the present disclosure may also be of various types; such as a relational database, interface, or file management system, etc. Illustratively, the target data source may be a distributed, open source search and analysis engine (ElasticSearch, ES), the ES may be used to collect, store, search, analyze and visually manage data, and the ES may be adapted for use with various types of data, including text, digital, structured data, and unstructured data.
In some embodiments, taking the target data source as an example, in an ES, a document is the smallest unit of all searchable data, like a row record in a table in a relational database; each document in the ES has a unique identifier, and the identifier can be set by a user or automatically generated by a system; the index information is used for indexing entity attribute information in the target data source, and fast full-text search can be performed in the ES through the index of the ES so as to obtain entity attribute information in the ES corresponding to the index; compared to database-type and interface-type data sources, ESs is more suitable for full-text retrieval with relevance and high performance, and therefore, in some application scenarios of the embodiments of the present disclosure, it is necessary to synchronize the ES as a target data source.
S22, determining a one-to-one correspondence relationship between entity attribute information in the metadata and entity attribute information corresponding to the index information.
In this step, when data synchronization is performed, it is necessary to write the value of the entity attribute information in the metadata of the data source (for example, the value of the table field in the database) into the entity attribute information (for example, the corresponding field information) under certain index information of the target data source, and therefore, when data synchronization is performed between the source data source and the target data source, it is necessary to determine the one-to-one correspondence between the entity attribute information in the metadata (entity attribute information in the source data source) and the entity attribute information corresponding to the index information (entity attribute information in the target data source).
S23, according to the tool type of the synchronous tool, acquiring synchronous configuration parameters corresponding to the tool type.
In this step, synchronization configuration parameters of different types of synchronization tools may be generated, which may be implemented through a Script (Script) language, i.e., a synchronization configuration Script of the different types of synchronization tools may be obtained. Scripts are executable files written in a format, also called macros or batch files, using a particular descriptive language. Scripts may typically be invoked and executed temporarily by an application.
S24, generating a synchronous configuration file corresponding to the tool type according to the one-to-one correspondence and the synchronous configuration parameters, wherein the synchronous configuration file is used for recording by the synchronous configuration file: and the entity attribute information in the target data and the entity attribute information corresponding to the index information are in synchronous corresponding relation.
In this embodiment, compared with the related art, the data synchronization tool does not support automatic generation of the standardized acquisition configuration file, and can only manually analyze the data or manually write the configuration file according to the specification of the acquisition tool; the data synchronization method in the embodiment of the disclosure can automatically generate the synchronization configuration information conforming to the selected synchronization tool specification according to specific different types of synchronization tools.
In the embodiment of the disclosure, after the target data to be synchronized is obtained from the metadata, a corresponding synchronization configuration file may be generated according to the type of the synchronization tool, and the synchronization configuration file may be deployed.
In some embodiments, after step S24, further comprising: s25, deploying the synchronous configuration file to a designated file path; wherein the synchronization tool has access rights specifying a file path.
In this step, after the configuration file is generated, the configuration file may be deployed under a directory that may be accessed by the synchronization tool, so that, during data synchronization, the synchronization tool may call the synchronization configuration file, and synchronize the target data according to the corresponding relationship between the entity attribute information in the source data source and the entity attribute information in the target data source recorded by the synchronization configuration file, and the synchronization configuration parameter recorded by the synchronization configuration file.
In the embodiment of the disclosure, the synchronous configuration files are deployed under the path which can be accessed by the selected synchronous tool, so that unified management of the synchronous configuration information is realized, automatic generation and unified deployment of the synchronous configuration files are realized, the generation efficiency of the data synchronous information is improved, and the automatic management of the synchronous configuration information and the data synchronous efficiency are facilitated.
In some embodiments, in step S25, the step of deploying the synchronization profile to the specified file path may specifically include: and mounting the synchronous configuration file to a local file directory in a file sharing mode, so as to be used for sharing access to the synchronous configuration file locally. In this embodiment, the target data may be synchronized to the target data source based on the synchronization profile by a synchronization tool. Specifically, this step may include: starting a plurality of synchronization instances of a pre-created synchronization tool; each synchronization instance is used for synchronizing target data; the synchronization configuration file is accessed in a local sharing mode according to a plurality of synchronization instances, and the target data is synchronized to the target data source through the synchronization instances based on the synchronization configuration file.
In this embodiment, the shared path may be implemented by a shared mount. The shared path may be, for example, a mount directory specifying a file path. And storing the synchronous configuration file in a designated file path by adopting a mode of sharing and storing a network file system (Network File System, NFS), and then mounting the synchronous configuration file to local equipment by adopting a mode of sharing and mounting, wherein the local equipment operates a plurality of examples of the synchronous tool, so that the synchronous configuration file is read by the plurality of examples of the synchronous tool to carry out data synchronization, and the data synchronization efficiency is improved.
S130, synchronizing the target data to the target data source based on the synchronization configuration file through the synchronization tool.
In the embodiment of the disclosure, after metadata is obtained from a source data source through a metadata query interface, target data to be synchronized can be selected from the metadata, and corresponding synchronization configuration files can be generated and deployed according to the type of the selected synchronization tool, so that the synchronization tool is utilized and the synchronization configuration files are used for synchronizing the target data from the source data source to the target data source; according to the data synchronization method, the synchronization configuration files of different types of synchronization tools can be automatically generated through metadata, so that different types of synchronization tools can be adapted; and moreover, development and test of a data acquisition script are not needed, so that the workload of data synchronization is reduced, and the data synchronization efficiency is improved.
In some embodiments, step S130 may specifically include the following steps.
S31, generating synchronous tasks according to the received task scheduling expression.
As an example, the task scheduling expression may be a scheduled task (cron) expression. The planning task refers to: the planned work is performed at the contracted time. As an example, 3-point-a-day-morning-start data synchronization, which is a planning task, may be contracted in the synchronization profile. A cron expression is an expression used to indicate when a task is to be executed, and is typically used to configure the trigger time of a scheduled task. The cron expression may be a string of six or seven sub-expressions (fields) separated by spaces. For example, a typical cron expression field includes: seconds, minutes, hours, days, weeks, months, and years. And the creation of the dispatching task can be completed through the received filling content of the cron dispatching expression.
S32, configuring task scheduling information according to the type of the source data source and the tool type.
As an example, the configuration of task scheduling information may be performed according to different types of data sources and different types of synchronization tools, such that data synchronization tasks are scheduled based on the task scheduling information.
S33, scheduling the synchronous tasks based on the task scheduling information.
S34, synchronizing the target data to the target data source through the synchronization tool according to the synchronization configuration file under the condition that the scheduled synchronization task is in a starting state.
Through the steps S31-S34, the synchronous task is scheduled according to the configured task scheduling expression, and when the task scheduling starts, a local synchronous tool is started according to the configuration file to synchronize target data to a target data source so as to synchronize the target data to be synchronized in the source data source of the corresponding type to the target data source, thereby realizing the automatic scheduling of the data synchronous task.
In some embodiments, the step S32 may specifically include: s41, determining a task execution mode supported by the synchronous tool according to the tool type, and setting task execution parameters corresponding to the task execution mode according to the type of the source data source; s42, generating task scheduling information corresponding to the task execution parameters.
In this embodiment, if the synchronization tool supports a distributed task execution manner, a distributed task scheduling platform may be used to schedule the synchronization task. Illustratively, the distributed task scheduling platform can be an XXL-JOB, which is an open source and has rich task management functions, high performance, high availability and other characteristics; if the synchronization tool does not support the distributed task execution mode, the task execution mode based on the data slicing can be adopted to schedule the synchronization task.
It should be appreciated that the task execution manners supported by the different types of the synchronization tool may be obtained in advance, so as to implement flexible scheduling of the synchronization task according to the task execution manners supported by the types of the synchronization tool.
In some embodiments, the task execution manner includes a data-slicing-based task execution manner; in step S41, the step of setting the task execution parameters corresponding to the task execution mode according to the type of the source data source may specifically include the following steps.
S51, taking the paging parameter as a task execution parameter based on a task execution mode of data slicing under the condition that the source data source type is a database type; the paging parameter is used for indicating that the target data is segmented in a paging mode.
Illustratively, if the synchronization tool, such as the open source version of DataX, does not support the distributed type, the slicing is performed by configuring a slicing (slicing) parameter, which may include a primary key ID of data, and the data is sliced according to the primary key ID of the data to determine a data range of each sliced data.
S52, taking the data line identification and the data line creation time as task execution parameters based on a task execution mode of data slicing under the condition that the source data source type is a system interface type; the data line identification and the data line creation time are used for indicating that the target data is segmented through the data line identification and the data line creation time.
For example, if the synchronization tool supports a task execution manner based on data slicing, and the source data source type is a system interface type, the slicing parameters may include a primary key ID of a data line and a data line creation time; and carrying out data segmentation according to the main key ID of each row of data and the data row creation time, thereby determining the data range of each piece of data.
In the embodiment of the disclosure, the configuration scheduling task and the configuration fragmentation parameter can be specifically set according to the task execution mode supported by the synchronization tool and the type of the source data source, so that automatic adaptation with different types of synchronization tools and different types of source data sources can be realized, and the flexibility of the data synchronization method is improved; and the problem that the open source version of some synchronous tools does not support the distributed type can be solved by self-defining the slicing, the synchronous tools are usually in a single-machine version, and the synchronous tools in the single-machine version realize the distributed type operation processing through the data slicing, so that the synchronous performance is improved.
In some embodiments, the synchronization configuration file is configured to record a synchronization correspondence between entity attribute information in the target data and entity attribute information in the target data source; in the case that the scheduled synchronization task is in a start state, the step of synchronizing, in step S34, the target data to the target data source according to the synchronization configuration file by the synchronization tool may specifically include: and under the condition that the scheduled synchronous task is in a starting state, generating and sending a synchronous command to the proxy server according to the type of the synchronous tool.
The proxy server is used for starting a synchronization tool according to the synchronization command so as to acquire the synchronization corresponding relation through the synchronization tool, and executing task scheduling information so as to write the target data into the target data source according to the synchronization corresponding relation.
In this embodiment, a proxy (Agent) server is used to provide Agent services. In some embodiments, an Agent program is installed on a server deployed by the synchronization tool to provide an Agent service. The data synchronization system of the embodiment of the disclosure can interact with a synchronization tool through an Agent service and establish connection so as to be used for sending control signals, log acquisition instructions and the like to the synchronization tool. In the embodiment of the disclosure, when a synchronous task is started, a task starting command is sent to an Agent server, after the Agent server receives the synchronous task starting, the Agent server is connected with a synchronous tool through an Agent service, the synchronous tool acquires a synchronous configuration file, acquires target data according to the synchronous configuration file, and writes the target data into a target data source. By means of the Agent-based task scheduling framework, time efficiency of task execution can be improved, and network load can be reduced.
In some embodiments, the data synchronization method further comprises: s61, in the process of synchronizing target data to a target data source based on the synchronization configuration file, log acquisition is carried out on the synchronization process through a proxy server; and S62, the collected log content is sent to a data interface of the synchronization tool, so that page display is carried out on the log content through the synchronization tool.
In the embodiment, the log file is monitored through the Agent service, and the log content can be uploaded to the synchronous platform in real time for log display. In the data synchronization process, the log is output to the page end, so that whether synchronization errors exist or not can be conveniently checked and judged, the retry or stop of the task can be automatically triggered according to a preset synchronization abnormality strategy and the error type, or the retry or stop of the task is manually triggered, thereby realizing the web management mode of the data synchronization process and realizing the functions of automatic retry, automatic scheduling and the like of the data synchronization process.
According to the data synchronization method disclosed by the embodiment of the invention, the synchronization configuration files of the synchronization tools of different types can be automatically generated through the metadata, so that the synchronization tools of different types are adapted, development and test of data acquisition scripts are not needed, the workload of data synchronization is reduced, and the data synchronization efficiency is improved; the synchronization tool used in the data synchronization method of the embodiment of the disclosure can support data synchronization of an interface type, thereby being beneficial to expanding application scenes of data synchronization and meeting data synchronization requirements of more data sources of different types; the synchronous tool supports automatic generation of synchronous configuration files corresponding to the types of the synchronous tools, so that writing and management of standardized acquisition configuration files of different types of synchronous tools are facilitated; and the data synchronization method of the embodiment of the disclosure can support a web management mode, and automatic retry and automatic scheduling can be realized in a data synchronization flow, thereby being beneficial to improving the efficiency and performance of the data synchronization method.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.
Based on the above-mentioned data synchronization method, the embodiment of the present application provides a data synchronization system, and referring to fig. 2, a schematic frame diagram of the data synchronization system according to an exemplary embodiment of the present disclosure is shown.
In the embodiment of the present disclosure, the data synchronization system may be implemented by a development framework (springboot). Springboot is a development framework based on a Browser/Server (B/S), is an indispensable framework for development of a computer program language (Java) technical stack, and can simplify Web service development flow. At the position of
In fig. 2, the architecture of the data synchronization system includes a metadata management subsystem 10 and a synchronization platform subsystem 20.
As shown in fig. 2, the metadata management subsystem 10 includes a database metadata management module 11, an interface type metadata management module 12, and a metadata acquisition module 13; included in the synchronization platform subsystem 20 are: a map configuration module 21, a map file creation module 22, a file deployment module 23, a task scheduling module 24, a task initiation module 25, and a viewing module 26.
In some embodiments, the metadata management subsystem 10 is used to manage metadata in one or more types of source data sources that are preset. The types of source data sources include, for example, database types and system interface types; the database metadata management module 11 is used for managing metadata in a source data source of a database type, and the interface metadata management module 12 is used for managing metadata in a source data source of a system interface type.
Illustratively, the metadata includes a data source name, entity information, and corresponding entity attribute information. For a source data source of a database Type, the data source name is a database name, the entity information is a table (table) name, the corresponding entity attribute information is a field (Column) name, and the attribute Type is a field Type (Type); for a source data source of a system interface Type, the data source name is an interface name, an entity is a response structure and a request structure of the interface, the entity attribute information is field information in a corresponding request structure and field (field) information in a corresponding response structure, and the attribute Type is a field Type (Type). In some embodiments, view Objects (VOs) may be used for data transfer between business layers, and interface names of the business layers may be represented by names of the corresponding view objects.
The data synchronization method of the exemplary embodiment of the present disclosure is described below in conjunction with each module in the data synchronization system shown in fig. 2. In some embodiments, the data synchronization method may include the following steps.
S201, acquiring metadata.
Referring to fig. 2, a metadata collection module 13 may be used to perform metadata collection. The embodiment of the disclosure can provide two modes of automatic acquisition and manual input for acquiring and maintaining metadata, for example, acquiring metadata from the database metadata management module 11 in an automatic acquisition mode; alternatively, metadata may be obtained by means of manual entry for at least one of a database type source data source and a system interface type source data source.
In some embodiments, the acquired metadata is stored in a predetermined storage location of the metadata management subsystem, and a query function for the metadata is provided externally through a metadata query interface.
S202, establishing a corresponding relation of attribute information between a source data source and a target data source.
Referring to fig. 2, the mapping configuration module 21 is configured to obtain a correspondence relationship of attribute information between a source data source and a target data source. In some embodiments, the mapping configuration function is implemented by the synchronization platform subsystem 20 to create a correspondence (also referred to as a mapping) of attribute information between the source data source and the target data source. The mapping configuration function is configured according to the following configuration steps.
Specifically, the configuring step may include: selecting a data source to be configured, calling a metadata query interface, and querying the data source to be mapped through a system interface name, wherein the data source is a system interface in the current embodiment, and the system interface is used for acquiring user information; the system interface returns response structure information and displays all fields of the response structure returned by the system interface through pages; the page provides a field selection function, for example, a specific field to be synchronized can be selected through a check box, and all fields of the structure body are responded to default to full selection; the page responds to page selection operation, and acquires selected field information as attribute information of a source data source, which can also be called a data source field; and selecting a target ES data source, selecting an index of the ES data source, calling a metadata query interface to query an attribute list under the index, and displaying attribute information in the attribute list under the index through a page, wherein the attribute information can also be called as a target field. In some embodiments, the data source field is consistent with the list of destination fields to ensure a one-to-one correspondence between the fields.
S203, a synchronous configuration file is created.
Referring to fig. 2, the mapping file creating module 22 is configured to generate a synchronization configuration file according to a correspondence relationship of attribute information between a source data source and a target data source and a synchronization configuration parameter corresponding to a synchronization tool type.
In some embodiments, after a particular field to be synchronized is determined, a type of synchronization tool is selected, such as DataX or FlinkX, and after the synchronization tool is selected, a synchronization profile (also referred to as a mapping file) corresponding to the tool type may be created; meanwhile, according to different tool types, the configuration corresponding to the type of the synchronous tool can be performed, for example, if the synchronous tool is DataX, the unique configuration of DataX can be parameters such as current limiting parameters, global concurrency and the like. In some embodiments, the created configuration file may be in the form of a script file; the script files of the synchronization profile may be persisted to the device for storage and management locally to facilitate script migration.
S204, deploying a synchronous configuration file.
Referring to FIG. 2, a file deployment module 23 is used to deploy a synchronization configuration file to a specified file path; wherein the synchronization tool has access to the specified file path.
In some embodiments, after the synchronization configuration file is generated, the synchronization configuration file needs to be deployed under a directory that can be accessed by the synchronization tool, and in the embodiments of the present disclosure, the synchronization configuration file may be shared and mounted in an NFS manner, so that multiple instances of the synchronization tool can share and access the configuration file, and by starting multiple synchronization instances, synchronization efficiency is improved.
S205, configuring a scheduling task and task execution parameters.
Referring to fig. 2, a task scheduling module 24 is used to perform configuration scheduling tasks and to configure the slicing parameters.
In some embodiments, the scheduling task is implemented using an open source distributed task scheduling platform XXL-Job, and the creation of the scheduling task is performed by a set cron scheduling expression. In performing the configuration of the task execution parameters, as an example, if the open source version of the synchronization tool (e.g., dataX) does not support distributed, then the target data shards to be synchronized are performed by configuring the sharding parameters. As a specific example, if the type of the source data source is a system interface type (i.e., an interface type data source), the slicing may be performed by paging; if the type of source data source is a database type (e.g., a relational data source), then the sharding may be performed by a data line identification and a data line creation time.
S206, starting a data synchronization task.
Referring to fig. 2, a task initiation module 25 is used to initiate a data synchronization task.
In some embodiments, the scheduling of the data synchronization task may be performed according to the configured cron expression, when the scheduled data synchronization task is started, an execution command is distributed to the proxy server according to the tool type of the selected synchronization tool, after the proxy server receives the execution command, the local synchronization tool may be started according to the configuration file, the target data to be synchronized may be obtained from the metadata, and the obtained target data may be written into the target data source.
S207, checking the synchronization log and checking the synchronization result.
Referring to fig. 2, a view module 26 is used to view the synchronization log and view the synchronization results.
Specifically, the proxy server monitors the log file, uploads the monitored log content to the display page provided by the metadata management subsystem 10 in real time, and performs log display and checks the operation result of the data synchronization task through the display page.
By the data synchronization method, common synchronization tools of different types can be adapted, so that the type difference of the different synchronization tools can be shielded; when the synchronous task configuration is carried out, synchronous configuration scripts of synchronous tools of different types can be generated according to the corresponding relation of attribute information between a source data source and a target data source and synchronous configuration parameters corresponding to the types of the synchronous tools; when the scheduling task is configured, the target data to be synchronized can be supported, and the processing performance of data synchronization by using the synchronization tool is improved.
In the process of executing the acquisition task, the output log of the task and the running result can be checked, and whether the acquisition abnormality exists or not is judged manually; if the acquisition abnormality exists, repairing the data acquisition script or repairing the configuration file, and restarting the task; if no abnormality exists, the data synchronization is completed; therefore, the system can automatically judge the synchronous abnormal condition to carry out corresponding treatment; the script file of the synchronous configuration file can be subjected to persistence storage, and the management of the data synchronization history information is performed through the synchronization log; in the embodiment of the disclosure, the life cycle of the synchronous task can be managed through a page form, such as task creation, task configuration and scheduling, partition parameter configuration, synchronous log checking, synchronous result checking and the like, so that multidirectional management of the life cycle of the synchronous task is realized, and the management efficiency of the life cycle information of the synchronous task is improved.
In addition, the disclosure further provides a data synchronization device, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any one of the data synchronization methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.
Based on the above data synchronization method and the data synchronization system embodiment, the embodiment of the present application provides a data synchronization device, and referring to fig. 3, a block diagram of the data synchronization device provided by the embodiment of the present disclosure is provided. Referring to fig. 3, an embodiment of the present disclosure provides a data synchronization apparatus 300 including:
a query module 310, configured to query and obtain metadata from a source data source through a metadata query interface, and obtain target data to be synchronized from the metadata;
a configuration module 320, configured to generate and deploy a synchronization configuration file corresponding to the tool type based on the tool type of the selected synchronization tool;
and a synchronization module 330, configured to synchronize, by the synchronization tool, the target data to a target data source based on the synchronization profile.
In some embodiments, the metadata includes a data source name, entity information, and corresponding entity attribute information; the query module 310 is specifically configured to: calling a metadata query interface, querying a source data source by using a data source name, and acquiring a plurality of corresponding candidate entity information and a plurality of candidate entity attribute information from the searched source data source; displaying a plurality of candidate entity information and a plurality of candidate entity attribute information through a page; the page comprises a plurality of first control elements for marking the plurality of candidate entity information and a plurality of second control elements for marking the plurality of candidate entity attribute information; one candidate entity information corresponds to one first control element, and one candidate entity attribute information corresponds to one second control element; responding to operation instructions aiming at a plurality of first control elements and a plurality of second control elements, marking candidate entity information of the operated first control elements and candidate entity attribute information corresponding to the operated second control elements as target data needing to be synchronized.
In some embodiments, the types of source data sources include database types and system interface types; if the type of the source data source is a database type, the data source name is a database name, the entity information is a table name, and the entity attribute information is a field name in a corresponding table; if the type of the source data source is the system interface type, the data source name is the interface name, the entity information is the request structure information and the response structure information of the system interface, and the entity attribute information is the field information in the corresponding request structure and the field information in the corresponding response structure.
In some embodiments, metadata further includes: related attribute information corresponding to the data source type; in the case where the source data source type is a database type, the related attribute information includes at least one of the following information items: database link information and database login information; in the case that the source data source type is a system interface type, the related attribute information includes at least one of the following information items: interface link information and interface configuration information.
In some embodiments, the configuration module 320 is specifically configured to: acquiring selected index information of a target data source, calling a metadata query interface, and querying entity attribute information corresponding to the index information from metadata by using the index information; determining a one-to-one correspondence relationship between entity attribute information in the metadata and entity attribute information corresponding to the index information; according to the tool type of the synchronous tool, acquiring synchronous configuration parameters corresponding to the tool type; generating a synchronous configuration file corresponding to the tool type according to the one-to-one correspondence and the synchronous configuration parameters, wherein the synchronous configuration file is used for recording by the synchronous configuration file: and the entity attribute information in the target data and the entity attribute information corresponding to the index information are in synchronous corresponding relation.
In some embodiments, the data synchronization device further includes a file deployment module, specifically configured to: deploying the synchronous configuration file to a designated file path; the synchronization tool has access rights for specifying a file path; the deployment module is specifically configured to: and mounting the synchronous configuration file to a local file directory in a file sharing mode, so as to be used for sharing access to the synchronous configuration file locally.
In some embodiments, the synchronization module 330 is specifically configured to: starting a plurality of synchronization instances of a pre-created synchronization tool; each synchronization instance is used for synchronizing target data; the synchronization configuration file is accessed in a local sharing mode according to a plurality of synchronization instances, and the target data is synchronized to the target data source through the synchronization instances based on the synchronization configuration file.
In some embodiments, the synchronization module 330 is specifically configured to: generating a synchronous task according to the received task scheduling expression; configuring task scheduling information according to the type of the source data source and the tool type; scheduling the synchronous tasks based on the task scheduling information; and under the condition that the scheduled synchronous task is in a starting state, synchronizing the target data to the target data source according to the synchronous configuration file through a synchronous tool.
In some embodiments, the synchronization module 330, when configured to configure task scheduling information according to the type of source data source and the type of tool, is specifically configured to: determining a task execution mode supported by the synchronous tool according to the type of the tool, and setting task execution parameters corresponding to the task execution mode according to the type of the source data source; task scheduling information corresponding to the task execution parameters is generated.
In some embodiments, the task execution manner includes a data-slicing-based task execution manner; the synchronization module 330 is specifically configured to, when configured to set a task execution parameter corresponding to a task execution manner according to a type of a source data source: under the condition that the source data source type is a database type, the paging parameter is used as a task execution parameter based on a task execution mode of data slicing; the paging parameter is used for indicating that target data are segmented in a paging mode; under the condition that the source data source type is a system interface type, the data line identification and the data line creation time are used as task execution parameters based on a task execution mode of data slicing; the data line identification and the data line creation time are used for indicating that the target data is segmented through the data line identification and the data line creation time.
In some embodiments, the synchronization configuration file is configured to record a synchronization correspondence between entity attribute information in the target data and entity attribute information in the target data source; the synchronization module 330 is specifically configured to, when the scheduled synchronization task is in a start state, synchronize, by the synchronization tool, the target data to the target data source according to the synchronization configuration file: under the condition that the scheduled synchronous task is in a starting state, generating and sending a synchronous command to the proxy server according to the type of the synchronous tool; the proxy server is used for starting a synchronization tool according to the synchronization command so as to acquire a synchronization corresponding relation through the synchronization tool, and executing task scheduling information so as to write target data into a target data source according to the synchronization corresponding relation.
In some embodiments, the data synchronization device 300 further comprises: the log acquisition module is used for carrying out log acquisition on the synchronization process through the proxy server in the process of synchronizing the target data to the target data source based on the synchronization configuration file; and the log viewing module is used for sending the collected log content to a data interface of the synchronization tool so as to display the log content on a page through the synchronization tool.
According to the data synchronization device of the embodiment of the disclosure, after metadata is obtained from a source data source through a metadata query interface, target data to be synchronized can be selected from the metadata, and corresponding synchronization configuration files can be generated and deployed according to the type of the selected synchronization tool, so that the synchronization tool is utilized and the synchronization configuration files are used for synchronizing the target data from the source data source to the target data source; according to the data synchronization method, the synchronization configuration files of different types of synchronization tools can be automatically generated through metadata, so that different types of synchronization tools can be adapted; and moreover, development and test of a data acquisition script are not needed, so that the workload of data synchronization is reduced, and the data synchronization efficiency is improved.
Based on the data synchronization method example and the data synchronization device embodiment, the embodiment of the disclosure provides an electronic device. Referring to fig. 4, a block diagram of an electronic device is provided in an embodiment of the disclosure. The electronic device includes: at least one processor 401; at least one memory 402, and one or more I/O interfaces 403, connected between the processor 401 and the memory 402; wherein the memory 402 stores one or more computer programs executable by the at least one processor 401, the one or more computer programs being executable by the at least one processor 401 to enable the at least one processor 401 to perform the data synchronization method described above.
The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor/processing core implements the data synchronization method described above. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.
The disclosed embodiments also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the above-described data synchronization method.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (14)

1. A method of data synchronization, comprising:
inquiring and obtaining metadata from a source data source through a metadata inquiry interface, and obtaining target data to be synchronized from the metadata;
generating and deploying a synchronous configuration file corresponding to a tool type based on the tool type of the selected synchronous tool;
and synchronizing, by the synchronization tool, the target data to a target data source based on the synchronization profile.
2. The method of claim 1, wherein the metadata includes a data source name, entity information, and corresponding entity attribute information; the method for obtaining metadata from a source data source through the metadata query interface and obtaining target data needing to be synchronized from the metadata comprises the following steps:
Invoking the metadata query interface, querying the source data source by using a data source name, and acquiring a plurality of corresponding candidate entity information and a plurality of candidate entity attribute information from the searched source data source;
displaying the plurality of candidate entity information and the plurality of candidate entity attribute information through a page; the page comprises a plurality of first control elements for marking the candidate entity information and a plurality of second control elements for marking the candidate entity attribute information; one candidate entity information corresponds to one first control element, and one candidate entity attribute information corresponds to one second control element;
responding to operation instructions aiming at a plurality of first control elements and a plurality of second control elements, marking candidate entity information of the operated first control elements and candidate entity attribute information corresponding to the operated second control elements as target data needing to be synchronized.
3. The method of claim 2, wherein the types of source data sources include a database type and a system interface type;
if the type of the source data source is a database type, the data source name is a database name, the entity information is a table name, and the entity attribute information is a field name in a corresponding table;
If the type of the source data source is the system interface type, the data source name is an interface name, the entity information is request structure information and response structure information of the system interface, and the entity attribute information is field information in the corresponding request structure and field information in the corresponding response structure.
4. A method according to claim 3, wherein the metadata further comprises: related attribute information corresponding to the data source type;
in the case that the source data source type is a database type, the related attribute information includes at least one of the following information items: database link information and database login information;
in the case that the source data source type is a system interface type, the related attribute information includes at least one of the following information items: interface link information and interface configuration information.
5. The method of claim 1, wherein the generating and deploying a synchronization profile corresponding to the tool type based on the tool type of the selected synchronization tool comprises:
acquiring selected index information of a target data source, calling the metadata query interface, and querying entity attribute information corresponding to the index information from the metadata by using the index information;
Determining a one-to-one correspondence between entity attribute information in the metadata and entity attribute information corresponding to the index information;
according to the tool type of the synchronous tool, acquiring synchronous configuration parameters corresponding to the tool type;
generating a synchronous configuration file corresponding to the tool type according to the one-to-one correspondence and the synchronous configuration parameters, wherein the synchronous configuration file is used for recording by the synchronous configuration file: and the entity attribute information in the target data and the entity attribute information corresponding to the index information are in synchronous corresponding relation.
6. The method of claim 5, wherein the method further comprises:
deploying the synchronous configuration file to a designated file path; wherein the synchronization tool has access rights to the specified file path;
the deploying the synchronization configuration file to a designated file path includes: mounting the synchronous configuration file to a local file directory in a file sharing mode, so as to be used for sharing access to the synchronous configuration file locally;
the synchronizing, by the synchronization tool, the target data to a target data source based on the synchronization profile, including:
Starting a plurality of synchronization instances of the synchronization tool which are created in advance; each synchronization instance is used for synchronizing target data;
and locally sharing and accessing the synchronous configuration file according to the synchronous instances, and synchronizing the target data to a target data source through the synchronous instances based on the synchronous configuration file.
7. The method of claim 1, wherein synchronizing, by the synchronization tool, the target data to a target data source based on the synchronization profile, comprises:
generating a synchronous task according to the received task scheduling expression;
configuring task scheduling information according to the type of the source data source and the tool type;
scheduling the synchronous task based on the task scheduling information;
and under the condition that the scheduled synchronous task is in a starting state, synchronizing the target data to the target data source according to the synchronous configuration file through the synchronous tool.
8. The method of claim 7, wherein said configuring task scheduling information according to the type of the source data source and the tool type comprises:
Determining a task execution mode supported by the synchronous tool according to the tool type, and setting task execution parameters corresponding to the task execution mode according to the type of the source data source;
and generating task scheduling information corresponding to the task execution parameters.
9. The method of claim 8, wherein the task execution mode comprises a data-slicing-based task execution mode; setting task execution parameters corresponding to the task execution mode according to the type of the source data source, including:
under the condition that the source data source type is a database type, the paging parameter is used as the task execution parameter of the task execution mode based on the data slicing; the paging parameter is used for indicating that target data are segmented in a paging mode;
under the condition that the source data source type is a system interface type, taking a data line identifier and data line creation time as task execution parameters of the task execution mode based on the data slicing; the data line identification and the data line creation time are used for indicating that the target data is segmented through the data line identification and the data line creation time.
10. The method according to claim 7, wherein the synchronization profile is used for recording a synchronization correspondence between entity attribute information in the target data and entity attribute information in the target data source;
and under the condition that the scheduled synchronous task is in a starting state, synchronizing the target data to the target data source according to the synchronous configuration file by the synchronous tool, wherein the method comprises the following steps:
generating and sending a synchronous command to a proxy server according to the type of the synchronous tool under the condition that the scheduled synchronous task is in a starting state;
the proxy server is used for starting the synchronous tool according to the synchronous command so as to acquire the synchronous corresponding relation through the synchronous tool, and executing the task scheduling information so as to write the target data into the target data source according to the synchronous corresponding relation.
11. The method according to claim 1, wherein the method further comprises:
in the process of synchronizing the target data to a target data source based on the synchronization configuration file, carrying out log acquisition on the synchronization process through a proxy server;
And sending the collected log content to a data interface of the synchronization tool so as to display the log content on a page through the synchronization tool.
12. A data synchronization device, comprising:
the query module is used for obtaining metadata from a source data source through a metadata query interface and obtaining target data to be synchronized from the metadata;
the configuration module is used for generating and deploying a synchronous configuration file corresponding to the tool type based on the tool type of the selected synchronous tool;
and the synchronization module is used for synchronizing the target data to a target data source based on the synchronization configuration file through the synchronization tool.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the data synchronization method of any one of claims 1-11.
14. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the data synchronization method according to any one of claims 1-11.
CN202211228327.5A 2022-10-08 2022-10-08 Data synchronization method and device, electronic equipment and computer readable storage medium Pending CN116150236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211228327.5A CN116150236A (en) 2022-10-08 2022-10-08 Data synchronization method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211228327.5A CN116150236A (en) 2022-10-08 2022-10-08 Data synchronization method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116150236A true CN116150236A (en) 2023-05-23

Family

ID=86360688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211228327.5A Pending CN116150236A (en) 2022-10-08 2022-10-08 Data synchronization method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN116150236A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567007A (en) * 2023-07-10 2023-08-08 长江信达软件技术(武汉)有限责任公司 Task segmentation-based micro-service water conservancy data sharing and exchanging method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116567007A (en) * 2023-07-10 2023-08-08 长江信达软件技术(武汉)有限责任公司 Task segmentation-based micro-service water conservancy data sharing and exchanging method
CN116567007B (en) * 2023-07-10 2023-10-13 长江信达软件技术(武汉)有限责任公司 Task segmentation-based micro-service water conservancy data sharing and exchanging method

Similar Documents

Publication Publication Date Title
US20180276304A1 (en) Advanced computer implementation for crawling and/or detecting related electronically catalogued data using improved metadata processing
US20200125619A1 (en) Methods and systems for providing a search service application
CN104133772A (en) Automatic test data generation method
US12086194B2 (en) Methods and systems for building search service application
US9098497B1 (en) Methods and systems for building a search service application
US20150066977A1 (en) Method and system for managing digital resources
CN107103011B (en) Method and device for realizing terminal data search
US10262055B2 (en) Selection of data storage settings for an application
CN111984659B (en) Data updating method, device, computer equipment and storage medium
US20150199408A1 (en) Systems and methods for a high speed query infrastructure
US10855750B2 (en) Centralized management of webservice resources in an enterprise
US11734241B2 (en) Efficient spatial indexing
CN112579705B (en) Metadata acquisition method, device, computer equipment and storage medium
CN114780641A (en) Multi-library multi-table synchronization method and device, computer equipment and storage medium
CN106682210B (en) Log file query method and device
CN114416868B (en) Data synchronization method, device, equipment and storage medium
CN116150236A (en) Data synchronization method and device, electronic equipment and computer readable storage medium
CN111625728A (en) Method, device, equipment and medium for generating retrieval catalog from webpage document
US20190340179A1 (en) Result set output criteria
CN105809577B (en) Power plant informatization data classification processing method based on rules and components
CN108241624B (en) Query script generation method and device
US11269903B1 (en) Indexing and retrieval of configuration data
JP2004192657A (en) Information retrieval system, and recording medium recording information retrieval method and program for information retrieval
CN116028504B (en) Data lake metadata management method and device
CN113835963B (en) Kubernetes-based automated deployment tool integration method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination