CN111241116A - Data synchronization method and device based on big data and electronic equipment - Google Patents

Data synchronization method and device based on big data and electronic equipment Download PDF

Info

Publication number
CN111241116A
CN111241116A CN202010054869.XA CN202010054869A CN111241116A CN 111241116 A CN111241116 A CN 111241116A CN 202010054869 A CN202010054869 A CN 202010054869A CN 111241116 A CN111241116 A CN 111241116A
Authority
CN
China
Prior art keywords
data
incremental
target
relational database
logs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010054869.XA
Other languages
Chinese (zh)
Other versions
CN111241116B (en
Inventor
邓锻炼
徐�明
王华松
吕健均
苏樟杰
颜文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Ocs Information Technology Co ltd
Original Assignee
Guangzhou Ocs Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Ocs Information Technology Co ltd filed Critical Guangzhou Ocs Information Technology Co ltd
Priority to CN202010054869.XA priority Critical patent/CN111241116B/en
Publication of CN111241116A publication Critical patent/CN111241116A/en
Application granted granted Critical
Publication of CN111241116B publication Critical patent/CN111241116B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data synchronization method, a device and electronic equipment based on big data, which can synchronize incremental logs corresponding to each relational database to an open source stream processing platform, thus, service data in the relational database does not need to be extracted for multiple times, whether the incremental logs are updated or not is detected by the open source stream processing platform, updating information can be obtained at the first time when the incremental logs are updated, thus, the timeliness of data synchronization based on the updated incremental logs is ensured, furthermore, corresponding first incremental data and second incremental data are drawn and converted based on the first target incremental logs and the second target incremental logs in the open source stream processing platform, the timeliness of data format conversion can be ensured, the timeliness of data synchronization is ensured, and the data synchronization is performed based on the incremental logs, the intrusion to the relational database is reduced, and the performance of the relational database is ensured.

Description

Data synchronization method and device based on big data and electronic equipment
Technical Field
The invention relates to the technical field of data synchronization, in particular to a data synchronization method and device based on big data and electronic equipment.
Background
With the development of big data, data synchronization has become a very important ring in data processing. For example, a large amount of business data in an enterprise is stored in various business system databases, which in some cases need to be synchronized. Common data synchronization methods include the following:
(1) each data user extracts the required service data from each service system database for synchronization in the period of low peak of service, however, the method has the problem of repeated extraction.
(2) Data are extracted from all service system databases through a unified data warehouse platform through a sqoop extraction method for synchronization, but the timeliness of the method is poor and is generally the timeliness of T + 1.
(3) The method for obtaining and synchronizing the changed business data from the business system database based on the trigger or the timestamp has the defects of great invasion to the business system database and reduced performance loss of the business system database.
Disclosure of Invention
In order to overcome at least the above-mentioned deficiencies in the prior art, an object of the present invention is to provide a method, an apparatus and an electronic device for data synchronization based on big data.
In a first aspect of the embodiments of the present invention, a data synchronization method based on big data is provided, which is applied to an electronic device, where the electronic device communicates with a plurality of relational databases, and the method at least includes:
reading the incremental logs corresponding to each relational database from each relational database;
synchronizing the read incremental logs corresponding to each relational database to an open source stream processing platform;
detecting whether updated first target incremental logs exist in all incremental logs in the open source flow processing platform, and when the first target incremental logs exist in the incremental logs in the open source flow processing platform, pulling first incremental data of the first target incremental logs in a relational database corresponding to the first target incremental logs and pulling second incremental data of the open source flow processing platform corresponding to second target incremental logs except the first target incremental logs;
converting a first data format of the first incremental data and a second data format of the second incremental data into a set data format in the open source flow processing platform to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are synchronized data;
and outputting the first target data and the second target data.
In an alternative embodiment, the converting, in the open source stream processing platform, a first data format of the first incremental data and a second data format of the second incremental data into a set data format includes:
distributing corresponding data format conversion thread flows for the first incremental data and each second incremental data in the open source flow processing platform;
and running each data format conversion thread flow to realize data format conversion of the first incremental data and the second incremental data.
In an alternative embodiment, the executing each data format conversion thread flow to realize data format conversion of the first incremental data and the second incremental data includes:
counting the process percentage of each data format conversion thread flow;
when a first process percentage reaching a set percentage exists in all the counted process percentages, releasing a first thread resource corresponding to a data format conversion thread flow corresponding to the first process percentage;
and loading the released first thread resource to a data format conversion thread flow corresponding to a second process percentage in all process percentages, wherein the second process percentage is the minimum value in all process percentages.
In an alternative embodiment, the method further comprises:
when a third process percentage which reaches a set percentage exists in all the counted process percentages, releasing a third thread resource corresponding to a data format conversion thread flow corresponding to the third process percentage;
and loading the released third thread resource to a data format conversion thread flow corresponding to a fourth process percentage in the all process percentages, wherein the fourth process percentage is the minimum value in the all process percentages except the second process percentage.
In an alternative embodiment, the method further comprises:
aiming at each relational database, detecting whether abnormal data exist in an incremental log corresponding to the relational database in real time and sending greeting information to the relational database according to a first set time interval;
detecting whether response information fed back by the relational database is received within a second set time length for sending the greeting information;
response information fed back by the relational database is not received within a second set time length for sending the greeting information, the relational database is judged to be abnormal, and first early warning information is output;
and receiving response information fed back by the relational database within a second set time length for sending the greeting information, if abnormal data exists in the incremental log corresponding to the relational database, judging that the relational database is abnormal, and outputting second early warning information.
In an alternative embodiment, the outputting the first target data and the second target data includes:
determining a data receiver corresponding to the first target data and the second target data;
determining the data receiving authority of the data receiver according to the identification information of the data receiver;
and sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority.
In an alternative embodiment, the sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority includes:
determining a first authority level corresponding to the data receiving authority;
determining a second permission level corresponding to all data in the first target data and a third permission level corresponding to all data in the second target data;
sending data of which the second permission level is less than or equal to the first permission level in all data in the first target data to the data receiver;
and sending the data of which the third authority level is less than or equal to the second authority level in all the data in the second target data to the data receiver.
In a second aspect of the embodiments of the present invention, there is provided a data synchronization apparatus based on big data, which is applied to an electronic device, where the electronic device communicates with a plurality of relational databases, and the apparatus at least includes:
the reading module is used for reading the incremental logs corresponding to each relational database from each relational database;
the synchronization module is used for synchronizing the read incremental logs corresponding to each relational database to the open source stream processing platform;
a detecting module, configured to detect whether there is an updated first target incremental log in all incremental logs in the open source stream processing platform, and when there is the first target incremental log in the incremental logs in the open source stream processing platform, pull first incremental data of the first target incremental log in a relational database corresponding to the first target incremental log and pull second incremental data of a relational database corresponding to a second target incremental log in the open source stream processing platform except the first target incremental log;
a conversion module, configured to convert, in the open source stream processing platform, a first data format of the first incremental data and a second data format of the second incremental data into a set data format, so as to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are synchronized data;
and the output module is used for outputting the first target data and the second target data.
In a third aspect of the embodiments of the present invention, an electronic device is provided, which includes a processor, and a memory and a bus connected to the processor; wherein, the processor and the memory complete mutual communication through the bus; the processor is used for calling the program instructions in the memory to execute the big data-based data synchronization method.
In a fourth aspect of the embodiments of the present invention, a readable storage medium is provided, on which a program is stored, and the program, when executed by a processor, implements the big data based data synchronization method described above.
The big data-based data synchronization method, device and electronic equipment provided by the embodiments of the present invention can synchronize the incremental log corresponding to each relational database to the open source stream processing platform, so that it is not necessary to extract the service data in the relational database for many times, and the open source stream processing platform detects whether the incremental log is updated, and can ensure that the update information is obtained at the first time when the incremental log is updated, so as to ensure the timeliness of data synchronization based on the updated incremental log, further, in the open source stream processing platform, the first and second incremental data corresponding to the first and second target incremental logs are pulled based on the first and second target incremental logs and are subjected to data format conversion, so as to ensure the timeliness of data format conversion, further ensure the timeliness of data synchronization, and because the data synchronization is performed based on the incremental logs, the intrusion to the relational database is reduced, and the performance of the relational database is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of a data synchronization method based on big data according to an embodiment of the present invention.
Fig. 2 is a functional block diagram of a data synchronization apparatus based on big data according to an embodiment of the present invention.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present invention.
Icon:
200-big data based data synchronization means; 201-a reading module; 202-a synchronization module; 203-a detection module; 204-a conversion module; 205-an output module;
300-an electronic device; 301-a processor; 302-a memory; 303-bus.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In order to better understand the technical solutions of the present invention, the following detailed descriptions of the technical solutions of the present invention are provided with the accompanying drawings and the specific embodiments, and it should be understood that the specific features in the embodiments and the examples of the present invention are the detailed descriptions of the technical solutions of the present invention, and are not limitations of the technical solutions of the present invention, and the technical features in the embodiments and the examples of the present invention may be combined with each other without conflict.
Fig. 1 is a flowchart of a big data-based data synchronization method according to an embodiment of the present invention, applied to an electronic device, which communicates with a plurality of relational databases. Alternatively, in this embodiment, the relational database may be mysql, oracle, or the like. Further, the method may include the following:
step S21, reading the incremental log corresponding to each relational database from each relational database.
And step S22, synchronizing the read incremental logs corresponding to each relational database to the open source stream processing platform.
Step S23, detecting whether an updated first target incremental log exists in all the incremental logs in the open source flow processing platform, and when the first target incremental log exists in the incremental logs in the open source flow processing platform, pulling first incremental data of the first target incremental log in a relational database corresponding to the first target incremental log and pulling second incremental data of a relational database corresponding to a second target incremental log in the open source flow processing platform except the first target incremental log.
Step S24, converting, in the open source stream processing platform, a first data format of the first incremental data and a second data format of the second incremental data into a set data format, so as to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data.
Step S25, outputting the first target data and the second target data.
In step S23, the first target incremental data and the second target incremental data are data after synchronization.
In this embodiment, the open source stream processing platform may be kafka.
In this embodiment, the incremental log may support multiple data log extraction schemes based on logstack, flash, and filebeat.
In this embodiment, the data format is set to be a JSON format, and accordingly, the first target data and the second target data may be UMS data.
It can be understood that, through steps S21-S25, the incremental log corresponding to each relational database can be synchronized to the open source stream processing platform, so that it is not necessary to extract the service data in the relational database for multiple times, and whether the incremental log is updated is detected by the open source stream processing platform, it can be ensured that the update information is obtained at the first time when the incremental log is updated, and thus, the timeliness of performing data synchronization based on the updated incremental log is ensured.
Furthermore, corresponding first incremental data and second incremental data are pulled and subjected to data format conversion based on the first target incremental log and the second target incremental log in the open-source stream processing platform, so that timeliness of data format conversion can be ensured, timeliness of data synchronization is further ensured, and data synchronization is performed based on the incremental logs, so that invasion to the relational database is reduced, and performance of the relational database is ensured.
In order to ensure real-time performance of data format conversion, it is necessary to improve efficiency of converting the first data format of the first incremental data and the second data format of the second incremental data into the set data format, and for this reason, in an alternative embodiment, in step S24, the converting, in the open source stream processing platform, the first data format of the first incremental data and the second data format of the second incremental data into the set data format may specifically include the following:
step S241, allocating corresponding data format conversion thread flows to the first incremental data and each second incremental data in the open source flow processing platform.
Step S242, each data format conversion thread flow is executed to implement data format conversion on the first incremental data and each second incremental data.
It can be understood that, through steps S241 to S242, parallel operation of each data format conversion thread flow can be realized, so as to improve efficiency of converting the first data format of the first incremental data and the second data format of the second incremental data into the set data format, thereby ensuring real-time performance of data format conversion.
In specific implementation, in order to further improve the timeliness of data conversion, in an alternative implementation, in step S242, the running each data format conversion thread flow to implement data format conversion on the first incremental data and each second incremental data may further include the following specific contents:
in step S2421, the process percentage of each data format conversion thread stream is counted.
Step S2422, when there is a first process percentage reaching the set percentage in all the counted process percentages, releasing the first thread resource corresponding to the data format conversion thread flow corresponding to the first process percentage.
Step S2423, load the released first thread resource to the data format conversion thread stream corresponding to the second process percentage in the all process percentages, where the second process percentage is the minimum value in the all process percentages.
In this embodiment, through step S2421 to step S2423, the first thread resource corresponding to the data format conversion thread flow for which the data format conversion is completed can be released, so as to provide rate addition for the slowest data format conversion thread flow, and further improve the timeliness of the data conversion.
In a specific implementation, in order to ensure synchronization of all data format conversion thread streams as much as possible, on the basis of the step S2423, the following may be further included:
step S2424, when a third process percentage that reaches the set percentage exists in all the counted process percentages, releasing a third thread resource corresponding to the data format conversion thread flow corresponding to the third process percentage.
Step S2425, load the released third thread resource to the data format conversion thread stream corresponding to the fourth process percentage in the all process percentages, where the fourth process percentage is the minimum value in the all process percentages except for the second process percentage.
It is understood that through the steps S2424 to S2425, all released thread resources can be prevented from being loaded into the same data format conversion thread stream, and thus uniformity of loading of the thread resources is ensured, so as to ensure synchronization of all data format conversion thread streams.
In specific implementation, in order to ensure the data security of each relational database, the incremental log corresponding to each relational database needs to be monitored in real time, and for this purpose, on the basis of steps S21 to S25, the following may be further included:
step S261, for each relational database, detecting in real time whether there is abnormal data in the incremental log corresponding to the relational database, and sending greeting information to the relational database according to a first set time interval.
Step S262, detecting whether response information fed back by the relational database is received within a second set time period of sending the greeting information.
Step S263, if no response information fed back by the relational database is received within the second set duration of sending the greeting information, determining that the relational database is abnormal, and outputting first warning information.
Step S264, receiving response information fed back by the relational database within a second set time period for sending the greeting information, if there is abnormal data in the incremental log corresponding to the relational database, determining that the relational database is abnormal, and outputting second warning information.
In this embodiment, the first set duration interval may be 30s, and in some scenarios where data interaction is frequent, the first set duration interval may be shortened to 10 s.
In this embodiment, the second set time period may be 5s, and of course, in specific implementation, the second set time period may also be extended appropriately according to the current network delay.
It can be understood that, through steps S261 to S264, the incremental log corresponding to each relational database can be regularly monitored in real time, so as to ensure the data security of each relational database.
In specific implementation, in order to ensure data security of the first target data and the second target data and avoid leakage of the first target data and the second target data, in step S25, the outputting the first target data and the second target data may specifically include the following:
step S251, determining a data receiving side that receives the first target data and the second target data.
Step S252, determining the data receiving authority of the data receiving party according to the identification information of the data receiving party.
Step S253, sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority.
It can be understood that through steps S251 to S253, the data receiving authority of the data receiving party can be determined, so that the first target data and the second target data are partially transmitted according to the data receiving authority, and data with higher authority in the first target data and the second target data is prevented from being transmitted, thereby ensuring data security of the first target data and the second target data, and preventing leakage of the first target data and the second target data.
Optionally, in step S253, the sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority may specifically include the following:
determining a first authority level corresponding to the data receiving authority; determining a second permission level corresponding to all data in the first target data and a third permission level corresponding to all data in the second target data; sending data of which the second permission level is less than or equal to the first permission level in all data in the first target data to the data receiver; and sending the data of which the third authority level is less than or equal to the second authority level in all the data in the second target data to the data receiver.
It can be understood that through the above, data with higher authority in the first target data and the second target data can be prevented from being sent, so that data security of the first target data and the second target data is ensured, and leakage of the first target data and the second target data is avoided.
In specific implementation, in order to further ensure the security of data output, not only the data receiving authority of the data receiving party needs to be considered, but also the security verification needs to be performed on the data receiving party, for this reason, before the step of outputting the first target data and the second target data, the method may further include the following steps:
and determining a communication record form of a data receiver corresponding to the first target data and the second target data.
Determining a plurality of communication parties extracted based on the communication record form and a communication time when the data receiving party establishes communication with each communication party.
And aiming at each communication party in the plurality of communication parties, determining the safety factor of the target communication record corresponding to the communication party and the data receiving party.
And determining a weighted value corresponding to each safety factor according to the communication moment corresponding to each safety factor.
And performing weighted summation on each safety coefficient according to the weight value corresponding to each safety coefficient to obtain a target safety coefficient corresponding to the data receiving party.
And judging whether the target safety factor reaches a set coefficient, and judging that the data receiving party passes safety verification when the target safety factor reaches the set coefficient.
In this embodiment, the safety factor is used to represent a data risk level when the data receiving party communicates with the communication party, and the higher the safety factor is, the lower the data risk level is.
In this embodiment, the weighting value becomes smaller before the communication time, and the weighting value becomes larger after the communication time.
In the present embodiment, the setting coefficient may be 0.9, and it is understood that the target safety factor may be a value between 0 and 1.
It can be understood that, through the above contents, the correspondent communication party of the data receiving party can be analyzed, and the target safety factor of the data receiving party can be determined according to the real-time property of the communication time, so that whether the data receiving party passes the safety verification can be determined according to the target safety factor, and the safety of data output can be ensured.
On the basis of the above, the embodiment of the present invention provides a data synchronization apparatus 200 based on big data. Fig. 2 is a functional block diagram of a big data based data synchronization apparatus 200 according to an embodiment of the present invention, where the big data based data synchronization apparatus 200 includes:
a reading module 201, configured to read an incremental log corresponding to each relational database from each relational database.
And the synchronization module 202 is configured to synchronize the read incremental logs corresponding to each relational database to the open-source stream processing platform.
A detecting module 203, configured to detect whether there is an updated first target incremental log in all the incremental logs in the open source stream processing platform, and when there is the first target incremental log in the incremental logs in the open source stream processing platform, pull first incremental data of the first target incremental log in a relational database corresponding to the first target incremental log and pull second incremental data of a relational database corresponding to a second target incremental log in the open source stream processing platform except the first target incremental log.
A conversion module 204, configured to convert, in the open source stream processing platform, a first data format of the first incremental data and a second data format of the second incremental data into a set data format, so as to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are data after synchronization.
An output module 205, configured to output the first target data and the second target data.
The electronic device 300 includes a processor and a memory, the reading module 201, the synchronizing module 202, the detecting module 203, the converting module 204, the outputting module 205, and the like are all stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, the business data in the relational database can be prevented from being extracted for many times by adjusting the kernel parameters, the timeliness of data synchronization can be ensured, and the invasiveness of the relational database is reduced, so that the performance of the relational database is ensured.
An embodiment of the present invention provides a readable storage medium, on which a program is stored, and the program, when executed by a processor, implements the big data based data synchronization method.
The embodiment of the invention provides a processor, which is used for running a program, wherein the data synchronization method based on big data executed when the program runs is as follows:
A1. a big data based data synchronization method applied to an electronic device, the electronic device communicating with a plurality of relational databases, the method at least comprising:
reading the incremental logs corresponding to each relational database from each relational database;
synchronizing the read incremental logs corresponding to each relational database to an open source stream processing platform;
detecting whether updated first target incremental logs exist in all incremental logs in the open source flow processing platform, and when the first target incremental logs exist in the incremental logs in the open source flow processing platform, pulling first incremental data of the first target incremental logs in a relational database corresponding to the first target incremental logs and pulling second incremental data of the open source flow processing platform corresponding to second target incremental logs except the first target incremental logs;
converting a first data format of the first incremental data and a second data format of the second incremental data into a set data format in the open source flow processing platform to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are synchronized data;
and outputting the first target data and the second target data.
A2. The method of a1, wherein converting, in the open source stream processing platform, a first data format of the first delta data and a second data format of the second delta data into a set data format, comprises:
distributing corresponding data format conversion thread flows for the first incremental data and each second incremental data in the open source flow processing platform;
and running each data format conversion thread flow to realize data format conversion of the first incremental data and the second incremental data.
A3. The method according to a2, wherein the executing each data format conversion thread flow to realize data format conversion of the first incremental data and the second incremental data includes:
counting the process percentage of each data format conversion thread flow;
when a first process percentage reaching a set percentage exists in all the counted process percentages, releasing a first thread resource corresponding to a data format conversion thread flow corresponding to the first process percentage;
and loading the released first thread resource to a data format conversion thread flow corresponding to a second process percentage in all process percentages, wherein the second process percentage is the minimum value in all process percentages.
A4. The method of a3, the method further comprising:
when a third process percentage which reaches a set percentage exists in all the counted process percentages, releasing a third thread resource corresponding to a data format conversion thread flow corresponding to the third process percentage;
and loading the released third thread resource to a data format conversion thread flow corresponding to a fourth process percentage in the all process percentages, wherein the fourth process percentage is the minimum value in the all process percentages except the second process percentage.
A5. The method of a1, the method further comprising:
aiming at each relational database, detecting whether abnormal data exist in an incremental log corresponding to the relational database in real time and sending greeting information to the relational database according to a first set time interval;
detecting whether response information fed back by the relational database is received within a second set time length for sending the greeting information;
response information fed back by the relational database is not received within a second set time length for sending the greeting information, the relational database is judged to be abnormal, and first early warning information is output;
and receiving response information fed back by the relational database within a second set time length for sending the greeting information, if abnormal data exists in the incremental log corresponding to the relational database, judging that the relational database is abnormal, and outputting second early warning information.
A6. The method of a1, the outputting the first target data and the second target data, comprising:
determining a data receiver corresponding to the first target data and the second target data;
determining the data receiving authority of the data receiver according to the identification information of the data receiver;
and sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority.
A7. The method according to a6, wherein the sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority includes:
determining a first authority level corresponding to the data receiving authority;
determining a second permission level corresponding to all data in the first target data and a third permission level corresponding to all data in the second target data;
sending data of which the second permission level is less than or equal to the first permission level in all data in the first target data to the data receiver;
and sending the data of which the third authority level is less than or equal to the second authority level in all the data in the second target data to the data receiver.
In the embodiment of the present invention, as shown in fig. 3, the electronic device 300 includes at least one processor 301, and at least one memory 302 and a bus connected to the processor 301; wherein, the processor 301 and the memory 302 complete the communication with each other through the bus 303; the processor 301 is used to call program instructions in the memory 302 to perform the big data based data synchronization method described above. The electronic device 300 herein may be an electronic device, a PC, a PAD, a mobile phone, etc.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, electronic devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing electronic device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing electronic device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, an electronic device includes one or more processors (CPUs), memory, and a bus. The electronic device may also include input/output interfaces, network interfaces, and the like.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip. The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage electronic devices, or any other non-transmission medium that can be used to store information that can be accessed by computing electronic devices. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or electronic device that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or electronic device. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or electronic device in which the element is included.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A big data based data synchronization method applied to an electronic device, the electronic device communicating with a plurality of relational databases, the method at least comprising:
reading the incremental logs corresponding to each relational database from each relational database;
synchronizing the read incremental logs corresponding to each relational database to an open source stream processing platform;
detecting whether updated first target incremental logs exist in all incremental logs in the open source flow processing platform, and when the first target incremental logs exist in the incremental logs in the open source flow processing platform, pulling first incremental data of the first target incremental logs in a relational database corresponding to the first target incremental logs and pulling second incremental data of the open source flow processing platform corresponding to second target incremental logs except the first target incremental logs;
converting a first data format of the first incremental data and a second data format of the second incremental data into a set data format in the open source flow processing platform to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are synchronized data;
and outputting the first target data and the second target data.
2. The method of claim 1, wherein converting the first data format of the first delta data and the second data format of the second delta data into a set data format in the open source stream processing platform comprises:
distributing corresponding data format conversion thread flows for the first incremental data and each second incremental data in the open source flow processing platform;
and running each data format conversion thread flow to realize data format conversion of the first incremental data and the second incremental data.
3. The method of claim 2, wherein the executing each data format conversion thread stream to perform data format conversion on the first incremental data and the second incremental data comprises:
counting the process percentage of each data format conversion thread flow;
when a first process percentage reaching a set percentage exists in all the counted process percentages, releasing a first thread resource corresponding to a data format conversion thread flow corresponding to the first process percentage;
and loading the released first thread resource to a data format conversion thread flow corresponding to a second process percentage in all process percentages, wherein the second process percentage is the minimum value in all process percentages.
4. The method of claim 3, further comprising:
when a third process percentage which reaches a set percentage exists in all the counted process percentages, releasing a third thread resource corresponding to a data format conversion thread flow corresponding to the third process percentage;
and loading the released third thread resource to a data format conversion thread flow corresponding to a fourth process percentage in the all process percentages, wherein the fourth process percentage is the minimum value in the all process percentages except the second process percentage.
5. The method according to any one of claims 1-4, further comprising:
aiming at each relational database, detecting whether abnormal data exist in an incremental log corresponding to the relational database in real time and sending greeting information to the relational database according to a first set time interval;
detecting whether response information fed back by the relational database is received within a second set time length for sending the greeting information;
response information fed back by the relational database is not received within a second set time length for sending the greeting information, the relational database is judged to be abnormal, and first early warning information is output;
and receiving response information fed back by the relational database within a second set time length for sending the greeting information, if abnormal data exists in the incremental log corresponding to the relational database, judging that the relational database is abnormal, and outputting second early warning information.
6. The method according to any one of claims 1-5, wherein the outputting the first target data and the second target data comprises:
determining a data receiver corresponding to the first target data and the second target data;
determining the data receiving authority of the data receiver according to the identification information of the data receiver;
and sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority.
7. The method of claim 6, wherein the sending at least part of the first target data or at least part of the second target data to the data receiver according to the data receiving authority comprises:
determining a first authority level corresponding to the data receiving authority;
determining a second permission level corresponding to all data in the first target data and a third permission level corresponding to all data in the second target data;
sending data of which the second permission level is less than or equal to the first permission level in all data in the first target data to the data receiver;
and sending the data of which the third authority level is less than or equal to the second authority level in all the data in the second target data to the data receiver.
8. A big data based data synchronization apparatus, applied to an electronic device, the electronic device communicating with a plurality of relational databases, the apparatus comprising at least:
the reading module is used for reading the incremental logs corresponding to each relational database from each relational database;
the synchronization module is used for synchronizing the read incremental logs corresponding to each relational database to the open source stream processing platform;
a detecting module, configured to detect whether there is an updated first target incremental log in all incremental logs in the open source stream processing platform, and when there is the first target incremental log in the incremental logs in the open source stream processing platform, pull first incremental data of the first target incremental log in a relational database corresponding to the first target incremental log and pull second incremental data of a relational database corresponding to a second target incremental log in the open source stream processing platform except the first target incremental log;
a conversion module, configured to convert, in the open source stream processing platform, a first data format of the first incremental data and a second data format of the second incremental data into a set data format, so as to obtain first target data corresponding to the first incremental data and second target data corresponding to the second incremental data; wherein the first target incremental data and the second target incremental data are synchronized data;
and the output module is used for outputting the first target data and the second target data.
9. An electronic device comprising a processor and a memory and bus connected to the processor; wherein, the processor and the memory complete mutual communication through the bus; the processor is used for calling the program instructions in the memory to execute the big data based data synchronization method of any one of the above claims 1-7.
10. A readable storage medium, having a program stored thereon, which when executed by a processor implements the big data based data synchronization method of any one of claims 1 to 7.
CN202010054869.XA 2020-01-17 2020-01-17 Data synchronization method and device based on big data and electronic equipment Active CN111241116B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010054869.XA CN111241116B (en) 2020-01-17 2020-01-17 Data synchronization method and device based on big data and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010054869.XA CN111241116B (en) 2020-01-17 2020-01-17 Data synchronization method and device based on big data and electronic equipment

Publications (2)

Publication Number Publication Date
CN111241116A true CN111241116A (en) 2020-06-05
CN111241116B CN111241116B (en) 2023-04-11

Family

ID=70865115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010054869.XA Active CN111241116B (en) 2020-01-17 2020-01-17 Data synchronization method and device based on big data and electronic equipment

Country Status (1)

Country Link
CN (1) CN111241116B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN108399256A (en) * 2018-03-06 2018-08-14 北京慧萌信安软件技术有限公司 Heterogeneous database content synchronization method, device and middleware

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005618A (en) * 2015-07-21 2015-10-28 杭州合众数据技术有限公司 Data synchronization method and system among heterogeneous databases
CN108399256A (en) * 2018-03-06 2018-08-14 北京慧萌信安软件技术有限公司 Heterogeneous database content synchronization method, device and middleware

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
庞秋奔;李银;: "基于Web Service多源异构系统增量同步的实现" *

Also Published As

Publication number Publication date
CN111241116B (en) 2023-04-11

Similar Documents

Publication Publication Date Title
CN110661659B (en) Alarm method, device and system and electronic equipment
US10560465B2 (en) Real time anomaly detection for data streams
US20180365085A1 (en) Method and apparatus for monitoring client applications
CN106202235B (en) Data processing method and device
CN111290866B (en) Service processing method and device
JP2019523952A (en) Streaming data distributed processing method and apparatus
CN108737132B (en) Alarm information processing method and device
CN110798490B (en) Method and device for accessing third-party system based on data center and data center
CN112015618A (en) Abnormity warning method and device
CN111147313B (en) Message abnormity monitoring method and device, storage medium and electronic equipment
CN113867957A (en) Method and device for realizing elastic expansion of number of cross-cluster containers
CN113641526A (en) Alarm root cause positioning method and device, electronic equipment and computer storage medium
CN114218046A (en) Business monitoring method, medium, electronic device and readable storage medium
CN111241116B (en) Data synchronization method and device based on big data and electronic equipment
CN112346872A (en) Cloud computing capacity expansion method and device based on service call link
CN115455121A (en) Real-time reliable data synchronous transmission method, equipment and medium
CN115374086A (en) Data migration method, device, equipment and medium based on message queue
CN114138615A (en) Service alarm processing method, device, equipment and storage medium
CN115033927A (en) Method, device, equipment and medium for detecting data integrity
CN104052852A (en) Communication method and device
WO2019205202A1 (en) Big data-based method for learning and protecting service logic and device for learning and protection
CN113055419B (en) Information sending method and device
CN110968552B (en) Application information storage method and device
CN114020571A (en) Monitoring method and monitoring equipment for index server
CN116643944A (en) Abnormal transaction monitoring method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant