CN115563218A - Method and system for dynamically synchronizing data based on Flink CDC technology - Google Patents

Method and system for dynamically synchronizing data based on Flink CDC technology Download PDF

Info

Publication number
CN115563218A
CN115563218A CN202211293896.8A CN202211293896A CN115563218A CN 115563218 A CN115563218 A CN 115563218A CN 202211293896 A CN202211293896 A CN 202211293896A CN 115563218 A CN115563218 A CN 115563218A
Authority
CN
China
Prior art keywords
flink
data
cdc
data based
sql
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211293896.8A
Other languages
Chinese (zh)
Inventor
许光锋
薛圣旦
廖宁
邱锋兴
向春生
王璟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Anscen Network Technology Co ltd
Original Assignee
Xiamen Anscen Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Anscen Network Technology Co ltd filed Critical Xiamen Anscen Network Technology Co ltd
Priority to CN202211293896.8A priority Critical patent/CN115563218A/en
Publication of CN115563218A publication Critical patent/CN115563218A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/82Solving problems relating to consistency

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a dynamic data synchronization method based on a Flink CDC technology, which comprises the following steps: s1, configuring a backend program to rewrite a DebezuiumDeserializationSchema method; s2, further configuring functions deserialize and getProducedType; s3, opening binlog setting; s4, entering a Flink/bin, starting a Flink cluster by using a start-cluster.sh, and starting an SQL CLI client by using an SQL-client.sh embedded; s5, calling the method of the DebezizumDeserializationSchema rewritten by the S1 to replace the original DebezumDeserializationSchema method when using the Flink SQL. Through rewriting the DebezieserialiationSchema dynamic loading field, the changed table structure and data are obtained, and the operation and maintenance cost is reduced; the data synchronization consistency can be improved, fields and formats do not need to be selected manually, and the problem of inconsistency of the fields and the data caused by manual intervention and interruption of a field change link is reduced; the dynamic loading of the table structure can be realized, and the stability of a data synchronization link is ensured; the upgrading and the expansion of old projects or light-weight projects are facilitated; the solution principle is simple and easy to develop and maintain.

Description

Method and system for dynamically synchronizing data based on Flink CDC technology
Technical Field
The invention belongs to the technical field of data synchronization, and particularly relates to a method and a system for dynamically synchronizing data based on a Flink CDC technology.
Background
The advent of big data technology has made a new breakthrough in system performance and has enabled hardware to achieve linearly increasing performance and storage in a horizontally expanding manner. The core idea of the big data technology is distributed, a big project is divided into a plurality of small applications, and then the small applications are mutually called in a distributed mode, so that the computing efficiency or the storage capacity of the system is improved.
After solving the technical difficulty of data security, distributed storage introduces a new technical problem, namely how to ensure the consistency of data in multiple copies. The mainstream solution at present is to use CDC (Change Data Capture Change Data acquisition) mode to monitor and Capture the Change of database (including INSERT of Data or Data table, UPDATE, DELETE, etc.), completely record these changes according to the occurring time sequence, and write them into the message middleware for subscription and consumption of other services.
However, the CDC has a pain point and changes to the table structure make it difficult to maintain the inbound links. For example, the user has a table, which originally has two columns of fields, id and name, and now has an address field added. The newly added column of data cannot be synchronized into the data lake, even the lake entering link is hung, stability is affected, and besides the change of the added column, deletion, type change and the like can also occur. After deleting the field, the Flink task will report an error to exit, and can be normally started after SQL is modified. A survey report was made by filvetran abroad and found that 60% of companies had a monthly and 30% weekly change in the schema. This illustrates that essentially every company can face the challenges of data integrators with schema changes. Database Schema changes may be frequent for some services, and if the changes only require SQL modification and Flink task restart, high maintenance costs may be incurred.
In view of this, it is very meaningful to provide a method and a system for dynamically synchronizing data based on the Flink CDC technology.
Disclosure of Invention
In order to solve the problem that the Fink SQL cannot synchronize the table structure when the structure of the existing database table changes, the invention provides a method and a system for dynamically synchronizing data based on the Flink CDC technology, so as to solve the technical defect problem.
In a first aspect, the present invention provides a method for dynamically synchronizing data based on the Flink CDC technology, wherein the method comprises the following steps:
s1, configuring a backend program to rewrite a DebezuiumDeserializationSchema method;
s2, further configuring functions deserialize and getProducedType;
s3, opening binlog setting;
s4, entering a Flink/bin, starting a Flink cluster by using a start-cluster.sh, and starting an SQL CLI client by using an SQL-client.sh embedded;
s5, calling the method of the DebezizumDeserializationSchema rewritten by the S1 to replace the original DebezumDeserializationSchema method when using the Flink SQL.
Preferably, S1 further comprises: the rewritten debenzimuserialiationschema method is configured as a method of the JSON-strongdebenzimuserialiationschema to realize conversion of binlog data into JSON.
Further preferably, S1 further includes: when the method of rewriting the DebezuiumDeserializationSchema is configured, the customization operation is added according to the actual requirement.
Further preferably, S2 specifically includes:
s21, configuring a function deserialize to realize the logic of data conversion;
s22, a configuration function getProducedType defines the type of the return.
Preferably, two parameters are returned in S22, specifically including: the first is that a Boolean type parameter indicates whether data is modified or deleted; the second parameter is json converted from binlog, and the json contains fields and contents after Schema change and updates the access table structure.
Further preferably, the method further comprises the following steps: and S6, after the table structure and the data are obtained, carrying out the next operation.
Preferably, the method further comprises the following steps: and S7, creating a result table for storing result data and writing the result into the target library.
In a second aspect, the present invention further provides a method and a system for dynamically synchronizing data based on the Flink CDC technology, including:
a rewriting module: the method is used for configuring a backend program rewriting debezuiumDeserializationSchema method; a function configuration module: for configuring functions deserialize, getProducedType; the module is started: for opening the binlog setting; a starting module: the method comprises the steps of entering a Flink/bin, starting a Flink cluster by using a start-cluster. A calling module: for calling the S1 rewritten debenziterializationschema method to replace the original debenziterializationschema method.
In a third aspect, an embodiment of the present invention provides an electronic device, including: one or more processors; storage means for storing one or more programs which, when executed by one or more processors, cause the one or more processors to carry out a method as described in any one of the implementations of the first aspect.
In a fourth aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the method described in any implementation manner of the first aspect.
Compared with the prior art, the beneficial results of the invention are as follows:
(1) According to the technical scheme, the changed table structure and data are obtained by rewriting the DebezieseriationationSchema dynamic loading field, and the operation and maintenance cost is reduced; the method can improve the data synchronization consistency, does not need to select fields and formats manually, and reduces the problem of inconsistent fields and data caused by manual intervention and interruption of field change links.
(2) The invention can realize dynamic loading of the table structure and ensure the stability of the data synchronization link; the upgrading and the expansion of old projects or light-weight projects are facilitated; the solution principle is simple and easy to develop and maintain.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the invention. Other embodiments and many of the intended advantages of embodiments will be readily appreciated as they become better understood by reference to the following detailed description. The elements of the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding similar parts.
FIG. 1 is an exemplary device architecture diagram in which an embodiment of the present invention may be employed;
FIG. 2 is a flowchart illustrating a method for dynamically updating a data structure in real time during data synchronization according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a method for dynamically updating data structures in real time during data synchronization according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart illustrating an implementation of a method for dynamically updating a data structure in real time during data synchronization according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a system for dynamically updating data structures in real time during data synchronization according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a computer apparatus suitable for use with an electronic device to implement an embodiment of the invention.
Detailed Description
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. In this regard, directional terminology, such as "top," "bottom," "left," "right," "up," "down," etc., is used with reference to the orientation of the figures being described. Because components of embodiments can be positioned in a number of different orientations, the directional terminology is used for purposes of illustration and is in no way limiting. It is to be understood that other embodiments may be utilized and logical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.
Fig. 1 illustrates an exemplary system architecture 100 of a method for processing information or an apparatus for processing information to which embodiments of the present invention may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having communication functions, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background information processing server that processes check request information transmitted by the terminal apparatuses 101, 102, 103. The background information processing server may analyze and perform other processing on the received verification request information, and obtain a processing result (e.g., verification success information used to represent that the verification request is a legal request).
It should be noted that the method for processing information provided by the embodiment of the present invention is generally executed by the server 105, and accordingly, the apparatus for processing information is generally disposed in the server 105. In addition, the method for sending information provided by the embodiment of the present invention is generally executed by the terminal equipment 101, 102, 103, and accordingly, the apparatus for sending information is generally disposed in the terminal equipment 101, 102, 103.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (for example, to provide distributed services), or may be implemented as a single piece of software or multiple pieces of software modules, which is not limited herein.
After the technical difficulty of data security is solved, distributed storage introduces a new technical problem, namely how to ensure the data consistency in multiple copies. The mainstream solution at present is to use CDC (Change Data Capture Change Data acquisition) mode to monitor and Capture the Change of database (including INSERT of Data or Data table, UPDATE, DELETE, etc.), completely record these changes according to the occurring time sequence, and write them into the message middleware for subscription and consumption of other services.
However, the CDC has a pain point and changes to the table structure make it difficult to maintain the inbound links. For example, the user has a table, which originally has two columns of fields, id and name, and now has an address field added. The newly added column of data cannot be synchronized into the data lake, and even the link entering the lake is hung, so that the stability is influenced. In addition to listed changes, there may be columns deleted, type changes, and so forth. After deleting the field, the Flink task will report an error and exit, and can be normally started after SQL is modified. A survey report was made by filtran abroad and it was found that 60% of companies had a change in the schema every month and 30% every week. This illustrates that essentially every company can face the challenges of data integrators with schema changes. Database Schema changes may be frequent for some services, and if the changes only require SQL modification and Flink task restart, high maintenance costs may be incurred.
Therefore, the invention provides a method for dynamically updating a data structure in real time during data synchronization, which is used for rewriting a debezium DeserializationSchema dynamic loading field and acquiring a changed table structure and data aiming at the problem that Fink SQL cannot synchronize the table structure when the structure of a database table changes. The data synchronization consistency is improved, the stability of the data synchronization is guaranteed, and the maintenance cost is reduced.
The treatment process specifically comprises the following steps: and rewriting the dynamic loading field of the DebezuiumDeserializationSchema to obtain a newly added table structure and data. These change data are cleaned, analyzed, and aggregated by SQL, for example, select, where, group by, etc. Finally, the result is output to ES, mongoDB and kafka.
Fig. 2 shows that an embodiment of the present invention discloses a method for dynamically synchronizing data based on the Flink CDC technology, and as shown in fig. 2 and fig. 4, the method includes the following steps:
s1, configuring a backend program to rewrite a DebezuiumDeserializationSchema method;
specifically, in this embodiment, first, we need to implement their own debenz detezium deteziralization Schema, where a method called JSON stringdebenz detezium deteziralization Schema is implemented to convert binlog data into JSON, and in an actual service, customized operations may be added according to their service requirements, for example, sending a change notification of the Schema to a message queue.
S2, further configuring functions deserialize and getProducedType;
the method specifically comprises the following steps: s21, configuring a function deserialize to realize the logic of data conversion;
s22, a configuration function getProducedType defines the type of the return.
Wherein, returning two parameters in S22 specifically includes: the first is that a Boolean type parameter indicates whether data is modified or deleted; the second parameter is json converted from binlog, the json contains the changed field and content of Schema, and the access table structure is updated.
S3, opening binlog setting;
specifically, in the present embodiment, the configuration is modified at the [ mysqld ] position server-id =1,log-bin = mysql-bin, bin _ format = row, bin _ do-db = tableme under the mysql configuration file (Linux:/etc/my. Cnf, windows:/my. Ini). The mysql was restarted.
S4, entering a Flink/bin, starting a Flink cluster by using a start-cluster.sh, and starting an SQL CLI client by using an SQL-client.sh embedded;
s5, calling the method of the DebezizumDeserializationSchema rewritten by the S1 to replace the original DebezumDeserializationSchema method when using the Flink SQL.
Further, the mainstream solution before is to use CDC (Change Data Capture Change Data acquisition) mode to monitor and Capture the database Change (including INSERT of Data or Data table, UPDATE, DELETE etc.), but the Change of table structure causes the database link to be difficult to maintain. The newly added column of data cannot be synchronized into the data lake, and even the link entering the lake is hung, so that the stability is influenced. After deleting the field, the Flink task will report an error and exit, and can be normally started after SQL is modified. In addition to the addition, deletion, and the like, there may be type changes and the like.
In a specific embodiment, referring to fig. 4, a specific implementation flow of the method is as follows:
s1, rewriting a debezuiumDeserializationSchema method at the back end; s2, opening a binlog setting; s3, starting the SQL CLI client; s4, when using Flink SQL, calling a rewritten JsonStringDebezuiarizationSchema method to replace the DebezuiarizationSchema method; s5, compiling clear and aggregated operation contents after the table structure and the data are obtained; s6, creating a result table for storing result data; s7, writing the result into a target library; and S8, ending the process.
It should be noted that in this process, the binlog setting must be turned on, otherwise the changed table structure and data cannot be obtained. Next, a flow chart for invoking data synchronization after rewriting the method is shown in FIG. 3. The method can automatically acquire the changed table structure and data without manual intervention. The data synchronization consistency is improved, the stability of the data synchronization is guaranteed, and the maintenance cost is reduced.
According to the technical scheme, the changed table structure and data are obtained by rewriting the DebezieseriationationSchema dynamic loading field, and the operation and maintenance cost is reduced; the data synchronization consistency can be improved, fields and formats do not need to be selected manually, and the problem of field and data inconsistency caused by manual intervention and field change link interruption is reduced; the dynamic loading of the table structure can be realized, and the stability of a data synchronization link is ensured; the upgrading and the expansion of old projects or light-weight projects are facilitated; the solution principle is simple and easy to develop and maintain.
In a second aspect, an embodiment of the present invention further discloses a method and a system for dynamically synchronizing data based on the Flink CDC technology, as shown in fig. 5, including:
the rewrite module 51: the method is used for configuring a backend program to rewrite a DebezuiarizationSchema method; function configuration module 52: for configuring functions deserialize, getProducedType; the opening module 53: for opening the binlog setting; the start module 54: the method comprises the steps of entering a Flink/bin, starting a Flink cluster by using a start-cluster. Calling module 55: for calling the S1 rewritten debenziterializationschema method to replace the original debenziterializationschema method.
Referring now to FIG. 6, a block diagram of a computer apparatus 600 suitable for use with an electronic device (e.g., the server or terminal device shown in FIG. 1) to implement an embodiment of the invention is shown. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer apparatus 600 includes a Central Processing Unit (CPU) 601 and a Graphics Processing Unit (GPU) 602, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 603 or a program loaded from a storage section 609 into a Random Access Memory (RAM) 606. In the RAM 604, various programs and data necessary for the operation of the apparatus 600 are also stored. The CPU 601, GPU602, ROM 603, and RAM 604 are connected to each other via a bus 605. An input/output (I/O) interface 606 is also connected to bus 605.
The following components are connected to the I/O interface 606: an input portion 607 including a keyboard, a mouse, and the like; an output section 608 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 609 including a hard disk and the like; and a communication section 610 including a network interface card such as a LAN card, a modem, or the like. The communication section 610 performs communication processing via a network such as the internet. The drive 611 may also be connected to the I/O interface 606 as needed. A removable medium 612 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 611 as necessary, so that a computer program read out therefrom is mounted into the storage section 609 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication section 610, and/or installed from the removable media 612. The computer programs, when executed by a Central Processing Unit (CPU) 601 and a Graphics Processor (GPU) 602, perform the above-described functions defined in the method of the present invention.
It should be noted that the computer readable medium of the present invention can be a computer readable signal medium or a computer readable medium or any combination of the two. The computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, apparatus, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The modules described may also be provided in a processor.
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the method steps as described in the first aspect of the invention.
The foregoing description is only exemplary of the preferred embodiments of the invention and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention according to the present invention is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the scope of the invention as defined by the appended claims. For example, the above features and the technical features (but not limited to) having similar functions disclosed in the present invention are mutually replaced to form the technical solution.

Claims (10)

1. A method for dynamically synchronizing data based on a Flink CDC technology is characterized by comprising the following steps:
s1, configuring a backend program to rewrite a DebezuiumDeserializationSchema method;
s2, further configuring functions deserialize and getProducedType;
s3, opening binlog setting;
s4, entering a Flink/bin, starting a Flink cluster by using a start-cluster.sh, and starting an SQL CLI client by using an SQL-client.sh embedded;
s5, calling the method of the DebezizumDeserializationSchema rewritten by the S1 to replace the original DebezumDeserializationSchema method when using the Flink SQL.
2. The method for dynamically synchronizing data based on the Flink CDC technology as claimed in claim 1, wherein S1 further comprises: the rewritten debenzimuserialiationschema method is configured as a method of the JSON-strongdebenzimuserialiationschema to realize conversion of binlog data into JSON.
3. The method for dynamically synchronizing data based on the Flink CDC technique according to claim 2, wherein S1 further comprises: when the method of rewriting the DebezuiumDeserializationSchema is configured, the customization operation is added according to the actual requirement.
4. The method for dynamically synchronizing data based on the Flink CDC technique according to claim 1, wherein S2 specifically comprises:
s21, configuring a function deserialize to realize the logic of data conversion;
s22, a configuration function getProducedType defines the type of the return.
5. The method for dynamically synchronizing data based on the Flink CDC technology according to claim 4, wherein two parameters are returned in S22, which specifically includes: the first is that a Boolean type parameter indicates whether data is modified or deleted; the second parameter is json converted from binlog, and the json contains fields and contents after Schema change and updates the access table structure.
6. The method for dynamically synchronizing data based on the Flink CDC technique according to claim 1, further comprising:
and S6, after the table structure and the data are obtained, carrying out the next operation.
7. The method for dynamically synchronizing data based on the Flink CDC technique according to claim 6, further comprising:
and S7, creating a result table for storing result data and writing the result into the target library.
8. A method and system for dynamically synchronizing data based on a Flink CDC technology are characterized by comprising the following steps:
a rewriting module: the method is used for configuring a backend program rewriting debezuiumDeserializationSchema method;
a function configuration module: for configuring functions deserialize, getProducedType;
the module is started: for opening the binlog setting;
a starting module: the method comprises the steps of entering a Flink/bin, starting a Flink cluster by using a start-cluster.
A calling module: for calling the S1 rewritten DebeziziderializationSchema method to replace the original DebezierializationSchema method.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 7.
CN202211293896.8A 2022-10-21 2022-10-21 Method and system for dynamically synchronizing data based on Flink CDC technology Pending CN115563218A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211293896.8A CN115563218A (en) 2022-10-21 2022-10-21 Method and system for dynamically synchronizing data based on Flink CDC technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211293896.8A CN115563218A (en) 2022-10-21 2022-10-21 Method and system for dynamically synchronizing data based on Flink CDC technology

Publications (1)

Publication Number Publication Date
CN115563218A true CN115563218A (en) 2023-01-03

Family

ID=84746090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211293896.8A Pending CN115563218A (en) 2022-10-21 2022-10-21 Method and system for dynamically synchronizing data based on Flink CDC technology

Country Status (1)

Country Link
CN (1) CN115563218A (en)

Similar Documents

Publication Publication Date Title
CN108920222B (en) Business processing method and device based on rule engine
CN107256206B (en) Method and device for converting character stream format
CN109447635B (en) Information storage method and device for block chain
CN107644075B (en) Method and device for collecting page information
CN112965945A (en) Data storage method and device, electronic equipment and computer readable medium
CN113076304A (en) Distributed version management method, device and system
CN114780564A (en) Data processing method, data processing apparatus, electronic device, and storage medium
CN111338834A (en) Data storage method and device
CN111107133A (en) Generation method of difference packet, data updating method, device and storage medium
CN113377770A (en) Data processing method and device
CN113127000A (en) Compiling method, device and equipment of application program assembly and storage medium
CN109614383B (en) Data copying method and device, electronic equipment and storage medium
CN109582580B (en) System, method and apparatus for debugging pages
US9934019B1 (en) Application function conversion to a service
CN111191225A (en) Method, device, medium and electronic equipment for switching isolated objects
CN110674082A (en) Method and device for removing online document, electronic equipment and computer readable medium
CN112148705A (en) Data migration method and device
CN116226189A (en) Cache data query method, device, electronic equipment and computer readable medium
CN112084254A (en) Data synchronization method and system
CN115658171A (en) Method and system for solving dynamic refreshing of java distributed application configuration in lightweight mode
CN115563218A (en) Method and system for dynamically synchronizing data based on Flink CDC technology
CN110288309B (en) Data interaction method, device, system, computer equipment and storage medium
CN113762702A (en) Workflow deployment method, device, computer system and readable storage medium
CN113253991A (en) Task visualization processing method and device, electronic equipment and storage medium
CN111382057A (en) Test case generation method, test method and device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination