[ summary of the invention ]
The technical problem to be solved by the invention is that an effective data synchronization method under a heterogeneous database scene is lacked in the prior art; the problem that computing resources and disk IO resources of a source-end database are occupied due to the fact that either real-time performance is lacked or database logs need to be analyzed temporarily in two synchronization modes in the prior art is solved.
The technical problem to be further solved by the present invention is how to further address the difference in the time when different destinations generate subscriptions, and how to provide corresponding subscription content for the new destination when the subscribed information stored in the synchronization server has been deleted in case that the subscription request may be completed after the previous subscription synchronization is completed.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for data synchronization between heterogeneous relational databases, including:
a synchronous server acquires a source end database log;
the synchronous server converts the database log into database language content supported by a destination terminal according to the subscription information corresponding to the source terminal database, and stores the database language content in a corresponding sector;
and the synchronous server sends a notification message of updating the log content to the subscriber so that the destination end can acquire the log content.
Preferably, the synchronous server stores subscription information, where the subscription information includes database types of one or more destination terminals subscribing to the source terminal database, and then the synchronous server converts a database log into database language content supported by a destination terminal according to the subscription information corresponding to the source terminal database, and stores the database language content in a corresponding sector, specifically including:
the synchronous server converts the acquired source database log into database language content supported by a destination end according to the database type of the destination end subscribing the source database;
when the data is stored in the corresponding sector, a counting item is added to the language content of the database supported by the corresponding converted destination according to the number of the subscribed destinations of the database of the same type; and the counting item is updated after the database language content acquisition of a destination end is finished each time, and when the state of the counting item is matched with the number of the destination ends, the corresponding type of database language content in the corresponding sector is deleted.
Preferably, the database log is converted into the database language content supported by the destination according to the subscription information corresponding to the source database, and is stored in the corresponding sector, specifically including:
and for one or more items of the importance level of the destination terminal, the total amount of the subscribed data and the number of subscribed destination terminal devices contained in the subscription information, storing the database language content converted into the language content supported by the destination terminal in hard disk partitions with different reading speeds.
Preferably, the hard disk partitions with different reading speeds specifically include:
dividing the hard disk into a plurality of subareas according to concentric circles with different diameters; wherein a larger diameter partition corresponds to a faster read speed than a smaller diameter partition.
Preferably, the subscription information of each source database and the destination database is periodically scanned, and then the method further includes:
if the number of the databases of the subscriber destinations identifying the same source-end database is reduced and exceeds a preset threshold value, the corresponding database language content is converted into the storage position of the database language content supported by the destination end, and the hard disk partition with the initial speed is migrated to the hard disk with the reading speed reduced by one level;
if the number of the databases of the subscribers who identify the same source-end database is increased to exceed the preset threshold value, the corresponding database language content is converted into the storage position of the database language content supported by the destination end, and the hard disk partition with the initial speed is migrated to the hard disk with the reading speed increased by one level.
Preferably, if the first database language content stored by the synchronization server and converted into the content supported by the destination is deleted, the synchronization server receives a new subscription request of the destination, and the subscribed content is the same as the deleted first database language content, the method further includes:
the synchronous server sends reverse subscription requests to one or more destination terminals which also subscribe the first database language content before deleting the first database language content;
wherein, the reverse subscription request comprises the related information of the language content of the first database;
and after receiving the reverse subscription request, if the one or more destination terminals confirm that the locally stored corresponding language content of the first database is modified by the language of the related database, returning a reverse subscription response to the synchronization server so that the synchronization server can select the destination terminal from the language content of the first database to obtain the language content of the first database.
Preferably, the information related to the language content of the first database includes:
one or more of a database ID, a table name, one or more database operation languages, a start time, and an end time.
Preferably, the database type of the destination includes one or more of a relational database Oracle, DB2, Microsoft SQL Server and MySQL.
In a second aspect, the present invention further provides an apparatus for synchronizing data between heterogeneous relational databases, where the apparatus is configured to implement the method for synchronizing data between heterogeneous relational databases in the first aspect, and the apparatus includes:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform the method for data synchronization between heterogeneous relational databases of the first aspect.
In a third aspect, the present invention also provides a non-transitory computer storage medium storing computer-executable instructions for execution by one or more processors for performing the method for data synchronization between heterogeneous relational databases according to the first aspect.
The method for data synchronization between heterogeneous relational databases provided by the invention provides a subscription mechanism of a destination end through the arranged synchronization server, so that the corresponding heterogeneous database synchronization process can be completed on the synchronization server in advance aiming at the conversion of subscribed data of different database types, and the real-time property is ensured and the occupation of processing resources is reduced under the condition of properly sacrificing the storage space. Moreover, in consideration of the real scene, the types of databases actually involved are very limited, so that the conversion of subscribed data for different database types is completed in advance, and the waste of too much storage resources is not caused.
Furthermore, the invention sets technical items by subscribing the type attribution of the destination terminal, thereby greatly compressing the use of the storage space of the synchronous server in the application scene that the destination terminal is mostly database of the same type.
Furthermore, the invention also provides a technical solution of reverse subscription, so that after a certain subscription destination end misses the synchronization content generated in the synchronization server, the corresponding synchronization content can be strived for based on the synchronization server, and the robustness of the whole scheme architecture is improved.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
embodiment 1 of the present invention provides a method for synchronizing data between heterogeneous relational databases, where a relational architecture diagram is shown in fig. 1, a plurality of subscribers may also be in an actual operation process, and for the purpose of presenting intuitiveness, only a single source-end database (a device bearing the source-end database is also referred to as a source end in the embodiments of the present invention) is taken as an example, when a log capture module shown in fig. 1 is actually implemented, the log capture module may be a corresponding network interface and is used to provide a source-end database side with log content actively reported by the source-end database side, in addition, a monitoring thread may be additionally provided in each source-end database, and the log capture module implements network communication with a corresponding monitoring thread and obtains the log content reported by the corresponding monitoring thread. The log receiving module and the log parsing module shown in fig. 1 are a set of parsing mechanisms that are respectively multiplexed for destination-side database types that do not pass through, but do not exclude the possibility of creating multiple log parsing threads concurrently to implement the log parsing module (especially in the case of more concurrent tasks and the synchronization server also supports concurrent data processing). It should be noted that the architecture diagram shown in fig. 1 is presented merely for ease of understanding, and does not produce a particular range of constraining forces on the specifically implemented process, as shown in fig. 2, including:
in step 201, the synchronization server obtains a source database log.
The source database log records all database operation actions of the source, so that for the embodiment of the invention, if the corresponding synchronization process is performed from the beginning of the operation of the source database, the source database log represents that the source data content is comprehensively backed up and synchronized.
In step 202, the synchronization server converts the database log into the database language content supported by the destination according to the subscription information corresponding to the source database, and stores the database language content in the corresponding sector.
The database type of the destination end comprises one or more of a relational database Oracle, DB2, Microsoft SQL Server and MySQL.
Still taking fig. 1 as an example, the subscriber and the destination database have a one-to-one correspondence, that is: the oracle has an oracle log receiving module and an oracle log parsing module, and the mysql has a mysql log receiving module and a mysql log parsing module, etc. (the reason why the one-to-one correspondence is needed here is that since some differences exist in sql syntax among databases, if the same log parsing module is used, the parsed sql can run on the mysql but cannot run on the oracle). The log receiving module is responsible for reading database log data sent by the source end from the log capturing module, then the log analyzing module is responsible for analyzing the log data, converting the log data into corresponding sql statements, and then executing the sql to write the data into a corresponding destination end database.
In step 203, the synchronization server sends a notification message of log content update to the subscriber, so that the destination can obtain the log content.
The method for data synchronization between heterogeneous relational databases provided by the invention provides a subscription mechanism of a destination end through the arranged synchronization server, so that the corresponding heterogeneous database synchronization process can be completed on the synchronization server in advance aiming at the conversion of subscribed data of different database types, and the real-time property is ensured and the occupation of processing resources is reduced under the condition of properly sacrificing the storage space. Moreover, in consideration of the real scene, the types of databases actually involved are very limited, so that the conversion of subscribed data for different database types is completed in advance, and the waste of too much storage resources is not caused.
In this embodiment of the present invention, the synchronization Server stores subscription information, where the subscription information includes a database type of one or more destination terminals subscribing to the source terminal database (for example, the database type of the destination terminal includes one or more of a relational database Oracle, DB2, Microsoft SQL Server, and MySQL), and then the synchronization Server converts a database log into a database language content supported by the destination terminal according to the subscription information corresponding to the source terminal database, and stores the database language content in a corresponding sector, specifically including:
the synchronous server converts the acquired source database log into database language content supported by a destination end according to the database type of the destination end subscribing the source database;
when the data is stored in the corresponding sector, a counting item is added to the language content of the database supported by the corresponding converted destination according to the number of the subscribed destinations of the database of the same type; and the counting item is updated after the database language content acquisition of a destination end is finished each time, and when the state of the counting item is matched with the number of the destination ends, the corresponding type of database language content in the corresponding sector is deleted.
In the preferred scheme of the embodiment of the invention, the technical items are set by subscribing the type attribution of the destination terminal, so that the use of the storage space of the synchronous server in the application scene that the destination terminal is mostly the database of the same type is greatly compressed.
In the implementation process of the present invention, for the subscription request of the destination, corresponding policy characteristics may also be provided, that is, the destination may specify the subscription content of the data content modified at the source, may also specify the subscription content of the source data operation in a specific time period (herein, usually, a future time period), may also specify the subscription content of the data table that is modified in relation, and so on. It usually appears to the synchronization server that a storage area is newly opened up to provide storage of the data to be synchronized for the destination of the customization request. Further, in the scenario of differentiated storage of different read sectors, which is provided later in conjunction with the embodiment of the present invention, the destination of the customized request provides the default data to be synchronized to be stored in the sector with low read speed, because it is usually a specific requirement of the destination of the small population.
In this embodiment of the present invention, in order to optimally improve the synchronization efficiency of the subscription content, the database log is converted into the database language content supported by the destination according to the subscription information corresponding to the source database, and the database language content is stored in the corresponding sector, specifically including:
and for one or more items of the importance level of the destination terminal, the total amount of the subscribed data and the number of subscribed destination terminal devices contained in the subscription information, storing the database language content converted into the language content supported by the destination terminal in hard disk partitions with different reading speeds. The hard disk partitions with different reading speeds specifically include: dividing the hard disk into a plurality of subareas according to concentric circles with different diameters; wherein a larger diameter partition corresponds to a faster read speed than a smaller diameter partition. As shown in fig. 3, it presents an effect schematic diagram of the three partitions storing the subscription content, where the read-write speeds of the partition 1, the partition 2, and the partition 3 decrease sequentially.
As shown in FIG. 4, when many-to-one synchronization is used, each topic contains multiple partitions. Assume that the source has 3 databases: database 1, database 2 and database 3, each having 3 tables: table1, table2, table 3. In a many-to-one scenario, data on the same table on each source library is generally required to be summarized to a destination, so that the subject here is still a table name, and the number of partitions under each subject is consistent with the number of source databases, so that the key here is database ID + table name. For example, an operation on table1 of database 1 may be written to partition 1 of topic 1, an operation on table1 of database 2 may be written to partition 2 of topic 1, and an operation on table1 of database n may be written to partition n of topic 1, and the operations of the other tables are the same. Unlike fig. 3, in fig. 3, the subscription content (e.g., subscription content 1.1) presented in each partition directly represents a subscription content object of one subscription task. While in fig. 4, the granularity of representation is smaller, it can be understood that messages (e.g., message 1, message 2) are specific database operation statements.
In the specific operation process, when the importance level of the destination is higher (for example, the VIP level of the destination), the total amount of subscribed data is small, and/or the number of subscribed destination devices is large,
in consideration of the fact that the content of the source-end database is unified, the subscription relationship may change, and especially in a scenario of performing one round of database upgrade or server replacement, the subscription relationship may be short-time; in addition, it is also considered that some subscription relationships may be time-efficient, that is, only for a certain period of subscription, the subscription relationship is abandoned after expiration; therefore, the different reading speed partition modes in which the subscription contents are stored are set in the similar number reference mode as described above. Preferably, a further improvement may be made, specifically, the subscription information of each source database and the destination database is periodically scanned, and then the method further includes:
if the number of the databases of the subscriber destinations identifying the same source-end database is reduced and exceeds a preset threshold value, the corresponding database language content is converted into the storage position of the database language content supported by the destination end, and the hard disk partition with the initial speed is migrated to the hard disk with the reading speed reduced by one level;
if the number of the databases of the subscribers who identify the same source-end database is increased to exceed the preset threshold value, the corresponding database language content is converted into the storage position of the database language content supported by the destination end, and the hard disk partition with the initial speed is migrated to the hard disk with the reading speed increased by one level. The preset threshold is an empirical value, and in the actual operation process, the considered dimension may be far more than the above number by one dimension, and the preset threshold here may be set according to a compromise parameter of the maximum subscription number and the minimum subscription number of the historical experience.
A normal subscription relationship is that, naturally, after the subscriber acquires the subscription content, the corresponding synchronization server deletes the corresponding subscription content stored in the synchronization server, thereby freeing up space for serving other subscription tasks. However, in the actual implementation process, after a round of subscription task is completed, a new destination initiates a subscription request that is the same as the completed subscription task, and at this time, the existing synchronization server cannot effectively solve the corresponding problem. If a subscription request is sent to the same source again, the corresponding synchronized data may have been deleted and the source-synchronized data content cannot be provided to the new destination. Based on such a problem scenario, an embodiment of the present invention further provides a preferred extension scheme, and specifically, if the first database language content stored by the synchronization server and converted to be supported by the destination is deleted, the synchronization server receives a new subscription request of the destination, and the subscribed content is the same as the deleted first database language content, where the method further includes:
the synchronous server sends reverse subscription requests to one or more destination terminals which also subscribe the first database language content before deleting the first database language content;
wherein, the reverse subscription request comprises the related information of the language content of the first database; the information related to the first database language content comprises: one or more of a database ID, a table name, one or more database operation languages, a start time, and an end time.
And after receiving the reverse subscription request, if the one or more destination terminals confirm that the locally stored corresponding language content of the first database is modified by the language of the related database, returning a reverse subscription response to the synchronization server so that the synchronization server can select the destination terminal from the language content of the first database to obtain the language content of the first database.
In the above, the embodiment of the present invention provides a technical solution for reverse subscription, so that after a certain subscription destination misses a synchronization content generated in a synchronization server, a corresponding synchronization content can be obtained based on the synchronization server, and robustness of the whole scheme architecture is improved.
Example 2:
the embodiment of the present invention takes fig. 5 as an example, which lists a cross-type subscription relationship, and compared with embodiment 1, the applicable scenario is directly presented by way of an example, and the specific subscription relationship establishment and subscription synchronization and storage content management may use the content described in embodiment 1 for reference, and will not be repeatedly described in this embodiment.
In the embodiment of the present invention, there are 2 tables, table1 and table2 (which may be from one source end or from different source ends), and each table has 1 partition. The destination has 2 subscriber groups (which can be understood as being composed of a plurality of subscribers), wherein the subscriber group 1 is composed of 2 oracle subscribers, and the subscriber group 2 is composed of 2 mysql subscribers. the messages in the partitions on the table1 are respectively subscribed by an oracle subscriber 1 and a mysql subscriber 2, each subscriber analyzes the data and then writes the data into the corresponding database, and similarly, the table2 is consumed by the oracle subscriber 2 and the mysql subscriber 1 and then writes the data into the databases, so that the data of the table1 and the table2 are written into the oracle and the mysql simultaneously. The same applies when there are multiple partitioned scenes.
Example 3:
fig. 6 is a schematic structural diagram of a content recommendation device based on human body status according to an embodiment of the present invention. The human body state-based content recommendation apparatus of the present embodiment includes one or more processors 21 and a memory 22. In fig. 6, one processor 21 is taken as an example.
The processor 21 and the memory 22 may be connected by a bus or other means, such as the bus connection in fig. 6.
The memory 22, which is a non-volatile computer-readable storage medium, can be used for storing a non-volatile software program and a non-volatile computer-executable program, such as the method for data synchronization between heterogeneous relational databases in embodiment 1. The processor 21 performs a method of data synchronization between heterogeneous relational databases by executing non-volatile software programs and instructions stored in the memory 22.
The memory 22 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 22 may optionally include memory located remotely from the processor 21, and these remote memories may be connected to the processor 21 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 22 and when executed by the one or more processors 21, perform the method for data synchronization between heterogeneous relational databases in the above embodiment 1, for example, perform the steps shown in fig. 2 and described above.
It should be noted that, for the information interaction, execution process and other contents between the modules and units in the apparatus and system, the specific contents may refer to the description in the embodiment of the method of the present invention because the same concept is used as the embodiment of the processing method of the present invention, and are not described herein again.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.