CN115952172A - Data matching method and device based on temporary table of database - Google Patents
Data matching method and device based on temporary table of database Download PDFInfo
- Publication number
- CN115952172A CN115952172A CN202310215543.4A CN202310215543A CN115952172A CN 115952172 A CN115952172 A CN 115952172A CN 202310215543 A CN202310215543 A CN 202310215543A CN 115952172 A CN115952172 A CN 115952172A
- Authority
- CN
- China
- Prior art keywords
- data
- data table
- temporary
- matching
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000008569 process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a data matching method and a device based on a temporary table of a database, which relate to the technical field of database processing, and the method comprises the following steps: the method comprises the steps of obtaining a first data table of a source database, a second data table of a target database and a data matching range; and a judging step, namely judging whether the data quantity of the first data table and the data quantity of the second data table in the data matching range are both larger than a first threshold value, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, if so, establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain the matching result. The invention improves the data matching efficiency and the data security.
Description
Technical Field
The invention relates to the technical field of database processing, in particular to a data matching method and device based on a database temporary table.
Background
In the prior art, any tables in the two databases are compared one by one, which has the disadvantages of slow speed, large memory occupation and the like, that is, in the prior art, a circular comparison mode is generally adopted for data matching.
In addition, in the data processing process, many current service scenarios depend on the functions of data synchronization and subscription, the links are different in length, if some of the environments lose part of data, the service function may be unavailable, and a means for verifying the accuracy of the data content is absent, so that the current problems mainly include:
1. the observability of the synchronous task does not have uniform log information to be traceable, and the change data is more, so that all data change logs cannot be stored.
2. Data loss is often the cause of problem re-finding, and sometimes, data loss is relatively passive and lacks of monitoring and early warning.
3. The complexity of the link, the number of synchronous data tables is large, and the data volume is large, and no scheme can compare data differences between the tables efficiently and accurately at present.
Therefore, it is a technical challenge how to perform efficient and safe matching of data tables, and in the data matching process, the performance of the system needs to be minimally affected.
Disclosure of Invention
The present invention proposes the following technical solutions to address one or more technical defects in the prior art.
A data matching method based on a temporary table of a database comprises the following steps:
an acquisition step, namely acquiring a first data table of a source database, a second data table of a target database and a data matching range;
judging, namely judging whether the data volumes of the first data table and the second data table in the data matching range are both larger than a first threshold, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, performing temporary table matching;
and a temporary table matching step, namely establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result.
Further, the operation of matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain the matching result is as follows: and calculating the MD5 value of the data of the first data table in the data matching range, storing the MD5 value of the data of the second data table in the data matching range in a first temporary data table, storing the MD5 value in the second temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking or inner linking on the second temporary data table and the third temporary data table to obtain a matching result.
Still further, the matching result includes at least one of: the data in the first data table and the same data in the second data table, the data in the first data table and the different data in the second data table, the data in the second data table which is missing than the data in the first data table and the data in the second data table which is more than the data in the first data table.
Furthermore, after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data which is not in the first data table and the second data table, the missing thread is used for outputting data which is missing in the second data table and the first data table, the extra thread is used for outputting data which is more than in the second data table and the first data table, and the match thread is used for outputting the same data in the first data table and the second data table.
Furthermore, for the diff thread, missing thread, extra thread and match thread, initializing the corresponding diff queue, missing queue, extra queue and match queue in the memory to implement the relationship pool between producer and consumer, the memory size of the relationship Chi Suozhan is:
if the number of the first and second antennas is greater than the predetermined number,greater than or equal to>,
wherein ,=1, 2, 3, 4 denotes a diff queue, missing queue, extra queue and match queue, =1, 2, 3, 4 respectively>Represents the memory size occupied by the corresponding queue realization producer and consumer relation pool, and/or is selected>Indicates the amount of data generated by the corresponding queue per unit of time, and>represents the amount of data consumed by the corresponding queue unit of time, based on the value of the queue>Indicates the total amount of data that the corresponding queue needs to output, <' > based on the status of the queue>Representing the total time required for the total amount of data output by the corresponding queue.
The invention also provides a data matching device based on the temporary table of the database, which comprises:
the acquisition unit is used for acquiring a first data table of a source database, a second data table of a target database and a data matching range;
the judging unit is used for judging whether the data quantity of the first data table and the second data table in the data matching range is larger than a first threshold value, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, performing temporary table matching;
and the temporary table matching unit is used for establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result.
Further, the operation of matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain the matching result is as follows: and calculating the MD5 value of the data of the first data table in the data matching range, storing the data in the first temporary data table, calculating the MD5 value of the data of the second data table in the data matching range, storing the data in the second temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking or inner linking on the second temporary data table and the third temporary data table to obtain a matching result.
Still further, the matching result includes at least one of: the data in the first data table and the second data table are the same, the data in the first data table and the second data table are different, the second data table is lack of data in the first data table, and the second data table is more data than the first data table.
Furthermore, after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data which is not in the first data table and the second data table, the missing thread is used for outputting data which is missing in the second data table and the first data table, the extra thread is used for outputting data which is more than in the second data table and the first data table, and the match thread is used for outputting the same data in the first data table and the second data table.
Further, for the diff thread, missing thread, extra thread and match thread initializing corresponding diff queue, missing queue, extra queue and match queue in the memory for realizing the relationship pool between producer and consumer, the relationship Chi Suozhan memory size is:
if the number of the first and second antennas is greater than the predetermined number,greater than or equal to>,
wherein ,=1, 2, 3, 4 denotes diff queue, missing queue, extra queue and match queue, = 4 denotes>Represents the memory size occupied by the corresponding queue realization producer and consumer relation pool, and/or is selected>Indicates the amount of data generated by the corresponding queue per unit of time, and>represents the amount of data consumed in a corresponding queue unit of time, based on the number of elapsed time units in the queue>Indicates the total amount of data that the corresponding queue needs to output, <' > based on the status of the queue>Representing the total time required for the total amount of data output by the corresponding queue.
The present invention also proposes a computer-readable storage medium having stored thereon computer program code which, when executed by a computer, performs any of the methods described above.
The invention has the technical effects that: the invention discloses a data matching method, a device and a storage medium based on a temporary table of a database, wherein the method comprises the following steps: an acquisition step, namely acquiring a first data table of a source database, a second data table of a target database and a data matching range; judging, namely judging whether the data volumes of the first data table and the second data table in the data matching range are both larger than a first threshold, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, performing temporary table matching; and a temporary table matching step, namely establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result. In the invention, the matching range of the data is appointed by a user, so that the matching of all data in all data tables is avoided, the calculated amount during data matching is reduced, and the data matching efficiency is improved. The space is saved, the temporary table can be automatically dropped after the client exits the session, and no data information occupies the space of the database; privacy, the client establishes a temporary table to serve only specific affairs, and the table has special use and privacy and does not need to be shared with other affairs; the invention has high efficiency, the temporary table established by the client has independent operation and read-write performance, therefore, the processing speed and the processing efficiency are higher, in the invention, the MD5 value of the data in the corresponding data matching range of the first and the second data tables is calculated and written into the first and the second temporary data tables, and the MD5 value in the first temporary data table of the source database (namely, a source end) is inserted into the third temporary data table on the target database (namely, a target segment), and the MD5 value matching in the second and the third temporary tables is carried out on the target data, thereby completing the matching of the data in the corresponding data matching range of the first and the second data tables.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
Fig. 1 is a flowchart of a data matching method based on a temporary table of a database according to an embodiment of the present invention.
Fig. 2 is a block diagram of a data matching apparatus based on a temporary table of a database according to an embodiment of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 shows a database temporary table-based data matching method of the present invention, which includes:
in the obtaining step S101, a first data table of a source database and a second data table of a target database and a data matching range are obtained, where the data matching range may be two data tables with the same table name specified by a user, or two data tables with different table names, or rows or columns in the two data tables, for example, a 5 th row in the first data table matches a 7 th row in the second data table, or a 3 rd column in the first data table matches a 7 th column in the second data table.
A judging step S102, judging whether the data quantity of the first data table and the data quantity of the second data table in the data matching range are both larger than a first threshold value, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, performing temporary table matching;
a temporary table matching step S103, which is to establish a first temporary data table in the source database, establish a second temporary data table and a third temporary data table in the target database, and perform matching based on the first temporary data table, the second temporary data table, and the third temporary data table to obtain a matching result.
The method comprises the steps of firstly obtaining a first data table of a source database and a second data table of a target database and a data matching range, then judging whether the data quantity of the first data table and the data quantity of the second data table in the data matching range are both larger than a first threshold value, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, if so, establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain the matching result. In the invention, the matching range of the data is specified by the user, so that the matching of all data in all data tables is avoided, the calculation amount during data matching is reduced, the data matching efficiency is improved, and the matching range of the data can be set by the user through a GUI (graphical user interface), a command line and the like. In the invention, the matching is directly carried out when the data volume is small, the matching is carried out based on the temporary table when the data volume is large, and the calculation matching is carried out based on the MD5 value when the temporary table is matched, so that the following technical effects are achieved due to the adoption of the temporary matching: the space is saved, the temporary table can be automatically dropped after the client exits the session, and no data information occupies the space of the database; privacy, the client establishes a temporary table to serve only specific affairs, and the table has special use and privacy and does not need to be shared with other affairs; the efficiency is high, the temporary table established by the client has independent operation and read-write performance, so the processing speed and the processing efficiency are higher, which is another important invention point of the invention.
In a further embodiment, the first temporary data table, the second temporary data table and the third temporary data table are set in the memory to be only accessible by the corresponding process which creates them, and other processes cannot access them, that is, the client (the source database, the client of the target database) creates the temporary table to serve only a specific transaction, and the table has special purpose and privacy, and does not need to be shared with other transactions, so that the security of data is improved, which is another important invention point of the present invention.
In a further embodiment, the operation of matching based on the first temporary data table, the second temporary data table, and the third temporary data table to obtain the matching result is: and calculating the MD5 value of the data of the first data table in the data matching range, storing the MD5 value of the data of the second data table in the data matching range in a first temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking, inner linking or right linking on the second temporary data table and the third temporary data table to obtain a matching result.
In the invention, the MD5 values of the data in the corresponding data matching ranges of the first and second data tables are calculated and then written into the first and second temporary data tables, the MD5 value in the first temporary data table of the source database (namely, the source end) is inserted into the third temporary data table on the target database (namely, the target section), and the MD5 values in the second and third temporary tables are matched with the target data, so that the matching of the data in the corresponding data matching ranges of the first and second data tables is completed.
In a further embodiment, the matching result comprises at least one of: the data in the first data table and the second data table are the same, the data in the first data table and the second data table are different, the second data table is lack of data in the first data table, and the second data table is more data than the first data table. Based on these matching results, data synchronization between the source end and the target end can be performed.
In a further embodiment, after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data which is not identical in the first data table and the second data table, the missing thread is used for outputting data which is missing in the second data table than in the first data table, the extra thread is used for outputting data which is more abundant in the second data table than in the first data table, and the match thread is used for outputting data which is identical in the first data table and the second data table. In the invention, the corresponding threads are initialized and can run in parallel, thereby realizing the output of different matching data results and improving the data output efficiency, which is another important invention point of the invention.
In a further embodiment, for the diff, missing, extra, and match threads, initializing corresponding diff, missing, extra, and match queues in memory for implementing a producer and consumer relationship pool, the relationship Chi Suozhan memory size is:
if the number of the first and second antennas is less than the predetermined number,greater than or equal to>,/>
wherein ,=1, 2, 3, 4 denotes diff queue, missing queue, extra queue and match queue, = 4 denotes>Represents the memory size occupied by the corresponding queue realization producer and consumer relation pool, and/or is selected>Indicates the amount of data generated by the corresponding queue per unit of time, and>represents the amount of data consumed by the corresponding queue unit of time, based on the value of the queue>Indicates the total amount of data that the corresponding queue needs to output, <' > based on the status of the queue>Representing the total time required for the total amount of data output by the corresponding queue.
In order to prevent data loss during output, the invention simulates a producer consumer working mode through the initialized corresponding queue so as to achieve the technical effects of data peak clipping, valley filling and decoupling.
Fig. 2 shows a database temporary table-based data matching apparatus according to the present invention, which includes:
the obtaining unit 201 obtains a first data table of a source database and a second data table of a target database, and a data matching range, where the data matching range may be two data tables with the same table name specified by a user, or two data tables with different table names, or rows or columns in the two data tables, for example, a row 5 in the first data table matches a row 7 in the second data table, or a column 3 in the first data table matches a column 7 in the second data table.
The judging unit 202 is configured to judge whether the data amounts of the first data table and the second data table in the data matching range are both greater than a first threshold, if not, directly match the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, perform temporary table matching;
the temporary table matching unit 203 establishes a first temporary data table in the source database, establishes a second temporary data table and a third temporary data table in the target database, and performs matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result.
The method comprises the steps of firstly obtaining a first data table of a source database and a second data table of a target database and a data matching range, then judging whether the data quantity of the first data table and the data quantity of the second data table in the data matching range are both larger than a first threshold value, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, if so, establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain the matching result. In the invention, the matching range of the data is specified by the user, so that the matching of all data in all data tables is avoided, the calculation amount during data matching is reduced, the data matching efficiency is improved, and the matching range of the data can be set by the user through a GUI (graphical user interface), a command line and the like. In the invention, the matching is directly carried out when the data volume is small, the matching is carried out based on the temporary table when the data volume is large, and the calculation matching is carried out based on the MD5 value when the temporary table is matched, so that the following technical effects are achieved due to the adoption of the temporary matching: the space is saved, the temporary table can be automatically dropped after the client exits the session, and no data information occupies the space of the database; privacy, the client establishes a temporary table to serve only specific affairs, and the table has special use and privacy and does not need to be shared with other affairs; the efficiency is high, and the temporary table established by the client has independent operation and read-write performance, so the processing speed and the processing efficiency are higher, which is another important invention point of the invention.
In a further embodiment, the first temporary data table, the second temporary data table and the third temporary data table are set in the memory to be only accessible by the corresponding process which creates them, and other processes cannot access them, that is, the client (the source database, the client of the target database) creates the temporary table to serve only a specific transaction, and the table has special purpose and privacy, and does not need to be shared with other transactions, so that the security of data is improved, which is another important invention point of the present invention.
In a further embodiment, the operation of matching based on the first temporary data table, the second temporary data table, and the third temporary data table to obtain the matching result is: and calculating the MD5 value of the data of the first data table in the data matching range, storing the MD5 value of the data of the second data table in the data matching range in a first temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking, inner linking or right linking on the second temporary data table and the third temporary data table to obtain a matching result.
In the invention, the MD5 values of the data in the corresponding data matching ranges of the first and second data tables are calculated and then written into the first and second temporary data tables, the MD5 value in the first temporary data table of the source database (namely, the source end) is inserted into the third temporary data table on the target database (namely, the target section), and the MD5 values in the second and third temporary tables are matched with the target data, so that the matching of the data in the corresponding data matching ranges of the first and second data tables is completed.
In a further embodiment, the matching result comprises at least one of: the data in the first data table and the same data in the second data table, the data in the first data table and the different data in the second data table, the data in the second data table which is missing than the data in the first data table and the data in the second data table which is more than the data in the first data table. Based on these matching results, data synchronization between the source end and the target end can be performed.
In a further embodiment, after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data which is not identical in the first data table and the second data table, the missing thread is used for outputting data which is missing in the second data table than in the first data table, the extra thread is used for outputting data which is more abundant in the second data table than in the first data table, and the match thread is used for outputting data which is identical in the first data table and the second data table. In the invention, the corresponding threads are initialized and can run in parallel, thereby realizing the output of different matching data results and improving the data output efficiency, which is another important invention point of the invention.
In a further embodiment, for the diff, missing, extra, and match threads, initializing corresponding diff, missing, extra, and match queues in memory for implementing a producer and consumer relationship pool, the relationship Chi Suozhan memory size is:
if the number of the first and second antennas is greater than the predetermined number,greater than or greater than>,
wherein ,=1, 2, 3, 4 denotes a diff queue, missing queue, extra queue and match queue, =1, 2, 3, 4 respectively>Indicates that the corresponding queue realizes the size of the memory occupied by the relationship pool of producer and consumer, and/or the device>Indicates the amount of data generated by the corresponding queue per unit of time, and>represents the amount of data consumed in a corresponding queue unit of time, based on the number of elapsed time units in the queue>Represents the total amount of data that the corresponding queue needs to output, based on the data size of the queue>Representing the total time required for the total amount of data output by the corresponding queue.
In order to prevent data loss during output, the invention simulates a producer consumer working mode through the initialized corresponding queues so as to achieve the technical effects of data peak clipping, valley filling and decoupling.
An embodiment of the present invention provides a computer storage medium, on which a computer program is stored, which when executed by a processor implements the above-mentioned method, and the computer storage medium can be a hard disk, a DVD, a CD, a flash memory, or the like.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the various elements may be implemented in the same one or more pieces of software and/or hardware in the practice of the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present application or portions thereof contributing to the prior art may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the apparatuses according to the embodiments or some parts of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.
Claims (10)
1. A data matching method based on a temporary table of a database is characterized by comprising the following steps:
the method comprises the steps of obtaining a first data table of a source database, a second data table of a target database and a data matching range;
judging, namely judging whether the data volumes of the first data table and the second data table in the data matching range are both larger than a first threshold, if not, directly matching the data of the first data table in the data matching range with the data of the second data table in the data matching range to obtain a matching result, and if so, performing temporary table matching;
and a temporary table matching step, namely establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result.
2. The method according to claim 1, wherein the operation of matching based on the first temporary data table, the second temporary data table, and the third temporary data table to obtain a matching result is: and calculating the MD5 value of the data of the first data table in the data matching range, storing the MD5 value in a first temporary data table, calculating the MD5 value of the data of the second data table in the data matching range, storing the MD5 value in a second temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking or inner linking on the second temporary data table and the third temporary data table to obtain a matching result.
3. The method of claim 2, wherein the matching result comprises at least one of: the data in the first data table and the same data in the second data table, the data in the first data table and the different data in the second data table, the data in the second data table which is missing than the data in the first data table and the data in the second data table which is more than the data in the first data table.
4. The method of claim 3, wherein after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data in the first data table which is different from that in the second data table, the missing thread is used for outputting data in the second data table which is missing from the first data table, the extra thread is used for outputting data in the second data table which is more than that in the first data table, and the match thread is used for outputting data in the first data table which is the same as that in the second data table.
5. The method of claim 4, wherein for the diff, missing, extra, and match threads initializing corresponding diff, missing, extra, and match queues in memory for implementing producer and consumer relationship pools, the relationship Chi Suozhan memory size is:
if the number of the first and second antennas is greater than the predetermined number,greater than or equal to>,
wherein ,=1, 2, 3, 4 denotes diff queue, missing queue, extra queue and match queue, = 4 denotes>Indicates that the corresponding queue realizes the size of the memory occupied by the relationship pool of producer and consumer, and/or the device>Indicates the amount of data generated by the corresponding queue per unit of time, and>represents the amount of data consumed by the corresponding queue unit of time, based on the value of the queue>Indicates the total amount of data that the corresponding queue needs to output, <' > based on the status of the queue>Representing the total time required for the total amount of data output by the corresponding queue. />
6. A database temporary table based data matching device, comprising:
the acquisition unit is used for acquiring a first data table of a source database, a second data table of a target database and a data matching range;
the judging unit is used for judging whether the data quantity of the first data table and the data quantity of the second data table in the data matching range are both larger than a first threshold value, if not, the data of the first data table in the data matching range are directly matched with the data of the second data table in the data matching range to obtain a matching result, and if so, temporary table matching is carried out;
and the temporary table matching unit is used for establishing a first temporary data table in the source database, establishing a second temporary data table and a third temporary data table in the target database, and matching based on the first temporary data table, the second temporary data table and the third temporary data table to obtain a matching result.
7. The apparatus of claim 6, wherein the matching based on the first temporary data table, the second temporary data table, and the third temporary data table is performed by: and calculating the MD5 value of the data of the first data table in the data matching range, storing the MD5 value in a first temporary data table, calculating the MD5 value of the data of the second data table in the data matching range, storing the MD5 value in a second temporary data table, inserting the MD5 value in the first temporary data table into a third temporary data table, and performing left linking or inner linking on the second temporary data table and the third temporary data table to obtain a matching result.
8. The apparatus of claim 7, wherein the matching result comprises at least one of: the data in the first data table and the same data in the second data table, the data in the first data table and the different data in the second data table, the data in the second data table which is missing than the data in the first data table and the data in the second data table which is more than the data in the first data table.
9. The apparatus of claim 8, wherein after the matching is completed, a diff thread, a missing thread, an extra thread and a match thread are initialized, the diff thread is used for outputting data in the first data table which is different from that in the second data table, the missing thread is used for outputting data in the second data table which is missing from the first data table, the extra thread is used for outputting data in the second data table which is more than that in the first data table, and the match thread is used for outputting data in the first data table which is the same as that in the second data table.
10. The apparatus of claim 9, wherein for the diff, missing, extra, and match threads to initialize corresponding diff, missing, extra, and match queues in memory for implementing a producer and consumer relationship pool, the relationship Chi Suozhan memory size is:
if the number of the first and second antennas is greater than the predetermined number,greater than or equal to>,
wherein ,=1, 2, 3, 4 denotes diff queue, missing queue, extra queue and match queue, = 4 denotes>Represents the memory size occupied by the corresponding queue realization producer and consumer relation pool, and/or is selected>Represents the amount of data generated in the corresponding queue unit of time, based on the queue status of the queue>Represents the amount of data consumed by the corresponding queue unit of time, based on the value of the queue>Represents the total amount of data that the corresponding queue needs to output, based on the data size of the queue>Representing the total time required for the total amount of data output by the corresponding queue. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310215543.4A CN115952172B (en) | 2023-03-08 | 2023-03-08 | Data matching method and device based on database temporary table |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310215543.4A CN115952172B (en) | 2023-03-08 | 2023-03-08 | Data matching method and device based on database temporary table |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115952172A true CN115952172A (en) | 2023-04-11 |
CN115952172B CN115952172B (en) | 2023-05-26 |
Family
ID=85891154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310215543.4A Active CN115952172B (en) | 2023-03-08 | 2023-03-08 | Data matching method and device based on database temporary table |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115952172B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530375A (en) * | 2013-10-15 | 2014-01-22 | 北京国双科技有限公司 | Method and device for data source matching |
CN107665233A (en) * | 2017-07-24 | 2018-02-06 | 上海壹账通金融科技有限公司 | Database data processing method, device, computer equipment and storage medium |
CN109408535A (en) * | 2018-09-28 | 2019-03-01 | 中国平安财产保险股份有限公司 | Big data quantity matching process, device, computer equipment and storage medium |
CN110347747A (en) * | 2019-06-14 | 2019-10-18 | 平安科技(深圳)有限公司 | Database data synchronic method, system, computer equipment and storage medium |
CN110865982A (en) * | 2019-11-19 | 2020-03-06 | 深信服科技股份有限公司 | Data matching method and device, electronic equipment and storage medium |
CN111949524A (en) * | 2020-08-03 | 2020-11-17 | 北京锐安科技有限公司 | Data interface testing method and device, server and storage medium |
US20200387354A1 (en) * | 2017-06-29 | 2020-12-10 | Beijing Qingying Machine Visual Technology Co., Ltd. | Two-dimensional data matching method, device and logic circuit |
CN113360503A (en) * | 2021-06-18 | 2021-09-07 | 建信金融科技有限责任公司 | Test data tracking method and device for distributed database |
WO2022083266A1 (en) * | 2020-10-19 | 2022-04-28 | 中兴通讯股份有限公司 | Data table synchronization method and apparatus, data exchange device, and storage medium |
-
2023
- 2023-03-08 CN CN202310215543.4A patent/CN115952172B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530375A (en) * | 2013-10-15 | 2014-01-22 | 北京国双科技有限公司 | Method and device for data source matching |
US20200387354A1 (en) * | 2017-06-29 | 2020-12-10 | Beijing Qingying Machine Visual Technology Co., Ltd. | Two-dimensional data matching method, device and logic circuit |
CN107665233A (en) * | 2017-07-24 | 2018-02-06 | 上海壹账通金融科技有限公司 | Database data processing method, device, computer equipment and storage medium |
CN109408535A (en) * | 2018-09-28 | 2019-03-01 | 中国平安财产保险股份有限公司 | Big data quantity matching process, device, computer equipment and storage medium |
CN110347747A (en) * | 2019-06-14 | 2019-10-18 | 平安科技(深圳)有限公司 | Database data synchronic method, system, computer equipment and storage medium |
CN110865982A (en) * | 2019-11-19 | 2020-03-06 | 深信服科技股份有限公司 | Data matching method and device, electronic equipment and storage medium |
CN111949524A (en) * | 2020-08-03 | 2020-11-17 | 北京锐安科技有限公司 | Data interface testing method and device, server and storage medium |
WO2022083266A1 (en) * | 2020-10-19 | 2022-04-28 | 中兴通讯股份有限公司 | Data table synchronization method and apparatus, data exchange device, and storage medium |
CN113360503A (en) * | 2021-06-18 | 2021-09-07 | 建信金融科技有限责任公司 | Test data tracking method and device for distributed database |
Also Published As
Publication number | Publication date |
---|---|
CN115952172B (en) | 2023-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107832062B (en) | Program updating method and terminal equipment | |
CN107704568A (en) | Method and device for adding test data | |
CN111475402B (en) | Program function testing method and related device | |
CN113486109A (en) | Data synchronization method and device of heterogeneous database and electronic equipment | |
CN110222046B (en) | List data processing method, device, server and storage medium | |
CN112732427B (en) | Data processing method, system and related device based on Redis cluster | |
CN111625330B (en) | Cross-thread task processing method and device, server and storage medium | |
CN113342647A (en) | Test data generation method and device | |
CN115952172A (en) | Data matching method and device based on temporary table of database | |
CN110795308A (en) | Server inspection method, device, equipment and storage medium | |
CN114741162A (en) | Service arranging method, device, storage medium and equipment | |
CN107203550B (en) | Data processing method and database server | |
CN110209512B (en) | Data checking method and device based on multiple data sources | |
CN110175182B (en) | Data checking method and device | |
CN114691193A (en) | Firmware configuration method, device and equipment of embedded equipment | |
CN109740027B (en) | Data exchange method, device, server and storage medium | |
CN111367750B (en) | Exception handling method, device and equipment thereof | |
CN107833259B (en) | Dynamic cartoon engine processing method and system based on intelligent terminal | |
CN112905438A (en) | Automatic testing method and device | |
CN115858324B (en) | AI-based IT equipment fault processing method, apparatus, equipment and medium | |
CN111444057A (en) | Page performance data acquisition method and device and computing equipment | |
CN112181539B (en) | File processing method, device, equipment and medium | |
CN117573730B (en) | Data processing method, apparatus, device, readable storage medium, and program product | |
CN113868030B (en) | CPU test tool self-adaptive matching method, system, terminal and storage medium | |
CN110096555B (en) | Table matching processing method and device for distributed system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |