CN112988817A - Data comparison method, system, electronic equipment and storage medium - Google Patents
Data comparison method, system, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN112988817A CN112988817A CN202110387928.XA CN202110387928A CN112988817A CN 112988817 A CN112988817 A CN 112988817A CN 202110387928 A CN202110387928 A CN 202110387928A CN 112988817 A CN112988817 A CN 112988817A
- Authority
- CN
- China
- Prior art keywords
- data
- comparison
- reference data
- task
- mode
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 28
- 238000013075 data extraction Methods 0.000 claims abstract description 15
- 238000013507 mapping Methods 0.000 claims abstract description 11
- 230000000007 visual effect Effects 0.000 claims description 15
- 238000012800 visualization Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 abstract description 16
- 238000012545 processing Methods 0.000 abstract description 9
- 230000004069 differentiation Effects 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 12
- 238000013508 migration Methods 0.000 description 3
- 230000005012 migration Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data processing, and provides a data comparison method, a data comparison system, electronic equipment and a storage medium. The data comparison method comprises the following steps: a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data; data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data. The invention can realize the differentiation comparison of mass data, flexibly fetch data according to the requirement and improve the efficiency of data comparison verification and the stability of the system.
Description
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data comparison method, system, electronic device, and storage medium.
Background
Data migration and data synchronization of different systems are very common in an information system, and before data migration, data synchronization and other operations, data needs to be compared and verified, so that the reliability of the migration, synchronization and other operations is ensured.
In the traditional data comparison verification, the data consistency is verified by adopting the modes of sampling, manual detection, total data verification and the like, the coverage range is very limited, a large amount of repeated work is often caused, the data comparison verification cannot be flexibly performed, and a systematic result report is lacked.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the invention and therefore may include information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
In view of this, the present invention provides a data comparison method, system, electronic device and storage medium, which can implement differentiated comparison of mass data, flexibly fetch data according to needs, and improve the efficiency of data comparison and verification and the stability of the system.
One aspect of the present invention provides a data comparison method, including: a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data; data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
In some embodiments, before the task obtaining step, a task generating step is further included, and the task generating step includes: receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, an access condition for data extraction and a paging extraction mode; receiving comparison information configured by a plurality of subtasks based on the main task, and generating a plurality of consumers of the message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on the access condition and a data comparison mode; the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
In some embodiments, the access condition and the query condition are both keyed; in the access condition, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value; in the query conditions, variable names satisfying the data characteristics are used as keys, and variable values satisfying the data range are used as values.
In some embodiments, the data alignment step comprises: filtering ignored fields in each group of the mapped reference data and comparison data according to the data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode; comparing the target fields reserved in each group of the reference data and the comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data; and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data.
In some embodiments, the ending condition of the paging extraction mode is a time cutoff condition or a number cutoff condition.
In some embodiments, after the data alignment step, a visualization step is further included, and the visualization step includes: and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
In some embodiments, the alignment results are stored in an Elasticsearch.
Another aspect of the present invention provides a data alignment system, including: the task obtaining module is used for obtaining comparison information of the comparison tasks, and the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; the data extraction module is used for respectively extracting the reference data and the comparison data which meet the query conditions from the reference data source and the comparison data source according to the paging extraction mode and establishing a mapping relation between the reference data and the comparison data; and the data comparison module is used for comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
Yet another aspect of the present invention provides an electronic device, comprising: a processor; a memory having executable instructions stored therein; wherein the executable instructions, when executed by the processor, implement the data comparison method of any of the above embodiments.
Yet another aspect of the present invention provides a computer-readable storage medium for storing a program which, when executed, implements the data alignment method of any of the above embodiments.
Compared with the prior art, the invention has the beneficial effects that:
the invention can realize the differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram illustrating steps of a data comparison method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a primary task configuration interface in an embodiment of the invention;
FIG. 3 is a schematic diagram illustrating a subtask configuration interface in an embodiment of the invention;
FIG. 4 is a schematic diagram of a comparison result display interface according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating the invocation of various services of a data comparison method according to an embodiment of the present invention;
FIG. 6 is a block diagram of a data alignment system according to an embodiment of the present invention;
FIG. 7 is a schematic diagram showing a structure of an electronic apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
The drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
In addition, the flow shown in the drawings is only an exemplary illustration, and not necessarily includes all the steps. For example, some steps may be divided, some steps may be combined or partially combined, and the actual execution sequence may be changed according to the actual situation. It should be noted that features of the embodiments of the invention and of the different embodiments may be combined with each other without conflict.
The data comparison method can realize the differential comparison of large quantities of structured data, performs data acquisition, comparison and result display according to configuration, continuously performs comparison verification on data from different sources, stores comparison results and performs differential display.
Fig. 1 shows the main steps of the data comparison method in the embodiment, and referring to fig. 1, the data comparison method in the embodiment includes the steps of: s110, task acquisition step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; s120, data extraction: according to the paging extraction mode, respectively extracting reference data and comparison data which meet the query conditions from a reference data source and a comparison data source, and establishing a mapping relation between the reference data and the comparison data; s130, data comparison: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
The data comparison method can realize the differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
In one embodiment, before the task obtaining step, a task generating step is further included, and the task generating step includes: receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, a data extraction condition and a paging extraction mode; receiving comparison information configured by a plurality of subtasks based on a main task, and generating a plurality of consumers of a message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on an access condition and a data comparison mode; the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
Fig. 2 shows a main task configuration interface 200 in an embodiment, and referring to fig. 2, in the main task configuration interface 200, the configuration of a main task, i.e., a driving task, can be performed, including a configuration reference data source (data name) 210, an access condition 220, and a paging extraction mode (paging or not) 230. The data type selectable by the reference data source 210 is SQLSERVER/MYSQL/ES; the access condition 220 can be stored in a key value mode, and in the access condition 220, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value, so that the data range of the data to be compared can be determined; the paging extraction mode 230 may use a time cutoff condition or a number cutoff condition as an end condition to achieve automatic paging access. In the main task configuration interface 200, a designated recipient may also be configured to push the relevant comparison result to the configured designated recipient after the main task is completed. In addition, information such as access environment, access path and the like can be configured according to needs, and the comparison information of the main task configuration can be generated by clicking and storing after the configuration is completed.
Table 1 shows a main task configuration information table, which is specifically as follows:
fig. 3 shows a subtask configuration interface 300 in an embodiment, and in conjunction with fig. 2 and fig. 3, in the subtask configuration interface 300, the configuration of the subtask in the main task may be performed, including configuring the alignment data source (data name) 310, the query condition 320, and the data alignment mode (whether strict mode is used) 330. The subtasks may be used to perform a data alignment between alignment data source 310 and reference data source 210. The query conditions 320 may be stored in a key value manner, and in the query conditions 320, variable names satisfying data characteristics are used as keys, and variable values satisfying a data range are used as values, so that data contents of the data to be compared can be determined. In the subtask, the access condition 220 configured by the main task is replaced by a variable as the query condition 320, and the access condition is changed into a directly-queried reference condition according to the variable name and value which accord with the data characteristics and the data range of the main task. The ignore condition 340 in strict mode and non-strict mode, i.e. the ignore field is different, the ignore condition 340 is generated automatically by the system according to whether the strict mode is selected or not, so as to avoid the inherent difference between the two data sources from interfering with the data comparison.
In the subtask configuration interface 300, a mapping relationship between the reference data and the comparison data may also be configured. When the reference field of the reference data and the comparison field of the comparison data both have specific field names, for example, the reference data and the comparison data come from a database DB, the field names can be used as mapping, and a group of reference fields and comparison fields with the same field names are taken for comparison each time; when the reference data and the comparison data come from the interface API/database ES, the reference field and the comparison field cannot be mapped through the field names, the mapping can be performed according to the sequence of the reference field and the comparison field in the data table, and a group of reference fields and comparison fields in the same sequence in the reference data table and the comparison data table are compared each time. The reference field and the comparison field refer to the target field with the omitted field filtered out. Therefore, under the condition that the reference field and the comparison field are many-to-many, a plurality of parameter names are appointed to be used as joint main keys to uniquely determine one record to carry out mapping comparison under the condition that the record is many-to-many.
Tables 2 and 3 show subtask configuration information tables, which are specifically as follows:
in the subtask configuration information table, the subtask method refers to that when the structure of the data to be compared is complex, special conversion, calculation, and the like are required, and other execution methods can be called for comparison.
Further, a general configuration of the data source may also be performed, and table 4 shows a data logic configuration information table, which is specifically as follows:
according to the configured main task and the subtask, the data acquired by the subtask can be compared with the configuration in a message consumption mode. And when the data are compared, automatically ignoring the relevant fields, and comparing the data of the target fields. Specifically, the data alignment step comprises: filtering ignored fields in each group of the mapped reference data and comparison data according to a data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode; comparing the target fields reserved in each group of reference data and comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data; and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data. And updating the state of the main task and the quantity of the unexecuted subtasks every time a subtask is executed.
After the data comparison step, a visual display step is further included, and the visual display step comprises the following steps: and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
Fig. 4 shows a comparison result display interface 400 in the embodiment, and referring to fig. 4, the comparison result display interface 400 may perform visual display according to the comparison result data of the query, and for the comparison data having a difference compared with the reference data, the comparison data may be distinguished by using difference marks 410 such as different colors and different patterns according to the difference type, so as to perform the difference display intuitively.
Table 5 shows a summary information table of the comparison results, which is as follows:
in the above summary information table of comparison results, the task ID refers to a group of data comparison tasks under the main task.
The aligned source data and the aligned result data are stored in the Elasticsearch as follows:
furthermore, each service of the data comparison method can realize API, so as to be convenient to call. Fig. 5 is a schematic diagram illustrating invoking of each service of the data comparison method in the embodiment, and referring to fig. 5, a user (e.g., a tester/developer) may invoke any service of the main task service 510, the subtask service 520, the source data query service 530, the comparison service 540, and the comparison result query service 550 as needed to implement configuration, access, comparison, result query, and the like of comparison information. The main task service 510 may be used for main task configuration, automatic paging setup, alignment task generation, and asserted message queues, among others. Subtask service 520 may be used for subtask configuration, consuming messages, and querying subtasks to obtain baseline data and alignment data. The source data query service 530 may be used for source data queries of a DB database, an ES database, an API interface, and the like. The alignment service 540 can be used for aligning data and marking differences, and storing the alignment result. The comparison result query service 550 can be used for querying the comparison result and visually displaying the comparison result.
In conclusion, the data comparison method adopts the paging extraction mode to extract the data to be compared from the data source, so that the pressure on the server during the extraction of the source data can be greatly reduced; the message queue is adopted, the consumption end receives the message and takes the key value in the message as the access parameter to respectively extract data from the reference data source and the comparison data source for comparison, the application of the message queue realizes distributed comparison verification, the efficiency of the comparison verification and the stability of the system are greatly improved, and the comparison of mass data becomes practical; through switching between a strict mode and a non-strict mode, a specific field can be ignored in a configurable manner, and the interference of normal difference in an actual system is solved; marking specific difference types of the comparison data or neglecting the difference results through difference marking; the comparison result is automatically formatted, processed and stored in the ES, so that the comparison and storage of million-level mass data becomes possible, high-performance query is realized, and meanwhile, the verification efficiency is improved through visual report display result screening.
The embodiment of the invention also provides a data comparison system which can be used for realizing the data comparison method described in any embodiment. The features and principles of the data alignment method described in any of the above embodiments can be applied to the data alignment system embodiments below. In the following embodiments of the data alignment system, the features and principles that have been elucidated with respect to data alignment are not repeated.
Fig. 6 shows the main modules of the data comparison system 600 in the embodiment, and referring to fig. 6, the data comparison system 600 in the embodiment includes: the task obtaining module 610 is configured to obtain comparison information of the comparison task, where the comparison information includes a reference data source, a comparison data source, a query condition of data to be compared, and a paging extraction mode; a data extraction module 620, configured to extract, according to the paging extraction mode, reference data and comparison data that meet the query condition from the reference data source and the comparison data source, respectively, and establish a mapping relationship between the reference data and the comparison data; the data comparing module 630 is configured to compare each set of the reference data and the comparison data mapped to each other, and generate a comparison result including a difference flag between the comparison data and the corresponding reference data.
Further, the data comparison system 600 may further include modules for implementing other process steps of the above embodiments of the data comparison method, and specific principles of each module may refer to the description of the above embodiments of the data comparison method, and will not be repeated here.
As described above, the data comparison system of the present invention can implement differentiated comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
The embodiment of the present invention further provides an electronic device, which includes a processor and a memory, where the memory stores executable instructions, and when the executable instructions are executed by the processor, the data comparison method described in any of the above embodiments is implemented.
As described above, the electronic device of the present invention can implement differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present invention, and it should be understood that fig. 7 only schematically illustrates various modules, and these modules may be virtual software modules or actual hardware modules, and the combination, the splitting, and the addition of the remaining modules of these modules are within the scope of the present invention.
As shown in fig. 7, electronic device 700 is embodied in the form of a general purpose computing device. The components of the electronic device 700 include, but are not limited to: at least one processing unit 710, at least one memory unit 720, a bus 730 connecting the different platform components (including memory unit 720 and processing unit 710), a display unit 740, etc.
The storage unit stores a program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the steps of the data comparison method described in any of the above embodiments. For example, processing unit 710 may perform the steps shown in fig. 1.
The storage unit 720 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)7201 and/or a cache memory unit 7202, and may further include a read only memory unit (ROM) 7203.
The memory unit 720 may also include programs/utilities 7204 having one or more program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 700 may also communicate with one or more external devices 800, and the external devices 800 may be one or more of a keyboard, a pointing device, a bluetooth device, and the like. These external devices 800 enable a user to interactively communicate with the electronic device 700. The electronic device 700 may also be capable of communicating with one or more other computing devices, including routers, modems. Such communication may occur via an input/output (I/O) interface 750. Also, the electronic device 700 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 760. The network adapter 760 may communicate with other modules of the electronic device 700 via the bus 730. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.
The embodiment of the present invention further provides a computer-readable storage medium for storing a program, and when the program is executed, the data comparison method described in any of the above embodiments is implemented. In some possible embodiments, the various aspects of the present invention may also be implemented in the form of a program product, which includes program code for causing a terminal device to perform the data comparison method described in any of the above embodiments, when the program product is run on the terminal device.
As described above, the computer-readable storage medium of the present invention can implement differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
Fig. 8 is a schematic structural diagram of a computer-readable storage medium of the present invention. Referring to fig. 8, a program product 900 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of readable storage media include, but are not limited to: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device, such as through the internet using an internet service provider.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.
Claims (10)
1. A method of data alignment, comprising:
a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode;
a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data;
data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
2. The data comparison method of claim 1, wherein before the task obtaining step, a task generating step is further included, and the task generating step includes:
receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, an access condition for data extraction and a paging extraction mode;
receiving comparison information configured by a plurality of subtasks based on the main task, and generating a plurality of consumers of the message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on the access condition and a data comparison mode;
the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
3. The data comparison method of claim 2, wherein the access condition and the query condition are both stored in a key manner;
in the access condition, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value;
in the query conditions, variable names satisfying the data characteristics are used as keys, and variable values satisfying the data range are used as values.
4. The method of claim 2, wherein the step of data alignment comprises:
filtering ignored fields in each group of the mapped reference data and comparison data according to the data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode;
comparing the target fields reserved in each group of the reference data and the comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data;
and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data.
5. The data matching method as claimed in claim 1, wherein the ending condition of the paging extraction mode is a time-cut condition or a number-cut condition.
6. The method of claim 1, further comprising a visualization step after the data alignment step, the visualization step comprising:
and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
7. The data alignment method of claim 1, wherein the alignment result is stored in an Elasticsearch.
8. A data alignment system, comprising:
the task obtaining module is used for obtaining comparison information of the comparison tasks, and the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode;
the data extraction module is used for respectively extracting the reference data and the comparison data which meet the query conditions from the reference data source and the comparison data source according to the paging extraction mode and establishing a mapping relation between the reference data and the comparison data;
and the data comparison module is used for comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
9. An electronic device, comprising:
a processor;
a memory having executable instructions stored therein;
wherein the executable instructions, when executed by the processor, implement the data alignment method of any one of claims 1-7.
10. A computer-readable storage medium storing a program which, when executed, implements a data alignment method as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110387928.XA CN112988817B (en) | 2021-04-12 | 2021-04-12 | Data comparison method, system, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110387928.XA CN112988817B (en) | 2021-04-12 | 2021-04-12 | Data comparison method, system, electronic device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112988817A true CN112988817A (en) | 2021-06-18 |
CN112988817B CN112988817B (en) | 2024-03-12 |
Family
ID=76337867
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110387928.XA Active CN112988817B (en) | 2021-04-12 | 2021-04-12 | Data comparison method, system, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112988817B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105354256A (en) * | 2015-10-22 | 2016-02-24 | 浪潮电子信息产业股份有限公司 | Data pagination query method and apparatus |
CN107944039A (en) * | 2017-12-07 | 2018-04-20 | 携程旅游网络技术(上海)有限公司 | Air ticket data transfer method, system, storage medium and electronic equipment |
CN108009207A (en) * | 2017-11-06 | 2018-05-08 | 东软集团股份有限公司 | Incremental data inquiry method and device, storage medium, electronic equipment |
CN109359060A (en) * | 2018-10-24 | 2019-02-19 | 北京奇虎科技有限公司 | Data pick-up method, apparatus calculates equipment and computer storage medium |
CN110309161A (en) * | 2019-06-06 | 2019-10-08 | 新华三大数据技术有限公司 | A kind of method of data synchronization, device and server |
CN110515974A (en) * | 2019-07-15 | 2019-11-29 | 金蝶软件(中国)有限公司 | Data pick-up method, apparatus, computer equipment and storage medium |
CN112256684A (en) * | 2020-10-23 | 2021-01-22 | 厦门悦讯信息科技股份有限公司 | Report generation method, terminal equipment and storage medium |
WO2021027363A1 (en) * | 2019-08-15 | 2021-02-18 | 平安科技(深圳)有限公司 | Data synchronization method and apparatus, computer device and storage medium |
CN112434087A (en) * | 2020-12-08 | 2021-03-02 | 中国人寿保险股份有限公司 | Cross-system data comparison method and device, electronic equipment and storage medium |
CN113900751A (en) * | 2021-09-29 | 2022-01-07 | 平安普惠企业管理有限公司 | Method, device, server and storage medium for synthesizing virtual image |
-
2021
- 2021-04-12 CN CN202110387928.XA patent/CN112988817B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105354256A (en) * | 2015-10-22 | 2016-02-24 | 浪潮电子信息产业股份有限公司 | Data pagination query method and apparatus |
CN108009207A (en) * | 2017-11-06 | 2018-05-08 | 东软集团股份有限公司 | Incremental data inquiry method and device, storage medium, electronic equipment |
CN107944039A (en) * | 2017-12-07 | 2018-04-20 | 携程旅游网络技术(上海)有限公司 | Air ticket data transfer method, system, storage medium and electronic equipment |
CN109359060A (en) * | 2018-10-24 | 2019-02-19 | 北京奇虎科技有限公司 | Data pick-up method, apparatus calculates equipment and computer storage medium |
CN110309161A (en) * | 2019-06-06 | 2019-10-08 | 新华三大数据技术有限公司 | A kind of method of data synchronization, device and server |
CN110515974A (en) * | 2019-07-15 | 2019-11-29 | 金蝶软件(中国)有限公司 | Data pick-up method, apparatus, computer equipment and storage medium |
WO2021027363A1 (en) * | 2019-08-15 | 2021-02-18 | 平安科技(深圳)有限公司 | Data synchronization method and apparatus, computer device and storage medium |
CN112256684A (en) * | 2020-10-23 | 2021-01-22 | 厦门悦讯信息科技股份有限公司 | Report generation method, terminal equipment and storage medium |
CN112434087A (en) * | 2020-12-08 | 2021-03-02 | 中国人寿保险股份有限公司 | Cross-system data comparison method and device, electronic equipment and storage medium |
CN113900751A (en) * | 2021-09-29 | 2022-01-07 | 平安普惠企业管理有限公司 | Method, device, server and storage medium for synthesizing virtual image |
Also Published As
Publication number | Publication date |
---|---|
CN112988817B (en) | 2024-03-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111177231B (en) | Report generation method and report generation device | |
AU2015223287B2 (en) | Compound controls | |
US7577909B2 (en) | Flexible management user interface from management models | |
US20140250135A1 (en) | Method and system of adapting a data model to a user interface component | |
US8954859B2 (en) | Visually analyzing, clustering, transforming and consolidating real and virtual machine images in a computing environment | |
US10635408B2 (en) | Method and apparatus for enabling agile development of services in cloud computing and traditional environments | |
CN107665228A (en) | A kind of related information querying method, terminal and equipment | |
KR20060043087A (en) | Application of data-binding mechanism to perform command binding | |
US20080065751A1 (en) | Method and computer program product for assigning ad-hoc groups | |
US10235270B2 (en) | Method for assisting with debugging, and computer system | |
CN113031946A (en) | Method and device for rendering page component | |
CN110310100A (en) | Project management method, device, electronic equipment and storage medium | |
CN116244387A (en) | Entity relationship construction method, device, electronic equipment and storage medium | |
US9535670B2 (en) | Extensible context based user interface simplification | |
US11184251B2 (en) | Data center cartography bootstrapping from process table data | |
CN114356962A (en) | Data query method and device, electronic equipment and storage medium | |
US20030151624A1 (en) | Method and system to display, modify or substitute the contents of self-describing objects | |
CN107766519B (en) | Method for visually configuring data structure | |
CN116932147A (en) | Streaming job processing method and device, electronic equipment and medium | |
CN112988817A (en) | Data comparison method, system, electronic equipment and storage medium | |
CN116009847A (en) | Code generation method, device, electronic equipment and storage medium | |
US10628397B2 (en) | Navigation of data set preparation | |
CN109241164A (en) | A kind of data processing method, device, server and storage medium | |
CN109561146A (en) | Document down loading method, device, terminal device | |
CN109683883A (en) | A kind of Flow Chart Design method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |