CN112988817A - Data comparison method, system, electronic equipment and storage medium - Google Patents

Data comparison method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN112988817A
CN112988817A CN202110387928.XA CN202110387928A CN112988817A CN 112988817 A CN112988817 A CN 112988817A CN 202110387928 A CN202110387928 A CN 202110387928A CN 112988817 A CN112988817 A CN 112988817A
Authority
CN
China
Prior art keywords
data
comparison
reference data
task
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110387928.XA
Other languages
Chinese (zh)
Other versions
CN112988817B (en
Inventor
张昌达
刘力
刘泽昕
黄书珽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Travel Network Technology Shanghai Co Ltd
Original Assignee
Ctrip Travel Network Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Travel Network Technology Shanghai Co Ltd filed Critical Ctrip Travel Network Technology Shanghai Co Ltd
Priority to CN202110387928.XA priority Critical patent/CN112988817B/en
Publication of CN112988817A publication Critical patent/CN112988817A/en
Application granted granted Critical
Publication of CN112988817B publication Critical patent/CN112988817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data processing, and provides a data comparison method, a data comparison system, electronic equipment and a storage medium. The data comparison method comprises the following steps: a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data; data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data. The invention can realize the differentiation comparison of mass data, flexibly fetch data according to the requirement and improve the efficiency of data comparison verification and the stability of the system.

Description

Data comparison method, system, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data comparison method, system, electronic device, and storage medium.
Background
Data migration and data synchronization of different systems are very common in an information system, and before data migration, data synchronization and other operations, data needs to be compared and verified, so that the reliability of the migration, synchronization and other operations is ensured.
In the traditional data comparison verification, the data consistency is verified by adopting the modes of sampling, manual detection, total data verification and the like, the coverage range is very limited, a large amount of repeated work is often caused, the data comparison verification cannot be flexibly performed, and a systematic result report is lacked.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the invention and therefore may include information that does not constitute prior art that is already known to a person of ordinary skill in the art.
Disclosure of Invention
In view of this, the present invention provides a data comparison method, system, electronic device and storage medium, which can implement differentiated comparison of mass data, flexibly fetch data according to needs, and improve the efficiency of data comparison and verification and the stability of the system.
One aspect of the present invention provides a data comparison method, including: a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data; data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
In some embodiments, before the task obtaining step, a task generating step is further included, and the task generating step includes: receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, an access condition for data extraction and a paging extraction mode; receiving comparison information configured by a plurality of subtasks based on the main task, and generating a plurality of consumers of the message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on the access condition and a data comparison mode; the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
In some embodiments, the access condition and the query condition are both keyed; in the access condition, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value; in the query conditions, variable names satisfying the data characteristics are used as keys, and variable values satisfying the data range are used as values.
In some embodiments, the data alignment step comprises: filtering ignored fields in each group of the mapped reference data and comparison data according to the data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode; comparing the target fields reserved in each group of the reference data and the comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data; and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data.
In some embodiments, the ending condition of the paging extraction mode is a time cutoff condition or a number cutoff condition.
In some embodiments, after the data alignment step, a visualization step is further included, and the visualization step includes: and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
In some embodiments, the alignment results are stored in an Elasticsearch.
Another aspect of the present invention provides a data alignment system, including: the task obtaining module is used for obtaining comparison information of the comparison tasks, and the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; the data extraction module is used for respectively extracting the reference data and the comparison data which meet the query conditions from the reference data source and the comparison data source according to the paging extraction mode and establishing a mapping relation between the reference data and the comparison data; and the data comparison module is used for comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
Yet another aspect of the present invention provides an electronic device, comprising: a processor; a memory having executable instructions stored therein; wherein the executable instructions, when executed by the processor, implement the data comparison method of any of the above embodiments.
Yet another aspect of the present invention provides a computer-readable storage medium for storing a program which, when executed, implements the data alignment method of any of the above embodiments.
Compared with the prior art, the invention has the beneficial effects that:
the invention can realize the differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram illustrating steps of a data comparison method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a primary task configuration interface in an embodiment of the invention;
FIG. 3 is a schematic diagram illustrating a subtask configuration interface in an embodiment of the invention;
FIG. 4 is a schematic diagram of a comparison result display interface according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating the invocation of various services of a data comparison method according to an embodiment of the present invention;
FIG. 6 is a block diagram of a data alignment system according to an embodiment of the present invention;
FIG. 7 is a schematic diagram showing a structure of an electronic apparatus according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
The drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
In addition, the flow shown in the drawings is only an exemplary illustration, and not necessarily includes all the steps. For example, some steps may be divided, some steps may be combined or partially combined, and the actual execution sequence may be changed according to the actual situation. It should be noted that features of the embodiments of the invention and of the different embodiments may be combined with each other without conflict.
The data comparison method can realize the differential comparison of large quantities of structured data, performs data acquisition, comparison and result display according to configuration, continuously performs comparison verification on data from different sources, stores comparison results and performs differential display.
Fig. 1 shows the main steps of the data comparison method in the embodiment, and referring to fig. 1, the data comparison method in the embodiment includes the steps of: s110, task acquisition step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode; s120, data extraction: according to the paging extraction mode, respectively extracting reference data and comparison data which meet the query conditions from a reference data source and a comparison data source, and establishing a mapping relation between the reference data and the comparison data; s130, data comparison: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
The data comparison method can realize the differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
In one embodiment, before the task obtaining step, a task generating step is further included, and the task generating step includes: receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, a data extraction condition and a paging extraction mode; receiving comparison information configured by a plurality of subtasks based on a main task, and generating a plurality of consumers of a message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on an access condition and a data comparison mode; the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
Fig. 2 shows a main task configuration interface 200 in an embodiment, and referring to fig. 2, in the main task configuration interface 200, the configuration of a main task, i.e., a driving task, can be performed, including a configuration reference data source (data name) 210, an access condition 220, and a paging extraction mode (paging or not) 230. The data type selectable by the reference data source 210 is SQLSERVER/MYSQL/ES; the access condition 220 can be stored in a key value mode, and in the access condition 220, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value, so that the data range of the data to be compared can be determined; the paging extraction mode 230 may use a time cutoff condition or a number cutoff condition as an end condition to achieve automatic paging access. In the main task configuration interface 200, a designated recipient may also be configured to push the relevant comparison result to the configured designated recipient after the main task is completed. In addition, information such as access environment, access path and the like can be configured according to needs, and the comparison information of the main task configuration can be generated by clicking and storing after the configuration is completed.
Table 1 shows a main task configuration information table, which is specifically as follows:
Figure BDA0003015636440000051
Figure BDA0003015636440000061
fig. 3 shows a subtask configuration interface 300 in an embodiment, and in conjunction with fig. 2 and fig. 3, in the subtask configuration interface 300, the configuration of the subtask in the main task may be performed, including configuring the alignment data source (data name) 310, the query condition 320, and the data alignment mode (whether strict mode is used) 330. The subtasks may be used to perform a data alignment between alignment data source 310 and reference data source 210. The query conditions 320 may be stored in a key value manner, and in the query conditions 320, variable names satisfying data characteristics are used as keys, and variable values satisfying a data range are used as values, so that data contents of the data to be compared can be determined. In the subtask, the access condition 220 configured by the main task is replaced by a variable as the query condition 320, and the access condition is changed into a directly-queried reference condition according to the variable name and value which accord with the data characteristics and the data range of the main task. The ignore condition 340 in strict mode and non-strict mode, i.e. the ignore field is different, the ignore condition 340 is generated automatically by the system according to whether the strict mode is selected or not, so as to avoid the inherent difference between the two data sources from interfering with the data comparison.
In the subtask configuration interface 300, a mapping relationship between the reference data and the comparison data may also be configured. When the reference field of the reference data and the comparison field of the comparison data both have specific field names, for example, the reference data and the comparison data come from a database DB, the field names can be used as mapping, and a group of reference fields and comparison fields with the same field names are taken for comparison each time; when the reference data and the comparison data come from the interface API/database ES, the reference field and the comparison field cannot be mapped through the field names, the mapping can be performed according to the sequence of the reference field and the comparison field in the data table, and a group of reference fields and comparison fields in the same sequence in the reference data table and the comparison data table are compared each time. The reference field and the comparison field refer to the target field with the omitted field filtered out. Therefore, under the condition that the reference field and the comparison field are many-to-many, a plurality of parameter names are appointed to be used as joint main keys to uniquely determine one record to carry out mapping comparison under the condition that the record is many-to-many.
Tables 2 and 3 show subtask configuration information tables, which are specifically as follows:
Figure BDA0003015636440000071
Figure BDA0003015636440000081
Figure BDA0003015636440000082
Figure BDA0003015636440000091
in the subtask configuration information table, the subtask method refers to that when the structure of the data to be compared is complex, special conversion, calculation, and the like are required, and other execution methods can be called for comparison.
Further, a general configuration of the data source may also be performed, and table 4 shows a data logic configuration information table, which is specifically as follows:
Figure BDA0003015636440000092
according to the configured main task and the subtask, the data acquired by the subtask can be compared with the configuration in a message consumption mode. And when the data are compared, automatically ignoring the relevant fields, and comparing the data of the target fields. Specifically, the data alignment step comprises: filtering ignored fields in each group of the mapped reference data and comparison data according to a data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode; comparing the target fields reserved in each group of reference data and comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data; and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data. And updating the state of the main task and the quantity of the unexecuted subtasks every time a subtask is executed.
After the data comparison step, a visual display step is further included, and the visual display step comprises the following steps: and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
Fig. 4 shows a comparison result display interface 400 in the embodiment, and referring to fig. 4, the comparison result display interface 400 may perform visual display according to the comparison result data of the query, and for the comparison data having a difference compared with the reference data, the comparison data may be distinguished by using difference marks 410 such as different colors and different patterns according to the difference type, so as to perform the difference display intuitively.
Table 5 shows a summary information table of the comparison results, which is as follows:
Figure BDA0003015636440000101
Figure BDA0003015636440000111
in the above summary information table of comparison results, the task ID refers to a group of data comparison tasks under the main task.
The aligned source data and the aligned result data are stored in the Elasticsearch as follows:
Figure BDA0003015636440000112
Figure BDA0003015636440000121
Figure BDA0003015636440000131
Figure BDA0003015636440000141
Figure BDA0003015636440000151
Figure BDA0003015636440000161
furthermore, each service of the data comparison method can realize API, so as to be convenient to call. Fig. 5 is a schematic diagram illustrating invoking of each service of the data comparison method in the embodiment, and referring to fig. 5, a user (e.g., a tester/developer) may invoke any service of the main task service 510, the subtask service 520, the source data query service 530, the comparison service 540, and the comparison result query service 550 as needed to implement configuration, access, comparison, result query, and the like of comparison information. The main task service 510 may be used for main task configuration, automatic paging setup, alignment task generation, and asserted message queues, among others. Subtask service 520 may be used for subtask configuration, consuming messages, and querying subtasks to obtain baseline data and alignment data. The source data query service 530 may be used for source data queries of a DB database, an ES database, an API interface, and the like. The alignment service 540 can be used for aligning data and marking differences, and storing the alignment result. The comparison result query service 550 can be used for querying the comparison result and visually displaying the comparison result.
In conclusion, the data comparison method adopts the paging extraction mode to extract the data to be compared from the data source, so that the pressure on the server during the extraction of the source data can be greatly reduced; the message queue is adopted, the consumption end receives the message and takes the key value in the message as the access parameter to respectively extract data from the reference data source and the comparison data source for comparison, the application of the message queue realizes distributed comparison verification, the efficiency of the comparison verification and the stability of the system are greatly improved, and the comparison of mass data becomes practical; through switching between a strict mode and a non-strict mode, a specific field can be ignored in a configurable manner, and the interference of normal difference in an actual system is solved; marking specific difference types of the comparison data or neglecting the difference results through difference marking; the comparison result is automatically formatted, processed and stored in the ES, so that the comparison and storage of million-level mass data becomes possible, high-performance query is realized, and meanwhile, the verification efficiency is improved through visual report display result screening.
The embodiment of the invention also provides a data comparison system which can be used for realizing the data comparison method described in any embodiment. The features and principles of the data alignment method described in any of the above embodiments can be applied to the data alignment system embodiments below. In the following embodiments of the data alignment system, the features and principles that have been elucidated with respect to data alignment are not repeated.
Fig. 6 shows the main modules of the data comparison system 600 in the embodiment, and referring to fig. 6, the data comparison system 600 in the embodiment includes: the task obtaining module 610 is configured to obtain comparison information of the comparison task, where the comparison information includes a reference data source, a comparison data source, a query condition of data to be compared, and a paging extraction mode; a data extraction module 620, configured to extract, according to the paging extraction mode, reference data and comparison data that meet the query condition from the reference data source and the comparison data source, respectively, and establish a mapping relationship between the reference data and the comparison data; the data comparing module 630 is configured to compare each set of the reference data and the comparison data mapped to each other, and generate a comparison result including a difference flag between the comparison data and the corresponding reference data.
Further, the data comparison system 600 may further include modules for implementing other process steps of the above embodiments of the data comparison method, and specific principles of each module may refer to the description of the above embodiments of the data comparison method, and will not be repeated here.
As described above, the data comparison system of the present invention can implement differentiated comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
The embodiment of the present invention further provides an electronic device, which includes a processor and a memory, where the memory stores executable instructions, and when the executable instructions are executed by the processor, the data comparison method described in any of the above embodiments is implemented.
As described above, the electronic device of the present invention can implement differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present invention, and it should be understood that fig. 7 only schematically illustrates various modules, and these modules may be virtual software modules or actual hardware modules, and the combination, the splitting, and the addition of the remaining modules of these modules are within the scope of the present invention.
As shown in fig. 7, electronic device 700 is embodied in the form of a general purpose computing device. The components of the electronic device 700 include, but are not limited to: at least one processing unit 710, at least one memory unit 720, a bus 730 connecting the different platform components (including memory unit 720 and processing unit 710), a display unit 740, etc.
The storage unit stores a program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the steps of the data comparison method described in any of the above embodiments. For example, processing unit 710 may perform the steps shown in fig. 1.
The storage unit 720 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)7201 and/or a cache memory unit 7202, and may further include a read only memory unit (ROM) 7203.
The memory unit 720 may also include programs/utilities 7204 having one or more program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 730 may be any representation of one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 700 may also communicate with one or more external devices 800, and the external devices 800 may be one or more of a keyboard, a pointing device, a bluetooth device, and the like. These external devices 800 enable a user to interactively communicate with the electronic device 700. The electronic device 700 may also be capable of communicating with one or more other computing devices, including routers, modems. Such communication may occur via an input/output (I/O) interface 750. Also, the electronic device 700 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the internet) via the network adapter 760. The network adapter 760 may communicate with other modules of the electronic device 700 via the bus 730. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.
The embodiment of the present invention further provides a computer-readable storage medium for storing a program, and when the program is executed, the data comparison method described in any of the above embodiments is implemented. In some possible embodiments, the various aspects of the present invention may also be implemented in the form of a program product, which includes program code for causing a terminal device to perform the data comparison method described in any of the above embodiments, when the program product is run on the terminal device.
As described above, the computer-readable storage medium of the present invention can implement differentiation comparison of mass data; each item of comparison information can be configured as required, flexible access is realized, the pressure on a server during data extraction is greatly reduced through a paging extraction mode, distributed comparison verification is realized through a message queue, and the efficiency of data comparison verification and the stability of a system are improved; and high-performance query of comparison results is realized and visual display is facilitated through the difference marks.
Fig. 8 is a schematic structural diagram of a computer-readable storage medium of the present invention. Referring to fig. 8, a program product 900 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of readable storage media include, but are not limited to: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device, such as through the internet using an internet service provider.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. A method of data alignment, comprising:
a task obtaining step: acquiring comparison information of a comparison task, wherein the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode;
a data extraction step: according to the paging extraction mode, respectively extracting reference data and comparison data meeting the query conditions from the reference data source and the comparison data source, and establishing a mapping relation between the reference data and the comparison data;
data comparison step: and comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
2. The data comparison method of claim 1, wherein before the task obtaining step, a task generating step is further included, and the task generating step includes:
receiving comparison information configured by a main task, and generating a message queue, wherein the comparison information configured by the main task comprises a reference data source, an access condition for data extraction and a paging extraction mode;
receiving comparison information configured by a plurality of subtasks based on the main task, and generating a plurality of consumers of the message queue, wherein the comparison information configured by each subtask comprises a comparison data source, a query condition based on the access condition and a data comparison mode;
the task obtaining step, the data extracting step and the data comparing step are respectively executed by each consumer, a plurality of consumers consume a plurality of messages of the message queue in parallel, and each message comprises comparing information of the main task and comparing information of the corresponding subtask.
3. The data comparison method of claim 2, wherein the access condition and the query condition are both stored in a key manner;
in the access condition, the data characteristics of the data to be compared are taken as keys, and the data range of the data to be compared is taken as a value;
in the query conditions, variable names satisfying the data characteristics are used as keys, and variable values satisfying the data range are used as values.
4. The method of claim 2, wherein the step of data alignment comprises:
filtering ignored fields in each group of the mapped reference data and comparison data according to the data comparison mode, wherein the data comparison mode comprises a strict mode and a non-strict mode;
comparing the target fields reserved in each group of the reference data and the comparison data, and marking difference according to the difference types when the target fields of the comparison data are different from the target fields of the corresponding reference data;
and generating a comparison result comprising the reference data, the comparison data and the difference marks of the comparison data and the corresponding reference data.
5. The data matching method as claimed in claim 1, wherein the ending condition of the paging extraction mode is a time-cut condition or a number-cut condition.
6. The method of claim 1, further comprising a visualization step after the data alignment step, the visualization step comprising:
and generating a visual report according to the comparison result, and displaying the visual report to a corresponding page.
7. The data alignment method of claim 1, wherein the alignment result is stored in an Elasticsearch.
8. A data alignment system, comprising:
the task obtaining module is used for obtaining comparison information of the comparison tasks, and the comparison information comprises a reference data source, a comparison data source, query conditions of data to be compared and a paging extraction mode;
the data extraction module is used for respectively extracting the reference data and the comparison data which meet the query conditions from the reference data source and the comparison data source according to the paging extraction mode and establishing a mapping relation between the reference data and the comparison data;
and the data comparison module is used for comparing each group of the mapped reference data with the comparison data to generate a comparison result containing the difference marks of the comparison data and the corresponding reference data.
9. An electronic device, comprising:
a processor;
a memory having executable instructions stored therein;
wherein the executable instructions, when executed by the processor, implement the data alignment method of any one of claims 1-7.
10. A computer-readable storage medium storing a program which, when executed, implements a data alignment method as claimed in any one of claims 1 to 7.
CN202110387928.XA 2021-04-12 2021-04-12 Data comparison method, system, electronic device and storage medium Active CN112988817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110387928.XA CN112988817B (en) 2021-04-12 2021-04-12 Data comparison method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110387928.XA CN112988817B (en) 2021-04-12 2021-04-12 Data comparison method, system, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN112988817A true CN112988817A (en) 2021-06-18
CN112988817B CN112988817B (en) 2024-03-12

Family

ID=76337867

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110387928.XA Active CN112988817B (en) 2021-04-12 2021-04-12 Data comparison method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112988817B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354256A (en) * 2015-10-22 2016-02-24 浪潮电子信息产业股份有限公司 Data pagination query method and apparatus
CN107944039A (en) * 2017-12-07 2018-04-20 携程旅游网络技术(上海)有限公司 Air ticket data transfer method, system, storage medium and electronic equipment
CN108009207A (en) * 2017-11-06 2018-05-08 东软集团股份有限公司 Incremental data inquiry method and device, storage medium, electronic equipment
CN109359060A (en) * 2018-10-24 2019-02-19 北京奇虎科技有限公司 Data pick-up method, apparatus calculates equipment and computer storage medium
CN110309161A (en) * 2019-06-06 2019-10-08 新华三大数据技术有限公司 A kind of method of data synchronization, device and server
CN110515974A (en) * 2019-07-15 2019-11-29 金蝶软件(中国)有限公司 Data pick-up method, apparatus, computer equipment and storage medium
CN112256684A (en) * 2020-10-23 2021-01-22 厦门悦讯信息科技股份有限公司 Report generation method, terminal equipment and storage medium
WO2021027363A1 (en) * 2019-08-15 2021-02-18 平安科技(深圳)有限公司 Data synchronization method and apparatus, computer device and storage medium
CN112434087A (en) * 2020-12-08 2021-03-02 中国人寿保险股份有限公司 Cross-system data comparison method and device, electronic equipment and storage medium
CN113900751A (en) * 2021-09-29 2022-01-07 平安普惠企业管理有限公司 Method, device, server and storage medium for synthesizing virtual image

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105354256A (en) * 2015-10-22 2016-02-24 浪潮电子信息产业股份有限公司 Data pagination query method and apparatus
CN108009207A (en) * 2017-11-06 2018-05-08 东软集团股份有限公司 Incremental data inquiry method and device, storage medium, electronic equipment
CN107944039A (en) * 2017-12-07 2018-04-20 携程旅游网络技术(上海)有限公司 Air ticket data transfer method, system, storage medium and electronic equipment
CN109359060A (en) * 2018-10-24 2019-02-19 北京奇虎科技有限公司 Data pick-up method, apparatus calculates equipment and computer storage medium
CN110309161A (en) * 2019-06-06 2019-10-08 新华三大数据技术有限公司 A kind of method of data synchronization, device and server
CN110515974A (en) * 2019-07-15 2019-11-29 金蝶软件(中国)有限公司 Data pick-up method, apparatus, computer equipment and storage medium
WO2021027363A1 (en) * 2019-08-15 2021-02-18 平安科技(深圳)有限公司 Data synchronization method and apparatus, computer device and storage medium
CN112256684A (en) * 2020-10-23 2021-01-22 厦门悦讯信息科技股份有限公司 Report generation method, terminal equipment and storage medium
CN112434087A (en) * 2020-12-08 2021-03-02 中国人寿保险股份有限公司 Cross-system data comparison method and device, electronic equipment and storage medium
CN113900751A (en) * 2021-09-29 2022-01-07 平安普惠企业管理有限公司 Method, device, server and storage medium for synthesizing virtual image

Also Published As

Publication number Publication date
CN112988817B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
CN111177231B (en) Report generation method and report generation device
US9305109B2 (en) Method and system of adapting a data model to a user interface component
US7577909B2 (en) Flexible management user interface from management models
US8954859B2 (en) Visually analyzing, clustering, transforming and consolidating real and virtual machine images in a computing environment
US9575639B2 (en) Compound controls
US10635408B2 (en) Method and apparatus for enabling agile development of services in cloud computing and traditional environments
US8533667B2 (en) Call wizard for information management system (IMS) applications
KR20060043087A (en) Application of data-binding mechanism to perform command binding
CN111538774B (en) Data storage and display method, system, equipment and storage medium
US20210304142A1 (en) End-user feedback reporting framework for collaborative software development environments
US10235270B2 (en) Method for assisting with debugging, and computer system
CN113031946A (en) Method and device for rendering page component
CN116244387A (en) Entity relationship construction method, device, electronic equipment and storage medium
US11184251B2 (en) Data center cartography bootstrapping from process table data
US20030151624A1 (en) Method and system to display, modify or substitute the contents of self-describing objects
US20160077812A1 (en) Extensible context based user interface simplification
CN116932147A (en) Streaming job processing method and device, electronic equipment and medium
CN112988817A (en) Data comparison method, system, electronic equipment and storage medium
CN116009847A (en) Code generation method, device, electronic equipment and storage medium
CN114880020A (en) Software development kit management method, related device and computer program product
US20180095644A1 (en) Navigation of data set preparation
CN109561146A (en) Document down loading method, device, terminal device
US10592526B2 (en) Multi-view control on data set
GB2458371A (en) Extracting data from application messages
CN117908866A (en) Visualization method and device for low-code platform, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant