WO2022068316A1 - Data reconciliation method and apparatus, device, and storage medium - Google Patents

Data reconciliation method and apparatus, device, and storage medium Download PDF

Info

Publication number
WO2022068316A1
WO2022068316A1 PCT/CN2021/106203 CN2021106203W WO2022068316A1 WO 2022068316 A1 WO2022068316 A1 WO 2022068316A1 CN 2021106203 W CN2021106203 W CN 2021106203W WO 2022068316 A1 WO2022068316 A1 WO 2022068316A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
accounting
content
reconciliation
accessed
Prior art date
Application number
PCT/CN2021/106203
Other languages
French (fr)
Chinese (zh)
Inventor
黄恒
谢永恒
万月亮
Original Assignee
北京锐安科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京锐安科技有限公司 filed Critical 北京锐安科技有限公司
Publication of WO2022068316A1 publication Critical patent/WO2022068316A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Definitions

  • the present application relates to the field of computer information technology, for example, to a data reconciliation method, apparatus, device, and storage medium.
  • Data access is an indispensable link in the use of big data systems.
  • a series of logical processing is required in the process of data source from data access to data storage, such as: data discarding, field filling, conversion, de- Heavy and so on.
  • data discarding In the process of these logical processing, there may be garbled data, inability to parse, data errors, etc., which will make the data source before data access and the data after data storage inconsistent.
  • the present application provides a data reconciliation method, device, equipment and storage medium, so as to achieve the technical effect of accurately reconciling data before and after data access.
  • This application provides a data reconciliation method, which includes:
  • the first accounting content includes: the number of data bars and fingerprint information of the data to be accessed
  • the second accounting content includes: the number of data bars and fingerprint information of the access data.
  • the application also provides a data reconciliation device, the device comprising:
  • a data acquisition module set to acquire the data to be accessed in the data source
  • the first bookkeeping module is set to carry out the first bookkeeping to the data to be accessed, and obtain the first bookkeeping content
  • a data access module configured to access the data to be accessed to a target database to obtain access data
  • a second accounting module configured to perform second accounting for the access data to obtain second accounting content
  • a reconciliation module configured to determine that the reconciliation is successful when the first and second accounting contents are the same
  • the first accounting content includes: the number of data bars and fingerprint information of the data to be accessed
  • the second accounting content includes: the number of data bars and fingerprint information of the access data.
  • the present application also provides a computer device, wherein the computer device includes:
  • a storage device configured to store at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor implements the above data reconciliation method.
  • the present application also provides a computer-readable storage medium storing a computer program, wherein when the program is executed by the processor, the above-mentioned data reconciliation method is implemented.
  • FIG. 1 is a flowchart of a data reconciliation method provided in Embodiment 1 of the present application.
  • FIG. 1a is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application.
  • FIG. 1b is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application.
  • FIG. 2 is a structural diagram of a data reconciliation device provided in Embodiment 2 of the present application.
  • FIG. 3 is a structural diagram of a computer device according to Embodiment 3 of the present application.
  • FIG. 1 is a flowchart of a data reconciliation method provided in Embodiment 1 of the present application. This embodiment can be applied to a scenario where data access is required.
  • the method can be executed by a data reconciliation device, which can use software and/or hardware, and can be integrated into an electronic device that has storage and computing capabilities and can perform text processing.
  • a data reconciliation device which can use software and/or hardware, and can be integrated into an electronic device that has storage and computing capabilities and can perform text processing.
  • the technical solution provided by this embodiment includes the following steps:
  • Step S110 acquiring the data to be accessed in the data source.
  • the data source is the acquisition source of the data to be accessed, which is equivalent to the location where the required data to be accessed is stored.
  • the data source may be a collection of to-be-accessed data corresponding to multiple access tasks, or may be an access source, a database, or a file system of the to-be-accessed data.
  • the data structure of each data bar in the data source includes the unique identifier of each data bar, and the unique identifier of each data bar will not change during the data access process.
  • the data to be accessed is the data that needs to be accessed, and the data to be accessed has a corresponding relationship with the access task.
  • acquiring the data to be accessed in the data source may be to determine the data to be accessed according to the access task, that is, the data to be accessed, and correspondingly acquire the data to be accessed from the data source.
  • An access task is a specified event that requires data access. For example, start the access task at the set time, and the access task is to access the input data of yesterday's file system, then determine that the input data of yesterday's file system is the data to be accessed, and obtain the input data of yesterday's file system from the file system. data.
  • the access task is to access the data entered in the file system on September 15, 2020, and from the file Obtain the data entered in the file system on September 15, 2020 in the system.
  • Step S120 Perform first accounting for the data to be accessed to obtain first accounting content.
  • the first accounting is performed on the obtained data to be accessed.
  • the first accounting is to record the data volume and data content of the data to be accessed.
  • the first accounting content is the data volume and data content information of the data to be accessed obtained after the first accounting of the data to be accessed.
  • the first accounting is performed on the obtained data to be accessed.
  • the first accounting of the data to be accessed is to record the data to be accessed as the original data, so that after the access action is completed, the access data and the data to be accessed are checked, and the data bars and data fields before and after access are judged. Whether the content has changed.
  • Performing the first accounting of the data to be accessed to obtain the first accounting content includes: acquiring the number of data bars of the data to be accessed; determining the data bar according to the data content corresponding to each data bar The core field corresponding to each data bar; the fingerprint information of the core field corresponding to each data bar is determined by the information digest algorithm.
  • the data in the data source is stored in sequence according to the data strips, and the number of data strips of the data to be accessed obtained from the data source reflects the data volume of the data to be accessed;
  • the fingerprint information corresponding to the data bar reflects the data content of the data to be accessed.
  • the core field corresponding to each data bar is a field whose data content corresponding to each data bar is fixed.
  • the fingerprint information corresponding to each data bar is the fingerprint information generated according to the core field corresponding to each data bar by using a message digest algorithm, for example, by using a message-digest algorithm 5 (Message-Digest algorithm 5, MD5)
  • MD5 value is generated according to the core field corresponding to each data strip, and the fingerprint information generated corresponding to each data strip includes the unique identifier of the data strip and the MD5 value corresponding to the data strip.
  • the first accounting content of the data to be accessed includes a data bar accounting statement and a data fingerprint information record table, wherein the data bar accounting statement is used to record the data bar of the data to be accessed.
  • the data fingerprint information record table is used to record the fingerprint information corresponding to each data bar of the data to be accessed.
  • the fingerprint information generated corresponding to each data bar of the data to be accessed contains the unique identifier of each data bar, because one MD5 value obtained by using the MD5 algorithm in the information digest algorithm may correspond to multiple data bars, so set each data bar.
  • the unique identifier of the data strip is used in combination with the MD5 value to make the fingerprint information generated by the information digest algorithm unique.
  • the first billing content obtained by performing the first billing of the data to be accessed includes a bill for the number of data pieces, and a sample bill for the number of data pieces is shown in Table 1 below:
  • the bill for the number of data pieces of the first billing content of the data to be accessed includes: billing date, number of data pieces, and billing type.
  • the accounting date is the date when the data to be accessed is entered into the data source, and the accounting type is the data information source.
  • the billing type "1" in Table 1 indicates that the data comes from a data source, and the billing type "2" indicates that the data comes from an access database.
  • the fingerprint information corresponding to the data bar of the data to be accessed is accounted for, and the data fingerprint information record table is shown in Table 2 below:
  • the data fingerprint information record table of the first accounting content of the data to be accessed includes: date, data bar ID, accounting type, and data fingerprint information.
  • the date is the date when the fingerprint information is generated from the data to be accessed
  • the data bar ID is the ID of each data bar of the data to be accessed, and the data bar ID will not change during the data access process
  • the accounting type is data Information source
  • data fingerprint information is the fingerprint information generated by the information digest algorithm according to the core field corresponding to each data bar.
  • the data bar IDs "1" and “2" in Table 2 are only exemplary to show that the unique identifier of each data bar will not change before and after data access, and are not actual data bar IDs.
  • the accounting type "1" in Table 2 indicates that the data comes from a data source
  • the accounting type "2" indicates that the data comes from an access database.
  • the determining of the core field corresponding to each data bar according to the data content corresponding to each data bar includes: matching the corresponding data bar corresponding to each data bar in a preset core field library according to the data content corresponding to each data bar. core fields.
  • the core field corresponding to each data bar is a field whose data content corresponding to each data bar is fixed.
  • the fields whose data content is fixed and unchanged corresponding to each data bar are fields that will not change after logical processing during the data access process.
  • the preset core field library is a database that stores fields that will not change after logical processing.
  • the access data of the access target database has the following structure:
  • FIG. 1a is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application. As shown in FIG. 1a, the process of determining the core field of the data to be accessed is as follows:
  • a field whose data content corresponding to each data bar in the data to be accessed is fixed and unchanged is a field that will not change after logical processing during the data access process.
  • id, code, name, weight, origin_name, origin_teldest_namedest_tel, delivery_addr, send_time, receive_time, remark, create_time, update_time are the core fields of the data content of the data bar.
  • the four fields of dvtype, dvlongitude, dvlatitude, and input_time are extended fields.
  • the data bar in the data to be accessed has undergone a series of logical processing in the data access process, it may happen that the data content of the data bar has not changed in substance, but the structure of the data bar has changed, but the Changes in the structure of the data strip do not affect the normal use of the data strip, and fingerprint information can be generated according to the core field of the data content of the data strip to check the data content of the access data.
  • matching is performed in the preset core field database according to the data content corresponding to each data bar, and the field in the preset core field database that is successfully matched with the data content corresponding to each data bar is used as the corresponding field of the current data bar The core field of the data content.
  • Step S130 Access the data to be accessed to a target database to obtain access data.
  • the target database is a database that stores the access data after accessing the data to be accessed.
  • the access data is the data stored in the target database after being accessed.
  • the data to be accessed needs to undergo a series of logical processing, such as data discarding, field filling, conversion, deduplication, and so on.
  • the data to be accessed is stored in the target database after a series of logical processing, and the data stored in the target database is used as the access data.
  • Step S140 Perform second accounting for the access data to obtain second accounting content.
  • the data to be accessed is stored in the target database after being accessed, and the data stored in the target database is used as the access data, and the second time accounting is performed on the access data.
  • the second accounting is to record the data volume and data content of the access data.
  • the second accounting content is the data volume and data content information of the access data obtained after the second accounting of the access data is performed.
  • the second time accounting is performed on the obtained access data.
  • the second accounting of the access data is to record the access data as the original data, so that after the access action is completed, the access data and the data to be accessed are checked, and the content of the data bar and data field before and after the access is judged. changes.
  • Step S150 in the case that the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the data bar of the data to be accessed. The number and fingerprint information, and the second accounting content includes: the number and fingerprint information of the data bars of the access data.
  • the first billing content and the second billing content are the same, that is, the number and fingerprint information of the data to be accessed in the first billing content and the access data in the second billing content are checked. Whether the number of data strips and the content of fingerprint information are consistent respectively. According to the number of data strips and the fingerprint information of the data content of the data strip, it is checked whether the data to be accessed is consistent with the data to be accessed. , then the data to be accessed is consistent with the access data, that is, the data before and after access are consistent, and the account reconciliation is successful.
  • the first accounting content further includes: the identifier of each data bar of the data to be accessed
  • the second accounting content further includes: the identifier of each data bar of the access data
  • determining that the reconciliation is successful includes: acquiring the identifier and the identifier of each data bar of the data to be accessed in the first accounting content
  • the identifier of each data bar of the data to be accessed in the second accounting content there is a data bar with the same identifier as each data bar of the data to be accessed in the access data, and
  • the fingerprint information of the data to be accessed and the data strips with the same identification in the access data are the same, and the number of data strips in the first accounting content is the same as the number of data strips in the second accounting content. If the number is the same, it is determined that the reconciliation is successful.
  • FIG. 1b is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application. As shown in FIG. 1b, the data reconciliation process is as follows:
  • the first accounting content in the accounting module 1 is stored in the reconciliation module. After the first accounting content is obtained, the data to be accessed is stored in the target database through the data access service module. Taking the inbound data as the access data, the second accounting is performed in the accounting module 2, and the second accounting content is obtained and stored in the reconciliation module. After the second accounting content is stored in the reconciliation module, the first accounting content and the second accounting content in the reconciliation module are checked, and the number of data bars in the first accounting content and the value of each data bar are checked.
  • the fingerprint information is consistent with the number of data bars in the second accounting content and the fingerprint information of each data bar respectively, if the number of data bars in the first accounting content and the fingerprint information of each data bar are consistent with the second accounting content If the number of data bars and the fingerprint information of each data bar are consistent, the reconciliation is successful, and the reconciliation is completed.
  • the number of data bars of the first accounting content is consistent with the number of data bars of the second accounting content. If the number of bars is the same, then according to the property of the unique identifier of each data bar that does not change before and after data access, make a one-to-one correspondence with the unique identifiers of all data bars in the first accounting content and the second accounting content, and check. Whether the fingerprint information of the data strips with the same unique identifier in the first accounting content and the second accounting content are consistent, if the unique identifiers of all data strips in the first accounting content can be found in the second accounting content, and have If the fingerprint information of the data strips with the same unique identification is consistent, the reconciliation is successful.
  • the data reconciliation method further includes: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of the second accounting content After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, Add 1 to the number of account failures; repeatedly perform the next accounting of the access data after a preset time interval with the accounting time interval of the previous accounting, and obtain the next accounting content, in the first If the accounting content is different from the next accounting content, the operation of adding 1 to the number of failed reconciliations is performed until the number of failed reconciliations is greater than or equal to the preset number of failed reconciliations, and it is determined that reconciliation fails.
  • the preset time is the period of time for re-obtaining access data through refresh.
  • the third accounting is to record the data volume and data content of the re-obtained access data after being refreshed by a preset time interval with the accounting time interval of the second accounting.
  • the third accounting content is the data volume and data content information of the access data obtained by performing third accounting on the access data obtained after refresh.
  • the preset number of reconciliation failures is the maximum number of reconciliation attempts allowed during the set reconciliation process.
  • the number of reconciliation failures is the cumulative number of reconciliation failures during the reconciliation process.
  • the data to be accessed has missing data bars or data access failure during the access process. complete situation.
  • the number of data bars in the first accounting content is 10,000
  • the number of data bars in the second accounting content is 9,000
  • the remaining 1,000 data bars may be data bars being accessed or in the process of accessing missing.
  • the access data needs to be refreshed after a preset time interval from the accounting time interval for the second accounting, and the refreshed access data needs to be refreshed. Enter the data for the third accounting, and obtain the third accounting content.
  • the number of failed accounting is incremented by 1. Setting the preset time can avoid incomplete data access even after the system is refreshed. Accumulate the number of reconciliation failures, and compare the accumulated number of reconciliation failures this time with the preset number of reconciliation failures, which can prevent the reconciliation time from being extended due to excessive refresh times.
  • the access data is obtained again, the access data is recorded for the third time to obtain the third accounting content, and the first accounting content and the third accounting content are checked, which can avoid waiting due to incomplete data access.
  • the access data and the access data are inconsistent. Exemplarily, the number of data bars in the first accounting content is 10,000, and the number of data bars in the second accounting content is 9,000, and the remaining 1,000 data bars may still be accessed.
  • Carry out the fourth bookkeeping obtain the number of data bars in the fourth bookkeeping content and check the number of data bars in the fourth bookkeeping content with the number of data bars in the first bookkeeping content, if the first bookkeeping The number of data bars in the accounting content is greater than the number of data bars in the fourth accounting content, and the number of reconciliation failures is added by 1 to determine whether the number of reconciliation failures in this reconciliation is equal to the preset number of reconciliation failures. If the number of reconciliation failures in reconciliation is equal to the preset number of reconciliation failures, the current reconciliation fails and the reconciliation is stopped.
  • the data reconciliation method further includes: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of the second accounting content After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, The reconciliation time is recorded according to the preset time; after the preset time interval between the accounting time of the previous accounting and the previous accounting is performed repeatedly, the next accounting is performed on the access data, and the next accounting content is obtained. In the case where the first billing content and the next billing content are different, the operation of recording the account reconciliation time according to the preset time, until the account reconciliation time is greater than or equal to the preset account reconciliation time, determine Reconciliation failed.
  • the reconciliation time is the accumulated reconciliation time in the reconciliation process.
  • the preset reconciliation time is the reconciliation time length preset according to the access task. If the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, the data to be accessed may be missing data bars or incomplete data access during the access process. Increasing the preset time and setting the reconciliation after refreshing the access data may increase the overall reconciliation time of this reconciliation. Multiple reconciliation attempts will greatly increase the reconciliation time, increase the setting of the preset reconciliation time, and avoid the large amount of reconciliation attempts caused by multiple reconciliation attempts when the data to be accessed and the amount of access data are large. increase in magnitude.
  • the data reconciliation method further includes: the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, and the pending accounting content in the first accounting content If the result of the comparison between the identifier of each data bar of the incoming data and the identifier of each data bar of the access data in the second accounting content is that the intermediate data bar is missing, it is determined that the reconciliation fails.
  • the data to be accessed has missing data bars or data access failure during the access process. complete situation.
  • the unique identifier of each data bar in the first accounting content is compared with the unique identifier of each data bar in the second accounting content, if the unique identifiers of all data bars in the second accounting content are less than those of the first accounting content.
  • the unique identifier of all the data bars in the accounting content, and compared with the first accounting content the unique identifier of the missing data bar in the second accounting content is in the middle of the unique identifiers of all the data bars in the first accounting content, Then, compared with the first accounting content, the second accounting content is missing an intermediate data bar.
  • the number of data bars of the first accounting content is consistent with the number of data bars of the second accounting content. If the number of data bars of the first accounting content and the data bars of the second accounting content are consistent If the number of data bars is the same, then according to the property of the unique identifier of each data bar that does not change before and after access, make a one-to-one correspondence with the unique identifiers of all data bars in the first accounting content and the second accounting content, and check the first accounting content.
  • the account reconciliation is successful.
  • the first accounting is performed on the data to be accessed, and after obtaining the first accounting content, the data to be accessed is connected to the target database to obtain the access data; the second accounting is performed on the access data to obtain the second accounting Check whether the first billing content and the second billing content are consistent; to check the number of data bars and fingerprint information of the data to be accessed and the number and fingerprint information of the data bars to access data, so as to achieve the connection between the data.
  • FIG. 2 is a structural diagram of a data reconciliation device provided in Embodiment 2 of the present application.
  • a data reconciliation device which includes: a data acquisition module 21, which sets In order to obtain the data to be accessed in the data source; the first accounting module 22 is configured to perform first accounting on the data to be accessed to obtain the first accounting content; the data access module 23 is configured to The data to be accessed is accessed to the target database to obtain access data; the second accounting module 24 is configured to perform second accounting for the access data to obtain second accounting content; the accounting module 25 , set to determine that the reconciliation is successful when the first accounting content and the second accounting content are the same; wherein, the first accounting content includes: the data of the data bar of the data to be accessed The number and fingerprint information, and the second accounting content includes: the number and fingerprint information of the data bars of the access data.
  • the first accounting module 22 is configured to: obtain the number of data bars of the data to be accessed; determine the core field corresponding to each data bar according to the data content corresponding to each data bar; use an information digest algorithm to determine each data bar. Fingerprint information of the core field corresponding to the data bar.
  • the first accounting module 22 is configured to determine the core field corresponding to each data bar according to the data content corresponding to each data bar in the following manner: match all the data in the preset core field library according to the data content corresponding to each data bar. Describe the core fields corresponding to each data bar.
  • the first accounting content further includes: an identifier of each data bar of the data to be accessed
  • the second accounting content further includes: an identifier of each data bar of the access data
  • an account reconciliation module 25 is set to: obtain the identifier of each data bar of the data to be accessed in the first accounting content and the identifier of each data bar of the access data in the second accounting content; There is a data bar with the same identifier as each data bar of the to-be-accessed data in the access data, and the to-be-accessed data and the access data have the same fingerprint information of the data bar with the same identifier, and If the number of data bars in the first accounting content is the same as the number of data bars in the second accounting content, it is determined that the reconciliation is successful.
  • the reconciliation module 25 is further configured to: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, Add 1 to the number of account failures; repeatedly perform the next accounting of the access data after a preset time interval with the accounting time interval of the previous accounting, and obtain the next accounting content, in the first If the accounting content is different from the next accounting content, the operation of adding 1 to the number of failed reconciliations is performed until the number of failed reconciliations is greater than or equal to the preset number of failed reconciliations, and it is determined that reconciliation fails.
  • the reconciliation module 25 is further configured to: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, The reconciliation time is recorded according to the preset time; after the preset time interval between the accounting time of the previous accounting and the previous accounting is performed repeatedly, the next accounting is performed on the access data, and the next accounting content is obtained. In the case where the first billing content and the next billing content are different, the operation of recording the account reconciliation time according to the preset time, until the account reconciliation time is greater than or equal to the preset account reconciliation time, determine Reconciliation failed.
  • the account reconciliation module 25 is further configured to: the number of data bars in the first bookkeeping content is greater than the number of data bars in the second bookkeeping content, and the number of data bars in the first bookkeeping content If the result of the comparison between the identifier of each data bar of the incoming data and the identifier of each data bar of the access data in the second accounting content is that the intermediate data bar is missing, it is determined that the reconciliation fails.
  • a data reconciliation device provided by an embodiment of the present application can execute the data reconciliation method provided by any embodiment of the present application, and has corresponding functional modules and effects for executing the method.
  • FIG. 3 is a schematic structural diagram of a computer device according to Embodiment 3 of the present application.
  • Figure 3 shows a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application.
  • the computer device 12 shown in FIG. 3 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
  • computer device 12 takes the form of a general-purpose computing device.
  • Components of computer device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .
  • Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures.
  • these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association) Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12, including both volatile and nonvolatile media, removable and non-removable media.
  • System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 .
  • Computer device 12 may include other removable/non-removable, volatile/non-volatile computer system storage media.
  • storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive").
  • a magnetic disk drive may be provided for reading and writing to removable non-volatile magnetic disks, such as "floppy disks", and to removable non-volatile optical disks, such as Compact Disc Read-Only Memory (Compact Disc) Read-Only Memory, CD-ROM), Digital Versatile Disc Read-Only Memory (DVD-ROM) or other optical media, CD-ROM drive for reading and writing.
  • each drive may be connected to bus 18 through one or more data media interfaces.
  • the memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of the embodiments of the present application.
  • a program/utility 40 having a set (at least one) of program modules 42, which may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data , each or a combination of these examples may include an implementation of a network environment.
  • Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
  • Computer device 12 may also communicate with one or more external devices 14 (eg, keyboard, pointing device, display 24, etc.), may also communicate with one or more devices that enable a user to interact with computer device 12, and/or communicate with Any device (eg, network card, modem, etc.) that enables the computer device 12 to communicate with one or more other computing devices. Such communication may take place through an input/output (I/O) interface 22 .
  • the computer device 12 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through the network adapter 20. As shown in FIG. 3 , network adapter 20 communicates with other modules of computer device 12 via bus 18 .
  • LAN Local Area Network
  • WAN Wide Area Network
  • public network such as the Internet
  • the processing unit 16 executes a variety of functional applications and data processing by running the program stored in the system memory 28, for example, implementing the data reconciliation method provided by the embodiment of the present application, the method includes: acquiring the data to be accessed in the data source perform first accounting on the data to be accessed to obtain first accounting content; access the data to be accessed to the target database to obtain access data; perform a second accounting on the access data The second accounting content is obtained, and the second accounting content is obtained; if the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the The number and fingerprint information of the data pieces of the data to be accessed, and the second accounting content includes: the number and fingerprint information of the data pieces of the access data.
  • Embodiment 4 of the present application provides a computer-readable storage medium storing a computer program, and when the program is executed by a processor, is used to execute a data reconciliation method.
  • the method includes: acquiring a data source to be accessed in a data source data; perform first accounting on the data to be accessed to obtain first accounting content; access the data to be accessed to the target database to obtain access data; perform second accounting on the access data The second accounting content is obtained, and the second accounting content is obtained; if the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the The number of data bars and fingerprint information of the data to be accessed, and the second accounting content includes: the number of data bars and fingerprint information of the data to be accessed.
  • the computer storage medium of the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Examples (non-exhaustive list) of computer-readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, RAM, Read-Only Memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM) or flash memory, optical fiber, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • suitable medium including but not limited to wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional A procedural programming language, such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or may be connected to an external computer, such as through the Internet using an Internet service provider.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data reconciliation method and apparatus, a device, and a storage medium. The data reconciliation method comprises: acquiring, from a data source, data to be linked (S110); performing a first accounting on the data to be linked to obtain first accounting content (S120); linking to a target database the data to be linked to obtain linked data (S130); performing a second accounting on the linked data to obtain second accounting content (S140); and if the first accounting content and the second accounting content are the same, determining that the reconciliation is successful (S150), the first accounting content comprising: the number of data pieces of the data to be linked and fingerprint information, and the second accounting content comprising: the number of the data pieces of the linked data and fingerprint information.

Description

数据对账方法、装置、设备及存储介质Data reconciliation method, device, equipment and storage medium
本申请要求在2020年09月29日提交中国专利局、申请号为202011052704.5的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202011052704.5 filed with the China Patent Office on September 29, 2020, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请涉及计算机信息技术领域,例如涉及一种数据对账方法、装置、设备及存储介质。The present application relates to the field of computer information technology, for example, to a data reconciliation method, apparatus, device, and storage medium.
背景技术Background technique
数据接入在大数据系统的使用过程中是必不可少的环节,数据源从数据接入到数据入库的过程中需要进行一系列的逻辑处理,例如:数据丢弃、字段填充、转换、去重等等。在这些逻辑处理的过程中可能会出现数据乱码、无法解析、数据错误等,会使得数据接入前的数据源和数据入库后的数据不一致。Data access is an indispensable link in the use of big data systems. A series of logical processing is required in the process of data source from data access to data storage, such as: data discarding, field filling, conversion, de- Heavy and so on. In the process of these logical processing, there may be garbled data, inability to parse, data errors, etc., which will make the data source before data access and the data after data storage inconsistent.
发明内容SUMMARY OF THE INVENTION
本申请提供一种数据对账方法、装置、设备及存储介质,以达到对数据接入前后的数据进行精准对账的技术效果。The present application provides a data reconciliation method, device, equipment and storage medium, so as to achieve the technical effect of accurately reconciling data before and after data access.
本申请提供了一种数据对账方法,该方法包括:This application provides a data reconciliation method, which includes:
获取数据源中的待接入数据;Obtain the data to be accessed from the data source;
对所述待接入数据进行第一次记账,得到第一记账内容;Perform first accounting for the data to be accessed to obtain first accounting content;
将所述待接入数据接入至目标数据库,得到接入数据;Accessing the data to be accessed to a target database to obtain access data;
对所述接入数据进行第二次记账,得到第二记账内容;Carrying out the second accounting for the access data to obtain the second accounting content;
在所述第一记账内容和第二记账内容相同的情况下,确定对账成功;In the case that the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful;
其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。The first accounting content includes: the number of data bars and fingerprint information of the data to be accessed, and the second accounting content includes: the number of data bars and fingerprint information of the access data.
本申请还提供了一种数据对账装置,该装置包括:The application also provides a data reconciliation device, the device comprising:
数据获取模块,设置为获取数据源中的待接入数据;a data acquisition module, set to acquire the data to be accessed in the data source;
第一记账模块,设置为对所述待接入数据进行第一次记账,得到第一记账 内容;The first bookkeeping module is set to carry out the first bookkeeping to the data to be accessed, and obtain the first bookkeeping content;
数据接入模块,设置为将所述待接入数据接入至目标数据库,得到接入数据;a data access module, configured to access the data to be accessed to a target database to obtain access data;
第二记账模块,设置为对所述接入数据进行第二次记账,得到第二记账内容;A second accounting module, configured to perform second accounting for the access data to obtain second accounting content;
对账模块,设置为在所述第一记账内容和第二记账内容相同的情况下,确定对账成功;A reconciliation module, configured to determine that the reconciliation is successful when the first and second accounting contents are the same;
其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。The first accounting content includes: the number of data bars and fingerprint information of the data to be accessed, and the second accounting content includes: the number of data bars and fingerprint information of the access data.
本申请还提供了一种计算机设备,其中,所述计算机设备包括:The present application also provides a computer device, wherein the computer device includes:
至少一个处理器;at least one processor;
存储装置,设置为存储至少一个程序;a storage device configured to store at least one program;
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现上述的数据对账方法。When the at least one program is executed by the at least one processor, the at least one processor implements the above data reconciliation method.
本申请还提供了一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现上述的数据对账方法。The present application also provides a computer-readable storage medium storing a computer program, wherein when the program is executed by the processor, the above-mentioned data reconciliation method is implemented.
附图说明Description of drawings
图1是本申请实施例一提供的一种数据对账方法的流程图;1 is a flowchart of a data reconciliation method provided in Embodiment 1 of the present application;
图1a是本申请实施例一提供的一种数据对账方法的示意图;1a is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application;
图1b是本申请实施例一提供的一种数据对账方法的原理图;FIG. 1b is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application;
图2是本申请实施例二提供的一种数据对账装置的结构图;2 is a structural diagram of a data reconciliation device provided in Embodiment 2 of the present application;
图3是本申请实施例三提供的一种计算机设备的结构图。FIG. 3 is a structural diagram of a computer device according to Embodiment 3 of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请进行说明。The present application will be described below with reference to the accompanying drawings and embodiments.
本申请的方法实施方式中记载的多个步骤可以按照不同的顺序执行,和/或并行执行。The multiple steps described in the method embodiments of the present application may be performed in different orders and/or in parallel.
本申请中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。Concepts such as "first" and "second" mentioned in this application are only used to distinguish different devices, modules or units, and are not used to limit the order or interdependence of the functions performed by these devices, modules or units relation.
实施例一Example 1
图1是本申请实施例一提供的一种数据对账方法的流程图,本实施例可适用于需要进行数据接入的场景,该方法可以由数据对账装置来执行,该装置可以通过软件和/或硬件的方式来实现,并可集成于具备存储和计算能力并可进行文本处理的电子设备中,如图1所示,本实施例提供的技术方案包括如下步骤:FIG. 1 is a flowchart of a data reconciliation method provided in Embodiment 1 of the present application. This embodiment can be applied to a scenario where data access is required. The method can be executed by a data reconciliation device, which can use software and/or hardware, and can be integrated into an electronic device that has storage and computing capabilities and can perform text processing. As shown in Figure 1, the technical solution provided by this embodiment includes the following steps:
步骤S110,获取数据源中的待接入数据。Step S110, acquiring the data to be accessed in the data source.
本申请实施例中,数据源为待接入数据的获取源头,相当于存储所需要的待接入数据的位置。数据源可以是多个接入任务对应的待接入数据的集合,也可以是待接入数据的接入源端、数据库或文件系统等等。其中,数据源中的每个数据条的数据结构包含每个数据条的唯一标识,且每个数据条的唯一标识在数据接入过程中不会发生改变。待接入数据为需要进行接入的数据,待接入数据与接入任务具有对应关系。In the embodiment of the present application, the data source is the acquisition source of the data to be accessed, which is equivalent to the location where the required data to be accessed is stored. The data source may be a collection of to-be-accessed data corresponding to multiple access tasks, or may be an access source, a database, or a file system of the to-be-accessed data. Wherein, the data structure of each data bar in the data source includes the unique identifier of each data bar, and the unique identifier of each data bar will not change during the data access process. The data to be accessed is the data that needs to be accessed, and the data to be accessed has a corresponding relationship with the access task.
本申请实施例中,获取数据源中的待接入数据可以是根据接入任务确定需要接入的数据,即待接入数据,并从数据源中对应地获取待接入数据。接入任务为需要进行数据接入的指定事件。示例的,在设定时间启动接入任务,接入任务为接入昨日文件系统的录入数据,则确定昨日文件系统的录入数据为待接入数据,并从文件系统中获取昨日文件系统的录入数据。例如:在设定时间2020年9月16日0点00分启动昨日文件系统的录入数据的接入任务,则接入任务为接入2020年9月15日文件系统录入的数据,并从文件系统中获取2020年9月15日文件系统录入的数据。In this embodiment of the present application, acquiring the data to be accessed in the data source may be to determine the data to be accessed according to the access task, that is, the data to be accessed, and correspondingly acquire the data to be accessed from the data source. An access task is a specified event that requires data access. For example, start the access task at the set time, and the access task is to access the input data of yesterday's file system, then determine that the input data of yesterday's file system is the data to be accessed, and obtain the input data of yesterday's file system from the file system. data. For example: at the set time at 0:00 on September 16, 2020, start the access task of the entered data of the file system yesterday, then the access task is to access the data entered in the file system on September 15, 2020, and from the file Obtain the data entered in the file system on September 15, 2020 in the system.
步骤S120,对所述待接入数据进行第一次记账,得到第一记账内容。Step S120: Perform first accounting for the data to be accessed to obtain first accounting content.
本申请实施例中,从数据源中获取到待接入数据之后,对获取到的待接入数据进行第一次记账。第一次记账为对待接入数据的数据量和数据内容进行记录。第一记账内容为对待接入数据进行第一次记账后获取到的待接入数据的数据量和数据内容信息。In the embodiment of the present application, after the data to be accessed is obtained from the data source, the first accounting is performed on the obtained data to be accessed. The first accounting is to record the data volume and data content of the data to be accessed. The first accounting content is the data volume and data content information of the data to be accessed obtained after the first accounting of the data to be accessed.
本申请实施例中,从数据源中获取到待接入数据之后,对获取到的待接入数据进行第一次记账。对待接入数据进行第一次记账是将待接入数据作为原始数据进行记录,以便于接入动作完成之后对接入数据与待接入数据进行核对,判断接入前后数据条和数据字段内容是否发生变化。In the embodiment of the present application, after the data to be accessed is obtained from the data source, the first accounting is performed on the obtained data to be accessed. The first accounting of the data to be accessed is to record the data to be accessed as the original data, so that after the access action is completed, the access data and the data to be accessed are checked, and the data bars and data fields before and after access are judged. Whether the content has changed.
所述对所述待接入数据进行第一次记账,得到第一记账内容,包括:获取所述待接入数据的数据条的数量;根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段;利用信息摘要算法确定每个数据条对应的核心字段的指纹信息。Performing the first accounting of the data to be accessed to obtain the first accounting content includes: acquiring the number of data bars of the data to be accessed; determining the data bar according to the data content corresponding to each data bar The core field corresponding to each data bar; the fingerprint information of the core field corresponding to each data bar is determined by the information digest algorithm.
本申请实施例中,数据源中数据的存储方式为按照数据条进行顺序存储,从数据源中获取的待接入数据的数据条的数量体现待接入数据的数据量;待接入数据的数据条对应的指纹信息体现待接入数据的数据内容。每个数据条对应的核心字段为每个数据条对应的数据内容固定不变的字段。In the embodiment of the present application, the data in the data source is stored in sequence according to the data strips, and the number of data strips of the data to be accessed obtained from the data source reflects the data volume of the data to be accessed; The fingerprint information corresponding to the data bar reflects the data content of the data to be accessed. The core field corresponding to each data bar is a field whose data content corresponding to each data bar is fixed.
本申请实施例中,每个数据条对应的指纹信息为利用信息摘要算法根据每个数据条对应的核心字段生成的指纹信息,例如,利用信息-摘要算法5(Message-Digest algorithm 5,MD5)根据每个数据条对应的核心字段生成MD5值,每个数据条对应生成的指纹信息中包括该数据条的唯一标识以及该数据条对应的MD5值。In the embodiment of the present application, the fingerprint information corresponding to each data bar is the fingerprint information generated according to the core field corresponding to each data bar by using a message digest algorithm, for example, by using a message-digest algorithm 5 (Message-Digest algorithm 5, MD5) The MD5 value is generated according to the core field corresponding to each data strip, and the fingerprint information generated corresponding to each data strip includes the unique identifier of the data strip and the MD5 value corresponding to the data strip.
本申请实施例中,待接入数据的第一记账内容中包括数据条数记账单和数据指纹信息记录表,其中,数据条数记账单用于记录待接入数据的数据条的数量;数据指纹信息记录表用于记录待接入数据的每个数据条对应的指纹信息。待接入数据的每个数据条对应生成的指纹信息中包含每个数据条的唯一标识,是因为利用信息摘要算法中的MD5算法得到的一个MD5值可能对应多个数据条,因此设置每个数据条的唯一标识和MD5值结合使用,使得信息摘要算法生成的指纹信息具有唯一性。In the embodiment of the present application, the first accounting content of the data to be accessed includes a data bar accounting statement and a data fingerprint information record table, wherein the data bar accounting statement is used to record the data bar of the data to be accessed. Quantity; the data fingerprint information record table is used to record the fingerprint information corresponding to each data bar of the data to be accessed. The fingerprint information generated corresponding to each data bar of the data to be accessed contains the unique identifier of each data bar, because one MD5 value obtained by using the MD5 algorithm in the information digest algorithm may correspond to multiple data bars, so set each data bar. The unique identifier of the data strip is used in combination with the MD5 value to make the fingerprint information generated by the information digest algorithm unique.
示例性的,对待接入数据进行第一次记账得到的第一记账内容包含数据条数记账单,数据条数记账单样例如下表1所示:Exemplarily, the first billing content obtained by performing the first billing of the data to be accessed includes a bill for the number of data pieces, and a sample bill for the number of data pieces is shown in Table 1 below:
表1Table 1
Figure PCTCN2021106203-appb-000001
Figure PCTCN2021106203-appb-000001
表1中,待接入数据的第一记账内容的数据条数记账单包括:记账日期、数据条的数量和记账类型。其中,记账日期为待接入数据录入数据源的日期,记账类型为数据信息来源。例如,表1中记账类型“1”为数据来源于数据源, 记账类型“2”为数据来源于接入数据库。In Table 1, the bill for the number of data pieces of the first billing content of the data to be accessed includes: billing date, number of data pieces, and billing type. The accounting date is the date when the data to be accessed is entered into the data source, and the accounting type is the data information source. For example, the billing type "1" in Table 1 indicates that the data comes from a data source, and the billing type "2" indicates that the data comes from an access database.
示例性的,对待接入数据的数据条对应的指纹信息进行记账,数据指纹信息记录表如下表2所示:Exemplarily, the fingerprint information corresponding to the data bar of the data to be accessed is accounted for, and the data fingerprint information record table is shown in Table 2 below:
表2Table 2
Figure PCTCN2021106203-appb-000002
Figure PCTCN2021106203-appb-000002
表2中,待接入数据的第一记账内容的数据指纹信息记录表包括:日期、数据条ID、记账类型和数据指纹信息。其中,日期为待接入数据生成指纹信息的日期;数据条ID为待接入数据的每个数据条的ID,并且数据条ID在数据接入过程中不会发生改变;记账类型为数据信息来源;数据指纹信息为利用信息摘要算法根据每个数据条对应的核心字段生成的指纹信息。其中,表2中数据条ID“1”和“2”仅为示例性的展示每个数据条的唯一标识在数据接入前后不会发生变化,并不是实际的数据条ID。其中,表2中记账类型“1”为数据来源于数据源,记账类型“2”为数据来源于接入数据库。In Table 2, the data fingerprint information record table of the first accounting content of the data to be accessed includes: date, data bar ID, accounting type, and data fingerprint information. Among them, the date is the date when the fingerprint information is generated from the data to be accessed; the data bar ID is the ID of each data bar of the data to be accessed, and the data bar ID will not change during the data access process; the accounting type is data Information source; data fingerprint information is the fingerprint information generated by the information digest algorithm according to the core field corresponding to each data bar. Wherein, the data bar IDs "1" and "2" in Table 2 are only exemplary to show that the unique identifier of each data bar will not change before and after data access, and are not actual data bar IDs. Among them, the accounting type "1" in Table 2 indicates that the data comes from a data source, and the accounting type "2" indicates that the data comes from an access database.
所述根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段,包括:根据每个数据条对应的数据内容在预设核心字段库中匹配所述每个数据条对应的核心字段。The determining of the core field corresponding to each data bar according to the data content corresponding to each data bar includes: matching the corresponding data bar corresponding to each data bar in a preset core field library according to the data content corresponding to each data bar. core fields.
本申请实施例中,每个数据条对应的核心字段为每个数据条对应的数据内容固定不变的字段。其中,每个数据条对应的数据内容固定不变的字段为数据接入过程中经逻辑处理后不会发生改变的字段。其中,预设核心字段库为存储经逻辑处理后不会发生改变的字段的数据库。In the embodiment of the present application, the core field corresponding to each data bar is a field whose data content corresponding to each data bar is fixed. The fields whose data content is fixed and unchanged corresponding to each data bar are fields that will not change after logical processing during the data access process. The preset core field library is a database that stores fields that will not change after logical processing.
示例性的,假设数据源中待接入数据为如下结构:Exemplarily, it is assumed that the data to be accessed in the data source has the following structure:
表名:t_express_info(快递信息表);Table name: t_express_info (express information table);
字段:Field:
id,code,name,weight,origin_name,origin_tel,dest_name,dest_tel,delivery_addr,send_time,receive_time,remark,create_time,update_time;id,code,name,weight,origin_name,origin_tel,dest_name,dest_tel,delivery_addr,send_time,receive_time,remark,create_time,update_time;
接入目标数据库的接入数据为如下结构:The access data of the access target database has the following structure:
表名:t_delivery(投递管理);Table name: t_delivery (delivery management);
字段:Field:
id,dvcode,dvname,dvweight,dvoriginname,dvoriginphone,dvmanname,dvmanphone,dvaddress,dvmantime,dvreceivetime,dvremark,dvcreatetime,dvupdatetime,dvtype,dvlongitude,dvlatitude,input_time;id,dvcode,dvname,dvweight,dvoriginname,dvoriginphone,dvmanname,dvmanphone,dvaddress,dvmantime,dvreceivetime,dvremark,dvcreatetime,dvupdatetime,dvtype,dvlongitude,dvlatitude,input_time;
示例性的,图1a是本申请实施例一提供的一种数据对账方法的示意图,如图1a所示,待接入数据的核心字段的确定过程如下:Exemplarily, FIG. 1a is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application. As shown in FIG. 1a, the process of determining the core field of the data to be accessed is as follows:
待接入数据中每个数据条对应的数据内容固定不变的字段为数据接入过程中经逻辑处理后不会发生改变的字段。如图1a中:id,code,name,weight,origin_name,origin_teldest_namedest_tel,delivery_addr,send_time,receive_time,remark,create_time,update_time为该数据条的数据内容的核心字段。其中,将待接入数据接入目标数据库的过程中,dvtype,dvlongitude,dvlatitude,input_time这四个字段是扩充的字段。待接入数据中的数据条在经过数据接入过程中的一系列逻辑处理后,可能会出现该数据条的数据内容的实质没有发生改变,但该数据条的结构发生变化的情况,但该数据条的结构发生变化并不影响该数据条的正常使用,可以根据该数据条的数据内容的核心字段生成指纹信息,对接入数据的数据内容进行核对。A field whose data content corresponding to each data bar in the data to be accessed is fixed and unchanged is a field that will not change after logical processing during the data access process. As shown in Figure 1a: id, code, name, weight, origin_name, origin_teldest_namedest_tel, delivery_addr, send_time, receive_time, remark, create_time, update_time are the core fields of the data content of the data bar. Among them, in the process of accessing the data to be accessed into the target database, the four fields of dvtype, dvlongitude, dvlatitude, and input_time are extended fields. After the data bar in the data to be accessed has undergone a series of logical processing in the data access process, it may happen that the data content of the data bar has not changed in substance, but the structure of the data bar has changed, but the Changes in the structure of the data strip do not affect the normal use of the data strip, and fingerprint information can be generated according to the core field of the data content of the data strip to check the data content of the access data.
本申请实施例中,根据每个数据条对应的数据内容在预设核心字段库中进行匹配,将预设核心字段库中与每个数据条对应的数据内容匹配成功的字段作为当前数据条对应的数据内容的核心字段。In the embodiment of the present application, matching is performed in the preset core field database according to the data content corresponding to each data bar, and the field in the preset core field database that is successfully matched with the data content corresponding to each data bar is used as the corresponding field of the current data bar The core field of the data content.
步骤S130,将所述待接入数据接入至目标数据库,得到接入数据。Step S130: Access the data to be accessed to a target database to obtain access data.
本申请实施例中,目标数据库为将待接入数据接入后存储接入数据的数据库。其中,接入数据为经接入后存储至目标数据库的数据。In the embodiment of the present application, the target database is a database that stores the access data after accessing the data to be accessed. The access data is the data stored in the target database after being accessed.
本申请实施例中,待接入数据接入至目标数据库的过程中,待接入数据需要经过一系列的逻辑处理,例如:数据丢弃、字段填充、转换、去重等等。待接入数据经过一系列的逻辑处理后存储至目标数据库,将存入目标数据库的数据作为接入数据。In the embodiment of the present application, in the process of accessing the data to be accessed to the target database, the data to be accessed needs to undergo a series of logical processing, such as data discarding, field filling, conversion, deduplication, and so on. The data to be accessed is stored in the target database after a series of logical processing, and the data stored in the target database is used as the access data.
步骤S140,对所述接入数据进行第二次记账,得到第二记账内容。Step S140: Perform second accounting for the access data to obtain second accounting content.
本申请实施例中,待接入数据经接入后存储至目标数据库,将存储至目标 数据库的数据作为接入数据,对接入数据进行第二次记账。第二次记账为对接入数据的数据量和数据内容进行记录。第二记账内容为对接入数据进行第二次记账后获取到的接入数据的数据量和数据内容信息。In the embodiment of the present application, the data to be accessed is stored in the target database after being accessed, and the data stored in the target database is used as the access data, and the second time accounting is performed on the access data. The second accounting is to record the data volume and data content of the access data. The second accounting content is the data volume and data content information of the access data obtained after the second accounting of the access data is performed.
本申请实施例中,从目标数据库中获取到接入数据之后对获取到的接入数据进行第二次记账。对接入数据进行第二次记账是将接入数据作为原始数据进行记录,以便于接入动作完成之后对接入数据与待接入数据进行核对,判断接入前后数据条和数据字段内容是否发生变化。In the embodiment of the present application, after the access data is obtained from the target database, the second time accounting is performed on the obtained access data. The second accounting of the access data is to record the access data as the original data, so that after the access action is completed, the access data and the data to be accessed are checked, and the content of the data bar and data field before and after the access is judged. changes.
步骤S150,在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功;其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。Step S150, in the case that the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the data bar of the data to be accessed. The number and fingerprint information, and the second accounting content includes: the number and fingerprint information of the data bars of the access data.
本申请实施例中,核对第一记账内容和第二记账内容是否相同,即核对第一记账内容中待接入数据的数据条的数量和指纹信息与第二记账内容中接入数据的数据条的数量和指纹信息内容是否分别一致。根据数据条的数量和数据条的数据内容的指纹信息,来核对待接入数据和接入数据是否一致,若待接入数据和接入数据的数据条的数量和数据内容的指纹信息分别一致,则待接入数据和接入数据核对一致,即接入前后的数据一致,本次对账成功。In this embodiment of the present application, it is checked whether the first billing content and the second billing content are the same, that is, the number and fingerprint information of the data to be accessed in the first billing content and the access data in the second billing content are checked. Whether the number of data strips and the content of fingerprint information are consistent respectively. According to the number of data strips and the fingerprint information of the data content of the data strip, it is checked whether the data to be accessed is consistent with the data to be accessed. , then the data to be accessed is consistent with the access data, that is, the data before and after access are consistent, and the account reconciliation is successful.
所述第一记账内容还包括:所述待接入数据的每个数据条的标识,所述第二记账内容还包括:所述接入数据的每个数据条的标识;所述在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功,包括:获取所述第一记账内容中所述待接入数据的每个数据条的标识和所述第二记账内容中所述待接入数据的每个数据条的标识;在所述接入数据中存在与所述待接入数据的每个数据条具有相同标识的数据条,且所述待接入数据和所述接入数据中标识相同的数据条的指纹信息相同,且所述第一记账内容中的数据条的数量与所述第二记账内容中的数据条的数量相同的情况下,确定对账成功。The first accounting content further includes: the identifier of each data bar of the data to be accessed, and the second accounting content further includes: the identifier of each data bar of the access data; In the case that the first accounting content and the second accounting content are the same, determining that the reconciliation is successful includes: acquiring the identifier and the identifier of each data bar of the data to be accessed in the first accounting content The identifier of each data bar of the data to be accessed in the second accounting content; there is a data bar with the same identifier as each data bar of the data to be accessed in the access data, and The fingerprint information of the data to be accessed and the data strips with the same identification in the access data are the same, and the number of data strips in the first accounting content is the same as the number of data strips in the second accounting content. If the number is the same, it is determined that the reconciliation is successful.
示例性的,图1b是本申请实施例一提供的一种数据对账方法的原理图,如图1b所示,数据对账过程如下:Exemplarily, FIG. 1b is a schematic diagram of a data reconciliation method provided in Embodiment 1 of the present application. As shown in FIG. 1b, the data reconciliation process is as follows:
根据接入任务确定待接入数据并从数据源中获取待接入数据,在记账模块1中对获取到的待接入数据进行第一次记账,得到第一记账内容,获取记账模块1中的第一记账内容并将第一记账内容存储至对账模块。得到第一记账内容后,将待接入数据经数据接入服务模块存储至目标数据库。将入库数据作为接入数据在记账模块2中进行第二次记账,得到第二记账内容并存储至对账模块。当第二记账内容存入对账模块后,将对账模块中的第一记账内容和第二记账内容进行核对,核对第一记账内容中数据条的数量和每个数据条的指纹信息与第二记账内容中数据条的数量和每个数据条的指纹信息是否分别一致,若第一记账 内容中数据条的数量和每个数据条的指纹信息与第二记账内容中数据条的数量和每个数据条的指纹信息分别一致则对账成功,本次对账完成。Determine the data to be accessed according to the access task and obtain the data to be accessed from the data source, perform first accounting on the acquired data to be accessed in the accounting module 1, obtain the first accounting content, and obtain the accounting The first accounting content in the accounting module 1 is stored in the reconciliation module. After the first accounting content is obtained, the data to be accessed is stored in the target database through the data access service module. Taking the inbound data as the access data, the second accounting is performed in the accounting module 2, and the second accounting content is obtained and stored in the reconciliation module. After the second accounting content is stored in the reconciliation module, the first accounting content and the second accounting content in the reconciliation module are checked, and the number of data bars in the first accounting content and the value of each data bar are checked. Whether the fingerprint information is consistent with the number of data bars in the second accounting content and the fingerprint information of each data bar respectively, if the number of data bars in the first accounting content and the fingerprint information of each data bar are consistent with the second accounting content If the number of data bars and the fingerprint information of each data bar are consistent, the reconciliation is successful, and the reconciliation is completed.
本申请实施例中,核对第一记账内容的数据条的数量和第二记账内容的数据条的数量是否一致,若第一记账内容的数据条的数量和第二记账内容的数据条的数量一致,则根据每个数据条的唯一标识在数据接入前后不变的属性,对第一记账内容和第二记账内容中的所有数据条的唯一标识进行一一对应,核对第一记账内容和第二记账内容中相同唯一标识的数据条的指纹信息是否一致,若第一记账内容中所有数据条的唯一标识在第二记账内容中均能找到,且具有相同唯一标识的数据条的指纹信息核对一致,则对账成功。In this embodiment of the present application, it is checked whether the number of data bars of the first accounting content is consistent with the number of data bars of the second accounting content. If the number of bars is the same, then according to the property of the unique identifier of each data bar that does not change before and after data access, make a one-to-one correspondence with the unique identifiers of all data bars in the first accounting content and the second accounting content, and check. Whether the fingerprint information of the data strips with the same unique identifier in the first accounting content and the second accounting content are consistent, if the unique identifiers of all data strips in the first accounting content can be found in the second accounting content, and have If the fingerprint information of the data strips with the same unique identification is consistent, the reconciliation is successful.
该数据对账方法还包括:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;在所述第一记账内容和所述第三记账内容不同的情况下,将对账失败次数加1;重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,将所述对账失败次数加1的操作,直至所述对账失败次数大于或等于预设对账失败次数,确定对账失败。The data reconciliation method further includes: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of the second accounting content After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, Add 1 to the number of account failures; repeatedly perform the next accounting of the access data after a preset time interval with the accounting time interval of the previous accounting, and obtain the next accounting content, in the first If the accounting content is different from the next accounting content, the operation of adding 1 to the number of failed reconciliations is performed until the number of failed reconciliations is greater than or equal to the preset number of failed reconciliations, and it is determined that reconciliation fails.
本申请实施例中,预设时间为通过刷新重新得到接入数据的周期时间长度。第三次记账为与第二次记账的记账时间间隔预设时间刷新后,对重新得到的接入数据的数据量和数据内容进行记录。第三记账内容为通过对刷新后重新得到的接入数据进行第三次记账,获取到的接入数据的数据量和数据内容信息。预设对账失败次数为设置的对账过程中允许的最大对账尝试次数。对账失败次数为对账过程中累计的对账失败次数。In this embodiment of the present application, the preset time is the period of time for re-obtaining access data through refresh. The third accounting is to record the data volume and data content of the re-obtained access data after being refreshed by a preset time interval with the accounting time interval of the second accounting. The third accounting content is the data volume and data content information of the access data obtained by performing third accounting on the access data obtained after refresh. The preset number of reconciliation failures is the maximum number of reconciliation attempts allowed during the set reconciliation process. The number of reconciliation failures is the cumulative number of reconciliation failures during the reconciliation process.
本申请实施例中,若第一记账内容中的数据条的数量大于第二记账内容中的数据条的数量,则待接入数据在接入过程中出现数据条缺失或数据接入不完全的情况。例如,第一记账内容中的数据条的数量为10000条,而第二记账内容中数据条的数量为9000条,剩余的1000条可能是正在接入中或在接入过程中数据条缺失。若待接入数据在接入过程中出现数据接入不完全的情况,则需要在与第二次记账的记账时间间隔预设时间后对接入数据进行刷新,并对刷新后的接入数据进行第三次记账,得到第三记账内容。若第一记账内容与第三记账内容不同,则将对账失败次数加1。设置预设时间可以避免系统重新刷新后仍然存在数据接入不完全的情况。累计对账失败次数,并将累计的本次对账失败次数与预设对账失败次数进行比对,可以避免刷新次数过多导致对账时间被延长。刷新后重新得到接入数据,对接入数据进行第三次记账得到第三记账内容, 对第一记账内容和第三记账内容进行核对,可以避免因为数据接入不完全导致待接入数据和接入数据不一致。示例性的,第一记账内容中的数据条的数量为10000条,而第二记账内容中的数据条的数量为9000条,剩余的1000条可能还在接入中。间隔15分钟重新刷新得到接入数据进行第三次记账,得到第三记账内容中的数据条的数量仍为9000条,则对账失败,对账失败次数加1,并判断本次对账的对账失败次数是否小于预设对账失败次数,若本次对账的对账失败次数小于预设对账失败次数,则间隔15分钟进行刷新,得到接入数据,并对接入数据进行第四次记账,得到第四记账内容的数据条的数量并将第四记账内容中的数据条的数量与第一记账内容中的数据条的数量进行核对,若第一记账内容中的数据条的数量大于第四记账内容中的数据条的数量,对账失败次数加1,判断本次对账的对账失败次数是否等于预设对账失败次数,若本次对账的对账失败次数等于预设对账失败次数,则本次对账失败,停止对账。In this embodiment of the present application, if the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, the data to be accessed has missing data bars or data access failure during the access process. complete situation. For example, the number of data bars in the first accounting content is 10,000, while the number of data bars in the second accounting content is 9,000, and the remaining 1,000 data bars may be data bars being accessed or in the process of accessing missing. If the data to be accessed has incomplete data access during the access process, the access data needs to be refreshed after a preset time interval from the accounting time interval for the second accounting, and the refreshed access data needs to be refreshed. Enter the data for the third accounting, and obtain the third accounting content. If the first accounting content is different from the third accounting content, the number of failed accounting is incremented by 1. Setting the preset time can avoid incomplete data access even after the system is refreshed. Accumulate the number of reconciliation failures, and compare the accumulated number of reconciliation failures this time with the preset number of reconciliation failures, which can prevent the reconciliation time from being extended due to excessive refresh times. After refreshing, the access data is obtained again, the access data is recorded for the third time to obtain the third accounting content, and the first accounting content and the third accounting content are checked, which can avoid waiting due to incomplete data access. The access data and the access data are inconsistent. Exemplarily, the number of data bars in the first accounting content is 10,000, and the number of data bars in the second accounting content is 9,000, and the remaining 1,000 data bars may still be accessed. Refresh the access data every 15 minutes to perform the third accounting, and the number of data bars in the third accounting content is still 9000, then the reconciliation fails, the number of reconciliation failures is increased by 1, and it is judged that the reconciliation fails. Whether the number of reconciliation failures of the account is less than the preset number of reconciliation failures, if the number of reconciliation failures in this reconciliation is less than the preset number of reconciliation failures, refresh at an interval of 15 minutes to obtain access data, and verify the access data. Carry out the fourth bookkeeping, obtain the number of data bars in the fourth bookkeeping content and check the number of data bars in the fourth bookkeeping content with the number of data bars in the first bookkeeping content, if the first bookkeeping The number of data bars in the accounting content is greater than the number of data bars in the fourth accounting content, and the number of reconciliation failures is added by 1 to determine whether the number of reconciliation failures in this reconciliation is equal to the preset number of reconciliation failures. If the number of reconciliation failures in reconciliation is equal to the preset number of reconciliation failures, the current reconciliation fails and the reconciliation is stopped.
该数据对账方法还包括:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;在所述第一记账内容和所述第三记账内容不同的情况下,根据所述预设时间记录对账时间;重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,根据所述预设时间记录所述对账时间的操作,直至所述对账时间大于或等于预设对账时间,确定对账失败。The data reconciliation method further includes: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of the second accounting content After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, The reconciliation time is recorded according to the preset time; after the preset time interval between the accounting time of the previous accounting and the previous accounting is performed repeatedly, the next accounting is performed on the access data, and the next accounting content is obtained. In the case where the first billing content and the next billing content are different, the operation of recording the account reconciliation time according to the preset time, until the account reconciliation time is greater than or equal to the preset account reconciliation time, determine Reconciliation failed.
本申请实施例中,对账时间为对账过程中的累计对账时间。预设对账时间为根据接入任务预设的对账时间长度。若第一记账内容中的数据条的数量大于第二记账内容中的数据条的数量,则待接入数据在接入过程中出现数据条缺失或数据接入不完全的情况。增加预设时间和刷新接入数据后重新对账的设置可能会增加本次对账的整体对账时间,则为了避免待接入数据和接入数据的数据量较大时,在核对过程中多次对账尝试会大幅度地增加对账时间,增加对预设对账时间的设置,避免了待接入数据和接入数据量较大时,多次对账尝试导致对账时间的大幅度增加。In the embodiment of the present application, the reconciliation time is the accumulated reconciliation time in the reconciliation process. The preset reconciliation time is the reconciliation time length preset according to the access task. If the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, the data to be accessed may be missing data bars or incomplete data access during the access process. Increasing the preset time and setting the reconciliation after refreshing the access data may increase the overall reconciliation time of this reconciliation. Multiple reconciliation attempts will greatly increase the reconciliation time, increase the setting of the preset reconciliation time, and avoid the large amount of reconciliation attempts caused by multiple reconciliation attempts when the data to be accessed and the amount of access data are large. increase in magnitude.
该数据对账方法还包括:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量,且所述第一记账内容中所述待接入数据的每个数据条的标识与所述第二记账内容中所述接入数据的每个数据条的标识的比对结果为中间数据条缺失的情况下,确定对账失败。The data reconciliation method further includes: the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, and the pending accounting content in the first accounting content If the result of the comparison between the identifier of each data bar of the incoming data and the identifier of each data bar of the access data in the second accounting content is that the intermediate data bar is missing, it is determined that the reconciliation fails.
本申请实施例中,若第一记账内容中的数据条的数量大于第二记账内容中的数据条的数量,则待接入数据在接入过程中出现数据条缺失或数据接入不完 全的情况。利用第一记账内容中每个数据条的唯一标识与第二记账内容中每个数据条的唯一标识进行比对,若第二记账内容中所有数据条的唯一标识少于第一记账内容中所有数据条的唯一标识,且与第一记账内容相比,第二记账内容中缺少的数据条的唯一标识处于第一记账内容中所有数据条的唯一标识的中间位置,则第二记账内容与第一记账内容相比出现中间数据条缺失。In this embodiment of the present application, if the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, the data to be accessed has missing data bars or data access failure during the access process. complete situation. The unique identifier of each data bar in the first accounting content is compared with the unique identifier of each data bar in the second accounting content, if the unique identifiers of all data bars in the second accounting content are less than those of the first accounting content The unique identifier of all the data bars in the accounting content, and compared with the first accounting content, the unique identifier of the missing data bar in the second accounting content is in the middle of the unique identifiers of all the data bars in the first accounting content, Then, compared with the first accounting content, the second accounting content is missing an intermediate data bar.
本申请实施例中,核对第一记账内容的数据条的数量和第二记账内容数据条的数量是否一致,若第一记账内容的数据条的数量和第二记账内容的数据条的数量一致,则根据每个数据条的唯一标识在接入前后不变的属性,对第一记账内容和第二记账内容中的所有数据条的唯一标识进行一一对应,核对第一记账内容和第二记账内容中唯一标识相同的数据条的指纹信息是否一致,若第一记账内容中所有数据条的唯一标识在第二记账内容中均能找到,且具有相同唯一标识的数据条的指纹信息核对一致,则对账成功。In this embodiment of the present application, it is checked whether the number of data bars of the first accounting content is consistent with the number of data bars of the second accounting content. If the number of data bars of the first accounting content and the data bars of the second accounting content are consistent If the number of data bars is the same, then according to the property of the unique identifier of each data bar that does not change before and after access, make a one-to-one correspondence with the unique identifiers of all data bars in the first accounting content and the second accounting content, and check the first accounting content. Whether the fingerprint information of the data bar with the same unique identifier in the accounting content and the second accounting content is consistent, if the unique identifiers of all data bars in the first accounting content can be found in the second accounting content, and have the same unique identifier If the fingerprint information of the identified data strip is consistent, the account reconciliation is successful.
本申请通过对待接入数据进行第一次记账,得到第一记账内容后将待接入数据接入至目标数据库得到接入数据;对接入数据进行第二次记账得到第二记账内容;核对第一记账内容和第二记账内容是否一致;以实现核对待接入数据的数据条的数量和指纹信息和接入数据的数据条的数量和指纹信息,达到对数据接入前后的数据进行精准对账的技术效果。In this application, the first accounting is performed on the data to be accessed, and after obtaining the first accounting content, the data to be accessed is connected to the target database to obtain the access data; the second accounting is performed on the access data to obtain the second accounting Check whether the first billing content and the second billing content are consistent; to check the number of data bars and fingerprint information of the data to be accessed and the number and fingerprint information of the data bars to access data, so as to achieve the connection between the data The technical effect of accurate reconciliation of the data before and after the entry.
实施例二Embodiment 2
图2为本申请实施例二提供的一种数据对账装置的结构图,如图2所示,本申请实施例还提供了一种数据对账装置,该装置包括:数据获取模块21,设置为获取数据源中的待接入数据;第一记账模块22,设置为对所述待接入数据进行第一次记账,得到第一记账内容;数据接入模块23,设置为将所述待接入数据接入至目标数据库,得到接入数据;第二记账模块24,设置为对所述接入数据进行第二次记账,得到第二记账内容;对账模块25,设置为在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功;其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。FIG. 2 is a structural diagram of a data reconciliation device provided in Embodiment 2 of the present application. As shown in FIG. 2 , an embodiment of the present application also provides a data reconciliation device, which includes: a data acquisition module 21, which sets In order to obtain the data to be accessed in the data source; the first accounting module 22 is configured to perform first accounting on the data to be accessed to obtain the first accounting content; the data access module 23 is configured to The data to be accessed is accessed to the target database to obtain access data; the second accounting module 24 is configured to perform second accounting for the access data to obtain second accounting content; the accounting module 25 , set to determine that the reconciliation is successful when the first accounting content and the second accounting content are the same; wherein, the first accounting content includes: the data of the data bar of the data to be accessed The number and fingerprint information, and the second accounting content includes: the number and fingerprint information of the data bars of the access data.
第一记账模块22设置为:获取所述待接入数据的数据条的数量;根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段;利用信息摘要算法确定每个数据条对应的核心字段的指纹信息。The first accounting module 22 is configured to: obtain the number of data bars of the data to be accessed; determine the core field corresponding to each data bar according to the data content corresponding to each data bar; use an information digest algorithm to determine each data bar. Fingerprint information of the core field corresponding to the data bar.
第一记账模块22设置为通过如下方式根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段:根据每个数据条对应的数据内容在预设核心 字段库中匹配所述每个数据条对应的核心字段。The first accounting module 22 is configured to determine the core field corresponding to each data bar according to the data content corresponding to each data bar in the following manner: match all the data in the preset core field library according to the data content corresponding to each data bar. Describe the core fields corresponding to each data bar.
所述第一记账内容还包括:所述待接入数据的每个数据条的标识,所述第二记账内容还包括:所述接入数据的每个数据条的标识;对账模块25设置为:获取所述第一记账内容中所述待接入数据的每个数据条的标识和所述第二记账内容中所述接入数据的每个数据条的标识;在所述接入数据中存在与所述待接入数据的每个数据条具有相同标识的数据条,且所述待接入数据和所述接入数据中标识相同的数据条的指纹信息相同,且所述第一记账内容中的数据条的数量与所述第二记账内容中的数据条的数量相同的情况下,确定对账成功。The first accounting content further includes: an identifier of each data bar of the data to be accessed, and the second accounting content further includes: an identifier of each data bar of the access data; an account reconciliation module 25 is set to: obtain the identifier of each data bar of the data to be accessed in the first accounting content and the identifier of each data bar of the access data in the second accounting content; There is a data bar with the same identifier as each data bar of the to-be-accessed data in the access data, and the to-be-accessed data and the access data have the same fingerprint information of the data bar with the same identifier, and If the number of data bars in the first accounting content is the same as the number of data bars in the second accounting content, it is determined that the reconciliation is successful.
对账模块25还设置为:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;在所述第一记账内容和所述第三记账内容不同的情况下,将对账失败次数加1;重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,将所述对账失败次数加1的操作,直至所述对账失败次数大于或等于预设对账失败次数,确定对账失败。The reconciliation module 25 is further configured to: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, Add 1 to the number of account failures; repeatedly perform the next accounting of the access data after a preset time interval with the accounting time interval of the previous accounting, and obtain the next accounting content, in the first If the accounting content is different from the next accounting content, the operation of adding 1 to the number of failed reconciliations is performed until the number of failed reconciliations is greater than or equal to the preset number of failed reconciliations, and it is determined that reconciliation fails.
对账模块25还设置为:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;在所述第一记账内容和所述第三记账内容不同的情况下,根据所述预设时间记录对账时间;重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,根据所述预设时间记录所述对账时间的操作,直至所述对账时间大于或等于预设对账时间,确定对账失败。The reconciliation module 25 is further configured to: in the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, in the case of After the accounting time interval preset time, the third accounting is performed on the access data to obtain the third accounting content; in the case that the first accounting content and the third accounting content are different, The reconciliation time is recorded according to the preset time; after the preset time interval between the accounting time of the previous accounting and the previous accounting is performed repeatedly, the next accounting is performed on the access data, and the next accounting content is obtained. In the case where the first billing content and the next billing content are different, the operation of recording the account reconciliation time according to the preset time, until the account reconciliation time is greater than or equal to the preset account reconciliation time, determine Reconciliation failed.
对账模块25还设置为:在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量,且所述第一记账内容中所述待接入数据的每个数据条的标识与所述第二记账内容中所述接入数据的每个数据条的标识的比对结果为中间数据条缺失的情况下,确定对账失败。The account reconciliation module 25 is further configured to: the number of data bars in the first bookkeeping content is greater than the number of data bars in the second bookkeeping content, and the number of data bars in the first bookkeeping content If the result of the comparison between the identifier of each data bar of the incoming data and the identifier of each data bar of the access data in the second accounting content is that the intermediate data bar is missing, it is determined that the reconciliation fails.
本申请实施例所提供的一种数据对账装置可执行本申请任意实施例所提供的数据对账方法,具备执行该方法相应的功能模块和效果。A data reconciliation device provided by an embodiment of the present application can execute the data reconciliation method provided by any embodiment of the present application, and has corresponding functional modules and effects for executing the method.
实施例三Embodiment 3
图3是本申请实施例三提供的一种计算机设备的结构示意图。图3示出了 适于用来实现本申请实施方式的示例性计算机设备12的框图。图3显示的计算机设备12仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。FIG. 3 is a schematic structural diagram of a computer device according to Embodiment 3 of the present application. Figure 3 shows a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present application. The computer device 12 shown in FIG. 3 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
如图3所示,计算机设备12以通用计算设备的形式表现。计算机设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,系统存储器28,连接不同系统组件(包括系统存储器28和处理单元16)的总线18。As shown in FIG. 3, computer device 12 takes the form of a general-purpose computing device. Components of computer device 12 may include, but are not limited to, one or more processors or processing units 16 , system memory 28 , and a bus 18 connecting various system components including system memory 28 and processing unit 16 .
总线18表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。 Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, a graphics acceleration port, a processor, or a local bus using any of a variety of bus structures. For example, these architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association) Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
计算机设备12包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。 Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12, including both volatile and nonvolatile media, removable and non-removable media.
系统存储器28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)30和/或高速缓存存储器32。计算机设备12可以包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统34可以用于读写不可移动的、非易失性磁介质(图3未显示,通常称为“硬盘驱动器”)。尽管图3中未示出,可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘,例如光盘只读存储器(Compact Disc Read-Only Memory,CD-ROM),数字多功能光盘只读存储器(Digital Versatile Disc Read-Only Memory,DVD-ROM)或者其它光介质,读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线18相连。存储器28可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请实施例的功能。 System memory 28 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) 30 and/or cache memory 32 . Computer device 12 may include other removable/non-removable, volatile/non-volatile computer system storage media. For example only, storage system 34 may be used to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive"). Although not shown in Figure 3, a magnetic disk drive may be provided for reading and writing to removable non-volatile magnetic disks, such as "floppy disks", and to removable non-volatile optical disks, such as Compact Disc Read-Only Memory (Compact Disc) Read-Only Memory, CD-ROM), Digital Versatile Disc Read-Only Memory (DVD-ROM) or other optical media, CD-ROM drive for reading and writing. In these cases, each drive may be connected to bus 18 through one or more data media interfaces. The memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of the embodiments of the present application.
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如存储器28中,这样的程序模块42包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或一种组合中可能包括网络环境的实现。程序模块42通常执行本申请所描述的实施例中的功能和/或方法。A program/utility 40 having a set (at least one) of program modules 42, which may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data , each or a combination of these examples may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
计算机设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显 示器24等)通信,还可与一个或者多个使得用户能与该计算机设备12交互的设备通信,和/或与使得该计算机设备12能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口22进行。并且,计算机设备12还可以通过网络适配器20与一个或者多个网络,例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网,通信。如图3所示,网络适配器20通过总线18与计算机设备12的其它模块通信。应当明白,尽管图3中未示出,可以结合计算机设备12使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、独立冗余磁盘阵列(Redundant Arrays of Independent Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。 Computer device 12 may also communicate with one or more external devices 14 (eg, keyboard, pointing device, display 24, etc.), may also communicate with one or more devices that enable a user to interact with computer device 12, and/or communicate with Any device (eg, network card, modem, etc.) that enables the computer device 12 to communicate with one or more other computing devices. Such communication may take place through an input/output (I/O) interface 22 . Furthermore, the computer device 12 may also communicate with one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through the network adapter 20. As shown in FIG. 3 , network adapter 20 communicates with other modules of computer device 12 via bus 18 . It should be understood that, although not shown in FIG. 3, other hardware and/or software modules may be used in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, redundant independent disks Array (Redundant Arrays of Independent Disks, RAID) systems, tape drives and data backup storage systems.
处理单元16通过运行存储在系统存储器28中的程序,从而执行多种功能应用以及数据处理,例如实现本申请实施例所提供的数据对账方法,该方法包括:获取数据源中的待接入数据;对所述待接入数据进行第一次记账,得到第一记账内容;将所述待接入数据接入至目标数据库,得到接入数据;对所述接入数据进行第二次记账,得到第二记账内容;在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功;其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。The processing unit 16 executes a variety of functional applications and data processing by running the program stored in the system memory 28, for example, implementing the data reconciliation method provided by the embodiment of the present application, the method includes: acquiring the data to be accessed in the data source perform first accounting on the data to be accessed to obtain first accounting content; access the data to be accessed to the target database to obtain access data; perform a second accounting on the access data The second accounting content is obtained, and the second accounting content is obtained; if the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the The number and fingerprint information of the data pieces of the data to be accessed, and the second accounting content includes: the number and fingerprint information of the data pieces of the access data.
实施例四Embodiment 4
本申请实施例四提供了一种计算机刻可读存储介质,存储有计算机程序,该程序被处理器执行时用于执行一种数据对账方法,该方法包括:获取数据源中的待接入数据;对所述待接入数据进行第一次记账,得到第一记账内容;将所述待接入数据接入至目标数据库,得到接入数据;对所述接入数据进行第二次记账,得到第二记账内容;在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功;其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。Embodiment 4 of the present application provides a computer-readable storage medium storing a computer program, and when the program is executed by a processor, is used to execute a data reconciliation method. The method includes: acquiring a data source to be accessed in a data source data; perform first accounting on the data to be accessed to obtain first accounting content; access the data to be accessed to the target database to obtain access data; perform second accounting on the access data The second accounting content is obtained, and the second accounting content is obtained; if the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful; wherein, the first accounting content includes: the The number of data bars and fingerprint information of the data to be accessed, and the second accounting content includes: the number of data bars and fingerprint information of the data to be accessed.
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁 盘、硬盘、RAM、只读存储器(Read-Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)或闪存、光纤、CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium of the embodiments of the present application may adopt any combination of one or more computer-readable media. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. Examples (non-exhaustive list) of computer-readable storage media include: electrical connections having one or more wires, portable computer disks, hard disks, RAM, Read-Only Memory (ROM), erasable Erasable Programmable Read Only Memory (EPROM) or flash memory, optical fiber, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer-readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。Program code embodied on a computer-readable medium may be transmitted using any suitable medium, including but not limited to wireless, wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络包括LAN或WAN连接到用户计算机,或者,可以连接到外部计算机,例如利用因特网服务提供商来通过因特网连接。Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional A procedural programming language, such as the "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. Where a remote computer is involved, the remote computer may be connected to the user's computer through any kind of network, including a LAN or WAN, or may be connected to an external computer, such as through the Internet using an Internet service provider.

Claims (10)

  1. 一种数据对账方法,包括:A data reconciliation method, including:
    获取数据源中的待接入数据;Obtain the data to be accessed from the data source;
    对所述待接入数据进行第一次记账,得到第一记账内容;Perform first accounting for the data to be accessed to obtain first accounting content;
    将所述待接入数据接入至目标数据库,得到接入数据;Accessing the data to be accessed to a target database to obtain access data;
    对所述接入数据进行第二次记账,得到第二记账内容;Carrying out the second accounting for the access data to obtain the second accounting content;
    在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功;In the case that the first accounting content and the second accounting content are the same, it is determined that the reconciliation is successful;
    其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。The first accounting content includes: the number of data bars and fingerprint information of the data to be accessed, and the second accounting content includes: the number of data bars and fingerprint information of the access data.
  2. 根据权利要求1所述的方法,其中,所述对所述待接入数据进行第一次记账,得到第一记账内容,包括:The method according to claim 1, wherein the first accounting for the data to be accessed to obtain the first accounting content comprises:
    获取所述待接入数据的数据条的数量;obtaining the number of data bars of the data to be accessed;
    根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段;Determine the core field corresponding to each data bar according to the data content corresponding to each data bar;
    利用信息摘要算法确定每个数据条对应的核心字段的指纹信息。The information digest algorithm is used to determine the fingerprint information of the core field corresponding to each data bar.
  3. 根据权利要求2所述的方法,其中,所述根据每个数据条对应的数据内容确定所述每个数据条对应的核心字段,包括:The method according to claim 2, wherein the determining the core field corresponding to each data bar according to the data content corresponding to each data bar comprises:
    根据每个数据条对应的数据内容在预设核心字段库中匹配所述每个数据条对应的核心字段。The core field corresponding to each data bar is matched in the preset core field library according to the data content corresponding to each data bar.
  4. 根据权利要求3所述的方法,其中,所述第一记账内容还包括:所述待接入数据的每个数据条的标识,所述第二记账内容还包括:所述接入数据的每个数据条的标识;The method according to claim 3, wherein the first accounting content further comprises: an identifier of each data bar of the data to be accessed, and the second accounting content further comprises: the access data the identification of each data strip;
    所述在所述第一记账内容和所述第二记账内容相同的情况下,确定对账成功,包括:In the case that the first accounting content and the second accounting content are the same, determining that the reconciliation is successful includes:
    获取所述第一记账内容中所述待接入数据的每个数据条的标识和所述第二记账内容中所述接入数据的每个数据条的标识;acquiring the identifier of each data bar of the data to be accessed in the first accounting content and the identifier of each data bar of the access data in the second accounting content;
    在所述接入数据中存在与所述待接入数据的每个数据条具有相同标识的数据条,且所述待接入数据和所述接入数据中标识相同的数据条的指纹信息相同,且所述第一记账内容中的数据条的数量与所述第二记账内容中的数据条的数量相同的情况下,确定对账成功。In the access data, there is a data bar with the same identification as each data bar of the to-be-accessed data, and the to-be-accessed data and the data bar with the same identification in the access data have the same fingerprint information , and if the number of data bars in the first accounting content is the same as the number of data bars in the second accounting content, it is determined that the reconciliation is successful.
  5. 根据权利要求4所述的方法,还包括:The method of claim 4, further comprising:
    在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条 的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;In the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, after a preset time interval with the accounting time interval of the second accounting , carry out the third accounting for the access data, and obtain the third accounting content;
    在所述第一记账内容和所述第三记账内容不同的情况下,将对账失败次数加1;In the case that the first accounting content and the third accounting content are different, add 1 to the number of accounting failures;
    重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,将所述对账失败次数加1的操作,直至所述对账失败次数大于或等于预设对账失败次数,确定对账失败。Repeatedly perform the next accounting for the access data after the accounting time interval from the previous accounting for a preset time to obtain the next accounting content, where the first accounting content and the next accounting In the case of different accounting contents, the operation of adding 1 to the number of account reconciliation failures is performed until the number of account reconciliation failures is greater than or equal to the preset number of account reconciliation failures, and it is determined that the account reconciliation fails.
  6. 根据权利要求4所述的方法,还包括:The method of claim 4, further comprising:
    在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量的情况下,在与所述第二次记账的记账时间间隔预设时间后,对所述接入数据进行第三次记账,得到第三记账内容;In the case that the number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, after a preset time interval with the accounting time interval of the second accounting , carry out the third accounting for the access data, and obtain the third accounting content;
    在所述第一记账内容和所述第三记账内容不同的情况下,根据所述预设时间记录对账时间;In the case that the first accounting content and the third accounting content are different, record the reconciliation time according to the preset time;
    重复执行在与上一次记账的记账时间间隔预设时间后,对所述接入数据进行下一次记账,得到下一记账内容,在所述第一记账内容和所述下一记账内容不同的情况下,根据所述预设时间记录所述对账时间的操作,直至所述对账时间大于或等于预设对账时间,确定对账失败。Repeatedly perform the next accounting for the access data after the accounting time interval from the previous accounting for a preset time to obtain the next accounting content, where the first accounting content and the next accounting When the accounting contents are different, the operation of recording the account reconciliation time according to the preset time, until the account reconciliation time is greater than or equal to the preset account reconciliation time, it is determined that the account reconciliation fails.
  7. 根据权利要求4所述的方法,还包括:The method of claim 4, further comprising:
    在所述第一记账内容中的数据条的数量大于所述第二记账内容中的数据条的数量,且所述第一记账内容中所述待接入数据的每个数据条的标识与所述第二记账内容中所述接入数据的每个数据条的标识的比对结果为中间数据条缺失的情况下,确定对账失败。The number of data bars in the first accounting content is greater than the number of data bars in the second accounting content, and the number of each data bar of the data to be accessed in the first accounting content If the result of the comparison between the identifier and the identifier of each data bar of the access data in the second accounting content is that the intermediate data bar is missing, it is determined that the reconciliation fails.
  8. 一种数据对账装置,包括:A data reconciliation device, comprising:
    数据获取模块,设置为获取数据源中的待接入数据;a data acquisition module, set to acquire the data to be accessed in the data source;
    第一记账模块,设置为对所述待接入数据进行第一次记账,得到第一记账内容;a first billing module, configured to perform first billing on the data to be accessed to obtain first billing content;
    数据接入模块,设置为将所述待接入数据接入至目标数据库,得到接入数据;a data access module, configured to access the data to be accessed to a target database to obtain access data;
    第二记账模块,设置为对所述接入数据进行第二次记账,得到第二记账内容;A second accounting module, configured to perform second accounting for the access data to obtain second accounting content;
    对账模块,设置为在所述第一记账内容和所述第二记账内容相同的情况下, 确定对账成功;A reconciliation module, configured to determine that the reconciliation is successful when the first accounting content and the second accounting content are the same;
    其中,所述第一记账内容包括:所述待接入数据的数据条的数量和指纹信息,所述第二记账内容包括:所述接入数据的数据条的数量和指纹信息。Wherein, the first accounting content includes: the number of data bars of the data to be accessed and fingerprint information, and the second accounting content includes: the number of data bars and fingerprint information of the access data.
  9. 一种计算机设备,包括:A computer device comprising:
    至少一个处理器;at least one processor;
    存储装置,设置为存储至少一个程序;a storage device configured to store at least one program;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-7中任一项所述的数据对账方法。When the at least one program is executed by the at least one processor, the at least one processor implements the data reconciliation method according to any one of claims 1-7.
  10. 一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-7中任一项所述的数据对账方法。A computer-readable storage medium storing a computer program, wherein when the program is executed by a processor, the data reconciliation method according to any one of claims 1-7 is implemented.
PCT/CN2021/106203 2020-09-29 2021-07-14 Data reconciliation method and apparatus, device, and storage medium WO2022068316A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011052704.5 2020-09-29
CN202011052704.5A CN112162976A (en) 2020-09-29 2020-09-29 Data reconciliation method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2022068316A1 true WO2022068316A1 (en) 2022-04-07

Family

ID=73860773

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/106203 WO2022068316A1 (en) 2020-09-29 2021-07-14 Data reconciliation method and apparatus, device, and storage medium

Country Status (2)

Country Link
CN (1) CN112162976A (en)
WO (1) WO2022068316A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117437076A (en) * 2023-12-19 2024-01-23 太平金融科技服务(上海)有限公司 Account checking method, device, equipment and medium based on account checking code

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112162976A (en) * 2020-09-29 2021-01-01 北京锐安科技有限公司 Data reconciliation method, device, equipment and storage medium
CN113377757B (en) * 2021-06-24 2023-08-25 杭州数梦工场科技有限公司 Data checking method and device, electronic equipment and machine-readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462568A (en) * 2014-12-26 2015-03-25 山东中创软件商用中间件股份有限公司 Data reconciliation method, device and system
CN107729553A (en) * 2017-11-07 2018-02-23 北京京东金融科技控股有限公司 System data account checking method and device, storage medium, electronic equipment
US20180357403A1 (en) * 2015-12-07 2018-12-13 Samsung Electronics Co., Ltd. Method, apparatus, and system for providing temporary account information
CN109711801A (en) * 2018-12-13 2019-05-03 平安普惠企业管理有限公司 A kind of Internetbank account checking method and device
CN110287200A (en) * 2019-07-02 2019-09-27 江苏满运软件科技有限公司 Account checking method, system, computer equipment and storage medium
CN112162976A (en) * 2020-09-29 2021-01-01 北京锐安科技有限公司 Data reconciliation method, device, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10769101B2 (en) * 2018-08-23 2020-09-08 Oath Inc. Selective data migration and sharing
CN110298740A (en) * 2019-06-24 2019-10-01 深圳乐信软件技术有限公司 Data account checking method, device, equipment and storage medium
CN110443690A (en) * 2019-08-15 2019-11-12 深圳乐信软件技术有限公司 A kind of method, apparatus, server and the storage medium of variance data reconciliation
CN110765179A (en) * 2019-10-18 2020-02-07 京东数字科技控股有限公司 Distributed account checking processing method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462568A (en) * 2014-12-26 2015-03-25 山东中创软件商用中间件股份有限公司 Data reconciliation method, device and system
US20180357403A1 (en) * 2015-12-07 2018-12-13 Samsung Electronics Co., Ltd. Method, apparatus, and system for providing temporary account information
CN107729553A (en) * 2017-11-07 2018-02-23 北京京东金融科技控股有限公司 System data account checking method and device, storage medium, electronic equipment
CN109711801A (en) * 2018-12-13 2019-05-03 平安普惠企业管理有限公司 A kind of Internetbank account checking method and device
CN110287200A (en) * 2019-07-02 2019-09-27 江苏满运软件科技有限公司 Account checking method, system, computer equipment and storage medium
CN112162976A (en) * 2020-09-29 2021-01-01 北京锐安科技有限公司 Data reconciliation method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117437076A (en) * 2023-12-19 2024-01-23 太平金融科技服务(上海)有限公司 Account checking method, device, equipment and medium based on account checking code
CN117437076B (en) * 2023-12-19 2024-03-19 太平金融科技服务(上海)有限公司 Account checking method, device, equipment and medium based on account checking code

Also Published As

Publication number Publication date
CN112162976A (en) 2021-01-01

Similar Documents

Publication Publication Date Title
WO2022068316A1 (en) Data reconciliation method and apparatus, device, and storage medium
CN109471851B (en) Data processing method, device, server and storage medium
WO2019019640A1 (en) Simulated processing method and apparatus for order information, and storage medium and computer device
WO2020134989A1 (en) Excel data import method and apparatus, and computer device and storage medium
US10089385B2 (en) Method and apparatus for asynchroinzed de-serialization of E-R model in a huge data trunk
CN109165209B (en) Data verification method, device, equipment and medium for object types in database
CN110659210A (en) Information acquisition method and device, electronic equipment and storage medium
CN111984303A (en) Transaction data processing method, device, equipment and storage medium
CN111258832B (en) Interface parameter verification method, device, equipment and medium
CN111694684B (en) Abnormal construction method and device of storage device, electronic device and storage medium
US11816163B2 (en) Systems and methods for improved transactional mainframes
CN110990346A (en) File data processing method, device, equipment and storage medium based on block chain
CN113010208A (en) Version information generation method, version information generation device, version information generation equipment and storage medium
CN113238940B (en) Interface test result comparison method, device, equipment and storage medium
CN115934537A (en) Interface test tool generation method, device, equipment, medium and product
CN114217790A (en) Interface scheduling method and device, electronic equipment and medium
CN112035159B (en) Configuration method, device, equipment and storage medium of audit model
CN111427902B (en) Metadata management method, device, equipment and medium based on lightweight database
CN109740027B (en) Data exchange method, device, server and storage medium
CN115620877A (en) Method, system, equipment and storage medium for uploading medical data to cloud platform
CN113852610A (en) Message processing method and device, computer equipment and storage medium
CN110134691B (en) Data verification method, device, equipment and medium
CN114449052B (en) Data compression method and device, electronic equipment and storage medium
CN113360454B (en) Memory snapshot file compression and decompression method and related device
CN110389862B (en) Data storage method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21873978

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21873978

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 22/09/2023)