WO2022089063A1 - 数据核对方法、装置、设备、系统及存储介质 - Google Patents
数据核对方法、装置、设备、系统及存储介质 Download PDFInfo
- Publication number
- WO2022089063A1 WO2022089063A1 PCT/CN2021/118146 CN2021118146W WO2022089063A1 WO 2022089063 A1 WO2022089063 A1 WO 2022089063A1 CN 2021118146 W CN2021118146 W CN 2021118146W WO 2022089063 A1 WO2022089063 A1 WO 2022089063A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data stream
- stream
- primary key
- key value
- Prior art date
Links
- 238000013524 data verification Methods 0.000 title claims abstract description 80
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000012795 verification Methods 0.000 claims description 51
- 238000004590 computer program Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 9
- 238000012545 processing Methods 0.000 abstract description 7
- 238000010586 diagram Methods 0.000 description 20
- 230000001960 triggered effect Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 6
- 230000008859 change Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/51—Discovery or management thereof, e.g. service location protocol [SLP] or web services
Definitions
- the present application belongs to the field of data processing, and in particular, relates to a data verification method, apparatus, device, system and storage medium.
- business data can be obtained from the two systems involved in data verification. For example, obtain business data within one day that has ended from system A and system B, respectively, and compare the business data in system A and system B one by one to see if the business data in system A and system B are consistent, that is, whether there is a cross-system data inequity problem.
- this kind of data checking method cannot detect the data inequity across the system in time.
- Embodiments of the present application provide a data verification method, apparatus, device, system, and storage medium, which can timely discover the problem of data inequity across systems.
- an embodiment of the present application provides a data verification method, including: when a write operation occurs in each system data pool, generating and transmitting a data stream including data associated with the write operation, where the data stream includes the primary key of the data value; based on the fields of the data stream and the preset area division rules, the data stream is divided into at least one data area, and each data area includes data streams corresponding to at least two system data pools; in each data area, according to the data The primary key value corresponding to the stream is checked against the data streams corresponding to at least two system data pools in the data area to determine whether the data of at least two system data pools in the data area are consistent.
- an embodiment of the present application provides a data verification apparatus, including: a data stream generation module, configured to generate and transmit a data stream including data associated with the write operation when a write operation occurs in each system data pool,
- the data stream includes the primary key value of the data;
- the area division module is used to divide the data stream into at least one data area based on the fields of the data stream and the preset area division rules, and each data area includes at least two system data pools corresponding to
- the checking module is used to check the data streams corresponding to at least two system data pools in the data area according to the primary key value corresponding to the data stream in each data area, so as to determine at least two data streams in the data area. Whether the data in the system data pool is consistent.
- an embodiment of the present application provides a data verification device, including: a processor and a memory storing computer program instructions; when the processor executes the computer program instructions, the data verification method of the first aspect is implemented.
- an embodiment of the present application provides a data verification system, including: a data stream device, configured to generate and transmit a data stream including data associated with the write operation when a write operation occurs in each system data pool, and the data
- the flow includes the primary key value of the data
- the flow distribution device is used to divide the data flow into at least one data region based on the fields of the data flow and the preset region division rules, and each data region includes data corresponding to at least two system data pools flow
- a checking device for checking the data flows corresponding to at least two system data pools in the data region according to the primary key value corresponding to the data flow in each data region to determine at least two system data in the data region Whether the data in the pool is consistent.
- an embodiment of the present application provides a computer storage medium, where computer program instructions are stored thereon, and when the computer program instructions are executed by a processor, the data checking method of the first aspect is implemented.
- Embodiments of the present application provide a data verification method, apparatus, device, system, and storage medium, which generate a data stream including data associated with the write operation when a write operation occurs in each system data pool.
- the data stream is divided into at least one data area, and each data area includes data streams corresponding to at least two system data pools.
- the write operation is not limited by the length of time, and can check the data in real time when the data changes, so that the problem of data inequity across the system can be found in time.
- FIG. 1 is a flowchart of an embodiment of a data verification method provided by the first aspect of the present application
- FIG. 2 is a flowchart of another embodiment of the data verification method provided by the first aspect of the present application.
- FIG. 3 is a flowchart of another embodiment of the data verification method provided by the first aspect of the present application.
- FIG. 4 is a schematic diagram of an example of a check window of a data area in an embodiment of the present application
- FIG. 5 is a flowchart of still another embodiment of the data verification method provided by the first aspect of the present application.
- FIG. 6 is a schematic structural diagram of an embodiment of the data verification apparatus provided in the second aspect of the present application.
- FIG. 7 is a schematic structural diagram of another embodiment of the data verification apparatus provided in the second aspect of the present application.
- FIG. 8 is a schematic structural diagram of another embodiment of the data verification apparatus provided by the second aspect of the present application.
- FIG. 9 is a schematic structural diagram of still another embodiment of the data verification apparatus provided by the second aspect of the present application.
- FIG. 10 is a schematic structural diagram of an embodiment of the data verification device provided by the third aspect of the application.
- FIG. 11 is a schematic structural diagram of an embodiment of the data verification system provided by the fourth aspect of the present application.
- a business may involve multiple systems, for example, a business is completed by the cooperation of multiple systems.
- the systems involved in the business will store the data of the business, and check the data between multiple systems, so as to be able to find the data inconsistency between multiple systems Therefore, measures can be taken to ensure the smooth operation of each system.
- the amount of data that needs to be checked across systems is very large. Since the clocks of multiple systems may be different, in order to avoid missing the checked data, the business data of a long period of time is generally obtained, such as obtaining the completed day. Check the business data of different systems one by one to determine whether there is a data inequity problem across systems, but in this case, if there is a data inequity problem, the data inequity problem only exists. It can be found late, but not in time.
- the present application provides a data verification method, device, device, system and storage medium, which can transmit data in the form of a data stream (ie, Stream Data) when a write operation occurs, and utilize the primary key of the data in the data stream. value, and check the data of different systems, so as to find the data inequity between the systems in time.
- a data stream ie, Stream Data
- the specific fields of business and data are not limited here.
- the business in the transaction field, the business may be transaction business, and the data of the business may be transaction flow data; the verification of the data is the transaction flow of the same transaction business. Data verification, the transaction details can be checked through data verification.
- the application scenarios of the embodiments of the present application are not limited to transaction scenarios, and other application scenarios that require data verification are also within the protection scope of the embodiments of the present application.
- a first aspect of the present application provides a data verification method, and the data verification method can be performed by a data verification device, a data verification device or a data verification system, that is, the data verification method can be implemented by a single device or device, or by a data verification method including multiple devices or devices.
- the system implementation is not limited here.
- FIG. 1 is a flowchart of an embodiment of the data verification method provided by the first aspect of the present application. As shown in FIG. 1 , the data verification method may include steps S101 to S103.
- step S101 when a write operation occurs in each system data pool, a data stream including data associated with the write operation is generated and transmitted.
- the system data pool is used to store the data of the system, and specifically can be used to store the data of the business in the system.
- the system data pool can be used to store the flow data of the transaction business of the system.
- the system data pool may be set in the system, or may exist in the form of a database independently of the system, which is not limited here.
- Write operations are operations that may cause changes to data in the system data pool.
- write operations may include, but are not limited to, insert operations such as insert operations, update operations such as update operations, delete operations such as delete operations and drop operations, and create operations such as create operations.
- modification operations such as alter operations, etc., are not limited here.
- the data associated with the write operation includes the data on which the write operation was effected.
- a data stream is a collection of dynamic data that is not limited in time distribution and quantity.
- data streams are used to carry data.
- the data stream includes the primary key value of the data.
- the content of the primary key value of the data can be set according to the type of data, and is not limited here.
- the data includes transaction sequence data, and the primary key value of the data may specifically include transaction sequence numbers.
- the data stream may be transmitted inside the data checking apparatus or the data checking apparatus.
- data streams may be transmitted between apparatuses or apparatuses in the data collation system.
- step S102 the data stream is divided into at least one data area based on the fields of the data stream and the preset area division rule.
- the fields of the data stream can be set according to the content and type of the data.
- the fields of the data flow may include a system identification field, a primary key value field, a business status field, etc., which are not limited herein.
- the system ID field is used to represent the ID of the system corresponding to the system data pool.
- the primary key value field is used to characterize the primary key value of the data.
- the service status field is used to represent the status of the service corresponding to the data.
- the data streams corresponding to the data pools of each system can be divided into multiple groups, that is, divided into at least one data area, according to the area division rules.
- Each data area includes data streams corresponding to at least two system data pools.
- the data contained in the data stream can be checked in each data area.
- Each data area can correspond to the entry of the data stream, and the division of the data stream can be realized by setting the area division rules.
- the area division rules can be set according to work scenarios and work requirements, and are not limited here.
- the data area can be regarded as a data stream collection formed after the data stream is grouped. Fields of data streams in the same data area satisfy the same area division rules.
- data checking is performed between systems, that is, data checking is performed between system data pools.
- each data area may include data streams corresponding to two system data pools.
- a business involves three systems.
- the three systems are system A1, system A2, and system A3.
- the data of system A1 is stored in system data pool B1
- the data of system A2 is stored in system data pool B2
- the data of system A3 is stored in system data pool B2.
- the data of this business in the system data pool B1, system data pool B2, and system data pool B3 should all change; however, one or both of them may occur.
- the situation that the data of this business in each system data pool has not changed is not limited here.
- the fields of the data stream can reflect the system identification, the primary key value of the data, the business status, etc.
- the data stream corresponding to the system data pool B1 and the data stream corresponding to the system data pool B2 can be divided into the data area C1 through the area division rules.
- the data flow corresponding to the system data pool B2 and the data flow corresponding to the system data pool B3 are divided into a data area C2.
- the data area C1 and the data flow corresponding to the system data pool B2 can be checked, and in the data area C2, the data flow corresponding to the system data pool B2 and the data flow corresponding to the system data pool B3 can be checked Data reconciliation for data flow.
- a data stream of a service corresponding to one system data pool may be divided into multiple data areas, or may be divided into one data area, which is not limited herein.
- the area division rule may define that when the value of the field D3 of the data stream is one of 0001, 0002, and 0003, the data stream is divided into the data area C3 through the entry 2008.
- the area division rule can define that when the value of the field D3 of the data stream is one of 0003 and 0004, the data stream is divided into the data area C4 through the entry 2009 .
- the data stream whose field D3 value is 0003 will be divided into data area C3 and data area C4; the data stream whose field D3 value is 0001 will be divided into data area C3.
- the value of the field D3 of the data stream of the data area C3 satisfies the area division rule that the value of the field D3 of the data stream is one of 0001, 0002, and 0003.
- the value of the field D3 of the data stream of the data area C4 satisfies the area division rule that the value of the field D3 of the data stream is one of 0003 and 0004.
- step S103 in each data area, according to the primary key value corresponding to the data stream, the data streams corresponding to at least two system data pools in the data area are checked to determine the data streams of the at least two system data pools in the data area. Is the data consistent.
- each data area the data streams corresponding to at least two system data pools with the same primary key value in the data area are checked.
- the data area there is a data stream corresponding to a system data pool whose primary key value is a certain value, but there is no data stream corresponding to another system data pool whose primary key value is a certain value.
- the data of at least two system data pools are inconsistent, that is, it is determined that a cross-system data inequity problem has occurred.
- the The data of at least two system data pools are consistent, that is, it is determined that there is no cross-system data inequity;
- the data in the system data pool is inconsistent, that is, it is determined that a cross-system data inequity problem has occurred.
- collation of data streams in multiple data regions is performed in parallel.
- there are 3 data areas which are data area C1, data area C2 and data area C3 respectively.
- the collation of the data flow in the data area C1, the collation of the data flow in the data area C2, and the collation of the data flow in the data area C3 may be performed in parallel.
- the verification of data streams in multiple data regions is performed in parallel, which can speed up data verification and improve data verification efficiency.
- the checking of data streams in different data areas can be performed by different apparatuses, devices or modules, which is not limited herein.
- the data area can be increased or decreased according to specific needs, which improves the flexibility and scalability of data checking.
- the checking of the data streams in each data area may be performed in the memory, so as to further improve the speed of data checking, improve the efficiency of data checking, and reduce the resources occupied by the data checking.
- a data stream including data associated with the write operation is generated.
- the data stream is divided into at least one data area, and each data area includes data streams corresponding to at least two system data pools.
- Check the data streams corresponding to at least two system data pools in the data area to determine whether the data in at least two system data pools are consistent.
- the write operation is not limited by the length of time, and can check the data in real time when the data changes, so that the problem of data inequity across the system can be found in time.
- the data checking method provided by the embodiments of the present application can shorten the time required to discover data inequities across systems to 1 minute or more. short.
- FIG. 2 is a flowchart of another embodiment of the data verification method provided by the first aspect of the present application.
- step S101 in FIG. 1 may be refined into steps S1011 to S1013 in FIG. 2
- the data verification method shown in FIG. 2 may further include step S104 .
- step S1011 the binary log of each system data pool is read, and the write operation of each system data pool is determined according to the binary log.
- the binary log is the BINLOG file, which is used to record changes to the database table structure and modification of table data. For example, the binary log records changes to the database table structure and operation statements for modifying table data. Based on the contents of the binary log, the write operations that occurred in the system data pool can be determined.
- step S1012 based on the write operation, a data flow message is generated.
- the data flow message is used to carry the data flow, and the specific format of the data flow message is not limited here.
- the data flow message may specifically be a JSON message.
- the data stream is carried by the data stream message, which facilitates the transmission of the data stream.
- the output format of a JSON packet carrying a data stream is as follows:
- sysId can represent the system identifier
- seqNo and traceId can represent the primary key value of the data at different stages
- bussTp can represent the transaction type
- seqSt can represent the business status corresponding to the data.
- the data flow message may include the data associated with the current write operation and the last write operation with the same primary key value. Manipulate associated data.
- the data with the same primary key value is the data corresponding to the same business.
- the data associated with the current write operation and the data associated with the previous write operation in the data flow message can reflect the change of the data, ensure that the correlation of the data before and after can be judged in the subsequent process, and determine the data according to the change of the data. Whether this data needs to be checked.
- __before is used as the node label of the data associated with the previous write operation and the data associated with the current write operation.
- step S1013 the data flow message is transmitted through the data flow component.
- Data flow packets can be transmitted one by one through the data flow component.
- the data streaming component may include components such as Kafka, which is not limited here.
- the data flow message may also be converted into a format that is more convenient for data verification.
- the execution of subsequent steps is convenient for data verification through configuration.
- step S104 in the case that a system data pool corresponds to a plurality of data streams with the same primary key value, a data stream in which a field meets a preset filter condition is reserved.
- the data stream obtained based on the system data pool there may be multiple data streams corresponding to one service. It is necessary to filter the multiple data streams corresponding to a service, so that one data stream corresponding to this service can participate in the data check. , to avoid confusion in data verification.
- the primary key value corresponding to the data flow is the same, indicating that the business corresponding to the data flow is the same business.
- a filter condition can be set based on the meaning of each field of the data stream and the requirements of data checking, and one data stream is filtered and retained among the multiple data streams with the same primary key value through the filter condition. A data flow whose reserved fields meet the filtering conditions can participate in the subsequent data verification process.
- the data stream includes a business status field.
- the service state field is used to represent the state of the service corresponding to the data of the data stream.
- the above filtering conditions may include that the service status field includes the target value in the preset value set, and the service status field of the data stream is different from the service status field of the data stream corresponding to the last write operation.
- the preset value set includes at least one target value. The preset value set can be set according to work scenarios and work requirements, and is not limited herein.
- the value of the business status field is 01, indicating that the data of the data stream does not need to be checked temporarily; the value of the business status field is 00, indicating that the data of the data flow needs to be checked temporarily.
- the preset value set includes a target value of 00.
- the service status field of the data stream L1 includes a target value of 00, and the service status field of the data stream corresponding to the last write operation is 01, the data stream L1 is reserved.
- the service status field of the data stream L1 includes the target value 00, but the service status field of the data stream corresponding to the last write operation is 00, the data stream L2 is discarded.
- the filtering conditions are not limited to the above-mentioned contents, and the filtering conditions that can realize the filtering of multiple data streams with the same primary key value are all within the protection scope of the embodiments of the present application, and will not be described one by one here.
- FIG. 3 is a flowchart of another embodiment of the data verification method provided by the first aspect of the present application. The difference between FIG. 3 and FIG. 1 is that step S103 in FIG. 1 can be specifically refined into step S1031 and step S1032 in FIG. 3 .
- step S1031 in each data area, the data stream is divided into the check window according to the primary key value corresponding to the data stream.
- the data streams in different check windows have different primary key values, that is, data streams with the same primary key value are not divided into different check windows, and data streams with the same primary key value are divided into the same check window. Dividing the data stream into check windows enables hashing of the data stream.
- a certain check window of a certain data area includes data streams with the same primary key value corresponding to each system data pool corresponding to the data area.
- the data area C1 includes a data stream corresponding to the system data pool B1 and a data stream corresponding to the system data pool B2, and a check window in the data area C1 may include a data stream corresponding to the system data pool B1 with the same primary key value and A data stream corresponding to the system data pool B2, that is, a pair of data streams of the system data pool B1 and the system data pool B2 with the same primary key value is checked in each check window in the data area C1.
- step S1032 the data flow in the check window is checked.
- step S1032 it is checked whether the data carried by the data streams in the checking window are consistent.
- the granularity of the verification window is smaller than that of the data area. In some cases, when the duration of the data stream in the verification window exceeds the preset trigger duration, the verification of the data stream in the verification window is triggered. In other cases, when the number of data streams in the verification window reaches a preset trigger number, the verification of the data streams in the verification window is triggered. Since the data stream in the embodiment of the present application is triggered and generated by a write operation and is not limited by the time length, the granularity of the verification window can be very finely divided in terms of time or the number of data streams, thereby speeding up the data verification speed and improving the Data checking efficiency.
- the verification of the data flow in the verification window does not need to be matched, which can be realized by standardization and plug-in, which improves the flexibility of data verification development and design , the increase and decrease of the check window is relatively flexible and easy to expand.
- a new check window is generated, and the undivided data stream is divided into a new in the check window.
- the duration of the division of the undivided data stream into the new verification window exceeds the preset trigger duration, the verification of the data stream in the new verification window is triggered.
- the preset trigger duration can be set according to the work scenario and work requirements, and is not limited here.
- the preset trigger duration can be set by a timer. For example, when the timer count reaches the preset trigger duration, the verification of the data stream in the new verification window is triggered.
- FIG. 4 is a schematic diagram of an example of a check window of a data area in an embodiment of the present application.
- the existing check window in the data area C1 includes a check window D1 , a check window D2 and a check window D3 .
- the primary key value corresponding to the data stream in the verification window D1 is 000792
- the primary key value corresponding to the data stream in the verification window D2 is 000982
- the primary key value corresponding to the data stream in the verification window D3 is 000991.
- the primary key value corresponding to the data stream E1 is 000993
- the primary key value of the data stream in the existing verification window in the data area C1 corresponds to the data stream E1
- the primary key values are different, therefore, a new check window D4 needs to be generated for the data stream E1, and the data stream E1 needs to be divided into the check window D4.
- the preset trigger duration is set to 3 minutes.
- 3 minutes after the data stream E1 is divided into the verification window D4 the verification of the data streams in the verification window D4 is triggered.
- the undivided data stream is divided into the existing check window.
- the number of data streams in the existing verification window reaches the preset trigger number, the verification of the data streams in the existing verification window is triggered. Continue to wait when the number of data streams in the existing check window does not reach the preset trigger number.
- the number of preset triggers can be set according to the work scenario and work requirements, and is not limited here.
- the existing check window in the data area C1 includes a check window D1 , a check window D2 and a check window D3 .
- the primary key value corresponding to the data stream in the verification window D1 is 000792
- the primary key value corresponding to the data stream in the verification window D2 is 000982
- the primary key value corresponding to the data stream in the verification window D3 is 000991. If the data stream E2 in the data area C1 has not been divided into the check window, and the primary key value corresponding to the data stream E2 is 000991, the data stream E2 is divided into the check window D3.
- the preset trigger number is set to 2. Correspondingly, when the number of data streams in the verification window D3 reaches 2, the verification of the data streams in the verification window is triggered.
- the verification of the data stream in the above-mentioned embodiment may specifically verify the value of the field of the data carried by the data stream, the number of the data stream in the verification window, etc., which is not limited herein.
- FIG. 5 is a flowchart of still another embodiment of the data verification method provided by the first aspect of the present application. The difference between FIG. 5 and FIG. 1 is that the data verification method shown in FIG. 5 may further include step S105 or step S106.
- step S105 when it is determined that the data of at least two system data pools in the data area are consistent, the value of the data verification success indicator is increased.
- the value of the data verification success indicator can be increased.
- the data verification success indicator is used to characterize the success rate of data verification. The larger the value of the data verification success indicator is, the higher the success rate of data verification is.
- Data verification success indicators can provide a basis for cross-system data inequity, alarms, risk prediction, etc., and expand the application scope of data verification.
- step S106 when it is determined that the data in the at least two system data pools in the data area are inconsistent, the inconsistent data in the at least two system data pools in the data area is output.
- the data in at least two system data pools in the data area is inconsistent, that is, a cross-system data inequity problem occurs.
- Inconsistent data in at least two system data pools in the data area is the data that causes the cross-system data inequity problem.
- Inconsistent data in at least two system data pools in the data area can provide a basis for cross-system data inequity, alarms, risk prediction, etc., and expand the application scope of data verification.
- FIG. 6 is a schematic structural diagram of an embodiment of the data verification apparatus provided in the second aspect of the present application.
- the data verification apparatus 200 may include a data stream generation module 201 , an area division module 202 and a verification module 203 .
- the data stream generation module 201 can be configured to generate and transmit a data stream including data associated with the write operation when a write operation occurs in each system data pool.
- the data stream includes the primary key value of the data.
- the area division module 202 may be configured to divide the data stream into at least one data area based on the fields of the data stream and the preset area division rule.
- Each data area includes data streams corresponding to at least two system data pools.
- fields of data streams of the same data region satisfy the same region partitioning rules.
- the checking module 203 can be configured to check the data flows corresponding to at least two system data pools in the data region according to the primary key value corresponding to the data flow in each data region, so as to determine the data flow of the at least two system data pools in the data region. Is the data consistent.
- collation of data streams in multiple data regions is performed in parallel.
- a data stream including data associated with the write operation is generated.
- the data stream is divided into at least one data area, and each data area includes data streams corresponding to at least two system data pools.
- the data streams corresponding to the at least two system data pools are checked in the data area, so as to determine whether the data of the at least two system data pools are consistent.
- the write operation is not limited by the length of time, and can check the data in real time when the data changes, so that the problem of data inequity across the system can be found in time.
- the data flow generation module 201 may be configured to: read the binary logs of each system data pool, and determine the write operation of each system data pool according to the binary log; based on the write operation, generate a data flow message, a data flow message It is used to carry data streams; it transmits data stream packets through the data stream component.
- the data flow packet includes data associated with the current write operation and data associated with the previous write operation with the same primary key value.
- FIG. 7 is a schematic structural diagram of another embodiment of the data verification apparatus provided in the second aspect of the present application. The difference between FIG. 7 and FIG. 6 is that the data checking apparatus 200 shown in FIG. 7 may further include a screening module 204 .
- the filtering module 204 may be configured to retain a data stream in which a field meets a preset filtering condition when a system data pool corresponds to multiple data streams with the same primary key value.
- the data stream includes a service state field
- the service state field is used to represent the state of the service corresponding to the data of the data stream.
- the filtering conditions include: the service status field includes the target value in the preset value set, and the service status field of the data stream is different from the service status field of the data stream corresponding to the last write operation.
- FIG. 8 is a schematic structural diagram of still another embodiment of the data verification apparatus provided in the second aspect of the present application.
- the checking module 203 may include a window dividing unit 2031 and a checking unit 2032 .
- the window dividing unit 2031 can be configured to divide the data stream into the check window according to the corresponding primary key value of the data stream in each data area.
- the primary key values of data streams in different check windows are different.
- the collation unit 2032 may be used to collate data streams within the collation window.
- the window dividing unit 2031 may be configured to generate a new check window when the primary key value of the data stream in the existing check window is different from the primary key value corresponding to the undivided data stream, Divide undivided data streams into new check windows.
- the verification unit 2032 may be configured to trigger verification of the data streams in the new verification window when the time period for which the undivided data streams are divided into the new verification window exceeds the preset triggering period.
- the window dividing unit 2031 may be configured to divide the undivided data into the same primary key value of the data stream in the existing check window as the primary key value corresponding to the undivided data stream. Divide the stream into an existing check window
- the verification unit 2032 may be configured to trigger verification of the data streams in the existing verification window when the number of data streams in the existing verification window reaches a preset trigger number.
- FIG. 9 is a schematic structural diagram of still another embodiment of the data verification apparatus provided in the second aspect of the present application.
- the difference between FIG. 9 and FIG. 6 is that the data verification apparatus 200 shown in FIG. 9 may further include a processing module 205 .
- the processing module 205 may be configured to: in the case of determining that the data of at least two system data pools in the data area are consistent, increase the value of the data verification success indicator; In case of inconsistent data in at least two system data pools in the output data area.
- FIG. 10 is a schematic structural diagram of an embodiment of the data verification device provided by the third aspect of the application.
- the data checking apparatus 300 includes a memory 301 , a processor 302 , and a computer program stored on the memory 301 and executable on the processor 302 .
- the above-mentioned processor 302 may include a central processing unit (CPU), or a specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits of the embodiments of the present application.
- CPU central processing unit
- ASIC Application Specific Integrated Circuit
- the memory 301 may include Read-Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical or other physical/tangible memory storage device.
- ROM Read-Only Memory
- RAM Random Access Memory
- magnetic disk storage media devices typically, magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical or other physical/tangible memory storage device.
- a memory typically, includes one or more tangible (non-transitory) computer-readable storage media (eg, memory devices) encoded with software including computer-executable instructions, and when the software is executed (eg, by a or multiple processors), it is operable to perform the operations described with reference to the data collation method according to the embodiment of the present application.
- the processor 302 runs a computer program corresponding to the executable program code by reading the executable program code stored in the memory 301, so as to implement the data checking method in the above-mentioned embodiment.
- the data collation device 300 may further include a communication interface 303 and a bus 304 .
- the memory 301 , the processor 302 , and the communication interface 303 are connected through the bus 304 and complete the communication with each other.
- the communication interface 303 is mainly used to implement communication between modules, apparatuses, units, and/or devices in the embodiments of the present application. Input devices and/or output devices may also be accessed through the communication interface 303 .
- the bus 304 includes hardware, software, or both, coupling the components of the data collation apparatus 300 to each other.
- the bus 304 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), HyperTransport (HT) interconnect, Industrial Standard Architecture (ISA) bus, Infiniband interconnect, Low pin count (LPC) bus, memory bus, Micro Channel architecture (Micro Channel) Architecture, MCA) bus, Peripheral Component Interconnect (PCI) bus, PCI-Express (PCI-X) bus, Serial Advanced Technology Attachment (Serial Advanced Technology Attachment, SATA) bus, Video Electronics Standards Association Part ( Video Electronics Standards Association Local Bus (VLB) bus or other suitable bus or a combination of two or more of these.
- Bus 304 may include one or more buses, where appropriate. Although embodiments herein describe and illustrate a particular bus, this application contemplates any suitable bus or interconnect.
- FIG. 11 is a schematic structural diagram of an embodiment of the data verification system provided by the fourth aspect of the present application.
- the data checking system may include a data streaming device 41 , a distribution device 42 and a checking device 43 .
- the respective numbers of the data flow devices 41 , the distribution devices 42 and the verification devices 43 in the data verification system are not limited herein.
- the data stream device 41 can be used to generate and transmit a data stream including data associated with the write operation when a write operation occurs in each system data pool.
- the data stream includes the primary key value of the data.
- the distribution device 42 may be configured to divide the data stream into at least one data area based on the fields of the data stream and the preset area division rule.
- Each data area includes data streams corresponding to at least two system data pools.
- the checking device 43 can be configured to check the data streams corresponding to at least two system data pools in the data area according to the primary key value corresponding to the data stream in each data area, so as to determine the data streams of the at least two system data pools in the data area. Is the data consistent.
- the data flow device 41 , the distribution device 42 and the verification device 43 may also perform other steps in the data verification method in the above-mentioned embodiment.
- the relevant description of the data verification method in the above-mentioned embodiment please refer to the relevant description of the data verification method in the above-mentioned embodiment, which will not be repeated here.
- a fifth aspect of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium.
- a computer program is stored on the computer-readable storage medium.
- the data checking method in the above-mentioned embodiment can be implemented, and the same can be achieved.
- the technical effect will not be repeated here.
- the above-mentioned computer-readable storage medium may include a non-transitory computer-readable storage medium, such as read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM), magnetic disk or optical disk etc., are not limited here.
- processors may be, but are not limited to, general purpose processors, special purpose processors, application specific processors, or field programmable logic circuits. It will also be understood that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can also be implemented by special purpose hardware that performs the specified functions or actions, or that special purpose hardware and/or A combination of computer instructions is implemented.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
本申请公开了一种数据核对方法、装置、设备、系统及存储介质,涉及数据处理领域。该方法包括:在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输,数据流包括数据的主键值;基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流;在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。根据本申请实施例能够及时发现跨系统的数据不平问题。
Description
相关申请的交叉引用
本申请要求享有于2020年10月27日提交的名称为“数据核对方法、装置、设备、系统及存储介质”的中国专利申请202011167710.5的优先权,该申请的全部内容通过引用并入本文中。
本申请属于数据处理领域,尤其涉及一种数据核对方法、装置、设备、系统及存储介质。
随着业务复杂程度的增加,一项业务会涉及多个系统。对应地,在多个系统之间,需要跨系统核对业务数据,以发现跨系统间的业务数据的不一致,便于对各系统采取措施,保证各系统的平稳运行。
现阶段,可从参与数据核对的两个系统中分别获取业务数据。例如,从系统A和系统B中分别获取已经结束的1天内的业务数据,逐条对比系统A和系统B中的业务数据是否一致,即是否存在跨系统的数据不平问题。但该种数据核对方法无法及时发现跨系统的数据不平问题。
发明内容
本申请实施例提供一种数据核对方法、装置、设备、系统及存储介质,能够及时发现跨系统的数据不平问题。
第一方面,本申请实施例提供一种数据核对方法,包括:在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输,数据流包括数据的主键值;基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统 数据池对应的数据流;在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。
第二方面,本申请实施例提供一种数据核对装置,包括:数据流生成模块,用于在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输,数据流包括数据的主键值;区域划分模块,用于基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流;核对模块,用于在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。
第三方面,本申请实施例提供一种数据核对设备,包括:处理器以及存储有计算机程序指令的存储器;处理器执行计算机程序指令时实现第一方面的数据核对方法。
第四方面,本申请实施例提供一种数据核对系统,包括:数据流装置,用于在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输,数据流包括数据的主键值;分流装置,用于基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流;核对装置,用于在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。
第五方面,本申请实施例提供一种计算机存储介质,计算机存储介质上存储有计算机程序指令,计算机程序指令被处理器执行时实现第一方面的数据核对方法。
本申请实施例提供一种数据核对方法、装置、设备、系统及存储介质,在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流。将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流。在数据区域中核对至少两个系统数据池对应 的数据流,从而确定至少两个系统数据池的数据是否一致。不需要设置获取数据的时间段,通过写操作触发生成数据流,从而进行数据流的划分及核对。写操作不受时间长度的限制,能够在数据发生变化的情况下,实时进行数据的核对,从而能够及时发现跨系统的数据不平问题。
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单的介绍,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请第一方面提供的数据核对方法的一实施例的流程图;
图2为本申请第一方面提供的数据核对方法的另一实施例的流程图
图3为本申请第一方面提供的数据核对方法的又一实施例的流程图;
图4为本申请实施例中一数据区域的核对窗口的一示例的示意图
图5为本申请第一方面提供的数据核对方法的再一实施例的流程图;
图6为本申请第二方面提供的数据核对装置的一实施例的结构示意图;
图7为本申请第二方面提供的数据核对装置的另一实施例的结构示意图;
图8为本申请第二方面提供的数据核对装置的又一实施例的结构示意图;
图9为本申请第二方面提供的数据核对装置的再一实施例的结构示意图;
图10为本申请第三方面提供的数据核对设备的一实施例的结构示意图;
图11为本申请第四方面提供的数据核对系统的一实施例的结构示意图。
下面将详细描述本申请的各个方面的特征和示例性实施例,为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及具体实施例,对本申请进行进一步详细描述。应理解,此处所描述的具体实施例仅意在解释本申请,而不是限定本申请。对于本领域技术人员来说,本申请可以在不需要这些具体细节中的一些细节的情况下实施。下面对实施例的描述仅仅是为了通过示出本申请的示例来提供对本申请更好的理解。
随着业务复杂程度的增加,一项业务可能会涉及到多个系统,例如,一项业务由多个系统协作完成。为了保证业务正常执行,该业务涉及到的系统均会存储该业务的数据,并在多个系统之间进行数据的核对,以便能够发现多个系统之间数据不一致的情况,即发现跨系统间的数据不平问题,从而能够采取措施,为各系统的平稳运行提供保障。
跨系统需要核对的数据的量非常庞大,由于多个系统的时钟可能会存在差异,为了避免遗漏核对的数据,一般会获取一个较长的时间段内的业务的数据,如获取已经结束的一天内的多个系统各自的业务的数据,将不同系统的业务的数据一一核对,以确定跨系统间是否存在数据不平问题,但在这种情况下,如果存在数据不平问题,数据不平问题只能延后发现,而不能及时发现。
本申请提供了一种数据核对方法、装置、设备、系统及存储介质,能够在发生写操作的情况下,以数据流(即Stream Data)的方式传输数据,并利用数据流的数据的主键值,对不同系统的数据进行核对,从而及时发现跨系统间的数据不平问题。
在此并不限定业务和数据的具体领域,例如,在交易领域中,业务具体可为交易业务,业务的数据具体可为交易流水数据;对数据的核对即为对相同的交易业务的交易流水数据的核对,可通过数据核对实现交易明细的勾对。但本申请实施例的应用场景并不限于交易场景,其他需要进行数据核对的应用场景也在本申请实施例的保护范围内。
本申请第一方面提供一种数据核对方法,该数据核对方法可由数据核对装置、数据核对设备或数据核对系统执行,即数据核对方法可由单个装置或设备实现,也可由包括多个装置或设备的系统实现,在此并不限定。
图1为本申请第一方面提供的数据核对方法的一实施例的流程图。如图1所示,该数据核对方法可包括步骤S101至步骤S103。
在步骤S101中,在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输。
系统数据池用于存放系统的数据,具体可用于存放系统中业务的数据。例如,在交易领域中,系统数据池可用于存放该系统的交易业务的流水数据。系统数据池可设置于系统内,也可独立于系统以数据库的形式存在,在此并不限定。参与数据核对的系统可以有多个,每个系统可对应一个系统数据池,即参与数据核对的系统数据池可以有多个。
写操作为可能引发系统数据池中数据的变化的操作,例如,写操作可包括但不限于插入操作如insert操作、更新操作如update操作、删除操作如delete操作和drop操作、创建操作如create操作、修改操作如alter操作等,在此并不限定。
与写操作关联的数据包括写操作起作用的数据。数据流为在时间分布和数量上并不限定的一系列的动态数据的集合体。在本申请实施例中,利用数据流来承载数据。数据流包括数据的主键值。数据的主键值的内容可根据数据的类型设定,在此并不限定。例如,数据包括交易流水数据,数据的主键值具体可包括交易流水号。
在数据核对方法由数据核对装置或数据核对设备执行的情况下,数据流可在数据核对装置或数据核对设备内部进行传输。在数据核对方法由包括多个装置或设备的数据核对系统执行的情况下,数据流可在数据核对系统中的装置或设备之间传输。
在步骤S102中,基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域。
数据流的字段可根据数据的内容、类型等设定。例如,数据流的字段可包括系统标识字段、主键值字段、业务状态字段等,在此并不限定。系统标识字段用于表征系统数据池对应的系统的标识。主键值字段用于表征数据的主键值。业务状态字段用于表征数据对应的业务的状态。
可根据数据核对的目的,通过区域划分规则,将各系统数据池对应的 数据流划分为多组,即划分为至少一个数据区域。每个数据区域包括至少两个系统数据池对应的数据流。在每个数据区域中可对数据流包含的数据进行核对。各数据区域可对应有数据流的入口,通过设置区域划分规则,实现数据流的划分。
区域划分规则可根据工作场景和工作需求设定,在此并不限定。数据区域可视为数据流分组后形成的数据流集合。同一数据区域的数据流的字段满足相同的区域划分规则。在一些示例中,系统之间两两进行数据核对,即系统数据池之间两两进行数据核对,对应地,每个数据区域可包括两个系统数据池对应的数据流。
例如,一笔业务涉及三个系统,三个系统分别为系统A1、系统A2和系统A3,系统A1的数据存储于系统数据池B1,系统A2的数据存储于系统数据池B2,系统A3的数据存储于系统数据池B3。在同一笔业务的数据发生变化的情况下,正常情况下,系统数据池B1、系统数据池B2和系统数据池B3中这一笔业务的数据均应发生变化;但也有可能出现其中一个或两个系统数据池中这一笔业务的数据未发生变化的情况,在此并不限定。数据流的字段可体现系统标识、数据的主键值、业务状态等,可通过区域划分规则,将系统数据池B1对应的数据流和系统数据池B2对应的数据流划分为数据区域C1,将系统数据池B2对应的数据流和系统数据池B3对应的数据流划分为数据区域C2。在数据区域C1中可进行系统数据池B1对应的数据流和系统数据池B2对应的数据流的数据核对,在数据区域C2中可进行系统数据池B2对应的数据流和系统数据池B3对应的数据流的数据核对。
一个系统数据池对应的一笔业务的数据流可划分至多个数据区域,也可划分至一个数据区域,在此并不限定。例如,区域划分规则可限定数据流的字段D3的值为0001、0002、0003中的一项的情况下,数据流通过入口2008划分为数据区域C3。区域划分规则可限定数据流的字段D3的值为0003、0004中的一项的情况下,数据流通过入口2009划分为数据区域C4。字段D3的值为0003的数据流会划分为数据区域C3和数据区域C4;字段D3的值为0001的数据流会划分为数据区域C3。数据区域C3的数据 流的字段D3的值满足数据流的字段D3的值为0001、0002、0003中的一项这一区域划分规则。数据区域C4的数据流的字段D3的值满足数据流的字段D3的值为0003、0004中的一项这一区域划分规则。
在步骤S103中,在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。
具体地,在每个数据区域中,核对该数据区域中主键值相同的至少两个系统数据池对应的数据流。在数据区域中存在主键值为某一值的一个系统数据池对应的数据流,但并不存在主键值为该某一值的另一个系统数据池对应的数据流,可确定数据区域中至少两个系统数据池的数据不一致,即确定发生了跨系统的数据不平问题。在数据区域中,核对主键值相同的至少两个系统数据池对应的数据流的数据,若数据主键值相同的至少两个系统数据池对应的数据流的数据相同,可确定数据区域中至少两个系统数据池的数据一致,即确定未发生跨系统的数据不平问题;若数据主键值相同的至少两个系统数据池对应的数据流的数据不同,可确定数据区域中至少两个系统数据池的数据不一致,即确定发生了跨系统的数据不平问题。
在一些示例中,多个数据区域中数据流的核对并行执行。例如,经过划分后,具有3个数据区域,分别为数据区域C1、数据区域C2和数据区域C3。数据区域C1中的数据流的核对、数据区域C2中的数据流的核对和数据区域C3中的数据流的核对可并行执行。多个数据区域中数据流的核对并行执行,可加快数据核对速度,提高数据核对效率。不同的数据区域中数据流的核对可由不同的装置、设备或模块进行,在此并不限定。可根据具体需求,增加或减少数据区域,提高了数据核对的灵活性和可扩展性。
在一些示例中,各数据区域中数据流的核对可在内存中进行,以进一步提高数据核对速度,提高数据核对效率,减少数据核对占用资源。
在本申请实施例中,在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流。将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流。在数据区域中核对至少 两个系统数据池对应的数据流,从而确定至少两个系统数据池的数据是否一致。不需要设置获取数据的时间段,通过写操作触发生成数据流,从而进行数据流的划分及核对。写操作不受时间长度的限制,能够在数据发生变化的情况下,实时进行数据的核对,从而能够及时发现跨系统的数据不平问题。与现阶段需要一天时间或更长时间发现跨系统的数据不平问题的方法相比,本申请实施例提供的数据核对方法可将发现跨系统的数据不平问题所需的时间缩短到1分钟甚至更短。
在数据的量较大的情况下,由于通过写操作触发生成数据流,实时进行数据的核对,与采用累积大量数据再进行核对的方式相比,能够满足更高的数据核对的性能要求。
图2为本申请第一方面提供的数据核对方法的另一实施例的流程图。图2与图1的不同之处在于,图1中的步骤S101可细化为图2中的步骤S1011至步骤S1013,图2所示的数据核对方法还可包括步骤S104。
在步骤S1011中,读取各系统数据池的二进制日志,根据二进制日志,确定各系统数据池的写操作。
二进制日志即BINLOG文件,用于记录数据库表结构的变更以及表数据的修改。例如,二进制日志会记载数据库表结构的变更以及表数据的修改的操作语句。根据二进制日志的内容,可确定系统数据池发生的写操作。
在步骤S1012中,基于写操作,生成数据流报文。
数据流报文用于承载数据流,在此并不限定数据流报文的具体格式。在一些示例中,数据流报文具体可为JSON报文。利用数据流报文承载数据流,便于传输数据流。例如,一条承载数据流的JSON报文的输出格式如下:
其中,sysId可表征系统标识,seqNo和traceId可表征数据在不同阶段的主键值,bussTp可表征交易类型,seqSt可表征数据对应的业务状态。
由于同一笔业务对应的数据可能会发生变化,为了使数据流能够体现数据的变化情况,在一些示例中,数据流报文可包括主键值相同的本次写操作关联的数据和上次写操作关联的数据。主键值相同的数据即为同一笔业务对应的数据。通过数据流报文中本次写操作关联的数据和上次写操作关联的数据,能够体现数据的变化情况,确保在后续过程中可判断前后数据的关联性,以及根据数据的变化情况,确定是否需要对该数据进行核对。例如,如上述承载数据流的JSON报文的输出格式中,利用__before作为上次写操作关联的数据和本次写操作关联的数据的节点标注。
在步骤S1013中,通过数据流式组件传输数据流报文。
可通过数据流式组件将数据流报文逐条传输。数据流式组件可包括Kafka等组件,在此并不限定。
在一些示例中,在执行步骤S102之前,还可将数据流报文转换为更加便于进行数据核对的格式,例如,将数据流报文转换为Map映射格式,利用转换为Map映射格式的数据参与后续步骤的执行,便于通过配置实现数据核对。
在步骤S104中,在一个系统数据池对应有主键值相同的多个数据流的情况下,保留其中字段符合预设的筛选条件的一个数据流。
在基于系统数据池获得的数据流中,一笔业务可能会对应出现多个数 据流,需要对一笔业务对应的多个数据流进行筛选,使这一笔业务对应的一个数据流参与数据核对,避免造成数据核对混乱。数据流对应的主键值相同,表示数据流对应的业务为同一笔业务。具体地,可利用数据流各字段的意义以及数据核对的要求设置筛选条件,通过筛选条件在主键值相同的多个数据流中,筛选保留其中的一个数据流。保留的字段符合筛选条件的一个数据流可参与后续的数据核对流程。
在一些示例中,数据流包括业务状态字段。业务状态字段用于表征数据流的数据对应的业务的状态。上述筛选条件可包括业务状态字段包括预设值集合中的目标值,且数据流的业务状态字段与上一次写操作对应的数据流的业务状态字段不同。预设值集合包括至少一个目标值。预设值集合可根据工作场景和工作需求设定,在此并不限定。
例如,业务状态状态字段的值为01,表示数据流的数据暂时不需核对;业务状态状态字段的值为00,表示数据流的数据暂时需要核对。预设值集合包括目标值00。在数据流L1的业务状态字段包括目标值00,且上一次写操作对应的数据流的业务状态字段为01的情况下,该数据流L1被保留。在数据流L1的业务状态字段包括目标值00,但上一次写操作对应的数据流的业务状态字段为00的情况下,该数据流L2被舍弃。
筛选条件并不限于上述内容,能够实现对主键值相同的多个数据流的筛选的筛选条件均在本申请实施例的保护范围内,在此不一一举例说明。
图3为本申请第一方面提供的数据核对方法的又一实施例的流程图。图3与图1的不同之处在于,图1中的步骤S103可具体细化为图3中的步骤S1031和步骤S1032。
在步骤S1031中,在每个数据区域中,根据数据流对应的主键值,将数据流划分至核对窗口内。
其中,不同的核对窗口内的数据流的主键值不同,即主键值相同的数据流不会划分到不同的核对窗口,主键值相同的数据流划分至同一核对窗口。将数据流划分至核对窗口,可实现对数据流的散列。在一些示例中,某个数据区域的某个核对窗口内包括该数据区域对应的各系统数据池对应的主键值相同的数据流。例如,数据区域C1中包括系统数据池B1对应的 数据流和系统数据池B2对应的数据流,数据区域C1中的一个核对窗口可包括主键值相同的系统数据池B1对应的一个数据流和系统数据池B2对应的一个数据流,即数据区域C1中每个核对窗口中进行一对主键值相同的系统数据池B1和系统数据池B2的数据流的核对。
在步骤S1032中,核对核对窗口内的数据流。
具体地,在步骤S1032中核对核对窗口内的数据流所承载的数据是否一致。核对窗口的粒度要小于数据区域的粒度,在一些情况下,在核对窗口内存在数据流的时长超过预设触发时长的情况下,触发核对核对窗口内的数据流。在另一些情况下,在核对窗口内的数据流的数量达到预设触发数量的情况下,触发核对核对窗口内的数据流。由于本申请实施例中的数据流是通过写操作触发生成,不受时间长度的限制,因此核对窗口的粒度从时间上或从数据流数量上可以划分得非常细,从而加快数据核对速度,提高数据核对效率。而且,由于数据流的匹配已经在将数据流划分至核对窗口的过程中完成,核对窗口内的数据流的核对不需要进行匹配,可标准化、插件化实现,提高了数据核对开发设计的灵活性,核对窗口的增加和减少也相对灵活,便于扩展。
在一些示例中,在已存在的核对窗口内的数据流的主键值与未划分的数据流对应的主键值不同的情况下,生成新的核对窗口,将未划分的数据流划分至新的核对窗口内。在未划分的数据流划分至新的核对窗口内的时长超过预设触发时长的情况下,触发核对新的核对窗口内的数据流。
在未划分的数据流划分至新的核对窗口内的时长超过预设触发时长,且该数据区域内没有能够与划分至新的核对窗口内的数据流进行核对的数据流的情况下,可能发生了数据不平问题。预设触发时长可根据工作场景和工作需求设定,在此并不限定。预设触发时长的设定可通过计时器实现,例如,计时器计时时长达到预设触发时长,则触发该新的核对窗口内的数据流的核对。
例如,图4为本申请实施例中一数据区域的核对窗口的一示例的示意图。如图4所示,数据区域C1已存在的核对窗口包括核对窗口D1、核对窗口D2和核对窗口D3。核对窗口D1中数据流对应的主键值为000792, 核对窗口D2中数据流对应的主键值为000982,核对窗口D3中数据流对应的主键值为000991。若数据区域C1中的数据流E1还未划分至核对窗口内,且数据流E1对应的主键值为000993,数据区域C1已存在的核对窗口内的数据流的主键值与数据流E1对应的主键值均不同,因此,需要为数据流E1生成新的核对窗口D4,并将数据流E1划分至核对窗口D4。设预设触发时长为3分钟,对应地,在数据流E1划分至核对窗口D4起3分钟后,触发该核对窗口D4中的数据流的核对。
在另一些示例中,在已存在的核对窗口内的数据流的主键值与未划分的数据流对应的主键值相同的情况下,将未划分的数据流划分至已存在的核对窗口内。在已存在的核对窗口内的数据流的数量达到预设触发数量的情况下,触发核对已存在的核对窗口内的数据流。在已存在的核对窗口内的数据流的数量未达到预设触发数量的情况下,继续等待。
预设触发数量可根据工作场景和工作需求设定,在此并不限定。
例如,如图4所示,数据区域C1已存在的核对窗口包括核对窗口D1、核对窗口D2和核对窗口D3。核对窗口D1中数据流对应的主键值为000792,核对窗口D2中数据流对应的主键值为000982,核对窗口D3中数据流对应的主键值为000991。若数据区域C1中的数据流E2还未划分至核对窗口内,且数据流E2对应的主键值为000991,将数据流E2划分至核对窗口D3内。设预设触发数量为2,对应地,在核对窗口D3内的数据流的数量达到2的情况下,触发该核对窗口内的数据流的核对。
上述实施例中的数据流的核对,具体可核对数据流承载的数据的字段的值、核对窗口内数据流的数量等,在此并不限定。
图5为本申请第一方面提供的数据核对方法的再一实施例的流程图。图5与图1的不同之处在于,图5所示的数据核对方法还可包括步骤S105或步骤S106。
在步骤S105中,在确定数据区域中的至少两个系统数据池的数据一致的情况下,增大数据核对成功指标的值。
数据区域中至少两个系统数据池的数据一致,即未发生跨系统的数据不平问题,可增大数据核对成功指标的值。数据核对成功指标用于表征数 据核对的成功率,数据核对成功指标的值越大,表示数据核对的成功率越高。数据核对成功指标可为跨系统数据不平、告警、风险预判等提供依据,扩大数据核对的应用范畴。
在步骤S106中,在确定数据区域中的至少两个系统数据池的数据不一致的情况下,输出数据区域中的至少两个系统数据池中不一致的数据。
数据区域中的至少两个系统数据池的数据不一致,即发生了跨系统的数据不平问题,数据区域中的至少两个系统数据池中不一致的数据即为引发跨系统的数据不平问题的数据。数据区域中的至少两个系统数据池中不一致的数据可为跨系统数据不平、告警、风险预判等提供依据,扩大数据核对的应用范畴。
需要说明的是,上述实施例中的数据核对方法在由数据核对装置或数据核对设备执行的情况下,生成数据流、划分数据区域、划分核对窗口、数据核对等功能可通过不同的模块或单元实现。上述实施例中的数据核对方法在由数据核对系统执行的情况下,生成数据流、划分数据区域、划分核对窗口、数据核对等功能可通过不同的装置实现。在此并不限定执行数据核对方法的主体的具体形式。
本申请第二方面还提供了一种数据核对装置。图6为本申请第二方面提供的数据核对装置的一实施例的结构示意图。如图6所示,数据核对装置200可包括数据流生成模块201、区域划分模块202和核对模块203。
数据流生成模块201可用于在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输。
数据流包括数据的主键值。
区域划分模块202可用于基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域。
每个数据区域包括至少两个系统数据池对应的数据流。
在一些示例中,同一数据区域的数据流的字段满足相同的区域划分规则。
核对模块203可用于在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中 的至少两个系统数据池的数据是否一致。
在一些示例中,多个数据区域中数据流的核对并行执行。
在本申请实施例中,在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流。将数据流划分为至少一个数据区域,每个数据区域包括至少两个系统数据池对应的数据流。在数据区域中核对至少两个系统数据池对应的数据流,从而确定至少两个系统数据池的数据是否一致。不需要设置获取数据的时间段,通过写操作触发生成数据流,从而进行数据流的划分及核对。写操作不受时间长度的限制,能够在数据发生变化的情况下,实时进行数据的核对,从而能够及时发现跨系统的数据不平问题。
在一些示例中,数据流生成模块201可用于:读取各系统数据池的二进制日志,根据二进制日志,确定各系统数据池的写操作;基于写操作,生成数据流报文,数据流报文用于承载数据流;通过数据流式组件传输数据流报文。
在一些示例中,数据流报文包括主键值相同的本次写操作关联的数据和上次写操作关联的数据。
图7为本申请第二方面提供的数据核对装置的另一实施例的结构示意图。图7与图6的不同之处在于,图7所示的数据核对装置200还可包括筛选模块204。
筛选模块204可用于在一个系统数据池对应有主键值相同的多个数据流的情况下,保留其中字段符合预设的筛选条件的一个数据流。
在一些示例中,数据流包括业务状态字段,业务状态字段用于表征数据流的数据对应的业务的状态。筛选条件包括:业务状态字段包括预设值集合中的目标值,且数据流的业务状态字段与上一次写操作对应的数据流的业务状态字段不同。
图8为本申请第二方面提供的数据核对装置的又一实施例的结构示意图。图8与图6的不同之处在于,核对模块203可包括窗口划分单元2031和核对单元2032。
窗口划分单元2031可用于在每个数据区域中,根据数据流对应的主 键值,将数据流划分至核对窗口内。
不同的核对窗口内的数据流的主键值不同。
核对单元2032可用于核对核对窗口内的数据流。
在一些示例中,具体地,窗口划分单元2031可用于在已存在的核对窗口内的数据流的主键值与未划分的数据流对应的主键值不同的情况下,生成新的核对窗口,将未划分的数据流划分至新的核对窗口内。
核对单元2032可用于在未划分的数据流划分至新的核对窗口内的时长超过预设触发时长的情况下,触发核对新的核对窗口内的数据流。
在另一些示例中,具体地,窗口划分单元2031可用于在已存在的核对窗口内的数据流的主键值与未划分的数据流对应的主键值相同的情况下,将未划分的数据流划分至已存在的核对窗口内
核对单元2032可用于在已存在的核对窗口内的数据流的数量达到预设触发数量的情况下,触发核对已存在的核对窗口内的数据流。
图9为本申请第二方面提供的数据核对装置的再一实施例的结构示意图。图9与图6的不同之处在于,图9所示的数据核对装置200还可包括处理模块205。
处理模块205可用于:在确定数据区域中的至少两个系统数据池的数据一致的情况下,增大数据核对成功指标的值;在确定数据区域中的至少两个系统数据池的数据不一致的情况下,输出数据区域中的至少两个系统数据池中不一致的数据。
本申请第三方面还提供了一种数据核对设备。图10为本申请第三方面提供的数据核对设备的一实施例的结构示意图。如图10所示,数据核对设备300包括存储器301、处理器302及存储在存储器301上并可在处理器302上运行的计算机程序。
在一个示例中,上述处理器302可以包括中央处理器(CPU),或者特定集成电路(Application Specific Integrated Circuit,ASIC),或者可以被配置成实施本申请实施例的一个或多个集成电路。
存储器301可包括只读存储器(Read-Only Memory,ROM),随机存取存储器(Random Access Memory,RAM),磁盘存储介质设备,光存 储介质设备,闪存设备,电气、光学或其他物理/有形的存储器存储设备。因此,通常,存储器包括一个或多个编码有包括计算机可执行指令的软件的有形(非暂态)计算机可读存储介质(例如,存储器设备),并且当该软件被执行(例如,由一个或多个处理器)时,其可操作来执行参考根据本申请实施例的数据核对方法所描述的操作。
处理器302通过读取存储器301中存储的可执行程序代码来运行与可执行程序代码对应的计算机程序,以用于实现上述实施例中的数据核对方法。
在一个示例中,数据核对设备300还可包括通信接口303和总线304。其中,如图10所示,存储器301、处理器302、通信接口303通过总线304连接并完成相互间的通信。
通信接口303,主要用于实现本申请实施例中各模块、装置、单元和/或设备之间的通信。也可通过通信接口303接入输入设备和/或输出设备。
总线304包括硬件、软件或两者,将数据核对设备300的部件彼此耦接在一起。举例来说而非限制,总线304可包括加速图形端口(Accelerated Graphics Port,AGP)或其他图形总线、增强工业标准架构(Enhanced Industry Standard Architecture,EISA)总线、前端总线(Front Side Bus,FSB)、超传输(Hyper Transport,HT)互连、工业标准架构(Industrial Standard Architecture,ISA)总线、无限带宽互连、低引脚数(Low pin count,LPC)总线、存储器总线、微信道架构(Micro Channel Architecture,MCA)总线、外围组件互连(Peripheral Component Interconnect,PCI)总线、PCI-Express(PCI-X)总线、串行高级技术附件(Serial Advanced Technology Attachment,SATA)总线、视频电子标准协会局部(Video Electronics Standards Association Local Bus,VLB)总线或其他合适的总线或者两个或更多个以上这些的组合。在合适的情况下,总线304可包括一个或多个总线。尽管本申请实施例描述和示出了特定的总线,但本申请考虑任何合适的总线或互连。
本申请第四方面还提供了一种数据核对系统。图11为本申请第四方面提供的数据核对系统的一实施例的结构示意图。如图11所示,该数据 核对系统可包括数据流装置41、分流装置42和核对装置43。在此并不限定数据核对系统中数据流装置41、分流装置42和核对装置43各自的数量。
数据流装置41可用于在各系统数据池发生写操作的情况下,生成包括与写操作关联的数据的数据流并传输。
数据流包括数据的主键值。
分流装置42可用于基于数据流的字段以及预设的区域划分规则,将数据流划分为至少一个数据区域。
每个数据区域包括至少两个系统数据池对应的数据流。
核对装置43可用于在每个数据区域中,根据数据流对应的主键值,核对数据区域中的至少两个系统数据池对应的数据流,以确定数据区域中的至少两个系统数据池的数据是否一致。
数据流装置41、分流装置42和核对装置43还可执行上述实施例中数据核对方法中的其他步骤,具体可参见上述实施例中数据核对方法的相关说明,在此不再赘述。
本申请第五方面还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时可实现上述实施例中的数据核对方法,且能达到相同的技术效果,为避免重复,这里不再赘述。其中,上述计算机可读存储介质可包括非暂态计算机可读存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等,在此并不限定。
需要明确的是,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。对于装置实施例、设备实施例、系统实施例、计算机可读存储介质实施例而言,相关之处可以参见方法实施例的说明部分。本申请并不局限于上文所描述并在图中示出的特定步骤和结构。本领域的技术人员可以在领会本申请的精神之后,作出各种改变、修改和添加,或者改变步骤之间的顺序。并且,为了简明起见,这里省略对已知 方法技术的详细描述。
上面参考根据本申请的实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本申请的各方面。应当理解,流程图和/或框图中的每个方框以及流程图和/或框图中各方框的组合可以由计算机程序指令实现。这些计算机程序指令可被提供给通用计算机、专用计算机、或其它可编程数据处理装置的处理器,以产生一种机器,使得经由计算机或其它可编程数据处理装置的处理器执行的这些指令使能对流程图和/或框图的一个或多个方框中指定的功能/动作的实现。这种处理器可以是但不限于是通用处理器、专用处理器、特殊应用处理器或者现场可编程逻辑电路。还可理解,框图和/或流程图中的每个方框以及框图和/或流程图中的方框的组合,也可以由执行指定的功能或动作的专用硬件来实现,或可由专用硬件和计算机指令的组合来实现。
本领域技术人员应能理解,上述实施例均是示例性而非限制性的。在不同实施例中出现的不同技术特征可以进行组合,以取得有益效果。本领域技术人员在研究附图、说明书及权利要求书的基础上,应能理解并实现所揭示的实施例的其他变化的实施例。在权利要求书中,术语“包括”并不排除其他装置或步骤;数量词“一个”不排除多个;术语“第一”、“第二”用于标示名称而非用于表示任何特定的顺序。权利要求中的任何附图标记均不应被理解为对保护范围的限制。权利要求中出现的多个部分的功能可以由一个单独的硬件或软件模块来实现。某些技术特征出现在不同的从属权利要求中并不意味着不能将这些技术特征进行组合以取得有益效果。
Claims (15)
- 一种数据核对方法,包括:在各系统数据池发生写操作的情况下,生成包括与所述写操作关联的数据的数据流并传输,所述数据流包括数据的主键值;基于所述数据流的字段以及预设的区域划分规则,将所述数据流划分为至少一个数据区域,每个所述数据区域包括至少两个系统数据池对应的所述数据流;在每个所述数据区域中,根据所述数据流对应的主键值,核对所述数据区域中的至少两个系统数据池对应的所述数据流,以确定所述数据区域中的至少两个系统数据池的数据是否一致。
- 根据权利要求1所述的方法,其中,所述在各系统数据池发生写操作的情况下,生成包括与所述写操作关联的数据的数据流并传输,包括:读取各系统数据池的二进制日志,根据所述二进制日志,确定各系统数据池的所述写操作;基于所述写操作,生成数据流报文,所述数据流报文用于承载所述数据流;通过数据流式组件传输所述数据流报文。
- 根据权利要求2所述的方法,其中,所述数据流报文包括主键值相同的本次写操作关联的数据和上次写操作关联的数据。
- 根据权利要求1所述的方法,其中,在所述基于所述数据流的字段以及预设的区域划分规则,将所述数据流划分为至少一个数据区域之前,还包括:在一个系统数据池对应有主键值相同的多个所述数据流的情况下,保留其中字段符合预设的筛选条件的一个所述数据流。
- 根据权利要求4所述的方法,其中,所述数据流包括业务状态字段,所述业务状态字段用于表征所述数据流的数据对应的业务的状态,所述筛选条件包括:所述业务状态字段包括预设值集合中的目标值, 且所述数据流的业务状态字段与上一次所述写操作对应的数据流的业务状态字段不同。
- 根据权利要求1所述的方法,其中,所述在每个所述数据区域中,根据所述数据流对应的主键值,核对所述数据区域中的至少两个系统数据池对应的所述数据流,包括:在每个所述数据区域中,根据所述数据流对应的主键值,将所述数据流划分至核对窗口内,不同的所述核对窗口内的所述数据流的主键值不同;核对所述核对窗口内的所述数据流。
- 根据权利要求6所述的方法,其中,所述根据所述数据流对应的主键值,将所述数据流划分至核对窗口内,包括:在已存在的所述核对窗口内的数据流的主键值与未划分的所述数据流对应的主键值不同的情况下,生成新的所述核对窗口,将未划分的所述数据流划分至新的所述核对窗口内;所述核对所述核对窗口内的所述数据流,包括:在未划分的所述数据流划分至新的所述核对窗口内的时长超过预设触发时长的情况下,触发核对新的所述核对窗口内的所述数据流。
- 根据权利要求6所述的方法,其中,所述根据所述数据流对应的主键值,将所述数据流划分至核对窗口内,包括:在已存在的所述核对窗口内的数据流的主键值与未划分的所述数据流对应的主键值相同的情况下,将未划分的所述数据流划分至已存在的所述核对窗口内;所述核对所述核对窗口内的所述数据流,包括:在已存在的所述核对窗口内的数据流的数量达到预设触发数量的情况下,触发核对已存在的所述核对窗口内的所述数据流。
- 根据权利要求1所述的方法,其中,在所述根据所述数据流对应的主键值,核对所述数据区域中的至少两个系统数据池对应的所述数据流 之后,还包括:在确定所述数据区域中的至少两个系统数据池的数据一致的情况下,增大数据核对成功指标的值;在确定所述数据区域中的至少两个系统数据池的数据不一致的情况下,输出所述数据区域中的至少两个系统数据池中不一致的数据。
- 根据权利要求1所述的方法,其中,同一所述数据区域的所述数据流的字段满足相同的所述区域划分规则。
- 根据权利要求1所述的方法,其中,多个所述数据区域中所述数据流的核对并行执行。
- 一种数据核对装置,包括:数据流生成模块,用于在各系统数据池发生写操作的情况下,生成包括与所述写操作关联的数据的数据流并传输,所述数据流包括数据的主键值;区域划分模块,用于基于所述数据流的字段以及预设的区域划分规则,将所述数据流划分为至少一个数据区域,每个所述数据区域包括至少两个系统数据池对应的所述数据流;核对模块,用于在每个所述数据区域中,根据所述数据流对应的主键值,核对所述数据区域中的至少两个系统数据池对应的所述数据流,以确定所述数据区域中的至少两个系统数据池的数据是否一致。
- 一种数据核对设备,包括:处理器以及存储有计算机程序指令的存储器;所述处理器执行所述计算机程序指令时实现如权利要求1至11中任意一项所述的数据核对方法。
- 一种数据核对系统,包括:数据流装置,用于在各系统数据池发生写操作的情况下,生成包括与所述写操作关联的数据的数据流并传输,所述数据流包括数据的主键值;分流装置,用于基于所述数据流的字段以及预设的区域划分规则,将所述数据流划分为至少一个数据区域,每个所述数据区域包括至少两个系统数据池对应的所述数据流;核对装置,用于在每个所述数据区域中,根据所述数据流对应的主键值,核对所述数据区域中的至少两个系统数据池对应的所述数据流,以确定所述数据区域中的至少两个系统数据池的数据是否一致。
- 一种计算机存储介质,所述计算机存储介质上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现如权利要求1至11中任意一项所述的数据核对方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011167710.5A CN112422635B (zh) | 2020-10-27 | 2020-10-27 | 数据核对方法、装置、设备、系统及存储介质 |
CN202011167710.5 | 2020-10-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022089063A1 true WO2022089063A1 (zh) | 2022-05-05 |
Family
ID=74841834
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/118146 WO2022089063A1 (zh) | 2020-10-27 | 2021-09-14 | 数据核对方法、装置、设备、系统及存储介质 |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN112422635B (zh) |
TW (1) | TWI802056B (zh) |
WO (1) | WO2022089063A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112422635B (zh) * | 2020-10-27 | 2023-05-23 | 中国银联股份有限公司 | 数据核对方法、装置、设备、系统及存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120271993A1 (en) * | 2011-04-19 | 2012-10-25 | International Business Machines Corporation | Virtual tape systems using physical tape caching |
CN103136276A (zh) * | 2011-12-02 | 2013-06-05 | 阿里巴巴集团控股有限公司 | 一种数据核对系统,方法及装置 |
US20160055190A1 (en) * | 2014-08-19 | 2016-02-25 | New England Complex Systems Institute, Inc. | Event detection and characterization in big data streams |
CN108647353A (zh) * | 2018-05-16 | 2018-10-12 | 口碑(上海)信息技术有限公司 | 一种实时核对数据的方法、装置 |
CN109840837A (zh) * | 2017-11-27 | 2019-06-04 | 财付通支付科技有限公司 | 财务数据的处理方法、装置、计算机可读介质及电子设备 |
CN110196844A (zh) * | 2018-04-16 | 2019-09-03 | 腾讯科技(深圳)有限公司 | 一种数据迁移方法、系统及存储介质 |
CN112422635A (zh) * | 2020-10-27 | 2021-02-26 | 中国银联股份有限公司 | 数据核对方法、装置、设备、系统及存储介质 |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102571617B (zh) * | 2012-03-22 | 2015-04-01 | 中国科学院上海高等研究院 | 流数据服务器、流数据传输方法及数据传输系统 |
CN103645963B (zh) * | 2013-12-26 | 2016-06-29 | 深圳市迪菲特科技股份有限公司 | 一种存储系统及其数据一致性校验方法 |
TWI607340B (zh) * | 2015-01-09 | 2017-12-01 | Chunghwa Telecom Co Ltd | Privacy data flow security and storage protection method and system |
CN106326219B (zh) * | 2015-06-16 | 2020-01-24 | 阿里巴巴集团控股有限公司 | 核对业务系统数据的方法、装置及系统 |
CN106454767A (zh) * | 2015-08-05 | 2017-02-22 | 中兴通讯股份有限公司 | 一种业务数据同步方法、装置及系统 |
US11550632B2 (en) * | 2015-12-24 | 2023-01-10 | Intel Corporation | Facilitating efficient communication and data processing across clusters of computing machines in heterogeneous computing environment |
CN110213071B (zh) * | 2018-04-16 | 2021-11-02 | 腾讯科技(深圳)有限公司 | 数据核对方法、装置、系统、计算机设备和存储介质 |
TW201947492A (zh) * | 2018-05-14 | 2019-12-16 | 玉山商業銀行股份有限公司 | 運營資料匯流系統與方法 |
CN113553313B (zh) * | 2018-07-10 | 2023-12-05 | 创新先进技术有限公司 | 一种数据迁移方法及系统、存储介质、电子设备 |
US10795913B2 (en) * | 2018-10-11 | 2020-10-06 | Capital One Services, Llc | Synching and reading arrangements for multi-regional active/active databases |
CN109684350A (zh) * | 2018-12-15 | 2019-04-26 | 平安证券股份有限公司 | 证券登记数据核对方法、装置、计算机设备及存储介质 |
CN110046202B (zh) * | 2019-03-07 | 2023-05-26 | 中国人民解放军海军工程大学 | 基于内存键值数据库的综合电力系统实时数据管理方法 |
CN110109824B (zh) * | 2019-04-09 | 2022-05-17 | 平安科技(深圳)有限公司 | 大数据自动回归测试方法、装置、计算机设备和存储介质 |
CN110716813A (zh) * | 2019-09-17 | 2020-01-21 | 贝壳技术有限公司 | 数据流处理方法、装置、可读存储介质及处理器 |
-
2020
- 2020-10-27 CN CN202011167710.5A patent/CN112422635B/zh active Active
-
2021
- 2021-09-14 WO PCT/CN2021/118146 patent/WO2022089063A1/zh active Application Filing
- 2021-10-22 TW TW110139362A patent/TWI802056B/zh active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120271993A1 (en) * | 2011-04-19 | 2012-10-25 | International Business Machines Corporation | Virtual tape systems using physical tape caching |
CN103136276A (zh) * | 2011-12-02 | 2013-06-05 | 阿里巴巴集团控股有限公司 | 一种数据核对系统,方法及装置 |
US20160055190A1 (en) * | 2014-08-19 | 2016-02-25 | New England Complex Systems Institute, Inc. | Event detection and characterization in big data streams |
CN109840837A (zh) * | 2017-11-27 | 2019-06-04 | 财付通支付科技有限公司 | 财务数据的处理方法、装置、计算机可读介质及电子设备 |
CN110196844A (zh) * | 2018-04-16 | 2019-09-03 | 腾讯科技(深圳)有限公司 | 一种数据迁移方法、系统及存储介质 |
CN108647353A (zh) * | 2018-05-16 | 2018-10-12 | 口碑(上海)信息技术有限公司 | 一种实时核对数据的方法、装置 |
CN112422635A (zh) * | 2020-10-27 | 2021-02-26 | 中国银联股份有限公司 | 数据核对方法、装置、设备、系统及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
TW202217641A (zh) | 2022-05-01 |
CN112422635B (zh) | 2023-05-23 |
TWI802056B (zh) | 2023-05-11 |
CN112422635A (zh) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9391831B2 (en) | Dynamic stream processing within an operator graph | |
CN112181614B (zh) | 任务超时监控方法、装置、设备、系统及存储介质 | |
WO2022089063A1 (zh) | 数据核对方法、装置、设备、系统及存储介质 | |
CN110602056A (zh) | 一种业务参数传递方法及装置 | |
CN108763071A (zh) | 一种网页测试方法及终端设备 | |
CN112087530B (zh) | 一种将数据上传至区块链系统的方法、装置、设备及介质 | |
US10073938B2 (en) | Integrated circuit design verification | |
CN106909454B (zh) | 一种规则处理方法和设备 | |
CN106649344B (zh) | 一种网络日志压缩方法和装置 | |
CN110704226B (zh) | 数据校验方法、装置及存储介质 | |
EP2829972B1 (en) | Method and apparatus for allocating stream processing unit | |
CN113923268B (zh) | 一种针对多版本通信规约的解析方法、设备及存储介质 | |
CN110704620B (zh) | 一种基于知识图谱的识别相同实体的方法及装置 | |
CN110516258A (zh) | 数据校验方法及装置、存储介质、电子装置 | |
CN109522915B (zh) | 病毒文件聚类方法、装置及可读介质 | |
WO2023015869A1 (zh) | 限流控制方法、装置、设备及存储介质 | |
CN105245380B (zh) | 一种消息的传播方式识别方法及装置 | |
CN115296913A (zh) | 一种适应flink作业规则的快速编排系统 | |
CN115774837A (zh) | 一种信号校验方法、装置、设备、介质、程序产品及车辆 | |
CN109857632B (zh) | 测试方法、装置、终端设备及可读存储介质 | |
CN109800823B (zh) | 一种pos终端的聚类方法及装置 | |
CN112835934A (zh) | 查询信息采集方法、装置、电子设备和存储介质 | |
CN110018844A (zh) | 决策触发方案的管理方法、装置和电子设备 | |
CN117648718B (zh) | 基于数据源的业务对象显示方法、装置、电子设备和介质 | |
CN110391952A (zh) | 一种性能分析方法、装置及其设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21884793 Country of ref document: EP Kind code of ref document: A1 |