WO2024036829A1 - 一种数据融合方法、装置、设备及存储介质 - Google Patents
一种数据融合方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2024036829A1 WO2024036829A1 PCT/CN2022/137357 CN2022137357W WO2024036829A1 WO 2024036829 A1 WO2024036829 A1 WO 2024036829A1 CN 2022137357 W CN2022137357 W CN 2022137357W WO 2024036829 A1 WO2024036829 A1 WO 2024036829A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- subsequent
- transaction
- target
- compared
- Prior art date
Links
- 238000007500 overflow downdraw method Methods 0.000 title claims abstract description 19
- 238000000034 method Methods 0.000 claims abstract description 37
- 238000004590 computer program Methods 0.000 claims description 24
- 230000004927 fusion Effects 0.000 claims description 10
- 238000012552 review Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 abstract description 10
- 238000010586 diagram Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 6
- 238000012795 verification Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000007704 transition Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
Definitions
- the embodiments of the present invention relate to the field of data processing technology, and in particular, to a data fusion method, device, equipment and storage medium.
- data center A For a certain business service, data center A is set to provide the business service to the outside world. When data center A has an equipment failure, it will often switch to other data centers, and the other data centers will continue to provide the business service. However, due to the synchronization delay between different data centers, the data corresponding to the business service may not exist in other data centers, or the data in other data centers is inconsistent with the data in data center A, which will cause the data in other data centers to be inconsistent. There is an error in providing this business service.
- a preset period including the switching time point is generally determined, and the data within the preset period is obtained from data center A as the source data.
- the source data is copied to other data centers.
- Embodiments of the present application provide a data fusion method, device, equipment and storage medium to ensure data integrity and continuity of the data center.
- embodiments of the present application provide a data fusion method, which includes:
- any first data identifier obtain the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as the first target data, and obtain the first data to be compared from the plurality of second data to be compared.
- the second data to be compared corresponding to the first data identifier is obtained from the data as the second target data; based on the first transaction status of the first target data and the second transaction status of the second target data, all The second target data is updated.
- updating the second target data based on the first transaction status of the first target data and the second transaction status of the second target data includes:
- first transaction status in the first target data is not empty, the second transaction status in the second target data is not empty, and the first transaction status and the second transaction status are different, then based on the preset Assume a transaction state machine to respectively determine the first state position corresponding to the first transaction state and the second state position corresponding to the second transaction state;
- the first target data is used to update the second target data.
- first transaction status and the second transaction status are the same, then determine the transaction time point of the first target data and the transaction time point of the second target data;
- the first target data is used to update the second target data.
- updating the second target data based on the first transaction status of the first target data and the second transaction status of the second target data includes:
- the second transaction status in the second target data is updated.
- determining the target subsequent transaction data chain based on the obtained M first subsequent identifiers and N second subsequent identifiers, as well as the M first subsequent state positions and N second subsequent state positions includes:
- the first subsequent transaction data corresponding to each of the M first subsequent status positions, and the N is used as the target subsequent transaction data;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- first subsequent identifiers and the second subsequent identifiers group the first subsequent identifiers and the second subsequent identifiers with the same subsequent identifiers into one group to obtain at least one identifier matching group. ; Use the first subsequent identification in the identification matching group as the first matching identification, and use the second subsequent identification in the identification matching group as the second matching identification;
- any identification matching group determine the first subsequent state position corresponding to the first matching identification and the second subsequent state position corresponding to the second matching identification; combine the first subsequent state position and the second The subsequent transaction data corresponding to the subsequent state position in the subsequent state position is deleted;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- updating the second transaction status in the second target data based on the target subsequent transaction data chain includes:
- the second transaction status in the second target data is determined based on the transaction status corresponding to each target subsequent transaction data in the target subsequent transaction data chain.
- the data unique identifier includes an application service unique identifier and a central service unique identifier; it also includes:
- the data pair to be compared includes a first The first data to be compared and the second data to be compared;
- the transaction time point of the first data to be compared in the data pair to be compared is earlier than the transaction time point of the second data to be compared in the data pair to be compared, use the The first data to be compared in the pair of data to be compared is updated, and the second data to be compared in the pair of data to be compared is updated.
- the method further includes:
- For the first attribute identifier in the second target data determine whether the first attribute value corresponding to the first attribute identifier is within a preset range, and if not, add the second target data to the exception file;
- the second attribute identifier For the first attribute identifier in the second target data, determine the second attribute identifier associated with the first attribute identifier, and determine whether the first attribute value corresponding to the first attribute identifier corresponds to the second attribute identifier. Whether the second attribute value satisfies the preset relationship, if so, the second target data is added to the exception file; the exception file is used for manual review.
- a data fusion device which includes:
- An acquisition module configured to acquire a plurality of first data to be compared within a preset period from the first data center, and to acquire a plurality of second data to be compared within the preset period from the second data center; the preset The period is determined based on the data center switching time point;
- a determination module configured to determine the same data unique identifier among the plurality of first data to be compared and the plurality of second data to be compared as the first data identifier
- An update module configured to obtain, for any first data identifier, the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as the first target data, and from the plurality of first data to be compared. Obtain the second data to be compared corresponding to the first data identifier from the second data to be compared as the second target data; the first transaction status based on the first target data and the second data of the second target data. Transaction status, update the second target data.
- the update module is specifically used to:
- first transaction status in the first target data is not empty, the second transaction status in the second target data is not empty, and the first transaction status and the second transaction status are different, then based on the preset Assume a transaction state machine to respectively determine the first state position corresponding to the first transaction state and the second state position corresponding to the second transaction state;
- the first target data is used to update the second target data.
- the update module is also used to:
- first transaction status and the second transaction status are the same, then determine the transaction time point of the first target data and the transaction time point of the second target data;
- the first target data is used to update the second target data.
- the update module is specifically used to:
- the second transaction status in the second target data is updated.
- the update module is specifically used to:
- the first subsequent transaction data corresponding to each of the M first subsequent status positions, and the N is used as the target subsequent transaction data;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- the update module is also used to:
- first subsequent identifiers and the second subsequent identifiers group the first subsequent identifiers and the second subsequent identifiers with the same subsequent identifiers into one group to obtain at least one identifier matching group. ; Use the first subsequent identification in the identification matching group as the first matching identification, and use the second subsequent identification in the identification matching group as the second matching identification;
- any identification matching group determine the first subsequent state position corresponding to the first matching identification and the second subsequent state position corresponding to the second matching identification; combine the first subsequent state position and the second The subsequent transaction data corresponding to the subsequent state position in the subsequent state position is deleted;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- the update module is specifically used to:
- the second transaction status in the second target data is determined based on the transaction status corresponding to each target subsequent transaction data in the target subsequent transaction data chain.
- the data unique identifier includes an application service unique identifier and a central service unique identifier; the update module is also used to:
- the data pair to be compared includes a first The first data to be compared and the second data to be compared;
- the transaction time point of the first data to be compared in the data pair to be compared is earlier than the transaction time point of the second data to be compared in the data pair to be compared, use the The first data to be compared in the pair of data to be compared is updated, and the second data to be compared in the pair of data to be compared is updated.
- a verification module which is specifically used to:
- the second attribute identifier For the first attribute identifier in the second target data, determine the second attribute identifier associated with the first attribute identifier, and determine whether the first attribute value corresponding to the first attribute identifier corresponds to the second attribute identifier. Whether the second attribute value satisfies the preset relationship, if so, the second target data is added to the exception file; the exception file is used for manual review.
- embodiments of the present application provide a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor.
- the processor executes the program, the above data fusion method is implemented. step.
- embodiments of the present application provide a computer-readable storage medium that stores a computer program that can be executed by a computer device.
- the program When the program is run on the computer device, it causes the computer device to execute the above data fusion method. step.
- inventions of the present application provide a computer program product.
- the computer program product includes a computer program stored on a computer-readable storage medium.
- the computer program includes program instructions. When the program instructions are executed by a computer device When, the computer device is caused to perform the steps of the above data fusion method.
- a plurality of first data to be compared within a preset period are obtained from the first data center, and a plurality of second data to be compared within a preset period are obtained from the second data center, and then a plurality of data to be compared are determined.
- the same data unique identifier among the first data to be compared and the plurality of second data to be compared is used as the first data identifier.
- obtain the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as the first target data and obtain the first data from the plurality of second data to be compared.
- the corresponding second data to be compared is identified as the second target data; and the second target data is updated based on the first transaction status of the first target data and the second transaction status of the second target data. Since the judgment in this application is not simply based on the update time of the first target data and the update time of the second target data, but based on the first transaction status of the first target data and the second transaction status of the second target data, When updating the second target data, the sequence relationship of each transaction status in the transaction scenario is fully considered, making the updated second target data more accurate and ensuring the data integrity and continuity of the second data center.
- Figure 1 is a schematic diagram of a system architecture provided by an embodiment of the present application.
- Figure 2 is a schematic flow chart of a data fusion method provided by an embodiment of the present application.
- Figure 3 is a schematic flow chart of a second target data updating method provided by an embodiment of the present application.
- Figure 4 is a schematic flow chart of a second target data updating method provided by an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of a consumption service state machine provided by an embodiment of the present application.
- Figure 6 is a schematic flowchart of a second transaction status updating method provided by an embodiment of the present application.
- Figure 7 is a schematic flowchart of a method for determining a target subsequent transaction data chain provided by an embodiment of the present application
- Figure 8 is a schematic flow chart of another data fusion method provided by the embodiment of the present application.
- Figure 9 is a schematic structural diagram of a data fusion device provided by an embodiment of the present application.
- Figure 10 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- FIG. 1 is an architecture diagram of a data fusion system applicable to the embodiment of the present application.
- the architecture diagram of the data fusion system at least includes a first data center 101 and a second data center 102.
- the first data center 101 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, Cloud servers for basic cloud computing services such as cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
- cloud services such as cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
- the second data center 102 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, Cloud servers for basic cloud computing services such as cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
- cloud services such as cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms.
- the first data center 101 and the second data center 102 can be directly connected through wired or wireless means, or the connection can be established through an intermediate server.
- the first data center 101 provides external business services.
- an equipment failure occurs in the first data center 101, it is switched to the second data center 102, and the second data center 102 continues to provide external business services.
- the data fusion system 104 in the second data center 102 obtains a plurality of first data to be compared within a preset period from the first data center, and obtains a plurality of first data to be compared within the preset period from the second data center.
- the preset period is determined based on the data center switching time point; the same data unique identifier among the plurality of first data to be compared and the plurality of second data to be compared is determined as the first data identifier ; For any first data identifier, obtain the first data to be compared corresponding to the first data identifier from a plurality of first data to be compared, as the first target data, and obtain the first data from a plurality of second data to be compared.
- the second data to be compared corresponding to the data identifier is used as the second target data; based on the first transaction status of the first target data and the second transaction status of the second target data, the second target data is updated to obtain the second updated data .
- the embodiment of the present application provides a process of a data fusion method, as shown in Figure 2.
- the process of this method is composed of the data fusion system in the second data center 102 shown in Figure 1 Execution includes the following steps:
- Step S201 Acquire a plurality of first data to be compared within a preset period from a first data center, and obtain a plurality of second data to be compared within a preset period from a second data center.
- the preset period is determined based on the data center switching time point.
- the preset duration can be the synchronization delay of the data center, or the sum of the synchronization delay and the specified delay.
- the starting point of the preset period is the data center switching time point minus the preset duration
- the end point of the preset period is the data center switching time point plus the preset duration
- Step S202 Determine the same data unique identifier among the plurality of first data to be compared and the plurality of second data to be compared as the first data identifier.
- the unique identifier of the data includes the unique identifier of the application business and the unique identifier of the central business.
- the unique identifier of the application business is determined by the business service and sent to the data center, and does not differ for different data centers; the unique identifier of the central business is Determined by the data center, it may vary between different data centers.
- the same unique identifier of data means that the unique identifier of the application service is the same, and the unique identifier of the central service is the same.
- the first data to be compared can be basic transaction data or subsequent transaction data.
- it includes consumption-return operation steps, in which the data generated by the consumption operation is the basic transaction data, and the data generated by the return operation is the subsequent transaction data.
- the first data to be compared is basic transaction data
- the first data to be compared includes multiple attribute identifiers, which are application business unique identifier, central business unique identifier, transaction time point, transaction type, transaction status, transaction amount, and payment amount. , the accumulated returned amount.
- the first data to be compared When the first data to be compared is subsequent transaction data, the first data to be compared includes multiple attribute identifiers, which are application business unique identifier, central business unique identifier, transaction time point, transaction type, transaction status, transaction amount, and payment amount. , the accumulated returned amount, and also includes the application business unique identifier of the associated basic transaction data.
- attribute identifiers which are application business unique identifier, central business unique identifier, transaction time point, transaction type, transaction status, transaction amount, and payment amount.
- the second data to be compared is similar to the first data to be compared, and will not be described in detail here.
- Step S203 For any first data identifier, obtain the first data to be compared corresponding to the first data identifier from a plurality of first data to be compared, as the first target data, and obtain the first data to be compared from a plurality of second data to be compared.
- the second data to be compared corresponding to the first data identifier is used as the second target data; the second target data is updated based on the first transaction status of the first target data and the second transaction status of the second target data.
- the transaction status in the first target data is regarded as the first transaction status
- the transaction status in the second target data is regarded as the second transaction status.
- the transaction status can be different states.
- the transaction status can be order success, payment success, order failure, payment failure, etc.
- the transaction sequence relationship between the first transaction status and the second transaction status is determined, and based on the transaction sequence relationship, the second target data is updated to obtain the second updated data.
- first transaction status is empty and the second transaction status is empty, determine M first subsequent transaction data corresponding to the first target data and N second subsequent transaction data corresponding to the second target data, based on the M first Subsequent transaction data and N second subsequent transaction data are used to update the second target data to obtain the second update data.
- a plurality of first data to be compared within a preset period are obtained from the first data center, and a plurality of second data to be compared within a preset period are obtained from the second data center, and then a plurality of data to be compared are determined.
- the same data unique identifier among the first data to be compared and the plurality of second data to be compared is used as the first data identifier.
- obtain the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as the first target data and obtain the first data from the plurality of second data to be compared.
- the corresponding second data to be compared is identified as the second target data; and the second target data is updated based on the first transaction status of the first target data and the second transaction status of the second target data. Since the judgment in this application is not simply based on the update time of the first target data and the update time of the second target data, but based on the first transaction status of the first target data and the second transaction status of the second target data, When updating the second target data, the sequence relationship of each transaction status in the transaction scenario is fully considered, making the updated second target data more accurate and ensuring the data integrity and continuity of the second data center.
- updating the second target data based on the first transaction status of the first target data and the second transaction status of the second target data includes the following two possible implementations:
- the first possible implementation manner for the situation where the first transaction status in the first target data is not empty, and the second transaction status in the second target data is not empty, specifically includes the following steps as shown in Figure 3:
- Step S301 Determine whether the first transaction status and the second transaction status are the same. If not, execute step S302; otherwise, execute step S305.
- Step S302 Based on the preset transaction state machine, determine the first state position corresponding to the first transaction state and the second state position corresponding to the second transaction state.
- Step S303 Determine whether the first state position is located after the second state position. If so, execute step S304; otherwise, end.
- Step S304 Use the first target data to update the second target data and end.
- Step S305 Determine whether the transaction time point of the first target data is later than the transaction time point of the second target data. If so, execute step S304; otherwise, end.
- the first transaction status in the first target data is not empty, and the second transaction status in the second target data is not empty
- the first transaction status is determined based on the preset transaction state machine.
- the corresponding first state position and the second state position corresponding to the second transaction state are updated based on the positional relationship between the first state position and the second state position, so that the updated second target data is more accurate. precise.
- the second possible implementation manner for the situation where the first transaction status in the first target data is empty, and the second transaction status in the second target data is empty, specifically includes the following steps as shown in Figure 4:
- M subsequent transaction data associated with the first target data can be determined from the multiple subsequent transaction data, as M pieces of subsequent transaction data.
- First subsequent transaction data determine N subsequent transaction data associated with the second target data from the plurality of subsequent transaction data as N second subsequent transaction data.
- Step S402 Determine the data unique identifiers corresponding to the M first subsequent transaction data as the first subsequent identifiers, and determine the data unique identifiers corresponding to the N second subsequent transaction data as the second subsequent identifiers.
- Step S403 Based on the preset transaction state machine, determine the first subsequent state positions corresponding to each of the M first subsequent transaction data, and determine the second subsequent state positions corresponding to each of the N second subsequent transaction data.
- the default transaction state machine includes multiple transaction states and state transition paths between each transaction state.
- the state transition path between each transaction state is related to the actual transaction sequence. The order of each transaction state can be determined based on the state transition path.
- the consumption business state machine includes multiple transaction states, namely: order success S1, payment success S2, cancellation success S3 and return success S4.
- the consumer business state machine includes multiple state transfer paths, namely: order V1, payment V2, cancellation V3, and return V4.
- the state transfer path between successful order S1 and successful payment S2 is payment V2
- the state transfer path between successful order S1 and successful cancellation S3 is cancellation V3
- the state transfer between successful payment S2 and successful cancellation S3 The path is cancellation V3, and the status transfer path between payment success S2 and return success S4 is return V4.
- Step S404 Determine the target subsequent transaction data chain based on the obtained M first subsequent identifiers and N second subsequent identifiers, as well as the M first subsequent state positions and N second subsequent state positions.
- the target subsequent transaction data chain consists of first subsequent transaction data and second subsequent transaction data.
- the setting determines that the first target data corresponds to two first subsequent transaction data, which are the first payment transaction data and the first return transaction data.
- the second target data corresponds to one second subsequent transaction data, which is the second payment transaction data. .
- the second payment is determined based on the first subsequent identification and the first subsequent status position corresponding to each of the two first subsequent transaction data, and based on the second subsequent identification and the second subsequent status position corresponding to the one second subsequent transaction data.
- the transaction data and the first return transaction data are the target subsequent transaction data.
- Step S405 Update the second transaction status in the second target data based on the target subsequent transaction data chain.
- updating the second transaction status in the second target data includes the following execution steps as shown in Figure 6:
- Step S601 For any two adjacent target subsequent transaction data in the target subsequent transaction data chain, determine the first position relationship corresponding to the two adjacent target subsequent transaction data based on the preset transaction state machine.
- the corresponding transaction status of the subsequent transaction data of two adjacent targets is determined respectively; then based on the preset transaction state machine, the status position of the transaction status corresponding to the subsequent transaction data of the two adjacent targets is determined; finally, based on the adjacent The subsequent transaction data of the two targets respectively correspond to the status position to determine the first position relationship.
- Step S602 Determine the second position relationship between the two target subsequent transaction data in the target subsequent transaction data chain.
- Step S603 if the first position relationship and the second position relationship are the same, determine the second transaction status in the second target data based on the transaction status corresponding to each target subsequent transaction data in the target subsequent transaction data chain. If the first position relationship and the second position relationship are different, the second target data is manually reviewed.
- the attribute value corresponding to the attribute identifier in the second target data can also be determined based on the attribute values corresponding to other attribute identifiers corresponding to each target subsequent transaction data in the target subsequent transaction data chain.
- the target subsequent transaction data chain For example, set the target subsequent transaction data chain as the second payment transaction data - the first return transaction data. Based on the preset transaction state machine, determine that the transaction status corresponding to the second payment transaction data is payment successful S2, and the first return transaction The transaction status corresponding to the data is return to S4. Therefore, the first position relationship corresponding to the second payment transaction data and the first return transaction data is: the second payment transaction data is before the first return transaction data.
- the second position relationship between the second payment transaction data and the first return transaction data in the target subsequent transaction data chain is determined to be: the second payment transaction data is before the first return transaction data.
- the second transaction status in the second target data is determined based on the corresponding transaction statuses of the second payment transaction data and the first return transaction data.
- the second transaction status in the second target data is empty, based on the M first subsequent transactions corresponding to the first target data data and N second subsequent transaction data corresponding to the second target data, determine the target subsequent transaction data chain, and update the second transaction status in the second target data based on the target subsequent transaction data chain. Since this application is based on the target subsequent transaction data chain, the second transaction status in the second target data is updated, making the updated second target data more accurate.
- determining the target subsequent transaction data chain includes the following two possible implementations:
- the first possible implementation manner for the situation where the same subsequent identifiers do not exist among the M first subsequent identifiers and the N second subsequent identifiers, specifically includes the following execution steps:
- the first subsequent transaction data corresponding to each of the M first subsequent state positions and the second subsequent transaction data corresponding to each of the N second subsequent state positions are used as the target subsequent transaction data; then the target subsequent transaction data is calculated according to the transaction time. Click to sort and obtain the target subsequent transaction data chain.
- the target is determined directly based on the M first follow-up transaction data and the N second follow-up transaction data.
- the subsequent transaction data chain improves the efficiency of generating target subsequent transaction data.
- the second possible implementation manner for the situation where the same subsequent identifiers exist among the M first subsequent identifiers and the N second subsequent identifiers, specifically includes the following steps as shown in Figure 7:
- Step S701 Group the first subsequent identifier and the second subsequent identifier with the same subsequent identifier into a group to obtain at least one identifier matching group; use the first subsequent identifier in the identifier matching group as the first matching identifier, and combine the identifier matching group The second subsequent identifier within is used as the second matching identifier.
- Step S702 For any identifier matching group, determine the first subsequent state position corresponding to the first matching identifier and the second subsequent state position corresponding to the second matching identifier; combine the first subsequent state position and the second subsequent state position into The subsequent transaction data corresponding to the subsequent status position is deleted.
- Step S703 Use the first subsequent transaction data corresponding to each of the remaining P first subsequent state positions and the second subsequent transaction data corresponding to each of the Q second subsequent state positions as target subsequent transaction data.
- Step S704 Sort the target subsequent transaction data according to transaction time points to obtain the target subsequent transaction data chain.
- the first subsequent state position and the second subsequent state position corresponding to the later state position are The deletion of subsequent transaction data ensures the accuracy of the remaining subsequent transaction data, thereby ensuring the accuracy of the generated target subsequent transaction data chain.
- this application also provides two other data fusion methods:
- the first other data fusion method for the situation where the unique identification part of the data in the first data to be compared and the second data to be compared is the same, specifically includes the following steps as shown in Figure 8:
- Step S801 From a plurality of first data to be compared and a plurality of second data to be compared, at least one data pair to be compared with different application service unique identifiers and the same central service unique identifier is determined.
- the data pair to be compared includes first data to be compared and second data to be compared.
- Step S802 For at least one data pair to be compared, if the transaction time point of the first data to be compared in the data pair to be compared is earlier than the transaction time point of the second data to be compared in the data pair to be compared, use the data pair to be compared.
- the first data to be compared in the pair of data to be compared is updated with the second data to be compared in the pair of data to be compared.
- the above update method ensures the accuracy of the second data to be compared in the data pair to be compared.
- the second other data fusion method for the situation where the data is uniquely identified only in multiple first data to be compared, specifically includes the following steps:
- the first data to be compared corresponding to the data identifier is added to the second data center.
- the first data to be compared corresponding to the second data identifier that only exists in the first data center is added to the second data center to ensure the data integrity of the second data center.
- step S203 after updating the second target data based on the first transaction status of the first target data and the second transaction status of the second target data, the following two possible verification implementations are also included: Way:
- the first possible verification implementation is to determine whether the first attribute value corresponding to the first attribute identifier is within the preset range for the first attribute identifier in the second target data. If not, add the second target data to to the exception file.
- the first attribute is identified as payment amount, and the payment amount value must be greater than or equal to 0.
- the second target data is verified based on the relationship between the first attribute value corresponding to the first attribute identifier and the preset range, thereby ensuring the accuracy of the second target data.
- the second possible verification implementation is to determine the second attribute identifier associated with the first attribute identifier for the first attribute identifier in the second target data, and determine the first attribute value and the second attribute corresponding to the first attribute identifier. Identify whether the corresponding second attribute value satisfies the preset relationship, and if so, add the second target data to the exception file.
- abnormal files are used for manual review.
- the first attribute is identified as the payment amount
- the second attribute is identified as the transaction amount.
- the default relationship between the payment amount and the transaction amount is: the payment amount value is less than or equal to the transaction amount value.
- the second target data is verified based on the relationship between the first attribute value corresponding to the first attribute identifier and the second attribute value corresponding to the second attribute identifier, ensuring the accuracy of the second target data.
- the data fusion device 900 includes:
- the acquisition module 901 is configured to acquire a plurality of first data to be compared within a preset period from a first data center, and acquire a plurality of second data to be compared within the preset period from a second data center; the preset The time period is determined based on the data center switching time point;
- Determining module 902 configured to determine the same data unique identifier among the plurality of first data to be compared and the plurality of second data to be compared as the first data identifier;
- the update module 903 is configured to, for any first data identifier, obtain the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared as the first target data, and obtain the first data to be compared corresponding to the first data identifier from the plurality of first data to be compared.
- the second data to be compared corresponding to the first data identifier is obtained from the plurality of second data to be compared as the second target data; the first transaction status based on the first target data and the second target data of the second target data. Second transaction status, update the second target data.
- the update module 903 is specifically used to:
- first transaction status in the first target data is not empty, the second transaction status in the second target data is not empty, and the first transaction status and the second transaction status are different, then based on the preset Assume a transaction state machine to respectively determine the first state position corresponding to the first transaction state and the second state position corresponding to the second transaction state;
- the first target data is used to update the second target data.
- the update module 903 is also used to:
- first transaction status and the second transaction status are the same, then determine the transaction time point of the first target data and the transaction time point of the second target data;
- the first target data is used to update the second target data.
- the update module 903 is specifically used to:
- the second transaction status in the second target data is updated.
- the update module 903 is specifically used to:
- the first subsequent transaction data corresponding to each of the M first subsequent status positions, and the N is used as the target subsequent transaction data;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- the update module 903 is also used to:
- first subsequent identifiers and the second subsequent identifiers group the first subsequent identifiers and the second subsequent identifiers with the same subsequent identifiers into one group to obtain at least one identifier matching group. ; Use the first subsequent identification in the identification matching group as the first matching identification, and use the second subsequent identification in the identification matching group as the second matching identification;
- any identification matching group determine the first subsequent state position corresponding to the first matching identification and the second subsequent state position corresponding to the second matching identification; combine the first subsequent state position and the second The subsequent transaction data corresponding to the subsequent state position in the subsequent state position is deleted;
- the target subsequent transaction data is sorted according to transaction time points to obtain the target subsequent transaction data chain.
- the update module 903 is specifically used to:
- the second transaction status in the second target data is determined based on the transaction status corresponding to each target subsequent transaction data in the target subsequent transaction data chain.
- the data unique identifier includes an application service unique identifier and a central service unique identifier; the update module 903 is also used to:
- the data pair to be compared includes a first The first data to be compared and the second data to be compared;
- the transaction time point of the first data to be compared in the data pair to be compared is earlier than the transaction time point of the second data to be compared in the data pair to be compared, use the The first data to be compared in the pair of data to be compared is updated, and the second data to be compared in the pair of data to be compared is updated.
- a verification module 904 is also included, and the verification module 904 is specifically used to:
- the second attribute identifier For the first attribute identifier in the second target data, determine the second attribute identifier associated with the first attribute identifier, and determine whether the first attribute value corresponding to the first attribute identifier corresponds to the second attribute identifier. Whether the second attribute value satisfies the preset relationship, if so, the second target data is added to the exception file; the exception file is used for manual review.
- the computer device may be a terminal or a server. As shown in Figure 10, it includes at least one processor 1001 and a memory 1002 connected to the at least one processor.
- the application embodiment does not limit the specific connection medium between the processor 1001 and the memory 1002.
- the connection between the processor 1001 and the memory 1002 in Figure 10 is taken as an example.
- the bus can be divided into address bus, data bus, control bus, etc.
- the memory 1002 stores instructions that can be executed by at least one processor 1001. By executing the instructions stored in the memory 1002, at least one processor 1001 can perform the steps included in the above data fusion method.
- the processor 1001 is the control center of the computer equipment. It can use various interfaces and lines to connect various parts of the computer equipment, and perform data processing by running or executing instructions stored in the memory 1002 and calling data stored in the memory 1002. Fusion.
- the processor 1001 may include one or more processing units.
- the processor 1001 may integrate an application processor and a modem processor.
- the application processor mainly processes the operating system, user interface, application programs, etc., and the modem processor
- the debug processor mainly handles wireless communications. It can be understood that the above modem processor may not be integrated into the processor 1001.
- the processor 1001 and the memory 1002 can be implemented on the same chip, and in some embodiments, they can also be implemented on separate chips.
- the processor 1001 can be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps and logical block diagrams disclosed in the embodiments of this application.
- a general-purpose processor may be a microprocessor or any conventional processor, etc. The steps of the methods disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware processor for execution, or can be executed by a combination of hardware and software modules in the processor.
- the memory 1002 can be used to store non-volatile software programs, non-volatile computer executable programs and modules.
- the memory 1002 may include at least one type of storage medium, for example, may include flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Magnetic Memory, Disk , CD, etc.
- Memory 1002 is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
- the memory 1002 in the embodiment of the present application can also be a circuit or any other device capable of realizing a storage function, used to store program instructions and/or data.
- embodiments of the present application provide a computer-readable storage medium that stores a computer program that can be executed by a computer device.
- the program When the program is run on the computer device, it causes the computer device to execute the steps of the above data fusion method.
- the computer program product includes a computer program stored on a computer-readable storage medium.
- the computer program includes program instructions. When the program instructions are processed by a computer, When executed, the computer is caused to execute the steps of the above data fusion method.
- embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
- the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
- These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
- Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- General Physics & Mathematics (AREA)
- Development Economics (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Debugging And Monitoring (AREA)
Abstract
一种数据融合方法、装置、设备及存储介质,涉及数据处理技术领域,该方法包括:从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取预设时段内的多个第二待比较数据(S201),再确定多个第一待比较数据和多个第二待比较数据中相同的数据唯一标识,作为第一数据标识(S202)。针对任一第一数据标识,获取第一数据标识对应的第一待比较数据,作为第一目标数据,获取第一数据标识对应的第二待比较数据,作为第二目标数据;基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新(S203)。上述方法充分考虑到交易场景中各个交易状态的先后关系,使得更新后的第二目标数据更加准确。
Description
相关申请的交叉引用
本申请要求在2022年08月19日提交中国专利局、申请号为202210996533.4、申请名称为“一种数据融合方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本发明实施例涉及数据处理技术领域,尤其涉及一种数据融合方法、装置、设备及存储介质。
随着互联网技术的高速发展,业务系统的规模越来越大,而每次技术故障所造成的损失也是不可估量的。为了提高业务系统的容灾能力,目前的业务系统一般采用异地多活架构,即在不同的地理位置上设置数据中心,不同的数据中心均可以对外提供业务服务。不同数据中心所存储的数据互为备份,由于不同数据中心在备份数据时存在同步时延,因此,对于任一时间点,不同数据中心所存储的数据并不完全一致。
针对某一项业务服务,设定由A数据中心对外提供该项业务服务,当A数据中心出现设备故障时,常常会切换至其他数据中心,由其他数据中心继续提供该项业务服务。然而,由于不同数据中心之间存在同步时延,其他数据中心可能会不存在该项业务服务对应的数据,或者其他数据中心的数据与A数据中心的数据不一致,这将会导致其他数据中心所提供的该项业务服务出错。
目前,一般确定包括切换时间点在内的预设时段,从A数据中心获取该预设时段内的数据,作为源数据,将源数据拷贝至其他数据中心,当出现源数据与其他数据中心的数据不一致的情况,则根据数据的更新时间进行判断,选择较晚的更新时间对应的数据进行更新。这种方法会出现数据遗漏等问题,无法保证数据中心的数据完整性和连续性。
发明内容
本申请实施例提供了一种数据融合方法、装置、设备及存储介质,用于保证数据中心的数据完整性和连续性。
一方面,本申请实施例提供了一种数据融合方法,该方法包括:
从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取所述预设时段内的多个第二待比较数据;所述预设时段是基于数据中心切换时间点确定的;
确定所述多个第一待比较数据和所述多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;
针对任一第一数据标识,从所述多个第一待比较数据中获取所述第一数据标识对应的 第一待比较数据,作为第一目标数据,以及从所述多个第二待比较数据中获取所述第一数据标识对应的第二待比较数据,作为第二目标数据;基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新。
可选地,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新,包括:
若所述第一目标数据中的第一交易状态非空,所述第二目标数据中的第二交易状态非空,且所述第一交易状态和所述第二交易状态不同,则基于预设交易状态机,分别确定所述第一交易状态对应的第一状态位置,以及所述第二交易状态对应的第二状态位置;
若所述第一状态位置位于所述第二状态位置之后,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,还包括:
若所述第一交易状态和所述第二交易状态相同,则对所述第一目标数据的交易时间点和所述第二目标数据的交易时间点进行判断;
若所述第一目标数据的交易时间点晚于所述第二目标数据的交易时间点,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新,包括:
若所述第一目标数据中的第一交易状态为空,所述第二目标数据中的第二交易状态为空,则分别确定所述第一目标数据对应的M个第一后续交易数据以及所述第二目标数据对应的N个第二后续交易数据;其中,M>=0,N>=0;
分别确定所述M个第一后续交易数据各自对应的数据唯一标识,作为第一后续标识,以及分别确定所述N个第二后续交易数据各自对应的数据唯一标识,作为第二后续标识;
基于预设交易状态机,分别确定所述M个第一后续交易数据各自对应的第一后续状态位置,以及分别确定所述N个第二后续交易数据各自对应的第二后续状态位置;
基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链;
基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新。
可选地,所述基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链,包括:
若所述M个第一后续标识和所述N个第二后续标识中不存在相同的后续标识,则将所述M个第一后续状态位置各自对应的第一后续交易数据,以及所述N个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,还包括:
若所述M个第一后续标识和所述N个第二后续标识中存在相同的后续标识,将后续标识相同的第一后续标识和第二后续标识分为一组,获得至少一个标识匹配组;并将所述标识匹配组内的第一后续标识作为第一匹配标识,将所述标识匹配组内的第二后续标识作为第二匹配标识;
针对任一标识匹配组,确定所述第一匹配标识对应的第一后续状态位置,以及所述第二匹配标识对应的第二后续状态位置;将所述第一后续状态位置和所述第二后续状态位置 中在后的状态位置对应的后续交易数据删除;
将剩余的P个第一后续状态位置各自对应的第一后续交易数据和Q个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;其中,0<=P<=M,0<=Q<=N;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,所述基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新,包括:
针对所述目标后续交易数据链中任一相邻两个目标后续交易数据,基于所述预设交易状态机,确定所述相邻两个目标后续交易数据对应的第一位置关系;
确定所述相邻两个目标后续交易数据在所述目标后续交易数据链中的第二位置关系;
若所述第一位置关系和所述第二位置关系相同,则基于所述目标后续交易数据链中各目标后续交易数据各自对应的交易状态,确定所述第二目标数据中的第二交易状态。
可选地,所述数据唯一标识包括应用业务唯一标识和中心业务唯一标识;还包括:
从所述多个第一待比较数据和所述多个第二待比较数据中,确定应用业务唯一标识不同且中心业务唯一标识相同的至少一个待比较数据对;所述待比较数据对包括第一待比较数据和第二待比较数据;
针对所述至少一个待比较数据对,若所述待比较数据对中的第一待比较数据的交易时间点早于所述待比较数据对中的第二待比较数据的交易时间点,采用所述待比较数据对中的第一待比较数据,对所述待比较数据对中的第二待比较数据进行更新。
可选地,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新之后,还包括:
针对所述第二目标数据中的第一属性标识,判断所述第一属性标识对应的第一属性值是否在预设范围内,若否,则将所述第二目标数据添加至异常文件;
针对所述第二目标数据中的第一属性标识,确定所述第一属性标识相关联的第二属性标识,判断所述第一属性标识对应的第一属性值和所述第二属性标识对应的第二属性值是否满足预设关系,若是,则将所述第二目标数据添加至异常文件;所述异常文件用于进行人工复审。
一方面,本申请实施例提供了一种数据融合装置,该装置包括:
获取模块,用于从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取所述预设时段内的多个第二待比较数据;所述预设时段是基于数据中心切换时间点确定的;
确定模块,用于确定所述多个第一待比较数据和所述多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;
更新模块,用于针对任一第一数据标识,从所述多个第一待比较数据中获取所述第一数据标识对应的第一待比较数据,作为第一目标数据,以及从所述多个第二待比较数据中获取所述第一数据标识对应的第二待比较数据,作为第二目标数据;基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新。
可选地,所述更新模块具体用于:
若所述第一目标数据中的第一交易状态非空,所述第二目标数据中的第二交易状态非空,且所述第一交易状态和所述第二交易状态不同,则基于预设交易状态机,分别确定所述第一交易状态对应的第一状态位置,以及所述第二交易状态对应的第二状态位置;
若所述第一状态位置位于所述第二状态位置之后,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,所述更新模块还用于:
若所述第一交易状态和所述第二交易状态相同,则对所述第一目标数据的交易时间点和所述第二目标数据的交易时间点进行判断;
若所述第一目标数据的交易时间点晚于所述第二目标数据的交易时间点,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,所述更新模块具体用于:
若所述第一目标数据中的第一交易状态为空,所述第二目标数据中的第二交易状态为空,则分别确定所述第一目标数据对应的M个第一后续交易数据以及所述第二目标数据对应的N个第二后续交易数据;其中,M>=0,N>=0;
分别确定所述M个第一后续交易数据各自对应的数据唯一标识,作为第一后续标识,以及分别确定所述N个第二后续交易数据各自对应的数据唯一标识,作为第二后续标识;
基于预设交易状态机,分别确定所述M个第一后续交易数据各自对应的第一后续状态位置,以及分别确定所述N个第二后续交易数据各自对应的第二后续状态位置;
基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链;
基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新。
可选地,所述更新模块具体用于:
若所述M个第一后续标识和所述N个第二后续标识中不存在相同的后续标识,则将所述M个第一后续状态位置各自对应的第一后续交易数据,以及所述N个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,所述更新模块还用于:
若所述M个第一后续标识和所述N个第二后续标识中存在相同的后续标识,将后续标识相同的第一后续标识和第二后续标识分为一组,获得至少一个标识匹配组;并将所述标识匹配组内的第一后续标识作为第一匹配标识,将所述标识匹配组内的第二后续标识作为第二匹配标识;
针对任一标识匹配组,确定所述第一匹配标识对应的第一后续状态位置,以及所述第二匹配标识对应的第二后续状态位置;将所述第一后续状态位置和所述第二后续状态位置中在后的状态位置对应的后续交易数据删除;
将剩余的P个第一后续状态位置各自对应的第一后续交易数据和Q个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;其中,0<=P<=M,0<=Q<=N;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,所述更新模块具体用于:
针对所述目标后续交易数据链中任一相邻两个目标后续交易数据,基于所述预设交易状态机,确定所述相邻两个目标后续交易数据对应的第一位置关系;
确定所述相邻两个目标后续交易数据在所述目标后续交易数据链中的第二位置关系;
若所述第一位置关系和所述第二位置关系相同,则基于所述目标后续交易数据链中各目标后续交易数据各自对应的交易状态,确定所述第二目标数据中的第二交易状态。
可选地,所述数据唯一标识包括应用业务唯一标识和中心业务唯一标识;所述更新模块还用于:
从所述多个第一待比较数据和所述多个第二待比较数据中,确定应用业务唯一标识不同且中心业务唯一标识相同的至少一个待比较数据对;所述待比较数据对包括第一待比较数据和第二待比较数据;
针对所述至少一个待比较数据对,若所述待比较数据对中的第一待比较数据的交易时间点早于所述待比较数据对中的第二待比较数据的交易时间点,采用所述待比较数据对中的第一待比较数据,对所述待比较数据对中的第二待比较数据进行更新。
可选地,还包括校验模块,所述校验模块具体用于:
所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新之后,针对所述第二目标数据中的第一属性标识,判断所述第一属性标识对应的第一属性值是否在预设范围内,若否,则将所述第二目标数据添加至异常文件;
针对所述第二目标数据中的第一属性标识,确定所述第一属性标识相关联的第二属性标识,判断所述第一属性标识对应的第一属性值和所述第二属性标识对应的第二属性值是否满足预设关系,若是,则将所述第二目标数据添加至异常文件;所述异常文件用于进行人工复审。
一方面,本申请实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述数据融合方法的步骤。
一方面,本申请实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行上述数据融合方法的步骤。
一方面,本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机设备执行时,使所述计算机设备执行上述数据融合方法的步骤。
在本申请实施例中,从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取预设时段内的多个第二待比较数据,再确定多个第一待比较数据和多个第二待比较数据中相同的数据唯一标识,作为第一数据标识。针对任一第一数据标识,从多个第一待比较数据中获取第一数据标识对应的第一待比较数据,作为第一目标数据,以及从多个第二待比较数据中获取第一数据标识对应的第二待比较数据,作为第二目标数据;基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新。由于本申请中并不是简单的基于第一目标数据的更新时间和第二目标数据的更新时间进行判断,而是基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新,充分考虑到交易场景中各个交易状态的先后关系,使得更新后的第二目标数据更加准确,保证了第二数据中心的数据完整性和连续性。
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的 附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种系统架构示意图;
图2为本申请实施例提供的一种数据融合方法的流程示意图;
图3为本申请实施例提供的一种第二目标数据更新方法的流程示意图;
图4为本申请实施例提供的一种第二目标数据更新方法的流程示意图;
图5为本申请实施例提供的一种消费业务状态机的结构示意图;
图6为本申请实施例提供的一种第二交易状态更新方法的流程示意图;
图7为本申请实施例提供的一种确定目标后续交易数据链方法的流程示意图;
图8为本申请实施例提供的一种其他数据融合方法的流程示意图;
图9为本申请实施例提供的一种数据融合装置的结构示意图;
图10为本申请实施例提供的一种计算机设备的结构示意图。
为了使本发明的目的、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
参考图1,其为本申请实施例适用的一种数据融合系统架构图,该数据融合系统架构图至少包括第一数据中心101、第二数据中心102。
第一数据中心101可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网路(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。
第二数据中心102可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网路(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。
第一数据中心101和第二数据中心102可以通过有线或无线的方式直接连接,也可以通过中间服务器建立连接。
第一数据中心101对外提供业务服务,当第一数据中心101出现设备故障时,切换至第二数据中心102,由第二数据中心102继续对外提供业务服务。在发生数据中心切换时,第二数据中心102中的数据融合系统104从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取预设时段内的多个第二待比较数据;其中,预设时段是基于数据中心切换时间点确定的;确定多个第一待比较数据和多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;针对任一第一数据标识,从多个第一待比较数据中获取第一数据标识对应的第一待比较数据,作为第一目标数据,以及从多个第二待比较数据中获取第一数据标识对应的第二待比较数据,作为第二目标数据;基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新,获得第二更 新数据。
基于图1所述的系统架构图,本申请实施例提供了一种数据融合方法的流程,如图2所示,该方法的流程由图1所示的第二数据中心102中的数据融合系统执行,包括以下步骤:
步骤S201,从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取预设时段内的多个第二待比较数据。
具体地,预设时段是基于数据中心切换时间点确定的。预设时长可以是数据中心的同步时延,也可以是同步时延与指定时延之和。
预设时段的起点为数据中心切换时间点减去预设时长,预设时段的终点为数据中心切换时间点加上预设时长。
步骤S202,确定多个第一待比较数据和多个第二待比较数据中相同的数据唯一标识,作为第一数据标识。
具体地,数据唯一标识包括应用业务唯一标识和中心业务唯一标识,其中,应用业务唯一标识为业务服务确定并发送至数据中心的,并不会因不同的数据中心而不同;中心业务唯一标识为数据中心确定的,可能会因不同的数据中心而不同。
在本申请中相同的数据唯一标识意味着应用业务唯一标识相同,且中心业务唯一标识相同。
第一待比较数据可以是基础交易数据,也可以是后续交易数据。如在消费场景中,包括消费-退货操作步骤,其中,消费操作产生的数据为基础交易数据,退货操作产生的数据为后续交易数据。
当第一待比较数据为基础交易数据时,第一待比较数据包括多个属性标识,分别为应用业务唯一标识、中心业务唯一标识、交易时间点、交易类型、交易状态、交易金额、支付金额、累计已退货金额。
当第一待比较数据为后续交易数据时,第一待比较数据包括多个属性标识,分别为应用业务唯一标识、中心业务唯一标识、交易时间点、交易类型、交易状态、交易金额、支付金额、累计已退货金额,还包括相关联的基础交易数据的应用业务唯一标识。
第二待比较数据与第一待比较数据类似,在此不做赘述。
步骤S203,针对任一第一数据标识,从多个第一待比较数据中获取第一数据标识对应的第一待比较数据,作为第一目标数据,以及从多个第二待比较数据中获取第一数据标识对应的第二待比较数据,作为第二目标数据;基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新。
具体地,将第一目标数据中的交易状态作为第一交易状态,第二目标数据中的交易状态作为第二交易状态。
在同一个交易场景中,交易状态可以为不同的状态。如在消费场景中,交易状态可以是下单成功、支付成功、下单失败、支付失败等。
若第一交易状态非空且第二交易状态非空,则确定第一交易状态和第二交易状态的交易先后关系,基于交易先后关系,对第二目标数据进行更新,获得第二更新数据。
若第一交易状态为空且第二交易状态为空,则确定第一目标数据对应的M个第一后续交易数据以及第二目标数据对应的N个第二后续交易数据,基于M个第一后续交易数据和N个第二后续交易数据,对第二目标数据进行更新,获得第二更新数据。
在本申请实施例中,从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取预设时段内的多个第二待比较数据,再确定多个第一待比较数据和多个第二待比较数据中相同的数据唯一标识,作为第一数据标识。针对任一第一数据标识,从多个第一待比较数据中获取第一数据标识对应的第一待比较数据,作为第一目标数据,以及从多个第二待比较数据中获取第一数据标识对应的第二待比较数据,作为第二目标数据;基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新。由于本申请中并不是简单的基于第一目标数据的更新时间和第二目标数据的更新时间进行判断,而是基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新,充分考虑到交易场景中各个交易状态的先后关系,使得更新后的第二目标数据更加准确,保证了第二数据中心的数据完整性和连续性。
可选地,在上述步骤S203中,基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新,包括以下两种可能的实施方式:
第一种可能的实施方式,针对第一目标数据中的第一交易状态非空,且第二目标数据中的第二交易状态非空的情况,具体包括如图3所示的以下步骤:
步骤S301,判断第一交易状态和第二交易状态是否相同,若否,则执行步骤S302;否则,执行步骤S305。
步骤S302,基于预设交易状态机,分别确定第一交易状态对应的第一状态位置,以及第二交易状态对应的第二状态位置。
步骤S303,判断第一状态位置是否位于第二状态位置之后,若是,则执行步骤S304;否则,结束。
步骤S304,使用第一目标数据对第二目标数据进行更新,并结束。
步骤S305,判断第一目标数据的交易时间点是否晚于第二目标数据的交易时间点,若是,则执行步骤S304;否则,结束。
在本申请实施例中,针对第一目标数据中的第一交易状态非空,且第二目标数据中的第二交易状态非空的情况,基于预设交易状态机,分别确定第一交易状态对应的第一状态位置,以及第二交易状态对应的第二状态位置,基于第一状态位置和第二状态位置的位置关系,对第二目标数据进行更新,使得更新后的第二目标数据更加准确。
第二种可能的实施方式,针对第一目标数据中的第一交易状态为空,且第二目标数据中的第二交易状态为空的情况,具体包括如图4所示的以下步骤:
步骤S401,分别确定第一目标数据对应的M个第一后续交易数据以及第二目标数据对应的N个第二后续交易数据,其中,M>=0,N>=0。
具体地,由于后续交易数据中存在相关联的基础交易数据的应用业务唯一标识,基于此,可以从多个后续交易数据中确定与第一目标数据相关联的M个后续交易数据,作为M个第一后续交易数据;从多个后续交易数据中确定与第二目标数据相关联的N个后续交易数据,作为N个第二后续交易数据。
步骤S402,分别确定M个第一后续交易数据各自对应的数据唯一标识,作为第一后续标识,以及分别确定N个第二后续交易数据各自对应的数据唯一标识,作为第二后续标识。
步骤S403,基于预设交易状态机,分别确定M个第一后续交易数据各自对应的第一后续状态位置,以及分别确定N个第二后续交易数据各自对应的第二后续状态位置。
具体地,业务场景不同,预设交易状态机也不相同。如消费业务对应的消费业务状态机。预设交易状态机包括多个交易状态,以及各个交易状态之间的状态转移路径。各个交易状态之间的状态转移路径与实际的交易顺序有关。根据状态转移路径可以确定各个交易状态的先后顺序。
举例来说,消费业务状态机如图5所示,消费业务状态机包括多个交易状态,分别为:下单成功S1、支付成功S2、取消成功S3和退货成功S4。消费业务状态机包括多个状态转移路径,分别为:下单V1、支付V2、取消V3、退货V4。其中,下单成功S1和支付成功S2之间的状态转移路径为支付V2,下单成功S1和取消成功S3之间的状态转移路径为取消V3,支付成功S2和取消成功S3之间的状态转移路径为取消V3,支付成功S2和退货成功S4之间的状态转移路径为退货V4。
步骤S404,基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链。
具体地,目标后续交易数据链由第一后续交易数据和第二后续交易数据组成。
设定确定第一目标数据对应2个第一后续交易数据,分别为第一支付交易数据、第一退货交易数据,第二目标数据对应的1个第二后续交易数据,为第二支付交易数据。
设定根据2个第一后续交易数据各自对应的第一后续标识和第一后续状态位置,以及根据1个第二后续交易数据对应的第二后续标识和第二后续状态位置,确定第二支付交易数据、第一退货交易数据为目标后续交易数据。
设定第二支付交易数据的交易时间点为10:00:00,第一退货交易数据的交易时间点为10:00:05,将以上2个目标后续交易数据按照交易时间点进行排序,获得目标后续交易数据链,该目标后续交易数据链为第二支付交易数据-第一退货交易数据。
步骤S405,基于目标后续交易数据链,对第二目标数据中的第二交易状态进行更新。
具体地,对第二目标数据中的第二交易状态进行更新,包括如图6所示的以下执行步骤:
步骤S601,针对目标后续交易数据链中任一相邻两个目标后续交易数据,基于预设交易状态机,确定相邻两个目标后续交易数据对应的第一位置关系。
具体地,分别确定相邻两个目标后续交易数据各自对应的交易状态;再基于预设交易状态机,确定相邻两个目标后续交易数据各自对应的交易状态的状态位置;最后,基于相邻两个目标后续交易数据各自对应状态位置,确定第一位置关系。
步骤S602,确定相连两个目标后续交易数据在目标后续交易数据链中的第二位置关系。
步骤S603,若第一位置关系和第二位置关系相同,则基于目标后续交易数据链中各目标后续交易数据各自对应的交易状态,确定第二目标数据中的第二交易状态。若第一位置关系和第二位置关系不同,则将第二目标数据进行人工审核。
其中,还可以基于目标后续交易数据链中各目标后续交易数据各自对应的其他属性标识对应的属性值,确定第二目标数据中属性标识对应的属性值。
举例来说,设定目标后续交易数据链为第二支付交易数据-第一退货交易数据,基于预设交易状态机,确定第二支付交易数据对应的交易状态为支付成功S2,第一退货交易数据对应的交易状态为退货成S4,因此,第二支付交易数据和第一退货交易数据对应的第一位置关系为:第二支付交易数据在第一退货交易数据之前。
确定第二支付交易数据和第一退货交易数据在目标后续交易数据链中的第二位置关 系为:第二支付交易数据在第一退货交易数据之前。
由于第一位置关系和第二位置关系相同,因此,根据第二支付交易数据和第一退货交易数据各自对应的交易状态,确定第二目标数据中的第二交易状态。
在本申请实施例中,针对第一目标数据中的第一交易状态为空,且第二目标数据中的第二交易状态为空的情况,基于第一目标数据对应的M个第一后续交易数据以及第二目标数据对应的N个第二后续交易数据,确定目标后续交易数据链,并基于目标后续交易数据链,对第二目标数据中的第二交易状态进行更新。由于本申请中基于目标后续交易数据链,对第二目标数据中的第二交易状态进行更新,使得更新后的第二目标数据更加准确。
可选地,在上述步骤S404,确定目标后续交易数据链包括以下两种可能的实施方式:
第一种可能的实施方式,针对M个第一后续标识和N个第二后续标识中不存在相同的后续标识的情况,具体包括以下执行步骤:
先将M个第一后续状态位置各自对应的第一后续交易数据,以及N个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;再将目标后续交易数据按照交易时间点进行排序,获得目标后续交易数据链。
在本申请实施例中,针对M个第一后续标识和N个第二后续标识中不存在相同的后续标识的情况,直接基于M个第一后续交易数据和N个第二后续交易数据确定目标后续交易数据链,提高了目标后续交易数据生成效率。
第二种可能的实施方式,针对M个第一后续标识和N个第二后续标识中存在相同的后续标识的情况,具体包括如图7所示的以下步骤:
步骤S701,将后续标识相同的第一后续标识和第二后续标识分为一组,获得至少一个标识匹配组;并将标识匹配组内的第一后续标识作为第一匹配标识,将标识匹配组内的第二后续标识作为第二匹配标识。
步骤S702,针对任一标识匹配组,确定第一匹配标识对应的第一后续状态位置,以及第二匹配标识对应的第二后续状态位置;将第一后续状态位置和第二后续状态位置中在后的状态位置对应的后续交易数据删除。
步骤S703,将剩余的P个第一后续状态位置各自对应的第一后续交易数据和Q个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据。
步骤S704,将目标后续交易数据按照交易时间点进行排序,获得目标后续交易数据链。
在本申请实施例中,针对M个第一后续标识和N个第二后续标识中存在相同的后续标识的情况,将第一后续状态位置和第二后续状态位置中在后的状态位置对应的后续交易数据删除,保证了剩余的后续交易数据的准确性,进而保证了所生成的目标后续交易数据链的准确性。
可选地,本申请还提供两种其他数据融合方法:
第一种其他数据融合方法,针对第一待比较数据和第二待比较数据中数据唯一标识部分相同的情况,具体包括如图8所示的以下步骤:
步骤S801,从多个第一待比较数据和多个第二待比较数据中,确定应用业务唯一标识不同且中心业务唯一标识相同的至少一个待比较数据对。
其中,待比较数据对包括第一待比较数据和第二待比较数据。
步骤S802,针对至少一个待比较数据对,若待比较数据对中的第一待比较数据的交易时间点早于待比较数据对中的第二待比较数据的交易时间点,采用待比较数据对中的第一 待比较数据,对待比较数据对中的第二待比较数据进行更新。
在本申请实施例中,针对应用业务唯一标识不同且中心业务唯一标识相同的待比较数据对,上述更新方法保证了待比较数据对中的第二待比较数据的准确性。
第二种其他数据融合方法,针对仅存在于多个第一待比较数据中的数据唯一标识的情况,具体包括以下步骤:
将仅存在于多个第一待比较数据中的数据唯一标识,作为第二数据标识;从多个第一待比较数据中获取将第二数据标识对应的第一待比较数据,并将第二数据标识对应的第一待比较数据添加至第二数据中心。
在本申请实施例中,将仅存在于第一数据中心的第二数据标识对应的第一待比较数据,添加至第二数据中心,保证第二数据中心的数据完整性。
可选地,在上述步骤S203中,基于第一目标数据的第一交易状态以及第二目标数据的第二交易状态,对第二目标数据进行更新之后,还包括以下两种可能的校验实施方式:
第一种可能的校验实施方式,针对第二目标数据中的第一属性标识,判断第一属性标识对应的第一属性值是否在预设范围内,若否,则将第二目标数据添加至异常文件。
举例来说,第一属性标识为支付金额,支付金额值需大于等于0。
在本申请实施例中,针对第一属性标识对应的第一属性值于预设范围的关系,对第二目标数据进行校验,保证了第二目标数据的准确性。
第二种可能的校验实施方式,针对第二目标数据中的第一属性标识,确定第一属性标识相关联的第二属性标识,判断第一属性标识对应的第一属性值和第二属性标识对应的第二属性值是否满足预设关系,若是,则将第二目标数据添加至异常文件。
其中,异常文件用于进行人工复审。
举例来说,第一属性标识为支付金额,第二属性标识为交易金额,支付金额和交易金额满足的预设关系为:支付金额值小于等于交易金额值。
在本申请实施例中,针对第一属性标识对应的第一属性值和第二属性标识对应的第二属性值的关系,对第二目标数据进行校验,保证了第二目标数据的准确性。
基于相同的技术构思,本申请实施例提供了一种数据融合装置,如图9所示,该数据融合装置900包括:
获取模块901,用于从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取所述预设时段内的多个第二待比较数据;所述预设时段是基于数据中心切换时间点确定的;
确定模块902,用于确定所述多个第一待比较数据和所述多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;
更新模块903,用于针对任一第一数据标识,从所述多个第一待比较数据中获取所述第一数据标识对应的第一待比较数据,作为第一目标数据,以及从所述多个第二待比较数据中获取所述第一数据标识对应的第二待比较数据,作为第二目标数据;基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新。
可选地,所述更新模块903具体用于:
若所述第一目标数据中的第一交易状态非空,所述第二目标数据中的第二交易状态非空,且所述第一交易状态和所述第二交易状态不同,则基于预设交易状态机,分别确定所 述第一交易状态对应的第一状态位置,以及所述第二交易状态对应的第二状态位置;
若所述第一状态位置位于所述第二状态位置之后,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,所述更新模块903还用于:
若所述第一交易状态和所述第二交易状态相同,则对所述第一目标数据的交易时间点和所述第二目标数据的交易时间点进行判断;
若所述第一目标数据的交易时间点晚于所述第二目标数据的交易时间点,则使用所述第一目标数据对所述第二目标数据进行更新。
可选地,所述更新模块903具体用于:
若所述第一目标数据中的第一交易状态为空,所述第二目标数据中的第二交易状态为空,则分别确定所述第一目标数据对应的M个第一后续交易数据以及所述第二目标数据对应的N个第二后续交易数据;其中,M>=0,N>=0;
分别确定所述M个第一后续交易数据各自对应的数据唯一标识,作为第一后续标识,以及分别确定所述N个第二后续交易数据各自对应的数据唯一标识,作为第二后续标识;
基于预设交易状态机,分别确定所述M个第一后续交易数据各自对应的第一后续状态位置,以及分别确定所述N个第二后续交易数据各自对应的第二后续状态位置;
基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链;
基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新。
可选地,所述更新模块903具体用于:
若所述M个第一后续标识和所述N个第二后续标识中不存在相同的后续标识,则将所述M个第一后续状态位置各自对应的第一后续交易数据,以及所述N个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,所述更新模块903还用于:
若所述M个第一后续标识和所述N个第二后续标识中存在相同的后续标识,将后续标识相同的第一后续标识和第二后续标识分为一组,获得至少一个标识匹配组;并将所述标识匹配组内的第一后续标识作为第一匹配标识,将所述标识匹配组内的第二后续标识作为第二匹配标识;
针对任一标识匹配组,确定所述第一匹配标识对应的第一后续状态位置,以及所述第二匹配标识对应的第二后续状态位置;将所述第一后续状态位置和所述第二后续状态位置中在后的状态位置对应的后续交易数据删除;
将剩余的P个第一后续状态位置各自对应的第一后续交易数据和Q个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;其中,0<=P<=M,0<=Q<=N;
将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
可选地,所述更新模块903具体用于:
针对所述目标后续交易数据链中任一相邻两个目标后续交易数据,基于所述预设交易状态机,确定所述相邻两个目标后续交易数据对应的第一位置关系;
确定所述相邻两个目标后续交易数据在所述目标后续交易数据链中的第二位置关系;
若所述第一位置关系和所述第二位置关系相同,则基于所述目标后续交易数据链中各 目标后续交易数据各自对应的交易状态,确定所述第二目标数据中的第二交易状态。
可选地,所述数据唯一标识包括应用业务唯一标识和中心业务唯一标识;所述更新模块903还用于:
从所述多个第一待比较数据和所述多个第二待比较数据中,确定应用业务唯一标识不同且中心业务唯一标识相同的至少一个待比较数据对;所述待比较数据对包括第一待比较数据和第二待比较数据;
针对所述至少一个待比较数据对,若所述待比较数据对中的第一待比较数据的交易时间点早于所述待比较数据对中的第二待比较数据的交易时间点,采用所述待比较数据对中的第一待比较数据,对所述待比较数据对中的第二待比较数据进行更新。
可选地,还包括校验模块904,所述校验模块904具体用于:
所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新之后,针对所述第二目标数据中的第一属性标识,判断所述第一属性标识对应的第一属性值是否在预设范围内,若否,则将所述第二目标数据添加至异常文件;
针对所述第二目标数据中的第一属性标识,确定所述第一属性标识相关联的第二属性标识,判断所述第一属性标识对应的第一属性值和所述第二属性标识对应的第二属性值是否满足预设关系,若是,则将所述第二目标数据添加至异常文件;所述异常文件用于进行人工复审。
基于相同的技术构思,本申请实施例提供了一种计算机设备,计算机设备可以是终端或服务器,如图10所示,包括至少一个处理器1001,以及与至少一个处理器连接的存储器1002,本申请实施例中不限定处理器1001与存储器1002之间的具体连接介质,图10中处理器1001和存储器1002之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。
在本申请实施例中,存储器1002存储有可被至少一个处理器1001执行的指令,至少一个处理器1001通过执行存储器1002存储的指令,可以执行上述数据融合方法中所包括的步骤。
其中,处理器1001是计算机设备的控制中心,可以利用各种接口和线路连接计算机设备的各个部分,通过运行或执行存储在存储器1002内的指令以及调用存储在存储器1002内的数据,从而进行数据融合。可选的,处理器1001可包括一个或多个处理单元,处理器1001可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1001中。在一些实施例中,处理器1001和存储器1002可以在同一芯片上实现,在一些实施例中,它们也可以在独立的芯片上分别实现。
处理器1001可以是通用处理器,例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器1002作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、 非易失性计算机可执行程序以及模块。存储器1002可以包括至少一种类型的存储介质,例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random Access Memory,RAM)、静态随机访问存储器(Static Random Access Memory,SRAM)、可编程只读存储器(Programmable Read Only Memory,PROM)、只读存储器(Read Only Memory,ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性存储器、磁盘、光盘等等。存储器1002是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本申请实施例中的存储器1002还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
基于同一发明构思,本申请实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当程序在计算机设备上运行时,使得计算机设备执行上述数据融合方法的步骤。
基于同一发明构思,本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,使所述计算机执行上述数据融合方法的步骤。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。
Claims (13)
- 一种数据融合方法,其特征在于,包括:从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取所述预设时段内的多个第二待比较数据;所述预设时段是基于数据中心切换时间点确定的;确定所述多个第一待比较数据和所述多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;针对任一第一数据标识,从所述多个第一待比较数据中获取所述第一数据标识对应的第一待比较数据,作为第一目标数据,以及从所述多个第二待比较数据中获取所述第一数据标识对应的第二待比较数据,作为第二目标数据;基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新。
- 如权利要求1所述的方法,其特征在于,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新,包括:若所述第一目标数据中的第一交易状态非空,所述第二目标数据中的第二交易状态非空,且所述第一交易状态和所述第二交易状态不同,则基于预设交易状态机,分别确定所述第一交易状态对应的第一状态位置,以及所述第二交易状态对应的第二状态位置;若所述第一状态位置位于所述第二状态位置之后,则使用所述第一目标数据对所述第二目标数据进行更新。
- 如权利要求2所述的方法,其特征在于,还包括:若所述第一交易状态和所述第二交易状态相同,则对所述第一目标数据的交易时间点和所述第二目标数据的交易时间点进行判断;若所述第一目标数据的交易时间点晚于所述第二目标数据的交易时间点,则使用所述第一目标数据对所述第二目标数据进行更新。
- 如权利要求1所述的方法,其特征在于,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新,包括:若所述第一目标数据中的第一交易状态为空,所述第二目标数据中的第二交易状态为空,则分别确定所述第一目标数据对应的M个第一后续交易数据以及所述第二目标数据对应的N个第二后续交易数据;其中,M>=0,N>=0;分别确定所述M个第一后续交易数据各自对应的数据唯一标识,作为第一后续标识,以及分别确定所述N个第二后续交易数据各自对应的数据唯一标识,作为第二后续标识;基于预设交易状态机,分别确定所述M个第一后续交易数据各自对应的第一后续状态位置,以及分别确定所述N个第二后续交易数据各自对应的第二后续状态位置;基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链;基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新。
- 如权利要求4所述的方法,其特征在于,所述基于获得的M个第一后续标识和N个第二后续标识,以及M个第一后续状态位置和N个第二后续状态位置,确定目标后续交易数据链,包括:若所述M个第一后续标识和所述N个第二后续标识中不存在相同的后续标识,则将所述M个第一后续状态位置各自对应的第一后续交易数据,以及所述N个第二后续状态 位置各自对应的第二后续交易数据,作为目标后续交易数据;将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
- 如权利要求5所述的方法,其特征在于,还包括:若所述M个第一后续标识和所述N个第二后续标识中存在相同的后续标识,将后续标识相同的第一后续标识和第二后续标识分为一组,获得至少一个标识匹配组;并将所述标识匹配组内的第一后续标识作为第一匹配标识,将所述标识匹配组内的第二后续标识作为第二匹配标识;针对任一标识匹配组,确定所述第一匹配标识对应的第一后续状态位置,以及所述第二匹配标识对应的第二后续状态位置;将所述第一后续状态位置和所述第二后续状态位置中在后的状态位置对应的后续交易数据删除;将剩余的P个第一后续状态位置各自对应的第一后续交易数据和Q个第二后续状态位置各自对应的第二后续交易数据,作为目标后续交易数据;其中,0<=P<=M,0<=Q<=N;将所述目标后续交易数据按照交易时间点进行排序,获得所述目标后续交易数据链。
- 如权利要求4所述的方法,其特征在于,所述基于所述目标后续交易数据链,对所述第二目标数据中的第二交易状态进行更新,包括:针对所述目标后续交易数据链中任一相邻两个目标后续交易数据,基于所述预设交易状态机,确定所述相邻两个目标后续交易数据对应的第一位置关系;确定所述相邻两个目标后续交易数据在所述目标后续交易数据链中的第二位置关系;若所述第一位置关系和所述第二位置关系相同,则基于所述目标后续交易数据链中各目标后续交易数据各自对应的交易状态,确定所述第二目标数据中的第二交易状态。
- 如权利要求1所述的方法,其特征在于,所述数据唯一标识包括应用业务唯一标识和中心业务唯一标识;还包括:从所述多个第一待比较数据和所述多个第二待比较数据中,确定应用业务唯一标识不同且中心业务唯一标识相同的至少一个待比较数据对;所述待比较数据对包括第一待比较数据和第二待比较数据;针对所述至少一个待比较数据对,若所述待比较数据对中的第一待比较数据的交易时间点早于所述待比较数据对中的第二待比较数据的交易时间点,采用所述待比较数据对中的第一待比较数据,对所述待比较数据对中的第二待比较数据进行更新。
- 如权利要求1所述的方法,其特征在于,所述基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新之后,还包括:针对所述第二目标数据中的第一属性标识,判断所述第一属性标识对应的第一属性值是否在预设范围内,若否,则将所述第二目标数据添加至异常文件;针对所述第二目标数据中的第一属性标识,确定所述第一属性标识相关联的第二属性标识,判断所述第一属性标识对应的第一属性值和所述第二属性标识对应的第二属性值是否满足预设关系,若是,则将所述第二目标数据添加至异常文件;所述异常文件用于进行人工复审。
- 一种数据融合装置,其特征在于,包括:获取模块,用于从第一数据中心获取预设时段内的多个第一待比较数据,以及从第二数据中心获取所述预设时段内的多个第二待比较数据;所述预设时段是基于数据中心切换时间点确定的;确定模块,用于确定所述多个第一待比较数据和所述多个第二待比较数据中相同的数据唯一标识,作为第一数据标识;更新模块,用于针对任一第一数据标识,从所述多个第一待比较数据中获取所述第一数据标识对应的第一待比较数据,作为第一目标数据,以及从所述多个第二待比较数据中获取所述第一数据标识对应的第二待比较数据,作为第二目标数据;基于所述第一目标数据的第一交易状态以及所述第二目标数据的第二交易状态,对所述第二目标数据进行更新。
- 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1~9任一所述方法的步骤。
- 一种计算机可读存储介质,其特征在于,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行权利要求1~9任一所述方法的步骤。
- 一种计算机程序产品,其特征在于,所述计算机程序产品包括存储在计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机设备执行时,使所述计算机设备执行权利要求1~9任一所述方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210996533.4A CN115438723A (zh) | 2022-08-19 | 2022-08-19 | 一种数据融合方法、装置、设备及存储介质 |
CN202210996533.4 | 2022-08-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024036829A1 true WO2024036829A1 (zh) | 2024-02-22 |
Family
ID=84242631
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/137357 WO2024036829A1 (zh) | 2022-08-19 | 2022-12-07 | 一种数据融合方法、装置、设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115438723A (zh) |
WO (1) | WO2024036829A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115438723A (zh) * | 2022-08-19 | 2022-12-06 | 中国银联股份有限公司 | 一种数据融合方法、装置、设备及存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103391311A (zh) * | 2013-06-24 | 2013-11-13 | 北京奇虎科技有限公司 | 一种多平台之间数据一致性校验的方法和系统 |
CN103605703A (zh) * | 2013-11-08 | 2014-02-26 | 北京奇虎科技有限公司 | 一种多平台之间数据一致性检测的方法和系统 |
WO2020259598A1 (zh) * | 2019-06-27 | 2020-12-30 | 网联清算有限公司 | 交易数据处理方法、装置、设备和系统 |
WO2021233049A1 (zh) * | 2020-05-20 | 2021-11-25 | 腾讯科技(深圳)有限公司 | 基于区块链的数据处理方法、装置、设备及可读存储介质 |
CN113837878A (zh) * | 2021-09-07 | 2021-12-24 | 中国银联股份有限公司 | 一种数据比对方法、装置、设备及存储介质 |
CN115438723A (zh) * | 2022-08-19 | 2022-12-06 | 中国银联股份有限公司 | 一种数据融合方法、装置、设备及存储介质 |
-
2022
- 2022-08-19 CN CN202210996533.4A patent/CN115438723A/zh active Pending
- 2022-12-07 WO PCT/CN2022/137357 patent/WO2024036829A1/zh unknown
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103391311A (zh) * | 2013-06-24 | 2013-11-13 | 北京奇虎科技有限公司 | 一种多平台之间数据一致性校验的方法和系统 |
CN103605703A (zh) * | 2013-11-08 | 2014-02-26 | 北京奇虎科技有限公司 | 一种多平台之间数据一致性检测的方法和系统 |
WO2020259598A1 (zh) * | 2019-06-27 | 2020-12-30 | 网联清算有限公司 | 交易数据处理方法、装置、设备和系统 |
WO2021233049A1 (zh) * | 2020-05-20 | 2021-11-25 | 腾讯科技(深圳)有限公司 | 基于区块链的数据处理方法、装置、设备及可读存储介质 |
CN113837878A (zh) * | 2021-09-07 | 2021-12-24 | 中国银联股份有限公司 | 一种数据比对方法、装置、设备及存储介质 |
CN115438723A (zh) * | 2022-08-19 | 2022-12-06 | 中国银联股份有限公司 | 一种数据融合方法、装置、设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN115438723A (zh) | 2022-12-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8364636B2 (en) | Real time data replication | |
CN112597153B (zh) | 一种基于区块链的数据存储方法、装置及存储介质 | |
CN109766330B (zh) | 数据分片方法、装置、电子设备及存储介质 | |
EP3869434A1 (en) | Blockchain-based data processing method and apparatus, device, and medium | |
WO2021051782A1 (zh) | 区块链的共识方法、装置及设备 | |
CN112015595B (zh) | 主从数据库的切换方法、计算设备及存储介质 | |
US20160044096A1 (en) | Scaling Up and Scaling Out of a Server Architecture for Large Scale Real-Time Applications | |
WO2024036829A1 (zh) | 一种数据融合方法、装置、设备及存储介质 | |
US11044312B2 (en) | Storage segment server covered cache | |
CN115237444A (zh) | 基于版本号的并发控制方法、装置、设备及存储介质 | |
US10671482B2 (en) | Providing consistency in a distributed data store | |
CN115033551A (zh) | 一种数据库迁移方法、装置、电子设备及存储介质 | |
CN111694801A (zh) | 一种应用于故障恢复的数据去重方法和装置 | |
JP7416768B2 (ja) | 分散コンピューティング環境で分散調整エンジンを非破壊的にアップグレードする方法、装置およびシステム | |
CN109710698B (zh) | 一种数据汇聚方法、装置、电子设备及介质 | |
CN117082046A (zh) | 数据上传方法、装置、设备及存储介质 | |
CN114138182B (zh) | 一种分布式云硬盘的跨存储在线克隆方法、系统及装置 | |
CN115562805A (zh) | 一种资源迁移的方法、装置及电子设备 | |
CN114116676A (zh) | 数据迁移方法、装置、电子设备及计算机可读存储介质 | |
CN115098231A (zh) | 一种跨数据中心的事务处理方法、装置以及设备 | |
CN114385657A (zh) | 数据存储方法、装置及存储介质 | |
US11734230B2 (en) | Traffic redundancy deduplication for blockchain recovery | |
CN112860694B (zh) | 业务数据的处理方法、装置及设备 | |
CN118277344B (zh) | 分布式键值存储系统的存储节点层间合并方法及装置 | |
US20240118878A1 (en) | Method and system for determining optimization applicability on intermediate representation from program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22955598 Country of ref document: EP Kind code of ref document: A1 |