CN115037729A - Data aggregation method and device, electronic equipment and computer readable medium - Google Patents

Data aggregation method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN115037729A
CN115037729A CN202210437349.6A CN202210437349A CN115037729A CN 115037729 A CN115037729 A CN 115037729A CN 202210437349 A CN202210437349 A CN 202210437349A CN 115037729 A CN115037729 A CN 115037729A
Authority
CN
China
Prior art keywords
data
operation time
time
segment
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210437349.6A
Other languages
Chinese (zh)
Inventor
李宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202210437349.6A priority Critical patent/CN115037729A/en
Publication of CN115037729A publication Critical patent/CN115037729A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/143Termination or inactivation of sessions, e.g. event-controlled end of session
    • H04L67/145Termination or inactivation of sessions, e.g. event-controlled end of session avoiding end of session, e.g. keep-alive, heartbeats, resumption message or wake-up for inactive or interrupted session
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/146Markers for unambiguous identification of a particular session, e.g. session cookie or URL-encoding

Abstract

The invention discloses a data aggregation method, a data aggregation device, electronic equipment and a computer readable medium, and relates to the technical field of big data acquisition. One embodiment of the method comprises: responding to the convergence task of the current data segment, circularly executing the following steps until the data segment of the data set is determined to meet the set condition, and outputting a final data set: acquiring reference operation time and reference data identification corresponding to the data fragments gathered last time, and determining that a data item with operation time equal to the reference operation time exists in the current data fragments; acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier and a data item with the operation time later than the reference operation time from the current data segment as the data segment gathered at this time; and integrating the data fragments gathered this time into a data set according to a set data set rule. The data set dynamically changes in the aggregation process in the embodiment.

Description

Data aggregation method and device, electronic equipment and computer readable medium
Technical Field
The invention relates to the technical field of big data acquisition, in particular to a data aggregation method and device, electronic equipment and a computer readable medium.
Background
In the design and research work of a cross-provincial and government affair platform, the problem that the research and development of business logic is mainly faced is how to gather the item data of a plurality of provinces. The conventional data aggregation scheme adopts a data snapshot mode, that is, a data set at a certain time is extracted as temporary static data, and finally required data is aggregated by cleaning and loading the temporary static data.
The data snapshot mode obtains complete static data at a certain moment, a data set influenced by user behaviors in a specific project is generally in dynamic change, and the data snapshot mode ignores the dynamic change of the data set.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data aggregation method, an apparatus, an electronic device, and a computer-readable medium, in which the method obtains an operation time and a data identifier of a last aggregated data segment, and obtains a data segment aggregated this time from a current data segment based on the obtained operation time and the data identifier, and further integrates the data segment aggregated this time into a data set, and the data set dynamically changes during the aggregation process, and a complete data set can be obtained.
To achieve the above object, according to an aspect of an embodiment of the present invention, a data aggregation method is provided.
The data aggregation method of the embodiment of the invention comprises the following steps: responding to the convergence task of the current data segment, circularly executing the following steps until the data segment of the data set is determined to meet the set condition, and outputting a final data set:
acquiring the operation time and the data identification of the data fragment gathered last time as corresponding reference operation time and reference data identification, and determining that a data item with the operation time equal to the reference operation time exists in the current data fragment;
acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier and a data item with the operation time later than the reference operation time from the current data segment as the data segment gathered at this time;
and integrating the data fragments gathered this time into the data set according to a set data set rule.
Optionally, after the step of obtaining the operation time and the data identifier of the last aggregated data segment as the corresponding reference operation time and reference data identifier, the method further includes:
judging whether a data item with the operation time equal to the reference operation time exists in the current data fragment;
and if the data item with the operation time equal to the reference operation time does not exist in the current data fragment, acquiring the data item with the operation time later than the reference operation time from the current data fragment as the data fragment gathered at this time.
Optionally, the data set rule is an ascending order rule with time as a main rule and data identification as a secondary rule;
the integrating the data segments gathered this time into the data set includes: and arranging the data fragments aggregated this time to the end of the data set.
Optionally, the method further comprises: and when the convergence task is interrupted, determining that the data segments of the data set do not meet the set conditions, and resuming the execution of the convergence task.
Optionally, the setting conditions are: the number of data segments of the data set is equal to the set number threshold.
Optionally, the method obtains a final data set by importing the reference operation time and the reference data identifier into a convergence interface as an entry, and circularly calling the convergence interface.
Optionally, the current data segment is event data of a government event transmitted over an HTTP network.
To achieve the above object, according to another aspect of the embodiments of the present invention, a data convergence device is provided.
The data convergence device of the embodiment of the invention comprises: the response module is used for responding to the convergence task of the current data segment, circularly executing the processing processes of the determining module, the first acquiring module and the integrating module until the data segment of the data set meets the set condition, and outputting a final data set; wherein, the first and the second end of the pipe are connected with each other,
the determining module is configured to obtain operation time and data identifier of a last aggregated data segment as corresponding reference operation time and reference data identifier, and determine that a data item whose operation time is equal to the reference operation time exists in the current data segment;
the first obtaining module is configured to obtain, from the current data segment, a data item whose operation time is equal to the reference operation time and whose data identifier is behind the reference data identifier, and a data item whose operation time is later than the reference operation time as a data segment to be aggregated this time;
and the integration module is used for integrating the data fragments aggregated this time into the data set according to a set data set rule.
Optionally, the apparatus further comprises: the judging module is used for judging whether a data item with the operation time equal to the reference operation time exists in the current data segment or not;
and the second acquisition module is used for acquiring a data item with the operation time later than the reference operation time from the current data fragment as the data fragment gathered at this time if the data item with the operation time equal to the reference operation time does not exist in the current data fragment.
Optionally, the data set rule is an ascending order rule with time as a main rule and data identification as a secondary rule;
the integration module is further configured to arrange the data segments aggregated this time to the end of the data set.
Optionally, the apparatus further comprises: and the recovery module is used for determining that the data segments of the data set do not meet the set conditions when the convergence task is interrupted, and recovering to execute the convergence task.
Optionally, the setting conditions are: the number of data segments of the data set is equal to the set number threshold.
Optionally, the device obtains a final data set by importing the reference operation time and the reference data identifier into a convergence interface as an entry, and circularly calling the convergence interface.
Optionally, the current data segment is event data of a government event transmitted over an HTTP network.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic apparatus.
An electronic device according to an embodiment of the present invention includes: one or more processors; a storage device, configured to store one or more programs, which when executed by the one or more processors, cause the one or more processors to implement a data aggregation method according to an embodiment of the present invention.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a computer-readable medium.
A computer-readable medium of an embodiment of the present invention stores thereon a computer program, which when executed by a processor implements a data aggregation method of an embodiment of the present invention.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided a computer program product.
A computer program product according to an embodiment of the present invention includes a computer program, and when the computer program is executed by a processor, the computer program implements a data aggregation method according to an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: the operation time and the data identification of the data fragment converged last time are obtained, the data fragment converged this time is obtained from the current data fragment based on the obtained operation time and the data identification, the data fragment converged this time is further integrated into the data set, the data set dynamically changes in the convergence process, and a complete data set can be obtained.
Preferentially adopting the operation time to split the current data fragment, and further adopting the operation time and the data identifier to split the current data fragment when the operation time cannot reflect the change of the data item so as to obtain the data fragment which is properly converged at this time. And (3) taking an ascending order rule with time as a main rule and data identification as a secondary rule as a data set rule, so that the obtained data set can describe temporal and spatial changes.
After the convergence task is interrupted, the convergence task can be automatically recovered, and noninductive convergence is realized. By setting the condition of cycle jumping, the recovery capability of increment convergence and convergence task interruption can be well supported, and the pressure of the convergence task on the server is reduced. The convergence logic is realized in a convergence interface mode, so that the realization is simple and rapid, the development workload is low, and the development efficiency is high. The gathering of government affair data can be automatically realized through HTTP network communication, the applicability is wide, and different development environments and safety strategies can be supported.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a diagram illustrating the main steps of a data aggregation method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the main flow of a data aggregation method according to another embodiment of the present invention;
FIG. 3 is a system architecture diagram of a data aggregation method according to still another embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating the main flow of a data aggregation method according to still another embodiment of the present invention;
FIG. 5 is a schematic diagram of the main modules of a data aggregation device according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 7 is a block diagram of a computer system suitable for use with the electronic device to implement an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
According to the technical scheme, the data acquisition, storage, use, processing and the like meet relevant regulations of national laws and regulations.
As described in the background, a conventional data aggregation scheme is a data snapshot approach. A snapshot is a fully available copy of a given set of data that includes an image of the corresponding data at some point in time (i.e., the point in time at which the copy began). The snapshot may be a copy of the data it represents or a replica of the data.
The operating environment of the trans-provincial and government affair platform is a network interface based on an HTTP transmission protocol, the data fragments have the upper limit of transmission content, and the network interface needs to be called for many times in a time-sharing manner to obtain complete data to be aggregated. Therefore, the data snapshot method can only acquire data segments which have indefinite time, limited space (content) and meet established rules, and cannot integrate data segments in dynamic change, and the data snapshot method has long response time and high development cost.
In order to solve the above problem, the present embodiment provides a new data aggregation method, which is suitable for integrating data segments in dynamic changes and can obtain a complete data set. The following detailed description is made with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of main steps of a data aggregation method according to an embodiment of the present invention. As shown in fig. 1, the data aggregation method according to the embodiment of the present invention mainly includes the following steps:
step S101: and responding to the convergence task of the current data segment, acquiring the operation time and the data identification of the last converged data segment as corresponding reference operation time and reference data identification, and determining that a data item with the operation time equal to the reference operation time exists in the current data segment. The current data fragment may be a portion of data that is intercepted from target data (such as network data) and may include one or more data items. When a data item changes, the time when the change occurs is recorded, referred to herein as the operation time.
After receiving a convergence task of a current data segment, acquiring operation time and a data identifier of the last converged data segment, taking the acquired operation time as reference operation time, and taking the acquired data identifier as a reference data identifier. And searching the data items with the operation time equal to the reference operation time based on the operation time of each data item in the current data segment.
Step S102: and acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier from the current data segment, and taking the data item with the operation time later than the reference operation time as the data segment gathered at this time. The data identifier, i.e., the unique identifier of the data item, may be used to indicate its storage location in the target data. In an embodiment, the larger the data identity, the later the data item is in the storage location of the target data.
When the obtaining operation time is equal to the reference operation time and the data identifier refers to a data item after the data identifier, a data item with the operation time equal to the reference operation time may be obtained from the current data segment, and then a data item with the data identifier larger than the reference data identifier is further obtained from the data items. And acquiring the data item with the operation time later than the reference operation time, namely acquiring the data item generated after the reference operation time from the current data segment.
Step S103: and integrating the data fragments gathered this time into the data set according to a set data set rule. The data aggregation rule is used to describe the integration manner between data segments, and in the embodiment, may be an arrangement rule describing temporal and spatial (i.e. content) changes. And arranging the data fragments gathered this time into a data set according to a data set rule.
Step S104: judging whether the data segments of the data set meet set conditions, if so, executing the step S101; otherwise, step S105 is performed. The conditions for jumping out of the loop are preset, after the data segments are integrated into the data set each time, whether the data segments of the data set meet the set conditions or not is judged, and if the set conditions are not met, the steps S101 to S104 are executed in a loop mode.
Step S105: and outputting the final data set. And if the data fragments of the data set meet the set conditions, the integration of the data set is completed, and the final data set is output.
Fig. 2 is a schematic diagram of a main flow of a data aggregation method according to a further embodiment of the present invention. As shown in fig. 2, the data aggregation method according to the embodiment of the present invention mainly includes the following steps:
step S201: and responding to the convergence task of the current data segment, and acquiring the operation time and the data identification of the last converged data segment as corresponding reference operation time and reference data identification. It should be noted that, if the aggregation task is executed for the first time, the operation time and data of the last aggregated data segment are marked as null. If the aggregation task is not executed for the first time, the operation time of the data segment aggregated last time may be the latest operation time of each data item of the data segment, and the data identifier may be the maximum data identifier of each data item of the data segment.
Step S202: judging whether a data item with the operation time equal to the reference operation time exists in the current data segment, if the data item with the operation time equal to the reference operation time does not exist in the current data segment, executing the step S203; otherwise, step S204 is executed. The time of sending change of the data items is recorded by operating the time, and the recording mode can accurately reflect the generation of the data items and the sequence of the change time because the time recording is characterized in that the time is only backward.
In the embodiment, the data time of the data provider can be used as the operation time, so that the problem of difference of different service systems due to different times is avoided.
Step S203: and acquiring a data item with the operation time later than the reference operation time from the current data segment as the data segment aggregated at this time, and executing step S205. In the step, the current data segment is segmented by adopting the operation time, and only the data items after the reference operation time are obtained.
Step S204: and acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier and a data item with the operation time later than the reference operation time from the current data segment as the data segment gathered this time, and executing step S205. The method comprises the steps of adopting the operation time and the data identification to segment the current data segment together, obtaining the data item which is equal to the reference operation time and the data identification is larger than the reference data identification, and obtaining the data item after the reference operation time.
When the current data segment is cut by adopting the operation time, the limit of the number in space is used as the boundary of the data segment. In the case where there is a batch operation in the business system of the data provider, it may occur that data fragment boundary partial data is partial data of a plurality of pieces of data at a time.
At this time, the operation time of the data item existing in the data segment is the same as the operation time of the data segment aggregated last time. Therefore, a data identifier can be added to the data segment aggregated this time as a constraint rule of the data segment boundary aggregated next time. And after the data processing at the same time is completed, namely the operation time of the data item is different from the reference operation time, the data item after the reference operation time is continuously acquired.
Step S205: and integrating the data fragments gathered this time into a data set according to a set data set rule. There are three elements to the acquisition of a data set: temporal, spatial (i.e., content), and data set rules. In this embodiment, the data set rule uses an ordering rule that can describe temporal and spatial variations, such as an ascending ordering rule that is time-oriented and data-oriented. And arranging the data fragments gathered this time to the end of the data set according to the data set rule.
In the aggregation process of the embodiment, no matter the data is acquired or not acquired, newly changed data is arranged to the end of the data set, so that the data which changes in the aggregation process can be acquired in the aggregation task, and the timeliness of the aggregated data is improved.
Step S206: judging whether the data segments of the data set meet set conditions, and if the data segments of the data set do not meet the set conditions, executing step S201; otherwise, step S207 is executed. Since each loop needs to segment the current data segment, a condition for jumping out of the loop can be established.
In an embodiment, the condition may be: the number of data segments of the data set is equal to the set number threshold. If the number of data segments of the data set is less than the set number threshold, the steps S201 to S206 are repeatedly performed. The condition can be set to well support increment convergence, and the pressure of the convergence task on the server is reduced.
Step S207: and outputting the final data set. And if the number of the data segments of the data set is equal to the set number threshold, the integration of the data set is completed, and a final data set is output.
In a preferred embodiment, when the convergence task is interrupted, it is determined that the data segments of the data set do not satisfy the set condition, and execution of the convergence task is resumed. The embodiment enables the data aggregation scheme to have task interruption recovery capability, and achieves noninductive aggregation of the interrupted aggregation tasks after recovery.
Fig. 3 is a system architecture diagram of a data aggregation method according to still another embodiment of the present invention. As shown in fig. 3, the data aggregation method of the present embodiment is applied to the event data aggregation for the government affairs of a plurality of provinces. The government affair matters can be administrative service matters provided by governments, related departments and public institutions to the affair transactor, such as business license transaction, tax registration, identity card transaction and the like. The event data for a government event may include event transactor information and event related information.
The gathering flow direction of the item data of the trans-provincial and government affair platform is as follows: the government affair manager of each province adds a general affair item label to the government affair, the general affair item label is used for defining which data contents of the affair data need to be converged, and a convergence interface is developed according to convergence logic designated by a trans-province government affair platform and then registered to an open platform. The integration and aggregation of the provincial data are realized by calling the aggregation interface through the office platform item directory system.
Fig. 4 is a schematic diagram of a main flow of a data aggregation method according to still another embodiment of the present invention. As shown in fig. 4, applying the data aggregation method of the present embodiment to item data aggregation for government matters of a plurality of provinces includes the steps of:
step S401: and responding to the convergence task of the current data segment, and acquiring the operation time and the data identification of the last converged data segment as corresponding reference operation time and reference data identification. In this embodiment, the current data segment is event data of a government affair transmitted through the HTTP network. HTTP is an abbreviation of HyperText Transfer Protocol, HyperText Transfer Protocol.
Step S402: and transmitting the reference operation time and the reference data identification into the convergence interface as the entry to call the convergence interface. It should be noted that, if the aggregation task is executed for the first time, the reference operation time and the reference data identifier do not need to be transmitted.
Step S403: and executing the convergence logic defined in the convergence interface to obtain the data segments required by the current convergence. The convergence logic is the implementation process of step S102, or step S202 to step S204.
The data set time corresponding to the event data of each province is in dynamic change, and only can be transmitted by using an HTTP network, and there is an upper limit of capacity, so that in this embodiment, when the convergence interface is called circularly each time, the operation time and the data identifier of the data segment converged last time are transmitted, and the operation time and the data identifier are adopted to segment the current data segment. Each loop call to the aggregation interface is a new cut and sort of the current data segment.
Step S404: and integrating the data fragments gathered this time into a data set according to a data set rule of arranging the data set in ascending order with the recording time as the main time and the data identification as the secondary time.
Step S405: judging whether the data segments of the data set meet the set conditions, if not, executing the step S401; otherwise, step S406 is performed.
Step S406: and outputting the final data set. In this embodiment, a plurality of time-shared data segments are integrated through a data set rule, and a complete data set is obtained.
The data aggregation method of the embodiment is based on the HTTP network, and data aggregation can be realized as long as the network is connected. In addition, the operation time and the data identification of the data fragment gathered last time are obtained through each calling, so that the appropriate data fragment gathered this time is collected, the applicability is wide, and different development environments and different safety strategies can be supported. In addition, the embodiment is simple and quick in implementation and low in workload of developers.
Fig. 5 is a schematic diagram of main modules of a data aggregation device according to an embodiment of the present invention. As shown in fig. 5, the data aggregation apparatus 500 according to the embodiment of the present invention mainly includes:
the response module 501 is configured to, in response to the task of aggregating the current data segments, cyclically execute the processing procedures of the determination module, the first obtaining module, and the integration module until it is determined that the data segments of the data set satisfy the set condition, and output a final data set.
The current data fragment may be a portion of data that is truncated from the target data (e.g., network data) and may include one or more data items. After the determining module, the first acquiring module and the integrating module are executed each time, judging whether the data fragments of the data set meet set conditions, and executing the determining module, the first acquiring module and the integrating module again when the set conditions are not met; and when the set conditions are met, the integration of the data sets is completed, and the final data set is output.
A determining module 502, configured to obtain an operation time and a data identifier of a data segment aggregated last time as a corresponding reference operation time and a corresponding reference data identifier, and determine that a data item whose operation time is equal to the reference operation time exists in the current data segment.
After receiving a convergence task of a current data segment, acquiring operation time and data identification of the data segment converged last time, taking the acquired operation time as reference operation time, and taking the acquired data identification as reference data identification. And searching the data items with the operation time equal to the reference operation time based on the operation time of each data item in the current data segment.
A first obtaining module 503, configured to obtain, from the current data segment, a data item whose operation time is equal to the reference operation time and whose data identifier is behind the reference data identifier, and a data item whose operation time is later than the reference operation time as the data segment aggregated this time.
When the obtaining operation time is equal to the reference operation time and the data identifier refers to a data item after the data identifier, a data item with the operation time equal to the reference operation time may be obtained from the current data segment, and then a data item with the data identifier larger than the reference data identifier is further obtained from the data items. And acquiring the data item with the operation time later than the reference operation time, namely acquiring the data item generated after the reference operation time from the current data segment.
An integrating module 504, configured to integrate the data segments aggregated this time into the data set according to a set data set rule. The data set rule is used to describe the integration manner between data segments, and in the embodiment, may be an arrangement rule describing temporal and spatial (i.e. content) changes. And arranging the data fragments gathered this time into a data set according to a data set rule.
In addition, the data aggregation device 500 according to the embodiment of the present invention may further include: a judging module, a second acquiring module and a recovering module (not shown in fig. 5). And the judging module is used for judging whether a data item with the operation time equal to the reference operation time exists in the current data segment or not.
And the second acquisition module is used for acquiring a data item with the operation time later than the reference operation time from the current data fragment as the data fragment gathered at this time if the data item with the operation time equal to the reference operation time does not exist in the current data fragment. And the recovery module is used for determining that the data segments of the data set do not meet the set conditions when the convergence task is interrupted, and recovering to execute the convergence task.
As can be seen from the above description, by obtaining the operation time and the data identifier of the data segment that is collected last time, and obtaining the data segment that is collected this time from the current data segment based on the obtained operation time and the data identifier, the data segment that is collected this time is integrated into the data set, the data set dynamically changes during the collection process, and a complete data set can be obtained.
Fig. 6 illustrates an exemplary system architecture 600 to which the data aggregation method or the data aggregation apparatus of the embodiments of the present invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 serves as a medium for providing communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server that provides various services, such as a background management server that processes an aggregation task transmitted by a user using the terminal devices 601, 602, and 603. The background management server can obtain the operation time and the data identifier of the data segment gathered last time, determine the data segment gathered this time, integrate the data segment gathered this time into a data set, and feed back a processing result (such as a final data set) to the terminal device.
It should be noted that the data aggregation method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the data aggregation apparatus is generally disposed in the server 605.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The invention also provides an electronic device, a computer readable medium and a computer program product according to the embodiments of the invention.
The electronic device of the present invention includes: one or more processors; a storage device, configured to store one or more programs, which when executed by the one or more processors, cause the one or more processors to implement a data aggregation method according to an embodiment of the present invention.
The computer-readable medium of the present invention has stored thereon a computer program which, when executed by a processor, implements a data aggregation method of an embodiment of the present invention.
The computer program product of the present invention includes a computer program, and when the computer program is executed by a processor, the computer program implements a data aggregation method according to an embodiment of the present invention.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with the electronic device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a response module, a determination module, a first acquisition module, and an integration module. For example, the response module may be further described as a module that, in response to the task of aggregating the current data segments, executes processing procedures of the determination module, the first acquisition module, and the integration module in a loop until it is determined that the data segments of the data set satisfy the set condition, and outputs a final data set.
As another aspect, the present invention also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: responding to the convergence task of the current data segment, circularly executing the following steps until the data segment of the data set is determined to meet the set condition, and outputting a final data set: acquiring the operation time and the data identification of the data fragment gathered last time as corresponding reference operation time and reference data identification, and determining that a data item with the operation time equal to the reference operation time exists in the current data fragment; acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier from the current data segment, and taking the data item with the operation time later than the reference operation time as the data segment gathered at this time; and integrating the data fragments gathered this time into the data set according to a set data set rule.
According to the technical scheme of the embodiment of the invention, the operation time and the data identification of the data fragment converged last time are obtained, the data fragment converged this time is obtained from the current data fragment based on the obtained operation time and the data identification, the data fragment converged this time is further integrated into the data set, the data set dynamically changes in the convergence process, and the complete data set can be obtained.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (17)

1. A data aggregation method, comprising:
responding to the convergence task of the current data segment, circularly executing the following steps until the data segment of the data set is determined to meet the set condition, and outputting a final data set:
acquiring the operation time and the data identification of the data fragment gathered last time as corresponding reference operation time and reference data identification, and determining that a data item with the operation time equal to the reference operation time exists in the current data fragment;
acquiring a data item with the operation time equal to the reference operation time and the data identifier behind the reference data identifier and a data item with the operation time later than the reference operation time from the current data segment as the data segment gathered at this time;
and integrating the data fragments aggregated this time into the data set according to a set data set rule.
2. The method of claim 1, wherein after the step of obtaining the operation time and the data identifier of the last aggregated data segment as the corresponding reference operation time and reference data identifier, the method further comprises:
judging whether a data item with the operation time equal to the reference operation time exists in the current data fragment;
and if the data item with the operation time equal to the reference operation time does not exist in the current data fragment, acquiring the data item with the operation time later than the reference operation time from the current data fragment as the data fragment gathered at this time.
3. The method of claim 1, wherein the data set rule is an ascending order rule with time as primary and data identification as secondary;
the integrating the data segments gathered this time into the data set includes:
and arranging the data fragments gathered this time to the end of the data set.
4. The method of claim 1, further comprising:
and when the convergence task is interrupted, determining that the data segments of the data set do not meet the set conditions, and resuming execution of the convergence task.
5. The method according to claim 4, wherein the setting condition is: the number of data segments of the data set is equal to the set number threshold.
6. The method according to any one of claims 1 to 5, wherein the method obtains the final data set by entering the reference operation time and the reference data identification as an entry into an aggregation interface and looping back the aggregation interface.
7. The method according to claim 6, wherein the current data segment is transaction data of a government transaction transmitted over an HTTP network.
8. A data aggregation device, comprising:
the response module is used for responding to the convergence task of the current data segment, circularly executing the processing processes of the determining module, the first acquiring module and the integrating module until the data segment of the data set meets the set condition, and outputting a final data set; wherein the content of the first and second substances,
the determining module is configured to obtain operation time and data identifier of a last aggregated data segment as corresponding reference operation time and reference data identifier, and determine that a data item whose operation time is equal to the reference operation time exists in the current data segment;
the first obtaining module is configured to obtain, from the current data segment, a data item whose operation time is equal to the reference operation time and whose data identifier is behind the reference data identifier, and a data item whose operation time is later than the reference operation time as a data segment to be aggregated this time;
and the integration module is used for integrating the data fragments aggregated this time into the data set according to a set data set rule.
9. The apparatus of claim 8, further comprising:
the judging module is used for judging whether a data item with the operation time equal to the reference operation time exists in the current data fragment or not;
and the second acquisition module is used for acquiring a data item with the operation time later than the reference operation time from the current data fragment as the data fragment gathered at this time if the data item with the operation time equal to the reference operation time does not exist in the current data fragment.
10. The apparatus of claim 8, wherein the data aggregation rule is an ascending order rule with time as primary and data identification as secondary;
the integration module is further configured to arrange the data segments aggregated this time to the end of the data set.
11. The apparatus of claim 8, further comprising: recovery module for
And when the convergence task is interrupted, determining that the data segments of the data set do not meet the set conditions, and resuming the execution of the convergence task.
12. The apparatus according to claim 11, wherein the setting condition is: the number of data segments of the data set is equal to the set number threshold.
13. The apparatus according to any of claims 8 to 12, wherein the apparatus obtains the final data set by entering the reference operation time and the reference data identification as an ingress into an aggregation interface and looping back the aggregation interface.
14. The apparatus according to claim 13, wherein the current data segment is transaction data of a government transaction transmitted over an HTTP network.
15. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
17. A computer program product comprising a computer program, characterized in that the program, when executed by a processor, implements the method according to any one of claims 1-7.
CN202210437349.6A 2022-04-21 2022-04-21 Data aggregation method and device, electronic equipment and computer readable medium Pending CN115037729A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210437349.6A CN115037729A (en) 2022-04-21 2022-04-21 Data aggregation method and device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210437349.6A CN115037729A (en) 2022-04-21 2022-04-21 Data aggregation method and device, electronic equipment and computer readable medium

Publications (1)

Publication Number Publication Date
CN115037729A true CN115037729A (en) 2022-09-09

Family

ID=83118583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210437349.6A Pending CN115037729A (en) 2022-04-21 2022-04-21 Data aggregation method and device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN115037729A (en)

Citations (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5649195A (en) * 1995-05-22 1997-07-15 International Business Machines Corporation Systems and methods for synchronizing databases in a receive-only network
JP2006155676A (en) * 2006-03-17 2006-06-15 Fujitsu Ltd Database backup method for designated time and database backup device
CN1961313A (en) * 2004-06-25 2007-05-09 苹果电脑有限公司 Methods and systems for indexing files and adding associated metadata to index and metadata databases
CN101291209A (en) * 2007-04-17 2008-10-22 大唐移动通信设备有限公司 Method and apparatus for data synchronization
CN102831214A (en) * 2006-10-05 2012-12-19 斯普兰克公司 Time series search engine
US20130204913A1 (en) * 2012-02-07 2013-08-08 Hitachi Solutions, Ltd. File list generation method, system, and program, and file list generation device
CN103530290A (en) * 2012-07-03 2014-01-22 深圳市腾讯计算机系统有限公司 Method and system for data migration among databases
CN104270424A (en) * 2014-09-17 2015-01-07 深圳创维数字技术有限公司 Database synchronization method, server and system
CN106101256A (en) * 2016-07-07 2016-11-09 百度在线网络技术(北京)有限公司 Method and apparatus for synchrodata
CN106372664A (en) * 2016-08-30 2017-02-01 成都路行通信息技术有限公司 Picture aggregation method and device
CN106874389A (en) * 2017-01-11 2017-06-20 腾讯科技(深圳)有限公司 The moving method and device of data
CN107491458A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 A kind of method and apparatus and system of storage time sequence data
US10216695B1 (en) * 2017-09-21 2019-02-26 Palantir Technologies Inc. Database system for time series data storage, processing, and analysis
CN109558442A (en) * 2018-11-19 2019-04-02 中国科学院信息工程研究所 A kind of real-time assemblage method of data and system
CN109710698A (en) * 2018-12-28 2019-05-03 北京明朝万达科技股份有限公司 A kind of data assemblage method, device, electronic equipment and medium
CN110377784A (en) * 2019-07-24 2019-10-25 广州酷狗计算机科技有限公司 Sing single update method, device, terminal and storage medium
US20190370800A1 (en) * 2018-05-31 2019-12-05 Visa International Service Association Method, System, and Computer Program Product for Aggregating Data from a Plurality of Sources
CN110765204A (en) * 2019-09-30 2020-02-07 武汉达梦数据库有限公司 Method and device for processing incremental synchronous abnormal interrupt condition
US10657018B1 (en) * 2019-08-26 2020-05-19 Coupang Corp. Systems and methods for dynamic aggregation of data and minimization of data loss
CN111506581A (en) * 2020-06-17 2020-08-07 北京北龙超级云计算有限责任公司 Data aggregation method and server
CN111756794A (en) * 2020-05-06 2020-10-09 上海明略人工智能(集团)有限公司 Data synchronization method and mobile terminal
CN112291185A (en) * 2019-07-24 2021-01-29 中国移动通信集团贵州有限公司 Method and device for collecting network data
CN112307057A (en) * 2020-10-27 2021-02-02 北京健康之家科技有限公司 Data processing method and device, electronic equipment and computer storage medium
CN113192233A (en) * 2021-04-29 2021-07-30 北京车和家信息技术有限公司 Data acquisition method, device, equipment and medium
CN113420048A (en) * 2021-05-19 2021-09-21 中交公规土木大数据信息技术(北京)有限公司 Data aggregation method and device, electronic equipment and storage medium
CN113535856A (en) * 2021-07-29 2021-10-22 上海哔哩哔哩科技有限公司 Data synchronization method and system
CN113656511A (en) * 2021-10-20 2021-11-16 天津南大通用数据技术股份有限公司 Heterogeneous database increment synchronization method and system based on source database non-outage
CN113791955A (en) * 2021-09-17 2021-12-14 济南浪潮数据技术有限公司 Data aggregation device and method for monitoring system and server
CN113886408A (en) * 2021-09-29 2022-01-04 北京字节跳动网络技术有限公司 Data uploading method and device, computer equipment and storage medium
CN114116811A (en) * 2022-01-29 2022-03-01 北京优特捷信息技术有限公司 Log processing method, device, equipment and storage medium
CN114185920A (en) * 2021-11-17 2022-03-15 南京星云数字技术有限公司 Log data processing method and device, computer equipment and storage medium
CN114238523A (en) * 2021-12-17 2022-03-25 蚂蚁区块链科技(上海)有限公司 Data synchronization method and device

Patent Citations (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5649195A (en) * 1995-05-22 1997-07-15 International Business Machines Corporation Systems and methods for synchronizing databases in a receive-only network
CN1961313A (en) * 2004-06-25 2007-05-09 苹果电脑有限公司 Methods and systems for indexing files and adding associated metadata to index and metadata databases
JP2006155676A (en) * 2006-03-17 2006-06-15 Fujitsu Ltd Database backup method for designated time and database backup device
CN102831214A (en) * 2006-10-05 2012-12-19 斯普兰克公司 Time series search engine
CN101291209A (en) * 2007-04-17 2008-10-22 大唐移动通信设备有限公司 Method and apparatus for data synchronization
US20130204913A1 (en) * 2012-02-07 2013-08-08 Hitachi Solutions, Ltd. File list generation method, system, and program, and file list generation device
CN103530290A (en) * 2012-07-03 2014-01-22 深圳市腾讯计算机系统有限公司 Method and system for data migration among databases
CN104270424A (en) * 2014-09-17 2015-01-07 深圳创维数字技术有限公司 Database synchronization method, server and system
CN107491458A (en) * 2016-06-13 2017-12-19 阿里巴巴集团控股有限公司 A kind of method and apparatus and system of storage time sequence data
CN106101256A (en) * 2016-07-07 2016-11-09 百度在线网络技术(北京)有限公司 Method and apparatus for synchrodata
CN106372664A (en) * 2016-08-30 2017-02-01 成都路行通信息技术有限公司 Picture aggregation method and device
CN106874389A (en) * 2017-01-11 2017-06-20 腾讯科技(深圳)有限公司 The moving method and device of data
US10216695B1 (en) * 2017-09-21 2019-02-26 Palantir Technologies Inc. Database system for time series data storage, processing, and analysis
US20190370800A1 (en) * 2018-05-31 2019-12-05 Visa International Service Association Method, System, and Computer Program Product for Aggregating Data from a Plurality of Sources
CN109558442A (en) * 2018-11-19 2019-04-02 中国科学院信息工程研究所 A kind of real-time assemblage method of data and system
CN109710698A (en) * 2018-12-28 2019-05-03 北京明朝万达科技股份有限公司 A kind of data assemblage method, device, electronic equipment and medium
CN110377784A (en) * 2019-07-24 2019-10-25 广州酷狗计算机科技有限公司 Sing single update method, device, terminal and storage medium
CN112291185A (en) * 2019-07-24 2021-01-29 中国移动通信集团贵州有限公司 Method and device for collecting network data
US10657018B1 (en) * 2019-08-26 2020-05-19 Coupang Corp. Systems and methods for dynamic aggregation of data and minimization of data loss
CN110765204A (en) * 2019-09-30 2020-02-07 武汉达梦数据库有限公司 Method and device for processing incremental synchronous abnormal interrupt condition
CN111756794A (en) * 2020-05-06 2020-10-09 上海明略人工智能(集团)有限公司 Data synchronization method and mobile terminal
CN111506581A (en) * 2020-06-17 2020-08-07 北京北龙超级云计算有限责任公司 Data aggregation method and server
CN112307057A (en) * 2020-10-27 2021-02-02 北京健康之家科技有限公司 Data processing method and device, electronic equipment and computer storage medium
CN113192233A (en) * 2021-04-29 2021-07-30 北京车和家信息技术有限公司 Data acquisition method, device, equipment and medium
CN113420048A (en) * 2021-05-19 2021-09-21 中交公规土木大数据信息技术(北京)有限公司 Data aggregation method and device, electronic equipment and storage medium
CN113535856A (en) * 2021-07-29 2021-10-22 上海哔哩哔哩科技有限公司 Data synchronization method and system
CN113791955A (en) * 2021-09-17 2021-12-14 济南浪潮数据技术有限公司 Data aggregation device and method for monitoring system and server
CN113886408A (en) * 2021-09-29 2022-01-04 北京字节跳动网络技术有限公司 Data uploading method and device, computer equipment and storage medium
CN113656511A (en) * 2021-10-20 2021-11-16 天津南大通用数据技术股份有限公司 Heterogeneous database increment synchronization method and system based on source database non-outage
CN114185920A (en) * 2021-11-17 2022-03-15 南京星云数字技术有限公司 Log data processing method and device, computer equipment and storage medium
CN114238523A (en) * 2021-12-17 2022-03-25 蚂蚁区块链科技(上海)有限公司 Data synchronization method and device
CN114116811A (en) * 2022-01-29 2022-03-01 北京优特捷信息技术有限公司 Log processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111190888A (en) Method and device for managing graph database cluster
CN111277639B (en) Method and device for maintaining data consistency
CN111427701A (en) Workflow engine system and business processing method
CN110532322B (en) Operation and maintenance interaction method, system, computer readable storage medium and equipment
CN107644075B (en) Method and device for collecting page information
US11934287B2 (en) Method, electronic device and computer program product for processing data
CN114827280B (en) Request processing method, device, equipment and medium
CN113760722A (en) Test system and test method
CN111782502A (en) Automatic testing method and device
CN114185734A (en) Cluster monitoring method and device and electronic equipment
CN113760522A (en) Task processing method and device
CN112817687A (en) Data synchronization method and device
WO2022171190A1 (en) Fixed execution sequence transaction method, and apparatus
CN115037729A (en) Data aggregation method and device, electronic equipment and computer readable medium
CN111698109A (en) Method and device for monitoring log
CN112910855B (en) Sample message processing method and device
CN110580216A (en) application extraction and detection method and device
CN114756301A (en) Log processing method, device and system
CN114265605A (en) Version rollback method and device for functional component of business system
CN112905273A (en) Service calling method and device
CN113141236A (en) Message processing method and device
CN113296829A (en) Method, device, equipment and computer readable medium for processing service
CN113761433A (en) Service processing method and device
CN113741951A (en) Local packaging method and device
CN113779048A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination