CN112764988A - Data segmentation acquisition method and device - Google Patents

Data segmentation acquisition method and device Download PDF

Info

Publication number
CN112764988A
CN112764988A CN202110021400.0A CN202110021400A CN112764988A CN 112764988 A CN112764988 A CN 112764988A CN 202110021400 A CN202110021400 A CN 202110021400A CN 112764988 A CN112764988 A CN 112764988A
Authority
CN
China
Prior art keywords
data
acquisition
record
node
collection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110021400.0A
Other languages
Chinese (zh)
Other versions
CN112764988B (en
Inventor
郁强
黄刘军
王灵艳
刘燕
楼凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCI China Co Ltd
Original Assignee
CCI China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCI China Co Ltd filed Critical CCI China Co Ltd
Priority to CN202110021400.0A priority Critical patent/CN112764988B/en
Publication of CN112764988A publication Critical patent/CN112764988A/en
Application granted granted Critical
Publication of CN112764988B publication Critical patent/CN112764988B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Abstract

The invention provides a data sectional acquisition method, which can be used for acquiring source data of a plurality of platform servers, dividing the source data to be acquired into a plurality of sections of sectional data for sectional acquisition, recording a log record corresponding to each sectional acquisition task by using a log table so as to accurately position sectional nodes of the sectional data, ensuring that each acquisition is continuously acquired from the sectional nodes of the sectional data acquired last time, and avoiding missed acquisition and repeated acquisition of the data; and the scheme processes the dependency relationship of the segmented data and ensures the integrity of the acquired data.

Description

Data segmentation acquisition method and device
Technical Field
The invention relates to the field of data acquisition, in particular to a data sectional acquisition method and a data sectional acquisition device.
Background
In order to enable a system administrator to know the service condition of the software system in time, specific data generated by the operation of the software system needs to be collected so as to carry out statistics and analysis on the specific data. In other words, data collection is a process of extracting valuable data from target system software and putting the data into a database in a structured format, and the consistency and integrity of the collected data greatly affect the statistics and analysis results of the data.
However, the current data acquisition method, particularly, the data acquisition method for mass data has the following problems:
first, for a data acquisition task of once full acquisition, the data acquisition task is easily interrupted for various reasons, for example, network interruption, network delay, etc. all result in the data acquisition task to be interrupted, and once the data acquisition task is interrupted, the phenomenon of data missing acquisition or repeated acquisition easily exists in the data acquisition task of reacquisition, and then causes the data and the source data that gather to be inconsistent, the uniformity of image data. Particularly, for mass data, the duration of the data acquisition task is long, and accordingly, data acquisition is interrupted more frequently in the data acquisition task process.
Secondly, data come from a plurality of software systems, and a certain data dependency relationship exists between a plurality of data, if the phenomenon of data acquisition interruption occurs, or a method of acquiring software data of different systems in a segmented manner is adopted, the phenomenon of incomplete acquired data is easy to occur.
Disclosure of Invention
The invention aims to provide a data sectional acquisition method and a data sectional acquisition device, the data sectional acquisition method can be used for multi-system data acquisition, the consistency and the integrity of the data acquisition can be ensured, and meanwhile, compared with a traditional one-time full acquisition mode, the data acquisition method can reduce the pressure of a server and improve the efficiency of subsequent statistical analysis.
In order to achieve the above object, the present technical solution provides a data segment collection method, including the following steps:
acquiring a previous subsection acquisition log record corresponding to a previous acquisition node in an acquisition database, wherein the previous subsection acquisition log record at least records a subsection record starting node and a subsection record ending node of subsection data acquired by the previous acquisition node, and an acquisition node section is arranged between the current acquisition node and the previous acquisition node;
acquiring the first generated data record and the last generated data record in the platform database in the acquisition node section, and determining an acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
In a second aspect, an application scenario of the data segment collection method is provided.
In a third aspect, a data segment collecting device for operating a data segment collecting method is provided, which includes:
the system comprises an acquisition database, a data storage module and a data processing module, wherein the acquisition database is used for storing sectional acquisition log records, and the sectional acquisition log records at least record a sectional recording starting node and a sectional ending node of sectional data;
the platform database is used for storing data records;
the log acquisition unit is used for acquiring a previous subsection acquisition log record corresponding to a previous acquisition node in an acquisition database, the previous subsection acquisition log record at least records a subsection record starting node and a subsection record ending node of subsection data acquired by the previous acquisition node, and an acquisition node section is arranged between a current acquisition node and the previous acquisition node;
and the data record acquisition unit is used for acquiring the first generated data record and the last generated data record in the platform database in the acquisition node section and determining the acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
Compared with the prior art, the technical scheme has the following characteristics and beneficial effects;
1) the consistency of the collected data and the source data is ensured. According to the scheme, data are acquired in a segmented mode, the segmented data are rapidly positioned in a mode of recording node positions of the segmented data, the fact that acquisition is started from the position where acquisition is finished last time is guaranteed, missing acquisition or repeated acquisition of the data is avoided, and meanwhile data among the multiple platform servers can be stored in a correlated mode, so that data consistency is guaranteed.
2) And the data integrity of the collected data is ensured. The scheme processes the dependency relationship among the platform data, ensures the sequence of data acquisition and avoids the one-sidedness of data statistics.
3) The server pressure is reduced, and the subsequent data analysis efficiency is improved. Compared with the traditional method for collecting all data at one time, the data segmentation collection method can properly reduce the pressure of a server, collect the data of each platform, facilitate the gathering and counting tasks of subsequent data and improve the counting efficiency.
Drawings
Fig. 1 is a diagram of an operating system architecture for a data segment acquisition method according to the present invention.
Fig. 2 is a schematic diagram of a segmented acquisition.
Fig. 3 is an application display diagram of the data segment collection method of the present invention.
Fig. 4 is a schematic diagram of a method for confirming an acquisition start node by the data segment acquisition method of the present invention.
In the figure: 101-client side, 102-platform server side, 103-platform data side, 104-acquisition server side and 105-acquisition data side.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
According to the first aspect of the invention, a data segmented acquisition method is provided, which can be used for acquiring source data of one or more platform servers, dividing the source data to be acquired into a plurality of segments of segmented data for segmented acquisition, recording a log record corresponding to each segmented acquisition task by using a log table so as to accurately position segmented nodes of the segmented data, ensuring that each acquisition is continuously acquired from the segmented nodes of the segmented data acquired last time, and avoiding missing acquisition and repeated acquisition of the data; and the scheme processes the dependency relationship of the segmented data and ensures the integrity of the acquired data.
Because the scheme carries out sectional acquisition on the source data, the acquisition time of each section is short, and the probability of data acquisition task interruption is reduced compared with the traditional one-time data full acquisition scheme; even if the phenomenon of data acquisition interruption occurs in the scheme, the interruption position of the data can be accurately positioned when the data acquisition is resumed, so that the data can be continuously acquired, and the consistency of the acquired data and the source data is ensured. In addition, the scheme determines the acquired segmented data nodes based on the dependency relationship among the source data, fully considers the dependency relationship among the source data and ensures the integrity of the acquired data.
As shown in fig. 1, a service framework of a data segmentation collection method provided by an embodiment of the present disclosure is shown, where the service framework includes a plurality of clients 101, a plurality of platform servers 102, a platform database 103, a collection server 104, and a collection database 105; the plurality of clients 101 and the platform server 102 establish communication connection, the platform server 102 and the acquisition server 104 establish communication connection, the platform server 102 stores data in the platform database 103, and the acquisition server 104 stores data in the acquisition data terminal 105.
It is understood that in other embodiments of the present application, the service framework may include one client 101, a plurality of platform servers 102, a platform database 103, an acquisition server 104, and an acquisition database 105; or a plurality of clients 101, a platform server 102, a platform database 103, an acquisition server 104 and an acquisition database 105; or a client 101, a platform server 102, a platform database 103, an acquisition server 104, and an acquisition database 105, which is not limited in this application.
The client 101 corresponds to a user end, a user may interact with the platform server 102 through a network or other communication protocol using the client 101 to receive or send a message, and the client 101 may be various electronic devices having a display screen and supporting information interaction, including but not limited to a tablet computer, a desktop computer, and the like. The platform server 102 may be a server providing various service supports, the platform server 102 returns data to the client 101 based on a request of a user, and source data generated by interaction between the platform server 102 and the client 101 is stored in the platform database 103, and the source data is recorded in the form of data records, where the data records at least include: the data name, data collection node, and data content of the source data. The collection server 104 is a server for realizing data sectional collection, and sectionally collects source data in the platform database 103 according to characteristic data characteristics to obtain collected data, and stores the collected data and corresponding collection log records in the collection database 105, wherein the sectionally collected log records at least include: and the acquisition nodes correspond to the sectional acquisition task names of the acquisition nodes, the sectional recording starting nodes and the sectional recording ending nodes of the sectional data.
Correspondingly, an acquisition log table is built in the acquisition database 105, and the acquisition log record is stored in the acquisition log table.
The structure of the log table is shown as the following table one:
table structure of table-log table
Name of field Type of field Field interpretation
name String Segment collection task name
startTime Date Segment recording start time
endTime Date End time of segmented recording
operationTime Date Collection node
In a first aspect of the present solution, based on the above service framework, the data segmentation acquisition method provided by the present solution includes the following steps:
acquiring a previous segment acquisition log record corresponding to a previous acquisition node in an acquisition database 105, wherein the previous segment acquisition log record comprises a segment acquisition task name corresponding to the previous acquisition node, a segment recording start node and a segment recording end node of the segment data, the previous acquisition node is the node closest to a current acquisition node, and an acquisition node segment is arranged between the current acquisition node and the previous acquisition node;
acquiring the first generated data record and the last generated data record in the platform database 103 in the acquisition node section, wherein the data record comprises the data name of the source data, the data acquisition node and the data content, and determining the acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
In this scheme, the platform database 103 stores source data generated by a plurality of clients 101, and the collection server 104 collects the source data in the platform database 103 in a segmented manner at intervals of collection nodes in the form of segmented data. In the embodiment of the present disclosure, the collection server 104 collects the source data satisfying a specific data characteristic from the platform database 103 in the manner mentioned above to obtain the collected data, so as to perform subsequent data analysis on the collected data.
Specifically, the collection server 104 collects corresponding segmented data at each collection node, and the segmented data collected by a plurality of collection nodes are collected to obtain all data meeting specific data characteristics. In the scheme, the acquisition node is selected as an acquisition time node or an acquisition data capacity node.
And when the acquisition node is an acquisition time node, performing segmented acquisition by taking time as the node. Illustratively, the duration of the source data a is 10: 00-11:00, then 10:30 can be used as the segmentation node of the source data A, and 10: source data of 00-10:30, and then source data of 10:30-11:00 are collected.
And when the acquisition node is the data capacity acquisition node, performing segmented acquisition by taking the data capacity as the node. Illustratively, the data capacity of the source data a is 100kb, and then 50kb is taken as a segment of the source data a to receive you, and the source data of 0-50kb is collected first, and then the source data of 50-100kb is collected.
In addition, after the collection server 104 completes the segment collection task of each collection node, a segment collection log record is generated and stored in the collection database 105, where the segment collection log record includes a collection node, a segment collection task name corresponding to the collection node, a segment record start node and a segment record end node of the segment data. The segmented acquisition task name comprises a task name of the segmented acquisition task and a data name of source data, so that the data in the source data can be correspondingly identified through the data name, and the data acquisition task is corresponding to the task name. And setting the data acquisition task to acquire source data meeting specific data characteristics. For example, if the data collection task a is set to collect source data satisfying the characteristic data characteristic X, the collection server 104 collects all source data B1, B2/B3 satisfying the characteristic data characteristic X, and the data names of the source data B1, B2, and B3 are recorded during the collection process.
In an embodiment of the present solution, before acquiring the previous segment acquisition log record corresponding to the previous acquisition node in the acquisition database 105, triggering a data acquisition task at the current acquisition node, wherein the data acquisition task includes a task name and an acquisition instruction. The data collection task is to collect source data with specific data characteristics from the platform database 103, and the collection server 105 restarts collecting the source data after receiving the collection instruction.
"obtaining a previous segment acquisition log record corresponding to a previous acquisition node within the acquisition database 105" includes: and based on the task name and the sectional collection task name in the collection log record, performing matching search based on the previous collection node and the collection node in the sectional collection log record, and acquiring the sectional collection log record which is matched with two conditions at the same time as the previous sectional log record.
And in the scheme, the quick matching query of the previous segmented collection log record can be carried out through an SQL structural language.
When the first generated data record and the last generated data record in the platform database 103 in the collection node segment are obtained, if the data corresponding to the data record has a dependency relationship, the last generated data record is determined according to the last generated node of the depended data on which the data depend, and at this time, the data is the dependent data. Specifically, whether the last generated data record has a dependency relationship is judged, if the last generated data record has the dependency relationship, the last generation node of the dependent data is used for determining the last generation node of the dependent data, and the last generated data record is obtained; and if the last generated data record has no dependency relationship, acquiring the last generated data record corresponding to the last generated node in the acquisition node section.
In the scheme, a plurality of platform service terminals 102, and the platform service terminals 102 and a plurality of clients 101 perform communication interaction to acquire source data, so that a dependency relationship exists in the source data in part of the platform service terminals 102. And if the attribute source of the data corresponding to the data record is the depended data, considering that the data corresponding to the data record has a dependency relationship. That is, the dependency refers to: when collecting the dependent data in the collection database 105, where the value of some attribute of the dependent data is to be derived from the depended data, it is necessary to ensure that the depended data is collected before saving the dependent data, which is the relationship that the dependent data depends on the depended data. If there is a dependency, the time of the last collection node of the depended data needs to be used as a reference.
"if the previous segment collection log record does not record the data of the data record, taking the data collection node of the first generated data record as the collection starting point position" includes: and if the segment collection task name in the previous segment collection log record is not matched with the data name of the source data, taking the data collection node of the first generated data record as the collection starting point position. This means that the acquisition server 104 has not acquired the source data previously, so the source data is acquired from the generation position of the source data.
"if there is an intersection between the data of the previous collection log record and the last generated data record, taking the segment record end node in the previous collection log record as the collection start location" includes if the segment record end node in the previous collection log record is smaller than the current collection node and not larger than the data collection node of the most recent data record.
If the segmented record end node in the previous acquisition log record is smaller than the current acquisition node, it indicates that the acquisition server 104 has acquired part or all of the source data; if the segment record end node in the previous collection log record is not greater than the data collection node of the latest data record, it means that the collection server 104 only collects part of the source data, and at this time, the segment record end node in the previous collection log record is used as the collection start position.
Correspondingly, it is implied in the determination rule that "if the segment record end node in the previous collection log record is smaller than the current collection node and equal to the data collection node of the latest data record, the source data is not collected", which means that the collection server 104 collects all the source data at this time.
The above determination manner of "determining the acquisition start point position" can be visualized as in fig. 2: for a case where source data a belongs to "the segment record end node within the previous collection log record is smaller than the current collection node and equal to the data collection node of the most recent data record"; for a case where source data B and source data C belong to "the segment record end node within the previous acquisition log record is smaller than the current acquisition node and not larger than the data acquisition node of the latest data record"; for the case that the source data D belongs to "if the previous segment acquisition log record has no data of recording the data record, the data acquisition node of the first generated data record is used as the acquisition starting point position".
"the current collection node starts collecting segmented data at the collection start point position" includes: the acquisition server 104 stores the acquired acquisition data in the acquisition database 105, and stores the acquisition log corresponding to the current acquisition node in the acquisition database 105. It is worth mentioning that the collected data is summarized with the segmented data that has been collected previously by the collection server 104, and data consistent with the source data is obtained.
In a second aspect, an application scenario of a data segment collection method is provided, where the data segment collection method can be applied to data collection of a platform server with participation of multiple application terminals, and an application scenario of the data segment collection method applied to "managing usage of an online conference system" is provided in the application scenario.
Specifically, the application scenarios are as follows: the manager needs to obtain "the conference scale situation of each enterprise in the last week during the use of the conference system platform". That is to say, the number of participants of each enterprise on a conference system platform in a set time period needs to be acquired, and the required source data includes: and in the set time period, meeting information of the enterprise and participant information corresponding to each meeting.
In the application scheme, the platform server 102 is a conference system platform, users log in and use an online conference system on their respective clients 101, participant information and conference information are stored in the platform database 103 as source data, and the acquisition server 104 acquires the source data meeting specific data characteristics from the platform database 103.
The scheme comprises the following steps:
setting a log table for storing sectional collection log records, wherein the information of the sectional collection log records comprises: acquiring time, namely a sectional acquisition task name corresponding to the acquiring time, sectional recording starting time and sectional recording ending time of sectional data;
triggering a data acquisition task, wherein the data acquisition task is used for acquiring the number of participants of each enterprise of the online conference in the last week, the data acquisition task is started to be executed, and data are acquired at intervals of acquisition time;
finding out a sectional acquisition log record corresponding to the latest acquisition time before the current acquisition time in a log table by executing SQL, and defining the sectional acquisition log record as lastLog;
taking startTime in lastLog as start time, and end time of lastData as end time, acquiring earliest meeting record between current acquisition time and latest acquisition time from a conference system platform, defined as firstData, and latest meeting record, defined as lastData. Because the conference record needs to be associated with the participant data on the video equipment platform, the mode ensures that the conference data collected in segments contains the participant data and is complete.
If the lastLog does not have the conference record corresponding to the firstData, starting to collect the conference record at the conference start time of the firstData; if the endTime of lastLog is less than the current acquisition time and the endTime of lastLog is not greater than startTime of lastData, starting to acquire data from the endTime of lastLog; storing the collected data in the collection database 105 completes the collection operation of the segmented data.
Original data in the conference system are collected into a collection database through timed task polling and repeated data collection tasks, so that consistency of data on two sides of the data is guaranteed. And finally, counting the conference data according to the enterprise grouping through SQL to obtain the relation data of the enterprise and the participant number, returning to the front end, and rendering the page. The statistical map shown in fig. 3 is thus obtained. Specifically, since all the collected data satisfying the specific data characteristics in the set time period (set by the collection time) have been stored in the collection database 105, the number of participants in the enterprise can be obtained by collecting the source data corresponding to the specific enterprise using SQL, and the relationship data can be displayed in a visual manner.
In a third aspect, the present disclosure provides a data segmented acquisition apparatus, which performs data segmented acquisition by using the data segmented acquisition method, including:
the system comprises an acquisition database, a data storage module and a data processing module, wherein the acquisition database is used for storing sectional acquisition log records, and the sectional acquisition log records at least record a sectional recording starting node and a sectional ending node of sectional data;
the platform database is used for storing data records;
the log acquisition unit is used for acquiring a previous subsection acquisition log record corresponding to a previous acquisition node in an acquisition database, the previous subsection acquisition log record at least records a subsection record starting node and a subsection record ending node of subsection data acquired by the previous acquisition node, and an acquisition node section is arranged between a current acquisition node and the previous acquisition node;
and the data record acquisition unit is used for acquiring the first generated data record and the last generated data record in the platform database in the acquisition node section and determining the acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
It should be noted that the content of the data segment collecting method executed by the data segment collecting device is as described in the embodiment content of the first aspect, and is not elaborated.
The computer system of the server for implementing the data segment collection method of the present embodiment includes a central processing unit CPU) that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section into a Random Access Memory (RAM). In the RAM, various programs and data necessary for system operation are also stored. The CPU, ROM, and RAM are connected to each other via a bus. An input/output (I/O) interface is also connected to the bus.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program containing program code for performing the data segment acquisition method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program performs the above-described functions defined in the system of the present invention when executed by a Central Processing Unit (CPU).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. The modules described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described modules may also be disposed in a processor.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to perform process steps corresponding to a data segment collection method. An electronic device is also provided; comprises a processor; and a memory having stored therein a computer memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform the data segment acquisition method mentioned above in relation to the first aspect.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A data segmentation acquisition method is characterized by comprising the following steps:
acquiring a previous subsection acquisition log record corresponding to a previous acquisition node in an acquisition database, wherein the previous subsection acquisition log record at least records a subsection record starting node and a subsection record ending node of subsection data acquired by the previous acquisition node, and an acquisition node section is arranged between the current acquisition node and the previous acquisition node;
acquiring the first generated data record and the last generated data record in the platform database in the acquisition node section, and determining an acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
2. The data segment collection method of claim 1, wherein "if the data of the previously collected log record and the last generated data record intersect" comprises: and if the segmented record ending node in the previous collected log record is smaller than the current collecting node and not larger than the data collecting node of the latest data record.
3. The data segment collection method of claim 1 wherein no data is collected if the segment record end node in the previously collected log record is less than the current collection node and equal to the data collection node of the most recent data record.
4. The data segment collection method of claim 1, wherein the step of collecting the data of the data record if the previous segment collection log record does not record the data of the data record comprises: the segment collection task name in the previous segment collection log record and the data name of the data record do not match.
5. The method of claim 1, wherein obtaining a prior segment acquisition log record corresponding to a prior acquisition node in the acquisition database comprises: and triggering a data acquisition task at the current acquisition node, and searching the previous sectional acquisition log record based on the data acquisition task and the current acquisition node.
6. The data segment collection method of claim 5, wherein the previous segment log record is obtained by performing a matching search based on the previous collection node and a collection node in the segment collection log record based on a task name of the data collection task and a segment collection task name in the segment collection log record.
7. The method according to claim 1, wherein in the step of obtaining the first generated data record and the last generated data record in the platform database in the collection node segment, if the data corresponding to the data record has a dependency relationship, the last generated data record is determined according to the last generation node of the depended-on data on which the data depends.
8. The data segment collection method of claim 7, wherein the data corresponding to the data record is considered to have a dependency relationship if the depended data is from the attribute source of the data corresponding to the data record.
9. The data segment collection method of claim 1, wherein the collection node is a collection time node or a collection data capacity node.
10. The data segment collection method of claim 1, applied to data collection of a platform server with participation of multiple application terminals.
11. A data segment collection device, comprising:
the system comprises an acquisition database, a data storage module and a data processing module, wherein the acquisition database is used for storing sectional acquisition log records, and the sectional acquisition log records at least record a sectional recording starting node and a sectional ending node of sectional data;
the platform database is used for storing data records;
the log acquisition unit is used for acquiring a previous subsection acquisition log record corresponding to a previous acquisition node in an acquisition database, the previous subsection acquisition log record at least records a subsection record starting node and a subsection record ending node of subsection data acquired by the previous acquisition node, and an acquisition node section is arranged between a current acquisition node and the previous acquisition node;
and the data record acquisition unit is used for acquiring the first generated data record and the last generated data record in the platform database in the acquisition node section and determining the acquisition starting point position in the following way:
if the previous sectional collection log record does not record the data of the data record, taking the data collection node of the data record generated firstly as the collection starting point position;
if the data of the previous collection log record and the last generated data record have an intersection, taking the segment record ending node in the previous collection log record as the collection starting point position;
and the current acquisition node starts to acquire the segmented data at the acquisition starting point position and records the segmented acquisition log record corresponding to the current acquisition node.
12. An electronic device, comprising:
a processor; and
a memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform a data segment acquisition method according to any one of claims 1-10.
CN202110021400.0A 2021-01-08 2021-01-08 Data segment acquisition method and device Active CN112764988B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110021400.0A CN112764988B (en) 2021-01-08 2021-01-08 Data segment acquisition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110021400.0A CN112764988B (en) 2021-01-08 2021-01-08 Data segment acquisition method and device

Publications (2)

Publication Number Publication Date
CN112764988A true CN112764988A (en) 2021-05-07
CN112764988B CN112764988B (en) 2024-02-23

Family

ID=75700912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110021400.0A Active CN112764988B (en) 2021-01-08 2021-01-08 Data segment acquisition method and device

Country Status (1)

Country Link
CN (1) CN112764988B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114189490A (en) * 2021-11-26 2022-03-15 广州市百果园信息技术有限公司 User list processing method, system, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154360A1 (en) * 2009-12-17 2011-06-23 Fujitsu Limited Job analyzing method and apparatus
CN105468764A (en) * 2015-12-02 2016-04-06 广州华多网络科技有限公司 Data processing method and apparatus and cluster service system
US20170371915A1 (en) * 2016-06-27 2017-12-28 Invensys Systems, Inc. Transactional integrity in a segmented database architecture
CN107528864A (en) * 2016-06-20 2017-12-29 中国科学院微电子研究所 Heterogeneous network data processing method and system
CN110647421A (en) * 2018-06-27 2020-01-03 阿里巴巴集团控股有限公司 Database processing method, device and system and electronic equipment
CN111198853A (en) * 2018-11-16 2020-05-26 北京微播视界科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112182025A (en) * 2020-10-28 2021-01-05 深圳前海微众银行股份有限公司 Log analysis method, device, equipment and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110154360A1 (en) * 2009-12-17 2011-06-23 Fujitsu Limited Job analyzing method and apparatus
CN105468764A (en) * 2015-12-02 2016-04-06 广州华多网络科技有限公司 Data processing method and apparatus and cluster service system
CN107528864A (en) * 2016-06-20 2017-12-29 中国科学院微电子研究所 Heterogeneous network data processing method and system
US20170371915A1 (en) * 2016-06-27 2017-12-28 Invensys Systems, Inc. Transactional integrity in a segmented database architecture
CN110647421A (en) * 2018-06-27 2020-01-03 阿里巴巴集团控股有限公司 Database processing method, device and system and electronic equipment
CN111198853A (en) * 2018-11-16 2020-05-26 北京微播视界科技有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN112182025A (en) * 2020-10-28 2021-01-05 深圳前海微众银行股份有限公司 Log analysis method, device, equipment and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NEGIN ALEMAZKOOR: "Efficient_Collection_of_Connected_Vehicles_Data_With_Precision_Guarantees", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》, 3 October 2019 (2019-10-03), pages 4637 - 4645, XP011818699, DOI: 10.1109/TITS.2019.2942568 *
张骁;应时;张韬;: "应用软件运行日志的收集与服务处理框架", 计算机工程与应用, no. 10, 1 November 2017 (2017-11-01) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114189490A (en) * 2021-11-26 2022-03-15 广州市百果园信息技术有限公司 User list processing method, system, electronic equipment and storage medium
CN114189490B (en) * 2021-11-26 2023-03-31 广州市百果园信息技术有限公司 User list processing method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112764988B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
CN110321387B (en) Data synchronization method, equipment and terminal equipment
CN110636340B (en) Video file uploading method, storage device, terminal device and storage medium
CN110505495B (en) Multimedia resource frame extraction method, device, server and storage medium
CN108600779B (en) Target object operation method and device based on video content
CN110620699B (en) Message arrival rate determination method, device, equipment and computer readable storage medium
CN106488256B (en) data processing method and device
CN114223189A (en) Duration statistical method and device, electronic equipment and computer readable medium
CN110995566A (en) Message data pushing method, system and device
CN112764988B (en) Data segment acquisition method and device
CN113296666A (en) Anchor exposure data reporting method and device, terminal equipment and storage medium
CN113051460A (en) Elasticissearch-based data retrieval method and system, electronic device and storage medium
CN112835978A (en) Data storage method and device and computer equipment
CN109189813B (en) Data sharing method and device
CN111124650A (en) Streaming data processing method and device
CN110909072A (en) Data table establishing method, device and equipment
CN110913240B (en) Video interception method, device, server and computer readable storage medium
CN109063201B (en) Impala online interactive query method based on mixed storage scheme
CN112887806A (en) Subtitle processing method, subtitle processing device, electronic equipment and subtitle processing medium
CN114661563B (en) Data processing method and system based on stream processing framework
CN113535702B (en) Data processing method, device, equipment and storage medium
CN111708997B (en) Method, device and storage medium for determining target object identity identifier
CN115827564A (en) File storage method, system, storage medium and computer equipment
CN106407205B (en) Data aggregation method and device
CN117294724A (en) Cloud resource synchronization method and device, electronic equipment and readable storage medium
CN114723466A (en) Account pull-back flow method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant