CN113760885A - Incremental log processing method and device, electronic equipment and storage medium - Google Patents

Incremental log processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113760885A
CN113760885A CN202011152022.1A CN202011152022A CN113760885A CN 113760885 A CN113760885 A CN 113760885A CN 202011152022 A CN202011152022 A CN 202011152022A CN 113760885 A CN113760885 A CN 113760885A
Authority
CN
China
Prior art keywords
data packet
target data
target
analysis
analysis result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011152022.1A
Other languages
Chinese (zh)
Inventor
程志良
林德强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202011152022.1A priority Critical patent/CN113760885A/en
Publication of CN113760885A publication Critical patent/CN113760885A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Abstract

The embodiment of the disclosure provides an incremental log processing method and device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a plurality of target data packets corresponding to the incremental log, wherein each target data packet comprises a sequence number identifier, and each sequence number identifier is used for representing the sequence of acquiring the target data packets; determining an analysis unit corresponding to each target data packet in a plurality of target data packets; calling an analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain analysis results corresponding to the target data packet, wherein each analysis result comprises a serial number identifier for representing the sequence of obtaining the target data packet; and outputting a plurality of analysis results according to the output sequence determined by the plurality of serial number identifications.

Description

Incremental log processing method and device, electronic equipment and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of computers, and more particularly, to an incremental log processing method and apparatus, an electronic device, and a storage medium.
Background
The database increment log comprises various database table operation events, wherein the position of each database increment log is provided with the corresponding database table operation event. Database table operation events may include insert operations, update operations, and delete operations.
With the development of internet technology, in the field of real-time synchronization of database change, based on the analysis of the incremental logs of the database, incremental data subscription and consumption services are provided, such as database mirroring, real-time database backup, multi-level indexing, service cache refreshing and the like. Where the database is a delta log (i.e., Binlog).
In implementing the disclosed concept, the inventors found that there are at least the following problems in the related art: the adoption of the related technology is easy to cause the delay of the data synchronization subscription.
Disclosure of Invention
In view of this, the present disclosure provides an incremental log processing method and apparatus, an electronic device, and a storage medium.
One aspect of the embodiments of the present disclosure provides an incremental log processing method, including: acquiring a plurality of target data packets corresponding to the incremental log, wherein each target data packet comprises a sequence number identifier, and each sequence number identifier is used for representing the sequence of acquiring the target data packets; determining, for each of the plurality of destination packets, an analysis unit corresponding to the destination packet; calling an analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet, wherein each analysis result comprises a serial number identifier for representing the sequence of obtaining the target data packet; and outputting a plurality of analysis results in an output order determined by the plurality of serial number identifiers.
Another aspect of an embodiment of the present disclosure provides an incremental log processing apparatus, including: an obtaining module, configured to obtain a plurality of target data packets corresponding to an incremental log, where each target data packet includes a sequence number identifier, and each sequence number identifier is used to represent a sequence in which the target data packets are obtained; a determining module, configured to determine, for each of the plurality of target packets, an analysis unit corresponding to the target packet; a calling module, configured to call an analysis unit corresponding to the target data packet to analyze the target data packet in parallel, so as to obtain an analysis result corresponding to the target data packet, where each analysis result includes a sequence number identifier used for representing a sequence in which the target data packet is obtained; and the output module is used for outputting a plurality of analysis results according to the output sequence determined by the serial number identifications.
Another aspect of an embodiment of the present disclosure provides an electronic device including: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described above.
Another aspect of embodiments of the present disclosure provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to implement a method as described above.
Another aspect of embodiments of the present disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.
According to the embodiment of the disclosure, a plurality of target data packets corresponding to an incremental log are obtained, each target data packet comprises a sequence number identifier, each sequence number identifier is used for representing the sequence of obtaining the target data packet, for each target data packet in the plurality of target data packets, an analysis unit corresponding to the target data packet is determined, the analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, an analysis result corresponding to the target data packet is obtained, each analysis result comprises a sequence number identifier used for representing the sequence of obtaining the target data packet, and the plurality of analysis results are output according to the output sequence determined by the plurality of sequence number identifiers. Because a mode of calling a plurality of analysis units to analyze a plurality of sequentially read target data packets in parallel is adopted for a time-consuming analysis process, the mode can fully utilize the capability of a hardware multi-core processor and improve the data processing speed, and therefore, the technical problem that data synchronization subscription is delayed easily caused by adopting a related technology is at least partially solved. In addition, a plurality of analysis results are output according to the output sequence determined by the plurality of serial number marks, so that the sequential output of the analysis results is realized.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an exemplary system architecture to which the incremental log processing method may be applied, according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a method of incremental log processing according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of another incremental log processing method according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram that schematically illustrates a method of incremental log processing, in accordance with an embodiment of the present disclosure;
FIG. 5 schematically illustrates a block diagram of an incremental log processing apparatus according to an embodiment of the present disclosure; and
FIG. 6 schematically illustrates a block diagram of an electronic device suitable for implementing a delta log processing method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
In the related art, the incremental logs are processed by sequential reading and serial parsing. In the process of implementing the present disclosure, the inventor finds that for a service scenario with a large data volume and a high real-time requirement, since sequential reading and serial parsing of the incremental log may be limited by the processing capability of a single thread (i.e., a single core), if the data volume exceeds the upper limit of the processing capability of the single thread, data delay of the synchronous subscription may be caused.
In order to solve the above problems in the related art, an embodiment of the present disclosure provides a sequential reading parsing processing logic for longitudinally splitting a database increment log, that is, sequential reading, allocating parsing units, parallel parsing, and sequential output. Specifically, the embodiment of the disclosure provides an incremental log processing method and device and an electronic device capable of applying the method. The method comprises sequential reading, allocation of analysis units, parallel analysis and sequential output. In the sequential reading process, a plurality of target data packets corresponding to the incremental log are obtained, each target data packet comprises a sequence number identifier, and each sequence number identifier is used for representing the sequence of obtaining the target data packets. In the process of distributing the analysis units, for each target data packet in the plurality of target data packets, the analysis unit corresponding to the target data packet is determined. In the parallel analysis process, an analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, and an analysis result corresponding to the target data packet is obtained. And after the parallel analysis is finished, entering a sequential output process, and outputting a plurality of analysis results according to an output sequence determined by a plurality of serial number identifications.
According to the embodiment of the disclosure, time-consuming analysis processes are processed in parallel, so that the capability of a hardware multi-core processor can be fully utilized, the processing capability is further improved, and the problem of data synchronous subscription delay is solved.
Fig. 1 schematically illustrates an exemplary system architecture 100 to which the incremental log processing method may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired and/or wireless communication links, and so forth.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, and/or social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background management server (for example only) that provides support for incremental data subscription services subscribed by users with the terminal devices 101, 102, 103. The background management server may analyze and otherwise process the received data such as the user subscription request, and feed back a processing result (e.g., an analysis result obtained or generated according to the user subscription request) to the terminal device.
It should be noted that the incremental log processing method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the incremental log processing apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The incremental log processing method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the incremental log processing apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
FIG. 2 schematically shows a flow chart of a method of incremental log processing according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S210 to S240.
In operation S210, a plurality of target data packets corresponding to the incremental log are obtained, where each target data packet includes a sequence number identifier, and each sequence number identifier is used to represent an order in which the target data packets are obtained.
In an embodiment of the present disclosure, the incremental log may refer to a database incremental log. A Socket programming interface provided by jdk (java Development kit) may be called, and the incremental log in the database is sequentially read to obtain a plurality of target data packets corresponding to the incremental log.
According to the embodiment of the present disclosure, each target data packet may include a sequence number identifier for characterizing a sequence corresponding to the target data packet, that is, the sequence number identifier may characterize an acquisition sequence of the target data packet. The reason for setting the sequence number identifier for the target data packet is as follows: in the parallel analysis process of the multiple analysis units, the execution sequence among the analysis units cannot be guaranteed, and the analysis result is required to be output in sequence, so that a serial number identifier needs to be set for the target data packet to provide a basis for subsequent sequential output. In addition, each target data packet may further include a data packet header and a data packet body.
In operation S220, for each of a plurality of target packets, a parsing unit corresponding to the target packet is determined.
In the embodiment of the present disclosure, after obtaining the plurality of target packets, a parsing unit for parsing the target packet needs to be determined for each target packet in the plurality of packets. In order that the target data packet can be distributed to each analysis unit in a balanced manner, the analysis unit corresponding to the target data packet can be determined based on a balanced distribution strategy. The balanced allocation policy may include an equal allocation policy or a minimum wait policy.
According to an embodiment of the present disclosure, each parsing unit may be configured to parse one target data packet or at least two target data packets, that is, different target data packets may be processed by the same parsing unit. The number of the parsing units for parsing the plurality of target packets is less than or equal to the number of the target packets.
In operation S230, an analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, and analysis results corresponding to the target data packet are obtained, where each analysis result includes a sequence number identifier used for representing a sequence of obtaining the target data packet.
In the embodiment of the present disclosure, after determining, for each target data packet, an analysis unit for analyzing the target data packet, the multiple analysis units for analyzing the multiple target data packets may perform analysis operations in parallel, that is, the multiple analysis units may analyze the corresponding target data packet in parallel, so as to obtain an analysis result corresponding to each target data packet. Each parsing result may include a sequence number identifier, where the sequence number identifier is used to represent an order in which the target data packets corresponding to the parsing result are obtained. In addition, each parsing result may further include a data packet header and a data packet body.
According to an embodiment of the present disclosure, since protocols used by different databases may be different, parsing logic for parsing packets in different databases may be different. In order to adapt to different databases, the embodiments of the present disclosure provide an analysis interface, so that the analysis unit may implement analysis of the target data packet by calling the analysis interface.
In operation S240, a plurality of parsing results are output in an output order determined by a plurality of serial number identifications.
In the embodiment of the present disclosure, in order to implement sequential output of the parsing results, an order determined by a plurality of sequence number identifications may be taken as an output order, so that a plurality of parsing results are output in the output order. Because the sequence number identifiers are determined based on the sequence of acquiring the target data packets, and the target data packets are sequentially acquired according to the sequence, a plurality of sequence number identifiers have a sequential relationship, and accordingly, if a plurality of analysis results are output according to the output sequence determined by the plurality of sequence number identifiers, sequential output can be realized.
According to the embodiment of the disclosure, the output sequence of the analysis result is matched with the acquisition sequence of the target data packet corresponding to the analysis result, that is, the analysis result is acquired first and output first.
According to the technical scheme of the embodiment of the disclosure, a plurality of target data packets corresponding to an incremental log are obtained, each target data packet comprises a serial number identifier, each serial number identifier is used for representing the sequence of obtaining the target data packet, for each target data packet in the plurality of target data packets, an analysis unit corresponding to the target data packet is determined, the analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, analysis results corresponding to the target data packet are obtained, each analysis result comprises a serial number identifier used for representing the sequence of obtaining the target data packet, and the plurality of analysis results are output according to the output sequence determined by the plurality of serial number identifiers. Because a mode of calling a plurality of analysis units to analyze a plurality of sequentially read target data packets in parallel is adopted for a time-consuming analysis process, the mode can fully utilize the capability of a hardware multi-core processor and improve the data processing speed, and therefore, the technical problem that data synchronization subscription is delayed easily caused by adopting a related technology is at least partially solved. In addition, a plurality of analysis results are output according to the output sequence determined by the plurality of serial number marks, so that the sequential output of the analysis results is realized.
Optionally, on the basis of the above technical solution, acquiring a plurality of data packets corresponding to the incremental log may include the following operations.
A plurality of original data packets corresponding to the incremental log are obtained. And setting a serial number identifier corresponding to each original data packet according to the sequence of obtaining each original data packet in the plurality of original data packets to obtain a target data packet corresponding to each original data packet.
In the embodiment of the present disclosure, in order to output the parsing result in sequence, a corresponding sequence number identifier may be set for each original data packet, where the sequence number identifier corresponding to each original data packet is determined by the sequence in which the original data packet is obtained. The plurality of sequence number identifications may be a set of consecutive and increasing sequence numbers.
For example, if the original data packet 1, the original data packet 2, the original data packet 3, and the original data packet 4 are sequentially obtained, a sequence number identifier corresponding to the original data packet 1 may be set to be 1, a sequence number identifier corresponding to the original data packet 2 may be set to be 2, a sequence number identifier corresponding to the original data packet 3 may be set to be 3, and a sequence number identifier corresponding to the original data packet 4 may be set to be 4.
According to the embodiment of the present disclosure, setting a corresponding sequence number identifier for each original data packet may be understood as performing encapsulation processing on the original data packet. The encapsulation process is as follows.
Figure BDA0002739664010000091
Optionally, on the basis of the above technical solution, determining the parsing unit corresponding to the target packet may include the following operations.
And determining the analysis unit corresponding to the target data packet based on the balanced distribution strategy.
In the embodiment of the disclosure, in order that the target data packet may be evenly distributed to each parsing unit, the parsing unit corresponding to the target data packet may be determined based on an even distribution policy. The balanced allocation policy may include an average allocation policy or a minimum waiting policy, and specifically, the corresponding allocation policy may be selected according to an actual situation.
According to the embodiment of the disclosure, the average distribution strategy is more suitable for application scenarios where sizes of different target data packets are not very different, for example, inventory information of products. The minimum waiting strategy is more suitable for application scenarios in which the sizes of different target data packets are different greatly, such as description information of products.
Optionally, on the basis of the above technical solution, each parsing unit includes a buffer queue, and the buffer queue is used for buffering the target packet.
Determining the parsing unit corresponding to the target packet based on the balanced allocation policy may include the following operations.
And determining the analysis unit corresponding to the target data packet according to the number of the buffered data packets corresponding to each buffer queue in the plurality of buffer queues.
In the embodiment of the present disclosure, each parsing unit may include a buffer queue for buffering the target data packet, that is, the target data packet that needs to be parsed may be added to the buffer queue.
According to the embodiment of the disclosure, in order to distribute the target data packet to each parsing unit in a balanced manner, the parsing unit corresponding to the target data packet may be determined based on a minimum waiting policy, that is, the number of data packets currently buffered in each of the plurality of buffer queues may be obtained. For each target data packet, the cache queue with the minimum number of cached data packets may be determined according to the number of cached data packets corresponding to each cache queue in the plurality of cache queues, and the cache queue with the minimum number of cached data packets may be determined as the parsing unit corresponding to the target data packet.
According to the embodiment of the present disclosure, each time an analysis unit corresponding to a target packet is determined for the target packet, the number of buffered packets in the buffer queue in the analysis unit corresponding to the target packet needs to be updated, that is, the number of buffered packets may be increased by 1. The corresponding parsing unit may be determined for each of the plurality of target data packets in sequence according to the obtaining order represented by the sequence number identifier of each target data packet.
Optionally, on the basis of the foregoing technical solution, each parsing unit has a corresponding unit number.
Determining the parsing unit corresponding to the target packet based on the balanced allocation policy may include the following operations.
And determining a remainder result obtained by dividing the serial number identifier included in the target data packet by the number of the analysis units. And determining the unit serial number consistent with the remainder result as the target serial number. And determining the analysis unit corresponding to the target sequence number as the analysis unit corresponding to the target data packet.
In the embodiment of the present disclosure, in order to enable the target data packet to be evenly distributed to each parsing unit, the parsing units corresponding to the target data packet may be determined based on an average distribution policy, that is, the number of parsing units may be obtained. For each target data packet, a remainder obtained by dividing the serial number identifier corresponding to the target data packet by the number of parsing units may be determined, the remainder is used as a remainder obtaining result, a unit serial number consistent with the remainder obtaining result is determined according to the remainder obtaining result, the unit serial number consistent with the remainder obtaining result is determined as a target serial number, and a parsing unit corresponding to the target serial number is determined as a parsing unit corresponding to the target data packet.
Illustratively, for example, 3 parsing units i are included, the unit number corresponding to the parsing unit i is i, 1 ≦ i ≦ 3, and i is an integer. If the serial number identifier corresponding to the target packet 10 is 10, 10% 3 is equal to 1, that is, the parsing unit corresponding to the target packet 10 is determined as parsing unit 1, where% represents the remainder operator.
Optionally, on the basis of the above technical solution, each parsing unit includes a parsing thread and a cache queue corresponding to the parsing thread.
After determining the parsing unit corresponding to the target packet, the method may further include the following operations.
And adding the target data packet to a buffer queue in the analysis unit corresponding to the target data packet.
Invoking an analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet, which may include the following operations.
And the analysis thread corresponding to the target data packet acquires the target data packet from the cache queue corresponding to the analysis thread. And the analysis thread analyzes the target data packet in parallel to obtain an analysis result corresponding to the target data packet.
In an embodiment of the present disclosure, each parsing unit may include a parsing thread and a buffer queue, where the parsing thread may be used to parse the target data packet, and the buffer queue may be used to buffer the target data packet. The target data packet analyzed by each analysis thread comes from the cache queue corresponding to the analysis thread.
According to the embodiment of the disclosure, after the parsing unit corresponding to the target data packet is determined, the target data packet may be added to the cache queue in the parsing unit corresponding to the target data packet, so that the parsing thread corresponding to the target data packet may obtain the target data packet from the cache queue corresponding to the parsing thread, and parse the target data packet in parallel, to obtain the parsing result corresponding to the target data packet.
Optionally, on the basis of the above technical solution, the analyzing thread concurrently analyzes the target data packet to obtain an analysis result corresponding to the target data packet, which may include the following operations.
And the analysis thread calls the analysis interface to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet.
In the embodiment of the present disclosure, in order to adapt to different databases, a manner of calling an analysis interface may be adopted, that is, an analysis interface may be provided for an analysis thread, so that the analysis thread may call when analyzing a target data packet. The parsing interface may be a LogEvent decode (Binlog), which is implemented as follows.
Figure BDA0002739664010000121
Optionally, on the basis of the above technical solution, after invoking an analysis unit corresponding to the target data packet to analyze the target data packet in parallel, and obtaining an analysis result corresponding to the target data packet, the method may further include the following operation.
And adding the analysis result to a priority queue.
Outputting the plurality of parsing results in an output order determined by the plurality of sequence number identifications may include the following operations.
And calling a sorting interface to process the priority queue so that a plurality of analysis results are output according to the output sequence determined by a plurality of serial number identifications.
In the embodiment of the disclosure, after obtaining the parsing result, the parsing result may be added to the priority queue, and the priority queue is processed by calling the sorting interface, so that the plurality of parsing results may be output in an output order determined by the plurality of sequence number identifiers, and the sorting interface may be LogEvent getLog ().
Optionally, on the basis of the above technical solution, invoking the sorting interface to process the priority queue, so that the multiple parsing results are output according to the output order determined by the multiple sequence number identifiers, which may include the following operations.
An initial sequence number identification is determined from a plurality of sequence number identifications. And outputting the analysis result corresponding to the initial serial number identifier, and calling an analysis result removing method to delete the analysis result corresponding to the initial serial number identifier from the priority queue. And calling an analysis result acquisition method to acquire an analysis result positioned at the head of the priority queue, wherein the target analysis result is the analysis result positioned at the head of the priority queue. And under the condition that the sequence number identification corresponding to the analysis result positioned at the head of the priority queue is determined to be discontinuous with the previous sequence number identification, continuing to call the analysis result acquisition method until the sequence number identification corresponding to the target analysis result is continuous with the previous sequence number identification, wherein the previous sequence number identification is the sequence number identification corresponding to the target analysis result, and the target analysis result is the latest analysis result deleted from the priority queue. And under the condition that the sequence number identification corresponding to the analysis result positioned at the head of the priority queue is determined to be continuous with the last sequence number identification, outputting the analysis result positioned at the head of the priority queue, and calling an analysis result removing method to delete the analysis result positioned at the head of the priority queue from the priority queue.
In the embodiment of the present disclosure, the implementation logic of the ordering interface is as follows: an initial sequence number identifier may be determined among the plurality of sequence number identifiers, where the initial sequence number identifier is a sequence number identifier corresponding to a target data packet that is obtained first among the plurality of target data packets. After the initial sequence number identifier is determined, an analysis result removal method, namely blocking priority queue remove () may be called, and the analysis result corresponding to the initial sequence number identifier is deleted from the priority queue.
After deleting the analysis result corresponding to the initial sequence number identifier from the priority queue, a call analysis result obtaining method, namely blocking priority queue.
If the determination is discontinuous, the analysis result obtaining method can be continuously called until a current serial number identifier continuous with a previous serial number identifier is obtained, wherein the current serial number identifier is a serial number identifier corresponding to an analysis result located at the head of the priority queue at present, the previous serial number identifier is a serial number identifier corresponding to a target analysis result, and the target analysis result is a latest analysis result deleted from the priority queue, namely the latest analysis result deleted from the priority queue is the target analysis result.
If the determination is continuous, the analysis result at the head of the priority queue can be output, and the analysis result removing method is called to delete the analysis result at the head of the priority queue from the priority queue.
After the analysis result corresponding to the initial sequence number identifier is deleted from the priority queue by calling the analysis result removing method, whether the priority queue is an empty queue or not can be determined, if the priority queue is determined not to be an empty queue, an operation of calling the analysis result obtaining method to obtain an analysis result located at the head of the priority queue and determining whether the sequence number identifier corresponding to the analysis result located at the head of the priority queue and a previous sequence number identifier are continuous or not can be performed. And if the queue is determined to be empty, finishing the operation of outputting the analysis result.
Optionally, on the basis of the above technical solution, the method for obtaining the analysis result continues to be called until the serial number identifier corresponding to the target analysis result is consecutive to the previous serial number identifier, which may include the following operations.
If the number of times of calling the analysis result acquisition method is equal to the number threshold, the analysis result acquisition method is continuously called after delaying the preset time length until the serial number identification corresponding to the target analysis result is continuous with the previous serial number identification.
In the embodiment of the present disclosure, in a case where it is determined that the sequence number identifier corresponding to the analysis result located at the head of the priority queue is not consecutive to the previous sequence number identifier, it may be determined whether the number of times of calling the analysis result acquisition method is equal to a number threshold. If the number of times is equal to the threshold value, in order to reduce the system overhead, the analysis result acquisition method may be continuously called after delaying a preset time period until a current serial number identifier continuous with a previous serial number identifier is acquired, where the current serial number identifier is a serial number identifier corresponding to the target analysis result. It should be noted that, the specific values of the number threshold and the preset time period may be set according to actual situations, and are not specifically limited herein. Illustratively, the number threshold is 100, and the preset time duration is 1 millisecond.
Optionally, on the basis of the above technical solution, acquiring a plurality of target data packets corresponding to the incremental log may include the following operations.
And under the condition that the number of the data packets currently being processed is determined to be smaller than the threshold value of the number of the data packets, acquiring a plurality of target data packets corresponding to the incremental logs. And under the condition that the number of the data packets currently being processed is determined to be greater than or equal to the data packet number threshold, suspending the acquisition of a plurality of target data packets corresponding to the incremental logs until the number of the data packets currently being processed is less than the data packet number threshold.
In the embodiment of the disclosure, in the whole incremental log processing process, an asynchronous processing mode is adopted, that is, the sequential reading, the allocation of the parsing units, the parallel parsing and the sequential output are mutually independent processing processes. If a processing speed of a certain process is slow, a large amount of data is backlogged, a memory overflow or even a service downtime may occur, and therefore, to avoid the above situation, the flow rate control needs to be performed on the whole processing process.
In order to realize the flow rate control, a threshold of the number of packets may be set in advance, and the threshold of the number of packets may be used as a basis for whether or not the target packet can be processed. It is determined whether the number of packets currently being processed is greater than or equal to a packet number threshold. If the number of the data packets currently being processed is determined to be greater than or equal to the data packet number threshold, the obtaining of the plurality of target data packets corresponding to the incremental log may be suspended until the number of the data packets currently being processed is less than the data packet number threshold, and the plurality of target data packets corresponding to the incremental log may be obtained. The currently processed data packet is the data packet currently processed in the whole processing process of sequential reading, distributing analysis units, parallel analysis and sequential output, and the currently processed data packet may exist in each processing stage.
According to the embodiment of the disclosure, the Semaphore class provided by JDK may be used to control the number of data packets in the whole processing procedure, so that the number of data packets currently being processed in the whole processing procedure is smaller than the threshold value of the number of data packets. The Semaphore class implementation logic obtains a permission for calling the acquire () method and releases a permission for calling the release () method. Specifically, before sequential reading, that is, before acquiring a plurality of target data packets corresponding to the incremental log, the acquire () method is called, and after permission is acquired, the plurality of target data packets are acquired. After outputting one parsing result at a time, release () return permission is called. By the method, the quantity of the number packets currently processed in the whole processing process can not exceed the data packet quantity threshold.
FIG. 3 schematically illustrates a flow diagram of another incremental log processing method according to an embodiment of the present disclosure.
As shown in fig. 3, the method includes operations S301 to S316.
In operation S301, the number of packets currently being processed is acquired.
In operation S302, whether the number of packets currently being processed is less than a packet number threshold; if yes, perform operation S303; if not, the process returns to the operation S302.
In operation S303, a plurality of original data packets corresponding to the delta log are acquired.
In operation S304, according to the sequence of obtaining each original data packet in the plurality of original data packets, a sequence number identifier corresponding to each original data packet is set, so as to obtain a target data packet corresponding to each original data packet.
In operation S305, for each target packet of a plurality of target packets, a parsing unit corresponding to the target packet is determined based on a balanced allocation policy, where each parsing unit includes a parsing thread and a buffer queue corresponding to the parsing thread.
In operation S306, the target packet is added to the buffer queue in the parsing unit corresponding to the target packet.
In operation S307, the parsing thread corresponding to the target packet obtains the target packet from the buffer queue corresponding to the parsing thread.
In operation S308, the parsing thread calls the parsing interface to parse the target data packet in parallel, so as to obtain a parsing result corresponding to the target data packet.
In operation S309, the parsing result is added to the priority queue.
In operation S310, an initial sequence number identification is determined from a plurality of sequence number identifications.
In operation S311, an analysis result corresponding to the initial serial number identifier is output, and an analysis result removing method is invoked to delete the analysis result corresponding to the initial serial number identifier from the priority queue.
In operation S312, whether the priority queue is an empty queue; if not, perform operation S313; if so, operation S314 is performed.
In operation S313, a parsing result obtaining method is called to obtain a parsing result located at the head of the priority queue, and operation S315 is performed.
In the embodiment of the present disclosure, the target analysis result is an analysis result located at the head of the priority queue.
In operation S314, the execution of the output parsing result operation is ended.
In operation S315, whether the sequence number identifier corresponding to the analysis result at the head of the priority queue is consecutive to the previous sequence number identifier; if not, returning to execute operation S313; if yes, operation S316 is performed.
In the embodiment of the present disclosure, the previous sequence number identifier is a sequence number identifier corresponding to the target parsing result, and the target parsing result is an operation of the latest parsing result deleted from the priority queue.
S316, outputting the analysis result at the head of the priority queue, calling an analysis result removing method to delete the analysis result at the head of the priority queue from the priority queue, and returning to perform operation S312.
In the embodiment of the present disclosure, in order to better understand the technical solution of the embodiment of the present disclosure, the following description is made with reference to fig. 4. FIG. 4 is a schematic diagram schematically illustrating a method of incremental log processing according to an embodiment of the present disclosure.
As shown in fig. 4, 5 parsing units are included, and each parsing unit includes a parsing thread and a buffer queue corresponding to the parsing thread. And sequentially acquiring 5 original data packets corresponding to the incremental log, namely an original data packet 1, an original data packet 2, an original data packet 3, an original data packet 4 and an original data packet 5, wherein the acquiring sequence of the 5 original data packets is in the direction shown by an arrow. According to the sequence of obtaining each original data packet in 5 original data packets, setting a sequence number identifier corresponding to the original data packet 1 as 1 to obtain a target data packet 1, setting a sequence number identifier corresponding to the original data packet 2 as 2 to obtain a target data packet 2, setting a sequence number identifier corresponding to the original data packet 3 as 3 to obtain a target data packet 3, setting a sequence number identifier corresponding to the original data packet 4 as 4 to obtain a target data packet 4, and setting a sequence number identifier corresponding to the original data packet 5 as 5 to obtain a target data packet 5.
According to the embodiment of the disclosure, an analysis unit is determined for each of 5 target data packets, the corresponding relationship between the target data packets and the analysis units is shown in fig. 4, and the 5 analysis units perform parallel analysis to obtain an analysis result 1, an analysis result 2, an analysis result 3, an analysis result 4, and an analysis result 5. The 5 analysis results are output according to the output sequence determined by the 5 serial number identifiers, namely, the analysis result 1, the analysis result 2, the analysis result 3, the analysis result 4 and the analysis result 5 are output in sequence.
According to the technical scheme of the embodiment of the disclosure, according to the sequence of obtaining each original data packet in a plurality of original data packets, a sequence number identifier corresponding to each original data packet is set, a target data packet corresponding to each original data packet is obtained, for each target data packet in the plurality of target data packets, an analysis unit corresponding to the target data packet is determined, the analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, an analysis result corresponding to the target data packet is obtained, each analysis result comprises a sequence number identifier used for representing the sequence of obtaining the target data packet, and the plurality of analysis results are output according to the output sequence determined by the plurality of sequence number identifiers. Because a mode of calling a plurality of analysis units to analyze a plurality of sequentially read target data packets in parallel is adopted for a time-consuming analysis process, the mode can fully utilize the capability of a hardware multi-core processor and improve the data processing speed, and therefore, the technical problem that data synchronization subscription is delayed easily caused by adopting a related technology is at least partially solved. In addition, a plurality of analysis results are output according to the output sequence determined by the plurality of serial number marks, so that the sequential output of the analysis results is realized. Meanwhile, under the condition that the number of the data packets currently being processed is smaller than the data packet number threshold, the original data packets corresponding to the incremental logs are obtained, the number of the data packets currently being processed in the whole processing process cannot exceed the data packet number threshold, and further the occurrence of memory overflow and even service downtime is avoided.
FIG. 5 schematically shows a block diagram of an incremental log processing apparatus according to an embodiment of the present disclosure.
As shown in fig. 5, incremental log processing apparatus 500 includes an obtaining module 510, a determining module 520, a calling module 530, and an outputting module 540.
The obtaining module 510, the determining module 520, the calling module 530, and the outputting module 540 are communicatively coupled.
An obtaining module 510, configured to obtain a plurality of target data packets corresponding to the incremental log, where each target data packet includes a sequence number identifier, and each sequence number identifier is used to represent a sequence in which the target data packets are obtained.
A determining module 520, configured to determine, for each target data packet of the multiple target data packets, a parsing unit corresponding to the target data packet.
The invoking module 530 is configured to invoke an analysis unit corresponding to the target data packet to concurrently analyze the target data packet, so as to obtain an analysis result corresponding to the target data packet, where each analysis result includes a sequence number identifier used for representing a sequence of obtaining the target data packet.
And an output module 540, configured to output a plurality of analysis results according to the output order determined by the plurality of sequence number identifiers.
According to the technical scheme of the embodiment of the disclosure, a plurality of target data packets corresponding to an incremental log are obtained, each target data packet comprises a serial number identifier, each serial number identifier is used for representing the sequence of obtaining the target data packet, for each target data packet in the plurality of target data packets, an analysis unit corresponding to the target data packet is determined, the analysis unit corresponding to the target data packet is called to analyze the target data packet in parallel, analysis results corresponding to the target data packet are obtained, each analysis result comprises a serial number identifier used for representing the sequence of obtaining the target data packet, and the plurality of analysis results are output according to the output sequence determined by the plurality of serial number identifiers. Because a mode of calling a plurality of analysis units to analyze a plurality of sequentially read target data packets in parallel is adopted for a time-consuming analysis process, the mode can fully utilize the capability of a hardware multi-core processor and improve the data processing speed, and therefore, the technical problem that data synchronization subscription is delayed easily caused by adopting a related technology is at least partially solved. In addition, a plurality of analysis results are output according to the output sequence determined by the plurality of serial number marks, so that the sequential output of the analysis results is realized.
Optionally, the obtaining module 510 may include a first obtaining sub-module and a setting sub-module.
And the first obtaining submodule is used for obtaining a plurality of original data packets corresponding to the incremental log.
And the setting submodule is used for setting a serial number identifier corresponding to each original data packet according to the sequence of obtaining each original data packet in the plurality of original data packets, so as to obtain a target data packet corresponding to each original data packet.
Optionally, on the basis of the above technical solution, the determining module 520 may include a first determining submodule.
And the first determining submodule is used for determining the analysis unit corresponding to the target data packet based on the balanced distribution strategy.
Optionally, on the basis of the above technical solution, each parsing unit includes a buffer queue, and the buffer queue is used for buffering the target packet.
The first determination submodule may include a first determination unit.
The first determining unit is used for determining the analyzing unit corresponding to the target data packet according to the number of the buffered data packets corresponding to each buffer queue in the plurality of buffer queues.
Optionally, on the basis of the foregoing technical solution, each parsing unit has a corresponding unit number.
The first determination submodule may include a second determination unit, a third determination unit, and a fourth determination unit.
And the second determining unit is used for determining a remainder result obtained by dividing the serial number identifier included in the target data packet by the number of the analyzing units.
And the third determining unit is used for determining the unit serial number consistent with the remainder result as the target serial number.
And a fourth determining unit configured to determine the parsing unit corresponding to the target sequence number as the parsing unit corresponding to the target packet.
Optionally, on the basis of the above technical solution, each parsing unit includes a parsing thread and a cache queue corresponding to the parsing thread.
The incremental log processing apparatus 500 further includes a first adding module.
And the first adding module is used for adding the target data packet to the buffer queue in the analysis unit corresponding to the target data packet.
The calling module 530 may include a second acquisition sub-module and a parsing sub-module.
And the second obtaining submodule is used for obtaining the target data packet from the cache queue corresponding to the analysis thread by the analysis thread corresponding to the target data packet.
And the analysis submodule is used for analyzing the target data packet in parallel by the analysis thread to obtain an analysis result corresponding to the target data packet.
Optionally, on the basis of the above technical solution, the parsing submodule may include a parsing unit.
And the analysis unit is used for calling the analysis interface by the analysis thread to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet.
Optionally, on the basis of the foregoing technical solution, the incremental log processing apparatus 500 may further include a second adding module
And the second adding module is used for adding the analysis result to the priority queue.
The output module 540 may include an output sub-module.
And the output submodule is used for calling the sequencing interface to process the priority queue so that a plurality of analysis results are output according to the output sequence determined by a plurality of serial number identifications.
Optionally, on the basis of the above technical solution, the output sub-module may include a fifth determining unit, a first output unit, a first invoking unit, a second invoking unit, and a second output unit.
A fifth determining unit configured to determine an initial sequence number identifier from the plurality of sequence number identifiers.
And the first output unit is used for outputting the analysis result corresponding to the initial serial number identifier and calling an analysis result removing method to delete the analysis result corresponding to the initial serial number identifier from the priority queue.
And the first calling unit is used for calling the analysis result acquisition method to acquire the analysis result at the head of the priority queue, wherein the target analysis result is the analysis result at the head of the priority queue.
And the second calling unit is used for continuing calling the analysis result acquisition method under the condition that the serial number identification corresponding to the analysis result positioned at the head of the priority queue is not continuous with the previous serial number identification until the serial number identification corresponding to the target analysis result is continuous with the previous serial number identification, wherein the previous serial number identification is the serial number identification corresponding to the target analysis result, and the target analysis result is the latest analysis result deleted from the priority queue. And
and the second output unit is used for outputting the analysis result at the head of the priority queue under the condition that the sequence number identification corresponding to the analysis result at the head of the priority queue is determined to be continuous with the last sequence number identification, and calling an analysis result removing method to delete the analysis result at the head of the priority queue from the priority queue.
Optionally, on the basis of the above technical solution, the second invoking unit may include an invoking subunit.
And the calling subunit is used for continuing to call the analysis result acquisition method after delaying the preset time length if the number of times of calling the analysis result acquisition method is equal to the number threshold value until the serial number identifier corresponding to the target analysis result is continuous with the last serial number identifier.
Optionally, on the basis of the above technical solution, the obtaining module 510 may include a third obtaining sub-module and a pause sub-module.
And the third obtaining submodule is used for obtaining a plurality of target data packets corresponding to the incremental logs under the condition that the number of the data packets currently processed is determined to be smaller than the threshold value of the number of the data packets.
And the pause submodule is used for pausing to acquire a plurality of target data packets corresponding to the increment logs under the condition that the number of the data packets currently being processed is determined to be greater than or equal to the data packet number threshold value until the number of the data packets currently being processed is less than the data packet number threshold value.
Any number of modules, sub-modules, units, sub-units, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules, units, and sub-units according to the embodiments of the present disclosure may be implemented at least partially as a hardware Circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a Circuit, or implemented by any one of three implementations of software, hardware, and firmware, or any suitable combination of any of them. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the disclosure may be at least partially implemented as a computer program module, which when executed may perform the corresponding functions.
For example, any plurality of the obtaining module 510, the determining module 520, the invoking module 530 and the outputting module 540 may be combined and implemented in one module/unit/sub-unit, or any one of the modules/units/sub-units may be split into a plurality of modules/units/sub-units. Alternatively, at least part of the functionality of one or more of these modules/units/sub-units may be combined with at least part of the functionality of other modules/units/sub-units and implemented in one module/unit/sub-unit. According to an embodiment of the present disclosure, at least one of the obtaining module 510, the determining module 520, the invoking module 530 and the outputting module 540 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware and firmware, or any suitable combination of any of the three. Alternatively, at least one of the obtaining module 510, the determining module 520, the calling module 530 and the outputting module 540 may be at least partially implemented as a computer program module, which when executed may perform a corresponding function.
It should be noted that, the incremental log processing apparatus portion in the embodiment of the present disclosure corresponds to the incremental log processing method portion in the embodiment of the present disclosure, and the description of the incremental log processing apparatus portion specifically refers to the incremental log processing method portion, which is not described herein again.
Fig. 6 schematically shows a block diagram of a computer system suitable for implementing the above described method according to an embodiment of the present disclosure. The computer system illustrated in FIG. 6 is only one example and should not impose any limitations on the scope of use or functionality of embodiments of the disclosure.
As shown in fig. 6, a computer system 600 according to an embodiment of the present disclosure includes a processor 601, which can perform various appropriate actions and processes according to a program stored in a Read-Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. Processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 601 may also include onboard memory for caching purposes. Processor 601 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the disclosure.
In the RAM 603, various programs and data necessary for the operation of the system 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. The processor 601 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or RAM 603. It is to be noted that the programs may also be stored in one or more memories other than the ROM 602 and RAM 603. The processor 601 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, system 600 may also include an input/output (I/O) interface 605, input/output (I/O) interface 605 also connected to bus 604. The system 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the system of the embodiments of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to an embodiment of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples may include, but are not limited to: a portable Computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an erasable Programmable Read-Only Memory (EPROM) (erasable Programmable Read-Only Memory) or flash Memory), a portable compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the preceding. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 602 and/or RAM 603 described above and/or one or more memories other than the ROM 602 and RAM 603.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (14)

1. An incremental log processing method, comprising:
acquiring a plurality of target data packets corresponding to the incremental log, wherein each target data packet comprises a sequence number identifier, and each sequence number identifier is used for representing the sequence of acquiring the target data packets;
determining, for each of the plurality of target packets, an parsing unit corresponding to the target packet;
calling an analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet, wherein each analysis result comprises a serial number identifier for representing the sequence of obtaining the target data packet; and
and outputting a plurality of analysis results according to the output sequence determined by the sequence number identifications.
2. The method of claim 1, wherein the retrieving a plurality of data packets corresponding to a delta log comprises:
acquiring a plurality of original data packets corresponding to the incremental logs; and
and setting a serial number identifier corresponding to each original data packet according to the sequence of obtaining each original data packet in the plurality of original data packets to obtain a target data packet corresponding to each original data packet.
3. The method of claim 1, wherein the determining a parsing unit corresponding to the target packet comprises:
and determining an analysis unit corresponding to the target data packet based on a balanced distribution strategy.
4. The method of claim 3, wherein each parsing unit comprises a buffer queue for buffering target packets;
the determining, based on the balanced distribution policy, an analysis unit corresponding to the target packet includes:
and determining an analysis unit corresponding to the target data packet according to the number of buffered data packets corresponding to each buffer queue in the plurality of buffer queues.
5. The method of claim 3, wherein each of the parsed units has a corresponding unit sequence number;
the determining, based on the balanced distribution policy, an analysis unit corresponding to the target packet includes:
determining a remainder result obtained by dividing the serial number identifier included in the target data packet by the number of the analysis units;
determining the unit serial number consistent with the residue taking result as a target serial number; and
and determining the analysis unit corresponding to the target sequence number as the analysis unit corresponding to the target data packet.
6. The method of claim 1, wherein each of the parsing units comprises a parsing thread and a cache queue corresponding to the parsing thread;
after the determining the parsing unit corresponding to the target data packet, further comprising:
adding the target data packet to a cache queue in an analysis unit corresponding to the target data packet;
the calling of the analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet includes:
an analysis thread corresponding to the target data packet obtains the target data packet from a cache queue corresponding to the analysis thread; and
and the analysis thread analyzes the target data packet in parallel to obtain an analysis result corresponding to the target data packet.
7. The method of claim 6, wherein the parsing thread parses the target packet in parallel to obtain a parsing result corresponding to the target packet, comprising:
and the analysis thread calls an analysis interface to analyze the target data packet in parallel to obtain an analysis result corresponding to the target data packet.
8. The method of claim 1, wherein after invoking a parsing unit corresponding to the target packet and parsing the target packet in parallel to obtain a parsing result corresponding to the target packet, the method further comprises:
adding the analysis result to a priority queue;
the outputting a plurality of the analysis results according to the output sequence determined by the plurality of sequence number identifiers includes:
and calling a sorting interface to process the priority queue, so that the plurality of analysis results are output according to the output sequence determined by the plurality of sequence number identifications.
9. The method of claim 8, wherein the calling ordering interface processes the priority queue such that the plurality of parsing results are output in an output order determined by the plurality of sequence number identifications, comprising:
determining an initial sequence number identifier from a plurality of sequence number identifiers;
outputting an analysis result corresponding to the initial serial number identifier, and calling an analysis result removing method to delete the analysis result corresponding to the initial serial number identifier from the priority queue;
calling an analysis result acquisition method to acquire an analysis result positioned at the head of the priority queue, wherein the target analysis result is the analysis result positioned at the head of the priority queue;
under the condition that the sequence number identification corresponding to the analysis result positioned at the head of the priority queue is determined to be discontinuous with the previous sequence number identification, continuing to call the analysis result acquisition method until the sequence number identification corresponding to the target analysis result is continuous with the previous sequence number identification, wherein the previous sequence number identification is the sequence number identification corresponding to the target analysis result, and the target analysis result is the analysis result which is deleted from the priority queue most recently; and
and under the condition that the sequence number identification corresponding to the analysis result at the head of the priority queue is determined to be continuous with the last sequence number identification, outputting the analysis result at the head of the priority queue, and calling the analysis result removing method to delete the analysis result at the head of the priority queue from the priority queue.
10. The method of claim 9, wherein the continuing to call the parsing result obtaining method until a sequence number identifier corresponding to the target parsing result is consecutive to a last sequence number identifier comprises:
if the number of times of calling the analysis result acquisition method is equal to the number threshold, after delaying the preset time, continuing to call the analysis result acquisition method until the serial number identification corresponding to the target analysis result is continuous with the previous serial number identification.
11. The method of claim 1, wherein the retrieving a plurality of target data packets corresponding to a delta log comprises:
under the condition that the number of the data packets currently being processed is smaller than the threshold value of the number of the data packets, acquiring a plurality of target data packets corresponding to the incremental logs; and
and under the condition that the number of the data packets currently being processed is determined to be greater than or equal to the data packet number threshold, suspending the acquisition of a plurality of target data packets corresponding to the incremental logs until the number of the data packets currently being processed is less than the data packet number threshold.
12. An incremental log processing apparatus, comprising:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a plurality of target data packets corresponding to an incremental log, each target data packet comprises a sequence number identifier, and each sequence number identifier is used for representing the sequence of acquiring the target data packets;
a determining module, configured to determine, for each of the plurality of target packets, an parsing unit corresponding to the target packet;
the calling module is used for calling the analysis unit corresponding to the target data packet to analyze the target data packet in parallel to obtain analysis results corresponding to the target data packet, wherein each analysis result comprises a serial number identifier used for representing the sequence of obtaining the target data packet; and
and the output module is used for outputting a plurality of analysis results according to the output sequence determined by the serial number identifications.
13. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-11.
14. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 11.
CN202011152022.1A 2020-10-23 2020-10-23 Incremental log processing method and device, electronic equipment and storage medium Pending CN113760885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011152022.1A CN113760885A (en) 2020-10-23 2020-10-23 Incremental log processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011152022.1A CN113760885A (en) 2020-10-23 2020-10-23 Incremental log processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113760885A true CN113760885A (en) 2021-12-07

Family

ID=78785918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011152022.1A Pending CN113760885A (en) 2020-10-23 2020-10-23 Incremental log processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113760885A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222071A (en) * 2010-04-16 2011-10-19 华为技术有限公司 Method, device and system for data synchronous processing
CN107807845A (en) * 2017-10-16 2018-03-16 昆仑智汇数据科技(北京)有限公司 A kind of incremented data parallel processing apparatus and method
CN107918621A (en) * 2016-10-10 2018-04-17 阿里巴巴集团控股有限公司 Daily record data processing method, device and operation system
CN109522316A (en) * 2018-11-02 2019-03-26 东软集团股份有限公司 Log processing method, device, equipment and storage medium
CN110175209A (en) * 2019-04-12 2019-08-27 中国人民财产保险股份有限公司 Incremental data synchronization method, system, equipment and storage medium
CN111259121A (en) * 2020-01-09 2020-06-09 深圳前海微众银行股份有限公司 Log processing method, device, equipment and computer readable storage medium
CN111290881A (en) * 2020-01-21 2020-06-16 上海达梦数据库有限公司 Data recovery method, device, equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102222071A (en) * 2010-04-16 2011-10-19 华为技术有限公司 Method, device and system for data synchronous processing
CN107918621A (en) * 2016-10-10 2018-04-17 阿里巴巴集团控股有限公司 Daily record data processing method, device and operation system
CN107807845A (en) * 2017-10-16 2018-03-16 昆仑智汇数据科技(北京)有限公司 A kind of incremented data parallel processing apparatus and method
CN109522316A (en) * 2018-11-02 2019-03-26 东软集团股份有限公司 Log processing method, device, equipment and storage medium
CN110175209A (en) * 2019-04-12 2019-08-27 中国人民财产保险股份有限公司 Incremental data synchronization method, system, equipment and storage medium
CN111259121A (en) * 2020-01-09 2020-06-09 深圳前海微众银行股份有限公司 Log processing method, device, equipment and computer readable storage medium
CN111290881A (en) * 2020-01-21 2020-06-16 上海达梦数据库有限公司 Data recovery method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US10908954B2 (en) Quality of service classes
US9497288B2 (en) Subscriber based priority of messages in a publisher-subscriber domain
CN113132489A (en) Method, device, computing equipment and medium for downloading file
CN111221638B (en) Concurrent task scheduling processing method, device, equipment and medium
CN107832143B (en) Method and device for processing physical machine resources
CN110928905B (en) Data processing method and device
CN110430142B (en) Method and device for controlling flow
CN112395067A (en) Task scheduling method, system, device and medium
CN111753065A (en) Request response method, system, computer system and readable storage medium
CN110851276A (en) Service request processing method, device, server and storage medium
CN110008187B (en) File transmission scheduling method, device, equipment and computer readable storage medium
CN115525400A (en) Method, apparatus and program product for managing multiple computing tasks on a batch basis
CN110825342B (en) Memory scheduling device and system, method and apparatus for processing information
CN113760885A (en) Incremental log processing method and device, electronic equipment and storage medium
CN115269063A (en) Process creation method, system, device and medium
US11307974B2 (en) Horizontally scalable distributed system for automated firmware testing and method thereof
CN114661415A (en) Scheduling method and computer system
CN109062706B (en) Electronic device, method for limiting inter-process communication thereof and storage medium
US10979359B1 (en) Polling resource management system
CN110764710A (en) Data access method and storage system of low-delay and high-IOPS
CN113407331A (en) Task processing method and device and storage medium
CN117395210B (en) Information transmission control method, equipment and storage medium based on rich media
CN111858002B (en) Concurrent processing method, system and device based on asynchronous IO
CN113259261B (en) Network flow control method and electronic equipment
CN115630033A (en) Log information processing method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination