CN115840633A - Log parallel processing method, system, storage medium and equipment - Google Patents

Log parallel processing method, system, storage medium and equipment Download PDF

Info

Publication number
CN115840633A
CN115840633A CN202310140146.5A CN202310140146A CN115840633A CN 115840633 A CN115840633 A CN 115840633A CN 202310140146 A CN202310140146 A CN 202310140146A CN 115840633 A CN115840633 A CN 115840633A
Authority
CN
China
Prior art keywords
log
thread
identification number
group
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310140146.5A
Other languages
Chinese (zh)
Other versions
CN115840633B (en
Inventor
王竹峰
谢超
许子文
强昌金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jishu Yunzhou Technology Co ltd
Original Assignee
Beijing Jishu Yunzhou Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jishu Yunzhou Technology Co ltd filed Critical Beijing Jishu Yunzhou Technology Co ltd
Priority to CN202310140146.5A priority Critical patent/CN115840633B/en
Publication of CN115840633A publication Critical patent/CN115840633A/en
Application granted granted Critical
Publication of CN115840633B publication Critical patent/CN115840633B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a log parallel processing method, a system, a storage medium and electronic equipment, which are characterized in that a log group generated by current write node physical transactions is obtained, wherein the log group is composed of a plurality of log pages which are in series, each log page corresponds to a specific identification number, log content is recorded in each log page, a thread number and the identification number are obtained, each log page sequentially generated according to a time sequence is distributed to a target thread corresponding to a matched thread number according to the identification number, an ending identification is generated at the tail end of each target thread, and finally the log content in each target thread is executed until all target threads are executed to the corresponding ending identifications, so that the current write node physical transactions are submitted, and the data processing speed and efficiency are greatly improved.

Description

Log parallel processing method, system, storage medium and equipment
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a log parallel processing method, a log parallel processing system, a log parallel processing storage medium and log parallel processing equipment.
Background
Data processing is a basic link of system engineering and automatic control, and runs through various fields of social production and social life. The development of data processing technology and the breadth and depth of its application have greatly influenced the progress of human society development.
In data processing, physical replication is an important mode in a master-slave replication synchronization mode, and the physical replication mainly depends on log information, namely log groups, generated by a database kernel when data is written, and sequentially applies the log groups in sequence, so that it can be understood that when a transaction is over, the transaction is committed, and when a next transaction is encountered, the next transaction is automatically started.
Disclosure of Invention
Based on this, embodiments of the present invention provide a log parallel processing method, system, storage medium, and device, which aim to solve the problems in the prior art that a single thread copy is usually adopted when a master-slave physical copy is performed, which results in a slow data processing speed and low efficiency.
A first aspect of an embodiment of the present invention provides a log parallel processing method, where the method includes:
acquiring a log group generated by a physical transaction of a current write node, wherein the log group is composed of a plurality of log pages which are in series, each log page corresponds to a specific identification number, and log content is recorded in each log page;
acquiring a thread number and the identification number, distributing the log pages sequentially generated according to a time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an ending identification at the tail of each target thread;
and executing the log content in each target thread until all the target threads are executed to the corresponding end identifier, and submitting the physical transaction of the current write node.
Further, the step of acquiring the thread number and the identification number, distributing each log page sequentially generated according to the time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an end identification at the end of each target thread includes:
acquiring thread numbers and each identification number, and performing hash processing on each identification number and each thread number to obtain the thread numbers corresponding to each thread;
establishing a mapping model of the thread number and the identification number, wherein the mapping model is used for outputting the corresponding thread number when the identification number is input;
inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
and generating an END identifier at the END of each target thread, wherein the END identifier is an END mark in an SQL statement.
Further, the step of acquiring the thread number and the identification number, distributing each log page sequentially generated according to the time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an end identification at the end of each target thread further includes:
acquiring the log group, and generating a target log group containing a target log page according to the log group;
acquiring thread numbers and all identification numbers in the target log group, and performing hash processing on all identification numbers in the target log group and the thread numbers to obtain the thread numbers corresponding to all threads;
establishing a mapping model of the thread number and each identification number in the target log group, wherein the mapping model is used for outputting the corresponding thread number when the identification number is input;
inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
and at least one target log page in the target log group is distributed at the tail of each target thread, and the target log page is the end identifier of the corresponding target thread.
Further, the step of obtaining the log group and generating a target log group containing a target log page according to the log group includes:
sequentially acquiring first identification numbers corresponding to all the log pages in the log group to obtain a first identification number group;
copying the first identification number group, and performing duplication removal processing on the first identification numbers in the first identification number group to obtain a second identification number group;
adding the second identification number group to the first identification number group to form a third identification number group;
in the third identification number group, the log page contained in the first identification number group is endowed with corresponding log content;
and giving a preset type to the log pages contained in the second identification number group in the third identification number group so as to finally obtain the target log group, wherein the preset type is used for realizing the function of finishing identification, and the log pages contained in the second identification number group are the target log pages.
Further, the preset type is an MLOG _ COMMIT _ MTR type.
Further, the step of submitting the physical transaction of the current write node after executing the log content in each target thread until all target threads execute to the corresponding end identifier includes:
sequentially acquiring the types of the log pages in the target threads, and judging whether the types are preset types or not;
if so, submitting the write node sub-physical transactions matched with the corresponding target threads, wherein the write node sub-physical transactions are used for forming the write node physical transactions.
Further, the step of copying the first identification number group and performing deduplication processing on the first identification number in the first identification number group to obtain a second identification number group includes:
acquiring a first identification number positioned at the first bit of the first identification number group, and putting the first identification number at the first bit into a preset number group;
sequentially judging whether the first identification numbers except the first bit in the first identification number group are the same as the identification numbers existing in the preset number group;
and if not, adding the corresponding first identification number into the preset number group to obtain the second identification number group.
A second aspect of an embodiment of the present invention provides a log parallel processing system, where the system includes:
the system comprises a log group acquisition module, a log group generation module and a log group generation module, wherein the log group is generated by current write node physical transactions and comprises a plurality of serial log pages, each log page corresponds to a specific identification number, and log content is recorded in each log page;
the distribution module is used for acquiring a thread number and the identification number, distributing each log page sequentially generated according to a time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an end identification at the tail of each target thread;
and the submitting module is used for executing the log content in each target thread, and submitting the physical transaction of the current write node until all the target threads are executed to the corresponding end identifier.
A third aspect of an embodiment of the present invention provides a computer-readable storage medium, including:
the readable storage medium stores one or more programs which, when executed by a processor, implement the log parallel processing method of the first aspect.
A fourth aspect of an embodiment of the present invention provides an electronic device, which includes a memory and a processor, wherein:
the memory is used for storing computer programs;
the processor is configured to implement the log parallel processing method of the first aspect when executing the computer program stored in the memory.
The invention provides a log parallel processing method, a system, a storage medium and equipment, which are characterized in that a log group generated by a current write node physical transaction is obtained, wherein the log group is composed of a plurality of log pages which are in series, each log page corresponds to a specific identification number, log content is recorded in each log page, then a thread number and the identification number are obtained, each log page sequentially generated according to a time sequence is distributed to a target thread corresponding to a matched thread number according to the identification number, an ending identification is generated at the tail end of each target thread to complete parallel distribution processing of the serial log pages, finally, the log content in each target thread is executed, and the current write node physical transaction is submitted until all the target threads are executed to the corresponding ending identifications, so that the data processing speed and efficiency are greatly improved.
Drawings
Fig. 1 is a flowchart illustrating an implementation of a method for parallel processing logs according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of log distribution provided in the first embodiment of the present invention;
FIG. 3 is a schematic diagram of parallel processing of logs according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a log parallel processing system according to a third embodiment of the present invention;
fig. 5 is a block diagram of an electronic device according to a fourth embodiment of the present invention.
The following detailed description will be further described in conjunction with the above-identified drawing figures.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Several embodiments of the invention are presented in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like as used herein are for illustrative purposes only.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
In data processing, a synchronization mode of master-slave copy includes two types, one type is logical copy, which belongs to a currently mainstream use mode, and is also a mode supported by the government, but the mode is premised on that data of different master-slave nodes are physically separated and are multiple different data, in order to keep the data consistent, the same SQL statements need to be independently executed through the logical copy respectively, in order to realize the shared storage characteristic of the master-slave nodes, another copy method needs to be realized, relative to the logical copy, which is called physical copy, the physical copy mainly depends on log information generated when data is written by a database kernel, usually, at the time of writing, the generated log information includes two types, binlog and Redolog, binlog is applied to the above mentioned logical copy, because the Binlog is generally SQL, can be independently executed at each node, and the Redo log is applied to the physical copy, because the log record form of Redo log is the write information of a bottom log page, it can be understood that the Redo log is a log record of a certain length specified by a log, and is called a replay action of a log (called physical log) which is called a replay action of a log).
In addition, redol is generated in sequence when being generated, and then is stored in a log file in sequence, and because the Redol is added all the time backwards, the logic of the content is also sequential and consists of log records, each log record consists of a plurality of bytes, each log record is defined by the kernel of the database, and the relationship exists between different log records or between the log records. The method comprises the steps that a plurality of continuous log records form a log group, wherein the log group is a log generated by a certain physical transaction and then written into a file in a continuous mode, the log group comprises a plurality of log records of different types, the beginning of each log group means the starting operation of the current physical transaction, and the end of the log group means the submitting operation of the current physical transaction, wherein the transaction submitting operation is necessary.
Based on the above theoretical basis, the single thread execution mode of data processing can be known, which results in slow data processing speed and low efficiency.
Example one
Referring to fig. 1, fig. 1 shows a parallel processing method for logs according to an embodiment of the present invention, where the method specifically includes steps S01 to S03.
Step S01, a log group generated by the physical transaction of the current write node is obtained, wherein the log group is composed of a plurality of serial log pages, each log page corresponds to a specific identification number, and log content is recorded in each log page.
It should be noted that, because the log group includes logs generated when different log pages for all modifications are written by the writing node during the physical transaction activity, where each log record corresponds to one log Page and each log Page corresponds to one identification number (Page id), it can be understood that all the identification numbers included in the log group may be the same or different, but it can be determined that, for the log pages with the same identification number, the log content corresponding to the log Page with the subsequent identification number is based on the log content corresponding to the log Page with the previous identification number, that is, the same log Page can continue to apply the subsequent log content only after the previous log content is applied, and the log pages with different identification numbers in the log group do not have any relationship.
Based on this, it is determined that when the log content in one log group is distributed to multiple threads, the log pages with the same identification number cannot be distributed to different threads, if the log pages are distributed to different threads, because the execution speed of different threads is fast or slow, after the execution speed is fast or slow, when the log record content in the log pages is applied, the original sequence cannot be ensured, only when all the log pages with the same identification number in the log group are distributed to the same thread, the sequential application of the log content of the log pages with the same identification number can be ensured, and in addition, the data correctness of the most basic slave nodes after the log is applied in parallel can be ensured.
And S02, acquiring a thread number and the identification number, distributing each log page sequentially generated according to the time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an ending identification at the tail of each target thread.
In this embodiment, as shown in fig. 2, a log distribution diagram provided in the first embodiment of the present invention is a log distribution diagram, in which hash processing may be performed according to the identification number and the thread number corresponding to a log page to obtain a mapping relationship, specifically, the identification numbers corresponding to 1, 5, 7, 6, and 3 are corresponding to the log page, and the thread number is 4, and then hash processing is performed, as can be seen from the diagram, identification numbers 1 and 5 correspond to thread 1, identification number 6 corresponds to thread 2, identification numbers 3 and 7 correspond to thread 3, and thread 4 has no corresponding identification number, and belongs to an empty thread.
It should be noted that, for the control method of transaction start and commit, when the master library generates the log, a log group does not set any flag at its start position to indicate that it is the start position of the transaction, and there is usually only one END flag at the END position, that is, the END flag in the SQL statement, which has no relationship with the log page itself unlike other log types in the log group.
Step S03, executing the log content in each target thread, and submitting the physical transaction of the current write node until all target threads execute to the corresponding end identifier.
In each thread, when the next log type to be applied is MLOG _ COMMIT _ MTR after the original log contents in all logs are applied, the thread is considered to be currently required to be committed, and a transaction COMMIT is performed. As for the start operation of the transaction, since the transaction is committed, if the physical transaction used by the thread has not started yet in the application process, the operation is automatically started.
To sum up, in the log parallel processing method in the embodiment of the present invention, a log group generated by a current write node physical transaction is obtained, where the log group is composed of a plurality of log pages in series, each log page corresponds to a specific identification number, and log content is recorded in each log page, then a thread number and an identification number are obtained, according to the identification numbers, each log page sequentially generated according to a time sequence is distributed to a target thread corresponding to a matched thread number, and an end identification is generated at the end of each target thread to complete parallel distribution processing of the serial log pages, and finally, the log content in each target thread is executed until all target threads execute corresponding end identifications, the current write node physical transaction is submitted, thereby greatly improving data processing speed and efficiency.
Example two
The log parallel processing method provided by the second embodiment of the present invention is also different from the log parallel processing method in the first embodiment in that the end identifier is different.
In particular, the issue of logically committing transactions may be solved by distributing the END flag at the END of each target thread, but such a method is too costly because there may be many threads, but the number of log pages in a group may be small, and if the END flag of the group is distributed to each different thread at the END of a log group, the distribution efficiency is too low, and most of the execution for the END flag is ineffective, because the log pages in a group may not be distributed to each thread, and it is likely that only a few threads have been distributed, and not all threads, it is right that the END flag should be distributed to only the few threads.
Based on this problem, it may be conceivable to record each thread involved in each log group and distribute the thread to only these threads after the group is completed, which may be slightly improved, but the distribution thread itself is single-threaded, and the operation of maintaining thread information and finally distributing END flags to each thread at one time greatly reduces the efficiency of the distribution thread.
Further, in order to better solve the problems of distribution efficiency and transaction submission, specifically, acquiring a log group, and generating a target log group including target log pages according to the log group, it should be noted that first identification numbers corresponding to the log pages in the log group are sequentially acquired to obtain a first identification number group, for example, as shown in fig. 3, which is a schematic diagram of parallel processing of logs provided by the second embodiment of the present invention, where the first identification number group is 1, 5, 7, 1, 6, 5, 3, 5, 1, 5, 7, etc. in the first identification number group are first identification numbers, and each log page corresponding to each first identification number includes log content, further, the first identification number group is copied, and the first identification numbers in the first identification number group are subjected to deduplication processing to obtain a second identification number group, specifically, firstly, obtaining a first identification number which is positioned at the first position of a first identification number group, namely a first identification number 1, placing the first identification number at the first position into a preset number group, namely, a member in the preset number group only has the first identification number 1, then sequentially judging whether the first identification numbers except the first position in the first identification number group are the same as the identification numbers existing in the preset number group, if not, adding the corresponding first identification numbers into the preset number group to obtain a second identification number group, understandably comparing the first identification number 5 with the identification numbers 1 in the preset number group, because 5 is different from 1, adding the first identification number 5 into the preset number group to obtain an updated preset number group, wherein the preset number group stores the first identification numbers 1 and 5, and continuously comparing the first identification number 7 in the first identification number group with the identification numbers 1 and 5 in the preset number group, because 7 compares in 1 and 5 all inequality, then add first identification number 7 to in the preset number group to update the preset number group, analogize with this, obtain the preset number group that first identification number is 1, 5, 7, 6, 3, final second identification number group promptly.
Adding the second identification number group to the first identification number group to form a third identification number group, wherein the third identification number group comprises the first identification numbers 1, 5, 7, 1, 6, 5, 3, 5, 1, 5, 7, 6, 3, and then assigning the identification numbers corresponding to the first identification number group to the corresponding log contents, and assigning the log pages included in the second identification number group in the third identification number group to preset types to finally obtain a target log group, wherein the preset types are used for realizing the function of ending the identification, the log pages corresponding to the identification numbers in the second identification number group are target log pages, specifically, the preset type is an MLOG _ comm _ MTR type, and the MLOG _ comm _ MTR type is used for marking the ending position of the physical transaction, and can be known from the above, the method comprises the steps of generating a plurality of MLOG _ COMMIT _ MTR type logs correspondingly according to different identification numbers in a log group, wherein information contained in the logs only needs the identification numbers and does not need specific log content, then intensively writing the logs with the MLOG _ COMMIT _ MTR type to the last of the log group, so that the end position of a physical transaction is a plurality of MLOG _ COMMIT _ MTR type logs, the logs of the type can be taken as common logs during distribution, the logs are distributed to specific threads according to the identification numbers corresponding to the logs, and the identification numbers in the logs are generated according to log pages actually contained in the physical transaction when the logs are written in a master library, so that accurate distribution and rapid distribution can be realized during distribution.
EXAMPLE III
Referring to fig. 4, fig. 4 is a schematic structural diagram of a log parallel processing system according to a third embodiment of the present invention, where the log parallel processing system 200 includes: a log group obtaining module 21, a distributing module 22 and a submitting module 23, wherein:
a log group obtaining module 21, configured to obtain a log group generated by a physical transaction of a current write node, where the log group is formed by multiple log pages in series, each log page corresponds to a specific identification number, and log content is recorded in each log page;
the distribution module 22 is configured to acquire a thread number and the identification number, distribute, according to the identification number, each log page sequentially generated according to a time sequence to a target thread corresponding to the matched thread number, and generate an END identifier at the END of each target thread, where the END identifier is an END identifier in an SQL statement;
a commit module 23, configured to execute the log content in each target thread, and commit the physical transaction of the current write node until all the target threads execute to the corresponding end identifier.
Further, in other embodiments of the present invention, the distribution module 22 includes:
the first hash processing unit is used for acquiring thread numbers and the identification numbers, and performing hash processing on the identification numbers and the thread numbers to obtain the thread numbers corresponding to the threads;
a first mapping model establishing unit, configured to establish a mapping model between the thread number and the identification number, where the mapping model is used to output the corresponding thread number when the identification number is input;
the first distribution unit is used for inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
and the ending identifier generating unit is used for generating ending identifiers at the ENDs of the target threads, wherein the ending identifiers are END marks in the SQL statements.
Further, in other embodiments of the present invention, the distribution module 22 further includes:
the target log group generating unit is used for acquiring the log group and generating a target log group containing a target log page according to the log group;
the second hash processing unit is used for acquiring thread numbers and all identification numbers in the target log group, and performing hash processing on all identification numbers in the target log group and the thread numbers to obtain the thread numbers corresponding to all threads;
a second mapping model establishing unit, configured to establish a mapping model between the thread number and each identification number in the target log group, where the mapping model is configured to output the corresponding thread number when the identification number is input;
the second distribution unit is used for inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
at least one target log page in the target log group is distributed at the tail end of each target thread, and the target log page is the corresponding end identifier of the target thread
Further, in other embodiments of the present invention, the target log group generating unit includes:
the acquisition subunit is used for sequentially acquiring first identification numbers corresponding to the log pages in the log group to obtain a first identification number group;
the duplication removing subunit is configured to copy the first identification number group, and perform duplication removing processing on the first identification number in the first identification number group to obtain a second identification number group;
an adding subunit, configured to add the second identification number group to the first identification number group to form a third identification number group;
the first giving subunit is configured to give a log page included in the third identification number group to corresponding log content;
and the second giving subunit is configured to give a preset type to the log pages in the third identification number group, where the preset type is used to implement a function of ending identification, and the log pages in the second identification number group are the target log pages, where the preset type is an MLOG _ COMMIT _ MTR type.
Further, in other embodiments of the present invention, the submitting module 23 includes:
the judging unit is used for sequentially acquiring the types of the log pages in each target thread and judging whether the types are preset types or not;
and the submitting unit is used for submitting the corresponding write node sub-physical transaction matched with the target thread when the type is judged to be the preset type, and the write node sub-physical transaction is used for forming the write node physical transaction.
Further, in other embodiments of the present invention, the duplication removal subunit is specifically configured to acquire a first identification number located in a first bit of the first identification number group, and place the first identification number in a preset number group;
sequentially judging whether the first identification numbers except the first bit in the first identification number group are the same as the identification numbers existing in the preset number group;
and if not, adding the corresponding first identification number into the preset number group to obtain the second identification number group.
Example four
Referring to fig. 5, a block diagram of an electronic device according to a fourth embodiment of the present invention is shown, which includes a memory 20, a processor 10, and a computer program 30 stored in the memory and running on the processor, where the processor 10 implements the log parallel processing method when executing the computer program 30.
The processor 10 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor or other data processing chip in some embodiments, and is used to execute program codes stored in the memory 20 or process data, such as executing an access restriction program.
The memory 20 includes at least one type of readable storage medium, which includes flash memory, hard disk, multimedia card, card type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. The memory 20 may in some embodiments be an internal storage unit of the electronic device, for example a hard disk of the electronic device. The memory 20 may also be an external storage device of the electronic device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device. Further, the memory 20 may also include both an internal storage unit and an external storage device of the electronic apparatus. The memory 20 may be used not only to store application software and various types of data of the electronic device, but also to temporarily store data that has been output or is to be output.
It should be noted that the configuration shown in fig. 5 does not constitute a limitation of the electronic device, which may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components in other embodiments.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the log parallel processing method as described above.
Those of skill in the art will understand that the logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be viewed as implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (an electrical device) having one or more wires, a portable computer diskette (a magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: discrete logic circuits with logic gates for implementing logic functions on data states, application specific integrated circuits with appropriate combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The above examples only show some embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method for parallel processing of logs, the method comprising:
acquiring a log group generated by a physical transaction of a current write node, wherein the log group is composed of a plurality of log pages which are in series, each log page corresponds to a specific identification number, and log content is recorded in each log page;
acquiring a thread number and the identification number, distributing the log pages sequentially generated according to a time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an ending identification at the tail of each target thread;
and executing the log content in each target thread until all the target threads are executed to the corresponding end identifier, and submitting the physical transaction of the current write node.
2. The method according to claim 1, wherein the step of acquiring the thread number and the identification number, distributing each log page sequentially generated in time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an end identifier at the end of each target thread comprises:
acquiring thread numbers and each identification number, and performing hash processing on each identification number and each thread number to obtain the thread numbers corresponding to each thread;
establishing a mapping model of the thread number and the identification number, wherein the mapping model is used for outputting the corresponding thread number when the identification number is input;
inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
and generating an END identifier at the END of each target thread, wherein the END identifier is an END mark in an SQL statement.
3. The method according to claim 1, wherein the step of obtaining a thread number and the identification number, distributing each log page generated in sequence in time to a target thread corresponding to the matched thread number according to the identification number, and generating an end identifier at the end of each target thread further comprises:
acquiring the log group, and generating a target log group containing a target log page according to the log group;
acquiring thread numbers and all identification numbers in the target log group, and performing hash processing on all identification numbers in the target log group and the thread numbers to obtain the thread numbers corresponding to all threads;
establishing a mapping model of the thread number and each identification number in the target log group, wherein the mapping model is used for outputting the corresponding thread number when the identification number is input;
inputting the identification number into the mapping model to obtain the corresponding thread number, and distributing each log page to the corresponding target thread according to the thread number;
and the tail of each target thread is distributed with at least one target log page in the target log group, and the target log page is the corresponding end identifier of the target thread.
4. The method according to claim 3, wherein the step of obtaining the log group and generating a target log group including a target log page according to the log group comprises:
sequentially acquiring first identification numbers corresponding to all the log pages in the log group to obtain a first identification number group;
copying the first identification number group, and performing duplication removal processing on the first identification numbers in the first identification number group to obtain a second identification number group;
adding the second identification number group to the first identification number group to form a third identification number group;
in the third identification number group, the log page contained in the first identification number group is endowed with corresponding log content;
and giving a preset type to the log pages contained in the second identification number group in the third identification number group so as to finally obtain the target log group, wherein the preset type is used for realizing the function of finishing identification, and the log pages contained in the second identification number group are the target log pages.
5. The log parallel processing method according to claim 4, wherein the preset type is an MLOG _ COMMIT _ MTR type.
6. The method according to claim 5, wherein the step of performing the log content in each of the target threads until all the target threads have reached the corresponding end identifier, and then committing the current write node physical transaction comprises:
sequentially acquiring the types of the log pages in the target threads, and judging whether the types are preset types or not;
if yes, submitting the write node sub-physical transaction matched with the corresponding target thread, wherein the write node sub-physical transaction is used for forming the write node physical transaction.
7. The method of claim 5, wherein the copying the first identification number group and performing de-duplication processing on the first identification numbers in the first identification number group to obtain a second identification number group comprises:
acquiring a first identification number positioned at the first bit of the first identification number group, and putting the first identification number at the first bit into a preset number group;
sequentially judging whether the first identification numbers except the first bit in the first identification number group are the same as the identification numbers existing in the preset number group;
and if not, adding the corresponding first identification number into the preset number group to obtain the second identification number group.
8. A log parallel processing system, the system comprising:
the system comprises a log group acquisition module, a log group generation module and a log group generation module, wherein the log group is generated by current write node physical transactions and comprises a plurality of serial log pages, each log page corresponds to a specific identification number, and log content is recorded in each log page;
the distribution module is used for acquiring a thread number and the identification number, distributing each log page sequentially generated according to a time sequence to a target thread corresponding to the matched thread number according to the identification number, and generating an end identification at the tail of each target thread;
and the submitting module is used for executing the log content in each target thread, and submitting the physical transaction of the current write node until all the target threads are executed to the corresponding end identifier.
9. A computer-readable storage medium, comprising:
the readable storage medium stores one or more programs which, when executed by a processor, implement the log parallel processing method according to any one of claims 1 to 7.
10. An electronic device, comprising a memory and a processor, wherein:
the memory is used for storing computer programs;
the processor is configured to implement the log parallel processing method according to any one of claims 1 to 7 when executing the computer program stored in the memory.
CN202310140146.5A 2023-02-21 2023-02-21 Log parallel processing method, system, storage medium and equipment Active CN115840633B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310140146.5A CN115840633B (en) 2023-02-21 2023-02-21 Log parallel processing method, system, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310140146.5A CN115840633B (en) 2023-02-21 2023-02-21 Log parallel processing method, system, storage medium and equipment

Publications (2)

Publication Number Publication Date
CN115840633A true CN115840633A (en) 2023-03-24
CN115840633B CN115840633B (en) 2023-04-21

Family

ID=85579954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310140146.5A Active CN115840633B (en) 2023-02-21 2023-02-21 Log parallel processing method, system, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN115840633B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090106594A1 (en) * 2007-10-19 2009-04-23 International Business Machines Corporation Method and Device for Log Events Processing
CN103593257A (en) * 2012-08-15 2014-02-19 阿里巴巴集团控股有限公司 Data backup method and device
CN105868030A (en) * 2015-12-22 2016-08-17 乐视移动智能信息技术(北京)有限公司 Log data communication processing apparatus and method as well as mobile terminal
US20180144015A1 (en) * 2016-11-18 2018-05-24 Microsoft Technology Licensing, Llc Redoing transaction log records in parallel
CN108205476A (en) * 2017-12-27 2018-06-26 郑州云海信息技术有限公司 A kind of method and device of multithreading daily record output
CN108874588A (en) * 2018-06-08 2018-11-23 郑州云海信息技术有限公司 A kind of database instance restoration methods and device
CN111858503A (en) * 2020-06-04 2020-10-30 武汉达梦数据库有限公司 Parallel execution method and data synchronization system based on log analysis synchronization

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090106594A1 (en) * 2007-10-19 2009-04-23 International Business Machines Corporation Method and Device for Log Events Processing
CN103593257A (en) * 2012-08-15 2014-02-19 阿里巴巴集团控股有限公司 Data backup method and device
CN105868030A (en) * 2015-12-22 2016-08-17 乐视移动智能信息技术(北京)有限公司 Log data communication processing apparatus and method as well as mobile terminal
US20180144015A1 (en) * 2016-11-18 2018-05-24 Microsoft Technology Licensing, Llc Redoing transaction log records in parallel
CN108205476A (en) * 2017-12-27 2018-06-26 郑州云海信息技术有限公司 A kind of method and device of multithreading daily record output
CN108874588A (en) * 2018-06-08 2018-11-23 郑州云海信息技术有限公司 A kind of database instance restoration methods and device
CN111858503A (en) * 2020-06-04 2020-10-30 武汉达梦数据库有限公司 Parallel execution method and data synchronization system based on log analysis synchronization

Also Published As

Publication number Publication date
CN115840633B (en) 2023-04-21

Similar Documents

Publication Publication Date Title
EP3678346B1 (en) Blockchain smart contract verification method and apparatus, and storage medium
US9116903B2 (en) Method and system for inserting data records into files
CN103034659B (en) A kind of method and system of data de-duplication
US10338917B2 (en) Method, apparatus, and system for reading and writing files
CN109300036B (en) Bifurcation regression method and device of block chain network
US20200210399A1 (en) Signature-based cache optimization for data preparation
US20170109378A1 (en) Distributed pipeline optimization for data preparation
US9384202B1 (en) Gateway module to access different types of databases
CN108319711B (en) Method and device for testing transaction consistency of database, storage medium and equipment
US8347052B2 (en) Initializing of a memory area
US10740316B2 (en) Cache optimization for data preparation
DE102020121075A1 (en) Establishment and procedure for the authentication of software
US5822746A (en) Method for mapping a file specification to a sequence of actions
US8572048B2 (en) Supporting internal consistency checking with consistency coded journal file entries
CN105094711A (en) Method and device for achieving copy-on-write file system
CN115840633A (en) Log parallel processing method, system, storage medium and equipment
CN102495838B (en) Data processing method and data processing device
CN111190895B (en) Organization method, device and storage medium of column-type storage data
CN109542860B (en) Service data management method based on HDFS and terminal equipment
CN111752954B (en) Large-scale feature data storage method and device
CN112965939A (en) File merging method, device and equipment
CN114461605B (en) Transaction data multi-version implementation method, device and equipment of memory multi-dimensional database
CN112559457A (en) Data access method and device
Wan et al. NVLH: crash-consistent linear hashing for non-volatile memory
US11288447B2 (en) Step editor for data preparation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant