CN116755858A - Kafka data management method, device, computer equipment and storage medium - Google Patents

Kafka data management method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116755858A
CN116755858A CN202310745862.6A CN202310745862A CN116755858A CN 116755858 A CN116755858 A CN 116755858A CN 202310745862 A CN202310745862 A CN 202310745862A CN 116755858 A CN116755858 A CN 116755858A
Authority
CN
China
Prior art keywords
disk
task
kafka data
kafka
partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310745862.6A
Other languages
Chinese (zh)
Inventor
解培佩
陈奕宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202310745862.6A priority Critical patent/CN116755858A/en
Publication of CN116755858A publication Critical patent/CN116755858A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to the technical field of data processing, and provides a Kafka data management method, a Kafka data management device, computer equipment and a storage medium, wherein the Kafka data management method comprises the following steps: monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task; if the task in the current batch partition Kafka data migration task is not completed, dividing the incomplete task into a next batch partition Kafka data migration task; and executing each task in the next batch of partition Kafka data migration tasks to improve the Kafka data management efficiency. The present application also relates to blockchain techniques in which Kafka data may be stored in blockchain nodes.

Description

Kafka data management method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a Kafka data management method, a Kafka data management device, a computer device, and a storage medium.
Background
Kafka is an open source stream processing platform, is widely applied to message intermediate keys, has the characteristics of message persistence, high throughput, distributed, multi-client support, real-time and the like, and mainly receives various data such as offline calculation data, real-time calculation data, log center data and the like, and is energized in application scenes such as log collection, website activity tracking, operation monitoring, real-time data stream synchronization and the like. For example, for the financial domain, kafka may be applied to electronic transaction data migration. However, as the Kafka cluster scale is continuously enlarged, the data volume is continuously increased, and the problems of delay of data migration of electronic transactions and the like are unavoidable, so that the data management efficiency is low.
Therefore, how to achieve the improvement of the Kafka data management efficiency becomes a problem to be solved.
Disclosure of Invention
The application provides a Kafka data management method, a Kafka data management device, computer equipment and a storage medium, and aims to improve Kafka data management efficiency.
To achieve the above object, the present application provides a Kafka data management method comprising:
monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task;
if the task in the current batch partition Kafka data migration task is not completed, dividing the incomplete task into a next batch partition Kafka data migration task;
and executing each task in the next batch partition Kafka data migration task.
In addition, to achieve the above object, the present application also provides a Kafka data management apparatus, comprising:
the monitoring module is used for monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task;
the processing module is used for dividing the incomplete task into a next batch partition Kafka data migration task if the task is incomplete in the current batch partition Kafka data migration task;
and the execution module is used for executing each task in the next batch of partition Kafka data migration tasks.
In addition, to achieve the above object, the present application also provides a computer apparatus including a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute the computer program and implement the Kafka data management method as described above when the computer program is executed.
In addition, to achieve the above object, the present application also provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described Kafka data management method.
The application discloses a Kafka data management method, a device, computer equipment and a storage medium, which monitor the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task, if the current batch partition Kafka data migration task has incomplete tasks, the incomplete tasks are divided into the next batch partition Kafka data migration task, and each task in the next batch partition Kafka data migration task is executed, so that the next batch partition Kafka data migration task does not need queuing and waiting any more, delay is reduced, and data management efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of steps of a Kafka data management method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of steps of another Kafka data management method provided by an embodiment of the present application;
FIG. 3 is a schematic flow chart of steps of yet another Kafka data management method provided by an embodiment of the present application;
FIG. 4 is a schematic flow chart of steps of yet another Kafka data management method provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of a system layer architecture;
FIG. 6 is a schematic diagram of a system layer structure according to an embodiment of the present application;
FIG. 7 is a schematic block diagram of a Kafka data management apparatus according to an embodiment of the present application;
fig. 8 is a schematic block diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
It is to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
The embodiment of the application provides a Kafka data management method, a Kafka data management device, computer equipment and a storage medium, which are used for improving the Kafka data management efficiency.
Referring to fig. 1, fig. 1 is a flowchart of a Kafka data management method according to an embodiment of the application. The method can be applied to computer equipment, and the application scene of the method is not limited in the application. The following describes the Kafka data management method in detail, taking the application of the Kafka data management method to a computer device as an example.
As shown in fig. 1, the Kafka data management method specifically includes steps S101 to S103.
S101, monitoring a current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task.
In the industry, when partition Kafka data migration tasks are performed in batches, usually after all tasks in the former batch of Kafka data migration tasks are performed, the next batch of Kafka data migration tasks are performed, and as long as tasks in the former batch of Kafka data migration tasks are not completed, the next batch of Kafka data migration tasks are queued, so that the data migration has efficiency problems and resource competition problems.
In order to improve the efficiency of data migration, in this embodiment, when partition Kafka data migration tasks are performed in batches, the current batch partition Kafka data migration task corresponding to each disk is monitored, and the execution condition of each task in the current batch partition Kafka data migration task, such as the execution condition that the task is completed, the task is not completed, and the like, is determined.
Wherein each batch partition Kafka data migration task comprises at least one task. For example, there are 16 partition Kafka data migration tasks: task 1, task 2 … … task 16, 4 batches, each batch partition Kafka data migration task including 4, a first batch partition Kafka data migration task including task 1, task 2, task 3, task 4, a second batch partition Kafka data migration task including task 5, task 6, task 7, task 8, a third batch partition Kafka data migration task including task 9, task 10, task 11, task 12, a fourth batch partition Kafka data migration task including task 13, task 14, task 15, task 16.
For example, for an electronic transaction business scenario in the financial field, based on Kafka batch, performing an electronic transaction data migration task of a disk partition, if each batch submits 4 partition electronic transaction data migration tasks, for example, the current batch partition electronic transaction data migration task includes task 1, task 2, task 3 and task 4, monitoring task 1, task 2, task 3 and task 4, and determining whether task 1, task 2, task 3 and task 4 are performed to complete.
S102, if tasks in the current batch partition Kafka data migration task are not completed, dividing the incomplete tasks into next batch partition Kafka data migration tasks.
For example, taking the above-listed example as an example, by monitoring task 1, task 2, task 3, and task 4, it is determined whether task 1, task 2, task 3, and task 4 are performed and completed, and if task 1, task 2, task 3, and task 4 are all performed and completed, then the next batch of partition tasks, that is, task 5, task 6, task 7, and task 8 are performed.
Illustratively, each of the batch partition Kafka data migration tasks are performed in parallel. For example, when all of the tasks 1, 2, 3, and 4 are executed, the tasks 5, 6, 7, and 8 are executed in parallel.
On the other hand, if there is task incompletion in task 1, task 2, task 3, and task 4, for example, task 3 in task 1, task 2, task 3, and task 4 is not complete, then the incomplete task 3 is divided into the next batch of partition tasks, and thus, the next batch of partition tasks includes task 3, task 5, task 6, task 7, and task 8.
S103, executing each task in the next batch partition Kafka data migration task.
Taking the above-listed example as an example, the incomplete task 3 is divided into the next batch of partition tasks, and the task 1, the task 2 and the task 4 are all executed and completed, that is, the partition tasks of the current batch are all executed and completed, so that the next batch of partition tasks do not need to be queued, that is, tasks 5, 6, 7, 8 and the like do not need to be queued. After task 3 is successfully divided into the next batch of partition tasks, executing tasks 3, 5, 6, 7 and 8. Illustratively, task 3, task 5, task 6, task 7, task 8 are executed in parallel.
And so on, once the task in the partition task of the current batch is not completed, the incomplete task is divided into the partition task of the next batch, and then the partition task of the next batch starts to be executed, so that the partition task of the next batch does not need to be queued, the delay is reduced, the efficiency of data migration is improved, and the resource utilization rate is also improved.
In some embodiments, as shown in fig. 2, the method further comprises steps S104 to S106.
S104, acquiring the current disk utilization rate of each disk through a preset Kafka data migration management component;
s105, determining an idle disk from the disks according to the current disk utilization rate of the disks, wherein the current disk utilization rate of the idle disk is lower than that of other disks;
s106, performing Kafka data migration based on the idle disk.
For the system application layer, the Broker is a Kafka server, namely a caching agent, and after receiving a message of a producer, one Broker of the Kafka saves the message on a disk, and at the same time, the Broker responds to a message acquisition request of a Consumer and fetches the message to a Consumer. Cluster Broker may have load imbalance problems, such as unbalanced disk utilization, resulting in local hot spot problems.
To ensure disk utilization balance, the Kafka data migration management component rebaancer is illustratively designed in advance, and the current disk utilization of each disk is obtained through the rebaancer. The performance status of each disk is monitored, for example, by a Kafka monitoring component Monitor, wherein the performance status of the disk includes the current disk utilization of the disk. Monitor reports the current disk utilization of each disk to Rebalancer. The Rebalancer obtains the current disk utilization of each disk from the Monitor.
After the current disk utilization rate of each disk is obtained, comparing the current disk utilization rates of the disks, determining the disk with low current disk utilization rate as an idle disk, and performing Kafka data migration through the idle disk. Illustratively, the rebaalancer submits a data migration plan of the free disk to the resign node of the Zookeeper, so that Kafka data migration is performed based on the free disk.
For example, if the disks include disk1, disk2, disk3, and disk 4, where the current disk utilization of disk1 is 60%, the current disk utilization of disk2 is 40%, the current disk utilization of disk3 is 80%, and the current disk utilization of disk 4 is 20%, then rebaracer determines disk 4 as an idle disk, and Kafka data migration is performed through disk 4.
And by monitoring the performance state of each disk, kafka data migration is performed based on the idle disk, so that the overall utilization rate of each disk is ensured to be balanced.
In some embodiments, as shown in fig. 3, the method further comprises steps S107 to S110.
S107, determining the number of partitions corresponding to the magnetic disks;
s108, determining whether redundant partitions exist on each disk according to the partition number of each disk;
s109, if redundant partitions exist, determining a target disk from the disks, wherein the number of the partitions corresponding to the target disk is lower than the number of the partitions corresponding to other disks;
s110, migrating the redundant partition to the target disk through a preset Kafka data migration management component.
For example, if the disk includes disk1, disk2, disk3, and disk 4, where disk1 and disk 4 hold 4 partitions, and disk2 and disk3 hold 2 partitions, then it is determined that the number of partitions corresponding to disk1 and disk 4 is 4, and the number of partitions corresponding to disk2 and disk3 is 2.
And according to the fact that the number of the partitions corresponding to the disk1 and the disk 4 is 4, the number of the partitions corresponding to the disk2 and the disk3 is 2, the number of the partitions corresponding to the disk1 and the disk 4 is more, and redundant partitions on the disk1 and the disk 4 are determined. For example, the last partition on disk1 and disk 4 is determined to be the spare partition.
In some embodiments, the determining whether there is an excess partition on each of the disks according to the number of partitions of each of the disks includes:
if the partition number of the first disk is greater than the preset number, determining other partitions of the first disk except the preset number of partitions as redundant partitions, wherein the first disk is any one of the disks;
if the number of the partitions of the first disk is smaller than or equal to the preset number, determining that no redundant partition exists on the first disk.
For example, a preset number corresponding to the number of partitions of each disk is preset as a reference, for example, typically 3 partitions of each disk are in a relatively balanced state, and the preset number may be set to be 3. It should be noted that, the specific numerical value of the preset number can be flexibly set according to the actual situation, and the application is not particularly limited.
If the partition number of a certain disk is greater than the preset number, redundant partitions exist on the disk, and other partitions on the disk except for the preset number of partitions are determined to be redundant partitions. For example, if the disk1 holds 4 partitions, there is one redundant partition on the disk1, and the second two partitions on the disk1 can be determined as redundant partitions.
Otherwise, if the number of partitions of a certain disk is smaller than or equal to the preset number, no redundant partition exists on the disk. For example, if disk3 holds 2 partitions, there are no redundant partitions on disk 3.
And determining the disk2 and the disk3 as target disks according to the fact that the number of the partitions corresponding to the disk2 and the disk3 is 2 and the number of the partitions corresponding to the disk2 and the disk3 is smaller.
Thereafter, the excess partition is migrated to the target disk by rebaancer. For example, the redundant partition of disk1 is migrated to disk2 and the redundant partition of disk 4 is migrated to disk 3.
Therefore, the disk1, the disk2, the disk3 and the disk 4 all hold 3 partitions, namely the number of the partitions corresponding to the disk1, the disk2, the disk3 and the disk 4 is 3, and an equilibrium state is achieved, so that the overall utilization rate of each disk is ensured to be balanced as much as possible.
In some embodiments, as shown in fig. 4, the method further comprises steps S111 to S112.
S111, monitoring data states corresponding to a plurality of brookers when Kafka data access is performed;
and S112, pulling the ready data of the first reader if the data of the first reader is ready first, wherein the first reader is any one of the plurality of readers.
Similarly, for the system application layer, the Kafka Consumer Consumer pulls the data in single threaded mode, for example, assuming that there are 3 bins as bin-1, bin-2, bin-3, when the Consumer first accesses bin-1 for data, bin-2 or bin-3 will not be accessed again, if bin-1 data is not ready, but bin-2 or bin-3 data may be ready during the waiting process, which may result in an increase in response time, decreasing data access efficiency.
In order to improve data access efficiency, when data access is based on Kafka, data states corresponding to a plurality of brookers are monitored, and whether data corresponding to each brooker are ready or not is determined.
Which of the plurality of stokers has the first ready data, the ready data for that stoker is pulled. For example, if the data of the first reader is ready first, the ready data of the first reader is pulled. Therefore, the response time is reduced, so that the data access efficiency is improved, and the problem that delay index distortion is caused by uneven Consumer consumption partition is solved.
Illustratively, after the ready data of the first reader is pulled, corresponding data access success information is returned. The successful data access information indicates that the data access is successful, so that the user experience is further improved.
For example, taking an electronic transaction service scenario in the financial field as an example, when electronic transaction data access is performed, the electronic transaction data states corresponding to the brooker-1, the brooker-2 and the brooker-3 are monitored to determine whether the electronic transaction data corresponding to each brooker is ready.
If the electronic transaction data corresponding to the reader-2 is ready first, the ready electronic transaction data in the reader-2 is pulled, so that the electronic transaction data corresponding to the reader-1 cannot be waited all the time because the electronic transaction data is not ready, the response time is reduced, and the electronic transaction data access efficiency is improved.
In some embodiments, the method further comprises:
determining the total cache capacity after expanding the cache capacity of the cache area based on the Raid card;
and performing Kafka data read-write operation according to the total cache capacity.
Currently, as shown in fig. 5, when data is read and written, the Kafka Broker will preferentially place the data in the PageCache page cache, which is abbreviated as page cache, and synchronize with the corresponding Disk when the capacity is insufficient, for example, as shown in fig. 5, the disks in the system layer include disks 1, disk2 and Disk3, however, general disks such as HDD (Hard Disk Drive) disks have the problem of insufficient random read and write performance, which is manifested as increased delay and reduced throughput.
In order to solve this problem, as shown in fig. 6, in this embodiment, a Raid (Redundant Arrays of Independent Disks, disk array) card is inserted into the system layer structure shown in fig. 5, where the Raid card has a self-cache, similar to the PageCache, and the Raid card merges data into a larger data block to write into the Disk, so that random read-write performance is guaranteed.
For example, if the buffer capacity corresponding to the PageCache is the first buffer capacity, the buffer capacity corresponding to the Raid card is the second buffer capacity, and the buffer capacity of the buffer area is expanded by the additionally-arranged Raid card, and the total buffer capacity after expansion is the sum of the first buffer capacity and the second buffer capacity. When data is read and written, based on the total cache capacity of the expanded total cache capacity, the data is combined into a larger Block data Block to carry out read and write operation. For example, when the buffer capacity reaches the total buffer capacity, block is written to Disk.
Since the data size corresponding to Block is larger than before, the bandwidth throughput increases and the delay decreases. The PageCache cache is improved, and the random read-write performance of the disk is improved, so that the kafka can also show good read-write performance under the condition of large data volume.
Referring to fig. 7, fig. 7 is a schematic block diagram of a Kafka data management apparatus according to an embodiment of the present application, where the Kafka data management apparatus may be configured in a computer device to perform the aforementioned Kafka data management method.
As shown in fig. 7, the Kafka data management apparatus 1000 includes: a monitoring module 1001, a processing module 1002, and an execution module 1003.
The monitoring module 1001 is configured to monitor a current batch partition Kafka data migration task corresponding to each disk, where each batch partition Kafka data migration task includes at least one task;
a processing module 1002, configured to divide, if there is a task incompletion in the current batch partition Kafka data migration task, the outstanding task into a next batch partition Kafka data migration task;
and an execution module 1003, configured to execute each task in the next batch partition Kafka data migration task.
In one embodiment, the processing module 1002 is further configured to:
acquiring the current disk utilization rate of each disk through a preset Kafka data migration management component;
determining an idle disk from each disk according to the current disk utilization rate of each disk, wherein the current disk utilization rate of the idle disk is lower than the current disk utilization rates of other disks;
the execution module 1003 is further configured to:
and performing Kafka data migration based on the idle disk.
In one embodiment, the processing module 1002 is further configured to:
determining the number of partitions corresponding to the magnetic disks;
determining whether redundant partitions exist on each disk according to the partition number of each disk;
if redundant partitions exist, determining a target disk from each disk, wherein the number of partitions corresponding to the target disk is lower than the number of partitions corresponding to other disks;
and migrating the redundant partition to the target disk through a preset Kafka data migration management component.
In one embodiment, the processing module 1002 is further configured to:
if the partition number of the first disk is greater than the preset number, determining other partitions of the first disk except the preset number of partitions as redundant partitions, wherein the first disk is any one of the disks;
if the number of the partitions of the first disk is smaller than or equal to the preset number, determining that no redundant partition exists on the first disk.
In one embodiment, the monitoring module 1001 is further configured to:
monitoring data states corresponding to a plurality of brookers when Kafka data access is performed;
the processing module 1002 is further configured to:
and pulling the ready data of the first reader if the data of the first reader is ready first, wherein the first reader is any one reader of the plurality of readers.
In one embodiment, the processing module 1002 is further configured to:
and returning corresponding data access success information.
In one embodiment, the processing module 1002 is further configured to:
determining the total cache capacity after expanding the cache capacity of the cache area based on the Raid card;
the execution module 1003 is further configured to:
and performing Kafka data read-write operation according to the total cache capacity.
Wherein, each module in the above-mentioned Kafka data management apparatus 1000 corresponds to each step in the above-mentioned Kafka data management method embodiment, and the functions and implementation processes thereof are not described herein in detail.
The methods and apparatus of the present application are operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
By way of example, the methods, apparatus described above may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 8.
Referring to fig. 8, fig. 8 is a schematic block diagram of a computer device according to an embodiment of the present application.
Referring to fig. 8, the computer device includes a processor and a memory connected by a system bus, wherein the memory may include a non-volatile storage medium and an internal memory.
The processor is used to provide computing and control capabilities to support the operation of the entire computer device.
The internal memory provides an environment for the execution of a computer program in a non-volatile storage medium that, when executed by a processor, causes the processor to perform any of the Kafka data management methods.
It should be appreciated that the processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein in one embodiment the processor is configured to run a computer program stored in the memory to implement the steps of:
monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task;
if the task in the current batch partition Kafka data migration task is not completed, dividing the incomplete task into a next batch partition Kafka data migration task;
and executing each task in the next batch partition Kafka data migration task.
In one embodiment, the processor is further configured to implement:
acquiring the current disk utilization rate of each disk through a preset Kafka data migration management component;
determining an idle disk from each disk according to the current disk utilization rate of each disk, wherein the current disk utilization rate of the idle disk is lower than the current disk utilization rates of other disks;
and performing Kafka data migration based on the idle disk.
In one embodiment, the processor is further configured to implement:
determining the number of partitions corresponding to the magnetic disks;
determining whether redundant partitions exist on each disk according to the partition number of each disk;
if redundant partitions exist, determining a target disk from each disk, wherein the number of partitions corresponding to the target disk is lower than the number of partitions corresponding to other disks;
and migrating the redundant partition to the target disk through a preset Kafka data migration management component.
In one embodiment, when implementing the determining whether there is an excess partition on each of the disks according to the number of partitions of each of the disks, the processor is configured to implement:
if the partition number of the first disk is greater than the preset number, determining other partitions of the first disk except the preset number of partitions as redundant partitions, wherein the first disk is any one of the disks;
if the number of the partitions of the first disk is smaller than or equal to the preset number, determining that no redundant partition exists on the first disk.
In one embodiment, the processor is further configured to implement:
monitoring data states corresponding to a plurality of brookers when Kafka data access is performed;
and pulling the ready data of the first reader if the data of the first reader is ready first, wherein the first reader is any one reader of the plurality of readers.
In one embodiment, the processor, after implementing the pulling the ready data of the first browser, is configured to implement:
and returning corresponding data access success information.
In one embodiment, the processor is further configured to implement:
determining the total cache capacity after expanding the cache capacity of the cache area based on the Raid card;
and performing Kafka data read-write operation according to the total cache capacity.
The embodiment of the application also provides a computer readable storage medium.
The computer readable storage medium of the present application has stored thereon a computer program which, when executed by a processor, implements the steps of the Kafka data management method as described above.
The computer readable storage medium may be an internal storage unit of the Kafka data management apparatus or the computer device according to the foregoing embodiment, for example, a hard disk or a memory of the Kafka data management apparatus or the computer device. The computer-readable storage medium may also be an external storage device of the Kafka data management apparatus or computer device, such as a plug-in hard disk, smart Media Card (SMC), secure digital Card (Secure Digital Card, SD Card), flash memory Card (Flash Card) or the like provided on the Kafka data management apparatus or computer device.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application.

Claims (10)

1. A Kafka data management method, the Kafka data management method comprising:
monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task;
if the task in the current batch partition Kafka data migration task is not completed, dividing the incomplete task into a next batch partition Kafka data migration task;
and executing each task in the next batch partition Kafka data migration task.
2. The Kafka data management method of claim 1, wherein the method further comprises:
acquiring the current disk utilization rate of each disk through a preset Kafka data migration management component;
determining an idle disk from each disk according to the current disk utilization rate of each disk, wherein the current disk utilization rate of the idle disk is lower than the current disk utilization rates of other disks;
and performing Kafka data migration based on the idle disk.
3. The Kafka data management method of claim 1, wherein the method further comprises:
determining the number of partitions corresponding to the magnetic disks;
determining whether redundant partitions exist on each disk according to the partition number of each disk;
if redundant partitions exist, determining a target disk from each disk, wherein the number of partitions corresponding to the target disk is lower than the number of partitions corresponding to other disks;
and migrating the redundant partition to the target disk through a preset Kafka data migration management component.
4. The Kafka data management method according to claim 3, wherein said determining whether there is a redundant partition on each of said disks according to the number of partitions of each of said disks comprises:
if the partition number of the first disk is greater than the preset number, determining other partitions of the first disk except the preset number of partitions as redundant partitions, wherein the first disk is any one of the disks;
if the number of the partitions of the first disk is smaller than or equal to the preset number, determining that no redundant partition exists on the first disk.
5. The Kafka data management method of claim 1, wherein the method further comprises:
monitoring data states corresponding to a plurality of brookers when Kafka data access is performed;
and pulling the ready data of the first reader if the data of the first reader is ready first, wherein the first reader is any one reader of the plurality of readers.
6. The Kafka data management method according to claim 1, wherein said pulling the ready data of said first reader comprises:
and returning corresponding data access success information.
7. The Kafka data management method according to any one of claims 1 to 6, further comprising:
determining the total cache capacity after expanding the cache capacity of the cache area based on the Raid card;
and performing Kafka data read-write operation according to the total cache capacity.
8. A Kafka data management apparatus, the Kafka data management apparatus comprising:
the monitoring module is used for monitoring the current batch partition Kafka data migration task corresponding to each disk, wherein each batch partition Kafka data migration task comprises at least one task;
the processing module is used for dividing the incomplete task into a next batch partition Kafka data migration task if the task is incomplete in the current batch partition Kafka data migration task;
and the execution module is used for executing each task in the next batch of partition Kafka data migration tasks.
9. A computer device, the computer device comprising a memory and a processor;
the memory is used for storing a computer program;
the processor for executing the computer program and for implementing the Kafka data management method according to any one of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the steps of the Kafka data management method according to any one of claims 1 to 7.
CN202310745862.6A 2023-06-20 2023-06-20 Kafka data management method, device, computer equipment and storage medium Pending CN116755858A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310745862.6A CN116755858A (en) 2023-06-20 2023-06-20 Kafka data management method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310745862.6A CN116755858A (en) 2023-06-20 2023-06-20 Kafka data management method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116755858A true CN116755858A (en) 2023-09-15

Family

ID=87958694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310745862.6A Pending CN116755858A (en) 2023-06-20 2023-06-20 Kafka data management method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116755858A (en)

Similar Documents

Publication Publication Date Title
CN106407190B (en) Event record query method and device
US7363629B2 (en) Method, system, and program for remote resource management
US8359596B2 (en) Determining capability of an information processing unit to execute the job request based on satisfying an index value and a content of processing of the job
US9031826B2 (en) Method and apparatus for simulating operation in a data processing system
JP5744707B2 (en) Computer-implemented method, computer program, and system for memory usage query governor (memory usage query governor)
US9274798B2 (en) Multi-threaded logging
CN107515784B (en) Method and equipment for calculating resources in distributed system
CN109240946A (en) The multi-level buffer method and terminal device of data
US20160306665A1 (en) Managing resources based on an application's historic information
CN106716335A (en) Asynchronous processing of mapping information
US11836067B2 (en) Hyper-converged infrastructure (HCI) log system
CN113568716A (en) Transaction processing method and device, electronic equipment and storage medium
US20130212584A1 (en) Method for distributed caching and scheduling for shared nothing computer frameworks
GB2497172A (en) Reserving space on a storage device for new data based on predicted changes in access frequencies of storage devices
US20200042521A1 (en) Database management system and method
US20240152444A1 (en) Online query execution using a big data framework
US10728186B2 (en) Preventing reader starvation during order preserving data stream consumption
CN113326146A (en) Message processing method and device, electronic equipment and storage medium
EP3264254B1 (en) System and method for a simulation of a block storage system on an object storage system
CN110569112B (en) Log data writing method and object storage daemon device
US20200387412A1 (en) Method To Manage Database
CN114218303B (en) Transaction data processing system, processing method, medium and equipment
CN116755858A (en) Kafka data management method, device, computer equipment and storage medium
US11243979B1 (en) Asynchronous propagation of database events
WO2020140623A1 (en) Electronic device, metadata processing method and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination