CN116560966A - Data processing method and device for cluster monitor, cluster and medium - Google Patents

Data processing method and device for cluster monitor, cluster and medium Download PDF

Info

Publication number
CN116560966A
CN116560966A CN202310821959.0A CN202310821959A CN116560966A CN 116560966 A CN116560966 A CN 116560966A CN 202310821959 A CN202310821959 A CN 202310821959A CN 116560966 A CN116560966 A CN 116560966A
Authority
CN
China
Prior art keywords
database
file
monitor
data
metadata log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310821959.0A
Other languages
Chinese (zh)
Other versions
CN116560966B (en
Inventor
刘鑫
王庆海
侯斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310821959.0A priority Critical patent/CN116560966B/en
Publication of CN116560966A publication Critical patent/CN116560966A/en
Application granted granted Critical
Publication of CN116560966B publication Critical patent/CN116560966B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1727Details of free space management performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the field of computers, and discloses a data processing method, a device, a monitor, a cluster and a medium of a cluster monitor, which are used for solving the problem that occupied space can not be released only by calling a compression interface of a database, and the method comprises the steps of acquiring a database data file and a database metadata log file under a database catalog of the monitor; determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold; when the target file is a database data file, compressing the database data file; when the target file is a database metadata log file, releasing the database metadata log file; and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes. The invention can effectively release the occupied space of the monitor database partition.

Description

Data processing method and device for cluster monitor, cluster and medium
Technical Field
The present invention relates to the field of computers, and in particular, to a data processing method and apparatus for a cluster monitor, a cluster, and a medium.
Background
An odd number of monitor (Mon) services are deployed in the distributed cluster to monitor the state of the cluster, and the monitor services are deployed on common storage service nodes. Each monitor service stores the respective data in a local database, and the directory where the database is located can be exclusive of one disk partition or can be deployed on a partition such as a system root partition. When the method is deployed, enough space needs to be reserved for the partition occupied by the database, and if the partition space is exhausted, the Mon service can be exited because more data cannot be stored; if more than half of the Mon services exit, the cluster will not be able to provide storage services outside.
Currently, when the occupied space or the occupied percentage of the partitions of the Mon database exceeds a certain threshold, the Mon database is compressed. And calling a compression interface provided by the database during compression, compressing the database data file, and deleting the garbage data such as the expired data version. However, the excessive occupation of the partition of the Mon database is not necessarily caused by the fact that the old version of the data file is overstocked too much, and other problems may be caused, such as the situation that the metadata file of the database is too large or the metadata file of the directory is mistakenly occupied by other files, and the like, at this time, the purpose of releasing the occupied space cannot be achieved by only calling the compression interface provided by the database.
Therefore, how to solve the above technical problems should be of great interest to those skilled in the art.
Disclosure of Invention
An object of the present invention is to provide a data processing method, apparatus, monitor, distributed cluster, and computer readable storage medium for a cluster monitor to determine the cause of the monitor database partition space being occupied and to effectively free the monitor database partition space being occupied.
In order to solve the above technical problems, the present invention provides a data processing method of a cluster monitor, including:
acquiring a database data file and a database metadata log file under a database directory of a monitor;
determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold;
when the target file is a database data file, compressing the database data file;
when the target file is a database metadata log file, releasing the database metadata log file;
and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
As one implementation, releasing the database metadata log file includes:
determining a first time for synchronizing database data files between monitors;
determining a second time to playback the database metadata log file when the monitor database is reopened;
determining a target release mode according to the magnitude relation between the first time and the second time;
and releasing the database metadata log file in a target release mode.
As one embodiment, determining the target release pattern according to the magnitude relation between the first time and the second time includes:
when the first time is longer than the second time, determining that the target release mode is to re-open the monitor database;
and when the first time is not longer than the second time, determining that the target release mode is to reconstruct the monitor database.
As one implementation, when the target release manner is to reopen the monitor database, releasing the database metadata log file by the target release manner includes:
and calling a closing interface function and an opening interface function of the monitor database to reopen the monitor database under the condition that the monitor service is not stopped, and releasing the database metadata log file.
As one implementation, when the target release manner is to reopen the monitor database, releasing the database metadata log file by the target release manner includes:
the monitor service is restarted to release the database metadata log file.
As one implementation, when the target release manner is to reconstruct the monitor database, releasing the database metadata log file by the target release manner includes:
deleting a database data file and a database metadata log file under a monitor database directory;
synchronizing data from the other monitors to the reconstructed monitor database.
As one implementation, determining a first time for synchronizing database data files between monitors includes:
determining a first speed at which database data files are synchronized between monitors;
acquiring the size of a database data file;
the first time is determined based on the size of the database data file and the first speed.
As one embodiment, determining a second time to playback the database metadata log file when the monitor database is reopened includes:
determining a second speed of playback of the database metadata log file upon re-opening the monitor database;
Acquiring the size of a database metadata log file;
and determining a second time according to the size of the database metadata log file and the second speed.
As one implementation, determining the first time based on the size of the database data file and the first speed includes:
determining a first time according to t1=s1/v 1;
wherein T1 is a first time, s1 is a size of a database data file, and v1 is a first speed.
As one embodiment, the compressed database data file comprises:
and calling a compression function interface of the monitor database and compressing the database data file.
As one implementation, transferring non-database files to a spare directory of other storage nodes includes:
counting non-database files which are not in a database file white list, and determining the sizes of all the non-database files;
judging whether the sizes of all the non-database files are larger than a first preset threshold value or not;
if the sizes of all the non-database files are larger than a first preset threshold value, packaging all the non-database files, and storing the packaged non-database files to standby catalogues of other storage nodes;
and deleting the counted non-database files.
As one embodiment, the file name of the packaged non-database file includes a packaged timestamp.
As an embodiment, further comprising:
a standby directory is established at the other storage nodes.
As one implementation, determining the target file includes:
judging whether the size of the database data file is larger than a second preset threshold value or not;
if the size of the database data file is larger than a second preset threshold value, determining the database data file as a target file;
judging whether the size of the database metadata log file is larger than a third preset threshold value or not;
if the size of the database metadata log file is larger than a third preset threshold value, determining the database metadata log file as a target file;
judging whether the size of the non-database file is larger than a fourth preset threshold value or not;
and if the size of the non-database file is larger than a fourth preset threshold value, determining the non-database file as the target file.
As an embodiment, further comprising:
judging whether the file size on the database file white list is larger than a fifth preset threshold value or not;
and if the file size on the database file white list is larger than a fifth preset threshold value, reporting alarm information.
As an implementation manner, before obtaining the non-database file, the database data file under the monitor database directory, and the database metadata log file, the method further includes:
And identifying files under the monitor database directory to obtain database data files and database metadata log files.
As an embodiment, further comprising:
when triggering data processing, the monitor sends election messages to other monitors; the type of the action executed in the election message is stop;
the state of the monitor is switched to an off-line state, and the non-database file, the database data file and the database metadata log file are processed in the off-line state.
As an embodiment, further comprising:
triggering election by other monitors according to the election message and generating new quorum; the new quorum does not include a monitor;
when the file processing of the monitor is completed in the off-line state, cutting off the off-line state and re-detecting other monitors;
when the election is complete, the monitor rejoins the new quorum.
The invention also provides a data processing device of the cluster monitor, which comprises:
the acquisition module is used for acquiring the non-database files, the database data files under the monitor database directory and the database metadata log files;
the determining module is used for determining the target file in the non-database file, the database data file and the database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold;
The calling module is used for compressing the database data file when the target file is the database data file;
the releasing module is used for releasing the database metadata log file when the target file is the database metadata log file;
and the transferring module is used for transferring the non-database file to the standby directory of other storage nodes when the target file is the non-database file.
The present invention also provides a monitor including:
a memory for storing a computer program;
and a processor for implementing the steps of any one of the data processing methods of the cluster monitor when executing the computer program.
The invention also provides a distributed cluster, which comprises the monitor of the embodiment.
The invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the data processing method of any of the above mentioned cluster monitors.
The invention provides a data processing method of a cluster monitor, which comprises the following steps: acquiring a database data file and a database metadata log file under a database directory of a monitor; determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold; when the target file is a database data file, compressing the database data file; when the target file is a database metadata log file, releasing the database metadata log file; and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
The beneficial effects are that: the data processing method of the invention determines the files occupying larger partition space of the monitor database by acquiring the non-database files, the database data files and the database metadata log files, namely, determines the reason why the partition space of the monitor database is occupied, and then executes different processing actions on different files according to the determined types of the files so as to release the occupied space of the partition of the monitor database.
The present invention further provides an apparatus, monitor, distributed cluster, and computer-readable storage medium having the above advantages.
Drawings
For a clearer description of embodiments of the invention or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for processing data of a cluster monitor according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method for releasing a database metadata log file according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating a method for determining a first time for synchronizing database data files between monitors according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a method for determining a second time for playback of a database metadata log file when a monitor database is reopened according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a second method for processing data of a cluster monitor according to an embodiment of the present invention;
FIG. 6 is a flowchart of a method for processing data of a cluster monitor according to an embodiment of the present invention;
FIG. 7 is a flowchart of a method for processing data of a cluster monitor according to an embodiment of the present invention;
FIG. 8 is a flowchart of a method for processing data of a cluster monitor according to an embodiment of the present invention;
FIG. 9 is a block diagram of a data processing apparatus of a cluster monitor according to an embodiment of the present invention;
FIG. 10 is a block diagram of a monitor according to an embodiment of the present invention;
fig. 11 is an application schematic diagram of a distributed cluster according to an embodiment of the present invention.
Detailed Description
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As in the background art section, currently, when the partition occupation of the Mon database is too large, a mode of compressing the Mon database is adopted. However, the excessive occupation of the partition of the Mon database is not necessarily caused by the excessive backlog of the old version of the data file, and the purpose of releasing the occupied space cannot be achieved by only calling the compression interface provided by the database.
In view of the above, the present invention provides a data processing method of a cluster monitor, please refer to fig. 1, which includes:
step S101: and acquiring a database data file and a database metadata log file under the database directory of the monitor.
The non-database files include non-database files under the monitor database directory and non-database files under the upper level directory of the monitor database directory. The reason that the non-database file includes a non-database file under the upper level directory of the monitor database directory is that the non-database file under the upper level directory of the monitor database directory may be in the same disk partition as the monitor database directory.
The monitor database is generally selected from a kv (Key-value) database such as a level db or a RocksDB.
Database data files are typically files with a suffix sst, which are files that store database data. For either the LevelDB database or the RocksDB database, each sst file corresponds to a hierarchy.
The database metadata log file may be a file prefixed with MANIFEST-that records changes to database metadata. For some databases, such as the level db database, MANIFEST files continue to grow with the running of database instances, and without a rollback mechanism, invoking the compression function interface of the database does not release the file.
The non-database files are files other than the database file whitelist. The database file whitelist includes files in the database other than the file with the suffix sst and other files with the prefix MANIFEST, such as CURRENT, IDENTITY, LOCK, keyring, done, kv _band, and store db. The files in the database file white list are normal database functional files, and the files are files necessary for the operation of the database or the Mon service, generally occupy only a very small storage space and do not increase along with the data volume or the operation time of the database.
Step S102: determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold.
The target file is a file which occupies too much partition space of the monitor database, namely, a file which needs to be processed so as to release the occupied partition space. The preset spatial threshold is not limited in the invention, and is optional.
As one implementation, determining the target file includes:
step S1021: and judging whether the size of the database data file is larger than a second preset threshold value.
It should be noted that, in the present invention, the second preset threshold is not limited, and may be set according to needs.
Step S1022: and if the size of the database data file is larger than a second preset threshold value, determining the database data file as the target file.
Step S1023: and if the size of the database data file is not larger than the second preset threshold value, determining that the database data file is not the target file.
When the size of the database data file is larger than a second preset threshold, the database data file is indicated to be oversized, the monitor database partition space is occupied, and otherwise, the monitor database partition space is indicated to be occupied less.
Step S1024: and judging whether the size of the database metadata log file is larger than a third preset threshold value.
It should be noted that, in the present invention, the third preset threshold is not limited, and may be set according to needs.
Step S1025: and if the size of the database metadata log file is larger than a third preset threshold value, determining the database metadata log file as a target file.
Step S1026: and if the size of the database metadata log file is not greater than a third preset threshold value, determining that the database metadata log file is not a target file.
When the size of the database metadata log file is larger than a third preset threshold, the database metadata log file is indicated to be oversized, the occupied partition space of the monitor database is larger, and otherwise, the occupation of the partition space of the monitor database is indicated to be smaller.
Step S1027: and judging whether the size of the non-database file is larger than a fourth preset threshold value.
It should be noted that, in the present invention, the fourth preset threshold is not limited, and may be set according to needs.
Step S1028: and if the size of the non-database file is larger than a fourth preset threshold value, determining the non-database file as the target file.
Step S1029: and if the size of the non-database file is not larger than a fourth preset threshold value, determining that the non-database file is not the target file.
When the size of the non-database file is larger than a fourth preset threshold, the non-database file is indicated to be oversized, the monitor database partition space is occupied to be larger, and otherwise, the monitor database partition space is indicated to be occupied to be smaller.
Step S103: when the target file is a database data file, the database data file is compressed.
Compressing database data files will merge the different levels while deleting the old version of the data, freeing up space.
As one embodiment, the compressed database data file comprises:
the compression (compact) function interface of the monitor database is invoked and the database data file is compressed.
Step S104: and when the target file is a database metadata log file, releasing the database metadata log file.
Step S105: and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
The standby directory may be a directory such as a system log directory, and the present invention is not particularly limited.
It should be noted that the order of the steps S103, S104 and S105 may be arbitrarily changed, which is within the scope of the present invention.
According to the data processing method, through obtaining the non-database files, the database data files and the database metadata log files, files occupying larger partition space of the monitor database are determined from the obtained files, namely, the reasons for occupying the partition space of the monitor database are determined, and then different processing actions are executed on different files according to the determined types of the files, so that occupied space of the partition of the monitor database is released, and Mon faults caused by the fact that the partition of the monitor database is fully written are avoided.
Furthermore, the present invention has the following advantages: 1. stability: the Mon database occupation can be maintained at a low level for a long time, so that the system can stably operate; 2. safety: the serious problems of Mon faults, even unavailable clusters and the like caused by the fullness of Mon partitions are avoided; 3. low cost: the additional resource consumption generated by the invention is kept at a lower level; 4. compatibility: the inventive technique may be integrated into existing distributed storage engines.
Based on the above embodiments, in one embodiment of the present invention, referring to fig. 2, releasing the database metadata log file includes:
step S201: a first time for synchronizing database data files between monitors is determined.
The cluster comprises a plurality of monitors, and each monitor can synchronize respective data.
If the monitor has data missing, the monitor will synchronize data from other monitors at the beginning of election.
Step S202: a second time to play back the database metadata log file when the monitor database is re-opened is determined.
Step S203: and determining a target release mode according to the magnitude relation between the first time and the second time.
As one embodiment, determining the target release pattern according to the magnitude relation between the first time and the second time includes:
Step S2031: and when the first time is greater than the second time, determining that the target release mode is to re-open the monitor database.
Step S2032: and when the first time is not longer than the second time, determining that the target release mode is to reconstruct the monitor database.
In one embodiment of the present invention, when the first time is greater than the second time, before determining that the target release manner is to reopen the monitor database, the method may further include:
and judging whether the first time is longer than the second time.
Step S204: and releasing the database metadata log file in a target release mode.
In one embodiment of the present invention, releasing the database metadata log file by the target release mode includes two types when the target release mode reopens the monitor database.
As one implementation, releasing the database metadata log file by the targeted release manner includes:
and calling a closing interface function and an opening interface function of the monitor database to reopen the monitor database under the condition that the monitor service is not stopped, and releasing the database metadata log file.
As another embodiment, releasing the database metadata log file by the target release manner includes:
The monitor service is restarted to release the database metadata log file.
In another embodiment of the present invention, when the target release manner is to reconstruct the monitor database, releasing the database metadata log file by the target release manner includes:
deleting a database data file and a database metadata log file under a monitor database directory;
synchronizing data from the other monitors to the reconstructed monitor database.
The data, i.e., database data files and database metadata log files, are synchronized from the other monitors.
Based on the above embodiments, in one embodiment of the present invention, please refer to fig. 3, determining a first time for synchronizing database data files between monitors includes:
step S301: a first speed of synchronizing database data files between monitors is determined.
The first speed of synchronizing database data files between monitors may be estimated based on empirical data of cluster operation.
Step S302: the size of the database data file is obtained.
The size of the database data file is the total size of all database data files, which can be obtained when compression is performed.
Step S303: the first time is determined based on the size of the database data file and the first speed.
The determination formula of the first time is as follows:
T1=s1/v1 (1)
where T1 is the first time, s1 is the size of the database data file, and v1 is the first speed.
Based on the above embodiments, in one embodiment of the present invention, please refer to fig. 4, determining the second time for playing back the database metadata log file when the monitor database is re-opened includes:
step S401: a second speed of playback of the database metadata log file upon re-opening the monitor database is determined.
Second speed of playback of database metadata log files upon reopening a monitor database
The second speed may be obtained by testing the database used by the monitor when the database metadata log file is played back when it is reopened.
For the LevelDB database, the database metadata log file records metadata change information from the operation of a database instance, and after the database is opened again and a new database instance is created, the file is played back, and the larger the file is, the longer the playback time is. After playback is completed, the file is deleted, and a new database metadata log file is created. The new database metadata log file grows from 0. The length of time the database is re-opened is proportional to the size of the original database metadata log file.
Step S402: the size of the database metadata log file is obtained.
Step S403: and determining a second time according to the size of the database metadata log file and the second speed.
The determination formula of the second time is:
T2=s2/v2 (1)
where T1 is the second time, s2 is the size of the database data file, and v2 is the second speed.
On the basis of any of the foregoing embodiments, in one embodiment of the present invention, referring to fig. 5, a data processing method of a cluster monitor includes:
step S501: and acquiring a database data file and a database metadata log file under the database directory of the monitor.
Step S502: determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold.
Step S503: when the target file is a database data file, the database data file is compressed.
Step S504: and when the target file is a database metadata log file, releasing the database metadata log file.
Step S505: and when the target file is a non-database file, counting the non-database files which are not in the white list of the database file, and determining the sizes of all the non-database files.
Step S506: and judging whether the sizes of all the non-database files are larger than a first preset threshold value.
It should be noted that, in the present invention, the first preset threshold is not specifically limited, and is optional.
Step S507: and if the sizes of all the non-database files are larger than a first preset threshold value, packaging all the non-database files, and storing the packaged non-database files to standby directories of other storage nodes.
The standby directory may be established before or during the execution of the data processing method.
As one implementation, packaging all non-database files includes: all non-database files are packaged using tar commands. However, the invention is not particularly limited in this regard, and in other embodiments of the invention packaging all non-database files includes: all non-database files are packaged using zip commands.
In one embodiment of the invention, the file name of the packaged non-database file includes a packaged timestamp to obtain the packaged time information by the file name of the packaged non-database file.
Step S508: and deleting the counted non-database files.
And when deleting the non-database files, traversing all the counted non-database files, and sequentially deleting.
Step S509: if the sizes of all the non-database files are not larger than the first preset threshold value, the non-database files do not need to be processed.
It should be noted that, the contents of the above embodiments are referred to in the step S501, the step S502, the step S503, and the step S504, and are not described in detail herein.
On the basis of the foregoing embodiments, in one embodiment of the present invention, the data processing method of the cluster monitor may further include:
a standby directory is established at the other storage nodes.
The backup directory may be built under a system log directory, such as a/var/log directory. Typically, the distributed storage node will mount the directory on a larger disk partition when creating the local operating system.
The process of establishing the spare directory may be performed before step S507 in the above-described embodiment.
On the basis of any of the foregoing embodiments, in one embodiment of the present invention, please refer to fig. 6, a data processing method of a cluster monitor includes:
step S601: and acquiring a database data file and a database metadata log file under the database directory of the monitor.
Step S602: determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold.
Step S603: when the target file is a database data file, the database data file is compressed.
Step S604: and when the target file is a database metadata log file, releasing the database metadata log file.
Step S605: and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
Step S606: and judging whether the file size on the database file white list is larger than a fifth preset threshold value.
In the present invention, the fifth preset threshold is not limited, and may be set according to needs.
In this step, the file size on the database file whitelist refers to the total size of all files on the database file whitelist.
Step S607: and if the file size on the database file white list is larger than a fifth preset threshold value, reporting alarm information.
When the file size on the database file white list is larger than a fifth preset threshold, the abnormal situation that automatic processing cannot be performed is indicated, and at the moment, alarm information is reported so that operation and maintenance personnel can position reasons and perform manual processing.
Step S608: if the file size on the database file white list is not greater than the fifth preset threshold value, no processing is needed.
It should be noted that, the contents of the above embodiments are referred to in step S601, step S602, step S603, step S604 and step S605, and will not be described in detail herein.
On the basis of any of the foregoing embodiments, in one embodiment of the present invention, referring to fig. 7, a data processing method of a cluster monitor includes:
step S701: and identifying files under the monitor database directory to obtain database data files and database metadata log files.
And identifying the files under the monitor database directory according to the suffixes and the prefixes of the files, so as to distinguish the database data files and the database metadata log files, and further judge what type of files occupy the storage partition space.
Step S702: and acquiring a database data file and a database metadata log file under the database directory of the monitor.
Step S703: determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold.
Step S704: when the target file is a database data file, the database data file is compressed.
Step S705: and when the target file is a database metadata log file, releasing the database metadata log file.
Step S706: and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
It should be noted that, the contents of the above embodiments are referred to in step S702, step S703, step S704, step S705 and step S706, and will not be described in detail herein.
On the basis of any of the foregoing embodiments, in one embodiment of the present invention, referring to fig. 8, a data processing method of a cluster monitor includes:
step S801: and acquiring a database data file and a database metadata log file under the database directory of the monitor.
Step S802: determining a target file in a non-database file, a database data file and a database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold.
Step S803: when the target file is a database data file, the database data file is compressed.
Step S804: and when the target file is a database metadata log file, releasing the database metadata log file.
Step S805: and when the target file is a non-database file, transferring the non-database file to a standby directory of other storage nodes.
Step S806: when triggering data processing, the monitor sends election messages to other monitors; the type of action performed in the election message is stop.
The triggering mode of the data processing can be manual triggering or automatic triggering.
The monitor is a main monitor, and the other monitors are standby monitors, that is, the main monitor sends election (selection) information to the other standby monitors, wherein the election information comprises parameter information of an execution action type, and the execution action type in the step is stop.
Step S807: the state of the monitor is switched to an off-line state, and the non-database file, the database data file and the database metadata log file are processed in the off-line state.
After the monitor is switched to the off-line state, the monitor does not participate in election any more and does not accept connection requests of other services. The monitor processes the file in an offline state, and the specific processing mode adopts the processing mode described in the above embodiment according to the type of the file.
It should be noted that, the contents of the above embodiments are referred to in step S801, step S802, step S803, step S804 and step S805, and will not be described in detail herein.
In this embodiment, the monitor may process the file in an offline state, so as to avoid the problem of backlog of Mon messages in the memory caused by long file processing time.
On the basis of the above embodiment, the data processing method of the cluster monitor may further include:
other monitors trigger elections based on the election message and generate new Quorum (Quorum); the new quorum does not include a monitor;
when the file processing of the monitor is completed in the off-line state, cutting off the off-line state and re-detecting other monitors;
when the election is complete, the monitor rejoins the new quorum.
When the monitor is in an off-line state, the other monitors are elected, and when the monitor finishes the processing of the file, the other monitors are re-detected (probe).
When the election between other monitors is completed, the monitors are added to the Quorum.
The following describes a data processing apparatus of a cluster monitor according to an embodiment of the present invention, and the data processing apparatus of a cluster monitor described below and the data processing method of a cluster monitor described above may be referred to correspondingly.
Fig. 9 is a block diagram of a data processing apparatus of a cluster monitor according to an embodiment of the present invention, and referring to fig. 9, the data processing apparatus of a cluster monitor may include:
the acquiring module 100 is configured to acquire a non-database file, a database data file under a monitor database directory, and a database metadata log file;
a determining module 200, configured to determine a target file among the non-database file, the database data file, and the database metadata log file; the target file is a file occupying the partition space of the monitor database beyond a preset space threshold;
the calling module 300 is configured to compress the database data file when the target file is the database data file;
the releasing module 400 is configured to release the database metadata log file when the target file is the database metadata log file;
and the transferring module 500 is configured to transfer the non-database file to the spare directory of the other storage node when the target file is the non-database file.
The data processing apparatus of the cluster monitor of the present embodiment is used to implement the foregoing data processing method of the cluster monitor, so that the detailed description of the data processing apparatus of the cluster monitor can be found in the foregoing example portions of the data processing method of the cluster monitor, for example, the obtaining module 100, the determining module 200, the calling module 300, the releasing module 400, and the transferring module 500, which are respectively used to implement steps S101, S102, S103, S104, and S105 in the foregoing data processing method of the cluster monitor, so that the detailed description of the embodiments of the corresponding portions will be omitted herein.
As one embodiment, the release module 400 includes:
a first determining sub-module for determining a first time between monitors at which database data files are synchronized;
a second determining sub-module for determining a second time to play back the database metadata log file when the monitor database is re-opened;
the third determining submodule is used for determining a target release mode according to the magnitude relation between the first time and the second time;
and the release sub-module is used for releasing the database metadata log file in a target release mode.
As an embodiment, the third determining submodule includes:
the first determining unit is used for determining that the target release mode is to re-open the monitor database when the first time is longer than the second time;
and the second determining unit is used for determining that the target release mode is to reconstruct the monitor database when the first time is not more than the second time.
As an implementation manner, when the target release manner is to reopen the monitor database, the release sub-module is specifically configured to: and calling a closing interface function and an opening interface function of the monitor database to reopen the monitor database under the condition that the monitor service is not stopped, and releasing the database metadata log file.
As an implementation manner, when the target release manner is to reopen the monitor database, the release sub-module is specifically configured to: the monitor service is restarted to release the database metadata log file.
As an embodiment, when the target release manner is to reopen the monitor database, the release submodule includes:
the deleting unit is used for deleting the database data file and the database metadata log file under the monitor database directory;
and the synchronization unit is used for synchronizing data from other monitors to the reconstructed monitor database.
As an embodiment, the first determining submodule includes:
a third determining unit for determining a first speed of synchronizing database data files between monitors;
the first acquisition unit is used for acquiring the size of the database data file;
and the fourth determining unit is used for determining the first time according to the size of the database data file and the first speed.
As an embodiment, the second determining submodule includes:
a fifth determining unit for determining a second speed of playback of the database metadata log file when the monitor database is re-opened;
the second acquisition unit is used for acquiring the size of the database metadata log file;
And a sixth determining unit, configured to determine a second time according to the size of the database metadata log file and the second speed.
As an embodiment, the fourth determining unit is specifically configured to:
determining a first time according to t1=s1/v 1;
wherein T1 is a first time, s1 is a size of a database data file, and v1 is a first speed.
As one implementation, the calling module 300 is specifically configured to call a compression function interface of the monitor database and compress the database data file.
As one embodiment, the transfer module 500 includes:
the statistics sub-module is used for counting non-database files which are not in the database file white list and determining the sizes of all the non-database files;
the first judging submodule is used for judging whether the sizes of all the non-database files are larger than a first preset threshold value or not;
the packaging and storing sub-module is used for packaging all the non-database files if the sizes of all the non-database files are larger than a first preset threshold value, and storing the packaged non-database files to the standby catalogues of other storage nodes;
and the deleting sub-module is used for deleting the counted non-database files.
As one embodiment, the packing and storing sub-module, when packed, the file name of the packed non-database file includes a packed timestamp.
As one embodiment, the transfer module 500 further includes:
and the establishing submodule is used for establishing standby catalogs at other storage nodes.
As one embodiment, the determining module 200 includes:
the second judging submodule is used for judging whether the size of the database data file is larger than a second preset threshold value or not;
a fourth determining submodule, configured to determine the database data file as a target file if the size of the database data file is greater than a second preset threshold;
the third judging sub-module is used for judging whether the size of the database metadata log file is larger than a third preset threshold value or not;
a fifth determining submodule, configured to determine the database metadata log file as a target file if the size of the database metadata log file is greater than a third preset threshold;
a fourth judging sub-module, configured to judge whether the size of the non-database file is greater than a fourth preset threshold;
and the sixth determining submodule is used for determining the non-database file as the target file if the size of the non-database file is larger than a fourth preset threshold value.
As an embodiment, the data processing apparatus of the cluster monitor may further include:
the judging module is used for judging whether the file size on the database file white list is larger than a fifth preset threshold value or not;
And the reporting module is used for reporting alarm information if the file size on the database file white list is larger than a fifth preset threshold value.
As an embodiment, the data processing apparatus of the cluster monitor may further include:
and the identification module is used for identifying the files under the monitor database directory so as to acquire database data files and database metadata log files.
As an embodiment, the data processing apparatus of the cluster monitor may further include:
the sending module is used for sending election messages to other monitors when the data processing is triggered; the type of the action executed in the election message is stop;
and the switching and processing module is used for switching the state of the monitor to an offline state and processing the non-database file, the database data file and the database metadata log file in the offline state.
As an embodiment, the data processing apparatus of the cluster monitor may further include:
the triggering module is used for triggering elections according to the election messages by other monitors and generating new legal quantity; the new quorum does not include a monitor;
the cut-out module is used for cutting out the offline state and re-detecting other monitors when the file processing of the monitors is completed in the offline state;
And the joining module is used for completing the election and re-joining the monitor into the new quorum.
The monitor provided by the embodiment of the present invention is described below, and the monitor described below and the data processing method of the cluster monitor described above may be referred to correspondingly.
A monitor, please refer to fig. 10, comprising:
a memory 11 for storing a computer program;
a processor 12 for implementing the steps of the data processing method of the cluster monitor of any of the embodiments described above when executing a computer program.
The invention also provides a distributed cluster, which comprises the monitor of the embodiment.
Other monitors are also included in the distributed cluster, the total number of monitors may be as appropriate.
Referring to fig. 11, a schematic diagram of an application of a distributed cluster is shown, in which a monitor monitors and manages a cluster state; the client interacts with the monitor to perform state inquiry; data reading and writing are realized through a client to an Object storage device (Object-based Storage Device, OSD), and the Object storage device and the hard disk are interacted.
An odd number of Mon services are required for a distributed cluster, typically 3, 5 or 7 Mon services are deployed depending on the size of the distributed cluster. The Mon service is deployed on common storage nodes, and the distributed cluster has no special management or monitoring nodes. For example, a 5-node distributed storage cluster may be deployed, where 3 nodes among the 5 storage nodes may be selected as Mon nodes, each node deploying a Mon service.
The Mon services are interconnected through a network to form a small Mon distributed cluster. And selecting one Mon service as a main Mon through election, and selecting the rest Mon services as standby Mon. When the Mon fails, or the failed Mon recovers, or the Mon members are expanded/moved out, the Mon election is triggered again, and a new main Mon is selected.
And map data are shared among Mon services through PAXOS protocols, and each Mon stores the respective data in a local database.
The following describes a computer readable storage medium provided in an embodiment of the present invention, and the computer readable storage medium described below and the data processing method of the cluster monitor described above may be referred to correspondingly.
A computer readable storage medium having a computer program stored thereon, which when executed by a processor performs the steps of the data processing method of a cluster monitor of any of the embodiments described above.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The data processing method, apparatus, monitor, distributed cluster and computer readable storage medium of the cluster monitor provided by the present invention are described in detail above. The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims (22)

1. A data processing method for a cluster monitor, comprising:
acquiring a database data file and a database metadata log file under a database directory of a monitor;
determining a target file in the non-database file, the database data file and the database metadata log file; the target file is a file occupying the partition space of the monitor database to exceed a preset space threshold;
when the target file is the database data file, compressing the database data file;
releasing the database metadata log file when the target file is the database metadata log file;
and when the target file is the non-database file, transferring the non-database file to a standby directory of other storage nodes.
2. The data processing method of a cluster monitor of claim 1, wherein said releasing the database metadata log file comprises:
determining a first time between monitors to synchronize the database data files;
determining a second time to play back the database metadata log file when the monitor database is re-opened;
Determining a target release mode according to the magnitude relation between the first time and the second time;
and releasing the database metadata log file in the target release mode.
3. The data processing method of the cluster monitor according to claim 2, wherein determining the target release pattern according to the magnitude relation between the first time and the second time comprises:
when the first time is greater than the second time, determining that the target release mode is to re-open the monitor database;
and when the first time is not greater than the second time, determining that the target release mode is to reconstruct the monitor database.
4. The data processing method of a cluster monitor of claim 3, wherein releasing the database metadata log file by the target release mode when the target release mode reopens the monitor database comprises:
and under the condition that the monitor service is not stopped, calling a closing interface function and an opening interface function of the monitor database to reopen the monitor database, and releasing the database metadata log file.
5. The data processing method of a cluster monitor of claim 3, wherein releasing the database metadata log file by the target release mode when the target release mode reopens the monitor database comprises:
Restarting the monitor service to release the database metadata log file.
6. The data processing method of a cluster monitor of claim 3, wherein releasing the database metadata log file by the target release manner when the target release manner is to reconstruct the monitor database comprises:
deleting the database data file and the database metadata log file under the monitor database directory;
synchronizing data from the other monitors to the reconstructed monitor database.
7. The data processing method of cluster monitors of claim 2, wherein determining a first time for synchronizing database data files between monitors comprises:
determining a first speed of synchronizing the database data files between monitors;
acquiring the size of the database data file;
and determining the first time according to the size of the database data file and the first speed.
8. The data processing method of a cluster monitor of claim 2, wherein determining a second time to play back the database metadata log file when the monitor database is reopened comprises:
Determining a second speed of playback of the database metadata log file when the monitor database is reopened;
acquiring the size of the database metadata log file;
and determining the second time according to the size of the database metadata log file and the second speed.
9. The data processing method of a cluster monitor of claim 7, wherein determining the first time based on the size of the database data file and the first speed comprises:
determining the first time according to t1=s1/v 1;
wherein T1 is a first time, s1 is a size of a database data file, and v1 is a first speed.
10. The data processing method of a cluster monitor of claim 1, wherein said compressing said database data file comprises:
and calling a compression function interface of the monitor database and compressing the database data file.
11. The data processing method of a cluster monitor of claim 1, wherein transferring the non-database file to a spare directory of other storage nodes comprises:
counting the non-database files which are not in the database file white list, and determining the sizes of all the non-database files;
Judging whether the sizes of all the non-database files are larger than a first preset threshold value or not;
if the sizes of all the non-database files are larger than the first preset threshold value, packaging all the non-database files, and storing the packaged non-database files to standby catalogues of other storage nodes;
and deleting the counted non-database files.
12. The data processing method of a cluster monitor of claim 11, wherein the file name of the packaged non-database file includes a packaged timestamp.
13. The data processing method of a cluster monitor of claim 11, further comprising:
and establishing the standby catalogue at the other storage nodes.
14. The data processing method of a cluster monitor of claim 1, wherein determining the target file comprises:
judging whether the size of the database data file is larger than a second preset threshold value or not;
if the size of the database data file is larger than the second preset threshold value, determining the database data file as a target file;
judging whether the size of the database metadata log file is larger than a third preset threshold value or not;
If the size of the database metadata log file is larger than the third preset threshold value, determining the database metadata log file as a target file;
judging whether the size of the non-database file is larger than a fourth preset threshold value or not;
and if the size of the non-database file is larger than the fourth preset threshold value, determining the non-database file as a target file.
15. The data processing method of a cluster monitor of claim 1, further comprising:
judging whether the file size on the database file white list is larger than a fifth preset threshold value or not;
and if the file size on the database file white list is larger than the fifth preset threshold value, reporting alarm information.
16. The method for processing data of a cluster monitor as set forth in claim 1, further comprising, before obtaining the non-database file and the database data file, the database metadata log file under the monitor database directory:
and identifying files under the monitor database directory to obtain database data files and the database metadata log files.
17. A data processing method of a cluster monitor according to any one of claims 1 to 16, further comprising:
When triggering data processing, the monitor sends election messages to other monitors; the type of the action executed in the election message is stop;
and switching the state of the monitor to an offline state, and processing the non-database file, the database data file and the database metadata log file in the offline state.
18. The data processing method of a cluster monitor of claim 17, further comprising:
triggering election by the other monitors according to the election message and generating a new quorum; the new quorum does not include the monitor;
when the file processing of the monitor is completed in the offline state, cutting out the offline state and re-detecting the other monitors;
when the election is complete, the monitor rejoins the new quorum.
19. A data processing apparatus for a cluster monitor, comprising:
the acquisition module is used for acquiring the non-database files, the database data files under the monitor database directory and the database metadata log files;
the determining module is used for determining a target file in the non-database file, the database data file and the database metadata log file; the target file is a file occupying the partition space of the monitor database to exceed a preset space threshold;
The calling module is used for compressing the database data file when the target file is the database data file;
the releasing module is used for releasing the database metadata log file when the target file is the database metadata log file;
and the transfer module is used for transferring the non-database file to the standby catalogue of other storage nodes when the target file is the non-database file.
20. A monitor, comprising:
a memory for storing a computer program;
processor for implementing the steps of the data processing method of a cluster monitor according to any of claims 1 to 18 when executing said computer program.
21. A distributed cluster comprising the monitor of claim 20.
22. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements the steps of the data processing method of a cluster monitor as claimed in any of claims 1 to 18.
CN202310821959.0A 2023-07-06 2023-07-06 Data processing method and device for cluster monitor, cluster and medium Active CN116560966B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310821959.0A CN116560966B (en) 2023-07-06 2023-07-06 Data processing method and device for cluster monitor, cluster and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310821959.0A CN116560966B (en) 2023-07-06 2023-07-06 Data processing method and device for cluster monitor, cluster and medium

Publications (2)

Publication Number Publication Date
CN116560966A true CN116560966A (en) 2023-08-08
CN116560966B CN116560966B (en) 2023-09-19

Family

ID=87500401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310821959.0A Active CN116560966B (en) 2023-07-06 2023-07-06 Data processing method and device for cluster monitor, cluster and medium

Country Status (1)

Country Link
CN (1) CN116560966B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866350A (en) * 2015-05-27 2015-08-26 小米科技有限责任公司 Terminal partition space optimizing method, device and terminal
CN106227867A (en) * 2016-07-29 2016-12-14 努比亚技术有限公司 A kind of method and device of file management
CN106708928A (en) * 2016-11-17 2017-05-24 广州视源电子科技股份有限公司 File managing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866350A (en) * 2015-05-27 2015-08-26 小米科技有限责任公司 Terminal partition space optimizing method, device and terminal
CN106227867A (en) * 2016-07-29 2016-12-14 努比亚技术有限公司 A kind of method and device of file management
CN106708928A (en) * 2016-11-17 2017-05-24 广州视源电子科技股份有限公司 File managing method and device

Also Published As

Publication number Publication date
CN116560966B (en) 2023-09-19

Similar Documents

Publication Publication Date Title
US11320991B2 (en) Identifying sub-health object storage devices in a data storage system
CN110209726B (en) Distributed database cluster system, data synchronization method and storage medium
CN106933843B (en) Database heartbeat detection method and device
CN110990432B (en) Device and method for synchronizing distributed cache clusters across machine room
CN110601903B (en) Data processing method and device based on message queue middleware
CN108173959B (en) Cluster storage system
CN102867035B (en) A kind of distributed file system cluster high availability method and device
US9652520B2 (en) System and method for supporting parallel asynchronous synchronization between clusters in a distributed data grid
CN108345617B (en) Data synchronization method and device and electronic equipment
CN107168970A (en) A kind of distributed file system HDFS management method, apparatus and system
CN112202853B (en) Data synchronization method, system, computer device and storage medium
CN108566291A (en) A kind of method of event handling, server and system
CN110351313B (en) Data caching method, device, equipment and storage medium
CN110333986B (en) Method for guaranteeing availability of redis cluster
CN104054076A (en) Data storage method, database storage node failure processing method and apparatus
CN112328702A (en) Data synchronization method and system
CN116560966B (en) Data processing method and device for cluster monitor, cluster and medium
CN107528703B (en) Method and equipment for managing node equipment in distributed system
CN108733808A (en) Big data software systems switching method, system, terminal device and storage medium
US9003018B2 (en) System and method for data set synchronization and replication
CN111552701A (en) Method for determining data consistency in distributed cluster and distributed data system
CN115695532A (en) Method, device and computer equipment for processing message by message middleware
CN114297182A (en) Industrial model data management method, device, equipment and readable storage medium
CN114301763A (en) Distributed cluster fault processing method and system, electronic device and storage medium
CN116170508A (en) Data processing method, terminal, system, equipment, medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant