CN110780821A - Optimization method and device of distributed storage system, server and storage medium - Google Patents

Optimization method and device of distributed storage system, server and storage medium Download PDF

Info

Publication number
CN110780821A
CN110780821A CN201911030886.3A CN201911030886A CN110780821A CN 110780821 A CN110780821 A CN 110780821A CN 201911030886 A CN201911030886 A CN 201911030886A CN 110780821 A CN110780821 A CN 110780821A
Authority
CN
China
Prior art keywords
cluster
optimizing
optimization
storage system
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911030886.3A
Other languages
Chinese (zh)
Inventor
许宇峰
邓篪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Data Technology (shenzhen) Ltd By Share Ltd
Original Assignee
Data Technology (shenzhen) Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Data Technology (shenzhen) Ltd By Share Ltd filed Critical Data Technology (shenzhen) Ltd By Share Ltd
Priority to CN201911030886.3A priority Critical patent/CN110780821A/en
Publication of CN110780821A publication Critical patent/CN110780821A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides an optimization method, an optimization device, a server and a storage medium of a distributed storage system. The optimization method of the distributed storage system comprises the following steps: analyzing the cluster state of the cluster; evaluating distribution parameters of the cluster according to the cluster state; optimizing the cluster based on the distribution parameter. The cluster state is automatically analyzed for optimization, so that the performance of the storage system is optimized, and the data storage efficiency is improved.

Description

Optimization method and device of distributed storage system, server and storage medium
Technical Field
The embodiment of the invention relates to the technical field of distributed storage, in particular to an optimization method, an optimization device, a server and a storage medium of a distributed storage system.
Background
With the rapid development of the internet, the global data volume has been increased explosively, and how to store a large amount of data efficiently is becoming more and more important.
At present, for the storage of a large amount of data, a ceph-based distributed storage system is generally adopted for the storage of the data. ceph is based on a cruise algorithm, has no central node, can perform infinite expansion and the like, and is widely used. Meanwhile, due to decentralization, the position of the PG that holds data may vary according to the variation of crushmap.
However, when a distributed storage system is used to store a large amount of data, if a failure occurs, such as a hard disk failure, a short-board effect occurs, resulting in a degradation in the performance of the storage system, thereby reducing the efficiency of storing data.
Disclosure of Invention
The embodiment of the invention provides an optimization method, an optimization device, a server and a storage medium of a distributed storage system, so as to realize the effect of optimizing the performance of the storage system and further improve the efficiency of data storage.
In a first aspect, an embodiment of the present invention provides an optimization method for a distributed storage system, including:
analyzing the cluster state of the cluster;
evaluating distribution parameters of the cluster according to the cluster state;
optimizing the cluster based on the distribution parameter.
Optionally, the evaluating the distribution parameter of the cluster according to the cluster state includes:
evaluating a first distribution state of the PG in the OSD according to the cluster state;
and evaluating a second distribution state of a special pool in the OSD according to the cluster state, wherein the special pool is an area for storing data information.
Optionally, the optimizing the cluster based on the distribution parameter includes:
acquiring one or more to-be-optimized OSD (on screen displays) with the occupied space larger than a first threshold;
and adjusting the occupation space of the OSD to be optimized to be lower than a second threshold value.
Optionally, before the optimizing the cluster based on the distribution parameter, the method includes:
judging whether the cluster meets an optimization condition;
optimizing the cluster based on the distribution parameters if the cluster satisfies the optimization condition;
and if the cluster does not meet the optimization condition, judging whether the cluster meets the optimization condition again at preset time.
Optionally, the determining whether the cluster meets the optimization condition includes:
judging whether the cluster state is normal or not; and/or
Judging whether the migration speed of the cluster meets a third threshold value; and/or
Judging whether the read-write speed of the cluster meets a fourth threshold value; and/or
And judging whether the number of the OSD to be optimized in the cluster meets a fifth threshold value.
Optionally, the analyzing the cluster state of the cluster includes:
acquiring the use record of the cluster at regular time;
acquiring the use time period of the cluster state according to the use record;
the cluster state of the cluster is analyzed during the non-use period.
Optionally, after the optimizing the cluster based on the distribution parameter, the method includes:
judging whether the optimization times meet a sixth threshold value;
if the optimization times meet the sixth threshold, stopping optimization;
and if the optimization times do not meet the sixth threshold, optimizing the cluster again based on the distribution parameters until the optimization times meet the sixth threshold.
In a second aspect, an embodiment of the present invention provides an apparatus for optimizing a distributed storage system, including:
the analysis module is used for analyzing the cluster state of the cluster;
the evaluation module is used for evaluating the distribution parameters of the clusters according to the cluster states;
an optimization module to optimize the cluster based on the distribution parameter.
In a third aspect, an embodiment of the present invention provides a server, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method for optimization of a distributed storage system as described in any embodiment of the invention.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the optimization method of the distributed storage system according to any embodiment of the present invention.
The embodiment of the invention analyzes the cluster state of the cluster; evaluating distribution parameters of the cluster according to the cluster state; the cluster is optimized based on the distribution parameters, the problem that the performance of a storage system is reduced, and therefore the efficiency of data storage is reduced is solved, the performance of the storage system is optimized, and therefore the efficiency of data storage is improved.
Drawings
Fig. 1 is a schematic flowchart illustrating an optimization method of a distributed storage system according to an embodiment of the present invention;
fig. 2 is a schematic flowchart of an optimization method of a distributed storage system according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of an optimization apparatus of a distributed storage system according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.
Furthermore, the terms "first," "second," and the like may be used herein to describe various orientations, actions, steps, elements, or the like, but the orientations, actions, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, a first distribution state may be referred to as a second distribution state, and similarly, the second distribution state may be referred to as the first distribution state, without departing from the scope of the present application. The first distribution state and the second distribution state are both distribution states, but they are not the same distribution state. The terms "first", "second", etc. are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Example one
Fig. 1 is a flowchart of an optimization method for a distributed storage system according to an embodiment of the present invention, which is applicable to a scenario in which the distributed storage system is optimized, where the method may be executed by an optimization apparatus of the distributed storage system, and the apparatus may be implemented in a software and/or hardware manner and may be integrated on a server.
As shown in fig. 1, a method for optimizing a distributed storage system according to an embodiment of the present invention includes:
and S110, analyzing the cluster state of the cluster.
The cluster is a storage pool which is used for aggregating storage spaces in a plurality of storage devices into one storage pool capable of providing a uniform access interface and a management interface for an application server in a distributed storage system. The cluster state refers to the current state information of the cluster. Specifically, the cluster state includes, but is not limited to, whether the cluster is currently used, a cluster environment, and the like, and is not limited herein. The cluster environment includes whether a hard disk fails, whether an operating system crashes, whether a network outage occurs, whether a space is unevenly distributed, and whether expanded hard disks are incompatible, and is not limited herein.
Optionally, the step may specifically include: acquiring the use record of the cluster at regular time; acquiring the use time period of the cluster state according to the use record; the cluster state of the cluster is analyzed during the non-use period.
The use record of the cluster refers to the record of the user using the cluster to store the data information. Specifically, the usage record may include a usage time record, a usage status record, and the like of the user, which is not limited herein. And obtaining the use habit of the user through the use record, and obtaining the use time period of the cluster. The non-use period is opposite to the use period, i.e., a time other than the use period is the non-use period. For example, by using the record to find that the cluster is using the cluster storage data at 8:00-22:00 every day, and not using the cluster storage data after 22:00, the use time period of the cluster for the user can be 8:00-22: 00. After the using time period is obtained, the cluster state can be analyzed at a time outside the using time period of the user, namely, the non-using time period, so that the efficiency of reading data by the cluster is not influenced.
And S120, evaluating the distribution parameters of the cluster according to the cluster state.
The distribution parameters refer to parameters and distribution states of the modules in the cluster, such as PG, OSD, and the like. For example, the distribution status of PG in OSD, the distribution status of a special pool in OSD, whether the distribution of PG in OSD is uniform, etc., and is not limited herein. The OSD (Object Storage Device) refers to a large number of units in a cluster, which are responsible for performing data Storage and maintenance functions. The special pool refers to an area for storing data information. A PG (place Group) is a unit for performing organization and location mapping when storing an object, and is a virtual concept and does not correspond to a specific entity. Preferably, the distribution parameters include a first distribution state of the PG inside the OSD and a second distribution state of the special pool inside the OSD.
Optionally, this step may include: evaluating a first distribution state of the PG in the OSD according to the cluster state; and evaluating a second distribution state of a special pool in the OSD according to the cluster state, wherein the special pool is an area for storing data information.
Wherein, the first distribution state refers to the distribution state of the PG in the OSD. The second distribution state refers to the distribution state of the special pool in the OSD. The special pool refers to an area for storing data information, such as a metadata pool, and the like, and is not limited herein. The metadata pool is an area in which information describing attributes of data is stored. And evaluating to obtain a first distribution state and a second distribution state so as to optimize the cluster.
S130, optimizing the cluster based on the distribution parameters.
And determining whether the percentage of the average occupied space of the OSD in the cluster is too large, or the percentage of the occupied space of a certain OSD is too large, or whether the OSD is damaged or not according to the distribution parameters, thereby optimizing the cluster. Specifically, when the percentage of the occupied space of one OSD is too large, the data in the OSD may be migrated to other OSDs, so as to ensure that the percentage of the occupied space of each OSD in the cluster is close to or approximately equal to each other. When the average percentage of occupied space of the whole OSD is too large, the percentage of occupied space of each OSD is close, or the OSD is damaged in the cluster, the damaged OSD can be replaced by adding a new OSD, so that the requirement of cluster storage data is met.
Optionally, the step may specifically include: acquiring one or more to-be-optimized OSD (on screen displays) with the occupied space larger than a first threshold; and adjusting the occupation space of the OSD to be optimized to be lower than a second threshold value.
The first threshold is a threshold for determining whether the occupied space of the OSD is too large. In particular, the first threshold may be embodied in the form of a percentage, for example 50%; and may be a specific value, and is not limited herein. Preferably, the first threshold is embodied in percentage. When the occupied space of the OSD is larger than the first threshold, the residual space of the OSD is not sufficient, and the part of the OSD is the OSD to be optimized and needs to be optimized. Specifically, the optimization process is to adjust the occupied space of the OSD to be optimized to be lower than the second threshold. The second threshold is a threshold for determining whether the remaining space of the OSD is sufficient. Specifically, the second threshold may be in the form of a percentage, or may be a specific value. Preferably, the second threshold is of the same form as the first threshold. In this embodiment, the OSDs are adjusted up or down in small increments by internal data migration of the cluster, i.e., part of the data of the OSDs to be optimized is migrated to other OSDs, so as to distribute the objects as uniformly as possible in each PG, which is uniformly distributed in each OSD. Optionally, the OSD to be optimized that needs to be optimized may be obtained through Python calculation. Python refers to a programming language suitable as a development web crawler.
Specifically, in the data migration process, the corresponding PG may be in an active state, a remapped state, and a background filling state simultaneously, and the PG may remap the PG and the read-write process of data between internal OSDs, without affecting the read-write operation of the client. Preferably, the number of PGs is 20 OSD counts/copy counts.
Optionally, after the optimization is completed, the usage record is cleared, and a new usage record is recorded again, so as to ensure that the next optimization time is not affected by the previous usage record.
According to the technical scheme of the embodiment of the invention, the cluster state of the cluster is analyzed; evaluating distribution parameters of the cluster according to the cluster state; and optimizing the cluster based on the distribution parameters, continuously analyzing the cluster state in the using process of the cluster, and continuously optimizing the cluster, so that the cluster can always store data with better performance, the performance of a storage system is optimized, and the technical effect of improving the efficiency of data storage is achieved. In addition, the cluster state is automatically analyzed to optimize, manual intervention is not needed, and the operation and maintenance cost is reduced. In addition, the cluster is analyzed and optimized in the non-use time of the cluster, so that additional resources are not needed for analysis and optimization, and the data reading efficiency of the cluster is not influenced.
Example two
Fig. 2 is a flowchart illustrating an optimization method of a distributed storage system according to a second embodiment of the present invention. The embodiment is further refined in the technical scheme, and is suitable for a scene of optimizing the distributed storage system. The method may be performed by an optimization device of the distributed storage system, which may be implemented in software and/or hardware, and may be integrated on a server.
As shown in fig. 2, the optimization method for a distributed storage system according to the second embodiment of the present invention includes:
and S210, analyzing the cluster state of the cluster.
The cluster is a storage pool which is used for aggregating storage spaces in a plurality of storage devices into one storage pool capable of providing a uniform access interface and a management interface for an application server in a distributed storage system. The cluster state refers to the current state information of the cluster.
And S220, evaluating the distribution parameters of the cluster according to the cluster state.
The distribution parameters refer to parameters and distribution states of the modules in the cluster, such as PG, OSD, and the like. For example, the distribution status of PG in OSD, the distribution status of a special pool in OSD, whether the distribution of PG in OSD is uniform, etc., and is not limited herein.
And S230, judging whether the cluster meets the optimization condition.
In this embodiment, the optimization condition refers to a condition for optimizing the cluster without affecting the data stored in the cluster. Optionally, the step may specifically include: judging whether the cluster state is normal or not; and/or judging whether the migration speed of the cluster meets a third threshold value; and/or judging whether the read-write speed of the cluster meets a fourth threshold value; and/or judging whether the number of the OSD to be optimized in the cluster meets a fifth threshold value.
And when the cluster state is normal, the cluster is optimized, so that the loss of data is avoided. The migration speed satisfying the third threshold value means that the migration speed is less than or equal to a specific value. Preferably, the migration velocity satisfying the third threshold value is a migration velocity of 0. And when the migration speed is 0, the cluster is optimized, so that the correctness of the data stored in the cluster is ensured. The fact that the read-write speed of the cluster meets the fourth threshold means that the read-write speed is smaller than a specific numerical value. And when the read-write speed is lower than the fourth threshold, the optimization of the cluster is performed, so that the function of storing data provided by the cluster is not influenced while the optimization is performed. The OSD to be optimized refers to the OSD needing to be optimized. Preferably, the OSD to be optimized is an OSD whose occupied space is greater than the first threshold. The cluster optimization needs to occupy the processing resources of the cluster, and the cluster is optimized when the number of the OSD to be optimized is larger than the fifth threshold value, so that the operation and maintenance cost is reduced. If the cluster meets the optimization condition, executing step S240, and optimizing the cluster based on the distribution parameter; and if the cluster does not meet the optimization condition, executing the step S250, and judging whether the cluster meets the optimization condition again in preset time.
S240, optimizing the cluster based on the distribution parameters.
And determining whether the percentage of the average occupied space of the OSD in the cluster is too large, or the percentage of the occupied space of a certain OSD is too large, or whether the OSD is damaged or not according to the distribution parameters, thereby optimizing the cluster. Specifically, when the percentage of the occupied space of one OSD is too large, the data in the OSD may be migrated to other OSDs, so as to ensure that the percentage of the occupied space of each OSD in the cluster is close to or approximately equal to each other. When the average percentage of occupied space of the whole OSD is too large, the percentage of occupied space of each OSD is close, or the OSD is damaged in the cluster, the damaged OSD can be replaced by adding a new OSD, so that the requirement of cluster storage data is met.
And S250, judging whether the cluster meets the optimization condition again in preset time.
In this embodiment, the preset time may be a time set in advance, or may be a time determined according to the usage record of the cluster. Preferably, the preset time is a non-use time period of the cluster. And judging whether the cluster meets the optimization condition again within the preset time so as to optimize the cluster.
Optionally, after step S250, the method may include: judging whether the optimization times meet a sixth threshold value; if the optimization times meet the sixth threshold, stopping optimization; and if the optimization times do not meet the sixth threshold, optimizing the cluster again based on the distribution parameters until the optimization times meet the sixth threshold.
And judging whether the cluster needs to be continuously adjusted or not according to the sixth threshold. The optimization times meet the sixth threshold, which means that the optimization times are not less than the sixth threshold. And the optimization effect of the cluster is ensured to be better through multiple times of optimization of the cluster.
According to the technical scheme of the embodiment of the invention, the cluster state of the cluster is analyzed; evaluating distribution parameters of the cluster according to the cluster state; and optimizing the cluster based on the distribution parameters, continuously analyzing the cluster state in the using process of the cluster, and continuously optimizing the cluster, so that the cluster can always store data with better performance, the performance of a storage system is optimized, and the technical effect of improving the efficiency of data storage is achieved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an optimization apparatus of a distributed storage system according to a third embodiment of the present invention, where this embodiment is applicable to a scenario of optimizing the distributed storage system, and the apparatus may be implemented in a software and/or hardware manner and may be integrated on a server.
As shown in fig. 3, the optimization apparatus of the distributed storage system provided in this embodiment may include an analysis module 310, an evaluation module 320, and an optimization module 330, where:
an analysis module 310, configured to analyze a cluster state of the cluster;
an evaluation module 320, configured to evaluate a distribution parameter of the cluster according to the cluster status;
an optimization module 330 configured to optimize the cluster based on the distribution parameter.
Optionally, the analysis module 310 includes:
the first acquisition unit is used for acquiring the use record of the cluster at regular time; acquiring the use time period of the cluster state according to the use record;
and the analysis unit is used for analyzing the cluster state of the cluster in the non-use time period.
Optionally, the evaluation module 320 includes:
the first evaluation unit is used for evaluating a first distribution state of the PG in the OSD according to the cluster state;
and the second evaluation unit is used for evaluating a second distribution state of a special pool in the OSD according to the cluster state, wherein the special pool is an area for storing data information.
Optionally, the optimizing module 330 includes:
the second obtaining unit is used for obtaining one or more OSD to be optimized, and the occupied space of the OSD is larger than the first threshold value;
and the adjusting unit is used for adjusting the occupied space of the OSD to be optimized to a state lower than a second threshold value.
Optionally, the apparatus further comprises:
the judging module is used for judging whether the cluster meets the optimization condition; optimizing the cluster based on the distribution parameters if the cluster satisfies the optimization condition; and if the cluster does not meet the optimization condition, judging whether the cluster meets the optimization condition again at preset time.
Optionally, the determining module is specifically configured to determine whether the cluster state is normal; and/or
Judging whether the migration speed of the cluster meets a third threshold value; and/or
Judging whether the read-write speed of the cluster meets a fourth threshold value; and/or
And judging whether the number of the OSD to be optimized in the cluster meets a fifth threshold value.
The optimization device of the distributed storage system provided by the embodiment of the invention can execute the optimization method of the distributed storage system provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. Reference may be made to the description of any method embodiment of the invention not specifically described in this embodiment.
Example four
Fig. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary server 612 suitable for use in implementing embodiments of the present invention. The server 612 shown in fig. 4 is only an example, and should not bring any limitation to the function and the scope of the use of the embodiments of the present invention.
As shown in fig. 4, the server 612 is in the form of a general-purpose server. The components of server 612 may include, but are not limited to: one or more processors 616, a memory device 628, and a bus 618 that couples the various system components including the memory device 628 and the processors 616.
Bus 618 represents one or more of any of several types of bus structures, including a memory device bus or memory device controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
The server 612 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by server 612 and includes both volatile and nonvolatile media, removable and non-removable media.
Storage 628 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 630 and/or cache Memory 632. Terminal 612 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 634 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, and commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk such as a Compact disk Read-Only Memory (CD-ROM), Digital Video disk Read-Only Memory (DVD-ROM) or other optical media may be provided. In such cases, each drive may be connected to bus 618 by one or more data media interfaces. Storage device 628 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 640 having a set (at least one) of program modules 642 may be stored, for example, in storage 628, such program modules 642 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 642 generally perform the functions and/or methods of the described embodiments of the present invention.
The server 612 may also communicate with one or more external devices 614 (e.g., keyboard, pointing terminal, display 624, etc.), with one or more terminals that enable a user to interact with the server 612, and/or with any terminals (e.g., network card, modem, etc.) that enable the server 612 to communicate with one or more other computing terminals. Such communication may occur via input/output (I/O) interfaces 622. Further, server 612 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network such as the internet) via Network adapter 620. As shown in FIG. 4, the network adapter 620 communicates with the other modules of the server 612 via the bus 618. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the server 612, including but not limited to: microcode, end drives, Redundant processors, external disk drive Arrays, RAID (Redundant Arrays of Independent Disks) systems, tape drives, and data backup storage systems, among others.
The processor 616 executes various functional applications and data processing by running programs stored in the storage device 628, for example, implementing an optimization method of a distributed storage system provided by any embodiment of the present invention, which may include:
analyzing the cluster state of the cluster;
evaluating distribution parameters of the cluster according to the cluster state;
optimizing the cluster based on the distribution parameter.
According to the technical scheme of the embodiment of the invention, the cluster state of the cluster is analyzed; evaluating distribution parameters of the cluster according to the cluster state; and optimizing the cluster based on the distribution parameters, continuously analyzing the cluster state in the using process of the cluster, and continuously optimizing the cluster, so that the cluster can always store data with better performance, the performance of a storage system is optimized, and the technical effect of improving the efficiency of data storage is achieved.
EXAMPLE five
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements a method for optimizing a distributed storage system, where the method includes:
analyzing the cluster state of the cluster;
evaluating distribution parameters of the cluster according to the cluster state;
optimizing the cluster based on the distribution parameter.
The computer-readable storage media of embodiments of the invention may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or terminal. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
According to the technical scheme of the embodiment of the invention, the cluster state of the cluster is analyzed; evaluating distribution parameters of the cluster according to the cluster state; and optimizing the cluster based on the distribution parameters, continuously analyzing the cluster state in the using process of the cluster, and continuously optimizing the cluster, so that the cluster can always store data with better performance, the performance of a storage system is optimized, and the technical effect of improving the efficiency of data storage is achieved.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method for optimizing a distributed storage system, comprising:
analyzing the cluster state of the cluster;
evaluating distribution parameters of the cluster according to the cluster state;
optimizing the cluster based on the distribution parameter.
2. The method of optimizing a distributed storage system according to claim 1, wherein said evaluating distribution parameters of said cluster according to said cluster status comprises:
evaluating a first distribution state of the PG in the OSD according to the cluster state;
and evaluating a second distribution state of a special pool in the OSD according to the cluster state, wherein the special pool is an area for storing data information.
3. The method of optimizing a distributed storage system according to claim 1, wherein the optimizing the cluster based on the distribution parameter comprises:
acquiring one or more to-be-optimized OSD (on screen displays) with the occupied space larger than a first threshold;
and adjusting the occupation space of the OSD to be optimized to be lower than a second threshold value.
4. The method of optimizing a distributed storage system according to claim 1, prior to said optimizing the cluster based on the distribution parameters, comprising:
judging whether the cluster meets an optimization condition;
optimizing the cluster based on the distribution parameters if the cluster satisfies the optimization condition;
and if the cluster does not meet the optimization condition, judging whether the cluster meets the optimization condition again at preset time.
5. The method of optimizing a distributed storage system according to claim 4, wherein said determining whether the cluster satisfies an optimization condition comprises:
judging whether the cluster state is normal or not; and/or
Judging whether the migration speed of the cluster meets a third threshold value; and/or
Judging whether the read-write speed of the cluster meets a fourth threshold value; and/or
And judging whether the number of the OSD to be optimized in the cluster meets a fifth threshold value.
6. The method of optimizing a distributed storage system according to claim 1, wherein said analyzing cluster status of clusters comprises:
acquiring the use record of the cluster at regular time;
acquiring the use time period of the cluster state according to the use record;
the cluster state of the cluster is analyzed during the non-use period.
7. The method of optimizing a distributed storage system according to claim 1, wherein after said optimizing the cluster based on the distribution parameter, comprising:
judging whether the optimization times meet a sixth threshold value;
if the optimization times meet the sixth threshold, stopping optimization;
and if the optimization times do not meet the sixth threshold, optimizing the cluster again based on the distribution parameters until the optimization times meet the sixth threshold.
8. An optimization apparatus for a distributed storage system, comprising:
the analysis module is used for analyzing the cluster state of the cluster;
the evaluation module is used for evaluating the distribution parameters of the clusters according to the cluster states;
an optimization module to optimize the cluster based on the distribution parameter.
9. A server, comprising:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the optimization method for the distributed storage system of any of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of optimizing a distributed storage system according to any one of claims 1 to 7.
CN201911030886.3A 2019-10-28 2019-10-28 Optimization method and device of distributed storage system, server and storage medium Pending CN110780821A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911030886.3A CN110780821A (en) 2019-10-28 2019-10-28 Optimization method and device of distributed storage system, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911030886.3A CN110780821A (en) 2019-10-28 2019-10-28 Optimization method and device of distributed storage system, server and storage medium

Publications (1)

Publication Number Publication Date
CN110780821A true CN110780821A (en) 2020-02-11

Family

ID=69386992

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911030886.3A Pending CN110780821A (en) 2019-10-28 2019-10-28 Optimization method and device of distributed storage system, server and storage medium

Country Status (1)

Country Link
CN (1) CN110780821A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111758086A (en) * 2020-05-22 2020-10-09 长江存储科技有限责任公司 Method for refreshing mapping table of SSD
CN111797508A (en) * 2020-06-12 2020-10-20 中冶建筑研究总院有限公司 Steel roof truss safety real-time evaluation method based on monitoring technology
CN113271323A (en) * 2020-02-14 2021-08-17 中移(苏州)软件技术有限公司 Cluster capacity expansion method and device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107391633A (en) * 2017-06-30 2017-11-24 北京奇虎科技有限公司 Data-base cluster Automatic Optimal processing method, device and server
CN107817950A (en) * 2017-10-31 2018-03-20 新华三技术有限公司 A kind of data processing method and device
CN109933285A (en) * 2019-02-26 2019-06-25 新华三技术有限公司成都分公司 The data balancing method and device of distributed storage

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107391633A (en) * 2017-06-30 2017-11-24 北京奇虎科技有限公司 Data-base cluster Automatic Optimal processing method, device and server
CN107817950A (en) * 2017-10-31 2018-03-20 新华三技术有限公司 A kind of data processing method and device
CN109933285A (en) * 2019-02-26 2019-06-25 新华三技术有限公司成都分公司 The data balancing method and device of distributed storage

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113271323A (en) * 2020-02-14 2021-08-17 中移(苏州)软件技术有限公司 Cluster capacity expansion method and device and storage medium
CN111758086A (en) * 2020-05-22 2020-10-09 长江存储科技有限责任公司 Method for refreshing mapping table of SSD
CN111797508A (en) * 2020-06-12 2020-10-20 中冶建筑研究总院有限公司 Steel roof truss safety real-time evaluation method based on monitoring technology
CN111797508B (en) * 2020-06-12 2024-03-01 中冶建筑研究总院有限公司 Real-time evaluation method for safety of steel roof truss based on monitoring technology

Similar Documents

Publication Publication Date Title
CN110413201B (en) Method, apparatus and computer program product for managing a storage system
US8677093B2 (en) Method and apparatus to manage tier information
JP4896593B2 (en) Performance monitoring method, computer and computer system
US8122158B1 (en) Method for improving I/O performance of host systems by applying future time interval policies when using external storage systems
CN110780821A (en) Optimization method and device of distributed storage system, server and storage medium
KR20120102664A (en) Allocating storage memory based on future use estimates
US7657705B2 (en) Method and apparatus of a RAID configuration module
CN111930713B (en) Distribution method, device, server and storage medium of CEPH placement group
US8584130B2 (en) Allocation of resources on computer systems
CN112346647B (en) Data storage method, device, equipment and medium
US20210216231A1 (en) Method, electronic device and computer program product for rebuilding disk array
CN112052082B (en) Task attribute optimization method, device, server and storage medium
CN115576505A (en) Data storage method, device and equipment and readable storage medium
CN111737212A (en) Method and equipment for improving performance of distributed file system
CN109284108A (en) Date storage method, device, electronic equipment and storage medium
CN113408070B (en) Engine parameter determining method, device, equipment and storage medium
CN110602207A (en) Method, device, server and storage medium for predicting push information based on off-network
CN109375871A (en) A kind of log processing method, system and electronic equipment and storage medium
CN111414422B (en) Data distribution method, device, equipment and storage medium
CN115993932A (en) Data processing method, device, storage medium and electronic equipment
CN117093335A (en) Task scheduling method and device for distributed storage system
CN110427377B (en) Data processing method, device, equipment and storage medium
US8966133B2 (en) Determining a mapping mode for a DMA data transfer
CN114020214A (en) Storage cluster capacity expansion method and device, electronic equipment and readable storage medium
CN112306744A (en) Log storage backup method, device, server and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200211