CN111949217A - Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof - Google Patents

Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof Download PDF

Info

Publication number
CN111949217A
CN111949217A CN202010850653.4A CN202010850653A CN111949217A CN 111949217 A CN111949217 A CN 111949217A CN 202010850653 A CN202010850653 A CN 202010850653A CN 111949217 A CN111949217 A CN 111949217A
Authority
CN
China
Prior art keywords
disk
storage
data
cache
master
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010850653.4A
Other languages
Chinese (zh)
Inventor
马金祥
郭亮
胡明
陈杰
张达
李建东
周逵
米保军
董瑞柯
撖伟
黄春旭
郑建忠
胡晓宇
张志标
唐延恺
冯志超
周国贞
黄立薇
林永昌
仇爱超
李华勇
林益溪
郭健海
郭雷
王兴龙
牛春营
李立霞
陈雪发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SGIS Songshan Co Ltd
Original Assignee
SGIS Songshan Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SGIS Songshan Co Ltd filed Critical SGIS Songshan Co Ltd
Priority to CN202010850653.4A priority Critical patent/CN111949217A/en
Publication of CN111949217A publication Critical patent/CN111949217A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0664Virtualisation aspects at device level, e.g. emulation of a storage device or system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a super-fusion all-in-one machine and a software definition storage SDS processing method and system thereof, comprising the following steps: the method comprises the steps of carrying out fragmentation processing on a file to be stored, and dividing the file into a plurality of object data; obtaining the belonged logic placement group identification through a Hash algorithm and a modulus according to the storage pool to which the virtual disk belongs and the identification of the object data; transmitting the logic placement group identifier to a storage location algorithm, and obtaining corresponding master and slave storage devices by the storage location algorithm, wherein the master and slave storage devices comprise a cache disk and a persistent storage disk; writing the object data into a cache disk corresponding to the master-slave storage equipment; returning the information of successful writing to the service system of the upper layer; and flushing the data in the cache disk to a persistent storage disk. The method can improve at least one of reliability, low availability and access efficiency.

Description

Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof
Technical Field
The invention relates to the technical field of distributed storage, in particular to a super-fusion all-in-one machine and a software definition storage SDS (sodium dodecyl sulfate) processing method and system thereof.
Background
With the popularization of the internet and the emergence of the internet of things, the information amount on the network shows explosive growth, market research organization IDC predicts that the annual growth rate of the global data amount will be maintained at about 50% in the future, and the global data amount will reach 40ZB in 2020. The data volume in China can reach 8.6ZB, and the data volume accounts for about 21% of the whole world. The business of application manufacturers expands rapidly, more flexible expansion modes of server resources, network resources and storage resources are inevitably needed, and the business online period of several months is far from meeting the requirement of business increase.
With the development of the internet, various APP and application service providers appear like bamboo shoots in spring after rain, and service providers require that an enterprise data center architecture can flexibly allocate and recover server computing resources, can conveniently expand storage space, and more importantly, can be built as required and expanded as required. Thus, the traditional virtualization platform and the traditional distributed storage are naturally combined together to form a Hyper-Converged Infrastructure (HCI) with computing, network and storage Converged on x86 servers.
The main components of the HCI are virtualization and distributed computing, the virtualization provides computing network resources of a service virtual machine, the distributed computing provides storage resources and ensures the reliability of data, the two components are fused in the same set of x86 server, the complexity of the traditional three-layer architecture is eliminated, and the problems that traditional centralized storage is difficult to manage and expand are avoided.
"Software Defined Storage (SDS)" is one of the main trends in the evolution of contemporary data center technology. In the last 10 years, the SDS ecosystem gradually matures on the basis of the accumulation of open architectures of various industries, particularly the successful experience of IT basic architectures of large-scale Internet operators.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: current software defined storage SDS has at least one of the problems of low system reliability, low availability, and inefficient access.
Disclosure of Invention
The embodiment of the invention provides a super-fusion all-in-one machine and a software definition storage SDS processing method and system thereof, aiming at improving at least one of reliability, low availability and access efficiency.
To achieve the above object, in a first aspect, an embodiment of the present invention provides a method for processing a software definition storage SDS of a super-fusion all-in-one machine, including:
the method comprises the steps of carrying out fragmentation processing on a file to be stored, and dividing the file into a plurality of object data;
obtaining the belonged logic placement group identification through a Hash algorithm and a modulus according to the storage pool to which the virtual disk belongs and the identification of the object data;
transmitting the logic placement group identifier to a storage location algorithm, and obtaining corresponding master and slave storage devices by the storage location algorithm, wherein the master and slave storage devices comprise a cache disk and a persistent storage disk;
writing the object data into a cache disk corresponding to the master-slave storage equipment;
returning the information of successful writing to the service system of the upper layer;
and flushing the data in the cache disk to a persistent storage disk.
In some possible embodiments, the cache disk is an SSD, and the persistent storage disk is an HDD, the method further comprising: caching hot data in the SSD to handle random I/O concurrent accesses; and automatically adjusting the read-write cache proportion of the SSD according to the service load.
In some possible embodiments, the method further comprises: when any copy data of any one data block fails to be read, the data is read from other copies and then rewritten into the copy for recovery, so that the total number of the data copies is not reduced.
In some possible embodiments, the method further comprises: when data inconsistency is caused by the failure of a node or a disk, the copy fragments on different nodes are compared through a self-checking mechanism, the data failure is found, and data recovery is started.
In some possible embodiments, the method further comprises: detecting whether a bad disk exists, automatically isolating the bad disk when the bad disk is detected, and automatically reconstructing a data copy on the bad disk in parallel; the magnetic disk or the cache disk is provided with a hard disk positioning lamp.
In some possible embodiments, the method further comprises: and detecting whether the storage pool has a slow disk or not according to the average reading rate, the average writing rate and the average I/O delay level, and if the storage pool has the slow disk, giving an alarm and transferring the data in the slow disk to a hot standby disk.
In a second aspect, an embodiment of the present invention provides a software definition storage SDS processing system of a super-fusion all-in-one machine, including:
the fragmentation module is used for carrying out fragmentation processing on the file to be stored and dividing the file into a plurality of object data;
the hash module is used for obtaining the belonged logic placement group identification through a hash algorithm and a modulus according to the storage pool to which the virtual disk belongs and the identification of the object data;
the storage location module is used for acquiring the logical placement group identifier and obtaining corresponding master and slave storage equipment according to a storage location algorithm, wherein the master and slave storage equipment comprises a cache disk and a persistent storage disk;
the writing module is used for writing the object data into a cache disk corresponding to the master storage device and the slave storage device;
the feedback module is used for returning the information of successful writing to the service system of the upper layer;
and the flash module is used for flashing the data in the cache disk into a persistent storage disk.
In some possible embodiments, the storage pool is a combination of an SSD and an HDD, and the method further includes an intelligent cache module for caching hot data in the cache disk to handle random I/O concurrent access; and the read-write cache proportion of the cache disk is automatically adjusted according to the service load.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor, and is characterized in that the software-defined storage SDS processing method for any one of the above-mentioned hyper-fusion all-in-one machines is provided.
In a fourth aspect, an embodiment of the present invention provides a super-fusion all-in-one machine, which includes: one or more processors; storage means for storing one or more programs; wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement any of the above-described software-defined storage (SDS) processing methods for a hyper-fusion all-in-one machine
The technical scheme has the following beneficial effects: the software defined storage processing method and the system of the super-fusion all-in-one machine can enable the local SSD and the HDD on a plurality of physical machines to form a virtual storage pool, share the storage load by a plurality of servers and locate the storage information by using the position server, thereby not only improving the reliability, the availability and the access efficiency of the system, but also being easy to expand.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 shows a schematic diagram of virtualized storage of a hyper-convergence all-in-one machine according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating the business logic architecture of a system for software defined storage in accordance with an embodiment of the present invention.
Fig. 3 is a flowchart of a software definition storage SDS processing method of the super-fusion all-in-one machine according to the embodiment of the present invention.
Fig. 4 is a functional block diagram of a software definition storage SDS processing system of a super-fusion all-in-one machine according to an embodiment of the present invention.
Fig. 5 is a functional block diagram of a super-fusion all-in-one machine according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows a schematic diagram of virtualized storage of a hyper-convergence all-in-one machine according to an embodiment of the present invention. Fig. 1 is an overall flow of data from a service virtual machine side, through front-back end scheduling of a virtual machine hypervisor layer (also referred to as a virtual machine monitor), and then to a final data drop, and the detailed explanation of the step-by-step flow is as follows:
step one, a Virtual Machine (VM) sends out a business data writing action;
step two, the front-end driver sends a data writing request to a control domain DOM0(domain0) on the virtual machine monitor;
step three, after receiving a data writing request, the DOM0 calls a back-end driver to respond to the request;
step four, the DOM0 acquires the memory space of the virtual machine needing to be written into the disk through privilege access, and then sends the request to the distributed storage of the virtual machine monitor through the block device driver;
fifthly, finally determining a physical or logical storage unit OSD (on screen display) in which data is written through a storage mapping component of the monitor, wherein the data starts to be written at the moment;
step six, when the write request is sent to the main storage device OSD (1), the main OSD sends the request to the auxiliary storage device OSD (2a), and meanwhile, the main storage device OSD starts to write journal (2b) to the SSD and waits for write confirmation (3 a);
step seven, after receiving the instruction of the main OSD, the slave OSD synchronously writes the journal (3b) to the SSD and then waits for the write confirmation (4);
step eight, after receiving the write confirmation from the OSD, returning the successfully written information to the main OSD (5), and after receiving the signals that the consistency write succeeds in the step 3a and the step 5, the main OSD feeding back a successfully written signal (6) to the upper layer;
and step nine, ending the data writing process, and performing the flushing operation according to the storage time of the SSD log and the proportion of the new log to the old log when the data finally falls to the disk.
FIG. 2 is a diagram illustrating the business logic architecture of a system for software defined storage in accordance with an embodiment of the present invention. In the system, local SSD and HDD on a plurality of physical machines can form a virtual storage pool, a plurality of x86 servers are used for sharing storage load, and a position server is used for positioning storage information, so that the reliability, availability and access efficiency of the system are improved, and the system is easy to expand.
As shown in fig. 2, after being fragmented, the file is written into the HDD hard disk on the bottom layer in a multi-copy manner, and all copies are dispersed on different nodes, thereby avoiding data loss caused by a single point of failure.
Specifically, the file is first sliced, for example, into a plurality of object objects.
And obtaining the belonged logic placement group identifier (LPG id) through a Hash algorithm and a module according to the storage pool to which the virtual disk belongs and the object identifier (object id).
And then transmitting the logic placement group identifier to a storage location algorithm, and obtaining a corresponding master-slave storage device OSD by the storage location algorithm.
After data is written into a cache disk SSD corresponding to a master-slave OSD (strong consistency write, ensuring that the master-slave OSD is successfully written), the information of successful writing is returned to an upper-layer service system.
And finally, according to a certain rule, flushing the data in the SSD into a disk of the persistent storage HDD.
Fig. 3 is a flowchart of a software definition storage SDS processing method of the super-fusion all-in-one machine according to the embodiment of the present invention. As shown in fig. 3, the method includes:
s310: and carrying out fragmentation processing on the file to be stored, and dividing the file into a plurality of object data.
S320: and obtaining the identifier of the belonged logic placement group through a Hash algorithm and a modulus according to the identifier of the storage pool and the object data to which the virtual disk belongs.
Logical placement group: the software defined storage processing method and the system are characterized in that after local SSD and HDD on a plurality of physical machines are paired to form a storage pool, the storage pool is logically divided into a plurality of logical placement groups, which has the effect of starting from top to bottom, each logical placement group corresponds to a plurality of actual HDD disks (OSD), when data is written, all the logical placement groups are determined according to data content, and then the finally written OSD is obtained according to the logical placement groups.
In this step, the modulus (%) is exemplified as follows: assuming N nodes, the location of the logical placement group is hash (obj _ id)% N, where obj _ id is the identifier of the file. This way is very simple and the advantages are obvious: through random hash (random hash), data can be uniformly distributed on the N nodes, so that hot spots can be eliminated, and load balance can be realized.
As an example, all logical placement groups constitute a logical placement group pool. Assuming a logical placement group pool name rbd, there are 256 logical placement groups, each of which is called 0x0,0x 1.. 0xF,0x10,0x11.. 0xFE,0xFF, respectively.
For two objects with object names bar and foo respectively, random hash calculation is performed on their object names:
HASH(‘bar’)=0x3E0A4162
HASH(‘foo’)=0x7FE391A0
HASH(‘bar’)=0x3E0A4162
after the object name is subjected to random hash calculation, a string of hexadecimal output values is obtained, namely, one object name is converted into a string of numbers through random hash. The first and third rows above are the same, indicating that the result of the calculation is the same for an identical object name, but the random hash algorithm can calculate the object name as a random number.
With this output, the remainder is then found. The remainder of dividing the random number by the total number 256 of logical placement groups must fall between [0x0,0xFF ], which is one of the 256 logical placement groups:
0x3E0A4162%0xFF===>0x62
0x7FE391A0%0xFF===>0xA0
as can be seen from the above, the object bar is stored in the logical placement group numbered 0x62, and the object foo is stored in the logical placement group numbered 0xA 0.
S330: and transmitting the identification of the logic placement group to a storage position algorithm, and obtaining a corresponding master-slave storage device (OSD) by the algorithm, wherein the master-slave storage device comprises a cache disk and a persistent storage disk.
The storage allocation algorithm is a scalable pseudo-random data distribution algorithm for software defined storage, is used for controlling the distribution of data, and can efficiently and stably distribute the data in a common structured cluster. The storage location algorithm has the following features: a decentralized architecture without a metadata server, and the read-write performance is not reduced due to the enlargement of the cluster; under the same environment, the results obtained by similar inputs have no correlation, and the results obtained by the same inputs are determined; ensuring that data is evenly distributed on all hard disks of each node of the cluster as far as possible; when the number of the storage targets changes due to the addition and deletion of the nodes, the data migration amount among the clusters can be minimized.
In one embodiment, this step specifically includes the following processing sub-steps (a process of selecting an OSD by a logical placement group, not shown in the figure):
s331: a logical placement group identification (LPG _ ID) is given as an input to the storage allocation algorithm.
S332: StoreLocation (LPG _ ID, OSD _ ID, r) derives a random number (which may be based on any pseudo-random algorithm with three inputs). In this step, LPG _ ID, OSD _ ID, and r are taken together as three inputs to the StoreLocation, and a hexadecimal output is found. Where r is a constant.
S333: for all OSDs, the product is obtained by multiplying their weights by the random number corresponding to each OSD _ ID. The storage capacity of each OSD may be different, and the capacity of each OSD is referred to as an OSD weight, for example, a 4T weight of 4 is specified, 800G is 0.8, that is, a value in T.
S334: the OSD with the largest product is selected.
S335: the set of logical placements is saved to the product max OSD.
In addition, r +1 may be further solved by a random number, multiplied by the weight of each OSD, and then selected as the OSD with the largest product, if the OSD number is different from the previous OSD number, it is selected as the second OSD, and if the OSD number is the same as the previous OSD number, r +2 is selected again until two different OSDs (master and slave OSDs) are selected.
S340: and writing the object data into a cache disk SSD corresponding to the master OSD and the slave OSD.
S350: and returning the information of successful writing to the service system of the upper layer.
S360: the data in the SSD is flushed to a persistent storage HDD disk.
The software defined storage processing method and the system of the ultra-fusion all-in-one machine can enable local SSD and HDD on a plurality of physical machines to form a virtual storage pool, utilize a plurality of x86 servers to share storage load, and utilize a position server to position storage information, thereby not only improving the reliability, the availability and the access efficiency of the system, but also being easy to expand.
In some possible embodiments, the storage pool is formed by combining an SSD and an HDD, and the method further includes: caching hot data in the SSD to cope with random I/O concurrent access; and automatically adjusting the read-write cache proportion of the SSD according to the service load.
In order to realize high performance, the architecture and the data path of the software defined storage processing method and system are completely designed and optimized for a block storage system, the read-write operation flow is very simple, and the resources are occupied as little as possible.
The storage pool is formed by combining the SSD and the HDD, the high-performance characteristic of the SSD is fully utilized, the hot data is cached in the fast SSD through an intelligent caching algorithm so as to deal with random I/O concurrent access, the read-write caching proportion of the SSD can be intelligently and automatically adjusted along with the service load, for example, in the scene of writing less and reading more, most of the space of the SSD can be used as the hot data read cache by the software defined storage processing method and system, and the response capability of the front-end service is greatly improved.
As an example, after a super-convergence cluster is formed by configuring two SSD of 400GB for each node, each node may provide an IOPS of 1W + for a read-write hybrid scenario, and with an upgrade of a single node configuration or an extension of a cluster node server, a performance may also be linearly extended.
In some possible embodiments, the method further comprises: when any copy data of any one data block fails to be read, the data is read from other copies and then rewritten into the copy for recovery, so that the total number of the data copies is not reduced.
In some possible embodiments, the method further comprises: when data inconsistency is caused by the failure of a node or a disk, the copy fragments on different nodes are compared through a self-checking mechanism, the data failure is found, and data recovery is started.
In some possible embodiments, the method further comprises: detecting whether a bad disk exists, automatically isolating the bad disk when the bad disk is detected, and automatically reconstructing a data copy on the bad disk in parallel; the disk or cache disk has a hard disk location light.
In some possible embodiments, the method further comprises: and detecting whether the storage pool has a slow disk or not according to the average reading rate, the average writing rate and the average I/O delay level, and if the storage pool has the slow disk, giving an alarm and transferring the data in the slow disk to the hot standby disk. Specifically, parameters such as the total accumulated read-write access count and the total I/O access time consumption in the system are recorded. From these indicators, the average read rate, average write rate, average I/O latency level, etc. of the hard disk can be calculated. The average I/O latency (svc _ tm), the average service time per device I/O operation (milliseconds), is the cumulative sum of the time I/O accesses are consumed over a period of time divided by the number of I/O accesses over the period of time. It can be used to evaluate the I/O performance on the specified partition for this period of time.
Examples of the present invention are described in more detail below:
(1) high reliability
It is assumed that in a real environment, hardware is not absolutely reliable, a disk may be damaged, a server may be down, a network may fail, and the like. In order to process the unpredictable hardware errors and ensure the integrity of data and the availability of services, the software defined storage processing method and the software defined storage processing system make up the problems of data reliability and availability caused by hardware unreliability through a series of software-level reliability designs such as full redundancy design and the like.
The software definition storage processing method and the software definition storage processing system adopt the following mechanisms to ensure the high reliability of data:
based on a multi-copy redundancy mechanism of a strategy, data and copies thereof are stored across hard disks, storage nodes and racks; the consistency of each data copy is ensured through a strong consistency copying technology, and the reliability and the availability of data are not influenced at all even if one super-fusion node or even the whole rack is stopped.
And a read repair mechanism is supported, and when reading certain copy data of a certain data block fails, the data can be recovered by reading the data from other copies and then rewriting the copy, so that the total number of the data copies is not reduced.
And automatic data reconstruction, when data inconsistency is caused by the failure of a node or a disk of the system, comparing copy fragments on different nodes through an internal self-checking mechanism, automatically discovering data failure, starting a data recovery mechanism, and recovering data in a background.
(2) High stability
The software definition storage processing method and the software definition storage processing system ensure the availability of data by adopting a decentralized architecture, a multi-pair local mechanism and strong consistency, and in the aspect of performance, all files are fragmented based on a storage location algorithm, all fragments are evenly distributed in all disks of the whole cluster, and all disks can simultaneously provide I/O.
The architecture design of the software defined storage processing method and system considers that all HDDs at the bottom layer can provide I/O simultaneously under the condition that the SSD is fully loaded, is not limited by the performance of a certain block or a plurality of blocks of HDDs, and ensures the stability of cluster performance as far as possible.
(3) Super-fusion framework HCI data falling plate
The software definition storage processing method and the system firstly divide the file into pieces and a plurality of objects; obtaining a logic placement group to which the virtual disk belongs through a Hash algorithm and a module taking according to the Pool and the object _ id to which the virtual disk belongs; then, transmitting the value of the logic placement group to a storage position algorithm, and obtaining a corresponding master OSD and a slave OSD by the storage position algorithm; after data is written into a cache disk SSD corresponding to a master-slave OSD (strong consistency write, ensuring that the master-slave OSD is successfully written), the successful writing information is returned to an upper-layer service system; and finally, the software definition storage processing method and the system can write the data in the SSD to the disk of the persistent storage HDD according to a certain rule.
(4) HCI technical effect of super-fusion architecture
The super-fusion architecture fuses virtualization and storage in an X86 server, centralized storage is not adopted for storing data, and implementation, operation and maintenance complexity of a traditional architecture is eliminated;
the distributed storage pool of the super-fusion framework HCI is not subjected to virtualized multi-layer packaging, the data drop path is shorter, and the performance loss is greatly reduced;
the HCI adopts SSD as an I/O cache space, is much larger than a traditional cache for storing 8GB or 16GB, and has more stable performance under high load;
each X86 server is a storage machine head, a storage path and has better performance in a concurrent I/O scene;
the data are distributed in multiple copies, and the copies of the data are inevitably stored on hard disks of other servers, so that the data are ensured not to be lost due to the failure of a single server;
the space utilization rate of all hard disks is kept consistent as much as possible by an intelligent storage position algorithm, and the condition of serious unbalance is avoided;
the super-fusion cluster supports single-node capacity expansion, data are automatically balanced after capacity expansion, and data migration is minimized; automatically isolating bad disks, automatically and parallelly reconstructing data copies, and ensuring that the data has complete copy number;
hard disk positioning lamps, SSD service life monitoring, slow disk detection, data reconstruction flow control and the like;
the super-fusion architecture HCI can flexibly meet the dynamic requirements of development of current enterprises on IT resources, can be built as required and expanded as required, and has high availability, high performance, high reliability and high stability.
Fig. 4 is a functional block diagram of a software definition storage SDS processing system 400 of a hyper-converged infrastructure machine according to an embodiment of the present invention. As shown in fig. 4, the processing system 400 includes:
the fragmentation module 410 is configured to perform fragmentation processing on a file to be stored, and divide the file into a plurality of object data;
the hash module 420 is configured to obtain an identifier of the logical placement group to which the virtual disk belongs through a hash algorithm and modulo according to the identifier of the storage pool to which the virtual disk belongs and the identifier of the object data;
a storage location module 430, configured to obtain an identifier of the logical placement group, and obtain a corresponding master OSD and a slave OSD according to a storage location algorithm;
a write module 440, configured to write the object data into a cache disk SSD corresponding to the master OSD and the slave OSD;
the feedback module 450 is configured to return information that writing is successful to the upper layer service system;
and the flushing module 460 is used for flushing the data in the SSD to the disk of the persistent storage HDD.
In some embodiments, the storage pool is a combination of an SSD and an HDD, and the system further includes an intelligent cache module for caching hot data in the SSD to handle random I/O concurrent accesses; and automatically adjusting the read-write cache proportion of the SSD according to the service load.
In some embodiments, the system further comprises a copy recovery module: when any copy data of any one data block fails to be read, the system for reading the data from other copies and then rewriting the copy is used for recovery, so that the total number of the data copies is not reduced.
In some embodiments, the system further comprises a read repair module: when data inconsistency is caused by node or disk failure, the copy fragments on different nodes are compared through a self-checking mechanism, data failure is found, and data recovery is started.
In some embodiments, the system further comprises a bad disc detection module: the system comprises a data storage module, a data processing module and a data processing module, wherein the data storage module is used for detecting whether a bad disk exists or not, automatically isolating the bad disk when the bad disk is detected, and automatically reconstructing a data copy on the bad disk in parallel; the disk or cache disk has a hard disk location light.
In some embodiments, the system further comprises a slow disc detection module: and the controller is used for detecting whether the storage pool has the slow disk or not according to the average reading rate, the average writing rate and the average I/O delay level, and if the storage pool has the slow disk, giving an alarm and transferring the data in the slow disk to the hot standby disk.
Based on the same inventive concept, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the software-defined storage SDS processing method of any one of the above-mentioned hyper-fusion all-in-one machines.
The embodiment of the present invention further provides a super-fusion all-in-one machine, as shown in fig. 5, including one or more processors 501, a communication interface 502, a memory 503 and a communication bus 504, where the processors 501, the communication interface 502, and the memory 503 complete mutual communication through the communication bus 504. A memory 503 for storing a computer program; when the processor 501 is configured to execute the program stored in the memory 503, the communication bus mentioned in the electronic device for implementing any one of the superintegration may be a Peripheral Component Interconnect (PCI) bus or an Extended Industrial Standard Architecture (EISA) bus. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate arrays (F logic placement group a) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.

Claims (10)

1. A software definition storage SDS processing method of a hyper-fusion all-in-one machine is characterized by comprising the following steps:
the method comprises the steps of carrying out fragmentation processing on a file to be stored, and dividing the file into a plurality of object data;
obtaining the identifier of the logical placement group to which the virtual disk belongs through a Hash algorithm and a modulus according to the identifier of the storage pool to which the virtual disk belongs and the identifier of the object data;
transmitting the identification of the logic placement group to a storage location algorithm, and obtaining corresponding master and slave storage equipment by the storage location algorithm, wherein the master and slave storage equipment comprises a cache disk and a persistent storage disk;
writing the object data into a cache disk corresponding to the master-slave storage equipment;
returning the information of successful writing to the service system of the upper layer;
and flushing the data in the cache disk to a persistent storage disk.
2. The method of claim 1, wherein the cache disk is an SSD and the persistent storage disk is an HDD, the method further comprising: caching hot data into the cache disk to cope with random I/O concurrent access; and automatically adjusting the read-write cache proportion of the cache disk according to the service load.
3. The method of claim 1, further comprising: when any copy data of any one data block fails to be read, the data is read from other copies and then rewritten into the copy for recovery, so that the total number of the data copies is not reduced.
4. The method of claim 1, further comprising: when data inconsistency is caused by the failure of a node or a disk, the copy fragments on different nodes are compared through a self-checking mechanism, the data failure is found, and data recovery is started.
5. The method of claim 1, further comprising: detecting whether a bad disk exists, isolating the bad disk when the bad disk is detected, and automatically reconstructing the data copy on the bad disk in parallel; the magnetic disk or the cache disk is provided with a hard disk positioning lamp.
6. The method of claim 1, further comprising: and detecting whether the storage pool has a slow disk or not according to the average reading rate, the average writing rate and the average I/O delay level, and if the storage pool has the slow disk, giving an alarm and transferring the data in the slow disk to a hot standby disk.
7. A Software Definition Storage (SDS) processing system of a hyper-fusion all-in-one machine is characterized by comprising:
the fragmentation module is used for carrying out fragmentation processing on the file to be stored and dividing the file into a plurality of object data;
the hash module is used for obtaining the belonged logic placement group identification through a hash algorithm and a modulus according to the storage pool to which the virtual disk belongs and the identification of the object data;
the storage location module is used for acquiring the logical placement group identifier and obtaining corresponding master and slave storage equipment according to a storage location algorithm, wherein the master and slave storage equipment comprises a cache disk and a persistent storage disk;
the writing module is used for writing the object data into a cache disk corresponding to the master storage device and the slave storage device;
the feedback module is used for returning the information of successful writing to the service system of the upper layer;
and the flash module is used for flashing the data in the cache disk into a persistent storage disk.
8. The system of claim 7, further comprising an intelligent cache module to cache hot data into the cache disk to handle random I/O concurrent accesses; and the read-write cache proportion of the cache disk is automatically adjusted according to the service load.
9. A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements a method for processing a software-defined storage SDS for a hyper-fusion all-in-one machine as claimed in any of claims 1-6.
10. A super fuse all-in-one machine which characterized in that, it includes: one or more processors;
storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the Software Defined Storage (SDS) processing method of the hyper-fusion all-in-one machine of any of claims 1-6.
CN202010850653.4A 2020-08-21 2020-08-21 Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof Pending CN111949217A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010850653.4A CN111949217A (en) 2020-08-21 2020-08-21 Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010850653.4A CN111949217A (en) 2020-08-21 2020-08-21 Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof

Publications (1)

Publication Number Publication Date
CN111949217A true CN111949217A (en) 2020-11-17

Family

ID=73359499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010850653.4A Pending CN111949217A (en) 2020-08-21 2020-08-21 Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof

Country Status (1)

Country Link
CN (1) CN111949217A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302680A (en) * 2023-01-15 2023-06-23 北京志凌海纳科技有限公司 Recovery system and method for reducing fault influence of super fusion system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103503414A (en) * 2012-12-31 2014-01-08 华为技术有限公司 Computing storage integration cluster system
CN105549915A (en) * 2015-12-25 2016-05-04 曙光信息产业(北京)有限公司 Bad disk block isolation method and system
CN110427308A (en) * 2019-07-26 2019-11-08 新华三技术有限公司成都分公司 A kind of hard disk localization method, device, electronic equipment and storage medium
CN111031096A (en) * 2019-11-15 2020-04-17 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Distributed storage system construction method based on mimicry defense
CN111158587A (en) * 2019-12-10 2020-05-15 南京道熵信息技术有限公司 Distributed storage system based on storage pool virtualization management and data read-write method
CN111339192A (en) * 2020-02-21 2020-06-26 深圳供电局有限公司 Distributed edge computing data storage system
US20210216245A1 (en) * 2018-07-10 2021-07-15 Here Data Technology Method of distributed data redundancy storage using consistent hashing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103503414A (en) * 2012-12-31 2014-01-08 华为技术有限公司 Computing storage integration cluster system
CN105549915A (en) * 2015-12-25 2016-05-04 曙光信息产业(北京)有限公司 Bad disk block isolation method and system
US20210216245A1 (en) * 2018-07-10 2021-07-15 Here Data Technology Method of distributed data redundancy storage using consistent hashing
CN110427308A (en) * 2019-07-26 2019-11-08 新华三技术有限公司成都分公司 A kind of hard disk localization method, device, electronic equipment and storage medium
CN111031096A (en) * 2019-11-15 2020-04-17 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Distributed storage system construction method based on mimicry defense
CN111158587A (en) * 2019-12-10 2020-05-15 南京道熵信息技术有限公司 Distributed storage system based on storage pool virtualization management and data read-write method
CN111339192A (en) * 2020-02-21 2020-06-26 深圳供电局有限公司 Distributed edge computing data storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘荣辉: "大数据架构技术与实例分析", 东北师范大学出版社 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302680A (en) * 2023-01-15 2023-06-23 北京志凌海纳科技有限公司 Recovery system and method for reducing fault influence of super fusion system
CN116302680B (en) * 2023-01-15 2024-01-23 北京志凌海纳科技有限公司 Recovery system and method for reducing fault influence of super fusion system

Similar Documents

Publication Publication Date Title
CN102016808B (en) Checkpoint data are stored in nonvolatile memory
US9158540B1 (en) Method and apparatus for offloading compute resources to a flash co-processing appliance
US20210303401A1 (en) Managing storage device errors during processing of inflight input/output requests
US9058195B2 (en) Virtual machines failover
JP2019101703A (en) Storage system and control software arrangement method
EP3007070A1 (en) Memory system, memory access request processing method and computer system
US11811895B2 (en) Automatic data replica manager in distributed caching and data processing systems
US9069701B2 (en) Virtual machine failover
US9471449B2 (en) Performing mirroring of a logical storage unit
US20200042343A1 (en) Virtual machine replication and migration
CN103516549B (en) A kind of file system metadata log mechanism based on shared object storage
US20150288752A1 (en) Application server to nvram path
JP2006155623A (en) Method and apparatus for recovering database cluster
EP2979187B1 (en) Data flush of group table
JP6652647B2 (en) Storage system
CN106155943A (en) A kind of method and device of the power down protection of dual control storage device
JP4322240B2 (en) Reboot method, system and program
US20190324868A1 (en) Backup portion of persistent memory
CN111949217A (en) Super-fusion all-in-one machine and software definition storage SDS processing method and system thereof
CN113031876A (en) Data processing method, device and equipment and readable storage medium
US10210060B2 (en) Online NVM format upgrade in a data storage system operating with active and standby memory controllers
US11875060B2 (en) Replication techniques using a replication log
WO2022033269A1 (en) Data processing method, device and system
JP6677021B2 (en) Information processing apparatus, information processing method, and program
JP7299724B2 (en) MEMORY SYSTEM AND METHOD OF OPERATION THEREOF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201117