CN109992204A - Date storage method and device - Google Patents
Date storage method and device Download PDFInfo
- Publication number
- CN109992204A CN109992204A CN201910209554.5A CN201910209554A CN109992204A CN 109992204 A CN109992204 A CN 109992204A CN 201910209554 A CN201910209554 A CN 201910209554A CN 109992204 A CN109992204 A CN 109992204A
- Authority
- CN
- China
- Prior art keywords
- data
- copy
- storage
- storage pool
- hot spot
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003860 storage Methods 0.000 title claims abstract description 429
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000013523 data management Methods 0.000 claims description 74
- 238000013507 mapping Methods 0.000 claims description 45
- 238000013500 data storage Methods 0.000 claims description 29
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 239000012634 fragment Substances 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 description 13
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 10
- 238000004590 computer program Methods 0.000 description 5
- 238000005520 cutting process Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 101100121776 Arabidopsis thaliana GIG1 gene Proteins 0.000 description 2
- 101100267551 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YME1 gene Proteins 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000002567 autonomic effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1004—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0644—Management of space entities, e.g. partitions, extents, pools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45579—I/O management, e.g. providing access to device drivers or storage
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The disclosure provides a kind of date storage method and the primary copy of hot spot data is stored in the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD storage pool of copy storage pool by device when storing hot spot data according to default copy storage strategy.In addition, non-thermal point data is stored in HDD storage pool according to default correcting and eleting codes storage strategy when storing non-thermal point data.In this way, the primary copy by reading hot spot data from the SSD storage pool to effectively solve the problems, such as that VDI starts storm, while improving the utilization rate of SSD, SSD cost be effectively reduced when the starting of each VDI virtual machine to guarantee largely to store I/O demand.
Description
Technical field
This disclosure relates to technical field of memory, in particular to a kind of date storage method and device.
Background technique
VDI (Virtual Desktop Infrastructure, virtual desktop architecture) refers to by data
Virtual User desktop is run in the virtual machine of the heart, to provide the method for Remote desk process for user.Currently, working as a large number of users
When requesting to provide Remote desk process simultaneously, a large amount of virtual machine can start simultaneously in a short time, so as to cause very intensive
Storage I/O (Input/Output, input/output) requirements for access behavior, cause VDI start storm.Start wind when VDI occurs
When sudden and violent, if distributed memory system cannot handle so intensive I/O load well, it finally will be unable to provide VDI service,
It causes the VDI of user to experience extremely to decline.
Summary of the invention
In order to overcome above-mentioned deficiency in the prior art, the disclosure is designed to provide a kind of date storage method and dress
It sets, to solve or improve the above problem.
To achieve the goals above, the embodiment of the present disclosure the technical solution adopted is as follows:
In a first aspect, the disclosure provides a kind of date storage method, it is applied to data-storage system, data storage system
System includes copy storage pool and correcting and eleting codes storage pool, and the copy storage pool includes SSD storage pool and HDD storage pool, described to entangle
Deleting yard storage pool also includes the HDD storage pool, which comprises
Data to be stored is obtained, the data to be stored includes for the hot spot number for the shared read-write of multiple VDI virtual machines
According to non-thermal point data, wherein when the starting of each VDI virtual machine, I/O load when reading the hot spot data, which is greater than, reads institute
State I/O load when non-thermal point data;
The SSD that the primary copy of the hot spot data is stored in the copy storage pool is deposited according to default copy storage strategy
In reservoir, subordinate copy is stored in the HDD storage pool of the copy storage pool;And
The non-thermal point data is stored in the correcting and eleting codes storage pool according to default correcting and eleting codes storage strategy.
In a kind of possible embodiment, the basis presets copy storage strategy for the primary copy of the hot spot data
It is stored in the SSD storage pool of the copy storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool
Step, comprising:
The hot spot data is divided into multiple hot spot data blocks according to preset data block size, and by each hot spot data
Block is mapped in a corresponding data management group;
The primary copy of each hot spot data block in each data management group is stored in institute according to default copy storage strategy
It states in the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool.
In a kind of possible embodiment, the basis presets copy storage strategy will be each in each data management group
The primary copy of hot spot data block is stored in the SSD storage pool of the copy storage pool, subordinate copy is stored in the copy and deposits
Step in the HDD storage pool of reservoir, comprising:
For each data management group, according to the copy amount of the default copy storage strategy configuration and copy storage rule
Then, using CRUSH algorithm from the SSD storage pool of the copy storage pool determine the data management group in each hot spot data block
The corresponding main OSD of primary copy, and from the HDD storage pool of the copy storage pool determine the data management group in each hot spot
The subordinate copy of data block it is corresponding at least one from OSD, wherein the sum of quantity of the primary copy and the subordinate copy etc.
In the copy amount, the copy storage rule include the storage mapping relationship between primary copy and the SSD storage pool with
And the storage mapping relationship between subordinate copy and the HDD storage pool;
According to the storage mapping relationship and subordinate copy and the HDD between the primary copy and the SSD storage pool
Storage mapping relationship between storage pool, the primary copy of each hot spot data block in the data management group is stored in main OSD,
Each subordinate copy be respectively stored in one it is corresponding from OSD.
In a kind of possible embodiment, the basis presets correcting and eleting codes storage strategy and stores the non-thermal point data
Step in the correcting and eleting codes storage pool, comprising:
The non-thermal point data is divided into multiple non-hot data blocks according to preset data block size, and will be each non-thermal
Point data block is mapped in a corresponding data management group;
For each data management group, the initial data number of fragments m configured according to the default correcting and eleting codes storage strategy,
Redundant data number of fragments n and correcting and eleting codes storage rule encode each non-hot data block in the data management group,
Obtain the m initial data segment and n redundant data segment of each non-hot data block, wherein the correcting and eleting codes storage rule
It then include the storage mapping relationship between initial data segment and redundant data segment and the HDD storage pool;
Determined from the HDD storage pool of the correcting and eleting codes storage pool using CRUSH algorithm the m initial data segment with
The n corresponding OSD of redundant data segment, wherein m+n=k, m, n, k are positive integer, and m > n;
According to the storage mapping relationship between the initial data segment and redundant data segment and the HDD storage pool,
The m initial data segment and n redundant data segment are respectively stored in corresponding OSD.
In a kind of possible embodiment, the method also includes:
When the starting of each VDI virtual machine, the corresponding each data pipe of the hot spot data is read from the SSD storage pool
The primary copy of reason group is to complete the starting of VDI desktop.
Second aspect, the embodiment of the present disclosure also provide a kind of data storage device, are applied to data-storage system, the number
It include copy storage pool and correcting and eleting codes storage pool according to storage system, the copy storage pool includes SSD storage pool and HDD storage
Pond, the correcting and eleting codes storage pool also include the HDD storage pool, and described device includes:
Data acquisition module, for obtaining data to be stored, the data to be stored includes for supplying multiple VDI virtual machines
The hot spot data of shared read-write and non-thermal point data, wherein when the starting of each VDI virtual machine, when reading the hot spot data
I/O load is greater than I/O load when reading the non-thermal point data;
Hot spot data memory module, for being stored in the primary copy of the hot spot data according to default copy storage strategy
In the SSD storage pool of the copy storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool;And
Non-hot data memory module, for the non-thermal point data to be stored in institute according to default correcting and eleting codes storage strategy
It states in correcting and eleting codes storage pool.
The third aspect, the embodiment of the present disclosure also provide a kind of readable storage medium storing program for executing, and the readable storage medium storing program for executing is stored with meter
Calculation machine program, the computer program realize above-mentioned date storage method when running.
In terms of existing technologies, the disclosure has the advantages that
The date storage method and device that the disclosure provides, will according to default copy storage strategy when storing hot spot data
The primary copy of hot spot data is stored in the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD of copy storage pool and deposits
In reservoir.In addition, non-thermal point data is stored in HDD according to default correcting and eleting codes storage strategy and is deposited when storing non-thermal point data
In reservoir.In this way, when the starting of each VDI virtual machine, by the primary copy of hot spot data being read from the SSD storage pool to protect
The a large amount of storage I/O demands of card, to effectively solve the problems, such as that VDI starts storm, while improving the utilization rate of SSD, are effectively reduced
SSD cost.
Detailed description of the invention
It, below will be to needed in the embodiment attached in order to illustrate more clearly of the technical solution of the embodiment of the present disclosure
Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the disclosure, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the structural schematic block diagram for the data-storage system that the embodiment of the present disclosure provides;
Fig. 2 is one of the flow diagram of date storage method that the embodiment of the present disclosure provides;
Fig. 3 is the storing process schematic diagram for the hot spot data that the embodiment of the present disclosure provides;
Fig. 4 is the storing process schematic diagram for the non-thermal point data that the embodiment of the present disclosure provides;
Fig. 5 is the two of the flow diagram for the date storage method that the embodiment of the present disclosure provides;
Fig. 6 is one of the functional block diagram of data storage device that the embodiment of the present disclosure provides;
Fig. 7 is the two of the functional block diagram for the data storage device that the embodiment of the present disclosure provides.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present disclosure, the technical solution in the embodiment of the present disclosure is carried out clear, complete
Site preparation description, it is clear that described embodiment is disclosure a part of the embodiment, instead of all the embodiments.Usually herein
The component of the embodiment of the present disclosure described and illustrated in place's attached drawing can be arranged and be designed with a variety of different configurations.
Therefore, the detailed description of the embodiment of the disclosure provided in the accompanying drawings is not intended to limit below claimed
The scope of the present disclosure, but be merely representative of the selected embodiment of the disclosure.Based on the embodiment in the disclosure, this field is common
Technical staff's all other embodiment obtained without creative efforts belongs to the model of disclosure protection
It encloses.
It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi
It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.
VDI can bring many benefits for IT department, such as simpler system administration, the security performance sum number more concentrated
According to protection etc..The technical issues of knowing such as aforementioned background art, a main problem existing for VDI are VDI starting storms.VDI
Start storm to occur when a large amount of VDI virtual machine starts simultaneously in a short time (for example, in 8 a.m. between 9 points),
Thus a large amount of storage I/O demands caused by easily can make distributed memory system stop working.Inventor was studying
It is found in journey, the storage I/O of VDI usually has inherent feature.It is asked for example, virtual machine can have intensive reading I/O on startup
It asks, after starting terminates, storage I/O would generally be kept low.For example, usually VDI virtual machine can produce on startup
Raw 50-100 IOPS (Input/Output Operations Per Second, read-write operation number per second), to read I/O behaviour
As master, once running smoothly after starting, average IOPS can drop to 5-10.Therefore, distributed storage system how is designed
System intensively stores I/O demand caused by storm to meet to be started as VDI, is a great problem of this field.
Based on this, inventor in the course of the research it has furthermore been found that in order to meet VDI starting storm when IOPS demand and
The storage architecture of the distributed memory system of design usually requires high cost.For example, in order to improve distributed memory system
IOPS processing capacity needs to increase more physical disks, thus be distributed in I/O load can on more disks, this
It means that distributed memory system will possess a large amount of disk sizes for exceeding actual demand, causes unnecessary cost allowance.
Currently, in order to solve the problems, such as that VDI starts storm, it may be considered that use SSD (Solid State Disk, solid-state
Hard disk) solution.SSD has higher I/O storage performance relative to tradition machinery disk.For example, traditional is per minute
The highest IOPS of 15000 turns SAS (Serial Attached SCSI, Serial Attached SCSI (SAS)) disk is about 200, and SSD
Thousands of IOPS can be easily reached.Certainly, the promotion of the I/O storage performance of SSD is also along with higher cost.If
Distributed memory system for VDI desktop all uses SSD, for a user, it is clear that can generate high cost.
For this purpose, inventor thinks after carefully studying, VDI starting storm when institute can be carried by using a small amount of SSD
The a large amount of storage I/O demands generated.It is designed in this way, great amount of cost lower SAS or SATA can be used and deposited to meet distribution
The capacity requirement of storage, while a small amount of SSD is come the performance requirement of I/O load when meeting processing VDI starting storm.For this purpose, industry
Generally use following scheme.
The mother matrix Image Data and its wave file of VDI virtual machine are stored in SSD storage pool by scheme one.Wherein, female
Version Image Data can be read for all VDI virtual machines are shared, which can be understood as the VDI table of all deployment
The stamper of face operating system, each VDI virtual machine can retain a writeable snapshot individually to store all repairing for stamper
Change.
When the starting of VDI virtual machine, most disk activity is from mother matrix Image Data, that is, mother matrix Number of Images
According to the most of operating system file and application file for including VDI virtual machine.As a result, by by mother matrix Image Data and its pair
This document is stored on SSD storage pool, can eliminate VDI starting storm.Meanwhile other subscriber data files, snapshot etc. are general
Logical data can be stored on the storage pool (such as (SAS or SATA)) of lower-level.
However, inventor is by carefully studying discovery, mother matrix Image Data are being stored in SSD storage pool by above scheme one
When, while also having arrived multiple copies of mother matrix Image Data (such as 3 copies, 4 copies etc.) storage in SSD storage pool.In this way,
Leading to the utilization rate of SSD storage pool reduces.For example, the SSD storage pool of a 100TB, if by mother matrix by the way of 3 copies
Image Data storage, then its available capacity is only 33TB or so (one third of 100TB).Due to SSD higher cost, scheme one
Cost control it is still bad.
Scheme two uses SSD to be arranged in as cache layer by the front end of SAS and the SATA storage pool formed, all in this way
The storage I/O of VDI virtual machine all will reach the storage pool that is made of SAS and SATA of rear end by SSD cache layer.Thus it sets
Meter, can identify the hot spot data frequently accessed by SSD cache layer, and cached automatically, can cache in this way from SSD
Layer directly reads hot spot data, rather than reads from the storage pool of rear end.When starting jointly within multiple VDI virtual machine short time,
It since mother matrix Image Data are hot spot data, then can be buffered in SSD cache layer, so as to eliminate VDI starting storm.
However, inventor is by carefully studying discovery, above scheme two mother matrix Image Data entrance SSD cache layer it
Before, mother matrix Image Data are read, from the storage pool of rear end first still so as to cause the virtual of the VDI virtual machine started in advance
It is longer that desktop starts the time.For example, for a user, real experiences are exactly that virtual desktop starts relatively slowly sometimes, start sometimes
Comparatively fast, it is unable to reach consistent quick start experience.
For this purpose, based on above-mentioned technical problem it has furthermore been found that the disclosed invention people propose following technical proposals to solve
Or improve the above problem.It should be noted that defect present in the above scheme in the prior art, is that inventor is passing through
Cross practice and obtained after carefully studying as a result, therefore, the discovery procedure of the above problem and hereinafter embodiment of the present disclosure needle
It all should be the contribution that inventor makes the disclosure during innovation and creation to the solution that the above problem is proposed.
Referring to Fig. 1, the structural schematic block diagram of the data-storage system 100 provided for the embodiment of the present disclosure, such as Fig. 1 institute
Showing, data-storage system 100 can be distributed storage cluster, including SSD storage pool and HDD (Hard Disk Drive, firmly
Disk drive) storage pool.Wherein, SSD storage pool and HDD storage pool collectively form a copy storage pool in order to which data can be with
It is respectively stored in SSD storage pool and HDD storage pool with copy version.In addition, HDD storage pool also constitutes a correcting and eleting codes storage
Pond, in order to which data can be stored in HDD storage pool in the form of correcting and eleting codes.Data-storage system 100 can pass through distribution
Storage management tool carries out storage management to copy storage pool and correcting and eleting codes storage pool respectively.
Wherein, SSD storage pool includes the SSD hard disk of all memory nodes, and HDD storage pool includes all memory nodes
HDD hard disk.It is worth noting that can configure each storage in specific design according to the ratio (such as 1:10) of design and save
SSD hard disk and HDD hard disk in point.
It, first can be to each OSD (Object of distributed storage cluster when defining SSD storage pool and HDD storage pool
Storage Device, object storage device) hard disk one unique identification of definition.Then, two logic of class hosts: HDD master are defined
Machine and SSD host, so that HDD host includes HDD hard disk, SSD host includes SSD hard disk.Then, SSD storage pool and HDD are defined
Storage pool, makes the SSD host in SSD storage pool comprising aforementioned definitions, includes the HDD host of aforementioned definitions in HDD storage pool.By
This, defines the OSD hard disk that SSD storage pool and HDD storage pool are included, and also defines from OSD hard disk to corresponding host, with
And from host to the tree-like inclusion relation of corresponding storage pool namely the cluster topology information of distributed storage cluster.
Optionally, HDD hard disk may include but be not limited to ATA (Advanced Technology Attachment) firmly
Disk and SATA (Serial ATA) hard disk and SCSI (Small Computer System Interface) hard disk and SAS
(Serial Attached SCSI) hard disk etc., the present embodiment is not intended to be limited in any this.
Wherein, OSD is mainly used for storing data, replicate data, equilibrium data, restores to carry out the heart between data and other OSD
Inspection etc. is jumped, and some situations of change are reported to distributed cluster storage system management software.Usually, one piece of hard disk pair
An OSD is answered, control is managed to the hard-disc storage by OSD.
Optionally, LUN (Logical Unit Number, logic can also be configured in SSD storage pool and HDD storage pool
Unit number) resource, and carry gives VDI virtual machine, so that VDI virtual machine stores related data using these LUN resources.One
LUN can correspond to multiple OSD above-mentioned, and each OSD is equivalent to a physical disk, and LUN can be these physical disks composition
Logic storage medium.
In detail, LUN be on distributed storage cluster can by application server identify individual memory cells, one
LUN can be considered as one piece of hard disk that can be used, for example, in linux system, have under/dev/dsk catalogue and set accordingly
Standby title;In Windows system, LUN corresponding one is similar to the drives such as local disk D, local disk E, local disk F.
Below to Fig. 2 shows date storage method be described in detail, the date storage method as shown in Fig. 1 number
It is executed according to storage system 100.It should be appreciated that in other embodiments, the date storage method of the present embodiment part step therein
Rapid sequence can be exchanged with each other according to actual needs or part steps therein also can be omitted or delete.The data are deposited
The detailed step of method for storing is described below.
Step S110 obtains data to be stored.
In the present embodiment, data to be stored may include for hot spot data for the shared read-write of multiple VDI virtual machines and non-
Hot spot data.Wherein, when each VDI virtual machine starts, when I/O load when reading hot spot data is greater than reading non-thermal point data
I/O load.
For example, hot spot data may include the mother matrix Image Data of each VDI virtual machine, such as necessary to operation VDI desktop
Operating system data, system application data etc. can initiate the intensive storage I/O to these data when the starting of each VDI virtual machine
Request.
Non-thermal point data may include user's general data unrelated with operating system used in everyday, such as number of files
According to, audio, video data, third-party application data etc. can initiate according to user demand to this after the starting of each VDI virtual machine
The storage I/O request of a little data.
Step S120 determines the storage pool of data to be stored according to the data type of data to be stored.
In detail, for hot spot data, hot spot data can be divided into multiple hot spot numbers according to preset data block size
It is mapped in a corresponding data management group according to block, and by each hot spot data block, it will be every according to default copy storage strategy
The primary copy of each hot spot data block is stored in the SSD storage pool of copy storage pool in a data management group, subordinate copy is deposited
Storage is in the HDD storage pool of copy storage pool.
As a kind of possible embodiment, Fig. 3 is please referred to, hot spot data can be cut into the fixed length of configuration
Hot spot data block (such as 4MB size) is then directed to each hot spot data block, according to the Data Identification of hot spot data and the hot spot
Offset information of the data block in hot spot data generates the data block identifier of the hot spot data block, and according to the hot spot data block
The hot spot data block is mapped in a corresponding data management group by data block identifier using hash algorithm.
In detail, for storage cluster uses Ceph cluster in a distributed manner, below for hot spot data to hot spot data block
Mapping, hot spot data block to data management group mapping illustrate.Wherein, data management group is data and data
The synthesis of property set, can be with self-management.Data management group is the logical collection of several data blocks, for guarantee data can
By property, each data block can be copied on multiple OSD, and OSD is the physical disk that Linux file system is had existed by one
Driver and OSD service composition.Once application program accesses Ceph cluster and executes write operation, each data block in data management group
It will be stored in OSD in the form of object.Under normal conditions, in order to guarantee data storage safety and reliability, hot spot
Data cutting be multiple hot spot data blocks after, hot spot data block can be stored in the corresponding multiple OSD of data management group.For just
In understanding, a kind of specific embodiment of above-mentioned mapping process is illustrated below:
Mapping of the hot spot data to hot spot data block: when a hot spot data is written, hot spot data is cut first
Point, also i.e. by the unlimited hot spot data cutting of size be it is in the same size, can be by RADOS (Reliable Autonomic
Distributed Object Storage, distributed objects storage system) one or more hot spot data blocks for efficiently managing,
So as to switch to the parallel processing implemented to multiple hot spot data blocks to the serial process that single hot spot data is implemented.By hot spot number
It, can be in a manner of Linear Mapping, so that each hot spot data block obtains unique heat after being multiple hot spot data blocks according to cutting
Point data block identification, the hot spot data block identification may include the mark of hot spot data to be written and the sequence of hot spot data block
Number.For example, it is assumed that the unique identification of hot spot data block is oid, oid can be made of ino and ono, wherein ino is characterized wait grasp
The metadata of the hot spot data of work is the hot spot data mark of the hot spot data;Ono is the hot spot generated by hot spot data cutting
The serial number of data block.
Mapping of the hot spot data block to data management group PG (Placement Group, put in order group): hot spot data block to PG
Mapping can be realized by Hash (Hash) algorithm: Hash (oid) &mask-->;PG_ID.For example, using Ceph first
Specified static Hashing function calculates the hash value of aforementioned oid, and oid is mapped to the pseudo random number of approaches uniformity distribution,
Then by the pseudo random number and mask (mask) step-by-step phase with obtain PG_ID.If given PG sum be that (such as m can be 2 to m
Integer power), then the value of mask is m-1.Wherein, mask is to calculate hash algorithm used in hot spot data block to PG mapping relations
Mask value, indicate finally to calculate the range that acquired results are fallen in by hash algorithm.In a kind of possible embodiment,
Mask can subtract one for PG sum.For example, PG sum is 1024, then PG number is 0-1023, therefore mapping finally calculates
As a result it should fall on 0-1023, so enabling mask=1023.Hash algorithm calculates resulting pseudo random number and mask is converted into
Binary number step-by-step phase with later, can preferably guarantee when hot spot data number of blocks is more, can uniform mapping in each PG
On.
Wherein, PG is the concept container of hot spot data block, is virtual presence in Ceph distributed data-storage system 100
, tissue and position mapping are carried out for the storage to hot spot data block.Usually, a PG can be responsible for tissue several
Hot spot data block (according to the difference of storage size, it is thousands of to tens of thousands of a data blocks that each PG can be responsible for tissue), but one
Hot spot data block can only be mapped in a PG, i.e. be one-to-many mapping between PG and hot spot data block.By rationally setting
The quantity for setting PG can ensure the uniformity of data mapping.Ceph can be as unit of PG to the hot spot data for being included as a result,
Block is managed collectively.In addition, PG can also have different addresses in different distributed memory systems.
On the basis of the above, for each data management group, according to the copy amount of default copy storage strategy configuration and
Copy storage rule, using CRUSH (Controlled Replication Under Scalable Hashing, pseudo random number
According to Distribution Algorithm) primary copy of each hot spot data block from the data management group determining in the SSD storage pool of copy storage pool
Corresponding main OSD, and from the HDD storage pool of copy storage pool determine the data management group in each hot spot data block subordinate
Copy it is corresponding at least one from OSD.Wherein, the sum of quantity of primary copy and subordinate copy is equal to copy amount, copy storage
Rule includes the storage mapping relationship between primary copy and SSD storage pool and the storage between subordinate copy and HDD storage pool
Mapping relations.
Then, according between primary copy and SSD storage pool storage mapping relationship and subordinate copy and HDD storage pool it
Between storage mapping relationship, the primary copy of each hot spot data block in the data management group is stored in main OSD, each subordinate
Copy be respectively stored in one it is corresponding from OSD.
For example, using 3 copies with copy amount, namely respectively include primary copy, subordinate copy 1 and subordinate copy 2, copy
Storage rule using primary copy is stored in SSD storage pool, subordinate copy is stored in for the storage rule of HDD storage pool, for
Some data management group A can be determined in data management group A using CRUSH algorithm from the SSD storage pool of copy storage pool
The corresponding OSD1 of the primary copy of each hot spot data block, and data management group A is determined from the HDD storage pool of copy storage pool
In each hot spot data block the corresponding OSD2 of subordinate copy 1 and the corresponding OSD3 of subordinate copy 2.Then, by the data pipe
The primary copy of reason group A is stored in OSD1, and subordinate copy 1 is stored in OSD2, and subordinate copy 2 is stored in OSD3.
Wherein, the mapping of above-mentioned data management group A to OSD is completed by CRUSH algorithm, by taking 3 copies as an example, data pipe
Each hot spot data block needs to be mapped on 3 OSD in reason group A, respectively includes copy an OSD and HDD in SSD storage pool
Two copy OSD in storage pool.(PG_ID), the mark (OSD_ID) of OSD, particular constant r have been identified in data management group A
When knowing, PG_ID, OSD_ID and r can be substituted into formula CRUSH_HASH (PG_ID, OSD_ID, r), obtain pseudo random number
draw;The weight of each OSD itself is obtained into the corresponding product value of each OSD multiplied by draw again.Then, max product is selected
It is worth corresponding OSD as the corresponding copy OSD of hot spot data block each in data management group A.If N is greater than 1, enable
R=r+1 repeats the above steps, to choose another copy OSD of each hot spot data block in data management group A.If it is determined that
Two copy OSD it is identical, then enable r=r+2 repeat calculating, until obtain each hot spot data block pair in data management group A
Whole numbers of the 3 copy OSD answered.
In a kind of possible embodiment, still storage cluster presets copy using for Ceph cluster in a distributed manner
The definition mode of storage strategy is as follows:
In defined above, ssd-primary is the title of default copy storage strategy;Id is the volume for being ssd-primary
Number;Type replicated expression is stored with copy version;Min_size and max_size respectively indicates the copy of minimum allowable
Quantity and maximum allowable copy amount, it is defined above by taking 3 copies as an example;Step take expression will be selected from which storage pool
OSD is selected, indicates to choose OSD from SSD storage pool ssd_pool in this storage strategy;step chooseleaf firstn
1type host indicates to choose 1 OSD;Step emit indicates that result output will be chosen;Second step take is indicated will be from
OSD is chosen in HHD storage pool hdd_pool, -1 indicates that the OSD for choosing remaining number (by taking 3 copies as an example, is having chosen 1
After the OSD of a primary copy, it is also necessary to choose the OSD of 2 subordinate copies);Emit expression will select result output.
By defining the above default copy storage strategy, as shown in Figure 3, so that it may it realizes under the configuration of 3 copies, it will be each
The primary copy of hot spot data block is stored in an OSD in SSD storage pool, remaining 2 subordinate copy is stored in HDD storage pool
In an OSD in.
In another case, for non-thermal point data, non-thermal point data is divided into according to preset data block size more
A non-hot data block, and each non-hot data block is mapped in a corresponding data management group, it is deleted according to default entangle
Non-hot data block each in each data management group is stored in correcting and eleting codes storage pool by code storage strategy.
As a kind of possible embodiment, Fig. 4 is please referred to, non-thermal point data can be cut into the fixed length of configuration
Non-hot data block (such as 4MB size), then be directed to each non-hot data block, according to the Data Identification of non-thermal point data
Generate the data block identifier of the non-hot data block with the offset information of the non-hot data block in non-thermal point data, and according to
The non-hot data block is mapped to a corresponding data pipe using hash algorithm by the data block identifier of the non-hot data block
In reason group.The detailed embodiment of above procedure can refer to the aforementioned mapping for hot spot data to hot spot data block, hot spot number
According to the mapping process exemplary illustration of block to data management group, it is no longer repeated herein.
On the basis of the above, it for each data management group, can be configured according to default correcting and eleting codes storage strategy original
Data slot quantity m, redundant data number of fragments n and correcting and eleting codes storage rule, to each non-hot in the data management group
Data block is encoded, and obtains the m initial data segment and n redundant data segment of each non-hot data block, wherein entangle
Deleting yard storage rule includes the storage mapping relationship between initial data segment and redundant data segment and HDD storage pool.
Then, m initial data segment and n are determined from the HDD storage pool of correcting and eleting codes storage pool using CRUSH algorithm
The corresponding OSD of redundant data segment, wherein m+n=k, m, n, k are positive integer, and m > n.
Finally, according to the storage mapping relationship between initial data segment and redundant data segment and HDD storage pool, by m
A initial data segment and n redundant data segment are respectively stored in corresponding OSD.
Wherein, correcting and eleting codes (Reed-Solomon Code, RSC) are a kind of forward error correction technologies, multiple compared to more copies
For system, correcting and eleting codes can obtain higher data reliability with smaller data redudancy.For example, can be by m original number
According to segment, increase n redundant data segment, and original can be reduced to by any m data segment in m+n data slot
Beginning data.That is, still can be restored by remaining data slot if there is the data slot for being arbitrarily less than or equal to n fails
Initial data.It in detail, is the data instance of input with non-hot data block, correcting and eleting codes are by being divided into m for non-hot data block
A data slot is considered as vector D=(D1, D2 ..., Dm), and the data after then encoding to this m data segment are considered as
Vector (D1, D2 ..., Dm, C1, C2 .., Cn), correcting and eleting codes coding can be considered as matrix operation.
For example, stored in a manner of 3 copies if necessary to storage data x and data y, need to deposit x, x, x, y, y, totally six parts of y
Data need to only store x, tri- parts of data of y, z, if at this moment data if stored in a manner of correcting and eleting codes, such as 2x+3y=z
X loses, then can obtain data x according to x=(z-3y)/2.
On the basis of foregoing description, with m=4, n=2, correcting and eleting codes storage rule uses initial data segment and redundant digit
It is stored according to segment for the storage rule of HDD storage pool, for some data management group B, to each in data management group B
Non-hot data block is encoded, and is obtained 4 initial data segments and 2 redundant data segments, is respectively included initial data piece
Section 1, initial data segment 2, initial data segment 3, initial data segment 4, redundant data segment 1 and redundant data segment 2.
Then, using CRUSH algorithm from the HDD storage pool of correcting and eleting codes storage pool determine initial data segment 1, initial data segment 2,
Initial data segment 3, initial data segment 4, redundant data segment 1 and the corresponding OSD of redundant data segment 2.Finally,
By initial data segment 1, initial data segment 2, initial data segment 3, initial data segment 4, redundant data segment 1 and superfluous
Remaining data slot 2 is respectively stored in corresponding OSD.
In a kind of possible embodiment, still storage cluster using for Ceph cluster, delete by default entangle in a distributed manner
The definition mode of code storage strategy is as follows:
rule hdd-erasure{
id 2
type erasure
min_size 4
max_size 6
step take hdd_pool
step choose indep O type osd
step emit
}
In defined above, hdd-erasure is the title of default correcting and eleting codes storage strategy;Id is the volume of hdd-erasure
Number;Type erasure indicates that data are stored in a manner of correcting and eleting codes;Min_size and max_size respectively indicates the smallest entangle and deletes
Code fragment quantity and maximum allowable correcting and eleting codes fragment quantity, in this storage strategy, correcting and eleting codes are configured to (4,2);step take
Expression will select OSD from which storage pool, indicate to choose OSD from HDD storage pool hdd_pool in this storage strategy;step
Choose indep 0type host indicates to choose the OSD of default amount, 6 OSD of this storage strategy namely selection;step
Emit indicates that result output will be chosen.
By defining the above default correcting and eleting codes storage strategy, as shown in Figure 4, so that it may realize and be configured in correcting and eleting codes (4,2)
Under, it is chosen in 6 OSD from HDD storage pool, and respectively that 4 initial data segments of each non-hot data block and 2 are superfluous
Remaining data slot is stored in corresponding OSD.
On the basis of being as defined above, plan is stored by the default copy storage strategy of application definition and default correcting and eleting codes
Slightly, so that it may store different data into different logic storage pools.That is, hot spot data is stored according to default copy
Strategy is placed in copy storage pool, and non-thermal point data is placed into correcting and eleting codes storage pool according to default correcting and eleting codes storage strategy
In.
Copy storage pool is defined first, and it is as follows to define method:
ceph osd pool create image_pool replicated ssd-primary
In defined above, the copy storage pool that a name is image_pool is created, replicated parameter indicates
The logic storage pool is the copy storage pool for storing hot spot data, and ssd-primary parameter is the default pair of aforementioned definitions
This storage strategy is then.
Secondly, defining correcting and eleting codes storage pool, it is as follows to define method:
ceph osd pool create data_pool erasure hdd-erasure
In defined above, the correcting and eleting codes storage pool that a name is data_pool is created, erasure parameter declaration should
Logic storage pool is the correcting and eleting codes storage pool for storing non-thermal point data, and hdd-erasure parameter is the default of aforementioned definitions
Correcting and eleting codes storage strategy.
Further, referring to Fig. 5, date storage method provided in this embodiment can also include the following steps:
Step S130 reads the corresponding each data of hot spot data when the starting of each VDI virtual machine from SSD storage pool
The primary copy of management group is to complete the starting of VDI desktop.
In the present embodiment, when the starting of each VDI virtual machine, main I/O request is reads I/O request, due to hot spot data
The primary copy of (mother matrix Image Data) is stored in SSD storage pool, it is possible to by reading hot spot data from SSD storage pool
The primary copy of corresponding each data management group is to complete the starting of VDI desktop, to guarantee higher storage I/O performance, solves
Certainly VDI starts the problem of storm.And in other periods, it is non-hot since the read-write I/O demand to distributed storage cluster is not high
Data are stored in HDD storage pool in the form of correcting and eleting codes, to reduce SSD cost.
Based on the above design, when storing hot spot data, hot spot data is divided into according to preset data block size multiple
Hot spot data block, and each hot spot data block is mapped in a corresponding data management group, plan is stored according to default copy
Slightly the primary copy of each hot spot data block in each data management group is stored in the SSD storage pool of copy storage pool, subordinate
Copy is stored in the HDD storage pool of copy storage pool.In addition, when storing non-thermal point data, according to preset data block size
Non-thermal point data is divided into multiple non-hot data blocks, and each non-hot data block is mapped to a corresponding data pipe
In reason group, each non-hot data block of each data management group is stored in by HDD storage according to default correcting and eleting codes storage strategy
Chi Zhong.In this way, when the starting of each VDI virtual machine, by the primary copy of hot spot data being read from the SSD storage pool to guarantee
A large amount of storage I/O demands, to effectively solve the problems, such as that VDI starts storm, while improving the utilization rate of SSD, SSD are effectively reduced
Cost.
Further, Fig. 6 shows the functional block diagram of the data storage device 200 of embodiment of the present disclosure offer, the number
The step of above-mentioned date storage method shown in Fig. 2 executes can be corresponded to according to the function that storage device 200 is realized.Data storage
Device 200 can be understood as the component that disclosure function is realized under the control of above-mentioned data-storage system 100.As shown in fig. 6,
The data storage device 200 may include that data acquisition module 210, hot spot data memory module 220 and non-thermal point data are deposited
Module 230 is stored up, the function of each functional module of the data storage device 200 is described in detail separately below.
Data acquisition module 210, for obtaining data to be stored, data to be stored includes for supplying multiple VDI virtual machines
The hot spot data of shared read-write and non-thermal point data, wherein when the starting of each VDI virtual machine, the I/O load of hot spot data is greater than
The I/O of non-thermal point data is loaded.It is appreciated that the data acquisition module 210 can be used for executing above-mentioned steps S110, about this
The detailed implementation of data acquisition module 210 is referred to above-mentioned to the related content of step S110.
Hot spot data memory module 220, for being stored in the primary copy of hot spot data according to default copy storage strategy
In the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD storage pool of copy storage pool.It is appreciated that the hot spot
Data memory module 220 can be used for executing above-mentioned steps S120, the detailed realization side about the hot spot data memory module 220
Formula is referred to above-mentioned to the related content of step S120.
Non-hot data memory module 230 is entangled for being stored in non-thermal point data according to default correcting and eleting codes storage strategy
It deletes in yard storage pool.It is appreciated that the non-hot data memory module 230 can be used for executing above-mentioned steps S120, about this
The detailed implementation of non-hot data memory module 230 is referred to above-mentioned to the related content of step S120.
In a kind of possible embodiment, the hot spot data memory module 220 specifically can in the following manner by
The primary copy of the hot spot data is stored in the SSD storage pool of the copy storage pool, subordinate copy is stored in the copy
In the HDD storage pool of storage pool:
The hot spot data is divided into multiple hot spot data blocks according to preset data block size, and by each hot spot data
Block is mapped in a corresponding data management group;
The primary copy of each hot spot data block in each data management group is stored in institute according to default copy storage strategy
It states in the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool.
In a kind of possible embodiment, the hot spot data memory module 220 specifically can in the following manner by
In each data management group the primary copy of each hot spot data block be stored in the SSD storage pool of the copy storage pool, subordinate
Copy is stored in the HDD storage pool of the copy storage pool:
For each data management group, according to the copy amount of the default copy storage strategy configuration and copy storage rule
Then, using CRUSH algorithm from the SSD storage pool of the copy storage pool determine the data management group in each hot spot data block
The corresponding main OSD of primary copy, and from the HDD storage pool of the copy storage pool determine the data management group in each hot spot
The subordinate copy of data block it is corresponding at least one from OSD, wherein the sum of quantity of the primary copy and the subordinate copy etc.
In the copy amount, the copy storage rule include the storage mapping relationship between primary copy and the SSD storage pool with
And the storage mapping relationship between subordinate copy and the HDD storage pool;
According to the storage mapping relationship and subordinate copy and the HDD between the primary copy and the SSD storage pool
Storage mapping relationship between storage pool, the primary copy of each hot spot data block in the data management group is stored in main OSD,
Each subordinate copy be respectively stored in one it is corresponding from OSD.
In a kind of possible embodiment, the non-hot data memory module 230 specifically can be in the following manner
The non-thermal point data is stored in the correcting and eleting codes storage pool:
The non-thermal point data is divided into multiple non-hot data blocks according to preset data block size, and will be each non-thermal
Point data block is mapped in a corresponding data management group;
For each data management group, the initial data number of fragments m configured according to the default correcting and eleting codes storage strategy,
Redundant data number of fragments n and correcting and eleting codes storage rule encode each non-hot data block in the data management group,
Obtain the m initial data segment and n redundant data segment of each non-hot data block, wherein the correcting and eleting codes storage rule
It then include the storage mapping relationship between initial data segment and redundant data segment and the HDD storage pool;
Determined from the HDD storage pool of the correcting and eleting codes storage pool using CRUSH algorithm the m initial data segment with
The n corresponding OSD of redundant data segment, wherein m+n=k, m, n, k are positive integer, and m > n;
According to the storage mapping relationship between the initial data segment and redundant data segment and the HDD storage pool,
The m initial data segment and n redundant data segment are respectively stored in corresponding OSD.
In a kind of possible embodiment, further referring to Fig. 7, data storage device 200 can also include data
Read module 240, for reading the corresponding each data pipe of hot spot data from SSD storage pool when the starting of each VDI virtual machine
The primary copy of reason group is to complete the starting of VDI desktop.It is appreciated that the data read module 240 can be used for executing above-mentioned step
Rapid S130, the detailed implementation about the data read module 240 are referred to above-mentioned to the related content of step S130.
In the embodiment provided by the disclosure, it should be understood that disclosed device and method, it can also be by other
Mode realize.Device and method embodiment described above is only schematical, for example, flow chart and frame in attached drawing
Figure shows the system frame in the cards of the system of multiple embodiments according to the disclosure, method and computer program product
Structure, function and operation.In this regard, each box in flowchart or block diagram can represent a module, section or code
A part, a part of the module, section or code includes one or more for implementing the specified logical function
Executable instruction.It should also be noted that function marked in the box can also be with not in some implementations as replacement
It is same as the sequence marked in attached drawing generation.For example, two continuous boxes can actually be basically executed in parallel, they have
When can also execute in the opposite order, this depends on the function involved.It is also noted that in block diagram and or flow chart
Each box and the box in block diagram and or flow chart combination, can function or movement as defined in executing it is dedicated
Hardware based system realize, or can realize using a combination of dedicated hardware and computer instructions.
In addition, each functional module in each embodiment of the disclosure can integrate one independent portion of formation together
Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.
It can replace, can be realized wholly or partly by software, hardware, firmware or any combination thereof.When
When using software realization, can entirely or partly it realize in the form of a computer program product.The computer program product
Including one or more computer instructions.It is all or part of when loading on computers and executing the computer program instructions
Ground is generated according to process or function described in the embodiment of the present disclosure.The computer can be general purpose computer, special purpose computer,
Computer network or other programmable devices.The computer instruction may be stored in a computer readable storage medium, or
Person is transmitted from a computer readable storage medium to another computer readable storage medium, for example, the computer instruction
Wired (such as coaxial cable, optical fiber, digital subscriber can be passed through from a web-site, computer, server or data center
Line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or data
It is transmitted at center.The computer readable storage medium can be any usable medium that computer can access and either wrap
The data storage devices such as electronic equipment, server, the data center integrated containing one or more usable mediums.The usable medium
It can be magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid-state
Hard disk Solid State Disk (SSD)) etc..
It should be noted that, in this document, term " including ", " including " or its any other variant are intended to non-row
Its property includes, so that the process, method, article or equipment for including a series of elements not only includes those elements, and
And further include the other elements being not explicitly listed, or further include for this process, method, article or equipment institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence " including one ... ", it is not excluded that including institute
State in the process, method, article or equipment of element that there is also other identical elements.
It is obvious to a person skilled in the art that the present disclosure is not limited to the details of above-mentioned exemplary embodiment, Er Qie
Without departing substantially from the disclosure spirit or essential attributes in the case where, can realize the disclosure in other specific forms.Therefore, no matter
From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present disclosure is by appended power
Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims
Variation is included in the disclosure.Any reference signs in the claims should not be construed as limiting the involved claims.
Claims (10)
1. a kind of date storage method, which is characterized in that be applied to data-storage system, the data-storage system includes copy
Storage pool and correcting and eleting codes storage pool, the copy storage pool include SSD storage pool and HDD storage pool, the correcting and eleting codes storage pool
It also include the HDD storage pool, which comprises
Obtain data to be stored, the data to be stored include for for the shared read-write of multiple VDI virtual machines hot spot data and
Non-thermal point data, wherein when the starting of each VDI virtual machine, it is described non-that I/O load when reading the hot spot data is greater than reading
I/O load when hot spot data;
The primary copy of the hot spot data is stored in the SSD storage pool of the copy storage pool according to default copy storage strategy
In, subordinate copy is stored in the HDD storage pool of the copy storage pool;And
The non-thermal point data is stored in the correcting and eleting codes storage pool according to default correcting and eleting codes storage strategy.
2. date storage method according to claim 1, which is characterized in that the basis presets copy storage strategy for institute
The primary copy for stating hot spot data is stored in the SSD storage pool of the copy storage pool, subordinate copy is stored in the copy and deposits
Step in the HDD storage pool of reservoir, comprising:
The hot spot data is divided into multiple hot spot data blocks according to preset data block size, and each hot spot data block is reflected
It is mapped in a corresponding data management group;
The primary copy of each hot spot data block in each data management group is stored in the pair according to default copy storage strategy
In the SSD storage pool of this storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool.
3. date storage method according to claim 2, which is characterized in that the basis presets copy storage strategy will be every
In a data management group the primary copy of each hot spot data block be stored in the SSD storage pool of the copy storage pool, subordinate pair
Originally the step being stored in the HDD storage pool of the copy storage pool, comprising:
For each data management group, according to the copy amount and copy storage rule of the default copy storage strategy configuration,
Using CRUSH algorithm from determining each hot spot data block in the data management group in the SSD storage pool of the copy storage pool
The corresponding main OSD of primary copy, and each hot spot number from the data management group determining in the HDD storage pool of the copy storage pool
According to the subordinate copy of block it is corresponding at least one from OSD, wherein the sum of the quantity of the primary copy and the subordinate copy is equal to
The copy amount, the copy storage rule include storage mapping relationship between primary copy and the SSD storage pool and
Storage mapping relationship between subordinate copy and the HDD storage pool;
According to the storage mapping relationship and subordinate copy and HDD storage between the primary copy and the SSD storage pool
The primary copy of each hot spot data block in the data management group is stored in main OSD, is each by the storage mapping relationship between pond
Subordinate copy be respectively stored in one it is corresponding from OSD.
4. date storage method according to claim 1, which is characterized in that the basis presets correcting and eleting codes storage strategy will
The non-thermal point data is stored in the step in the correcting and eleting codes storage pool, comprising:
The non-thermal point data is divided into multiple non-hot data blocks according to preset data block size, and by each non-thermal points
It is mapped in a corresponding data management group according to block;
For each data management group, according to the initial data number of fragments m of the default correcting and eleting codes storage strategy configuration, redundancy
Data slot quantity n and correcting and eleting codes storage rule encode each non-hot data block in the data management group, obtain
The m initial data segment and n redundant data segment of each non-hot data block, wherein the correcting and eleting codes storage rule packet
Include the storage mapping relationship between initial data segment and redundant data segment and the HDD storage pool;
The m initial data segment and n are determined from the HDD storage pool of the correcting and eleting codes storage pool using CRUSH algorithm
The corresponding OSD of redundant data segment, wherein m+n=k, m, n, k are positive integer, and m > n;
According to the storage mapping relationship between the initial data segment and redundant data segment and the HDD storage pool, by institute
It states m initial data segment and n redundant data segment is respectively stored in corresponding OSD.
5. date storage method described in any one of -4 according to claim 1, which is characterized in that the method also includes:
When the starting of each VDI virtual machine, the corresponding each data management group of the hot spot data is read from the SSD storage pool
Primary copy to complete the starting of VDI desktop.
6. a kind of data storage device, which is characterized in that be applied to data-storage system, the data-storage system includes copy
Storage pool and correcting and eleting codes storage pool, the copy storage pool include SSD storage pool and HDD storage pool, the correcting and eleting codes storage pool
It also include the HDD storage pool, described device includes:
Data acquisition module, for obtaining data to be stored, the data to be stored includes for shared for multiple VDI virtual machines
The hot spot data of read-write and non-thermal point data, wherein when the starting of each VDI virtual machine, I/O when reading the hot spot data is negative
Carry the I/O load being greater than when reading the non-thermal point data;
Hot spot data memory module, it is described for being stored in the primary copy of the hot spot data according to default copy storage strategy
In the SSD storage pool of copy storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool;And
Non-hot data memory module, for the non-thermal point data to be stored in described entangle according to default correcting and eleting codes storage strategy
It deletes in yard storage pool.
7. data storage device according to claim 6, which is characterized in that the hot spot data memory module especially by
The primary copy of the hot spot data is stored in the SSD storage pool of the copy storage pool by following manner, subordinate copy stores
In the HDD storage pool of the copy storage pool:
The hot spot data is divided into multiple hot spot data blocks according to preset data block size, and each hot spot data block is reflected
It is mapped in a corresponding data management group;
The primary copy of each hot spot data block in each data management group is stored in the pair according to default copy storage strategy
In the SSD storage pool of this storage pool, subordinate copy is stored in the HDD storage pool of the copy storage pool.
8. data storage device according to claim 7, which is characterized in that the hot spot data memory module especially by
Following manner stores the SSD that the primary copy of each hot spot data block in each data management group is stored in the copy storage pool
Chi Zhong, subordinate copy are stored in the HDD storage pool of the copy storage pool:
For each data management group, according to the copy amount and copy storage rule of the default copy storage strategy configuration,
Using CRUSH algorithm from determining each hot spot data block in the data management group in the SSD storage pool of the copy storage pool
The corresponding main OSD of primary copy, and each hot spot number from the data management group determining in the HDD storage pool of the copy storage pool
According to the subordinate copy of block it is corresponding at least one from OSD, wherein the sum of the quantity of the primary copy and the subordinate copy is equal to
The copy amount, the copy storage rule include storage mapping relationship between primary copy and the SSD storage pool and
Storage mapping relationship between subordinate copy and the HDD storage pool;
According to the storage mapping relationship and subordinate copy and HDD storage between the primary copy and the SSD storage pool
The primary copy of each hot spot data block in the data management group is stored in main OSD, is each by the storage mapping relationship between pond
Subordinate copy be respectively stored in one it is corresponding from OSD.
9. data storage device according to claim 6, which is characterized in that the non-hot data memory module specifically leads to
It crosses following manner the non-thermal point data is stored in the correcting and eleting codes storage pool:
The non-thermal point data is divided into multiple non-hot data blocks according to preset data block size, and by each non-thermal points
It is mapped in a corresponding data management group according to block;
For each data management group, according to the initial data number of fragments m of the default correcting and eleting codes storage strategy configuration, redundancy
Data slot quantity n and correcting and eleting codes storage rule encode each non-hot data block in the data management group, obtain
The m initial data segment and n redundant data segment of each non-hot data block, wherein the correcting and eleting codes storage rule packet
Include the storage mapping relationship between initial data segment and redundant data segment and the HDD storage pool;
The m initial data segment and n are determined from the HDD storage pool of the correcting and eleting codes storage pool using CRUSH algorithm
The corresponding OSD of redundant data segment, wherein m+n=k, m, n, k are positive integer, and m > n;
According to the storage mapping relationship between the initial data segment and redundant data segment and the HDD storage pool, by institute
It states m initial data segment and n redundant data segment is respectively stored in corresponding OSD.
10. the data storage device according to any one of claim 6-9, which is characterized in that described device further include:
Data read module, for reading the hot spot data pair from the SSD storage pool when the starting of each VDI virtual machine
The primary copy for each data management group answered is to complete the starting of VDI desktop.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910209554.5A CN109992204A (en) | 2019-03-19 | 2019-03-19 | Date storage method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910209554.5A CN109992204A (en) | 2019-03-19 | 2019-03-19 | Date storage method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109992204A true CN109992204A (en) | 2019-07-09 |
Family
ID=67129209
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910209554.5A Pending CN109992204A (en) | 2019-03-19 | 2019-03-19 | Date storage method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109992204A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110531936A (en) * | 2019-08-29 | 2019-12-03 | 西安交通大学 | The crop type storage organization and method of distributed correcting and eleting codes mixing storage based on multi storage |
CN110809030A (en) * | 2019-10-17 | 2020-02-18 | 浙江大华技术股份有限公司 | Network flow control method and device, coder-decoder and storage device |
CN111240591A (en) * | 2020-01-03 | 2020-06-05 | 苏州浪潮智能科技有限公司 | Operation request processing method of storage equipment and related device |
CN111414271A (en) * | 2020-03-17 | 2020-07-14 | 上海爱数信息技术股份有限公司 | Storage method based on self-adaptive storage redundancy strategy |
CN112148219A (en) * | 2020-09-16 | 2020-12-29 | 北京优炫软件股份有限公司 | Design method and device for ceph type distributed storage cluster |
CN112363674A (en) * | 2020-11-12 | 2021-02-12 | 新华三技术有限公司成都分公司 | Data writing method and device |
CN113778341A (en) * | 2021-09-17 | 2021-12-10 | 北京航天泰坦科技股份有限公司 | Distributed storage method and device for remote sensing data and remote sensing data reading method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095807A (en) * | 2016-05-31 | 2016-11-09 | 中广天择传媒股份有限公司 | A kind of distributed file system correcting and eleting codes buffer storage and caching method thereof |
CN107422989A (en) * | 2017-07-27 | 2017-12-01 | 深圳市云舒网络技术有限公司 | A kind of more copy read methods of Server SAN systems and storage architecture |
CN108052655A (en) * | 2017-12-28 | 2018-05-18 | 新华三技术有限公司 | Data write and read method |
CN108121510A (en) * | 2017-12-19 | 2018-06-05 | 紫光华山信息技术有限公司 | OSD choosing methods, method for writing data, device and storage system |
CN108196978A (en) * | 2017-12-22 | 2018-06-22 | 新华三技术有限公司 | Date storage method, device, data-storage system and readable storage medium storing program for executing |
CN108287669A (en) * | 2018-01-26 | 2018-07-17 | 平安科技(深圳)有限公司 | Date storage method, device and storage medium |
CN108920100A (en) * | 2018-06-25 | 2018-11-30 | 重庆邮电大学 | Read-write model optimization and isomery copy combined method based on Ceph |
-
2019
- 2019-03-19 CN CN201910209554.5A patent/CN109992204A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106095807A (en) * | 2016-05-31 | 2016-11-09 | 中广天择传媒股份有限公司 | A kind of distributed file system correcting and eleting codes buffer storage and caching method thereof |
CN107422989A (en) * | 2017-07-27 | 2017-12-01 | 深圳市云舒网络技术有限公司 | A kind of more copy read methods of Server SAN systems and storage architecture |
CN108121510A (en) * | 2017-12-19 | 2018-06-05 | 紫光华山信息技术有限公司 | OSD choosing methods, method for writing data, device and storage system |
CN108196978A (en) * | 2017-12-22 | 2018-06-22 | 新华三技术有限公司 | Date storage method, device, data-storage system and readable storage medium storing program for executing |
CN108052655A (en) * | 2017-12-28 | 2018-05-18 | 新华三技术有限公司 | Data write and read method |
CN108287669A (en) * | 2018-01-26 | 2018-07-17 | 平安科技(深圳)有限公司 | Date storage method, device and storage medium |
CN108920100A (en) * | 2018-06-25 | 2018-11-30 | 重庆邮电大学 | Read-write model optimization and isomery copy combined method based on Ceph |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110531936A (en) * | 2019-08-29 | 2019-12-03 | 西安交通大学 | The crop type storage organization and method of distributed correcting and eleting codes mixing storage based on multi storage |
CN110531936B (en) * | 2019-08-29 | 2021-05-28 | 西安交通大学 | Distributed erasure code mixed storage forest type storage structure and method based on multiple storage media |
CN110809030A (en) * | 2019-10-17 | 2020-02-18 | 浙江大华技术股份有限公司 | Network flow control method and device, coder-decoder and storage device |
CN111240591A (en) * | 2020-01-03 | 2020-06-05 | 苏州浪潮智能科技有限公司 | Operation request processing method of storage equipment and related device |
CN111414271A (en) * | 2020-03-17 | 2020-07-14 | 上海爱数信息技术股份有限公司 | Storage method based on self-adaptive storage redundancy strategy |
CN111414271B (en) * | 2020-03-17 | 2023-10-13 | 上海爱数信息技术股份有限公司 | Storage method based on self-adaptive storage redundancy strategy |
CN112148219A (en) * | 2020-09-16 | 2020-12-29 | 北京优炫软件股份有限公司 | Design method and device for ceph type distributed storage cluster |
CN112363674A (en) * | 2020-11-12 | 2021-02-12 | 新华三技术有限公司成都分公司 | Data writing method and device |
CN112363674B (en) * | 2020-11-12 | 2022-04-22 | 新华三技术有限公司成都分公司 | Data writing method and device |
CN113778341A (en) * | 2021-09-17 | 2021-12-10 | 北京航天泰坦科技股份有限公司 | Distributed storage method and device for remote sensing data and remote sensing data reading method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109992204A (en) | Date storage method and device | |
JP7312251B2 (en) | Improving available storage space in systems with various data redundancy schemes | |
CN111158587B (en) | Distributed storage system based on storage pool virtualization management and data read-write method | |
US7792882B2 (en) | Method and system for block allocation for hybrid drives | |
JP5112003B2 (en) | Storage device and data storage method using the same | |
US8090792B2 (en) | Method and system for a self managing and scalable grid storage | |
CN104850598B (en) | A kind of real-time data base back-up restoring method | |
US8135907B2 (en) | Method and system for managing wear-level aware file systems | |
JP5411250B2 (en) | Data placement according to instructions to redundant data storage system | |
US7290102B2 (en) | Point in time storage copy | |
US20200117362A1 (en) | Erasure coding content driven distribution of data blocks | |
CN103635900B (en) | Time-based data partitioning | |
US7840657B2 (en) | Method and apparatus for power-managing storage devices in a storage pool | |
US10691354B1 (en) | Method and system of disk access pattern selection for content based storage RAID system | |
US9996557B2 (en) | Database storage system based on optical disk and method using the system | |
US20110145528A1 (en) | Storage apparatus and its control method | |
US20080263089A1 (en) | Transaction-Based Storage System and Method That Uses Variable Sized Objects to Store Data | |
US11151056B2 (en) | Efficient virtualization layer structure for a data storage system | |
US20190243553A1 (en) | Storage system, computer-readable recording medium, and control method for system | |
JP2004213064A (en) | Raid device and logic device expansion method therefor | |
US8825653B1 (en) | Characterizing and modeling virtual synthetic backup workloads | |
CN1770088A (en) | Incremental backup operations in storage networks | |
CN1770114A (en) | Copy operations in storage networks | |
CN103562914A (en) | Resource efficient scale-out file systems | |
US20200133555A1 (en) | Mechanisms for performing accurate space accounting for volume families |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190709 |