CN105242884B - A kind of storage system of AUTOMATIC ZONING - Google Patents

A kind of storage system of AUTOMATIC ZONING Download PDF

Info

Publication number
CN105242884B
CN105242884B CN201510696499.9A CN201510696499A CN105242884B CN 105242884 B CN105242884 B CN 105242884B CN 201510696499 A CN201510696499 A CN 201510696499A CN 105242884 B CN105242884 B CN 105242884B
Authority
CN
China
Prior art keywords
tier0
tier1
data
storage system
automatic zoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510696499.9A
Other languages
Chinese (zh)
Other versions
CN105242884A (en
Inventor
赵祯龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201510696499.9A priority Critical patent/CN105242884B/en
Publication of CN105242884A publication Critical patent/CN105242884A/en
Application granted granted Critical
Publication of CN105242884B publication Critical patent/CN105242884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of storage systems of AUTOMATIC ZONING, including:High performance accumulation layer Tier0, for high performance stored copies to be arranged;The accumulation layer Tier1 of common performance, the stored copies for common performance to be arranged;Monitor obtains the data object access information and system performance information in storage system for being responsible for;Scheduler, for safeguarding that the data object on Tier0 stores;To Tier1 transmission data objects, and push order is sent to Tier1 based on scheduling strategy and carries out Tier1 to the transmission of the data object of Tier0;Agent node pushes interface for providing in external agent's service;Wherein, there are the one way data paths that Tier1 is directed toward Tier0 between Tier0 and Tier1, and the one way data path is opened after the push order for carrying out child scheduler is received in Tier1, carries out one-way data object transfers of the Tier1 to Tier0.The present invention efficiently solves the problems, such as real-time deficiency in currently stored AUTOMATIC ZONING system, improves the read access performance of hot spot data, and reduces the invalid abrasion of SSD hard disks.

Description

A kind of storage system of AUTOMATIC ZONING
Technical field
The present invention relates to memory system technologies field, espespecially a kind of storage system of AUTOMATIC ZONING.
Background technology
The purpose of design of storage AUTOMATIC ZONING technology is to make full use of the performance and cost variance of the hard disk of different rotating speeds. In recent years, with flash memory solid state disk (SSD, Solid State Drives) within the storage system increasingly mature and universal, Its is per second be written and read operation (IOPS, Input/Output Operations Per Second) compared with hard disk drive (HDD, Hard Disk Drive) compared to there is larger promotion, become a kind of ideal chose in storage AUTOMATIC ZONING.
When such as data access frequency, creation time, last access time or response can be based on by storing AUTOMATIC ZONING technology Between etc. indexs analyzed, the data of different characteristic are placed on different levels, are the weight in current high-end storage systems nowadays Want technology.
But in storage AUTOMATIC ZONING technology, there is also following three challenges:
First, AUTOMATIC ZONING is that the strategy of a passive technology namely its migrating data is obtained according to historical trend, rather than Real-time state.
The abrasion of second, SSD reduce its service life, the abrasion frequency for how reducing SSD are contemplated that, and to locate Manage data protection when SSD damages.
Third can all consume certain computing resource due to the analysis of access behavior monitoring and statistics and Data Migration operation.It passes The settling mode of system is the time section for setting permission system and executing statistical analysis and data migration operation, high to avoid access The peak period.This method can make the problem of the real-time deficiency of AUTOMATIC ZONING more serious.
Therefore, the challenge of storage AUTOMATIC ZONING technology is to research and develop high-performance distributed storage system to bring great complexity Property, seriously affect the real-time and validity of storage of hierarchically.
Invention content
In order to solve the above technical problem, the present invention provides a kind of storage systems of AUTOMATIC ZONING, efficiently solve and work as Real-time deficiency problem in preceding storage AUTOMATIC ZONING system, improves the read access performance of hot spot data, and reduces SSD hard disks Invalid abrasion.
In order to reach the object of the invention, the present invention provides a kind of storage systems of AUTOMATIC ZONING, including:It is high performance to deposit Reservoir Tier0, for high performance stored copies to be arranged;The accumulation layer Tier1 of common performance, for depositing for common performance to be arranged Store up copy;Monitor obtains the data object access information and system performance information in storage system for being responsible for;Scheduler, For safeguarding that the data object on Tier0 stores;It is pushed away to Tier1 transmissions to Tier1 transmission data objects, and based on scheduling strategy Order of losing one's life carries out Tier1 and is transmitted to the data object of Tier0;Agent node pushes interface for providing in external agent's service; Wherein, it is received in Tier1 there are the one way data path that Tier1 is directed toward Tier0 between Tier0 and Tier1 and carrys out child scheduler The one way data path is opened after push order, carries out one-way data object transfers of the Tier1 to Tier0.
Further, the stored copies parameter of the Tier0 includes:Maximum redundancy degree M, Tier0 can in expression system The maximum number of copies of receiving;Configuring redundancy degree m indicates the number of copies pushed into Tier0, and m≤M;Copy slot indicates The position of Replica placement in Tier0;The stored copies parameter of the Tier1 includes:Redundancy N indicates the number of copies of Tier1.
Further, the Tier0 is operated based on REST API, including:CREATE is operated, right for creating one As;GET is operated, for reading a data object;REMOVE is operated, for deleting a data object;CLEAN is operated, and is used To remove data object in storage system, wherein the calling opportunity of REMOVE operations is that scheduler actively deletes the number in Tier0 According to, or come from after data object is removed by garbage collection in Tier1 and asked to the readjustment of scheduler dispatches;CLEAN is operated Data object in storage system is removed according to the specified range removed of uniform resource locator.
Further, with the presence or absence of requested access in the stored copies of the agent node inquiry Tier0 and Tier1 Object data after stored copies are determined in Tier0 or Tier1, is believed by pushing interface and being accessed to monitor sending object Breath.
Further, the stored copies determining in Tier0 or Tier1, specially:Inquire the storage pair in Tier0 Whether this can be used;If it is available, then determining stored copies in Tier0;If it is not then determining that storage is secondary in Tier1 This.
Further, the scheduling strategy includes:Hot spot data identification, the maintenance of data temperature and the displacement plan of data Slightly.
Further, the monitor obtains the data object in storage system by the push interface of agent node and accesses The object data that access times are greater than the set value is pushed in Tier0 by information and system performance information, scheduler, and according to heat Point data identification increases the stored copies quantity of hot spot data.
Further, the monitor and scheduler are positioned in the isolated node except accumulation layer and agent node, And the database realizing monitor of shared monitoring system is taken to be docked with the monitoring system of storage system.
Further, described to take the database realizing monitor of shared monitoring system and the monitoring system pair of storage system It connects, specially:Storage system has monitoring interface, is realized using statsd;In storage system operation, if object carries out super text This transport protocol is asked, then is inserted into pile function, and the monitoring that monitoring data are sent into monitoring system by User Datagram Protocol connects Mouthful, monitor obtains detection data by the database of monitoring system.
Further, the scheduler and monitor using single cpu mode and carry out High Availabitity protection.
Compared with prior art, the present invention makes full use of the real-time characteristic of data-pushing, collects performance of storage system data Dynamic copies scheduling is carried out with access information, real-time deficiency in currently stored AUTOMATIC ZONING system is efficiently solved the problems, such as, carries High hot spot data access performance reduces the invalid abrasion of SSD hard disks, to push mass data storage system structure Development.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The purpose of the present invention and other advantages can be by specification, rights Specifically noted structure is realized and is obtained in claim and attached drawing.
Description of the drawings
Attached drawing is used for providing further understanding technical solution of the present invention, and a part for constitution instruction, with this The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 be the present invention a kind of embodiment in AUTOMATIC ZONING storage system configuration diagram.
Fig. 2 be the present invention a kind of embodiment in Tier0 access interfaces schematic diagram.
Specific implementation mode
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application Feature mutually can arbitrarily combine.
Step shown in the flowchart of the accompanying drawings can be in the computer system of such as a group of computer-executable instructions It executes.Also, although logical order is shown in flow charts, and it in some cases, can be with suitable different from herein Sequence executes shown or described step.
Fig. 1 be the present invention a kind of embodiment in AUTOMATIC ZONING storage system configuration diagram.As shown in Figure 1, packet Include high performance accumulation layer Tier0, common performance accumulation layer Tier1, monitor, scheduler and agent node, wherein supervising Visual organ and scheduler can be in the same apparatus.
Tier0, for high performance stored copies to be arranged;
Specifically, the stored copies parameter of Tier0 includes:
Maximum redundancy degree M, the maximum number of copies that Tier0 can be accommodated in expression system;
Configuring redundancy degree m indicates the number of copies pushed into Tier0, and m≤M;
Copy slot indicates that the position of Replica placement in Tier0, the copy in Tier0 are all placed in copy slot, which can To place m parts of copies, or empty.
The stored copies parameter of Tier0 is set, for example, M=1, m=1, i.e. has 1 copy slot in Tier0.
Tier1, the stored copies for common performance to be arranged;
Specifically, the stored copies parameter of Tier1 includes:
Redundancy N indicates the number of copies of Tier1.
The stored copies parameter of Tier1 is set, for example, placing 3 parts of copies in N=3, i.e. Tier1.
Monitor obtains the data object access information and system performance information in storage system for being responsible for;
Specifically, object accesses information comes from agent node, mainly flows to the access information of storage service.
Scheduler, the storage condition for being responsible for safeguarding object on Tier0, and number is pushed to Tier1 based on scheduling strategy According to;
Specifically, including using storage service ask propelling data to Tier0, the removing of the upper data of Tier0, and The maintenance of list object on Tier0 nodes.Scheduling strategy includes mainly:Hot spot data identification, the maintenance of data temperature and data Replacement Strategy.
Agent node pushes (PUSH) interface for providing in external agent's service.
In the storage system of the AUTOMATIC ZONING, there is only the unidirectional numbers that Tier1 is directed toward Tier0 between Tier0 and Tier1 According to access, which opens after Tier1 receives PUSH orders, is not communicated between Tier0 and Tier1 in the case of other.
The operation of Tier0 is based on REST application programming interfaces (REST API), as shown in Fig. 2, including:
CREATE is operated, for creating an object;
GET is operated, for reading an object;
REMOVE is operated, for deleting an object;
It is worth noting that, the semanteme deleted is different from the DELETE in Tier1 herein, REMOVE is by one to object It deletes in systems rather than dereference;On the one hand the calling opportunity of REMOVE operations is that scheduler actively deletes number in Tier0 According on the other hand coming from Tier1 after object is removed by garbage collection (GC, Garbage Collection) and sent out to scheduler The readjustment request sent;
CLEAN is operated, and is used for data in removing system, according to uniform resource locator (URL, Uniform Resoure Locator the range removed) is specified.
Tier0 is can be seen that according to four REST API to be only used for optimizing read operation performance, and it is unrelated with write operation.System The data of system are entirely derived from scheduler and send PUSH requests to Tier1.
Agent node inquires all copies of an object first before the data for asking an object, selects one Just carry out data transmission after readable copy, if the copy in Tier0 is available, can be selected first, if in Tier0 not There are the data of requested object, then agent node can read the copy in Tier1.Behaviour is read since Tier0 is only used to optimization Make, so any metadata need not be preserved, the complex operations that metadata consistency is safeguarded also are not present.
Due to access behaviortrace statistical analysis and Data Migration operation, can all consume certain computing resource.In order to Make the monitoring of system and dispatch the normal access for not influencing system, monitor and scheduler is placed in individual node, portion Monitor can be made to be docked with the monitoring system of storage when administration, such as take the database mode of shared monitoring system.It deposits Storage system itself, there are monitoring interface, is realized when realizing using statsd, is carried out in the key point of system operation, such as object When hypertext transfer protocol (HTTP, Hyper Text Transfer Protocol) is asked, it is inserted into pile function, passes through number of users Monitoring data are sent into monitoring system or monitor according to datagram protocol (UDP, User Datagram Protocol), using UDP Agreement can make the network overhead very little of monitoring.It can be accessed to Tier0 and Tier1 by REST API for scheduler, this Part REST API belong to the access of Control Cooling, and the influence loaded to Operational Visit is very small.
Scheduler and monitor use single cpu mode when realizing, and save state, thus need to carry out High Availabitity (HA, High Available) protection, once the state in scheduler is lost, need to only navigate to can not ensure data correctness Minimum zone sends the data that CLEAN orders empty corresponding portion in Tier0.By the property based on content addressed storage system Understanding the problem of data are almost without consistency maintenance in Tier0, the loss of data also can only cause to lose in certain reading performance, Correctness without influencing data.If the node delay machine where monitor and scheduler, part can be only retained in Tier0 and is gone through History data, and newest access data can not be obtained, storage system itself is also available.
The storage system of AUTOMATIC ZONING in the present invention at least has following several points compared with traditional Cache management Advantage:
First, traditional Cache systems are " Best-Effort ", therefore data access and number in high-speed processing apparatus According to update be " synchronization ";And data access in layering is automatically stored and does not directly affect putting for data in high-speed processing apparatus It sets, but determines to need to be put into the data of high-speed processing apparatus after to data acess control and calculating, therefore this process is " asynchronous ".
Second, layering is automatically stored and focuses more in being optimized from global level to access, completes the knowledge of hot spot data After not, be written the storage device of high speed in a manner of " push ", this mode in tradition Cache systems after cache miss The mode that data are carried out with " pulling " is compared, on the one hand also advantageous on the one hand so that more having specific aim to the optimization of access In reducing high-speed processing apparatus by erasable frequency, extend SSD service lifes.
The present invention devises a kind of storage system of AUTOMATIC ZONING, in AUTOMATIC ZONING storage system architecture, will deposit Storage divides level according to performance characteristic, and coordinates PUSH interfaces in monitor, scheduler and external service, is carried out to data Global layering scheduling;Performance acquisition and data hierarchy scheduling are carried out when operation, and the operation when property of storage is collected by monitor The access information of energy data and objects of statistics, scheduler will be in the Object Push frequently accessed to high-performance accumulation layer;Implement dynamic State replica management, access information when according to operation are collected, and the number of copies of hot spot data is improved, and the concurrently reading to improve object is visited Ask performance.
The present invention makes full use of the real-time characteristic of data-pushing, the dynamic copies scheduling based on runtime data analysis, with And the collection method of performance of storage system data and access information.It is above-mentioned excellent possessed by the storage system of this AUTOMATIC ZONING Point, compared with traditional Cache system optimization data access performances, the present invention is improving the same of hot spot data access performance When, it efficiently solves the problems, such as real-time deficiency in currently stored AUTOMATIC ZONING system, also reduces SSD hard disks to a certain extent Invalid abrasion, the method proposed in present system is applied equally to other distributed memory systems.Therefore the present invention is big There is very high technological value and practical value in the practice of scale distribution formula object storage system.
Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use Embodiment is not limited to the present invention.Technical staff in any fields of the present invention is taken off not departing from the present invention Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.

Claims (8)

1. a kind of storage system of AUTOMATIC ZONING, which is characterized in that including:
High performance accumulation layer Tier0, for high performance stored copies to be arranged;
The accumulation layer Tier1 of common performance, the stored copies for common performance to be arranged;
Monitor obtains the data object access information and system performance information in storage system for being responsible for;
Scheduler, for safeguarding that the data object on Tier0 stores;To Tier1 transmission data objects, and based on scheduling strategy to Tier1 sends push order and carries out Tier1 to the transmission of the data object of Tier0;
Agent node pushes interface for providing in external agent's service;
Wherein, it is received in Tier1 there are the one way data path that Tier1 is directed toward Tier0 between Tier0 and Tier1 and carrys out self scheduling The one way data path is opened after the push order of device, carries out one-way data object transfers of the Tier1 to Tier0;
The scheduling strategy includes:Hot spot data identification, the maintenance of data temperature and the Replacement Strategy of data;
The monitor obtains data object access information and systematicness in storage system by the push interface of agent node The object data that access times are greater than the set value is pushed in Tier0 by energy information, scheduler, and is identified and increased according to hot spot data Heat the stored copies quantity of point data.
2. the storage system of AUTOMATIC ZONING according to claim 1, which is characterized in that the stored copies of the Tier0 are joined Number includes:Maximum redundancy degree M, the maximum number of copies that Tier0 can be accommodated in expression system;Configuring redundancy degree m, indicate to The number of copies pushed in Tier0, and m≤M;Copy slot indicates the position of Replica placement in Tier0;
The stored copies parameter of the Tier1 includes:Redundancy N indicates the number of copies of Tier1.
3. the storage system of AUTOMATIC ZONING according to claim 1, which is characterized in that the Tier0 is based on REST API It is operated, including:CREATE is operated, for creating an object;GET is operated, for reading a data object;REMOVE Operation, for deleting a data object;CLEAN is operated, for removing data object in storage system, wherein
The calling opportunity of REMOVE operations is that scheduler actively deletes the data in Tier0, or comes from data pair in Tier1 As being asked to the readjustment of scheduler dispatches after being removed by garbage collection;
CLEAN operations remove data object in storage system according to the specified range removed of uniform resource locator.
4. the storage system of AUTOMATIC ZONING according to claim 1, which is characterized in that the agent node inquires Tier0 With the object data that whether there is requested access in the stored copies of Tier1, it is determined that storage is secondary in Tier0 or Tier1 After this, by pushing interface to monitor sending object access information.
5. the storage system of AUTOMATIC ZONING according to claim 4, which is characterized in that described true in Tier0 or Tier1 Determine stored copies, specially:
Whether the stored copies in inquiry Tier0 can be used;If it is available, then determining stored copies in Tier0;If can not With, then in Tier1 determine stored copies.
6. the storage system of AUTOMATIC ZONING according to claim 1, which is characterized in that the monitor and scheduler are placed In the isolated node except accumulation layer and agent node, and takes the database realizing monitor of shared monitoring system and deposit The monitoring system of storage system is docked.
7. the storage system of AUTOMATIC ZONING according to claim 6, which is characterized in that described to take shared monitoring system Database realizing monitor is docked with the monitoring system of storage system, specially:
Storage system has monitoring interface, is realized using statsd;
In storage system operation, if object carries out hypertext transfer protocol requests, it is inserted into pile function, passes through user datagram Monitoring data are sent into the monitoring interface of monitoring system by agreement, and monitor obtains detection data by the database of monitoring system.
8. the storage system of AUTOMATIC ZONING according to claim 7, which is characterized in that the scheduler and monitor use Single cpu mode simultaneously carries out High Availabitity protection.
CN201510696499.9A 2015-10-23 2015-10-23 A kind of storage system of AUTOMATIC ZONING Active CN105242884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510696499.9A CN105242884B (en) 2015-10-23 2015-10-23 A kind of storage system of AUTOMATIC ZONING

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510696499.9A CN105242884B (en) 2015-10-23 2015-10-23 A kind of storage system of AUTOMATIC ZONING

Publications (2)

Publication Number Publication Date
CN105242884A CN105242884A (en) 2016-01-13
CN105242884B true CN105242884B (en) 2018-10-16

Family

ID=55040547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510696499.9A Active CN105242884B (en) 2015-10-23 2015-10-23 A kind of storage system of AUTOMATIC ZONING

Country Status (1)

Country Link
CN (1) CN105242884B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463514B (en) * 2017-08-16 2021-06-29 郑州云海信息技术有限公司 Data storage method and device
CN109344077A (en) * 2018-10-24 2019-02-15 郑州云海信息技术有限公司 RestAPI characteristic test method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508789A (en) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 Grading storage method for system
CN103713861A (en) * 2014-01-09 2014-04-09 浪潮(北京)电子信息产业有限公司 File processing method and system based on hierarchical division
CN104102454A (en) * 2013-04-07 2014-10-15 杭州信核数据科技有限公司 Method for automatically realizing hierarchical storage and system for managing hierarchical storage

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102508789A (en) * 2011-10-14 2012-06-20 浪潮电子信息产业股份有限公司 Grading storage method for system
CN104102454A (en) * 2013-04-07 2014-10-15 杭州信核数据科技有限公司 Method for automatically realizing hierarchical storage and system for managing hierarchical storage
CN103713861A (en) * 2014-01-09 2014-04-09 浪潮(北京)电子信息产业有限公司 File processing method and system based on hierarchical division

Also Published As

Publication number Publication date
CN105242884A (en) 2016-01-13

Similar Documents

Publication Publication Date Title
US10691716B2 (en) Dynamic partitioning techniques for data streams
AU2014346369B2 (en) Managed service for acquisition, storage and consumption of large-scale data streams
JP6371858B2 (en) Atomic writing for multiple extent operations
JP6371859B2 (en) Session management in distributed storage systems
EP3127000B1 (en) Scalable file storage service
JP6259532B2 (en) Namespace management in distributed storage systems
US9858322B2 (en) Data stream ingestion and persistence techniques
US20090307329A1 (en) Adaptive file placement in a distributed file system
CN103795781B (en) A kind of distributed caching method based on file prediction
US20150134626A1 (en) Partition-based data stream processing framework
JP2017511541A (en) File storage using variable stripe size
CN107077492A (en) The expansible transaction management based on daily record
JP2004295790A (en) Cache management method for storage
CN105373347B (en) A kind of hot spot data identification of storage system and dispatching method and system
CN108292235A (en) Use the Network Attached Storage of selective resource migration
US10810054B1 (en) Capacity balancing for data storage system
CN105242884B (en) A kind of storage system of AUTOMATIC ZONING
Pan et al. predis: Penalty and locality aware memory allocation in redis
CN108319634A (en) The directory access method and apparatus of distributed file system
Pan et al. Penalty-and locality-aware memory allocation in Redis using enhanced AET
CN104765572B (en) The virtual storage server system and its dispatching method of a kind of energy-conservation
CN109471971B (en) Semantic prefetching method and system for resource cloud storage in education field
CN112540954B (en) Multi-level storage construction and online migration method in directory unit
Liu et al. FLAP: Flash-aware prefetching for improving SSD-based disk cache
JP2006085208A (en) Information life cycle management system and data arrangement determination method therefor

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant