US11954076B2 - Hierarchical storage management system, hierarchical storage control apparatus, hierarchical storage management method and program - Google Patents

Hierarchical storage management system, hierarchical storage control apparatus, hierarchical storage management method and program Download PDF

Info

Publication number
US11954076B2
US11954076B2 US17/629,462 US201917629462A US11954076B2 US 11954076 B2 US11954076 B2 US 11954076B2 US 201917629462 A US201917629462 A US 201917629462A US 11954076 B2 US11954076 B2 US 11954076B2
Authority
US
United States
Prior art keywords
data
data center
power consumption
stored data
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US17/629,462
Other versions
US20220222220A1 (en
Inventor
Tomonori Iino
Atsushi Sakurai
Yuriko Tanaka
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION reassignment NIPPON TELEGRAPH AND TELEPHONE CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TANAKA, Yuriko, Iino, Tomonori, SAKURAI, ATSUSHI
Publication of US20220222220A1 publication Critical patent/US20220222220A1/en
Application granted granted Critical
Publication of US11954076B2 publication Critical patent/US11954076B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/185Hierarchical storage management [HSM] systems, e.g. file migration or policies thereof
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates to a technique for optimizing data storage and operation by arranging data stored in a plurality of data centers based on conditions determined by a business operator.
  • a conventional hierarchical storage management system has large-capacity storage configured by using SSDs, HDDs, and magnetic tapes in accordance with the number of reference counts made to stored data and the access speed (write, read) of a storage medium. Data having the high number of reference counts is automatically stored in an SSD to achieve a higher access speed (Non Patent Literature 1).
  • Non Patent Literature 2 there is a content distribution system that can shorten download time by providing a content cache server at the boundary between a user area and a public network and downloading a content from the cache server close to an accessing user.
  • the present invention has been made with the foregoing in view, and it is an object to provide a technique capable of automatically selecting a storage medium that matches an operation policy of data from a plurality of storage media disposed in a plurality of data centers.
  • a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage, wherein the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
  • a technique capable of automatically selecting a storage medium that matches an operation policy of data from a plurality of storage media disposed in a plurality of data centers.
  • FIG. 1 illustrates a configuration of a hierarchical storage management system.
  • FIG. 2 illustrates a configuration of a hierarchical storage.
  • FIG. 3 illustrates a configuration a hierarchical side storage control apparatus.
  • FIG. 4 is a diagram illustrating an example of a hardware configuration of the apparatus.
  • FIG. 5 is diagram illustrating a structure of a data center information table.
  • FIG. 6 is a diagram illustrating a structure of a storage medium information table.
  • FIGS. 7 ( a ) to 7 ( c ) are diagrams illustrating structures of various tables.
  • FIG. 8 is a diagram illustrating a structure of an operation policy table.
  • FIG. 9 is a diagram illustrating a structure of a stored data management table.
  • FIG. 10 is a flowchart illustrating calculation performed by a hierarchical storage control apparatus.
  • FIG. 11 is a diagram illustrating a data center arrangement according to an embodiment.
  • FIG. 12 illustrates a configuration of a hierarchical storage control system according to the embodiment.
  • FIG. 13 is a diagram illustrating data center information table according to the embodiment.
  • FIG. 14 is a diagram illustrating a storage medium information table according to the embodiment.
  • FIGS. 15 ( a ) to 15 ( c ) are diagrams illustrating various tables according to the embodiment.
  • FIG. 16 is a diagram illustrating an operation policy table according to the embodiment.
  • FIG. 17 is a diagram illustrating a stored data management table according to the embodiment.
  • FIG. 18 is a diagram illustrating a calculation example.
  • FIG. 19 is a diagram illustrating a calculation example.
  • FIG. 20 is a diagram illustrating a calculation example.
  • FIG. 21 is a diagram illustrating a calculate result.
  • the present embodiment describes a technique for automatically selecting, for individual data to be stored in a plurality of data centers, a storage medium that matches an operation policy of the data, by referring to the location conditions (construction cost, electricity charges), the data reference frequency and the communication speed of the network, and the type of the storage medium storing the data and the installation location of the storage medium.
  • This technique reduces unnecessary power consumption and a cost and contributes to reductions of the power consumption (improvement in energy-saving properties) and the cost in a cloud-type data center and a virtualized NW as well as to improvement of QoS.
  • the technique will be specifically described.
  • FIG. 1 illustrates a configuration of a hierarchical storage management system according to the present embodiment.
  • the hierarchical storage management system according to the present embodiment includes a hierarchical storage control apparatus 20 and a plurality of data centers 30 each connected to a network 10 .
  • a user 50 is connected to the network 10 .
  • the “user 50 ” is, for example, a client terminal used by a user.
  • a hierarchical storage 40 is disposed in each of the data centers 30 .
  • a data center operator or a CDN operator stores its own data or data of the user 50 in the hierarchical storage 40 disposed in any one of the data centers 30 .
  • the hierarchical storage 40 in the plurality of data centers 30 and the hierarchical storage control apparatus 20 are connected via at least one network 10 so that large-scale storage can be provided.
  • a storage medium of the hierarchical storage 40 in the data center 30 located near an urban area where the land price is high is configured mainly by a high-speed storage medium such as an SSD and stores data that has a high reference frequency and requires a short delay time.
  • a storage medium of the hierarchical storage 40 in the data center 30 located in a suburban area where the land price is low is configured mainly by a plurality of storage media with a low speed such as a magnetic tape to achieve an ultra-high capacity and stores data that has a low reference frequency and allows delay.
  • the data is downloaded from the hierarchical storage 40 in which the data is stored to the user 50 .
  • FIG. 2 illustrates a configuration of the hierarchical storage 40 disposed in the data center 30 .
  • the hierarchical storage 40 includes a storage unit 410 , constituted of a plurality of storage media 420 # 1 to 420 #n, and a management unit 430 .
  • the individual storage medium 420 is, for example, an SSD (flash memory), an HDD (magnetic disk), an optical disk, a magnetic tape, or the like.
  • the management unit 430 checks the input and output of data and detects a reference source area and the number of cumulative reference counts when stored data is referred to. The detected information is notified to the hierarchical storage control apparatus 20 and managed therein.
  • FIG. 3 illustrates a configuration of the hierarchical storage control apparatus 20 .
  • the hierarchical storage control apparatus 20 includes a calculation unit 210 , a storage unit 220 , and a timer 230 .
  • the storage unit 220 stores a data center information table 2210 , a storage medium information table 2220 , a transmission line information table 2230 , a calculation interval table 2240 , an execution log table 2250 , an operation policy table 2260 , and a stored data management table 2270 .
  • the timer 230 holds current date and time.
  • the content of each table and the content of calculation performed by the calculation unit 210 will be described below.
  • the functions of the hierarchical storage control apparatus 20 can be implemented, for example, by causing a computer to execute a program.
  • the functions of the hierarchical storage control apparatus 20 can be implemented by executing a program corresponding to processing performed by the hierarchical storage control apparatus 20 by using hardware resources such as a CPU and a memory built in a computer.
  • the above program can be recorded on a computer-readable recording medium (portable memory or the like) to be stored or distributed.
  • the above program can be provided through a network such as the Internet or e-mail.
  • FIG. 4 is a diagram illustrating an example of a hardware configuration of the above computer.
  • the computer illustrated in FIG. 4 includes a drive device 1000 , an auxiliary storage device 1002 , a memory device 1003 , a CPU 1004 , an interface device 1005 , a display device 1006 , an input device 1007 , etc. connected to each other by a bus B.
  • the program for implementing the processing by the computer is provided, for example, by a recording medium 1001 such as a CD-ROM, a memory card, or the like.
  • a recording medium 1001 such as a CD-ROM, a memory card, or the like.
  • the program is installed in the auxiliary storage device 1002 from the recording medium 1001 via the drive device 1000 .
  • the program does not necessarily need to be installed from the recording medium 1001 and may be downloaded from another computer via the network.
  • the auxiliary storage device 1002 stores the installed program and also stores necessary files, data, etc.
  • the memory device 1003 reads and stores the program from the auxiliary storage device 1002 .
  • the CPU 1004 implements the functions of the hierarchical storage control apparatus 20 in accordance with the program stored in the memory device 1003 .
  • the interface device 1005 is used as an interface for connecting to a network and functions as input means and output means via the network.
  • the display device 1006 displays a GUI (graphical user interface) or the like in accordance with the program.
  • the input device 157 includes a keyboard, a mouse, buttons, a touch panel, or the like and is used to input various operation instructions.
  • FIG. 5 illustrates a structure of the data center information table 2210 .
  • the data center information table 2210 stores the unique number, name, and location (address or latitude/longitude) of a data center included in the present hierarchical storage management system, the unit price of the electricity charge of the power supplied to the data center, and the construction cost per rack.
  • Each information item is input manually by an administrator or automatically when the hierarchical storage 40 is newly added to (or eliminated from) the present hierarchical storage management system or when any one of the information items is changed.
  • FIG. 6 illustrates a structure of the storage medium information table 2220 .
  • the storage medium information table 2220 stores the unique number assigned to the storage medium 420 by the hierarchical storage control apparatus 20 , reading time, capacity, power consumption during standby and during reading, lifetime, acquisition price, etc.
  • Each information item is input manually by an administrator or automatically when the storage medium is newly added (or eliminated) or when any one of the information items is changed.
  • FIG. 7 ( a ) illustrates a structure of the transmission line information table 2230 .
  • the transmission line information table 2230 stores the unique number assigned by the present hierarchical storage management system to an individual transmission line connecting between the data centers 30 , the data center numbers of the data centers located at both ends of the transmission line, and the communication speed of the transmission line.
  • Each information item is input manually by an administrator or automatically when a new transmission line is established or when any one of the information items is changed.
  • the transmission line may be a dedicated line for the operator of the hierarchical storage management system or a public line.
  • FIG. 7 ( b ) illustrates a structure of the calculation interval table 2240 .
  • the calculation interval table 2240 stores intervals (for example, one year, one month) for the calculation determined by the data center operator or the administrator to be performed periodically.
  • the calculation interval is updated when the administrator performs an input to the hierarchical storage management system.
  • FIG. 7 ( c ) illustrates a structure of the execution log table 2250 . As illustrated in FIG. 7 ( c ) , the execution log table 2250 stores the calculation execution date and time in the past.
  • FIG. 8 illustrates a structure of the operation policy table 2260 .
  • the operation policy table 2260 stores the operation policy of the present hierarchical storage management system.
  • the data center operator or the administrator determines the rank of the delay time, power consumption, and cost, and stores these information items in association with a corresponding policy number.
  • FIG. 9 illustrates a structure of the stored data management table 2270 .
  • the stored data management table 2270 manages all the data stored in the present hierarchical storage management system in cooperation with the management unit 430 of the hierarchical storage 40 in each of the data centers 30 .
  • a record is added to the stored data management table 2270 to record the data number of the data, data size, number of the storage medium storing the data, the number of reference counts, most frequent reference source area, policy number freely set by the administrator, and the communication speed, power consumption, and cost obtained by the calculation, which will be described below, are recorded.
  • the calculation unit 210 compares the latest calculation execution date and time in the execution log table 2250 with the date and time of the timer 230 , and when the calculation interval stored in the calculation interval table 2240 has elapsed, the calculation unit 210 starts a calculation. In addition, the calculation unit 210 stores the data and time when the calculation is started in the execution log table 2250 .
  • the calculation unit 210 performs the following processing for each of all the data managed in the stored data management table 2270 : the calculation unit 210 calculates the annual power consumption needed for storing the data by using the following formula (1), calculates the annual cost needed for storing the data by using the following formula (2), and calculates the communication speed from the data center storing the data to the most frequent reference source area by using the following formula (3). Since time needed for reading and transmitting data is used as the communication speed in the present embodiment, the communication speed may be referred to as communication time instead. In addition, while the annual value is used in the present example, a value for a period other than one year may be used.
  • PU year T read ⁇ F read ⁇ P read +(8760 ⁇ T read ⁇ F read ) ⁇ P idle formula (1)
  • PU year annual data storage power consumption
  • T read reading time
  • F read reference frequency
  • P read power consumption during reading
  • P idle power consumption during standby
  • C year PU year ⁇ Charge power +(Charge foorprint ⁇ Size data ) ⁇ Density storage +(Charge media ⁇ Size data ) ⁇ (Capacity media ⁇ Lifetime media ) formula (2)
  • C year annual data storage cost PU year : data storage power consumption
  • Charge power electricity charge
  • Charge foorprint space cost Size data : data size
  • Density storage storage medium recording density
  • Charge media unit price of the storage medium
  • Capacity media capacity of the storage medium
  • Lifetime media lifetime of the storage medium
  • T DL T read +T 1 formula (3)
  • T EL data download time from the data center to the reference source
  • T read reading time
  • T 1 communication speed (communication time) of the NW
  • the calculation unit 210 stores the annual power consumption, the annual cost, and the communication speed from the data center to the most frequent reference source area, which have been calculated in S 1 , in the stored data management table 2270 per data.
  • the power consumption, the cost, and the communication speed are calculated for each of all the data, and subsequently, determination, etc. in S 3 , which will be described below, are performed.
  • repetitive processing of “calculation, determination, change” (until the operation policy is satisfied) per data may be performed.
  • the calculation unit 210 compares the resultant values (the annual power consumption, the annual storage cost, and the communication speed) calculated in S 1 with values set for the policy number corresponding to the data in the operation policy table 2260 per data and determines whether all the values satisfy the corresponding values of the operation policy. When all the values of all the data satisfy the corresponding values in the respective operation policies, the processing ends.
  • the data center base and the storage medium corresponding to the data are changed.
  • the data center base may not be changed, and only the storage medium may be changed.
  • the change is not particularly limited.
  • the change may be made by increasing (or decreasing) the data center number/storage medium number. After the change has been made, the calculation is performed on the assumption that the data is stored in a changed storage medium.
  • the calculation unit 210 transfers the data to the storage medium of the data center at that time.
  • the transfer of data from one data center to another data center can be implemented by instructing the management unit 430 of the hierarchical storage 40 in the relevant data center.
  • FIG. 11 illustrates the locations of the data centers in the present example. As illustrated in FIG. 11 , four data centers (South Kanto, North Kanto, Joshinetsu, Hokkaido) are located in the eastern Japan area.
  • FIG. 12 illustrates a configuration of the hierarchical storage management system of the present example.
  • each data center is provided with the hierarchical storage 40 .
  • a configuration of a storage medium in the individual hierarchical storage 40 is as illustrated in FIG. 11 .
  • the hierarchical storage control apparatus 20 is provided in the South Kanto area. The hierarchical storage control apparatus 20 is connected to each data center via a public network.
  • FIG. 13 illustrates the data center information table 2210 of the present example.
  • the data center information table 2210 stores the name, location, unit price of the electricity charge, and construction cost per rack of each data center.
  • the construction cost per rack is a value obtained by dividing the number of accommodated racks by the total construction cost.
  • FIG. 14 illustrates the storage medium information table 2220 of the present example. As illustrated in FIG. 14 , the storage medium information table 2220 stores information about the storage accommodated in each data center.
  • FIG. 15 ( a ) illustrates the transmission line information table 2230 of the present example.
  • the transmission line number in the transmission line information table 2230 in FIG. 15 ( a ) corresponds to a number assigned to the individual transmission line in FIG. 12 .
  • the South Kanto and Joshinetsu are connected by a dedicated line, and the South Kanto and Hokkaido are also connected by a dedicated line.
  • the South Kanto and the North Kanto are connected via a public network, instead of a dedicated line.
  • the data centers may be directly connected to one another by a dedicated line or may be connected via a public network.
  • FIG. 15 ( b ) illustrates the calculation interval table 2240 of the present example.
  • the hierarchical storage control apparatus 20 performs the calculation and rearranges the stored data at intervals of one month.
  • the calculation interval stored in the calculation interval table 2240 is set to one month.
  • FIG. 15 ( c ) illustrates the execution log table 2250 of the present example.
  • the execution log table 2250 stores the calculation time in the past. In the present example, the calculation is performed on the first of every month.
  • FIG. 16 illustrates the operation policy table 2260 of the present example.
  • the business operator determines a communication speed (communication time) needed for downloading the data, power consumption needed for storing the data, and a cost based on the construction cost, the acquisition price of the storage, the electricity charge, etc. and assigns an operation policy number to each operation policy in the operation policy table 2260 .
  • data having a data size of 300 GB is stored in the SSD “21” of the North Kanto data center as data number 1
  • data of 1 TB is stored in the HDD “11” of the South Kanto data center as data number 2
  • data of 500 MB is stored in the magnetic tape “71” of Hokkaido data center as data number 3.
  • the users who have uploaded the above data have set policy numbers 5, 4, and 1 to the data number 1, 2, and 3, respectively.
  • the number of reference counts and the most frequent reference source areas at the time when the first calculation is performed are as illustrated in FIG. 17 .
  • the calculation described with reference to the flowchart in FIG. 10 starts on the first day of the following month of the data storage.
  • the calculation performed for the data of the data number “2” will be described as an example.
  • the calculation unit 210 calculates the annual power consumption needed for storing the data of the data number “2” by using the formula (1).
  • FIG. 18 illustrates the content of the calculation.
  • F read (annual reference frequency) is 20 times
  • P read power consumption of a single reading of 60 T
  • the total capacity of the storage medium is 60 T. Therefore, T read ⁇ F read ⁇ P read (annual power consumption needed for reading) is ((1 T[B]/200 M[B/s])/3600) ⁇ 20 ⁇ 10/60, as illustrated in FIG. 18 .
  • calculation unit 210 calculates the annual cost needed for storing the data of the data number “2” by using the formula (2).
  • FIG. 19 illustrates the content of the calculation in detail.
  • the calculation unit 210 calculates the communication speed (communication time) from the data center storing the data to the most frequent reference source area by using the formula (3).
  • FIG. 20 illustrates the details.
  • T read reading time
  • T 1 communication time of the NW
  • the calculation unit 210 stores the annual power consumption, the annual cost, and the communication speed from the corresponding data center to the most frequent reference source area, which have been calculated as described above, in the stored data management table 2270 .
  • FIG. 21 illustrates the portion of the data number 2 in the stored data management table 2270 after the above calculation results have been stored.
  • the calculation unit 210 acquires the communication speed, the power consumption, and the cost corresponding to the policy number “4” set for the data of the data number 2 from the operation policy table 2260 .
  • the calculation unit 210 compares these acquired values with the communication speed, the power consumption, and the cost calculated above.
  • the calculation unit 210 determines that the storage medium HDD “11” currently storing the data does not satisfy the operation policy. Thus, the calculation unit 210 changes the storage medium and performs the calculation, and the calculation is continued until the values are within the range of the policy number “4”.
  • the data center and the type of storage medium can be automatically selected for the data to be stored in accordance with the communication speed between the data centers and the power consumption and the cost needed for storing the data set in advance by the business operator.
  • the power consumption and the cost can be reduced, and this leads to reductions of the electricity charge and environmental load as well as an improvement in QoS.
  • the present description discloses at least the hierarchical storage management system, the hierarchical storage control apparatus, the hierarchical storage management method, and the program in the following items.
  • a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage, wherein the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
  • the hierarchical storage management system wherein, when the calculation unit determines that the calculated power consumption, cost, and communication time do not satisfy the operation policy set for the data, the calculation unit changes a storage medium storing the data and performs the processing on an assumption that the data is stored in a changed storage medium.
  • the hierarchical storage management system according to item 1 or 2, wherein the calculation unit calculates power consumption needed for storing the data by calculating a sum of power consumption for data reading from the storage medium storing the data and power consumption during standby,
  • a hierarchical storage management method used in a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage,
  • the hierarchical storage control apparatus performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
  • a hierarchical storage control apparatus used in a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage,
  • the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

In a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage, the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a National Stage application under 35 U.S.C. § 371 of International Application No. PCT/JP2019/030082, having an International Filing Date of Jul. 31, 2019. The disclosure of the prior application is considered part of the disclosure of this application, and is incorporated in its entirety into this application.
TECHNICAL FIELD
The present invention relates to a technique for optimizing data storage and operation by arranging data stored in a plurality of data centers based on conditions determined by a business operator.
BACKGROUND ART
A conventional hierarchical storage management system has large-capacity storage configured by using SSDs, HDDs, and magnetic tapes in accordance with the number of reference counts made to stored data and the access speed (write, read) of a storage medium. Data having the high number of reference counts is automatically stored in an SSD to achieve a higher access speed (Non Patent Literature 1).
In addition, there is a content distribution system that can shorten download time by providing a content cache server at the boundary between a user area and a public network and downloading a content from the cache server close to an accessing user (Non Patent Literature 2).
Furthermore, thin clients, software-defined storage employing virtualization, etc. have become widespread, and data is managed in data centers without ensuring storage in client terminals of the users (Non Patent Literature 3).
CITATION LIST Non Patent Literature
[NPL 1] https://www.atmarkit.co.jp/ait/articles/1106/27/news109.html
[NPL 2] https://blog.redbox.ne.jp/what-is-cdn.html
[NPL 3] https://www.atmarkit.co.jp/ait/articles/1409/29/news130.htm
SUMMARY OF THE INVENTION Technical Problem
In recent year, instead of holding and managing software, data, etc. in computer hardware, users have come to manage software, data, etc. on a server in a data center connected to a network, and with the increasing capacity and speed of communication networks, the spread of SNS, the revision of the Electronic Document Law, etc., there has been an increasing demand for storing large volumes of data with different purposes on the network.
Accordingly, a large amount of data of various sizes is stored in the data center for an extended period of time. Among various types of data, there is data for which delay is not allowed, data for which delay is allowed but power consumption and costs need to be reduced, etc. However, to store each of the data having such various operation policies in an appropriate storage medium, the processing needs to be performed manually in the prior art, and thus takes a great deal of time and effort.
The present invention has been made with the foregoing in view, and it is an object to provide a technique capable of automatically selecting a storage medium that matches an operation policy of data from a plurality of storage media disposed in a plurality of data centers.
Means for Solving the Problem
According to the disclosed technique, there is provided a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage, wherein the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
Effects of the Invention
According to the disclosed technique, there is provided a technique capable of automatically selecting a storage medium that matches an operation policy of data from a plurality of storage media disposed in a plurality of data centers.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 illustrates a configuration of a hierarchical storage management system.
FIG. 2 illustrates a configuration of a hierarchical storage.
FIG. 3 illustrates a configuration a hierarchical side storage control apparatus.
FIG. 4 is a diagram illustrating an example of a hardware configuration of the apparatus.
FIG. 5 is diagram illustrating a structure of a data center information table.
FIG. 6 is a diagram illustrating a structure of a storage medium information table.
FIGS. 7(a) to 7(c) are diagrams illustrating structures of various tables.
FIG. 8 is a diagram illustrating a structure of an operation policy table.
FIG. 9 is a diagram illustrating a structure of a stored data management table.
FIG. 10 is a flowchart illustrating calculation performed by a hierarchical storage control apparatus.
FIG. 11 is a diagram illustrating a data center arrangement according to an embodiment.
FIG. 12 illustrates a configuration of a hierarchical storage control system according to the embodiment.
FIG. 13 is a diagram illustrating data center information table according to the embodiment.
FIG. 14 is a diagram illustrating a storage medium information table according to the embodiment.
FIGS. 15(a) to 15(c) are diagrams illustrating various tables according to the embodiment.
FIG. 16 is a diagram illustrating an operation policy table according to the embodiment.
FIG. 17 is a diagram illustrating a stored data management table according to the embodiment.
FIG. 18 is a diagram illustrating a calculation example.
FIG. 19 is a diagram illustrating a calculation example.
FIG. 20 is a diagram illustrating a calculation example.
FIG. 21 is a diagram illustrating a calculate result.
DESCRIPTION OF EMBODIMENTS
Hereinafter, an embodiment of the present invention (the present embodiment) will be described with reference to the drawings. The embodiment described below is merely an example. An embodiment to which the present invention is applied is not limited to the following embodiment.
The present embodiment describes a technique for automatically selecting, for individual data to be stored in a plurality of data centers, a storage medium that matches an operation policy of the data, by referring to the location conditions (construction cost, electricity charges), the data reference frequency and the communication speed of the network, and the type of the storage medium storing the data and the installation location of the storage medium. This technique reduces unnecessary power consumption and a cost and contributes to reductions of the power consumption (improvement in energy-saving properties) and the cost in a cloud-type data center and a virtualized NW as well as to improvement of QoS. Hereinafter, the technique will be specifically described.
(Overall System Configuration)
FIG. 1 illustrates a configuration of a hierarchical storage management system according to the present embodiment. As illustrated in FIG. 1 , the hierarchical storage management system according to the present embodiment includes a hierarchical storage control apparatus 20 and a plurality of data centers 30 each connected to a network 10. In addition, a user 50 is connected to the network 10. The “user 50” is, for example, a client terminal used by a user.
As illustrated in FIG. 1 , a hierarchical storage 40 is disposed in each of the data centers 30. A data center operator or a CDN operator stores its own data or data of the user 50 in the hierarchical storage 40 disposed in any one of the data centers 30.
The hierarchical storage 40 in the plurality of data centers 30 and the hierarchical storage control apparatus 20 are connected via at least one network 10 so that large-scale storage can be provided.
For example, a storage medium of the hierarchical storage 40 in the data center 30 located near an urban area where the land price is high is configured mainly by a high-speed storage medium such as an SSD and stores data that has a high reference frequency and requires a short delay time. In contrast, a storage medium of the hierarchical storage 40 in the data center 30 located in a suburban area where the land price is low is configured mainly by a plurality of storage media with a low speed such as a magnetic tape to achieve an ultra-high capacity and stores data that has a low reference frequency and allows delay.
When the user 50 refers to data, the data is downloaded from the hierarchical storage 40 in which the data is stored to the user 50.
(Configuration of Hierarchical Storage 40)
FIG. 2 illustrates a configuration of the hierarchical storage 40 disposed in the data center 30. As illustrated in FIG. 2 , the hierarchical storage 40 includes a storage unit 410, constituted of a plurality of storage media 420 #1 to 420 #n, and a management unit 430.
The individual storage medium 420 is, for example, an SSD (flash memory), an HDD (magnetic disk), an optical disk, a magnetic tape, or the like. The management unit 430 checks the input and output of data and detects a reference source area and the number of cumulative reference counts when stored data is referred to. The detected information is notified to the hierarchical storage control apparatus 20 and managed therein.
(Configuration of Hierarchical Storage Control Apparatus 20)
FIG. 3 illustrates a configuration of the hierarchical storage control apparatus 20. As illustrated in FIG. 3 , the hierarchical storage control apparatus 20 includes a calculation unit 210, a storage unit 220, and a timer 230.
The storage unit 220 stores a data center information table 2210, a storage medium information table 2220, a transmission line information table 2230, a calculation interval table 2240, an execution log table 2250, an operation policy table 2260, and a stored data management table 2270.
The timer 230 holds current date and time. The content of each table and the content of calculation performed by the calculation unit 210 will be described below.
(Hardware Configuration Example)
The functions of the hierarchical storage control apparatus 20 can be implemented, for example, by causing a computer to execute a program.
That is, the functions of the hierarchical storage control apparatus 20 can be implemented by executing a program corresponding to processing performed by the hierarchical storage control apparatus 20 by using hardware resources such as a CPU and a memory built in a computer. The above program can be recorded on a computer-readable recording medium (portable memory or the like) to be stored or distributed. In addition, the above program can be provided through a network such as the Internet or e-mail.
FIG. 4 is a diagram illustrating an example of a hardware configuration of the above computer. The computer illustrated in FIG. 4 includes a drive device 1000, an auxiliary storage device 1002, a memory device 1003, a CPU 1004, an interface device 1005, a display device 1006, an input device 1007, etc. connected to each other by a bus B.
The program for implementing the processing by the computer is provided, for example, by a recording medium 1001 such as a CD-ROM, a memory card, or the like. When the recording medium 1001 storing the program is set in the drive device 1000, the program is installed in the auxiliary storage device 1002 from the recording medium 1001 via the drive device 1000. However, the program does not necessarily need to be installed from the recording medium 1001 and may be downloaded from another computer via the network. The auxiliary storage device 1002 stores the installed program and also stores necessary files, data, etc.
When the program is instructed to start, the memory device 1003 reads and stores the program from the auxiliary storage device 1002. The CPU 1004 implements the functions of the hierarchical storage control apparatus 20 in accordance with the program stored in the memory device 1003. The interface device 1005 is used as an interface for connecting to a network and functions as input means and output means via the network. The display device 1006 displays a GUI (graphical user interface) or the like in accordance with the program. The input device 157 includes a keyboard, a mouse, buttons, a touch panel, or the like and is used to input various operation instructions.
(Description of Tables)
Next, the tables stored in the storage unit 220 of the hierarchical storage control apparatus 20 will be described.
FIG. 5 illustrates a structure of the data center information table 2210. As illustrated in FIG. 5 , the data center information table 2210 stores the unique number, name, and location (address or latitude/longitude) of a data center included in the present hierarchical storage management system, the unit price of the electricity charge of the power supplied to the data center, and the construction cost per rack. Each information item is input manually by an administrator or automatically when the hierarchical storage 40 is newly added to (or eliminated from) the present hierarchical storage management system or when any one of the information items is changed.
FIG. 6 illustrates a structure of the storage medium information table 2220. As illustrated in FIG. 6 , the storage medium information table 2220 stores the unique number assigned to the storage medium 420 by the hierarchical storage control apparatus 20, reading time, capacity, power consumption during standby and during reading, lifetime, acquisition price, etc. Each information item is input manually by an administrator or automatically when the storage medium is newly added (or eliminated) or when any one of the information items is changed.
FIG. 7(a) illustrates a structure of the transmission line information table 2230. As illustrated in FIG. 7(a), the transmission line information table 2230 stores the unique number assigned by the present hierarchical storage management system to an individual transmission line connecting between the data centers 30, the data center numbers of the data centers located at both ends of the transmission line, and the communication speed of the transmission line. Each information item is input manually by an administrator or automatically when a new transmission line is established or when any one of the information items is changed. The transmission line may be a dedicated line for the operator of the hierarchical storage management system or a public line.
FIG. 7(b) illustrates a structure of the calculation interval table 2240. As illustrated in FIG. 7(b), the calculation interval table 2240 stores intervals (for example, one year, one month) for the calculation determined by the data center operator or the administrator to be performed periodically. The calculation interval is updated when the administrator performs an input to the hierarchical storage management system.
FIG. 7(c) illustrates a structure of the execution log table 2250. As illustrated in FIG. 7(c), the execution log table 2250 stores the calculation execution date and time in the past.
FIG. 8 illustrates a structure of the operation policy table 2260. As illustrated in FIG. 8 , the operation policy table 2260 stores the operation policy of the present hierarchical storage management system. The data center operator or the administrator determines the rank of the delay time, power consumption, and cost, and stores these information items in association with a corresponding policy number.
FIG. 9 illustrates a structure of the stored data management table 2270. As illustrated in FIG. 9 , the stored data management table 2270 manages all the data stored in the present hierarchical storage management system in cooperation with the management unit 430 of the hierarchical storage 40 in each of the data centers 30. When new data is stored in the recording medium 420 in the storage unit 410, a record is added to the stored data management table 2270 to record the data number of the data, data size, number of the storage medium storing the data, the number of reference counts, most frequent reference source area, policy number freely set by the administrator, and the communication speed, power consumption, and cost obtained by the calculation, which will be described below, are recorded.
(Processing Operation of Hierarchical Storage Control Apparatus 20)
Hereinafter, the details of the calculation processing performed by the calculation unit 210 of the hierarchical storage control apparatus 20 will be described with reference to a flowchart in FIG. 10 .
The calculation unit 210 compares the latest calculation execution date and time in the execution log table 2250 with the date and time of the timer 230, and when the calculation interval stored in the calculation interval table 2240 has elapsed, the calculation unit 210 starts a calculation. In addition, the calculation unit 210 stores the data and time when the calculation is started in the execution log table 2250.
In S1 (step 1) in FIG. 10 , the calculation unit 210 performs the following processing for each of all the data managed in the stored data management table 2270: the calculation unit 210 calculates the annual power consumption needed for storing the data by using the following formula (1), calculates the annual cost needed for storing the data by using the following formula (2), and calculates the communication speed from the data center storing the data to the most frequent reference source area by using the following formula (3). Since time needed for reading and transmitting data is used as the communication speed in the present embodiment, the communication speed may be referred to as communication time instead. In addition, while the annual value is used in the present example, a value for a period other than one year may be used.
PU year =T read ×F read ×P read+(8760−T read ×F readP idle   formula (1)
PUyear: annual data storage power consumption
Tread: reading time
Fread: reference frequency
Pread: power consumption during reading
Pidle: power consumption during standby
C year =PU year×Chargepower+(Chargefoorprint×Sizedata)÷Densitystorage+(Chargemedia×Sizedata)÷(Capacitymedia×Lifetimemedia)  formula (2)
Cyear: annual data storage cost
PUyear: data storage power consumption
Chargepower: electricity charge
Chargefoorprint: space cost
Sizedata: data size
Densitystorage: storage medium recording density
Chargemedia: unit price of the storage medium
Capacitymedia: capacity of the storage medium
Lifetimemedia: lifetime of the storage medium
T DL =T read +T 1  formula (3)
TEL: data download time from the data center to the reference source
Tread: reading time
T1: communication speed (communication time) of the NW
In S2, the calculation unit 210 stores the annual power consumption, the annual cost, and the communication speed from the data center to the most frequent reference source area, which have been calculated in S1, in the stored data management table 2270 per data.
In the present example, first, the power consumption, the cost, and the communication speed are calculated for each of all the data, and subsequently, determination, etc. in S3, which will be described below, are performed. Alternatively, however, repetitive processing of “calculation, determination, change” (until the operation policy is satisfied) per data may be performed.
In S3, the calculation unit 210 compares the resultant values (the annual power consumption, the annual storage cost, and the communication speed) calculated in S1 with values set for the policy number corresponding to the data in the operation policy table 2260 per data and determines whether all the values satisfy the corresponding values of the operation policy. When all the values of all the data satisfy the corresponding values in the respective operation policies, the processing ends.
When there is one or more data having the value that does not satisfy the value of the corresponding operation policy, the processing of S4 through S8 is performed on each of the one or more data.
In S4, the data center base and the storage medium corresponding to the data are changed. Alternatively, the data center base may not be changed, and only the storage medium may be changed. How the change is performed is not particularly limited. For example, the change may be made by increasing (or decreasing) the data center number/storage medium number. After the change has been made, the calculation is performed on the assumption that the data is stored in a changed storage medium.
The processing of S4 through S7 is then repeated until the determination in S7 (the same determination as in S4) becomes Yes. The content of the calculation in S5 is the same as that in S1.
When the determination in S7 becomes Yes (when the operation policy is satisfied), the calculation unit 210 transfers the data to the storage medium of the data center at that time. The transfer of data from one data center to another data center can be implemented by instructing the management unit 430 of the hierarchical storage 40 in the relevant data center.
EXAMPLE
Hereinafter, an example will be described as a more specific example. FIG. 11 illustrates the locations of the data centers in the present example. As illustrated in FIG. 11 , four data centers (South Kanto, North Kanto, Joshinetsu, Hokkaido) are located in the eastern Japan area.
FIG. 12 illustrates a configuration of the hierarchical storage management system of the present example. As illustrated in FIG. 12 , each data center is provided with the hierarchical storage 40. A configuration of a storage medium in the individual hierarchical storage 40 is as illustrated in FIG. 11 . In the present example, the hierarchical storage control apparatus 20 is provided in the South Kanto area. The hierarchical storage control apparatus 20 is connected to each data center via a public network.
FIG. 13 illustrates the data center information table 2210 of the present example. As illustrated in FIG. 13 , the data center information table 2210 stores the name, location, unit price of the electricity charge, and construction cost per rack of each data center. The construction cost per rack is a value obtained by dividing the number of accommodated racks by the total construction cost.
FIG. 14 illustrates the storage medium information table 2220 of the present example. As illustrated in FIG. 14 , the storage medium information table 2220 stores information about the storage accommodated in each data center.
FIG. 15(a) illustrates the transmission line information table 2230 of the present example. The transmission line number in the transmission line information table 2230 in FIG. 15(a) corresponds to a number assigned to the individual transmission line in FIG. 12 .
In the present example, with respect to the South Kanto, the South Kanto and Joshinetsu are connected by a dedicated line, and the South Kanto and Hokkaido are also connected by a dedicated line. The South Kanto and the North Kanto are connected via a public network, instead of a dedicated line. Alternatively, the data centers may be directly connected to one another by a dedicated line or may be connected via a public network.
FIG. 15(b) illustrates the calculation interval table 2240 of the present example. In the present example, the hierarchical storage control apparatus 20 performs the calculation and rearranges the stored data at intervals of one month. Thus, the calculation interval stored in the calculation interval table 2240 is set to one month.
FIG. 15(c) illustrates the execution log table 2250 of the present example. As described above, the execution log table 2250 stores the calculation time in the past. In the present example, the calculation is performed on the first of every month.
FIG. 16 illustrates the operation policy table 2260 of the present example. The business operator determines a communication speed (communication time) needed for downloading the data, power consumption needed for storing the data, and a cost based on the construction cost, the acquisition price of the storage, the electricity charge, etc. and assigns an operation policy number to each operation policy in the operation policy table 2260.
For example, as an operation policy assuming data for which latency is not allowed, achieving low delay regardless of the power consumption or the cost is set as a condition of policy number “1”. In addition, for example, as an operation policy assuming a large amount of data having a low reference frequency, having the smallest cost is set as a condition of policy number “3”.
Hereinafter, an example of the detailed processing performed by the calculation unit 210 of the present example will be described.
First, it is assumed that data is uploaded as illustrated in the stored data management table 2270 in FIG. 17 . That is, the following example case will be considered: data having a data size of 300 GB is stored in the SSD “21” of the North Kanto data center as data number 1, data of 1 TB is stored in the HDD “11” of the South Kanto data center as data number 2, and data of 500 MB is stored in the magnetic tape “71” of Hokkaido data center as data number 3.
The users who have uploaded the above data have set policy numbers 5, 4, and 1 to the data number 1, 2, and 3, respectively. In addition, the number of reference counts and the most frequent reference source areas at the time when the first calculation is performed are as illustrated in FIG. 17 .
The calculation described with reference to the flowchart in FIG. 10 starts on the first day of the following month of the data storage. In the present example, the calculation performed for the data of the data number “2” will be described as an example.
The calculation unit 210 calculates the annual power consumption needed for storing the data of the data number “2” by using the formula (1). FIG. 18 illustrates the content of the calculation.
With regard to the data of the data number “2” stored in the storage medium of the storage medium number 11, Tread (a single data reading time (h/time)) is (1 T[B]/200 M[B/s])=3600 based on FIG. 14 . Fread (annual reference frequency) is 20 times, Pread (power consumption of a single reading of 60 T) is 10 [W], and the total capacity of the storage medium is 60 T. Therefore, Tread×Fread×Pread (annual power consumption needed for reading) is ((1 T[B]/200 M[B/s])/3600)×20×10/60, as illustrated in FIG. 18 .
Further, based on FIG. 14 , since Pidle (power consumption during standby) is 5 [W], the annual power consumption needed for standby is (8760−(((1 T[B]/200 M[B/s])/3600)×20)×5=60, as illustrated in FIG. 18 . The sum of the above resultant values is, as illustrated in FIG. 18, 730 , which is the annual power consumption.
Further, the calculation unit 210 calculates the annual cost needed for storing the data of the data number “2” by using the formula (2). FIG. 19 illustrates the content of the calculation in detail.
Since Chargepower (electricity charge) is 20 [yen/Wh], PUyear×Chargepower=730×20=14600 [yen].
In the present example, since Chargefoorprint (space cost) is 2,000,000/2, and Densitystorage (storage medium recording density) is 60 T, (Chargefoorprint×Sizedata)÷Densitystorage=2,000,000/2×1/60=16667 [yen].
Further, since Chargemedia (unit price of the storage medium)=5,000,000, Capacitymedia (capacity of the storage medium)=60, and Lifetimemedia (lifetime of the storage medium)=4, (Chargemedia×Sizedata)÷(Capacitymedia×Lifetimemedia)=5,000,000×1/(60×4)=20833 [yen].
Therefore, by summing up these resultant values, Cyear=52100 [yen] is obtained.
Further, the calculation unit 210 calculates the communication speed (communication time) from the data center storing the data to the most frequent reference source area by using the formula (3). FIG. 20 illustrates the details.
In the present example, Since Tread (reading time)=1 TB/200 MB, and T1 (communication time of the NW)=0, TDL=5000 S.
The calculation unit 210 stores the annual power consumption, the annual cost, and the communication speed from the corresponding data center to the most frequent reference source area, which have been calculated as described above, in the stored data management table 2270. FIG. 21 illustrates the portion of the data number 2 in the stored data management table 2270 after the above calculation results have been stored.
The calculation unit 210 acquires the communication speed, the power consumption, and the cost corresponding to the policy number “4” set for the data of the data number 2 from the operation policy table 2260. The calculation unit 210 compares these acquired values with the communication speed, the power consumption, and the cost calculated above.
Since the communication speed, the power consumption, and the cost all exceed the threshold values of the policy number “4”, the calculation unit 210 determines that the storage medium HDD “11” currently storing the data does not satisfy the operation policy. Thus, the calculation unit 210 changes the storage medium and performs the calculation, and the calculation is continued until the values are within the range of the policy number “4”.
Effects of the Embodiment
According to the technique in the present embodiment described above, when certain data is stored in any one of the plurality of data centers, the data center and the type of storage medium can be automatically selected for the data to be stored in accordance with the communication speed between the data centers and the power consumption and the cost needed for storing the data set in advance by the business operator. As a result, the power consumption and the cost can be reduced, and this leads to reductions of the electricity charge and environmental load as well as an improvement in QoS.
SUMMARY OF THE EMBODIMENT
The present description discloses at least the hierarchical storage management system, the hierarchical storage control apparatus, the hierarchical storage management method, and the program in the following items.
(Item 1)
A hierarchical storage management system, including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage, wherein the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
(Item 2)
The hierarchical storage management system according to item 1, wherein, when the calculation unit determines that the calculated power consumption, cost, and communication time do not satisfy the operation policy set for the data, the calculation unit changes a storage medium storing the data and performs the processing on an assumption that the data is stored in a changed storage medium.
(Item 3)
The hierarchical storage management system according to item 1 or 2, wherein the calculation unit calculates power consumption needed for storing the data by calculating a sum of power consumption for data reading from the storage medium storing the data and power consumption during standby,
calculates a cost needed for storing the data by calculating a sum of a cost of power consumption of the storage medium, a cost of installing the storage medium, and a cost of acquiring the storage medium, and
calculates communication time for transferring the data from a data center to a reference source area based on a reading speed of the storage medium and a communication speed of a transmission line between the data center in which the storage medium is installed and the reference source area.
(Item 4)
A hierarchical storage management method used in a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage,
wherein the hierarchical storage control apparatus performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
(Item 5)
A hierarchical storage control apparatus used in a hierarchical storage management system including: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage,
wherein the hierarchical storage control apparatus includes a calculation unit that performs processing for obtaining, for individual data managed by the hierarchical storage control apparatus, a storage medium in a data center that satisfies an operation policy by calculating power consumption needed for storing the data, a cost needed for storing the data, and communication time for transferring the data from a data center to a reference source area and by comparing the calculated power consumption, cost, and communication time with the operation policy set for the data.
(Item 6)
A program that causes a computer to function as the calculation unit in the hierarchical storage control apparatus according to Item 5.
While the present embodiment has thus been described, the present invention is not limited to the above specific embodiment, and various variations and modifications may be made without departing from the scope of the present invention.
REFERENCE SIGNS LIST
    • 10 Network
    • 20 Hierarchical storage control apparatus
    • 30 Data center
    • 40 Hierarchical storage
    • 50 User
    • 210 Calculation unit
    • 220 Storage unit
    • 230 Timer
    • 420 Storage medium
    • 410 Storage unit
    • 430 Management unit
    • 2210 Data center information table
    • 2220 Storage medium information table
    • 2230 Transmission line information table
    • 2240 Calculation interval table
    • 2250 Execution log table
    • 2260 Operation policy table
    • 2270 Stored data management table
    • 1000 Drive device
    • 1001 Recording medium
    • 1002 Auxiliary storage device
    • 1003 Memory device
    • 1004 CPU
    • 1005 Interface device
    • 1006 Display device
    • 1007 Input device

Claims (9)

The invention claimed is:
1. A hierarchical storage management system comprising: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage for a plurality of data centers, wherein the hierarchical storage control apparatus is configured to:
for a plurality of stored data in each of the plurality of data centers:
obtain characteristics of the stored data;
determine a power consumption required for the stored data in a respective data center by calculating a sum of the power consumption by reading from the storage medium the stored data, power consumption during standby, annual data storage power consumption, a reading time for reading the stored data, and a frequency amount with which the stored data is annually access;
determine a cost required for the stored data in the respective data center;
determine a communication speed for transferring the stored data from the respective data center to a frequent device for downloading the stored data;
determine, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, satisfy a policy from the obtained characteristics of the stored data, the policy describing criteria for data to be stored in the respective data center; and
in response to determining at least one of (i) the power consumption, (ii) the cost required, and (iii) the communication speed, does not satisfy the policy:
identify another data center of the plurality of data centers for storing the stored data from the respective data center;
determine, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, satisfy another policy at the other data center, the other policy describing criteria for data to be stored in the other data center; and
in response to determining (i) the power consumption, (ii) the cost required, and (iii) the communication speed do satisfy the policy at the other data center, transmit the stored data from the respective data center to the other data center for storage.
2. The hierarchical storage management system according to claim 1, wherein the hierarchical storage control apparatus is configured to:
determine the cost required for the stored data in the respective data center by calculating a sum of a cost of power consumption of the storage medium, a cost of installing the storage medium, and a cost of acquiring the storage medium; and
determine the communication speed for transferring the stored data from the respective data center to a frequent device based on a reading speed of the storage medium of the respective data center and a communication speed of a transmission line between the respective data center in which the storage medium is installed and the frequent device.
3. The hierarchical storage management system according to claim 1, wherein the hierarchical storage control apparatus is configured to utilize an elapsed time of a timer to initiate processing for the plurality of stored data in each of the plurality of data centers.
4. The hierarchical storage management system according to claim 3, wherein the timer elapses on a monthly basis.
5. The hierarchical storage management system according to claim 1, wherein the hierarchical storage control apparatus is configured to:
in response to determining at least one of (i) the power consumption, (ii) the cost required, and (iii) the communication speed, does not satisfy the policy:
identify another storage medium at the respective data center for storing the stored data;
determine whether the power consumption, the cost required, and the communication speed for the stored data satisfies the policy at the other storage medium in the respective data center; and
in response to determining (i) the power consumption, (ii) the cost required, and (iii) the communication speed do satisfy the policy at the other storage medium in the respective data center, transmit the stored data from a first storage medium of the respective data center to the other storage medium of the respective data center.
6. A hierarchical storage management method used in a hierarchical storage management system comprising: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage for a plurality of data centers, the method comprising:
for a plurality of stored data in each of the plurality of data centers:
obtaining characteristics of the stored data;
determining a power consumption required for the stored data in a respective data center by calculating a sum of the power consumption by reading from the storage medium the stored data, power consumption during standby, annual data storage power consumption, a reading time for reading the stored data, and a frequency amount with which the stored data is annually access;
determining a cost required for the stored data in the respective data center;
determining a communication speed for transferring the stored data from the respective data center to a frequent device for downloading the stored data;
determining, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, satisfy a policy from the obtained characteristics of the stored data, the policy describing criteria for data to be stored in the respective data center; and
in response to determining at least one of (i) the power consumption, (ii) the cost required, and (iii) the communication speed, does not satisfy the policy:
identifying another data center of the plurality of data centers for storing the stored data from the respective data center;
determining, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, for the stored data satisfies satisfy another policy at the other data center, the other policy describing criteria for data to be stored in the other data center; and
in response to determining (i) the power consumption, (ii) the cost required, and (iii) the communication speed do satisfy the other policy at the other data center, transmitting the stored data from the respective data center to the other data center for storage.
7. The hierarchical storage management method according to claim 6, comprising:
determining the cost required for the stored data in the respective data center by calculating a sum of a cost of power consumption of the storage medium, a cost of installing the storage medium, and a cost of acquiring the storage medium; and
determining the communication speed for transferring the stored data from the respective data center to a frequent device based on a reading speed of the storage medium of the respective data center and a communication speed of a transmission line between the respective data center in which the storage medium is installed and the frequent device.
8. A hierarchical storage control apparatus used in a hierarchical storage management system comprising: a hierarchical storage that is provided in an individual data center and has at least one storage medium; and a hierarchical storage control apparatus that manages at least one hierarchical storage for a plurality of data centers, wherein the hierarchical storage control apparatus is configured to:
for a plurality of stored data in each of the plurality of data centers:
obtain characteristics of the stored data;
determine a power consumption required for the stored data in a respective data center by calculating a sum of the power consumption by reading from the storage medium the stored data, power consumption during standby, annual data storage power consumption, a reading time for reading the stored data, and a frequency amount with which the stored data is annually access;
determine a cost required for the stored data in the respective data center;
determine a communication speed for transferring the stored data from the respective data center to a frequent device for downloading the stored data;
determine, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, satisfy a policy from the obtained characteristics of the stored data, the policy describing criteria for data to be stored in the respective data center; and
in response to determining at least one of (i) the power consumption, (ii) the cost required, and (iii) the communication speed, does not satisfy the policy:
identify another data center of the plurality of data centers for storing the stored data from the respective data center;
determine, for the stored data, whether (i) the power consumption, (ii) the cost required, and (iii) the communication speed, satisfy another policy at the other data center, the other policy describing criteria for data to be stored in the other data center; and
in response to determining (i) the power consumption, (ii) the cost required, and (iii) the communication speed do satisfy the other policy at the other data center, transmit the stored data from the respective data center to the other data center for storage.
9. The hierarchical storage control apparatus according to claim 8, wherein the hierarchical storage control apparatus is configured to:
determine the cost required for the stored data in the respective data center by calculating a sum of a cost of power consumption of the storage medium, a cost of installing the storage medium, and a cost of acquiring the storage medium; and
determine the communication speed for transferring the stored data from the respective data center to a frequent device based on a reading speed of the storage medium of the respective data center and a communication speed of a transmission line between the respective data center in which the storage medium is installed and the frequent device.
US17/629,462 2019-07-31 2019-07-31 Hierarchical storage management system, hierarchical storage control apparatus, hierarchical storage management method and program Active US11954076B2 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/030082 WO2021019746A1 (en) 2019-07-31 2019-07-31 Hierarchical storage management system, hierarchical storage control device, hierarchical storage management method, and program

Publications (2)

Publication Number Publication Date
US20220222220A1 US20220222220A1 (en) 2022-07-14
US11954076B2 true US11954076B2 (en) 2024-04-09

Family

ID=74229478

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/629,462 Active US11954076B2 (en) 2019-07-31 2019-07-31 Hierarchical storage management system, hierarchical storage control apparatus, hierarchical storage management method and program

Country Status (3)

Country Link
US (1) US11954076B2 (en)
JP (1) JP7176639B2 (en)
WO (1) WO2021019746A1 (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6330572B1 (en) * 1998-07-15 2001-12-11 Imation Corp. Hierarchical data storage management
US20020008250A1 (en) * 2000-02-02 2002-01-24 Esin Terzioglu Memory module with hierarchical functionality
US20050033757A1 (en) * 2001-08-31 2005-02-10 Arkivio, Inc. Techniques for performing policy automated operations
US20050055519A1 (en) * 2003-09-08 2005-03-10 Stuart Alan L. Method, system, and program for implementing retention policies to archive records
US20050246386A1 (en) * 2004-02-20 2005-11-03 George Sullivan Hierarchical storage management
US20060069886A1 (en) * 2004-09-28 2006-03-30 Akhil Tulyani Managing disk storage media
US20070136397A1 (en) * 2005-12-09 2007-06-14 Interdigital Technology Corporation Information life-cycle management architecture for a device with infinite storage capacity
US20070179990A1 (en) * 2006-01-31 2007-08-02 Eyal Zimran Primary stub file retention and secondary retention coordination in a hierarchical storage system
US20070250838A1 (en) * 2006-04-24 2007-10-25 Belady Christian L Computer workload redistribution
US20090144393A1 (en) * 2007-11-29 2009-06-04 Yutaka Kudo Method and apparatus for locating candidate data centers for application migration
US20110040937A1 (en) * 2009-08-11 2011-02-17 International Business Machines Corporation Hierarchical storage management for database systems
US20140298349A1 (en) * 2008-04-21 2014-10-02 Adaptive Computing Enterprises, Inc. System and Method for Managing Energy Consumption in a Compute Environment
US20200026784A1 (en) * 2018-07-18 2020-01-23 International Business Machines Corporation Preventing inefficient recalls in a hierarchical storage management (hsm) system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012093992A (en) * 2010-10-27 2012-05-17 Ejworks Corp Data center controlling system, data center controlling apparatus and program
JP2013016111A (en) * 2011-07-06 2013-01-24 Panasonic Corp Data center system, operation evaluation device, and program of operation evaluation device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6330572B1 (en) * 1998-07-15 2001-12-11 Imation Corp. Hierarchical data storage management
US20020008250A1 (en) * 2000-02-02 2002-01-24 Esin Terzioglu Memory module with hierarchical functionality
US20050033757A1 (en) * 2001-08-31 2005-02-10 Arkivio, Inc. Techniques for performing policy automated operations
US20050055519A1 (en) * 2003-09-08 2005-03-10 Stuart Alan L. Method, system, and program for implementing retention policies to archive records
US20050246386A1 (en) * 2004-02-20 2005-11-03 George Sullivan Hierarchical storage management
US20060069886A1 (en) * 2004-09-28 2006-03-30 Akhil Tulyani Managing disk storage media
US20070136397A1 (en) * 2005-12-09 2007-06-14 Interdigital Technology Corporation Information life-cycle management architecture for a device with infinite storage capacity
US20070179990A1 (en) * 2006-01-31 2007-08-02 Eyal Zimran Primary stub file retention and secondary retention coordination in a hierarchical storage system
US20070250838A1 (en) * 2006-04-24 2007-10-25 Belady Christian L Computer workload redistribution
US20090144393A1 (en) * 2007-11-29 2009-06-04 Yutaka Kudo Method and apparatus for locating candidate data centers for application migration
US20100325273A1 (en) * 2007-11-29 2010-12-23 Hitachi, Ltd. Method and apparatus for locating candidate data centers for application migration
US20140298349A1 (en) * 2008-04-21 2014-10-02 Adaptive Computing Enterprises, Inc. System and Method for Managing Energy Consumption in a Compute Environment
US20110040937A1 (en) * 2009-08-11 2011-02-17 International Business Machines Corporation Hierarchical storage management for database systems
US20200026784A1 (en) * 2018-07-18 2020-01-23 International Business Machines Corporation Preventing inefficient recalls in a hierarchical storage management (hsm) system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
[No Author Listed] [online], "Part 1 Mechanism of CDN (What kind of technology can CDN do?)," Cash shop blog CDN / WEB high-speed blog, May 18, 2015, retrieved from URL <https://blog.redbox.ne.jp/what-is-cdn.html>, 29 pages (with English Translation).
Katsurashima, "Systematic understanding of storage virtualization (4): Understanding automatic storage tiering (1/3)," ITmedia Inc., Jun. 27, 2011, retrieved from URL <https://www.atmarkit.co.jp/ait/articles/1106/27/news109.html>, 7 pages (with English Translation).
Miki et al., "Basic knowledge of storage in the "offensive IT" era, 1st What is Software Defined Storage?" ITmedia Inc., Sep. 29, 2014, retrieved from URL <https://atmarkit.itmedia.co.jp/ait/articles/1409/29/news130.html>, 9 pages (with English Translation).

Also Published As

Publication number Publication date
JPWO2021019746A1 (en) 2021-02-04
JP7176639B2 (en) 2022-11-22
US20220222220A1 (en) 2022-07-14
WO2021019746A1 (en) 2021-02-04

Similar Documents

Publication Publication Date Title
US8578096B2 (en) Policy for storing data objects in a multi-tier storage system
US8549229B2 (en) Systems and methods for managing an upload of files in a shared cache storage system
US9891830B2 (en) Tier based data file management
US8756199B2 (en) File level hierarchical storage management system, method, and apparatus
US9965207B2 (en) Maintenance of cloned computer data
CN105637470B (en) Method and computing device for dirty data management
US20100153474A1 (en) Discardable files
CN101258497A (en) A method for centralized policy based disk-space preallocation in a distributed file system
US11126506B2 (en) Systems and methods for predictive data protection
JP7176209B2 (en) Information processing equipment
US9804863B2 (en) Efficient sharing of artifacts between collaboration applications
US10560513B2 (en) Small disk support for large cloud-based storage systems
US20120296871A1 (en) File managing apparatus for processing an online storage service
CN109947373A (en) Data processing method and device
CN115469813A (en) Data processing method and device, storage medium and electronic device
US11531468B2 (en) System and method for managing storage space
CN106156038B (en) Date storage method and device
CN101483668A (en) Network storage and access method, device and system for hot spot data
US11954076B2 (en) Hierarchical storage management system, hierarchical storage control apparatus, hierarchical storage management method and program
JP2021513137A (en) Data migration in a tiered storage management system
JP2019125322A (en) Storage management device, method for managing storage, and program
US20210286772A1 (en) Tape unmounting protocol
CN115185451A (en) Shared storage dynamic user quota system

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:IINO, TOMONORI;SAKURAI, ATSUSHI;TANAKA, YURIKO;SIGNING DATES FROM 20211014 TO 20211020;REEL/FRAME:060276/0109

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STCF Information on status: patent grant

Free format text: PATENTED CASE