WO2018038740A1 - Method and apparatus to control data copy based on correlations between number of copied data and application output - Google Patents

Method and apparatus to control data copy based on correlations between number of copied data and application output Download PDF

Info

Publication number
WO2018038740A1
WO2018038740A1 PCT/US2016/049031 US2016049031W WO2018038740A1 WO 2018038740 A1 WO2018038740 A1 WO 2018038740A1 US 2016049031 W US2016049031 W US 2016049031W WO 2018038740 A1 WO2018038740 A1 WO 2018038740A1
Authority
WO
WIPO (PCT)
Prior art keywords
entries
copy pair
output
storage area
data
Prior art date
Application number
PCT/US2016/049031
Other languages
French (fr)
Inventor
Yasutaka Kono
Original Assignee
Hitachi, Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi, Ltd. filed Critical Hitachi, Ltd.
Priority to PCT/US2016/049031 priority Critical patent/WO2018038740A1/en
Publication of WO2018038740A1 publication Critical patent/WO2018038740A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0617Improving the reliability of storage systems in relation to availability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery

Definitions

  • the present disclosure relates generally to storage system management, and more specifically, towards system s and methods of managing data copies.
  • aspects of the present disclosure include a management computer connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected.
  • the management computer can include a processor, configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas: create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output: and apply the copy pair condition to the copy pair.
  • aspects of the present disclosure further include a computer program having instmctions for executing a process for a management computer connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate a output from which a change to the output is detected.
  • Tire instructions can include managing a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas: creating a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and applying the copy pair condition to the copy pair.
  • the instmctions can be stored on a non-transitory computer readable medium.
  • aspects of the present disclosure further include a system, which can involve a plurality of servers, w herein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected; a plurality of storage areas; and a management computer connected to a plurality of servers and a plurality of storage areas.
  • the management computer can include a processor, configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and apply the copy- pair condition to the copy pair.
  • aspects of the present disclosure further include a system which can involve means connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected.
  • the system can further include means for managing a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; means for creating a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and means for applying the copy pair condition to the copy pair.
  • FIG. 1(a) illustrates an example of a logical configuration of the system in which the method and apparatus of the example implementations described herein may be applied.
  • FIG. 1(b) illustrates a logical configuration of the IT infrastructure of FIG. 1(a), in accordance with an example implementation.
  • FIG. 1(c) illustrates a physical configuration of the IT environment, in accordance with an example implementation.
  • FIG. 2 illustrates the configurations of management computer, in accordance with an example implementation.
  • FIG. 3 illustrates an example of Extract, Transform and Load (ETL) Rule Table, in accordance with an example implementation.
  • FIG. 4 illustrates Database Query Logs Table, in accordance with an example implementation.
  • FIG. 5 illustrates the physical storage table, in accordance with an example implementation .
  • FIG. 6 illustrates a storage volume table, in accordance with an example implementation.
  • FIG. 7 illustrates a physical server table, in accordance with an example implementation .
  • FIG. 8 illustrates a virtual server table, in accordance with an example implementation.
  • FIG. 9 illustrates a mapping table, in accordance with an example implementation.
  • FIG. 10 illustrates a copy pair table, in accordance with an example implementation.
  • FIG. 11 illustrates a copy condition table, in accordance with an example implementation.
  • FIG. 12 illustrates a graphical user interface (GUI) of self-service portal, in accordance with an example implementation.
  • GUI graphical user interface
  • FIG. 13 illustrates a flowchart of management program, for deploying an application, in accordance with an example implementation.
  • FIG. 14 illustrates a flowchart of management program for creating a copy condition, in accordance with an example implementation.
  • FIG. 15 illustrates an example configuration created by management program to deploy an application, in accordance with an example implementation.
  • FIG. 16 illustrates a flowchart of management program for resyncing copy pair to copy data which is created, modified or deleted in a data source to a target data store, in accordance with an example implementation.
  • FIG. 17 shows an example of multiple cycles of FIG. 14, in accordance with an example implementation.
  • FIG. 18 illustrates another example of multiple cycles of FIG . 14, in accordance with an example implementation.
  • the example implementations described herein are directed to a management program configured to copy data, execute Extract, Transform and Load (ETL) and deploy an application w hile considering correlations between the number of input data to the application and occurrence of changes in its output.
  • the management program also set priorities of data copy based on the correlations.
  • the management program does not copy data if it detennines that output of the application would not be changed even if the data is copied.
  • FIG. 1(a) illustrates an example of a logical configuration of the system in which the method and apparatus of the invention may be applied.
  • IT environment 1000 can include management program 1200, Self-Service Portal
  • Application developer 1010 develops applications and deploys them onto the IT environment 1000 via Self-Service Portal 1 100.
  • IT infrastructure administrator 1030 manages the IT environment 1000 via IT infrastructure management UI 1400.
  • FIG. 1 (b) illustrates a logical configuration of the IT infrastructure 1500 of FIG.
  • IT infrastructure 1500 involves one or more servers and/or storage arrays.
  • Applications 1523 and 1533 are running on Virtual Machines (VMs) 1522 and 1532 which are running on Hypervisor 1521 and 1531 respectively.
  • Application 1523 uses Storage Volumes 1511 of Storage Array 1510 and
  • Application 1533 uses Storage Volumes 1512 and 1513 of Storage Array 1510.
  • Storage Control Program 1514 nms on Storage Array 1510 and it controls I/O from applications to storage volumes.
  • Applications 1544 and 1545 are running on VMs 1542 and 1543 which are running on Hypervisor 1541 .
  • Application 1544 uses Storage Volumes 1551 and 1552 of Storage VM 1555 and
  • Application 1545 uses Storage Volumes 1553 of Storage VM 1556.
  • Storage Control Program 1557 and 1558 runs on Storage VM 1555 and 1556, and these programs control I/O from applications to storage volumes.
  • FIG. 1(c) illustrates a physical configuration of the IT environment, in accordance with an example implementation.
  • IT environment 1000 involves management computer 2000, servers 3000, storage arrays 4000, management network 5000 and data network 6000.
  • Servers 3000 and storage arrays 4000 are connected via data network 6000.
  • This network can be LAN (Local Area Network) but it is not limited thereto, and other networks may be utilized according to the desired implementation.
  • Management computer 2000, servers 3000 and storage arrays 4000 are connected via management network 5000.
  • Management network 5000 may also be LAN, but is not limited thereto.
  • management network 5000 and data network 6000 are separated in this example, they can be a single converged network in accordance with a desired implementation.
  • management computer 2000 and servers 3000 are separated, but other implementations may also be utilized according to the desired implementation.
  • any server can host a management program.
  • servers 3000 and storage arrays 4000 are separated, however, other implementations may also be utilized according to the desired implementation.
  • servers and storages arrays can be converged into one system.
  • the storage arrays 4000 can provide a plurality of storage areas.
  • one or more servers from the plurality of servers 3000 can be configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected, as described in more detail below.
  • FIG. 2 illustrates the configurations of management computer 2000, in accordance with an example implementation.
  • management computer 2000 may also be in the form of a management server.
  • Management Network Interface 2100 is an interface to connect the management computer 2000 to the management network 5000.
  • I/O Input and output
  • Local Disk 2400 contains management program 2410, ETL (Extract, Transform and Load) rule table 2420 and database query logs table 2430.
  • Management program 2410 is loaded to Memory 2500 and executed by processor 2200.
  • Processor 2200 can be in the form of a physical processor or central processing unit (CPU) that is configured to execute instructions from memory 2500, or as directly input into the processor 2200.
  • the procedure of the management program 2410 is disclosed below.
  • Management program 2410 is the same entity as management program 1200 in FIG. 1(a).
  • ETL rule table 2420 and database query logs table 2430 are loaded to Memory 2500 and used by the management program 2410, of which further description is provided below.
  • Memory 2500 contains storage array table 2510, storage volume table 2520, physical server table 2530, virtual server table 2540, mapping table 2550, copy pair table 2560 and application characteristics table 2570, of which further description is provided below.
  • the processor 2200 can be configured to load the management program 2410 from memory 2500 and thereby configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output as described in FIG. 14: and apply the copy pair condition to the copy pair.
  • Copy pair conditions are described with respect to FIG. 11.
  • the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application until a first occurrence of the change in output is detected: wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the number of entries applied to the data until the first occerience of the change in output is detected as described in FIG. 14.
  • the number of entries applied to the data associated with the target storage area from, the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of an average number of entries applied to the data over occurrences of changes in output, wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the average number of entries as described in FIG. 14.
  • the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of a minimum number of entries applied to the data for the change in the output to occur, wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the minimum number of entries as described in FIG. 14.
  • the data associated with the target storage area is stored in the target storage area.
  • the data associated with the target storage area can be stored in the target storage area 155 A in FIG. 15.
  • the data can also be stored in another storage area and processed through an Extract, Transfonn and Load (ETL) process, as described in FIG. 15 with respect to storage area 1552A.
  • ETL Extract, Transfonn and Load
  • the processor 2200 can be configured to load the management program 2410 from memory 2500 and thereby configured to manage a plurality of copy pairs and a plurality of copy pair conditions, each of the plurality of copy pair conditions associated with a corresponding one of the plurality of copy pairs, each of the plurality of copy pair conditions associated with a priority based on an associated number of entries to cause a change to an associated output, such that ones of the plurality of copy pair conditions having a fewer associated number of entries are given higher priority, as described in FIG. 11.
  • Processor 2200 can also be configured to execute a re synchronization process for each of the plurality of copy pairs based on the associated priority as described in FIG. 16.
  • FIG. 3 illustrates an example of ETL Rule Table 2420, m accordance with an example implementation.
  • Column 2421 shows identifications of ETL rules.
  • Column 2422 shows queries to extract data.
  • Column 2423 shows rules to transfonn. data.
  • Column 2424 shows methods to load data.
  • Each row shows an ELT rule.
  • row 242A shows a rule which extract data via Structured Query Language (SQL) query, transfonn the extracted data into a structured data named "sales_data" and load the transformed data to a data store named ' " datastore- 1 " through Hypertext transfer protocol (HTTP) Post method.
  • SQL Structured Query Language
  • FIG. 4 illustrates Database Query Logs Table 2430, in accordance with an example implementation.
  • Column 2431 shows the application identifiers (IDs).
  • Column 2432 shows logs of database query issued by the applications.
  • Each row shows examples of logs issued by an application.
  • row 243A shows that an application having value of 01 as its ID issued some Insert queries and Delete queries.
  • FIG. 5 illustrates the physical storage table 2510, in accordance with an example implementation.
  • This table is created in the memory 2500 by management program 2410.
  • Column 2511 shows identifications of storage arrays.
  • Column 2512 shows types of storage arrays. The type can be either "Storage Array” which means hardware-based storage array, or "Storage VM” which means virtual machine based storage array in this example implementation.
  • Column 2513 shows processors of the storage arrays.
  • Column 2514 shows ports of the storage arrays.
  • Column 2515 shows cache resources of the storage arrays.
  • Column 2516 shows pools of resources (typically capacities) of the storage arrays.
  • Each row shows configurations of a storage array. For example, row 251 A shows configurations of storage array 01.
  • the storage array is hardware-based and has two processors with 32 cores each, 8Gbps of port A, B, C and D, 160GB of cache C-01 and 128GB of cache C-02, 300TB of pool Pool-01 and Pool -02, and 500TB of pool Pool -03 and Pool -04, but is not limited thereto, and other configurations can also be utilized in accordance with the desired implementation.
  • Rows 25 IB and 25 IC show configurations of storage VM 02 and 03.
  • FIG. 6 illustrates a storage volume table 2520, in accordance with an example implementation.
  • This table is created in the memory 2500 by management program 2/410.
  • Column 252 ! shows identifiers of storage arrays owning storage volumes.
  • Column 2522 shows identifiers of storage volumes.
  • Column 2523 shows the capacities of each storage volume.
  • Column 2524 shows identifiers of pools from which storage volumes are curved.
  • Each row 252A, 252B, 252C, 252D, 252E, 252F, 252G shows the configuration of each storage volume.
  • row 252A shows a configuration of storage volume 01 of storage array 01. This storage volume has 10TB of capacity, curved from Pool-01.
  • FIG. 7 illustrates a physical server table 2530, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410.
  • Column 2531 shows identifiers of physical servers.
  • Column 2532 shows numbers of cores and types of central processing unit (CPU) of each physical server.
  • Column 2533 shows capacities of memory resources of each physical server.
  • Column 2534 shows ports of each physical server.
  • Each row 253A, 253B, 253C, 253D shows the configuration of each physical server.
  • row 253A shows a configuration of physical server 01.
  • the physical server has 12 cores of Normal CPU, 32GB of memory, 4Gbps of port A and B.
  • FIG. 8 illustrates a virtual server table 2540, in accordance with an example implementation.
  • This table is created in the memory 2500 by management program 2410.
  • Column 2541 shows identifications of the virtual servers.
  • Column 2542 shows identifiers of the physical servers on which the virtual servers are running.
  • Column 2543 shows numbers of CPU cores assigned to each virtual server.
  • Column 2544 shows capacities of memory resources assigned to each virtual server.
  • Column 2545 shows ports assigned to each virtual server.
  • Each row 254A, 254B, 254C, 254D shows the configuration of each virtual server.
  • row 254A shows a configuration of virtual server 01. This virtual server is hosted on physical server 01 , has 2 CPU cores, 4GB of memory and 4Gbps of port A,
  • FIG. 9 illustrates a mapping table 2550, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410.
  • Column 2551 shows identifiers of applications.
  • Column 2552 shows names of the applications.
  • Column 2553 shows identifiers of virtual servers on which the applications are running.
  • Column 2554 shows identifiers of ports of the virtual servers.
  • Column 2555 shows identifiers of storage arrays or storage VMs.
  • Column 2556 shows identifiers of ports of the storage arrays or storage VMs.
  • Column 2557 shows identifiers of storage volumes.
  • Each row 255A, 255B, 255C, 255D shows an end-to-end mapping between an application and storage volumes.
  • row 255A shows that application 01 whose name is "Database-A" is running on virtual server 01 and storage volume 01 of storage array 01 is allocated to this application via storage port A and virtual server port A.
  • FIG. 10 illustrates a copy pair table 2560, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410.
  • Column 2561 shows identifiers of copy pairs.
  • Column 2562 shows identifiers of source storage arrays or storage VMs,
  • Column 2563 shows identifiers of source storage volumes.
  • Column 2564 shows identifiers of target storage arrays or storage VMs.
  • Column 2565 shows identifiers of target storage volumes or storage VMs.
  • Column 2565 shows statuses of copy pairs.
  • Each row 256A, 265B shows a copy pair between a source storage volume and a target storage volume.
  • row 256A shows a copy pair between storage volume 01 of storage array 01 and storage volume 01 of storage array 02. Its status is "Paired” which means that data contained in storage volume 01 of storage array 01 is being copied to storage volume 01 of storage array 02.
  • the status of copy pair 02 shown in row 256B is " 'Suspended" which means that data copy between storage volume 01 of storage array 01 and storage volume 02 of storage array 02 is suspended and thus data, contained in these two storage volumes may be different.
  • FIG. 11 illustrates a copy condition table 2570, in accordance with an example implementation. This table is created in the memory 2,500 by management program 2410.
  • Column 2571 shows identifiers of copy pairs.
  • Column 2572 shows identifiers of ETL rales assigned to each copy pair.
  • Column 2573 shows copy pair conditions, which include conditions to activate each copy pair.
  • Column 2574 shows priorities assigned to each copy pair.
  • Each row 257A, 257B shows a condition to activate each copy pair. For example, row 257 A shows that copy pair 01 is activated if delta of number of "purchasers" in "sales_data" is equal or larger than 1000.
  • Row 257B shows that copy pair 02 is activated if delta of number of "purchasers" in "purchasejhistory” is equal or larger than 1. If these two conditions are satisfied at the same time, copy pair 02 is activated first because copy pair 02 has higher priority than copy pair 01.
  • the copy pair conditions can be based on the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output.
  • the number of new entries required to invoke the copy pair condition is "1000", which can be based on a number of entries required to change the output for the data of "sales data”.
  • priority can be assigned based on the number of entries required to change the output for the data, wherein fewer entries can be assigned a higher priority, as shown in the priority of copy condition 257B having a higher priority than the copy condition of 257A.
  • FIG. 12 illustrates a GUI 1100-A of self-service portal 1100, in accordance with an example implementation.
  • This GUI is used when application developer 1010 deploys an application.
  • the application developer selects and/or inputs a data source l i l O-A, an extract method 1120-A, a transform rule 1 130-A, a load method 1140-A, and an application image 1150- A to be deployed.
  • An application image can be an image of a VM or a container image but is not limited to and other application images can be applied according to the desired implementation. If the OK button 1160-A is clicked, management program 2410 copies data contained within the selected data source, executes ETL processes and deploys the selected application.
  • FIG. 13 illustrates a flowchart of management program for deploying an application, in accordance with an example implementation.
  • the flow for the management program begins at 10010.
  • Management program 2410 receives a request for deploying an application from the self- service portal 1100. Parameters as shown in FIG. 12 are passed to Management program 2410.
  • the Management program 2410 stores received ETL information into ETL rule table.
  • the Management program 2410 identifies storage volumes containing data of specified data source. This can be done by comparing the selected value of data source 1110- A with values of application ID 2551 of Mapping Table 2550.
  • the Management program 2410 judges if a target storage specified by Load Method 1140-A exists or not. If the result is Yes then the process proceeds to 10070. If the result is No then the process proceeds to 10060.
  • the Management program 2410 creates a storage VM which can be used as a target storage, and a storage volume.
  • Management program 2410 refers configuration information necessar ' for creating a storage VM in Physical Server Table 2530. Configuration information of the created storage VM is stored in Virtual Server Table 2540.
  • Management program 2410 configures a copy pair between the source storage volume identified in step 10040 and a target storage volume which has been existing or created at 10060.
  • Management program 2410 refers configuration information necessary for configuring a copy pair (e.g. capacity of each storage volume) in Storage Array Table 2510 and Storage Volume Table 2520.
  • Management program 2410 executes initial data copy between the source storage volume and the target storage volume.
  • Management program 2410 suspends the data copy.
  • Management program 2410 stores copy pair information into Copy Pair Table 2560.
  • Management program 2410 CIO SltCS £1 VM deploys the same type of database as specified data source and allocates the target storage volume to the created database. This process facilitates the management program 2410 to read data copied from the source storage volume to the target storage volume via the same access method like SQL.
  • Management program 2410 extracts data from the created database, transforms the data and loads transformed data into the target data store according to the specified ETL information.
  • Management program 2410 invokes the "create copy condition" sub-sequence. This sub-sequence can be done asynchronously.
  • Management program 2410 creates a VM and deploys an instance of specified application image.
  • Management program 2410 quits the application deployment process.
  • FIG. 14 illustrates a flowchart of management program 2410 for creating a copy condition, in accordance with an example implementation .
  • the flow illustrated in FIG. 14 is the procedure from the flow at 10130.
  • One purpose of executing this sub-sequence is to identify correlations between number of input data to the application and occurrence of changes in its output, and create copy condition based on the correlation to make data copy efficient.
  • Management program 2410 creates a temporary VM and deploys a temporary application instance of the specified application image.
  • Management program. 2410 deploys a proxy for the target data store and binds it to the temporary application instead of the target data, store. This proxy has the same data access interface as the one of the target data store, and a limitation for the number of data to be returned can be set.
  • Management program 2410 sets an initial limitation of number of data to be returned to the temporary application instance.
  • Management program 2410 judges if number of data contained in the target data store is equal to or larger than the limitation or not. If the result is Yes, then the process proceeds to 11060. If the result is No, then the process proceeds to 11090.
  • Management program 2,410 runs the temporary instance of the application .
  • Management program 2410 compares the output of the application with the output in the previous execution of the same application and check if the output is changed. If it is the first execution (thereby resulting in no previous output), then management program 2,410 concludes that the output is changed.
  • Management program 2410 increases the limitation for the number of data to be returned to the temporary application instance.
  • Management program 2410 identifies a range of number of data in which the output of the application is not changed.
  • Management program 2410 creates a copy condition and stores it to Copy Condition Table.
  • Management program 2410 sets priorities based on the range if there are multiple copy pairs which have the same source storage volume.
  • the copy condition can be based on the desired implementation and is not particularly limited thereto.
  • the copy condition can be set based on the termination of a cycle if a change has occurred. That is, cycles as described in FIGS. 16 and 17 are traversed until one cycle causes a change in the output, upon which the number of data entries associated with the cycle is used as a copy condition. For example, if the cycle associated with the change in output involved the input of 1000 data entries for the change in output to occur, the copy condition can be set to execute the copy operation when 1000 new data entries are received.
  • all of the cycles can be executed until completion, and then an average is taken. That is, cycles as described in FIGS. 16 and 17 are traversed until the maximum number of data entries is reached, upon which the average number of data entries associated with the cycles in which changes occurred is used as a copy condition. For example, if the average number of data entries between cycles in which changes occur is 1000 data entries, the copy condition can be set to execute the copy operation when 1000 new data entries are received.
  • all of the cycles can be executed until competition, and then a minimum value is taken. That is, cycles as described in FIGS. 16 and 17 are traversed until the maximum number of data entries is reached, upon which the cycle associated with the smallest number of data entries to which changes occurred is used as a copy condition. For example, if one of the cycles associated with a change indicates that the number of data entries causing the change is 100 data entries and such a value is the smallest among all of the other cycles in which changes occurred, the copy condition can be set to execute the copy operation when 100 new data entries are received.
  • Management program 2410 terminates the process.
  • FIG. 15 illustrates an example configuration created by management program
  • storage array 1510A contains Storage Control Program 1512A and Storage Array 1511A.
  • Server 1520A includes Database-A 1523A, VM 1522A and Hypervisor 1521A.
  • Server 1530A includes Analytics-A application 1533A, VM 1532A and Hypervisor 1531A.
  • Server 1540A includes Analytics-A application 1543 A, VM 1542A, and Hypervisor 1541A.
  • Server 1550A includes Storage Control Program 1557A, Database-A 1558A, ETL 1559A, Datastore-l 15AA, Proxy 15BA, VMs 1554A, 1555A, 1556A, Hypervisor 1553A, and storage volumes 1551 A and 1552A.
  • Management program 2410 executes the following processes in accordance with die flow diagrams of FIGS. 13, 14 and 16.
  • storage volume 1552A can alternately be removed, with all operations conducted on storage volume 1551A instead, depending on the desired implementation.
  • the data associated with the target storage area can be stored in the target storage area 1551 A in FIG. 15, thereby implementing the server 1550A without another storage area 1552A.
  • operations can be conducted on Database-A 1558A without the ETL operation 1559A, thereby removing the need for ETL 1559 A, Datastore-l 15AA and Proxy 15BA .
  • the present disclosure is not limited thereto, and any combinations of the elements of ETL 1559A, Datastore-l 15AA, Proxy 15BA, and the another storage area 1552 A may be used or omitted depending on the desired implementation.
  • FIG, 16 illustrates a flowchart of management program 2410 for resyncing copy pair to copy data which is created, modified or deleted in a data source to a target data store, in accordance with an example implementation.
  • Management program 2410 monitors each data source, retrieves query logs and stores them into Database Query Logs Table 2430.
  • Management program 2 10 checks if a copy condition is satisfied based on Database Query Logs Table 2430, ETL Rule Table 2420 and Copy Condition Table 2570.
  • Management program 2410 judges if there are any copy pairs whose copy condition is satisfied or not. If the result is Yes then the process proceeds to 12050. If the result is No then the process proceeds to 12020.
  • Management program 2410 resyncs the copy pair to transfer uncopied data from the data source to the target data store.
  • the copy operation can be conducted according to the priority of the data set as illustrated in FIG. 11. For example, the data entries obtained from 257B of FIG. 11 has higher priority than the data, set of 257A, so the uncopied data of 257B is copied before the uncopied data set of 257B.
  • the data sets can be copied in a serial manner (e.g.
  • uncopied data entries of higher priority are copied over first until completion before initiation of a copy operation on uncopied data entries of lower priority
  • present disclosure is not limited thereto, and other implementations may be utilized according to the desired implementation (e.g., bandwidth sharing schemes for copying in parallel).
  • Management program 2410 suspends the data copy.
  • Management program 2410 suspends the data copy.
  • Management program 2410 extracts data from the database 1558A, transforms the data and loads transformed data into the target data store 15AA.
  • the management program copies data, executes ETL and deploys an application with considering correlations between number of input data to the application and occurrence of changes in its output. Management program does not copy data if it determines that output of the application would not be changed even if the data is copied. Management program also set priorities of data copy based on the correlations. Through this example implementation, management program can control data, copy necessary for a deployed application efficiently.
  • FIG. 17 shows an example of multiple cycles of FIG. 14, in accordance with an example implementation. Specifically, FIG. 17 illustrates an example flow from 11050 to 11080 from FIG. 14 as applied to the elements described in FIG. 15. The number of data stored in storage volume 1552A is 5000 in this example.
  • the output of Analytics-A application 1543A is "A”.
  • a value of 200 is set to Proxy 15BA. This means that the Proxy 15BA returns only 200 data of total 5000 data to Analytics- A application 1543 A.
  • the output of Analytics-A application 1543A is "A”.
  • Management program 2410 identifies that the output of Analytics-A application 1543A is not changed.
  • a value of 1000 is set to Proxy 15BA. This means that the Proxy 15BA returns only 1000 data of total 5000 data to Analytics-A application 1543A.
  • the output of Analytics-A application I543A is "B”.
  • Management program 2410 identifies that the output of Analytics-A application 1543A is changed. Management program 2410 identifies that output of Analytics-A application is not changed if the number of its input data is between 0 and 999. As a result, management program 2410 creates a copy condition like shown in the row 257A in Fig. l l .
  • An example of Analytics-A application that has a weak correlation between number of input data to the application and occurrence of changes in its output is an application that analyzes sales data of a product (e.g. book) to identify large purchasing power group. A few change of number of input sales data would not affect the result of the application.
  • FIG. 18 illustrates another example of multiple cycles of FIG. 14, in accordance with an example implementation. Specifically, FIG. 18 illustrates an example flow from 11050 to 11080 from FIG. 14 as applied to the elements described in FIG. 15.
  • the application is Anaiytics-B application I543B that is different from the one in the previous example shown in FIG. 17.
  • the number of data, stored in storage volume 1552A is 100 in this example.
  • a value of 1 is set to Proxy 15BA. This means that the Proxy
  • Tire output of Analytics-B application 1543B is "A"
  • a value of 2 is set to Proxy 15BA. This means that the Proxy 15BA returns only 2 data of total 100 data to Analytics-B application 1543B.
  • the output of Analytics-B application 1543B is "B”
  • Management program 2410 identifies that the output of Analytics-B application 1543B is changed.
  • a value of 5 is set to Proxy 15BA. This means that the Proxy 15BA returns only 5 data of total 100 data, to Analytics-B application 1543B.
  • the output of Analytics-B application 1543B is "E”.
  • Management program 241 identifies that the output of Analytics- B application 1543B is changed.
  • Management program 2410 identifies that output of Analytics-B application is not changed if the number of its input data is between 0 and 999.
  • management program 2,410 creates a copy condition like shown in the row 257B in FIG. 11.
  • An example of Analytics-B application that has a strong correlation between number of input data to the application and occurrence of changes in its output is an application that analyzes purchase history data of a customer to identify purchasing trends and preferences of the user.
  • One most recent purchasing data is very important for the application as it has a big impact on the result of the application.
  • Example implementations may also relate to an apparatus for performing the operations herein.
  • This apparatus may be specially constructed for the required purposes, or it may include one or more general-purpose computers selectively activated or reconfigured by one or more computer programs.
  • Such computer programs may be stored in a computer readable medium, such as a computer-readable storage medium or a computer-readable signal medium.
  • a computer-readable storage medium may involve tangible mediums such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible or non-transitory media suitable for storing electronic information.
  • a computer readable signal medium may include mediums such as carrier waves.
  • the algorithms and displays presented herein are not inherently related to any particular computer or other apparatus.
  • Computer programs can involve pure software implementations that involve instructions that perform the operations of the desired implementation.
  • the operations described above can be performed by hardware, software, or some combination of software and hardware.
  • Various aspects of the example implementations may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine-readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out implementations of the present application.
  • some example implementations of the present application may be performed solely in hardware, whereas other example implementations may be performed solely in software.
  • the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways.
  • the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.

Abstract

Example implementations disclosed herein are directed to systems and methods for a management program that copies data, executes ETL and deploys an application with considering correlations between number of input data to the application and occurrence of changes in its output. In example implementations, the management program does not copy data if it determines that output of the application would not be changed even if the data is copied. Management program may also set priorities of data copy based on the correlations.

Description

METHOD AND APPARATUS TO CONTROL DATA COPY BASED ON CORRELATIONS BETWEEN NUMBER OF COPIED DATA AND APPLICATION
OUTPUT
BACKGROUND
Field
[1] The present disclosure relates generally to storage system management, and more specifically, towards system s and methods of managing data copies.
Related Art
[2] In the related art, there are methods and apparatuses that manage object-based tiering in storage systems. In an example related art implementation, there is a storage system configured to acquire information about where database tables are stored from an application, and to use the information to manage tiering of data. An example of such a related art implementation is described, for example, in U.S. Patent No. 8,464,003, incorporated herein by reference in its entirety for all purposes.
[3] In the related art, there are storage systems configured to replicate data to other storage systems for the purposes of providing high availability. An example of such a related art implementation can be found, for example, in U.S. Patent No. 8,943,286, incorporated herein by reference in its entirety for all purposes.
SUMMARY
[4] Aspects of the present disclosure include a management computer connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected. The management computer can include a processor, configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas: create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output: and apply the copy pair condition to the copy pair. [5] Aspects of the present disclosure further include a computer program having instmctions for executing a process for a management computer connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate a output from which a change to the output is detected. Tire instructions can include managing a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas: creating a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and applying the copy pair condition to the copy pair. The instmctions can be stored on a non-transitory computer readable medium.
[6] Aspects of the present disclosure further include a system, which can involve a plurality of servers, w herein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected; a plurality of storage areas; and a management computer connected to a plurality of servers and a plurality of storage areas. The management computer can include a processor, configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and apply the copy- pair condition to the copy pair.
[7] Aspects of the present disclosure further include a system which can involve means connected to a plurality of servers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected. The system can further include means for managing a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; means for creating a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and means for applying the copy pair condition to the copy pair. BRIEF DESCRIPTION OF DRAWINGS
[8] FIG. 1(a) illustrates an example of a logical configuration of the system in which the method and apparatus of the example implementations described herein may be applied.
[9] FIG. 1(b) illustrates a logical configuration of the IT infrastructure of FIG. 1(a), in accordance with an example implementation.
110| FIG. 1(c) illustrates a physical configuration of the IT environment, in accordance with an example implementation.
[11] FIG. 2 illustrates the configurations of management computer, in accordance with an example implementation.
[12] FIG. 3 illustrates an example of Extract, Transform and Load (ETL) Rule Table, in accordance with an example implementation.
[13] FIG. 4 illustrates Database Query Logs Table, in accordance with an example implementation.
[14] FIG. 5 illustrates the physical storage table, in accordance with an example implementation .
[15] FIG. 6 illustrates a storage volume table, in accordance with an example implementation.
[16] FIG. 7 illustrates a physical server table, in accordance with an example implementation .
[17] FIG. 8 illustrates a virtual server table, in accordance with an example implementation.
[18] FIG. 9 illustrates a mapping table, in accordance with an example implementation.
[19] FIG. 10 illustrates a copy pair table, in accordance with an example implementation.
[20] FIG. 11 illustrates a copy condition table, in accordance with an example implementation. [21] FIG. 12 illustrates a graphical user interface (GUI) of self-service portal, in accordance with an example implementation.
[22] FIG. 13 illustrates a flowchart of management program, for deploying an application, in accordance with an example implementation.
[23] FIG. 14 illustrates a flowchart of management program for creating a copy condition, in accordance with an example implementation.
[24] FIG. 15 illustrates an example configuration created by management program to deploy an application, in accordance with an example implementation.
[25] FIG. 16 illustrates a flowchart of management program for resyncing copy pair to copy data which is created, modified or deleted in a data source to a target data store, in accordance with an example implementation.
[26] FIG. 17 shows an example of multiple cycles of FIG. 14, in accordance with an example implementation.
[27] FIG. 18 illustrates another example of multiple cycles of FIG . 14, in accordance with an example implementation.
DETAILED DESCRIPTION
[28] The following detailed description provides further details of the figures and example implementations of the present application. Reference numerals and descriptions of redundant elements between figures are omitted for clarity. Terms used throughout the description are provided as examples and are not intended to be limiting. For example, the use of the term "automatic" may involve fully automatic or semi-automatic implementations involving user or administrator control o ver certain aspects of the implementation, depending on the desired implementation of one of ordinary skill in the art practicing implementations of the present application. Selection can be conducted by a user through a user interface or other input means, or can be implemented through a desired algorithm. Example implementations as described herein can be utilized either singularly or in combination and the functionality of the example implementations can be implemented through any means according to the desired implementations. [29] In the related art implementations, there is a lack of consideration concerning correlations between input data, to the application and changes in the output, and there is no storage system management based on such correlations. The example implementations described herein are directed to a management program configured to copy data, execute Extract, Transform and Load (ETL) and deploy an application w hile considering correlations between the number of input data to the application and occurrence of changes in its output. In example implementations, the management program also set priorities of data copy based on the correlations. In example implementations, the management program, does not copy data if it detennines that output of the application would not be changed even if the data is copied.
[30] In an example implementation described below, there is a management program that controls data copy while considering correlations between the number of input data to the application and occurrence of changes in its output.
[31] FIG. 1(a) illustrates an example of a logical configuration of the system in which the method and apparatus of the invention may be applied.
[32] IT environment 1000 can include management program 1200, Self-Service Portal
1100, Application Image Repository 1300, Information Technology (IT) Infrastructure Management User Interface (ill) 1400, and IT infrastructure 1500. Application developer 1010 develops applications and deploys them onto the IT environment 1000 via Self-Service Portal 1 100. IT infrastructure administrator 1030 manages the IT environment 1000 via IT infrastructure management UI 1400.
[33] FIG. 1 (b) illustrates a logical configuration of the IT infrastructure 1500 of FIG.
1(a), in accordance with an example implementation. IT infrastructure 1500 involves one or more servers and/or storage arrays. In the example of FIG. 1(b), there is one storage array (1510) and five servers (1520, 1530, 1540 and 1550), however any number of storage arrays and servers may be utilized according to the desired implementation, and the present disclosure is not limited thereto. Applications 1523 and 1533 are running on Virtual Machines (VMs) 1522 and 1532 which are running on Hypervisor 1521 and 1531 respectively. Application 1523 uses Storage Volumes 1511 of Storage Array 1510 and Application 1533 uses Storage Volumes 1512 and 1513 of Storage Array 1510. Storage Control Program 1514 nms on Storage Array 1510 and it controls I/O from applications to storage volumes. Applications 1544 and 1545 are running on VMs 1542 and 1543 which are running on Hypervisor 1541 . Application 1544 uses Storage Volumes 1551 and 1552 of Storage VM 1555 and Application 1545 uses Storage Volumes 1553 of Storage VM 1556. Storage Control Program 1557 and 1558 runs on Storage VM 1555 and 1556, and these programs control I/O from applications to storage volumes.
[34] FIG. 1(c) illustrates a physical configuration of the IT environment, in accordance with an example implementation. IT environment 1000 involves management computer 2000, servers 3000, storage arrays 4000, management network 5000 and data network 6000. Servers 3000 and storage arrays 4000 are connected via data network 6000. This network can be LAN (Local Area Network) but it is not limited thereto, and other networks may be utilized according to the desired implementation. Management computer 2000, servers 3000 and storage arrays 4000 are connected via management network 5000. Management network 5000 may also be LAN, but is not limited thereto. Though management network 5000 and data network 6000 are separated in this example, they can be a single converged network in accordance with a desired implementation. In this example implementation, management computer 2000 and servers 3000 are separated, but other implementations may also be utilized according to the desired implementation. For example, any server can host a management program. Further, in this example, servers 3000 and storage arrays 4000 are separated, however, other implementations may also be utilized according to the desired implementation. For example, servers and storages arrays can be converged into one system. The storage arrays 4000 can provide a plurality of storage areas. Further, one or more servers from the plurality of servers 3000 can be configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected, as described in more detail below.
[35] FIG. 2 illustrates the configurations of management computer 2000, in accordance with an example implementation. Depending on the desired implementation, management computer 2000 may also be in the form of a management server. Management Network Interface 2100 is an interface to connect the management computer 2000 to the management network 5000. Input and output (I/O) device 2300 can involve any I/O user interface such as a monitor, a keyboard and a mouse. Local Disk 2400 contains management program 2410, ETL (Extract, Transform and Load) rule table 2420 and database query logs table 2430. Management program 2410 is loaded to Memory 2500 and executed by processor 2200. Processor 2200 can be in the form of a physical processor or central processing unit (CPU) that is configured to execute instructions from memory 2500, or as directly input into the processor 2200. The procedure of the management program 2410 is disclosed below. Management program 2410 is the same entity as management program 1200 in FIG. 1(a). ETL rule table 2420 and database query logs table 2430 are loaded to Memory 2500 and used by the management program 2410, of which further description is provided below. Memory 2500 contains storage array table 2510, storage volume table 2520, physical server table 2530, virtual server table 2540, mapping table 2550, copy pair table 2560 and application characteristics table 2570, of which further description is provided below.
[36] In an example implementation, the processor 2200 can be configured to load the management program 2410 from memory 2500 and thereby configured to manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output as described in FIG. 14: and apply the copy pair condition to the copy pair. Copy pair conditions are described with respect to FIG. 11.
[37] In an example implementation, the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application until a first occurrence of the change in output is detected: wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the number of entries applied to the data until the first occuirence of the change in output is detected as described in FIG. 14.
[38] In an example implementation, the number of entries applied to the data associated with the target storage area from, the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of an average number of entries applied to the data over occurrences of changes in output, wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the average number of entries as described in FIG. 14.
1 [39] In an example implementation, the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of a minimum number of entries applied to the data for the change in the output to occur, wherein the copy pair condition involves conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the minimum number of entries as described in FIG. 14.
[40] In example implementations, the data associated with the target storage area is stored in the target storage area. For example, the data associated with the target storage area can be stored in the target storage area 155 A in FIG. 15. However, the data can also be stored in another storage area and processed through an Extract, Transfonn and Load (ETL) process, as described in FIG. 15 with respect to storage area 1552A.
[41] In an example implementation, the processor 2200 can be configured to load the management program 2410 from memory 2500 and thereby configured to manage a plurality of copy pairs and a plurality of copy pair conditions, each of the plurality of copy pair conditions associated with a corresponding one of the plurality of copy pairs, each of the plurality of copy pair conditions associated with a priority based on an associated number of entries to cause a change to an associated output, such that ones of the plurality of copy pair conditions having a fewer associated number of entries are given higher priority, as described in FIG. 11. Processor 2200 can also be configured to execute a re synchronization process for each of the plurality of copy pairs based on the associated priority as described in FIG. 16.
[42] FIG. 3 illustrates an example of ETL Rule Table 2420, m accordance with an example implementation. Column 2421 shows identifications of ETL rules. Column 2422 shows queries to extract data. Column 2423 shows rules to transfonn. data. Column 2424 shows methods to load data. Each row shows an ELT rule. For example, row 242A shows a rule which extract data via Structured Query Language (SQL) query, transfonn the extracted data into a structured data named "sales_data" and load the transformed data to a data store named '"datastore- 1 " through Hypertext transfer protocol (HTTP) Post method.
[43] FIG. 4 illustrates Database Query Logs Table 2430, in accordance with an example implementation. Column 2431 shows the application identifiers (IDs). Column 2432 shows logs of database query issued by the applications. Each row shows examples of logs issued by an application. For example, row 243A shows that an application having value of 01 as its ID issued some Insert queries and Delete queries.
[44] FIG. 5 illustrates the physical storage table 2510, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410. Column 2511 shows identifications of storage arrays. Column 2512 shows types of storage arrays. The type can be either "Storage Array" which means hardware-based storage array, or "Storage VM" which means virtual machine based storage array in this example implementation. Column 2513 shows processors of the storage arrays. Column 2514 shows ports of the storage arrays. Column 2515 shows cache resources of the storage arrays. Column 2516 shows pools of resources (typically capacities) of the storage arrays. Each row shows configurations of a storage array. For example, row 251 A shows configurations of storage array 01. The storage array is hardware-based and has two processors with 32 cores each, 8Gbps of port A, B, C and D, 160GB of cache C-01 and 128GB of cache C-02, 300TB of pool Pool-01 and Pool -02, and 500TB of pool Pool -03 and Pool -04, but is not limited thereto, and other configurations can also be utilized in accordance with the desired implementation. Rows 25 IB and 25 IC show configurations of storage VM 02 and 03.
[45] FIG. 6 illustrates a storage volume table 2520, in accordance with an example implementation. This table is created in the memory 2500 by management program 2/410. Column 252 ! shows identifiers of storage arrays owning storage volumes. Column 2522 shows identifiers of storage volumes. Column 2523 shows the capacities of each storage volume. Column 2524 shows identifiers of pools from which storage volumes are curved. Each row 252A, 252B, 252C, 252D, 252E, 252F, 252G shows the configuration of each storage volume. For example, row 252A shows a configuration of storage volume 01 of storage array 01. This storage volume has 10TB of capacity, curved from Pool-01.
[46] FIG. 7 illustrates a physical server table 2530, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410. Column 2531 shows identifiers of physical servers. Column 2532 shows numbers of cores and types of central processing unit (CPU) of each physical server. Column 2533 shows capacities of memory resources of each physical server. Column 2534 shows ports of each physical server. Each row 253A, 253B, 253C, 253D shows the configuration of each physical server. For example, row 253A shows a configuration of physical server 01. The physical server has 12 cores of Normal CPU, 32GB of memory, 4Gbps of port A and B. [47] FIG. 8 illustrates a virtual server table 2540, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410. Column 2541 shows identifications of the virtual servers. Column 2542 shows identifiers of the physical servers on which the virtual servers are running. Column 2543 shows numbers of CPU cores assigned to each virtual server. Column 2544 shows capacities of memory resources assigned to each virtual server. Column 2545 shows ports assigned to each virtual server. Each row 254A, 254B, 254C, 254D shows the configuration of each virtual server. For example, row 254A shows a configuration of virtual server 01. This virtual server is hosted on physical server 01 , has 2 CPU cores, 4GB of memory and 4Gbps of port A,
[48] FIG. 9 illustrates a mapping table 2550, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410. Column 2551 shows identifiers of applications. Column 2552 shows names of the applications. Column 2553 shows identifiers of virtual servers on which the applications are running. Column 2554 shows identifiers of ports of the virtual servers. Column 2555 shows identifiers of storage arrays or storage VMs. Column 2556 shows identifiers of ports of the storage arrays or storage VMs. Column 2557 shows identifiers of storage volumes.
[49] Each row 255A, 255B, 255C, 255D shows an end-to-end mapping between an application and storage volumes. For example, row 255A shows that application 01 whose name is "Database-A" is running on virtual server 01 and storage volume 01 of storage array 01 is allocated to this application via storage port A and virtual server port A.
[50] FIG. 10 illustrates a copy pair table 2560, in accordance with an example implementation. This table is created in the memory 2500 by management program 2410. Column 2561 shows identifiers of copy pairs. Column 2562 shows identifiers of source storage arrays or storage VMs, Column 2563 shows identifiers of source storage volumes. Column 2564 shows identifiers of target storage arrays or storage VMs. Column 2565 shows identifiers of target storage volumes or storage VMs. Column 2565 shows statuses of copy pairs.
[51] Each row 256A, 265B shows a copy pair between a source storage volume and a target storage volume. For example, row 256A shows a copy pair between storage volume 01 of storage array 01 and storage volume 01 of storage array 02. Its status is "Paired" which means that data contained in storage volume 01 of storage array 01 is being copied to storage volume 01 of storage array 02. The status of copy pair 02 shown in row 256B is "'Suspended" which means that data copy between storage volume 01 of storage array 01 and storage volume 02 of storage array 02 is suspended and thus data, contained in these two storage volumes may be different.
[52] FIG. 11 illustrates a copy condition table 2570, in accordance with an example implementation. This table is created in the memory 2,500 by management program 2410. Column 2571 shows identifiers of copy pairs. Column 2572 shows identifiers of ETL rales assigned to each copy pair. Column 2573 shows copy pair conditions, which include conditions to activate each copy pair. Column 2574 shows priorities assigned to each copy pair. Each row 257A, 257B shows a condition to activate each copy pair. For example, row 257 A shows that copy pair 01 is activated if delta of number of "purchasers" in "sales_data" is equal or larger than 1000. Row 257B shows that copy pair 02 is activated if delta of number of "purchasers" in "purchasejhistory" is equal or larger than 1. If these two conditions are satisfied at the same time, copy pair 02 is activated first because copy pair 02 has higher priority than copy pair 01. As shown in FIG. 11, the copy pair conditions can be based on the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output. In the entry of 257A, the number of new entries required to invoke the copy pair condition is "1000", which can be based on a number of entries required to change the output for the data of "sales data". Further, priority can be assigned based on the number of entries required to change the output for the data, wherein fewer entries can be assigned a higher priority, as shown in the priority of copy condition 257B having a higher priority than the copy condition of 257A.
[53] FIG. 12 illustrates a GUI 1100-A of self-service portal 1100, in accordance with an example implementation. This GUI is used when application developer 1010 deploys an application. The application developer selects and/or inputs a data source l i l O-A, an extract method 1120-A, a transform rule 1 130-A, a load method 1140-A, and an application image 1150- A to be deployed. An application image can be an image of a VM or a container image but is not limited to and other application images can be applied according to the desired implementation. If the OK button 1160-A is clicked, management program 2410 copies data contained within the selected data source, executes ETL processes and deploys the selected application. If the cancel button 1170-A is clicked, management program 2410 cancels application deployment process. [54] FIG. 13 illustrates a flowchart of management program for deploying an application, in accordance with an example implementation. The flow for the management program begins at 10010. At 10020, Management program 2410 receives a request for deploying an application from the self- service portal 1100. Parameters as shown in FIG. 12 are passed to Management program 2410. At 10030, the Management program 2410 stores received ETL information into ETL rule table. At 10040, the Management program 2410 identifies storage volumes containing data of specified data source. This can be done by comparing the selected value of data source 1110- A with values of application ID 2551 of Mapping Table 2550. At 10050, the Management program 2410 judges if a target storage specified by Load Method 1140-A exists or not. If the result is Yes then the process proceeds to 10070. If the result is No then the process proceeds to 10060.
[55] At 10060, the Management program 2410 creates a storage VM which can be used as a target storage, and a storage volume. Management program 2410 refers configuration information necessar ' for creating a storage VM in Physical Server Table 2530. Configuration information of the created storage VM is stored in Virtual Server Table 2540. At 10070, Management program 2410 configures a copy pair between the source storage volume identified in step 10040 and a target storage volume which has been existing or created at 10060. Management program 2410 refers configuration information necessary for configuring a copy pair (e.g. capacity of each storage volume) in Storage Array Table 2510 and Storage Volume Table 2520.
[56] At 10080, Management program 2410 executes initial data copy between the source storage volume and the target storage volume. At 10090, Management program 2410 suspends the data copy. At 10100, Management program 2410 stores copy pair information into Copy Pair Table 2560. At 10110, Management program 2410 CIO SltCS £1 VM, deploys the same type of database as specified data source and allocates the target storage volume to the created database. This process facilitates the management program 2410 to read data copied from the source storage volume to the target storage volume via the same access method like SQL. At 10120, Management program 2410 extracts data from the created database, transforms the data and loads transformed data into the target data store according to the specified ETL information. At 10130, Management program 2410 invokes the "create copy condition" sub-sequence. This sub-sequence can be done asynchronously. At 10140, Management program 2410 creates a VM and deploys an instance of specified application image. At 10150, Management program 2410 quits the application deployment process.
[57] FIG. 14 illustrates a flowchart of management program 2410 for creating a copy condition, in accordance with an example implementation . The flow illustrated in FIG. 14 is the procedure from the flow at 10130. One purpose of executing this sub-sequence is to identify correlations between number of input data to the application and occurrence of changes in its output, and create copy condition based on the correlation to make data copy efficient.
[58] The flow begins at 1 1010. At 11020, Management program 2410 creates a temporary VM and deploys a temporary application instance of the specified application image. At 1 1030, Management program. 2410 deploys a proxy for the target data store and binds it to the temporary application instead of the target data, store. This proxy has the same data access interface as the one of the target data store, and a limitation for the number of data to be returned can be set. At 11040, Management program 2410 sets an initial limitation of number of data to be returned to the temporary application instance. At 11050, Management program 2410 judges if number of data contained in the target data store is equal to or larger than the limitation or not. If the result is Yes, then the process proceeds to 11060. If the result is No, then the process proceeds to 11090.
[59] At 11060, Management program 2,410 runs the temporary instance of the application . At 11070, Management program 2410 compares the output of the application with the output in the previous execution of the same application and check if the output is changed. If it is the first execution (thereby resulting in no previous output), then management program 2,410 concludes that the output is changed. At 11080, Management program 2410 increases the limitation for the number of data to be returned to the temporary application instance.
[60] At 1 1090, Management program 2410 identifies a range of number of data in which the output of the application is not changed. At 11100, Management program 2410 creates a copy condition and stores it to Copy Condition Table. Management program 2410 sets priorities based on the range if there are multiple copy pairs which have the same source storage volume. The copy condition can be based on the desired implementation and is not particularly limited thereto. [61] In an example implementation, the copy condition can be set based on the termination of a cycle if a change has occurred. That is, cycles as described in FIGS. 16 and 17 are traversed until one cycle causes a change in the output, upon which the number of data entries associated with the cycle is used as a copy condition. For example, if the cycle associated with the change in output involved the input of 1000 data entries for the change in output to occur, the copy condition can be set to execute the copy operation when 1000 new data entries are received.
[62] In another example implementation, all of the cycles can be executed until completion, and then an average is taken. That is, cycles as described in FIGS. 16 and 17 are traversed until the maximum number of data entries is reached, upon which the average number of data entries associated with the cycles in which changes occurred is used as a copy condition. For example, if the average number of data entries between cycles in which changes occur is 1000 data entries, the copy condition can be set to execute the copy operation when 1000 new data entries are received.
[63] In another example implementation, all of the cycles can be executed until competition, and then a minimum value is taken. That is, cycles as described in FIGS. 16 and 17 are traversed until the maximum number of data entries is reached, upon which the cycle associated with the smallest number of data entries to which changes occurred is used as a copy condition. For example, if one of the cycles associated with a change indicates that the number of data entries causing the change is 100 data entries and such a value is the smallest among all of the other cycles in which changes occurred, the copy condition can be set to execute the copy operation when 100 new data entries are received.
[64] Variations of such example implementations may also be employed, and the present implementations are not limited to the examples provided above. Other implementations may be utilized in accordance with the desired implementation.
[65] At 1 1 110, Management program 2410 terminates the process.
[66] FIG. 15 illustrates an example configuration created by management program
2410 to deploy an application, in accordance with an example implementation. In the illustration of FIG. 15, boxes with bold lines are created by management program 2410. The larger arrows indicate data flow from a storage volume of a data source (151 IA) to target data store (15AA). In the example of FIG. 15, storage array 1510A contains Storage Control Program 1512A and Storage Array 1511A. Server 1520A includes Database-A 1523A, VM 1522A and Hypervisor 1521A. Server 1530A includes Analytics-A application 1533A, VM 1532A and Hypervisor 1531A. Server 1540A includes Analytics-A application 1543 A, VM 1542A, and Hypervisor 1541A. Server 1550A includes Storage Control Program 1557A, Database-A 1558A, ETL 1559A, Datastore-l 15AA, Proxy 15BA, VMs 1554A, 1555A, 1556A, Hypervisor 1553A, and storage volumes 1551 A and 1552A. Management program 2410 executes the following processes in accordance with die flow diagrams of FIGS. 13, 14 and 16.
[67] 1. Creates storage VM 1554A and storage volume 1551A.
[68] 2. Copy data from storage volume 1511 A to storage volume 15 1 A, and deploys Storage Control Program 1557A.
[69] 3. Creates VM 1555A and deploys Database-A 1558A and an ETL tool 1559A.
[70] 4. Creates VM 1556A and a storage volume 1552A, and deploys Datastore-l 15AA and Proxy 15BA.
[71] 5. Executes ETL.
[72] 6. Creates VM 1532A and deploys Analytics-A application 1533A.
[73] 7. Creates VM 1542A and deploys Analytics-A application 1543A.
[74] 8. Creates a copy condition between storage volume 151 1A and storage volume 1551 A by executing Analytics-A application 1543A with limited amount of data contained in storage volume 1552A.
[75] 9. Controls data copy between storage volume 151 LA and storage volume 1551 A according to the created copy condition,
[76] Variations of this configuration may be applied according to the desired implementation, and the present disclosure is not limited to the example of FIG. 15. For example, storage volume 1552A can alternately be removed, with all operations conducted on storage volume 1551A instead, depending on the desired implementation. In such a variation, the data associated with the target storage area can be stored in the target storage area 1551 A in FIG. 15, thereby implementing the server 1550A without another storage area 1552A. In such an example implementation, operations can be conducted on Database-A 1558A without the ETL operation 1559A, thereby removing the need for ETL 1559 A, Datastore-l 15AA and Proxy 15BA . However, the present disclosure is not limited thereto, and any combinations of the elements of ETL 1559A, Datastore-l 15AA, Proxy 15BA, and the another storage area 1552 A may be used or omitted depending on the desired implementation.
[77] FIG, 16 illustrates a flowchart of management program 2410 for resyncing copy pair to copy data which is created, modified or deleted in a data source to a target data store, in accordance with an example implementation.
[78] The procedure begins at 12010. At 12020, Management program 2410 monitors each data source, retrieves query logs and stores them into Database Query Logs Table 2430. At 12030, Management program 2 10 checks if a copy condition is satisfied based on Database Query Logs Table 2430, ETL Rule Table 2420 and Copy Condition Table 2570. At 12040, Management program 2410 judges if there are any copy pairs whose copy condition is satisfied or not. If the result is Yes then the process proceeds to 12050. If the result is No then the process proceeds to 12020.
[79] At 12050, Management program 2410 resyncs the copy pair to transfer uncopied data from the data source to the target data store. The copy operation can be conducted according to the priority of the data set as illustrated in FIG. 11. For example, the data entries obtained from 257B of FIG. 11 has higher priority than the data, set of 257A, so the uncopied data of 257B is copied before the uncopied data set of 257B. In an example implementation, the data sets can be copied in a serial manner (e.g. uncopied data entries of higher priority are copied over first until completion before initiation of a copy operation on uncopied data entries of lower priority), but the present disclosure is not limited thereto, and other implementations may be utilized according to the desired implementation (e.g., bandwidth sharing schemes for copying in parallel).
[80] At 12060, Management program 2410 suspends the data copy. At 12070,
Management program 2410 extracts data from the database 1558A, transforms the data and loads transformed data into the target data store 15AA.
[81] In this example implementation, the management program copies data, executes ETL and deploys an application with considering correlations between number of input data to the application and occurrence of changes in its output. Management program does not copy data if it determines that output of the application would not be changed even if the data is copied. Management program also set priorities of data copy based on the correlations. Through this example implementation, management program can control data, copy necessary for a deployed application efficiently.
[82] FIG. 17 shows an example of multiple cycles of FIG. 14, in accordance with an example implementation. Specifically, FIG. 17 illustrates an example flow from 11050 to 11080 from FIG. 14 as applied to the elements described in FIG. 15. The number of data stored in storage volume 1552A is 5000 in this example.
[83] In the first cycle, a value of 100 is set to Proxy 15BA. This means that the Proxy
15BA returns only 100 data of total 5000 data to Analytics-A application 1543A. The output of Analytics-A application 1543A is "A". In the second cycle, a value of 200 is set to Proxy 15BA. This means that the Proxy 15BA returns only 200 data of total 5000 data to Analytics- A application 1543 A. The output of Analytics-A application 1543A is "A". Management program 2410 identifies that the output of Analytics-A application 1543A is not changed. In the tenth cycle, a value of 1000 is set to Proxy 15BA. This means that the Proxy 15BA returns only 1000 data of total 5000 data to Analytics-A application 1543A. The output of Analytics-A application I543A is "B". Management program 2410 identifies that the output of Analytics-A application 1543A is changed. Management program 2410 identifies that output of Analytics-A application is not changed if the number of its input data is between 0 and 999. As a result, management program 2410 creates a copy condition like shown in the row 257A in Fig. l l . An example of Analytics-A application that has a weak correlation between number of input data to the application and occurrence of changes in its output is an application that analyzes sales data of a product (e.g. book) to identify large purchasing power group. A few change of number of input sales data would not affect the result of the application.
[84] FIG. 18 illustrates another example of multiple cycles of FIG. 14, in accordance with an example implementation. Specifically, FIG. 18 illustrates an example flow from 11050 to 11080 from FIG. 14 as applied to the elements described in FIG. 15. The application is Anaiytics-B application I543B that is different from the one in the previous example shown in FIG. 17. The number of data, stored in storage volume 1552A is 100 in this example. [85] In the first cycle, a value of 1 is set to Proxy 15BA. This means that the Proxy
15BA returns only 1 data of total 100 data to Analytics-B application 1543B. Tire output of Analytics-B application 1543B is "A", in the second cycle, a value of 2 is set to Proxy 15BA. This means that the Proxy 15BA returns only 2 data of total 100 data to Analytics-B application 1543B. The output of Analytics-B application 1543B is "B", Management program 2410 identifies that the output of Analytics-B application 1543B is changed. In the fifth cycle, a value of 5 is set to Proxy 15BA. This means that the Proxy 15BA returns only 5 data of total 100 data, to Analytics-B application 1543B. The output of Analytics-B application 1543B is "E". Management program 241 identifies that the output of Analytics- B application 1543B is changed. Management program 2410 identifies that output of Analytics-B application is not changed if the number of its input data is between 0 and 999. As a result, management program 2,410 creates a copy condition like shown in the row 257B in FIG. 11.
[86] An example of Analytics-B application that has a strong correlation between number of input data to the application and occurrence of changes in its output is an application that analyzes purchase history data of a customer to identify purchasing trends and preferences of the user. One most recent purchasing data is very important for the application as it has a big impact on the result of the application.
[87] Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations within a computer. These algorithmic descriptions and symbolic representations are the means used by those skilled in the data processing arts to convey the essence of their innovations to others skilled in the art. An algorithm is a series of defined steps leading to a desired end state or result. In example implementations, the steps carried out require physical manipulations of tangible quantities for achieving a tangible result,
[88] Unless specifically stated otherwise, as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as "processing," "computing,"' '"calculating," "'determining," "'displaying," or the like, can include the actions and processes of a computer sy stem or other information processing device that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system's memories or registers or other information storage, transmission or display devices.
[89] Example implementations may also relate to an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may include one or more general-purpose computers selectively activated or reconfigured by one or more computer programs. Such computer programs may be stored in a computer readable medium, such as a computer-readable storage medium or a computer-readable signal medium. A computer-readable storage medium may involve tangible mediums such as, but not limited to optical disks, magnetic disks, read-only memories, random access memories, solid state devices and drives, or any other types of tangible or non-transitory media suitable for storing electronic information. A computer readable signal medium may include mediums such as carrier waves. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Computer programs can involve pure software implementations that involve instructions that perform the operations of the desired implementation.
[90] Various general-purpose systems may be used with programs and modules in accordance with the examples herein, or it may prove convenient to construct a more specialized apparatus to perform desired method steps. In addition, the example implementations are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the example implementations as described herein. The instructions of the programming language(s) may be executed by one or more processing devices, e.g., central processing units (CPUs), processors, or controllers.
[91] As is known in the art, the operations described above can be performed by hardware, software, or some combination of software and hardware. Various aspects of the example implementations may be implemented using circuits and logic devices (hardware), while other aspects may be implemented using instructions stored on a machine-readable medium (software), which if executed by a processor, would cause the processor to perform a method to carry out implementations of the present application. Further, some example implementations of the present application may be performed solely in hardware, whereas other example implementations may be performed solely in software. Moreover, the various functions described can be performed in a single unit, or can be spread across a number of components in any number of ways. When performed by software, the methods may be executed by a processor, such as a general purpose computer, based on instructions stored on a computer-readable medium. If desired, the instructions can be stored on the medium in a compressed and/or encrypted format.
[92] Moreover, other implementations of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the teachings of the present application. Various aspects and/or components of the described example implementations may be used singly or in any combination. It is intended that the specification and example implementations be considered as examples only, with the true scope and spirit of the present application being indicated by the following claims.

Claims

What is claimed is:
1 , A management computer connected to a plurality of servers and a plurality of storage areas, wherein a server from the pluralit ' of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected, the management computer comprising:
a processor, configured to:
manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas;
create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and
apply the copy pair condition to the copy pair.
2, The management computer of claim 1 , wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application until a first occurrence of the change in output is detected;
wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the number of entries applied to the data until the first occurrence of the change in output is detected.
3. The management computer of claim 1, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of an average number of entries applied to the data over occurrences of changes in output, wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the average number of entries.
4. The management computer of claim 1, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of a minimum number of entries applied to the data, for the change in the output to occur, wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the minimum number of entries,
5. The management computer of claim 1, wherein the processor is configured to manage a plurality of copy pairs and a plurality of copy pair conditions, each of the plurality of copy pair conditions associated with a corresponding one of the plurality of copy pairs, each of the plurality of copy pair conditions associated with a priority based on an associated number of entries to cause a change to an associated output, such that ones of the plurality of copy pair conditions having a fewer associated number of entries are given higher priority;
wherein the processor is configured to execute a ^synchronization process for each of the plurality of copy pairs based on the associated priority.
6. A non-transitory computer readable medium, storing instructions for executing a process for a management computer connected to a plurality of sen/ers and a plurality of storage areas, wherein a server from the plurality of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected, the instructions comprising:
managing a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas;
creating a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and
applying the copy pair condition to the copy pair.
7. The non-transitory computer readable medium of claim. 6, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application until, a first occurrence of the change in output is detected; wherem the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the number of entries applied to the data until the first occurrence of the change in output is detected.
8. The non-transitory computer readable medium of claim 6, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of an average number of entries applied to the data over occurrences of changes in output,
wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the average number of entries,
9. The non-transitory computer readable medium of claim 6, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of a minimum number of entries applied to the data for the change in the output to occur,
wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the minimum number of entries.
10. The non-transitory computer readable medium of claim 6, wherein the instructions further comprise managing a plurality of copy pairs and a plurality of copy pair conditions, each of the plurality of copy pair conditions associated with a corresponding one of the plurality of copy pairs, each of the plurality of copy pair conditions associated with a priority based on an associated number of entries to cause a change to an associated output, such that ones of the plurality of copy pair conditions having a fewer associated number of entries are given higher priority; and
executing a re synchronization process for each of the plurality of copy pairs based on the associated priority.
1 1 . A system, comprising: a plurality of servers, wherein a server from the pluralit ' of servers is configured to manage an application that executes data associated with a target storage area from the plurality of storage areas to generate an output from which a change to the output is detected; a plurality of storage areas; and
a management computer connected to a plurality of servers and a plurality of storage areas, the management computer comprising:
a processor, configured to:
manage a copy pair between a source storage area from the plurality of storage areas and the target storage area from the plurality of storage areas; create a copy pair condition for the copy pair based on a number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output; and apply the copy pair condition to the copy pair.
12, The system of claim 11, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application until a first occurrence of the change in output is detected;
wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the number of entries applied to the data until the first occurrence of the change in output is detected.
13, The system of claim 1 1, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from, the execution of the application and a determination of an average number of entries applied to the data over occurrences of changes in output,
wherein the copy pair condition comprises conducting a copy pair upon a number of new entries to be applied to the target storage area meets or exceeds the average number of entries.
14, The system of claim 11, wherein the number of entries applied to the data associated with the target storage area from the plurality of storage areas to cause the change to the output is determined from the execution of the application and a determination of a minimum number of entries applied to the data for the change in the output to occur, wherem the copy pair condition comprises conducting a copy pair upon a number of new entries to be apphed to the target storage area meets or exceeds the minimum number of entries.
15. The system of claim 11, wherein the processor is configured to manage a plurality of copy pairs and a plurality of copy pair conditions, each of the plurality of copy pair conditions associated with a corresponding one of the plurality of copy pairs, each of the plurality of copy pair conditions associated with a priority based on an associated number of entries to cause a change to an associated output, such that ones of the plurality of copy pair conditions having a fewer associated number of entries are given higher priority;
wherein the processor is configured to execute a re synchronization process for each of the plurality of copy pairs based on the associated priority.
PCT/US2016/049031 2016-08-26 2016-08-26 Method and apparatus to control data copy based on correlations between number of copied data and application output WO2018038740A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2016/049031 WO2018038740A1 (en) 2016-08-26 2016-08-26 Method and apparatus to control data copy based on correlations between number of copied data and application output

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2016/049031 WO2018038740A1 (en) 2016-08-26 2016-08-26 Method and apparatus to control data copy based on correlations between number of copied data and application output

Publications (1)

Publication Number Publication Date
WO2018038740A1 true WO2018038740A1 (en) 2018-03-01

Family

ID=61245335

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/049031 WO2018038740A1 (en) 2016-08-26 2016-08-26 Method and apparatus to control data copy based on correlations between number of copied data and application output

Country Status (1)

Country Link
WO (1) WO2018038740A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240103981A1 (en) * 2022-09-28 2024-03-28 Hitachi, Ltd. Automatic copy configuration

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154576A1 (en) * 2004-01-09 2005-07-14 Hitachi, Ltd. Policy simulator for analyzing autonomic system management policy of a computer system
US20070101082A1 (en) * 2005-10-31 2007-05-03 Hitachi, Ltd. Load balancing system and method
US20080059735A1 (en) * 2006-09-05 2008-03-06 Hironori Emaru Method of improving efficiency of replication monitoring
WO2016024994A1 (en) * 2014-08-15 2016-02-18 Hitachi, Ltd. Method and apparatus to virtualize remote copy pair in three data center configuration

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154576A1 (en) * 2004-01-09 2005-07-14 Hitachi, Ltd. Policy simulator for analyzing autonomic system management policy of a computer system
US20070101082A1 (en) * 2005-10-31 2007-05-03 Hitachi, Ltd. Load balancing system and method
US20080059735A1 (en) * 2006-09-05 2008-03-06 Hironori Emaru Method of improving efficiency of replication monitoring
WO2016024994A1 (en) * 2014-08-15 2016-02-18 Hitachi, Ltd. Method and apparatus to virtualize remote copy pair in three data center configuration

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240103981A1 (en) * 2022-09-28 2024-03-28 Hitachi, Ltd. Automatic copy configuration

Similar Documents

Publication Publication Date Title
US20210342747A1 (en) Method and system for distributed deep machine learning
US10120787B1 (en) Automated code testing in a two-dimensional test plane utilizing multiple data versions from a copy data manager
CN107005596B (en) Replicated database allocation for workload balancing after cluster reconfiguration
US10339236B2 (en) Techniques for improving computational throughput by using virtual machines
EP3519987B1 (en) Intents and locks with intent
US20180181383A1 (en) Controlling application deployment based on lifecycle stage
US9928004B2 (en) Assigning device adaptors to use to copy source extents to target extents in a copy relationship
US11036608B2 (en) Identifying differences in resource usage across different versions of a software application
CA3055071C (en) Writing composite objects to a data store
US10303464B1 (en) Automated code testing with traversal of code version and data version repositories
WO2014184606A1 (en) Identifying workload and sizing of buffers for the purpose of volume replication
JP6975153B2 (en) Data storage service processing method and equipment
WO2015167447A1 (en) Deploying applications in cloud environments
US9772928B2 (en) Distributed kernel thread list processing for kernel patching
WO2018038740A1 (en) Method and apparatus to control data copy based on correlations between number of copied data and application output
Munerman et al. Realization of Distributed Data Processing on the Basis of Container Technology
US20220236879A1 (en) Dynamic volume provisioning for remote replication
US11556430B2 (en) Selecting restore processes for applications hosted on storage volumes that are part of group replication sessions
JP6210501B2 (en) Database management system, computer, database management method
US20230281219A1 (en) Access-frequency-based entity replication techniques for distributed property graphs with schema
US20230362103A1 (en) Reducing placement conflicts between concurrent virtual machine allocations
Giger et al. A raspberry pi cluster for teaching big-data analytics
US11461188B2 (en) Automated failover backup reconfiguration management for storage systems
US20230333870A1 (en) Orchestrated shutdown of virtual machines using a shutdown interface and a network card
Pampari et al. Implementing and Benchmarking a Fault-Tolerant Parameter Server for Distributed Machine Learning Applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16914365

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16914365

Country of ref document: EP

Kind code of ref document: A1