WO2013136520A1 - 情報処理システム及び情報処理システムの制御方法 - Google Patents
情報処理システム及び情報処理システムの制御方法 Download PDFInfo
- Publication number
- WO2013136520A1 WO2013136520A1 PCT/JP2012/056922 JP2012056922W WO2013136520A1 WO 2013136520 A1 WO2013136520 A1 WO 2013136520A1 JP 2012056922 W JP2012056922 W JP 2012056922W WO 2013136520 A1 WO2013136520 A1 WO 2013136520A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data structure
- unit
- stored
- processing
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/81—Indexing, e.g. XML tags; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2056—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
- G06F11/2058—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using more than 2 mirrored copies
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0631—Configuration or reconfiguration of storage systems by allocating resources to storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/80—Database-specific techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/845—Systems in which the redundancy can be transformed in increased performance
Definitions
- the present invention relates to an information processing system and a control method for the information processing system that can be applied to the processing of semi-structured data.
- the amount of data processed by the information processing system has been increasing year by year as the fields to be processed expand.
- the types of data range from conventional business data to real-world information represented by sensor technology.
- by analyzing log data generated in the processing process of information processing systems we will gain new insights into business and society from log data that has been regarded as less valuable until now. The movement is active.
- a large amount of data including these log data is called big data, and an information processing system is required to perform data processing at a higher speed than ever before as a basis for realizing a large amount of data processing.
- the data that has been processed so far is generally data that can be stored in a relational database.
- Such data is called structured data.
- a relational database is suitable for a process of searching and extracting data, but it takes a lot of time and load work to the database to perform the search and extraction.
- CSV file When the CSV file is read sequentially from the disk, the data in the row direction is accessed sequentially for the data on the CSV file.
- related information such as a time stamp, a name for distinguishing each record, and various attribute values is generally stored as one record in one line. Therefore, when the CSV file is read sequentially, the data for each record can be read sequentially.
- Patent Document 1 exists as a patent document disclosing a similar technique.
- the input data such as structured data and semi-structured data is loaded into the information processing system
- the input data is stored in the column store and converted into a data structure suitable for column-direction access.
- conversion is performed so that access in the column direction becomes sequential access.
- Non-Patent Document 1 and Patent Document 1 Even in distributed processing platforms used in recent years, semi-structured data needs to be analyzed. At this time, when the method using the column store shown in Non-Patent Document 1 and Patent Document 1 is applied, there are the following problems. In the case of using a column store in an information processing system using a distributed processing platform, it is necessary to have extra data in the information processing system for storing data in the column store anew. On the other hand, an information processing system using a distributed processing platform stores data redundantly in order to improve the fault tolerance of the data. From the above two points, when a column store is applied to an information processing system using a distributed processing platform, there arises a problem that extra data is required.
- one aspect of the present invention is an information processing system that stores data and analyzes the stored data in response to a request from an external device, A plurality of data storage units providing a storage area for the data, and a predetermined operation with respect to the data structure of the data stored in the data storage unit, each of which is associated with each data storage unit A plurality of data structure operation units, each of which is associated with each of the data storage units, and transmits one of the data stored in the data storage unit to the other data storage unit, A data redundancy determining unit, a data arrangement determining unit for determining in which of the plurality of data storage units the data requested from the external device, and a plurality of the data A data redundancy determining unit for causing the lengthening unit to transmit any of the data to the data storage unit; and the data structure of the data stored in the data storage unit in each of the data structure operation units A data structure operation determining unit for operating the data, a data structure management information holding unit for holding data stored in a plurality of
- the data structure related to the storage request is determined by referring to the data structure management information of the corresponding data stored in the data structure management information holding unit.
- the data redundancy determination unit refers to the data structure management information holding unit and creates the copy of the data related to the storage request and stores the created copy of the data.
- the data redundancy operation unit instructs the data redundancy unit to transmit, and the data structure operation determination unit refers to the data structure management information recorded in the data structure management information holding unit, and An instruction to perform a data operation on the data stored in the stored data storage unit is transmitted to the data structure operation unit, and the analysis processing unit
- the information processing system executes the analysis processing according to either the data after the data structure operation stored in one of the data storage units or the data that has not been manipulated in accordance with the contents.
- Another aspect of the present invention is a method for controlling the information processing system.
- an information processing system and information that can execute semi-structured data analysis at high speed while maintaining fault tolerance of data without having extra data in the information processing system.
- a method for controlling a processing system is provided.
- FIG. 1 is a diagram illustrating a configuration example of an information processing system 10 in the first embodiment.
- FIG. 2 is a diagram illustrating a configuration example of the computer definition file 40.
- FIG. 3 is a diagram illustrating a configuration example of the policy definition file 50.
- FIG. 4 is a diagram illustrating a configuration example of the analysis request setting screen 61.
- FIG. 5 is a diagram illustrating a configuration example of the data structure management table 70.
- FIG. 6 is a schematic diagram showing an example of data conversion processing in the present embodiment.
- FIG. 7 is an example of a sequence diagram illustrating processing when a file is copied to the information processing system 10.
- FIG. 8 is a flowchart for explaining a processing example of the data arrangement determining unit 100.
- FIG. 9 is a flowchart for explaining a processing example of the data redundancy determining unit 200.
- FIG. 10 is a flowchart for explaining a processing example of the data structure conversion determination unit 300.
- FIG. 11 is a flowchart for explaining a processing example of the data structure conversion unit 1200.
- FIG. 12 is a flowchart for explaining a processing example of the data structure inverse transform unit 1300.
- FIG. 13 is a sequence diagram illustrating a processing example when an analysis request is received in the first embodiment.
- FIG. 14 is a flowchart for explaining a processing example of the analysis execution place determination unit 500.
- FIG. 15 is a sequence diagram illustrating a processing example when a failure occurs in the processing computer 21.
- FIG. 16 is a flowchart for explaining a processing example of the data restoration determination unit 600.
- FIG. 17 is a diagram illustrating a configuration example of the information processing system information processing system 10 according to the second embodiment.
- FIG. 18 is a diagram illustrating a configuration example of the data structure / statistical information management table 70A.
- FIG. 19 is a sequence diagram illustrating a processing example when an analysis request is received in the second embodiment.
- FIG. 20 is a flowchart for explaining a processing example of the statistical information recording unit 700.
- FIG. 21 is a sequence diagram illustrating a processing example when changing the retention ratio according to the second embodiment.
- FIG. 22 is a flowchart for explaining a processing example of the retention ratio change determination unit 800.
- FIG. 23 is a diagram illustrating a configuration example of the information processing system 10 according to the third embodiment.
- FIG. 24 is a diagram illustrating a configuration example of the conversion rule definition file 90.
- FIG. 25 is a sequence diagram for explaining a processing example when changing the retention ratio in the third embodiment.
- FIG. 26 is a flowchart for explaining a processing example of the load information notification unit 1500.
- FIG. 27 is a flowchart for explaining a processing example of the retention ratio change determination unit 800 in the third embodiment.
- FIG. 28 is a flowchart for explaining a processing example of the data structure conversion method determination unit 900.
- FIG. 1 is a diagram illustrating an overall configuration of an information processing system 10 according to a first embodiment of the present invention.
- the system 10 includes a management computer 20 and processing computers 1 to 3 (21-1 to 21-3).
- processing computers 1 to 3 21-1 to 21-3.
- FIG. 1 as an example of the present embodiment, there are three processing computers in the system 10, but four or more computers may be provided.
- Each of the management computer 20 and the processing computer 30 includes a central processing unit 30-0 to 30-3, a main storage unit 31-0 to 31-3, a secondary storage unit 32-0 to 32-3, The network interface 33-0 to 33-3, the input device 34-0 to 34-3, and the output device 35-0 to 35-3 are connected to each other by buses 36-0 to 36-3. ing.
- a storage request computer 24 and an analysis request computer 25 which respectively have a central processing unit 30-4 to 30-5 and a main storage unit 31-4 to 31-. 5, secondary storage devices 32-4 to 32-5, network interfaces 33-4 to 33-5, input devices 34-4 to 34-5, and output devices 35-4 to 35-5, The elements are connected to each other by buses 36-4 to 36-5.
- the central processing units 30-0 to 30-5 are, for example, a central processing unit (CPU) or a microprocessing unit (MPU).
- the main storage devices 31-0 to 31-5 are, for example, random access memory (RAM), read only memory (ROM), or the like.
- the secondary storage devices 32-0 to 32-5 are, for example, hard disk drives (HDD), semiconductor disks (Solid State Disk, SSD), and the like.
- the network interfaces 33-0 to 33-5 are, for example, Ethernet network interface cards (NIC).
- the management computer 20, the processing computers 21-1 to 21-3, the storage request computer 24, and the analysis request computer 25 are mutually connected by the network 37 via the network interfaces 33-0 to 33-5 possessed by each computer. Is communicably connected.
- the input devices 34-0 to 34-5 for using the present system 10 are configured by devices such as a keyboard and a mouse, for example.
- the output devices 35-0 to 35-5 are composed of devices such as a liquid crystal monitor, for example.
- the main storage device 31-0 of the management computer 20 includes a data arrangement determining unit 100, a data redundancy determining unit 200, a data structure conversion determining unit 300, an analysis request receiving unit 400, an analysis execution location determining unit 500, and a data restoration.
- a determination unit 600 is stored.
- These processing units are realized by the central processing unit 30-0 executing a software program corresponding to the function of each processing unit, but can also be realized as hardware.
- each processing unit realized by the central processing unit 30-0 executing each processing unit will be described as the subject of each processing. However, when each processing unit is realized by hardware, Each processing unit mainly performs each processing.
- a data structure management table 70 is stored in the main storage device 31-0 of the management computer 20.
- the data arrangement determining unit 100 determines to which processing computer 21 the storage request computer 24 should first transmit the file, and executes a process instructing the storage request computer 24. .
- the file that the storage request computer 24 first transmits to the system 10 is referred to as “original file”.
- the data redundancy determination unit 200 performs a process of issuing a file redundancy instruction for the file stored in the processing computer 21 in accordance with the policy definition file 50 (see FIG. 3) to satisfy the file redundancy. Execute.
- the data structure conversion determination unit 300 determines the retention rate of the data structure in accordance with the policy definition file 50 (see FIG. 3 for details) for the files made redundant in the system 10 by the data redundancy determination unit 200. In order to satisfy the condition, a process for giving an instruction for the data structure conversion process or the data structure reverse conversion process is executed on the redundant file. Details of the data structure conversion process and the data structure reverse conversion process will be described later.
- the analysis request reception unit 400 receives the analysis request transmitted from the analysis request computer 25 and performs a process of calling the analysis execution location determination unit 500.
- the analysis execution place determination unit 500 analyzes the analysis request from the analysis request calculator 25 received by the analysis request reception unit 400, and analyzes the analysis request using the setting on the analysis request setting screen 61 (see FIG. 4). A processing computer 21 for processing is determined and an analysis process is instructed.
- the data restoration determination unit 600 instructs each processing computer 21 to restore the file as necessary in order to maintain the redundancy of the file in accordance with the policy definition file 50 (see FIG. 3). Execute the process.
- the secondary storage device 32-0 of the management computer 20 stores a computer definition file 40 (see FIG. 2), a policy definition file 50 (see FIG. 3), and an analysis request setting file 60. Details of each element will be described later.
- the main storage device 31-1 stores a data storage unit 1000, a data redundancy unit 1100, a data structure conversion unit 1200, a data structure reverse conversion unit 1300, and an analysis processing unit 1400.
- These processing units are realized by the central processing unit 30-1 executing a software program corresponding to the function of each processing unit, but can also be realized as hardware.
- each processing unit realized by executing each processing unit by the central processing unit 30-1 will be described as the subject of each processing. However, when each processing unit is realized by hardware, Each processing unit mainly performs each processing.
- the data storage unit 1000 receives the file transmitted from the storage request computer 24 and executes processing for storing it in the secondary storage device 32-1 of the processing computer 1 (21-1).
- the data redundancy unit 1100 In response to the instruction issued by the data redundancy determining unit 200 of the management computer 20, the data redundancy unit 1100 stores the file stored in the secondary storage device 32-1 of the processing computer 1 (21-1) into another file. Execute processing to send to the computer.
- the data structure conversion unit 1200 and the data structure reverse conversion unit 1300 execute a process of converting or reverse converting the data structure of the file stored in the secondary storage device 32-1 of the processing computer 1 (21-1).
- the data structure conversion unit 1200 receives a file name as input, and performs processing for converting the data structure of the file into the data structure of the converted file.
- a file converted and output by the data structure conversion unit 1200 is referred to as a “converted file”.
- the data structure reverse conversion unit 1300 receives the converted file and outputs an original file obtained by reverse conversion of the data structure. Specific examples of conversion and inverse conversion will be described later.
- the CSV Common Separated Values
- the data structure conversion unit 1200 performs transposition processing from row data to column data in the CSV file
- the data structure inverse conversion unit 1300 performs transposition processing from column data to row data in the CSV file.
- the analysis processing unit 1400 receives an instruction issued by the analysis execution location determination unit 500 executed by the management computer 20, analyzes the query included in the analysis request, and stores the secondary storage device of the processing computer 1 (21-1). The analysis on the data on the file stored in 32-1 is executed, and the process of returning the analysis result to the analysis request computer 25 is executed.
- a language similar to SQL (Structured Query) Language is used as a query language, but any data analysis language can be applied to the present invention.
- the analysis processing unit 1400 executes a process of analyzing a query used in the present system 10 and returning a result with respect to the CSV file stored in the secondary storage device 32-1.
- the secondary storage devices 32-1 to 32-3 of the processing computers 21-1 to 21-3 store files 80-1 to 80-3 managed by the system 10.
- the files 80-1 to 80-3 managed by the system 10 are stored as either the original file or the converted file whose data structure has been converted. Note that the number of files stored in each of the processing computers 21-1 to 21-3 is not limited to one, and there may be a plurality of files.
- FIG. 2 shows a configuration example of the computer definition file 40.
- the computer definition file 40 is stored in the secondary storage device 32-0 of the management computer 20.
- processing computers 41 to 43 managed by the management computer 20 are designated.
- the host name (processing computer 1 to 3) of each processing computer 21-1 to 21-3 is used to specify the processing computer, but each processing computer 21 such as an IP address is used.
- a method that can uniquely identify -1 to 21-3 can also be specified.
- FIG. 3 shows a configuration example of the policy definition file 50.
- This policy definition file 50 is stored in the secondary storage device 32-0 of the management computer 20.
- a file redundancy 51 indicating how many files are retained in the system for redundancy can be defined for one file.
- the holding ratio 52 of the original file / converted file that defines the holding ratio of the data structure. For example, if the file redundancy is defined as 3 and the retention ratio of the original file / converted file is defined as 2: 1, the management computer 20 refers to the policy definition file 50 and stores each file stored in the system 10. For the above, two original files before conversion, one converted file after conversion, and a total of three files are distributed and stored in each of the processing computers 21-1 to 21-3.
- FIG. 4 shows a screen example of an analysis request setting screen 61 for editing the analysis request setting file 60 stored in the secondary storage device 32-0 of the management computer 20.
- Input / output of this screen is executed via the input device 34-0 and output device 35-0 of the management computer 20.
- input / output processing can be performed by the input device and output device of other computers. is there.
- a user who uses this system 10 can explicitly specify whether or not to use a converted file for each query for a query specified when issuing an analysis request from the analysis request computer 25 to the system 10.
- the query 62 for calculating the total value of the values in the specific column and the query 63 for calculating the average value of the specific column use the converted file for the analysis on the arbitrary file.
- the query 64 for retrieving the value of a specific column is defined to use the converted file when the number of columns is 1 or more and 4 or less for the analysis of an arbitrary file.
- the original file is used.
- the user can freely add or delete a query to the list 65 for registering the query using the converted file by using the add button 66 or the delete button 67.
- the queries 62 to 64 defined on the screen are merely examples, and the present invention is not limited to these. In this example, an SQL-based query language is used, but other general query languages can also be used.
- FIG. 5 is a diagram illustrating a configuration example of the data structure management table 70.
- the data structure management table 70 is stored in the main storage device 31-0 of the management computer 20.
- Columns 71-1 to 71-3 of the data structure management table 70 show the files and their formats held in the respective secondary storage devices 32-1 to 32-3 by the processing computers 21-1 to 21-3.
- lines 72-A to 72-C of the data structure management table 70 indicate which processing computers 21-1 to 21-3 hold in which storage format a certain file.
- the holding state of the files arranged in each of the processing computers 21-1 to 21-3 is shown with three processing computers 21-1 to 21-3 and three types of files. .
- “0” represents that the original file before conversion is held
- “1” represents that the converted file after conversion is held. If the processing computers 21-1 to 21-3 do not hold the file, it is expressed as “ ⁇ 1”.
- the column 71-1 shows the secondary storage device 32-1 where the processing computer 1 stores the original file before conversion for files A and C, and the converted file for file B after conversion. Is stored.
- line 72-A for file A, the original file before conversion is stored in the processing computer 1 and processing computer 2, and the converted file after conversion is stored in the secondary storage devices 32-1 to 32-1 of the processing computer 3. 32-3 is stored.
- the data structure management table 70 represents the files held by the processing computers 21-1 to 21-3 and their data structures, all the processing computers 21-1 to 21-21 are to be updated.
- the items 73-A1 to 73-C3 of the data structure management table 70 can be updated.
- FIG. 6 shows an example of a file assumed to be processed in the present embodiment and its format.
- the original file 81 transmitted from the storage request computer 24 to the system 10 is a CSV file in which one line is treated as one record and the value of each record is divided by a delimiter comma.
- the data structure conversion unit 1200 executes a data structure conversion process 83 that is a conversion process for transposing and storing the rows and columns of the CSV file.
- the converted file 82 in which the rows and columns of the original file are transposed by the data structure conversion unit 1200 is also a CSV file.
- the data structure reverse conversion unit 1300 is a data structure that is a conversion process in which rows and columns are transposed again and stored in the converted file 82 in which the rows and columns are transposed by the data structure conversion unit 1200.
- An inverse conversion process 84 is executed. That is, after the data structure conversion unit 1200 applies the data structure conversion process 83 to the original file 81 and obtains the converted file 82, the data structure reverse conversion unit 1300 further applies the data structure reverse conversion process 84. An original file 81 is obtained. Details of the data structure conversion process 83 and the data structure reverse conversion process 84 will be described later.
- a CSV file is used as the original file and the converted file, and the data structure conversion process and the data structure reverse conversion process are performed based on a processing example in which the rows and columns of the CSV file are mutually transposed.
- uncompressed and compressed files as original files and converted files
- compression processing as data structure conversion processing
- combination of expansion processing as data structure reverse conversion processing
- decryption as original file and converted file
- conversion / inverse conversion processing including a combination of a file and an encrypted file, encryption processing as data structure conversion processing, and decryption processing as data structure reverse conversion processing.
- FIG. 7 is an example of a sequence diagram for explaining processing when a file is copied from the storage request computer 24 to the system 10.
- the storage request computer 24 copies a file to the system 10
- the storage request computer 24 issues a file save request to the management computer 20 (S2001).
- the data allocation determining unit 100 of the management computer 20 receives the file storage request (S2001) from the storage request computer 24 and executes the data allocation determination process (S2002), and the storage request computer 24 becomes a file storage destination.
- the host name of the processing computer 21 is notified (S2003).
- the data arrangement determining unit 100 selects the processing computer 1 (21-1) as the file storage destination processing computer 21. Details of the data arrangement determining process (S2002) executed by the data arrangement determining unit 100 will be described later.
- the storage request computer 24 receives the contents of the storage destination computer instruction (S2003) from the management computer 20, and transmits the file to the processing computer 1 (21-1) (S2004).
- the processing computer 1 (21-1) receives the file transmitted by the storage requesting computer 24 and performs the data storage processing (S2005) executed by the data storage unit 1000, thereby processing computer 1 (21-1) in FIG.
- the file is stored in the secondary storage device 32-1.
- the file stored at this time is hereinafter identified as the “original file”.
- the processing computer 1 (21-1) transmits a storage completion notification (S2006 to S2007) to the management computer 20 and the storage request computer 24.
- the management computer 20 After receiving the storage completion notification (S2007) from the processing computer 1 (21-1), the management computer 20 makes the original file redundant by the data redundancy decision processing S2008 executed by the data redundancy decision unit 200. An instruction is given (S2009). Details of the data redundancy determination processing S2008 will be described later.
- a data redundancy instruction S2009 including the host name of the processing computer 21 that is the data redundancy destination is transmitted to the processing computer 1 (21-1), and the processing computer 1 (21-1). -1) executes the data redundancy processing S2010.
- the data redundancy processing S2010 is processing for copying (copying) an instructed file to another processing computer 21.
- the processing computer 21 that is the transmission destination of the original file is specified.
- the original file stored in the secondary storage device 32-1 of the processing computer 1 (21-1) is processed with respect to the processing computer 2 (21-2) and the processing computer 3 (21-3).
- the processing computer 2 (21-2) and the processing computer 3 (21-3) receive the original file transmitted from the processing computer 1 (21-1), and then perform data storage processing S2013 to S2014, respectively.
- the original file is stored in the next storage devices 32-2 to 32-3.
- the processing computer 2 (21-2) and the processing computer 3 (21-3) transmit storage completion notifications S2015 to S2016 to the management computer 20.
- the management computer 20 After receiving the storage completion notifications S2015 to S2016 from the processing computer 2 (21-2) and the processing computer 3 (21-3), the management computer 20 converts the data structure through the data structure conversion determination processing S2017. Instruct. Details of the data structure conversion determination processing S2017 will be described later.
- the processing computer 21 that executes the data structure conversion process is specified.
- a data structure conversion instruction S2018 including the file name of a file to be subjected to data structure conversion processing is transmitted to the processing computer 3 (21-3).
- the processing computer 3 (21-3) executes the data structure conversion process S2019.
- the data structure conversion processing S2019 converts the data structure of the file stored in the secondary storage device 32-3 of the processing computer 3 (21-3) based on the file name included in the data structure conversion instruction S2018. The original file is replaced with the converted file. Details of the data structure conversion processing S2019 will be described later.
- the processing computer 3 (21-3) transmits a conversion completion notification S2020 to the management computer 20. Thus, the processing when the storage request computer 24 copies a file to the system 10 is completed.
- FIG. 8 is a flowchart illustrating an example of the data arrangement determination process S2002.
- the data arrangement determining unit 100 receives a file saving request from the storage request computer 24 (S101). This file saving request corresponds to the file saving request S2001 in FIG.
- a list of processing computers 21 is read from the computer definition file 40 shown in FIG. 2 (S102).
- the storage request computer 24 randomly selects one computer to which a file is to be transmitted first from the list of processing computers 21 that has been read (S103).
- the computer selected here corresponds to the processing computer 1 (21-1) in FIG.
- the data arrangement determining unit 100 transmits a storage destination computer instruction to the storage request computer 24 (S104).
- the storage destination computer instruction here corresponds to the storage destination computer instruction S2003 in FIG.
- the data arrangement determination process S2002 executed by the data arrangement determination unit 100 ends (S105).
- FIG. 9 is a flowchart illustrating an example of the data redundancy determination process S2008.
- the data redundancy determination unit 200 receives a storage completion notification transmitted from the processing computer 21 determined in the data arrangement determination unit 100 (S201).
- the storage completion notification here corresponds to the storage completion notification S2007 in FIG.
- the redundancy 51 of the file to be saved is read from the policy definition file 50 shown in FIG. 3 (S202).
- FIG. 10 is a flowchart showing an example of the data structure conversion determination process S2017.
- the data structure conversion determination process S2017 it is determined whether the data structure conversion is required according to the definition of the policy definition file 50 shown in FIG. 3 among the files made redundant by the data redundancy determination process S2008 and the data redundancy process S2010. Processing to determine and instruct is performed.
- the process will be described with reference to FIG.
- the data structure conversion determination unit 300 When the data structure conversion determination unit 300 starts the process in S300, the data redundancy determination process S2008 and the storage completion notification returned by the data storage processes S2013 to S2014 executed by each processing computer 21 by the data redundancy process S2010 are returned. All are received (S301).
- the storage completion notification here corresponds to the storage completion notifications S2015 to S2016 in FIG.
- the data structure conversion determination unit 300 reads the file redundancy 51 and the original file / converted file retention ratio 52 from the policy definition file 50 shown in FIG. 3 (S302).
- the data structure conversion determination unit 300 calculates the target retention number of the converted file from the redundancy 51 of the file read in S302 and the retention ratio 52 of the original file / converted file (S303). Specifically, the target retention number can be calculated as a ⁇ c / (b + c) (the fractional part is rounded down) where the file redundancy is a and the retention ratio is b: c.
- the data structure conversion determination unit 300 refers to the data structure management table 70 illustrated in FIG. 5 and repeats S304 to S309 until the current number of converted files held matches the target holding number calculated in S303.
- the data structure conversion determination unit 300 executes S306, selects one processing computer 21 that stores the original file, and instructs conversion of the data structure (S306).
- This data structure conversion instruction includes the file name to be converted.
- the data structure conversion determination unit 300 executes S307 and stores the converted files.
- One processing computer is selected, and the inverse transformation of the data structure is instructed (S307).
- This data structure reverse conversion instruction includes the file name to be converted.
- the data structure conversion determination unit 300 updates the data structure management table 70 (S308), and returns to S304 until the current number of converted files retained matches the target number of retention calculated in S303.
- the redundancy 51 of the file to be stored recorded in the policy definition file 50 is 3 and the retention ratio 52 of the original file / converted file is 2: 1, at the time of the first S305, FIG.
- the number of retained original files is 3, and the number of retained converted files is 0.
- the target retention number of the converted file is 1 from the calculation of the target retention number in S303. Therefore, by executing S306 and converting the original file to the converted file, the current number of converted files held becomes 1, which coincides with the target number of holdings, so the processing ends (S310).
- FIG. 11 is a flowchart illustrating an example of the data structure conversion process S2019.
- the data structure conversion process S2019 a process of creating and saving a converted file in which rows and columns are transposed with respect to the CSV-format original file shown in FIG. 6 is performed.
- the processing will be described with reference to FIG.
- the data structure conversion unit 1200 receives a data structure conversion instruction from the management computer 20 (S1201).
- the data structure conversion instruction here corresponds to S2018 in FIG.
- the data structure conversion instruction includes the file name of the file to be converted. This file is called the original file, and the file generated by the data structure conversion process is called the converted file.
- the data structure conversion unit 1200 opens the original file and creates an empty converted file (S1202). Next, the data structure conversion unit 1200 repeats processing S1203 to S1205 for performing transposition processing for each column for the value of each column in the original file. In the processing, values in each column are read in order, converted into a comma-delimited line of CSV format in the order of reading, and then added to the converted file (S1204). After executing the processing from S1203 to S1205, the original file and the converted file are closed (S1206), the original file is replaced with the converted file (S1207), and the processing is terminated (S1208). As described with reference to FIG. 6, the process of transposing CSV format rows and columns as the data structure conversion process is an example in the present embodiment, and the present invention is not limited to this.
- FIG. 12 is a flowchart showing an example of the data structure reverse conversion process.
- the data structure reverse conversion process is a process for creating and storing a converted file in which columns and rows are transposed with respect to the converted file in the CSV format shown in FIG.
- the processing will be described with reference to FIG.
- the data structure reverse conversion unit 1300 receives a data structure reverse conversion instruction from the management computer 20 (S1301).
- the data structure reverse conversion instruction includes the file name of the file to be reverse converted. This file is called the converted file, and the file generated by the data structure reverse conversion process is called the original file. To do.
- the data structure reverse conversion unit 1300 opens the converted file and creates an empty original file (S1302).
- the data structure reverse conversion unit 1300 repeats processing S1303 to S1305 for performing transposition processing for each column for each column value of the converted file.
- values in each column are read in order, converted into a comma-delimited line of CSV format in the order of reading, and then added to the original file (S1304).
- the data structure reverse conversion unit 1300 closes the converted file and the original file (S106), replaces the converted file with the original file (S1307), and ends the processing. (S1308).
- the process of transposing columns and rows in the CSV format as the data structure reverse conversion process is an example in the present embodiment, and the present invention is not limited to this.
- FIG. 13 is an example of a sequence diagram when the system 10 receives an analysis request from the analysis request computer 25.
- the analysis request computer 25 transmits an analysis request including a query to the management computer 20.
- the management computer 20 receives the analysis request from the analysis request computer 25 through the analysis request reception process S2102 and determines the computer that performs the analysis through the analysis execution location determination process S2103. Details of the analysis execution location determination processing S2103 will be described later.
- the management computer 20 transmits an analysis instruction S2104 to the processing computer 21 determined in the analysis execution location determination processing S2103.
- the analysis execution location determination process S2103 selects the processing computer 3 (21-3), but the present invention is not limited to this.
- Receiving the analysis instruction S2104, the processing computer 3 (21-3) performs analysis by analysis processing S2105 and transmits the analysis result S2106 to the analysis request computer 25.
- the analysis request computer 25 can obtain the analysis result S2106 for the analysis request S2101.
- FIG. 14 is a flowchart illustrating an example of the analysis execution location determination process S2103.
- the analysis execution location determination unit 500 starts processing in S500, the analysis execution location determination unit 500 generates a list of queries that use the converted file from the analysis request setting file 60 that can be edited on the analysis request setting screen 61 illustrated in FIG. Read (S501).
- S501 the analysis request reception process S2102 is included in the list of queries using the converted file read in S501 (S502). If it is determined that the query is included in the list of queries that use the converted file (S502, included), the analysis execution location determination unit 500 proceeds to the process of S503. If it is determined that the query is not included in the list (S502, not included), the process proceeds to S504.
- the analysis execution place determination unit 500 selects one processing computer 21 in which the converted file is stored from the data structure management table 70 shown in FIG. 5 (S503).
- the analysis execution location determination unit 500 selects one processing computer in which the original file is stored from the data structure management table 70 shown in FIG. 5 (S504). If there are a plurality of candidate computers in either S503 or S504, one computer is selected at random from those computers.
- the analysis execution place determination unit 500 instructs the computer selected in either step S503 or S504 to perform analysis processing (S505), and ends the processing (S506).
- the analysis instruction here corresponds to S2104 in FIG.
- FIG. 15 is an example of a sequence diagram showing file restoration processing when a failure occurs in the processing computer 21 in the system 10.
- this sequence diagram when a failure occurs in the processing computer 2 (21-2), an example of restoring a file to the processing computer 2 ′ (21-2 ′) in order to guarantee file redundancy Is shown.
- the processing will be described with reference to FIG.
- the management computer 20 periodically executes a failure detection process S2201 to check whether a failure has occurred in each processing computer 21 described in the computer definition file 40 shown in FIG.
- the processing computer 1 21-1
- the processing computer 2 (21-2)
- the processing computer 3 (21-3)
- the processing computer 2 ′ (21-2 ′)
- the failure detection process S2201 can be realized by the following existing method, and thus description thereof is omitted here.
- a computer alive monitoring is performed depending on whether or not network communication to each computer is possible, and the management computer 20 refers to management information collected and notified by each processing computer. There are methods for alive monitoring.
- the management computer 20 detects a failure of the processing computer 21 through the failure detection process S2201, there is a high possibility that the redundancy of the file stored in the system 10 has been lost. Need to run. In order to determine whether restoration processing is necessary, the management computer 20 executes data restoration determination processing S2202.
- the data restoration determination process S2202 executed by the data restoration determination unit 600, it is confirmed whether redundancy has been lost for each file stored in the faulty processing computer 21 detected in the fault detection process S2201. If the redundancy is lost, the file whose redundancy is lost is restored to another processing computer 21.
- the data restoration determining unit 600 holds the file lost due to the failure in the processing computer 3 (21-3).
- the file is transmitted from the processing computer 3 (21-3) to the processing computer 2 ′ (21-2 ′).
- the specific procedure for restoring the file is a data redundancy processing instruction and a data structure conversion instruction or a data structure reverse conversion instruction. Details will be described later.
- FIG. 16 is a flowchart illustrating an example of the data restoration determination process S2202.
- indicates to perform restoration
- processing for instructing the data structure conversion or reverse conversion is also executed.
- the processing will be described with reference to FIG.
- the data restoration determination unit 600 refers to the data structure management table 70 shown in FIG. 5 and acquires all the information on the files stored in the processing computer 21 in which the failure has occurred. (S601). Next, the processing from S602 to S611 is repeated for all files for which information has been acquired. In the loop from S602 to S611, it is determined for each file whether the file redundancy 51 defined in FIG. 3 is satisfied (S603). If it is determined that the redundancy 51 is satisfied (S603, is satisfied), the data restoration determination unit 600 does not perform the subsequent processing and returns to the processing of S602. On the other hand, if it is determined that the redundancy 51 is not satisfied (S603, not satisfied), the data restoration determination unit 600 proceeds to the process of S604.
- the data restoration determination unit 600 selects one computer as a restoration source computer from among the computers held regardless of whether the file to be restored is an original file or a converted file (S604). . When there are a plurality of candidate computers, they are selected at random. Further, for a file to be restored, one computer is selected as a restoration destination computer from computers that do not hold both the original file and the converted file (S605). Again, when there are a plurality of candidate computers, they are selected at random from those computers.
- the data restoration determination unit 600 instructs the restoration source computer selected in S604 to perform data redundancy processing of the restoration target file to the restoration destination computer selected in S605 (S606).
- the data redundancy instruction here corresponds to the data redundancy instruction S2203 in FIG. Through the processing so far, the redundancy 51 of the file to be restored can be restored to the state before the failure.
- the data structure of the file stored in the computer in which the failure has occurred is compared with the data structure of the file copied in steps S604 to S606 (S607).
- the data restoration determination unit 600 proceeds to the process of S611.
- the data restoration determination unit 600 proceeds to the processing of S608 and confirms the data structure of the restoration source file selected in S604 (S608).
- the data restoration determination unit 600 proceeds to the processing of S609, and the data structure conversion is performed on the restoration destination computer selected in S605. A processing instruction is executed (S609). On the other hand, if it is determined that the file is a converted file (S608, converted file), the data restoration determination unit 600 instructs the restoration destination computer selected in S605 to perform a data structure reverse transformation process (S610). Through the processing up to this point, the data structure retention ratio 52 of the file to be restored can be restored to the state before the failure.
- the processing computer 21 storing the restoration target data with the same data structure as the data structure held in the processing computer 21 in which the failure has occurred is selected as the restoration source computer, the data conversion process at the restoration destination is performed. Can be omitted.
- the management computer 20 refers to the data structure management table 70 and holds the data to be restored with the same data structure as the data structure of the data stored in the processing computer 21 where the failure has occurred.
- the processing computer 21 may be selected as a restoration source.
- the data restoration determination unit 600 repeats S602 to S611 for all the files stored in the computer in which the failure has occurred, and then updates the data structure management table 70 with the latest information (S612) and ends the processing (S612). S613).
- the retention ratio 52 of the original file / converted file is statically defined in the policy definition file 50 shown in FIG. 3, and the retention ratio 52 is dynamically changed during system operation. I could't.
- the analysis request is processed depending on the initial definition of the retention ratio 52 of the original file and the converted file.
- the load of the processing computer 21 is biased toward a specific processing computer 21 and it is difficult to improve the performance of the entire system 10. Therefore, in the present embodiment, statistical information related to the analysis request is recorded, and the dynamic change of the data structure retention ratio based on the recorded statistical information is realized.
- FIG. 17 shows a configuration example of the system 10 according to the second embodiment.
- a statistical information recording unit 700 and a holding ratio change determination unit 800 are added to the management computer 20.
- a data structure / statistical information management table 70A which is an extended table of the data structure management table 70, is added.
- FIG. 18 shows a configuration example of the data structure / statistical information management table 70A.
- the data structure management table 70 shown in FIG. 5 includes a field (74-A to 74-C) for recording the retention ratio of the current original file / converted file for each file, and the data structure in the analysis process.
- a field (75-A to 75-C) for recording the number of references is added.
- the column 71-1 indicates that the processing computer 1 holds the original file before conversion for the files A and C and the converted file after conversion for the file B.
- the row 72-A indicates that the processing computers 1 and 2 hold the file A as the original file before conversion, and the processing computer 3 holds it as the converted file after conversion. Indicates.
- the column 74 of the fields 74-A to 74-C for recording the retention ratio represents the retention ratio of the original file and the converted file for each file.
- 2: 1 is recorded in the cell 74-A.
- a column 75 of fields 75-A to 75-C for recording the number of references indicates the number of references to the original file and the number of references to the converted file for each file.
- cell 75-A has 2: 6 to indicate that file A has been referenced twice in total in the original file and six times in total in the converted file. It is recorded.
- the reference number 75 is recorded by accumulating the reference number during the operation period of the information processing system 10 for each data structure, but other recording modes such as recording the reference number per unit time in the latest are adopted. May be.
- the updating method of each item shown in the cells 73-A1 to 73-C3 is the same as the method shown in FIG.
- the updating method of the fields 74-A to 74-C for recording the holding ratio the number of original files and the number of converted files are counted for each corresponding line, and the correspondence between 74-A to 74-C is counted. Record in the format of (number of original files) :( number of converted files).
- a method of updating the fields 75-A to 75-C for recording the reference number will be described later with reference to FIG.
- FIG. 19 shows an example of a sequence diagram when the analysis request computer 25 transmits an analysis request to the system 10 in the present embodiment. 13 is different from FIG. 13 shown in the first embodiment in that a statistical information recording process S2307 is added as a process of the management computer 20 after the analysis execution place determination process S2303.
- the processing computer 3 (21-3) executes the analysis processing S2305.
- the management computer 20 executes the statistical information recording process S2307, and the data shown in FIG. 18 indicates that the analysis request transmitted from the analysis request computer 25 is executed by the processing computer 3 (21-3). Processing to be recorded in the structure / statistical information management table 70A is performed. Details of the statistical information recording process S2307 will be described later.
- FIG. 20 is a flowchart illustrating an example of the statistical information recording process S2307.
- the statistical information recording unit 700 updates the reference number field of the data structure / statistical information management table 70A from the processing computer 21 determined in the analysis execution place determination processing S2303 executed immediately before and the file used in the analysis processing S2305. Execute the process.
- the statistical information recording unit 700 starts the process in S700, first, the column of the processing computer 21 selected in the analysis execution location determination process S2303 is selected from the data structure / statistical information management table 70A shown in FIG. (S701). Next, the row of the file used in the analysis request is selected from the selected column (S702).
- the statistical information recording unit 700 proceeds to the process of S704 (S704, 0).
- the statistical information recording unit 700 proceeds to the process of S705 (S705, 1).
- the statistical information recording unit 700 increments the value on the left side of the reference number field of the selected row by 1 (S704).
- the statistical information recording unit 700 increments the value on the right side of the reference number field of the selected row by 1 (S705).
- the statistical information recording unit 700 ends the process.
- FIG. 21 shows an example of a sequence diagram in the case where the data structure retention ratio is changed for the file selected in the retention ratio change determination process S2401.
- the timing at which the holding ratio change determination process S2401 is performed can be arbitrarily set. For example, a configuration in which the user can manually cause the holding ratio change determination unit 800 to execute, or a configuration in which the management computer 20 causes the holding ratio change determination unit 800 to periodically execute can be applied to the present invention. Details of the retention ratio change determination process S2401 will be described later. Further, the data structure conversion instruction S2402 and the data structure conversion process S2403 in the sequence diagram are merely examples, and depending on the determination of the holding ratio change determination process S2401, the data structure reverse conversion instruction and the data structure reverse conversion process may be performed. Exists.
- the management computer 20 first executes a retention ratio change determination process S2401.
- the current retention ratio 2: 1 is a new value closest to the reference number 2: 6. It is decided to change the retention ratio to 1: 2.
- an instruction to convert the data structure is given to the original file of file A stored in the processing computer 2 (21-2).
- the data structure conversion instruction S2402 is transmitted to the processing computer 2 (21-2).
- the processing computer 2 (21-2) that has received the data structure conversion instruction S2402 executes the data structure conversion process S2403 for the original file A of the file A specified by the data structure conversion instruction, and the converted file of the file A Get.
- the data structure conversion process S2403 is the same as the process shown in FIG.
- the processing computer 2 (21-2) transmits a conversion completion notification S2404 to the management computer 20 and ends the processing.
- the state in which the holding ratio of the original file and the converted file for file A is 2: 1 is changed to a new holding ratio of 1: 2, which is closer to the reference number of 2: 6. .
- FIG. 22 is a flowchart illustrating an example of the retention ratio change determination process S2401.
- the holding ratio change determination unit 800 checks whether or not the holding ratio of the original file and the converted file needs to be changed for all the files stored in the system 10, and if necessary, sets a new holding ratio. By determining and instructing the processing computer 21 to convert or reverse the data structure, processing for changing the retention ratio of the original file and the converted file is executed. Hereinafter, the process will be described with reference to FIG.
- the retention ratio change determination unit 800 reads the file redundancy 51 from the policy definition file 50 shown in FIG. 3 (S801).
- the file redundancy is assumed to be e.
- S802 to S815 are repeated for all the files registered in the data structure / statistical information management table 70A shown in FIG.
- the retention ratio (74-A to 74-C) and the reference number (75-A to 75-C) of each file are read from the data structure / statistical information management table 70A (S803).
- the read retention ratio is a: b
- the read reference number is c: d.
- the retention ratio change determination unit 800 determines whether the read retention ratio and the number of references are the same (S804).
- An example of a specific determination method is as follows. That is, the value of a / (a + b) is compared with the value of c / (c + d), and if there is a difference of 1 / (a + b) or more, it is determined that the tendency of the holding ratio and the reference number is different. On the other hand, if the difference is less than 1 / (a + b), it is determined that the retention rate and the number of references are the same.
- the determination method of the retention ratio and the reference number tendency shown above is merely an example, and the present invention is not limited to this.
- the retention ratio change determination unit 800 skips the process until the loop end (S815).
- the retention ratio change determination unit 800 proceeds to the process of S805.
- the retention ratio change determination unit 800 determines a new retention ratio close to the reference number c: d (S805, S806).
- a new retention ratio to be determined is set as a ': b' (a 'and b' are non-negative integers).
- a non-negative integer value a ' that minimizes the absolute value of the difference between a' / e and c / (c + d) is obtained.
- the determination method mentioned above is an example to the last, and is not limited to this.
- the redundancy (e) of the file is 3
- the retention ratio (a: b) of the original file / converted file is 2: 1
- the reference number (c: d) is 1: If it is 2, the new retention ratio a ′: b ′ is 1: 2.
- the retention ratio change determination unit 800 calculates the target retention number of the converted file (S807).
- the target holding number here is the same as b ′ calculated in S806.
- the retention rate change determination unit 800 refers to the data structure / statistical information management table 70A shown in FIG. 18 until the current retention number of the converted file matches the target retention number calculated in S807. The processing from S808 to S814 is repeated. If it is determined that the current converted file retention number does not match the target retention number calculated in S807, the current converted file retention number and the target retention number are continuously compared (S809).
- the retention ratio change determining unit 800 stores data from the original file to the converted file.
- a structure conversion process is performed (S810). Specifically, the target computer is randomly selected from the processing computers 21 storing the original file so that the newly determined retention ratio is obtained, and a data structure conversion instruction is transmitted. Thereafter, the data structure / statistical information management table 70A is updated to the latest information (S811), and the process returns to S808 until the current number of converted files and the target number of retention match.
- the retention ratio change determination unit 800 converts the converted file to the original file.
- Data structure reverse conversion processing is performed (S812). Specifically, the target computer is randomly selected from the processing computers 21 storing the converted files so that the newly determined retention ratio is obtained, and a data structure reverse conversion instruction is transmitted. Thereafter, the data structure / statistical information management table 70A is updated to the latest information (S813), and the process returns to S808 until the current number of converted files and the target number of retention match. If the current number of converted files held matches the target number of files, the process ends (S816).
- the statistical information recording unit 700, the retention ratio change determination unit 800, and the data structure / statistical information management table 70A dynamically change the retention ratio of the original file / converted file based on the statistical information. Proposed configuration is realized.
- the dynamic holding ratio is changed using the load information of each computer, so that the holding can be performed more efficiently. Realize the percentage change.
- a method of copying a file in a format after conversion from the processing computer 21 that already holds is adopted as a new method of data structure conversion and inverse conversion processing.
- FIG. 23 is a diagram illustrating a configuration example of the system 10 according to the third embodiment.
- the main storage device 31-0 of the management computer 20 includes the holding ratio change determination unit 800A in the third embodiment, The difference is that a data structure conversion method determination unit 900 is added.
- a conversion rule definition file 90 is added to the secondary storage device 32-0 of the management computer 20.
- a load information notification unit 1500 and a data replacement unit 1600 are added to each processing computer (21-1 to 21-3).
- FIG. 24 is an example of the conversion rule definition file 90 shown in FIG.
- the conversion rule definition file 90 is a file that defines the threshold conditions for load information such as the CPU usage rate and network usage rate of the processing computer 21 and the processing when all conditions are satisfied.
- threshold conditions are defined for the CPU usage rate and the network usage rate, and the processing (91 to 94) to be performed when both are satisfied is described.
- the values shown as definition examples are merely examples, and can be defined more finely.
- the CPU usage rate and the network usage rate are defined as conditions, the conditions are not limited to these, and load information of other processing computers 21 can be defined.
- FIG. 25 the retention ratio change determination unit 800A and the data structure conversion method determination unit 900 in this embodiment execute a retention ratio change determination process S2501 and a data structure conversion method determination process S2508, respectively, and the data structure conversion process S2512.
- FIG. 11 is an example of a sequence diagram in the case of instructing execution of (S2510) and in the case of instructing execution of data replacement processing S2527 (S2520). Whether to instruct execution of the data structure conversion process S2512 or execution of the data replacement process S2527 is determined in the data structure conversion method determination process S2508, details of which will be described later.
- the data structure conversion instruction S2511 and the data structure conversion process S2512 in the sequence diagram are merely examples, and depending on the determination of the retention ratio change determination process S2501 and the data structure conversion method determination process S2508, the data structure reverse conversion instruction and the data structure reverse There is also a case of conversion processing. Details of the retention ratio change determination process S2501, the data structure conversion method determination process S2508, and the load information notification processes S2504 to S2505 will be described later. Further, the data structure conversion process S2512 is the same as that in FIG.
- the processing illustrated in the sequence diagram of FIG. 25 will be described based on a specific example. It is assumed that three processing computers 21 are registered in the system 10 and the retention ratio change determination process S2501 determines that the retention ratio is to be changed for a certain file A. At this time, as shown in line 72-A of the data structure / statistical information management table 70A shown in FIG. 18, the original file of the file A is stored in the processing computers 1 and 2, and the converted file is stored in the processing computer 3. , Each stored. For file A, the retention ratio of the original file and the converted file is 2: 1 described in the cell 74-A in FIG. 18, and the reference number of the original file and the converted file is the cell 75 in FIG. -2: 6 described in A.
- the management computer 20 executes a retention ratio change determination process S2501.
- the current retention ratio 2: 1 is obtained as the reference number. It is decided to change to a new retention ratio of 1: 2, which is closest to 2: 6. Further, in order to change the current holding ratio 2: 1 to the new holding ratio 1: 2, the management computer 20 loads the processing computer 1 (21-1) and the processing computer 2 (21-2). Information notification instructions S2502 to S2503 are transmitted. At this time, the file to be processed in the retention ratio change determination process S2501 is hereinafter referred to as a target file.
- the processing computer 1 (21-1) and the processing computer 2 (21-2) Upon receiving the load information notification instructions S2502 to S2503, the processing computer 1 (21-1) and the processing computer 2 (21-2) execute the load information notification processing S2504 to S2505, and send the load information S2506 to the management computer 20. S2507 is transmitted.
- the load information for example, information such as the CPU usage rate and the network usage rate described above can be included.
- the management computer 20 After receiving the load information S2506 to S2507 of the processing computer 1 (21-1) and the processing computer 2 (21-2), the management computer 20 executes the data structure conversion method determination processing S2508.
- the data structure conversion method determination processing S2508 has two patterns: a case where the data structure conversion processing is selected (S2510) and a case where the data replacement processing is selected (S2520). It is divided into. From here, each case will be described separately.
- the management computer 20 After receiving the storage completion notification S2525, the management computer 20 sends the converted file received from the processing computer 3 (21-3) to the processing computer 2 (21-2) and the processing computer 2 (21 originally). -2) transmits the data replacement instruction S2526 for the original file held.
- the processing computer 2 (21-2) that has received the data replacement instruction S2526 executes the data replacement processing S2527 for the converted file and the original file, and the secondary storage device (21-2) of the processing computer 2 (21-2) ( 32-2)
- the original file to be replaced that was originally held in 32-2) is overwritten and copied with the converted file received from the processing computer 3 (21-3), and a replacement completion notification S2528 is transmitted to the management computer 20.
- the state in which the retention ratio of the original file and the converted file is 2: 1 is changed to a new retention ratio 1: 2 that is closer to the reference number 2: 6.
- FIG. 26 is a flowchart illustrating an example of the load information notification process.
- the processing computer 21 receives a request from the management computer 20 and notifies the management computer 20 of the load information of the processing computer 21 in which the load information notification unit 1500 is executed.
- the load information notification unit 1500 When the load information notification unit 1500 starts processing in S1500, it receives a load information notification instruction from the management computer 20 (S1501), and acquires load information of the processing computer 21 (S1502).
- the load information of the processing computer 21 may include, for example, the above-described CPU usage rate, network usage rate, disk usage rate, etc., but is not limited thereto.
- the load information notification unit 1500 After acquiring the load information of the processing computer 21, the load information notification unit 1500 notifies the management computer of the acquired load information (S1503), and ends the processing (S1504).
- FIG. 27 is a flowchart showing an example of the holding ratio change determination processing S2504 to S2505.
- the holding ratio change determination unit 800A according to the present embodiment is different from the holding ratio change determination unit 800 according to the second embodiment in that a load information notification instruction is transmitted to the processing computer 21.
- a load information notification instruction is transmitted to the processing computer 21.
- the retention ratio change determination unit 800A calculates the target retention number of the converted file (S807), and then calculates the absolute value f of the difference between the current retention number of the converted file and the target retention number (S850). Taking the file A on line 72-A of the data structure / statistical information management table 70A shown in FIG. 18 as an example, the current number of retained files is 1, and the target number of retentions is 2 from the results of S805 to S807. Therefore, f calculated in S850 is 1. Next, the current number of retained files is compared with the target number of retained files (S851). When it is determined that the current number of converted files held matches the target number of held files, the holding ratio change determination unit 800A proceeds to the process of S815 (S851, match).
- the retention ratio change determination unit 800A uses the data structure / statistical information management table 70A shown in FIG.
- the load information notification instruction is transmitted to all the processing computers 21 storing the original file (S852).
- the retention ratio change determining unit 800A uses the data structure / statistical information management table 70A shown in FIG.
- the load information notification instruction is transmitted to all the processing computers 21 that refer to and store the converted file (S853).
- the load information of each processing computer 21 that is returned in response to the load information notification instruction transmitted here, and the absolute value f of the difference between the number of retained converted files and the target number of retention calculated in S850 are shown in FIG. Used in data structure conversion method determination processing.
- FIG. 28 is a flowchart illustrating an example of the data structure conversion method determination processing S2508.
- the data structure conversion method determination unit 900 uses the load information in any one of S852 and S853 for all files determined in S802 to S815 in the retention ratio change determination unit 800A in the present embodiment shown in FIG. A method for efficiently changing the retention ratio of the file data structure is selected, determined, and instructed from among the processing computers 21 that have instructed the notification.
- the processing will be described with reference to FIG.
- the data structure conversion method determination unit 900 repeats the steps S901 to S915 for all files whose retention ratios are determined by the retention ratio change determination unit 800A in the present embodiment shown in FIG. First, when starting the process in S900, the data structure conversion method determination unit 900 receives all the load information instructed by the retention ratio change determination unit 800A to each processing computer 21 for the file whose retention ratio is to be changed. (S902).
- the load information received in S902 corresponds to the load information S2506 to S2507 in FIG.
- the CPU usage rate is low using the value of f determined in S850 of the retention ratio changing process shown in FIG. 27 and the CPU usage rate information included in the load information of the processing computer 21 received in S902.
- f processing computers 21 are selected (S903).
- the computer selected here is called an instruction target computer, and the original file or converted file stored in it is called a conversion target file.
- a file obtained by applying data structure conversion processing and data structure reverse conversion processing to a conversion target file is referred to as a post-conversion file. Specifically, if the conversion target file is an original file, the converted file is a converted file, and if the conversion target file is a converted file, the converted file is an original file.
- the data structure conversion method determination unit 900 repeats S904 to S914 for all the instruction target computers. First, it is checked whether the converted file is stored in another computer with respect to the conversion target file stored in the instruction target computer (S905). The check in S905 can be performed with reference to the data structure / statistical information management table 70A shown in FIG. If it is determined that the converted file is stored in a computer other than the instruction target computer, the data structure conversion method determination unit 900 proceeds to the processing of S906 (S905, exists). On the other hand, if it is determined that the converted file is not stored other than the instruction target computer, the data structure conversion method determination unit 900 proceeds to the process of S910 (S905, does not exist).
- the data structure conversion method determination unit 900 derives processes 91 to 94 from the load information of the instruction target computer acquired in S902 and the conversion rule definition file 90 shown in FIG. If the derived process is “conversion”, the data structure conversion method determination unit 900 proceeds to the process of S910 (S906, conversion). On the other hand, if the derived process is “copy”, the data structure conversion method determination unit 900 proceeds to the process of S907 (S906, copy).
- the data structure conversion method determination unit 900 confirms the data structure of the conversion target file (S910).
- the data structure conversion method determination unit 900 converts the data structure of the conversion target file from the original file to the converted file.
- the instruction target computer is instructed to convert the data structure (S911).
- the data structure conversion method determination unit 900 changes the data structure of the conversion target file from the converted file to the original file.
- the instruction target computer is instructed to reversely convert the data structure (S912).
- FIG. 25 exemplifies processing when a data structure conversion instruction is issued, and the data structure conversion instruction in S911 corresponds to S2511 in FIG. After issuing either instruction in S911 or S912, the data structure conversion method determination unit 900 proceeds to the process of S913.
- the processing flow here corresponds to S2520 to S2528 in FIG.
- the data structure conversion method determination unit 900 selects one computer that stores the converted file of the conversion target file checked in S905 (S907).
- the data structure conversion method determination unit 900 instructs the selected computer to make data redundant toward the instruction target computer (S908).
- the data redundancy instruction here corresponds to S2521 in FIG.
- the processing computer that has received the data redundancy instruction transmits the converted file of the conversion target file to the instruction target computer.
- the instruction target computer that has received the converted file of the conversion target file performs data storage processing and transmits a storage completion notification to the management computer.
- the management computer instructs the instruction target computer to perform data replacement processing (S909).
- the data replacement instruction here corresponds to S2526 in FIG. Thereafter, the process proceeds to S913.
- the data structure conversion method determination unit 900 updates the recorded contents of the data structure / statistical information management table 70A with the latest information after completing the processing in S907 to S909 or S910 to S912 (S913). Thereafter, the processing returns to S904 and S901, and after repeating the processing loop under a predetermined condition (S914 and S915), the processing is terminated (S916).
- the original file / conversion The retention ratio of a completed file can be changed dynamically.
- semi-structured data can be analyzed at high speed while maintaining fault tolerance of data without having extra data in the information processing system.
- An information processing system and an information processing method are provided.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
また、分散処理基盤を高速に実行するため、専用のファイルシステムも用意されている。このファイルシステムにおいては、コモディティハードウェアを利用することを想定しているため、処理対象とするファイルを複数の計算機で冗長的に保持することで、ファイルの耐障害性を実現している。
分散処理基盤を用いた情報処理システムにおいて、カラムストアを利用する場合、カラムストアにデータを改めて格納する分だけ、情報処理システム内にデータを余分に持つ必要がある。一方、分散処理基盤を用いた情報処理システムは、データの耐障害性を高めるため、データを冗長化して格納している。以上の二点から、分散処理基盤を用いた情報処理システムにカラムストアを適用すると、データを余分に持たなければならない問題が発生する。
また、本発明の他の態様は、前記情報処理システムの制御方法である。
[第1の実施の形態]
図1は、本発明の第1の実施の形態に係る情報処理システム10の全体構成を例示した図である。
データ配置決定部100は、本システム10にファイルを格納する際、格納要求計算機24が最初にどの処理用計算機21にファイルを送信すべきかを決定し、格納要求計算機24に指示する処理を実行する。以降、格納要求計算機24が最初に本システム10に送信するファイルを「元ファイル」と呼ぶ。
データ保存部1000は、格納要求計算機24から送信されるファイルを受信し、処理用計算機1(21-1)の二次記憶装置32-1に保存する処理を実行する。
次に、本発明について、その第2の実施形態に即して説明する。本実施の形態では、分析要求の統計情報を用いて元ファイル・変換済ファイルの保持割合の動的な変更も行うことができるシステムの例を示している。
次に、本発明の第3実施形態について説明する。本実施の形態では、分析要求の統計情報と、各処理用計算機の負荷情報とを用いて元ファイル・変換済ファイルの保持割合の効率の良い動的な変更も行えるシステムの例を示している。
20 管理用計算機
21 処理用計算機
24 格納要求計算機
25 分析要求計算機
40 計算機定義ファイル
50 ポリシー定義ファイル
60 分析要求設定ファイル
70 データ構造管理テーブル
70A データ構造・統計情報管理テーブル
80 ファイル
90 変換ルール定義ファイル
100 データ配置決定部
200 データ冗長化決定部
300 データ構造変換決定部
400 分析要求受付部
500 分析実行場所決定部
600 データ復元決定部
700 統計情報記録部
800 保持割合変更決定部
800A 保持割合変更決定部
900 データ構造変換方法決定部
1000 データ保存部
1100 データ冗長化部
1200 データ構造変換部
1300 データ構造逆変換部
1400 分析処理部
1500 負荷情報通知部
1600 データ置換部
Claims (15)
- 外部装置からの要求に応じてデータの格納及び格納した前記データの分析を実行する情報処理システムであって、
それぞれが前記データの格納領域を提供している、複数のデータ格納部と、
それぞれが各前記データ格納部に関連付けられて、前記データ格納部に格納されている前記データのデータ構造に対して所定の操作を実行する、複数のデータ構造操作部と、
それぞれが各前記データ格納部に関連付けられて、前記データ格納部に格納されている前記データのいずれかを他の前記データ格納部に送信する、複数のデータ冗長化部と、
前記外部装置から要求された前記データを、複数の前記データ格納部のいずれに格納するかを決定するデータ配置決定部と、
複数の前記データ冗長化部にいずれかの前記データの他の前記データ格納部への送信をさせるデータ冗長化決定部と、
各前記データ構造操作部に前記データ格納部に格納されている前記データの前記データ構造を操作させるデータ構造操作決定部と、
複数の前記データ格納部に格納する前記データ及び前記データのデータ構造に関する情報であるデータ構造管理データを保持しているデータ構造管理情報保持部と、
前記外部装置から前記データ格納部に格納された前記データに対する分析要求を受け付けて、前記分析要求に係る前記データの分析処理を実行する分析処理部と、を備え、
前記データ配置決定部は、前記外部装置から格納要求を受けた前記データについて、あらかじめ設定されている、前記情報処理システム内での格納数とその格納される前記データに関する前記データ構造を取得し、前記データ構造管理情報保持部に格納されている対応データの前記データ構造管理情報を参照して、前記格納要求に係る前記データを格納すべき前記データ格納部を決定して前記外部装置に通知し、
前記データ冗長化決定部は、前記データ構造管理情報保持部を参照して前記格納要求に係る前記データの複製作成及び作成した複製の格納先である前記データ格納部への送信を前記データ冗長化部に指示し、
前記データ構造操作決定部は、前記データ構造管理情報保持部に記録されている前記データ構造管理情報を参照して、いずれかの前記複製が格納された前記データ格納部に格納された前記データについてデータ操作を実行する指示を前記データ構造操作部に送信し、
前記分析処理部は、前記分析要求の内容に応じていずれかの前記データ格納部に格納されている前記データ構造操作後の前記データ又はデータ構造未操作の前記データのいずれかにより前記分析処理を実行する、
情報処理システム。 - 請求項1に記載の情報処理システムであって、
前記データ構造操作決定部は、前記外部装置からの前記格納要求に係る前記データについて前記データ構造管理情報保持部に記録されている前記データ構造管理情報に応じて、複数の前記データ構造操作部のいずれかに、前記格納要求に係る前記データに対する前記データ構造操作処理を実行させる、情報処理システム。 - 請求項1に記載の情報処理システムであって、
前記分析実行場所決定部は、前記外部装置からの前記分析要求の内容に基づいて、分析対象である前記データにつき前記データ構造に対する前記データ操作ごまたは前記データ構造未操作のいずれの前記データ構造を有する前記データを用いるか判定し、その判定結果に従って、前記データ構造管理情報保持部を参照して該当するデータ構造操作後又はデータ構造未操作のデータが格納されているいずれかの前記データ格納部に格納されているデータに対して分析処理を実行する、
情報処理システム。 - 請求項1に記載の情報処理システムであって、
いずれかの前記データ格納部に障害が発生したことを検知した場合、前記データ構造管理情報格納部に格納されている前記データ構造管理情報を参照して、障害が発生した前記データ格納部に格納されていたデータとそのデータ構造を取得し、障害が発生した前記データ格納部に格納されていた前記データを、障害が発生した前記データ格納部以外のいずれかの前記データ格納部に格納されている前記データを用いて、当該いずれかのデータ格納部に関連付けられている前記データ冗長化部及び前記データ構造操作部により復元させる、
情報処理システム。 - 請求項1に記載の情報処理システムであって、
前記外部装置から受け付けた分析要求に基づいて、分析対象である前記データの参照数を、前記データ構造別に記録する統計情報記録部と、
各前記データについて、前記データ構造操作後、及び前記データ未操作の前記データ構造ごとの前記データ保持割合を変更する指示を出す保持割合変更決定部と、を備え、
前記データ構造管理情報保持部は、各前記データごとの前記参照数、及び前記保持割合をさらに記録しており、
前記保持割合変更決定部は、各前記データ格納部が格納する前記データの前記データ構造別の保持割合と、前記統計情報記録部で記録された前記データ構造別の参照数を用いて、各前記データの前記データ構造の保持割合を変更するか判定し、変更すると判定した場合には新たな保持割合を決定し、前記データ冗長部、及び前記データ構造操作部に、前記データ構造の操作又は操作後の前記データ構造の復元を実行させる、
情報処理システム。 - 請求項5に記載の情報処理システムであって、
前記データ格納部に格納されている前記データの前記データ構造を変更する場合に、当該データの前記データ構造を操作するか、又は他の前記データ格納部に格納されている所望の前記データ構造を有する前記データを取得するかを決定し、該当する前記データ格納部に関連付けられている前記データ冗長化部に指示するデータ構造変更方法決定部と、
それぞれが各前記データ格納部に関連付けられており、前記各データ格納部に関する負荷情報を他の前記データ格納部に通知する負荷情報通知部と、
それぞれが各前記データ格納部に関連付けられており、前記データ格納部に格納された前記データを他の前記データ格納部から受信した前記データに置き換えるデータ置換部と、
を備え、
前記保持割合変更決定部は、前記統計情報記録部により、前記データ構造管理情報保持部に記録された前記外部装置からの分析要求対象である前記データの前記データ構造別の参照数及び前記データ構造別の保持割合を用いて、各前記データの前記データ構造の保持割合を変更するか判定し、変更すると判定した場合には前記参照数に基づいて新たな保持割合を決定し、前記データ構造変更方法決定部が前記データに関する前記データ構造の操作、又はいずれかの前記データ格納部に格納されている前記データの複製作成及びデータ置換を実行させるかを決定するため、各前記データ格納部に関する負荷情報を通知するよう指示し、
前記データ構造変更方法決定部は、各前記データ格納部に関して受領した前記負荷情報を用いて、前記データ構造の操作をさせるか、前記データ冗長部および前記データ置換部に前記データの複製作成及びデータ置換を実行させるかを決定し、
前記データ置換部は、他の前記データ冗長部が送信した前記データを受信し、前記データ格納部に格納されたデータと置き換える処理を行う、
情報処理システム。 - 請求項1から請求項6までのいずれかに記載の情報処理システムであって、
前記データ構造操作部は、一定の規則に従って定義された可逆変換可能な変換元ファイルについて、変換処理を適用した変換済ファイルに変換するデータ構造変換部と、前記一定の規則に従って定義された可逆変換可能な変換済ファイルについて、逆変換処理を適用した変換元ファイルに変換するデータ構造逆変換部とを有する、
情報処理システム。 - 請求項7に記載の情報処理システムであって、
前記データ構造はカンマ区切り値形式であり、前記データ構造変換部は、前記データの行の値と列の値とを入れ替える転置処理を実行し、前記データ構造逆変換部は、前記転置処理されたデータの行の値と列の値とを入れ替える逆転置処理を実行する、
情報処理システム。 - 請求項7に記載の情報処理システムであって、
前記データ構造変換部は、前記データを適宜のアルゴリズムを用いて圧縮し、前記データ構造逆変換部は、前記圧縮されたデータについて、前記アルゴリズムにより復元する、
情報処理システム。 - 外部装置からの要求に応じてデータの格納及び格納した前記データの分析を実行する情報処理システムの制御方法であって、
前記情報処理システムは、
それぞれが前記データの格納領域を提供している、複数のデータ格納部と、
それぞれが各前記データ格納部に関連付けられて、前記データ格納部に格納されている前記データのデータ構造に対して所定の操作を実行する、複数のデータ構造操作部と、
それぞれが各前記データ格納部に関連付けられて、前記データ格納部に格納されている前記データのいずれかを他の前記データ格納部に送信する、複数のデータ冗長化部と、
前記外部装置から要求された前記データを、複数の前記データ格納部のいずれに格納するかを決定するデータ配置決定部と、
複数の前記データ冗長化部にいずれかの前記データの他の前記データ格納部への送信をさせるデータ冗長化決定部と、
各前記データ構造操作部に前記データ格納部に格納されている前記データの前記データ構造を操作させるデータ構造操作決定部と、
複数の前記データ格納部に格納する前記データ及び前記データのデータ構造に関する情報であるデータ構造管理データを保持しているデータ構造管理情報保持部と、
前記外部装置から前記データ格納部に格納された前記データに対する分析要求を受け付けて、前記分析要求に係る前記データの分析処理を実行する分析処理部と、を備え、
前記データ配置決定部は、前記外部装置から格納要求を受けた前記データについて、あらかじめ設定されている、前記情報処理システム内での格納数とその格納される前記データに関する前記データ構造を取得し、前記データ構造管理情報保持部に格納されている対応データの前記データ構造管理情報を参照して、前記格納要求に係る前記データを格納すべき前記データ格納部を決定して前記外部装置に通知し、
前記データ冗長化決定部は、前記データ構造管理情報保持部を参照して前記格納要求に係る前記データの複製作成及び作成した複製の格納先である前記データ格納部への送信を前記データ冗長化部に指示し、
前記データ構造操作決定部は、前記データ構造管理情報保持部に記録されている前記データ構造管理情報を参照して、いずれかの前記複製が格納された前記データ格納部に格納された前記データについてデータ操作を実行する指示を前記データ構造操作部に送信し、
前記分析処理部は、前記分析要求の内容に応じていずれかの前記データ格納部に格納されている前記データ構造操作後の前記データ又はデータ構造未操作の前記データのいずれかにより前記分析処理を実行する、
情報処理システムの制御方法。 - 請求項10に記載の情報処理システムの制御方法であって、
前記データ構造操作決定部は、前記外部装置からの前記格納要求に係る前記データについて前記データ構造管理情報保持部に記録されている前記データ構造管理情報に応じて、複数の前記データ構造操作部のいずれかに、前記格納要求に係る前記データに対する前記データ構造操作処理を実行させる、情報処理システムの制御方法。 - 請求項10に記載の情報処理システムの制御方法であって、
前記分析実行場所決定部は、前記外部装置からの前記分析要求の内容に基づいて、分析対象である前記データにつき前記データ構造に対する前記データ操作ごまたは前記データ構造未操作のいずれの前記データ構造を有する前記データを用いるか判定し、その判定結果に従って、前記データ構造管理情報保持部を参照して該当するデータ構造操作後又はデータ構造未操作のデータが格納されているいずれかの前記データ格納部に格納されているデータに対して分析処理を実行する、
情報処理システムの制御方法。 - 請求項10に記載の情報処理システムの制御方法であって、
いずれかの前記データ格納部に障害が発生したことを検知した場合、前記データ構造管理情報格納部に格納されている前記データ構造管理情報を参照して、障害が発生した前記データ格納部に格納されていたデータとそのデータ構造を取得し、障害が発生した前記データ格納部に格納されていた前記データを、障害が発生した前記データ格納部以外のいずれかの前記データ格納部に格納されている前記データを用いて、当該いずれかのデータ格納部に関連付けられている前記データ冗長化部及び前記データ構造操作部により復元させる、
情報処理システムの制御方法。 - 請求項10に記載の情報処理システムの制御方法であって、
前記情報処理システムは、
前記外部装置から受け付けた分析要求に基づいて、分析対象である前記データの参照数を、前記データ構造別に記録する統計情報記録部と、
各前記データについて、前記データ構造操作後、及び前記データ未操作の前記データ構造ごとの前記データ保持割合を変更する指示を出す保持割合変更決定部と、を備え、
前記データ構造管理情報保持部は、各前記データごとの前記参照数、及び前記保持割合をさらに記録しており、
前記保持割合変更決定部は、各前記データ格納部が格納する前記データの前記データ構造別の保持割合と、前記統計情報記録部で記録された前記データ構造別の参照数を用いて、各前記データの前記データ構造の保持割合を変更するか判定し、変更すると判定した場合には新たな保持割合を決定し、前記データ冗長部、及び前記データ構造操作部に、前記データ構造の操作又は操作後の前記データ構造の復元を実行させる、
情報処理システムの制御方法。 - 請求項14に記載の情報処理システムの制御方法であって、
前記情報処理システムは、
前記データ格納部に格納されている前記データの前記データ構造を変更する場合に、当該データの前記データ構造を操作するか、又は他の前記データ格納部に格納されている所望の前記データ構造を有する前記データを取得するかを決定し、該当する前記データ格納部に関連付けられている前記データ冗長化部に指示するデータ構造変更方法決定部と、
それぞれが各前記データ格納部に関連付けられており、前記各データ格納部に関する負荷情報を他の前記データ格納部に通知する負荷情報通知部と、
それぞれが各前記データ格納部に関連付けられており、前記データ格納部に格納された前記データを他の前記データ格納部から受信した前記データに置き換えるデータ置換部と、
を備え、
前記保持割合変更決定部は、前記統計情報記録部により、前記データ構造管理情報保持部に記録された前記外部装置からの分析要求対象である前記データの前記データ構造別の参照数及び前記データ構造別の保持割合を用いて、各前記データの前記データ構造の保持割合を変更するか判定し、変更すると判定した場合には前記参照数に基づいて新たな保持割合を決定し、前記データ構造変更方法決定部が前記データに関する前記データ構造の操作、又はいずれかの前記データ格納部に格納されている前記データの複製作成及びデータ置換を実行させるかを決定するため、各前記データ格納部に関する負荷情報を通知するよう指示し、
前記データ構造変更方法決定部は、各前記データ格納部に関して受領した前記負荷情報を用いて、前記データ構造の操作をさせるか、前記データ冗長部および前記データ置換部に前記データの複製作成及びデータ置換を実行させるかを決定し、
前記データ置換部は、他の前記データ冗長部が送信した前記データを受信し、前記データ格納部に格納されたデータと置き換える処理を行う、
情報処理システムの制御方法。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/358,616 US9317205B2 (en) | 2012-03-16 | 2012-03-16 | Information processing system and control method thereof |
PCT/JP2012/056922 WO2013136520A1 (ja) | 2012-03-16 | 2012-03-16 | 情報処理システム及び情報処理システムの制御方法 |
JP2014504599A JP5735702B2 (ja) | 2012-03-16 | 2012-03-16 | 情報処理システム及び情報処理システムの制御方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2012/056922 WO2013136520A1 (ja) | 2012-03-16 | 2012-03-16 | 情報処理システム及び情報処理システムの制御方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013136520A1 true WO2013136520A1 (ja) | 2013-09-19 |
Family
ID=49160485
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/056922 WO2013136520A1 (ja) | 2012-03-16 | 2012-03-16 | 情報処理システム及び情報処理システムの制御方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US9317205B2 (ja) |
JP (1) | JP5735702B2 (ja) |
WO (1) | WO2013136520A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015172823A (ja) * | 2014-03-11 | 2015-10-01 | 株式会社電通国際情報サービス | 情報処理装置、情報処理方法及びプログラム |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6056453B2 (ja) * | 2012-12-20 | 2017-01-11 | 富士通株式会社 | プログラム、データ管理方法および情報処理装置 |
CN104375959A (zh) * | 2014-12-01 | 2015-02-25 | 浪潮集团有限公司 | 一种powerpc云存储平台采用nvdimm实现数据保护的方法 |
US9785792B2 (en) * | 2016-03-04 | 2017-10-10 | Color Genomics, Inc. | Systems and methods for processing requests for genetic data based on client permission data |
US10733476B1 (en) | 2015-04-20 | 2020-08-04 | Color Genomics, Inc. | Communication generation using sparse indicators and sensor data |
JP6550448B2 (ja) * | 2017-12-18 | 2019-07-24 | ヤフー株式会社 | データ管理装置、データ管理方法、およびプログラム |
EP3983906A4 (en) * | 2019-06-17 | 2023-07-19 | Umwelt (Australia) Pty. Limited | DATA EXTRACTION PROCESS |
JP2023136323A (ja) * | 2022-03-16 | 2023-09-29 | 株式会社日立製作所 | ストレージシステム |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3024619B2 (ja) | 1997-11-20 | 2000-03-21 | 三菱電機株式会社 | ファイル管理方法 |
US7487395B2 (en) * | 2004-09-09 | 2009-02-03 | Microsoft Corporation | Method, system, and apparatus for creating an architectural model for generating robust and easy to manage data protection applications in a data protection system |
JP2006106985A (ja) * | 2004-10-01 | 2006-04-20 | Hitachi Ltd | 計算機システム、ストレージ装置及びストレージ管理方法 |
US7904756B2 (en) * | 2007-10-19 | 2011-03-08 | Oracle International Corporation | Repair planning engine for data corruptions |
-
2012
- 2012-03-16 WO PCT/JP2012/056922 patent/WO2013136520A1/ja active Application Filing
- 2012-03-16 US US14/358,616 patent/US9317205B2/en not_active Expired - Fee Related
- 2012-03-16 JP JP2014504599A patent/JP5735702B2/ja not_active Expired - Fee Related
Non-Patent Citations (2)
Title |
---|
EMC CORPORATION: "Critical Mass Innovation Architecture White Paper", GREENPLUM DATABASE, August 2010 (2010-08-01), pages 1 - 17, Retrieved from the Internet <URL:http://www.greenplum.com> * |
MASAYUKI MATSUSHITA: "GreenplumDB ga Motsu Kosokuka Kino to Work Load Kanri - Kayosei Kino", BIG DATA NO LETHAL WEAPON! TETTEI KAISEKI GREENPLUMDB, SHOEISHA CO., LTD, 1 February 2012 (2012-02-01), Retrieved from the Internet <URL:http://enterprisezine.jp/dbonline/detail/3720> * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015172823A (ja) * | 2014-03-11 | 2015-10-01 | 株式会社電通国際情報サービス | 情報処理装置、情報処理方法及びプログラム |
Also Published As
Publication number | Publication date |
---|---|
US20140331084A1 (en) | 2014-11-06 |
US9317205B2 (en) | 2016-04-19 |
JP5735702B2 (ja) | 2015-06-17 |
JPWO2013136520A1 (ja) | 2015-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5735702B2 (ja) | 情報処理システム及び情報処理システムの制御方法 | |
US9811549B2 (en) | Applying a database transaction log record directly to a database table container | |
Nadiya et al. | Block summarization and compression in bitcoin blockchain | |
US10795862B2 (en) | Identification of high deduplication data | |
Fring et al. | Massive gauge particles versus Goldstone bosons in non-Hermitian non-Abelian gauge theory | |
US10726005B2 (en) | Virtual split dictionary for search optimization | |
US11675743B2 (en) | Web-scale distributed deduplication | |
CN113961580A (zh) | 数据查询方法、业务系统以及电子设备 | |
Lytvyn et al. | Development of Intellectual System for Data De-Duplication and Distribution in Cloud Storage. | |
Lin et al. | Improving federated relational data modeling via basis alignment and weight penalty | |
US10083121B2 (en) | Storage system and storage method | |
Goncalves et al. | Scaling features of two special Markov chains involving total disasters | |
PURDILĂ et al. | MR-Tree-A Scalable MapReduce Algorithm for Building Decision Trees. | |
Sun et al. | Weyl‐Titchmarsh Theory for Hamiltonian Dynamic Systems | |
US20220245097A1 (en) | Hashing with differing hash size and compression size | |
EP3246900B1 (en) | Matrix and key generation device, matrix and key generation system, matrix coupling device, matrix and key generation method, and program | |
Grossman | What is analytic infrastructure and why should you care? | |
EP3220290A1 (en) | Processing of tabular data | |
Benhassine | Ground state solutions for a class of fractional Hamiltonian systems | |
Chan et al. | A versatile stochastic dissemination model | |
Zollmann | Nosql databases | |
CN111444194A (zh) | 一种块链式账本中索引的清除方法、装置及设备 | |
JP6336302B2 (ja) | 情報処理装置、情報処理方法及びプログラム | |
Sethi et al. | Leveraging hadoop framework to develop duplication detector and analysis using Mapreduce, Hive and Pig | |
US11995060B2 (en) | Hashing a data set with multiple hash engines |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12871156 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2014504599 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14358616 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12871156 Country of ref document: EP Kind code of ref document: A1 |