JP2020107010A

JP2020107010A - Information processing program, information processor and information processing method

Info

Publication number: JP2020107010A
Application number: JP2018243928A
Authority: JP
Inventors: 政和川崎; Masakazu Kawasaki; 芳隆末廣; Yoshitaka Suehiro; 正雄友藤; Masao Tomofuji; 義弘安岡; Yoshihiro Yasuoka; 樹一山田; Kiichi Yamada
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2020-07-09
Anticipated expiration: 2038-12-27
Also published as: JP7174245B2

Abstract

To provide an information processing program, an information processor, and an information processing method that can conduct flattening of a data size of each file without dispersing a record having the same key to different files.SOLUTION: Data is divided into multiple records. Each record included in the multiple record is classified into a record group corresponding to a key included in each record in order based on a key included in respective record of the divided multiple records, if the number of the key corresponding to each record group exceeds the number of the files, the minimum file is specified from multiple files randomly extracted from the files corresponding to the number of the files, if the file size of the specified file is equal to or less than the average size of the file size of the files corresponding to the number of files, the classified record where a record including the same key is not output to the file is output to the specified file.SELECTED DRAWING: Figure 11

Description

本発明は、情報処理プログラム、情報処理装置及び情報処理方法に関する。 The present invention relates to an information processing program, an information processing device, and an information processing method.

例えば、利用者にサービスを提供する事業者（以下、単に事業者とも呼ぶ）は、利用者が所有するデータ（以下、処理対象データとも呼ぶ）を並列処理に適した形に加工する情報処理システムを構築して稼働させる。 For example, a business operator that provides a service to a user (hereinafter, also simply referred to as a business operator) is an information processing system that processes data owned by the user (hereinafter, also referred to as processing target data) into a form suitable for parallel processing. Build and run.

具体的に、このような情報処理システムでは、例えば、分割された後のデータが含まれるファイル（以下、単にファイルとも呼ぶ）のファイルサイズが平坦になるように、処理対象データに含まれる複数のレコードを並列処理の多重度に応じた数に分割する処理を行う。これにより、情報処理システムは、処理対象データに対して行われる処理（並列処理）の所要時間を短縮することが可能になる（例えば、特許文献１から４参照）。 Specifically, in such an information processing system, for example, a plurality of files included in the processing target data are flattened so that the file size of the file including the divided data (hereinafter, simply referred to as a file) becomes flat. The process of dividing a record into a number according to the multiplicity of parallel processing is performed. As a result, the information processing system can reduce the time required for the processing (parallel processing) performed on the processing target data (see, for example, Patent Documents 1 to 4).

特開２０１２−１１８６６９号公報JP 2012-118669 A 特開２０１３−１５６９６０号公報JP, 2013-156960, A 国際公開第２０１６／１７８３１２号International Publication No. 2016/178312 特開２０１８−０３６８８５号公報JP, 2008-036885, A

しかしながら、上記のような方法によって処理対象データの分割を行う場合、情報処理システムは、同じキーを有する複数のレコードを異なるファイルに振り分ける可能性がある。そのため、例えば、処理対象データの分割後、同じキーを有するレコードごとに処理を行う必要がある場合、事業者は、処理対象データに対して行われる処理の所要時間を短縮することができない場合がある。 However, when the data to be processed is divided by the above method, the information processing system may allocate a plurality of records having the same key to different files. Therefore, for example, when it is necessary to perform processing for each record having the same key after dividing the processing target data, the business operator may not be able to reduce the time required for the processing performed on the processing target data. is there.

そこで、一つの側面では、本発明は、同じキーを有するレコードを異なるファイルに分散させることなく、各ファイルのデータサイズの平坦化を行うことを可能とする情報処理プログラム、情報処理装置及び情報処理方法を提供することを目的とする。 Therefore, in one aspect, the present invention provides an information processing program, an information processing apparatus, and an information processing apparatus that enable flattening the data size of each file without distributing records having the same key to different files. The purpose is to provide a method.

実施の形態の一態様では、データの入力を受け付けた際に、該データを複数のレコードに分割し、分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類し、前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定し、特定した前記ファイルのファイルサイズが、前記ファイル数に対応するファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力する、処理をコンピュータに実行させる。 In one aspect of the embodiment, when data input is accepted, the data is divided into a plurality of records, and the data is included in the plurality of records based on a key included in each of the plurality of divided records. Each of the records is sequentially classified into a record group corresponding to a key included in each record, and when the number of keys corresponding to each of the record groups does not exceed a preset number of files, the classified records Is output to a different file for each of the record groups corresponding to the keys included in each record, and the number of keys corresponding to each of the record groups exceeds the number of files, the classified records However, a record in which a record including the same key has already been output to the file is output to a different file for each of the record groups corresponding to the key included in each record, and the record of the file corresponding to the number of files is output. Among a plurality of files randomly extracted from the inside, a file having the smallest file size is specified, and the specified file size is equal to or smaller than the average size of the files corresponding to the number of files. In this case, the computer is caused to execute a process of outputting to the specified file the records that are classified and that have the same key but have not been output to the file.

一つの側面によれば、同じキーを有するレコードを異なるファイルに分散させることなく、各ファイルのデータサイズの平坦化を行うことを可能とする。 According to one aspect, it is possible to flatten the data size of each file without distributing records having the same key to different files.

図１は、情報処理システム１０の構成について説明する図である。FIG. 1 is a diagram illustrating the configuration of the information processing system 10. 図２は、情報処理装置１のハードウエア構成を説明する図である。FIG. 2 is a diagram illustrating a hardware configuration of the information processing device 1. 図３は、情報処理装置１の機能のブロック図である。FIG. 3 is a block diagram of functions of the information processing device 1. 図４は、第１の実施の形態におけるファイル出力処理の概略を説明するフローチャート図である。FIG. 4 is a flowchart illustrating an outline of the file output process according to the first embodiment. 図５は、第１の実施の形態におけるファイル出力処理の概略を説明するフローチャート図である。FIG. 5 is a flow chart illustrating the outline of the file output process according to the first embodiment. 図６は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 6 is a diagram for explaining the outline of the file output process according to the first embodiment. 図７は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 7 is a diagram for explaining the outline of the file output process according to the first embodiment. 図８は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 8 is a diagram for explaining the outline of the file output process according to the first embodiment. 図９は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 9 is a diagram for explaining the outline of the file output process according to the first embodiment. 図１０は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 10 is a diagram for explaining the outline of the file output processing according to the first embodiment. 図１１は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。FIG. 11 is a diagram for explaining the outline of the file output processing according to the first embodiment. 図１２は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 12 is a flowchart illustrating details of the file output process according to the first embodiment. 図１３は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 13 is a flowchart illustrating details of the file output process according to the first embodiment. 図１４は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 14 is a flowchart illustrating details of the file output process according to the first embodiment. 図１５は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 15 is a flowchart illustrating details of the file output process according to the first embodiment. 図１６は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 16 is a flowchart illustrating details of the file output process according to the first embodiment. 図１７は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。FIG. 17 is a flowchart illustrating details of the file output process according to the first embodiment. 図１８は、Ｓ２２の処理で分割したレコードの具体例を説明する図である。FIG. 18 is a diagram illustrating a specific example of the records divided in the process of S22. 図１９は、Ｓ４３の処理でファイルに出力したレコードの具体例を説明する図である。FIG. 19 is a diagram illustrating a specific example of the record output to the file in the process of S43. 図２０は、Ｓ４３の処理でファイルに出力したレコードの具体例を説明する図である。FIG. 20 is a diagram illustrating a specific example of the record output to the file in the process of S43. 図２１は、Ｓ３２の処理でファイルに出力したレコードの具体例を説明する図である。FIG. 21 is a diagram illustrating a specific example of the record output to the file in the process of S32. 図２２は、Ｓ４５の処理が行われる前のレコードの具体例を説明する図である。FIG. 22 is a diagram illustrating a specific example of a record before the process of S45 is performed. 図２３は、Ｓ４５の処理でファイルに出力したレコードの具体例を説明する図である。FIG. 23 is a diagram illustrating a specific example of the record output to the file in the process of S45.

［情報処理システムの構成］
初めに、情報処理システム１０の構成について説明を行う。図１は、情報処理システム１０の構成について説明する図である。 [Configuration of information processing system]
First, the configuration of the information processing system 10 will be described. FIG. 1 is a diagram illustrating the configuration of the information processing system 10.

図１に示す情報処理システム１０は、情報処理装置１と、情報処理装置２と、操作端末３とを有する。操作端末３は、ネットワーク（図示しない）を介することによって情報処理装置１とアクセスが可能である。 The information processing system 10 illustrated in FIG. 1 includes an information processing device 1, an information processing device 2, and an operation terminal 3. The operation terminal 3 can access the information processing device 1 via a network (not shown).

操作端末３は、例えば、１台以上のＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）であり、事業者が情報処理装置１に対して各種操作を行う端末である。 The operation terminal 3 is, for example, one or more PCs (Personal Computers), and is a terminal by which a business operator performs various operations on the information processing apparatus 1.

情報処理装置２は、例えば、１台以上の物理マシンであり、利用者によって構築された情報処理システム（以下、外部システムとも呼ぶ）が稼働する装置である。具体的に、情報処理装置２では、例えば、利用者によって予め用意された処理対象データに対して所定の処理を行うシステムが稼働する。なお、以下、情報処理装置２（外部システム）において所定数の多重度による並列処理が行われるものとして説明を行う。 The information processing device 2 is, for example, one or more physical machines, and is a device in which an information processing system (hereinafter also referred to as an external system) constructed by a user operates. Specifically, in the information processing device 2, for example, a system that performs a predetermined process on the process target data prepared in advance by the user operates. In the following description, it is assumed that the information processing apparatus 2 (external system) performs parallel processing with a predetermined multiplicity.

情報処理装置１は、例えば、１台以上の物理マシンであり、情報処理装置２において処理が行われる処理対象データを並列処理に適した形に加工する。具体的に、情報処理装置１は、例えば、処理対象データに含まれる複数のレコードを並列処理の多重度に応じた数に分割する。そして、情報処理装置１は、処理対象データを分割することによって生成した複数のファイルを情報処理装置２に送信する。 The information processing device 1 is, for example, one or more physical machines, and processes data to be processed in the information processing device 2 into a form suitable for parallel processing. Specifically, the information processing device 1 divides, for example, a plurality of records included in the processing target data into a number according to the multiplicity of parallel processing. Then, the information processing device 1 transmits to the information processing device 2 a plurality of files generated by dividing the processing target data.

これにより、情報処理装置１は、情報処理装置２において行われる処理（並列処理）の所要時間を短縮することが可能になる。 As a result, the information processing apparatus 1 can reduce the time required for the processing (parallel processing) performed in the information processing apparatus 2.

しかしながら、例えば、各ファイルのファイルサイズが平坦になるように処理対象データの分割を行う場合、情報処理装置１は、同じキーを有する複数のレコードを異なるファイルに振り分ける可能性がある。そのため、例えば、処理対象データの分割後、同じキーを有するレコードごとに処理を行う必要がある場合、情報処理装置１は、情報処理装置２において行われる処理の所要時間を短縮することができない場合がある。 However, for example, when dividing the processing target data so that the file size of each file is flat, the information processing apparatus 1 may distribute a plurality of records having the same key to different files. Therefore, for example, when it is necessary to perform processing for each record having the same key after dividing the processing target data, the information processing apparatus 1 cannot shorten the time required for the processing performed in the information processing apparatus 2. There is.

そこで、本実施の形態における情報処理装置１は、処理対象データの入力を受け付けた際に、その処理対象データを複数のレコードに分割する。そして、情報処理装置１は、分割した複数のレコードのそれぞれに含まれるキーに基づいて、複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類する。 Therefore, when the information processing device 1 according to the present embodiment receives an input of the processing target data, the processing target data is divided into a plurality of records. Then, the information processing device 1 sequentially classifies each of the records included in the plurality of records into a record group corresponding to the key included in each of the records, based on the keys included in each of the plurality of divided records.

続いて、レコードが分類されたレコード群のそれぞれに対応するキーの数が、予め設定されたファイル数（以下、単にファイル数とも呼ぶ）を超えていない場合、情報処理装置１は、分類されたレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイルに出力する。 Subsequently, when the number of keys corresponding to each of the record groups into which the records are classified does not exceed the preset number of files (hereinafter, also simply referred to as the number of files), the information processing apparatus 1 is classified. Each record is output to a different file for each record group corresponding to the key included in each record.

一方、レコードが分類されたレコード群のそれぞれに対応するキーの数が、予め設定されたファイル数を超えた場合、情報処理装置１は、分類されたレコードであって、同一のキーを含むレコードがファイルに出力済であるレコードを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイルに出力する。 On the other hand, when the number of keys corresponding to each of the record groups into which the records are classified exceeds the number of files set in advance, the information processing device 1 is a classified record and a record including the same key. Records that have already been output to the file are output to different files for each record group corresponding to the key included in each record.

また、情報処理装置１は、この場合、ファイル数に対応するファイル（以下、全ファイルとも呼ぶ）の中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定する。そして、情報処理装置１は、特定したファイルのファイルサイズが全ファイルのファイルサイズの平均サイズ以下である場合、特定したファイルに、分類されたレコードであって、同一のキーを含むレコードがファイルに出力済でないレコードを出力する。 Further, in this case, the information processing device 1 identifies the file having the smallest file size among the plurality of files randomly extracted from the files corresponding to the number of files (hereinafter, also referred to as all files). .. Then, when the file size of the specified file is equal to or smaller than the average size of the file sizes of all the files, the information processing device 1 classifies the specified file into records including the same key. Output records that have not been output.

すなわち、情報処理装置１は、全ファイルからランダムに選択したファイルの中で、現時点のファイルサイズが最小であるファイルを、新たなキーを有するレコードを出力するファイルの候補として特定する。そして、情報処理装置１は、特定したファイルのファイルサイズが全ファイルの平均サイズ以下である場合、ランダムに選択したファイルのファイルサイズが全ファイルの中で偏っているものではないと判定し、特定したファイルに、新たなキーを有するレコードを出力する。 That is, the information processing apparatus 1 specifies the file having the smallest file size at the present time among the files randomly selected from all the files, as a candidate of the file for outputting the record having the new key. Then, when the file size of the specified file is equal to or smaller than the average size of all the files, the information processing apparatus 1 determines that the file sizes of the randomly selected files are not biased among all the files, and specifies the file size. The record having the new key is output to the file.

これにより、情報処理装置１は、同じキーを有する各レコードを異なるファイルに分散させることなく、各ファイルのデータサイズの平坦化を行うことを可能とする。そのため、情報処理装置２は、同じキーを有するレコードごとに処理を行う必要がある場合であっても、処理対象データに対する処理の所要時間を短縮させることが可能になる。 As a result, the information processing device 1 can flatten the data size of each file without distributing each record having the same key to different files. Therefore, the information processing device 2 can reduce the time required for the processing on the processing target data even when it is necessary to perform the processing for each record having the same key.

［情報処理システムのハードウエア構成］
次に、情報処理システム１０のハードウエア構成について説明する。図２は、情報処理装置１のハードウエア構成を説明する図である。 [Hardware configuration of information processing system]
Next, the hardware configuration of the information processing system 10 will be described. FIG. 2 is a diagram illustrating a hardware configuration of the information processing device 1.

情報処理装置１は、図２に示すように、プロセッサであるＣＰＵ１０１と、メモリ１０２と、外部インターフェース（Ｉ／Ｏユニット）１０３と、記憶媒体１０４とを有する。各部は、バス１０５を介して互いに接続される。 As shown in FIG. 2, the information processing device 1 includes a CPU 101 that is a processor, a memory 102, an external interface (I/O unit) 103, and a storage medium 104. The respective units are connected to each other via a bus 105.

記憶媒体１０４は、例えば、処理対象データに含まれる複数のレコードを各ファイルに出力する処理（以下、ファイル出力処理とも呼ぶ）を行うためのプログラム１１０を記憶するプログラム格納領域（図示しない）を有する。また、記憶媒体１０４は、例えば、ファイル出力処理を行う際に用いられる情報を記憶する記憶部１３０（以下、情報格納領域１３０とも呼ぶ）を有する。なお、記憶媒体１０４は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やＳＳＤ（ＳｏｋｉｄＳｔａｔｅＤｒｉｖｅ）であってよい。 The storage medium 104 has, for example, a program storage area (not shown) that stores a program 110 for performing a process of outputting a plurality of records included in the process target data to each file (hereinafter, also referred to as a file output process). .. In addition, the storage medium 104 has, for example, a storage unit 130 (hereinafter, also referred to as an information storage area 130) that stores information used when performing a file output process. The storage medium 104 may be, for example, an HDD (Hard Disk Drive) or an SSD (Sokid State Drive).

ＣＰＵ１０１は、記憶媒体１０４からメモリ１０２にロードされたプログラム１１０を実行してファイル出力処理を行う。 The CPU 101 executes the program 110 loaded from the storage medium 104 into the memory 102 to perform file output processing.

外部インターフェース１０３は、例えば、情報処理装置２や操作端末３と通信を行う。 The external interface 103 communicates with the information processing device 2 and the operation terminal 3, for example.

［情報処理システムの機能］
次に、情報処理システム１０の機能について説明を行う。図３は、情報処理装置１の機能のブロック図である。 [Functions of information processing system]
Next, the function of the information processing system 10 will be described. FIG. 3 is a block diagram of functions of the information processing device 1.

情報処理装置１は、図３に示すように、例えば、ＣＰＵ１０１やメモリ１０２等のハードウエアとプログラム１１０とが有機的に協働することにより、データ受付部１１１と、データ分割部１１２と、レコード分類部１１３と、キー判定部１１４と、レコード出力部１１５と、平均算出部１１６と、ファイル特定部１１７と、ファイル出力部１１８とを含む各種機能を実現する。 As shown in FIG. 3, the information processing apparatus 1 includes a data receiving unit 111, a data dividing unit 112, a record, and the like, by organically cooperating hardware such as the CPU 101 and the memory 102 with the program 110. Various functions including the classification unit 113, the key determination unit 114, the record output unit 115, the average calculation unit 116, the file identification unit 117, and the file output unit 118 are realized.

また、情報処理装置１は、例えば、図３に示すように、処理対象データに含まれる複数のレコード（分割後のレコード）を含む複数のファイル１３１と、事業者や利用者によって予め設定されたファイル数を示す情報であるファイル数情報１３２とを情報格納領域１３０に記憶する。以下、ファイル数情報１３２が予め情報格納領域１３０に記憶されているものとして説明を行う。 In addition, the information processing device 1 is set in advance by a business operator and a user, as shown in FIG. 3, for example, a plurality of files 131 including a plurality of records (records after division) included in the processing target data. The file number information 132, which is information indicating the number of files, is stored in the information storage area 130. Hereinafter, the description will be made assuming that the file number information 132 is stored in the information storage area 130 in advance.

データ受付部１１１は、処理対象データの入力を受け付ける。具体的に、データ受付部１１１は、例えば、事業者が操作端末３を介して入力した処理対象データの入力を受け付ける。 The data receiving unit 111 receives input of processing target data. Specifically, the data receiving unit 111 receives, for example, the input of the processing target data input by the business operator via the operation terminal 3.

データ分割部１１２は、データ受付部１１１が処理対象データの入力を受け付けた場合に、そのデータを複数のレコードに分割する。具体的に、データ分割部１１２は、データ受付部１１１が入力を受け付けた処理対象データをレコードごとに分割する。 When the data receiving unit 111 receives an input of the processing target data, the data dividing unit 112 divides the data into a plurality of records. Specifically, the data dividing unit 112 divides the processing target data, which the data receiving unit 111 has received input, for each record.

レコード分類部１１３は、データ分割部１１２が分割した複数のレコードのそれぞれに含まれるキーに基づいて、複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類する。 The record classification unit 113 sequentially sets each of the records included in the plurality of records into a record group corresponding to the key included in each of the records, based on the keys included in each of the plurality of records divided by the data dividing unit 112. Classify.

キー判定部１１４は、レコード分類部１１３によってレコードが分類されたレコード群のそれぞれに対応するキーの数が、情報格納領域１３０に記憶されたファイル数情報１３２に対応する数を超えているか否かを判定する。 The key determination unit 114 determines whether the number of keys corresponding to each of the record groups into which the records are classified by the record classification unit 113 exceeds the number corresponding to the file number information 132 stored in the information storage area 130. To judge.

レコード出力部１１５は、レコード分類部１１３によってレコードが分類されたレコード群のそれぞれに対応するキーの数が、ファイル数情報１３２に対応する数を超えていないとキー判定部１１４が判定した場合、レコード分類部１１３によって分類されたレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する。 When the key determination unit 114 determines that the number of keys corresponding to each of the record groups into which the records are classified by the record classification unit 113 does not exceed the number corresponding to the file number information 132, the record output unit 115 Each of the records classified by the record classification unit 113 is output to a different file 131 for each record group corresponding to the key included in each record.

また、レコード出力部１１５は、レコード分類部１１３によってレコードが分類されたレコード群のそれぞれに対応するキーの数が、ファイル数情報１３２に対応する数を超えたとキー判定部１１４が判定した場合、レコード分類部１１３によって分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済であるレコードを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する。 Further, when the key output unit 115 determines that the key determination unit 114 determines that the number of keys corresponding to each of the record groups into which the records are classified by the record classification unit 113 exceeds the number corresponding to the file number information 132, The records classified by the record classifying unit 113 and having the record including the same key already output to the file 131 are output to different files 131 for each record group corresponding to the key included in each record. ..

平均算出部１１６は、レコード分類部１１３によってレコードが分類されたレコード群のそれぞれに対応するキーの数が、ファイル数情報１３２に対応する数を超えたとキー判定部１１４が判定した場合、ファイル数情報１３２に対応する数のファイル１３１のファイルサイズの平均（以下、平均サイズとも呼ぶ）を算出する。 If the key determination unit 114 determines that the number of keys corresponding to each of the record groups into which the records are classified by the record classification unit 113 exceeds the number corresponding to the file number information 132, the average calculation unit 116 determines the number of files. An average of the file sizes of the number of files 131 corresponding to the information 132 (hereinafter, also referred to as average size) is calculated.

ファイル特定部１１７は、レコード分類部１１３によってレコードが分類されたレコード群のそれぞれに対応するキーの数が、ファイル数情報１３２に対応する数を超えたとキー判定部１１４が判定した場合、ファイル数情報１３２に対応する数のファイル１３１の中からランダムに抽出された複数のファイル１３１の中で、最小のファイルサイズであるファイル１３１を特定する。 When the key determination unit 114 determines that the number of keys corresponding to each of the record groups into which the records are classified by the record classification unit 113 exceeds the number corresponding to the file number information 132, the file identification unit 117 determines the number of files. The file 131 having the smallest file size is specified from the plurality of files 131 randomly extracted from the number of files 131 corresponding to the information 132.

そして、レコード出力部１１５は、ファイル特定部１１７が特定したファイル１３１のファイルサイズが、平均算出部１１６が算出した平均サイズ以下である場合、ファイル特定部１１７が特定したファイル１３１に、レコード分類部１１３によって分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済でないレコードを出力する。 Then, when the file size of the file 131 specified by the file specifying unit 117 is equal to or smaller than the average size calculated by the average calculating unit 116, the record output unit 115 adds the record classification unit to the file 131 specified by the file specifying unit 117. Records that are classified by 113 and that include the same key but have not been output to the file 131 are output.

ファイル出力部１１８は、各ファイル１３１を情報処理装置２に送信（出力）する。具体的に、ファイル出力部１１８は、例えば、ファイルサイズが大きいファイル１３１から順に、各ファイル１３１を情報処理装置２に送信する。 The file output unit 118 transmits (outputs) each file 131 to the information processing device 2. Specifically, the file output unit 118 transmits each file 131 to the information processing device 2 in order from the file 131 having the largest file size, for example.

［第１の実施の形態の概略］
次に、第１の実施の形態の概略について説明する。図４及び図５は、第１の実施の形態におけるファイル出力処理の概略を説明するフローチャート図である。また、図６から図１１は、第１の実施の形態におけるファイル出力処理の概略を説明する図である。 [Outline of First Embodiment]
Next, the outline of the first embodiment will be described. 4 and 5 are flowcharts for explaining the outline of the file output process according to the first embodiment. Further, FIG. 6 to FIG. 11 are diagrams for explaining the outline of the file output processing in the first embodiment.

情報処理装置１は、図４に示すように、処理対象データの入力を受け付けるまで待機する（Ｓ１のＮＯ）。具体的に、データ受付部１１１は、例えば、事業者が操作端末３を介して処理対象データを入力するまで待機する。 As shown in FIG. 4, the information processing device 1 waits until the input of the processing target data is received (NO in S1). Specifically, the data reception unit 111 waits, for example, until the business operator inputs the processing target data via the operation terminal 3.

そして、処理対象データの入力を受け付けた場合（Ｓ１のＹＥＳ）、情報処理装置１は、Ｓ１の処理で入力を受け付けたデータを複数のレコードに分割する（Ｓ２）。 When the input of the processing target data is accepted (YES in S1), the information processing device 1 divides the data, the input of which is accepted in the process of S1, into a plurality of records (S2).

続いて、情報処理装置１は、Ｓ２の処理で分割した複数のレコードのそれぞれに含まれるキーに基づいて、複数のレコードに含まれるレコードを、各レコードに含まれるキーに対応するレコード群に順次分類する（Ｓ３）。 Subsequently, the information processing apparatus 1 sequentially sets the records included in the plurality of records into the record group corresponding to the key included in each record, based on the keys included in each of the plurality of records divided in the process of S2. Classify (S3).

その後、情報処理装置１は、Ｓ３の処理でレコードが分類されたレコード群のそれぞれに対応するキーの数が、予め設定されたファイル数を超えているか否かを判定する（Ｓ４）。具体的に、情報処理装置１は、情報格納領域１３０に記憶されたファイル数情報１３２を参照し、Ｓ３の処理でレコードが分類されたレコード群のそれぞれに対応するキーの数が、ファイル数情報１３２に対応する数を超えているか否かの判定を行う。 After that, the information processing device 1 determines whether or not the number of keys corresponding to each of the record groups into which the records are classified in the process of S3 exceeds a preset number of files (S4). Specifically, the information processing device 1 refers to the file number information 132 stored in the information storage area 130, and the number of keys corresponding to each of the record groups into which the records are classified in the process of S3 is the file number information. It is determined whether or not the number corresponding to 132 is exceeded.

その結果、Ｓ３の処理でレコードが分類されたレコード群のそれぞれに対応するキーの数が、予め設定されたファイル数を超えていないと判定した場合（Ｓ５のＮＯ）、情報処理装置１は、Ｓ３の処理で分類されたレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する（Ｓ６）。 As a result, when it is determined that the number of keys corresponding to each of the record groups into which the records have been classified in the process of S3 does not exceed the preset number of files (NO in S5), the information processing device 1 Each of the records classified in the process of S3 is output to a different file 131 for each record group corresponding to the key included in each record (S6).

一方、Ｓ３の処理でレコードが分類されたレコード群のそれぞれに対応するキーの数が、予め設定されたファイル数を超えていると判定した場合（Ｓ６のＮＯ）、情報処理装置１は、図５に示すように、Ｓ３の処理で分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済であるレコードを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する（Ｓ１１）。 On the other hand, when it is determined that the number of keys corresponding to each of the record groups into which the records are classified in the process of S3 exceeds the preset number of files (NO in S6), the information processing device 1 As shown in FIG. 5, the records classified by the process of S3 and including the record including the same key to the file 131 are different for each record group corresponding to the key included in each record. It is output to the file 131 (S11).

そして、情報処理装置１は、この場合、ランダムに抽出された複数のファイル１３１の中で、最小のファイルサイズであるファイル１３１を特定する（Ｓ１２）。 Then, in this case, the information processing device 1 identifies the file 131 having the smallest file size among the plurality of files 131 extracted at random (S12).

その後、情報処理装置１は、Ｓ１２の処理で特定したファイル１３１のファイルサイズが、予め設定されたファイル数に対応するファイル１３１のファイルサイズの平均サイズ以下であるか否かを判定する（Ｓ１３）。 After that, the information processing device 1 determines whether or not the file size of the file 131 identified in the process of S12 is equal to or smaller than the average size of the file sizes of the files 131 corresponding to the preset number of files (S13). ..

その結果、Ｓ１２の処理で特定したファイル１３１のファイルサイズが、予め設定されたファイル数に対応するファイル１３１のファイルサイズの平均サイズ以下であると判定した場合（Ｓ１４のＹＥＳ）、情報処理装置１は、Ｓ１２の処理で特定したファイル１３１に、Ｓ３の処理で分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済でないレコードを出力する（Ｓ１５）。 As a result, when it is determined that the file size of the file 131 identified in the process of S12 is equal to or smaller than the average size of the file sizes of the files 131 corresponding to the preset number of files (YES in S14), the information processing device 1 Outputs to the file 131 identified in the process of S12 a record that has been classified in the process of S3 and has not been output to the file 131 as a record containing the same key (S15).

一方、Ｓ１２の処理で特定したファイル１３１のファイルサイズが、予め設定されたファイル数に対応するファイル１３１のファイルサイズの平均サイズ以下でないと判定した場合（Ｓ１４のＮＯ）、情報処理装置１は、例えば、Ｓ１２以降の処理を再度行う。 On the other hand, when it is determined that the file size of the file 131 specified in the process of S12 is not less than or equal to the average size of the file sizes of the files 131 corresponding to the preset number of files (NO in S14), the information processing device 1 For example, the processing after S12 is performed again.

すなわち、情報処理装置１は、この場合、例えば、Ｓ３の処理で分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済でないレコードの出力先のファイル１３１の特定をさらに行う。 That is, in this case, the information processing apparatus 1 further specifies, for example, the output destination file 131 of the records that have been classified in the process of S3 and that include the same key but have not been output to the file 131. ..

これにより、情報処理装置１は、同じキーを有する各レコードを異なるファイル１３１に分散させることなく、各ファイル１３１のデータサイズの平坦化を行うことを可能とする。そのため、情報処理装置２は、同じキーを有するレコードごとに処理を行う必要がある場合であっても、処理対象データに対する処理の所要時間を短縮させることが可能になる。以下、第１の実施の形態におけるファイル出力処理の概略の具体例について説明を行う。 As a result, the information processing apparatus 1 can flatten the data size of each file 131 without distributing each record having the same key to different files 131. Therefore, the information processing device 2 can reduce the time required for the processing on the processing target data even when it is necessary to perform the processing for each record having the same key. Hereinafter, a specific example of the outline of the file output process according to the first embodiment will be described.

［ファイル出力処理の概略の具体例］
図６から図１１は、ファイル出力処理の概略の具体例を説明する図である。以下、ファイル数情報１３２が示す数が５であるものとして説明を行う。また、以下、情報処理装置２（外部システム）における並列処理の多重度が３であるものとして説明を行う。 [Specific example of outline of file output processing]
6 to 11 are diagrams illustrating specific examples of the outline of the file output process. Hereinafter, the description will be made assuming that the number indicated by the file number information 132 is five. Further, in the following description, it is assumed that the multiplicity of parallel processing in the information processing device 2 (external system) is 3.

図６に示す例において、処理対象データには、それぞれ主キーがＤであるレコード（以下、単にレコードＤとも表記する）と、主キーがＥであるレコード（以下、単にレコードＥとも表記する）、主キーがＡであるレコード（以下、単にレコードＡとも表記する）、主キーがＨであるレコード（以下、単にレコードＨとも表記する）、主キーがＤであるレコード、主キーがＧであるレコード（以下、単にレコードＧとも表記する）、主キーがＡであるレコード、主キーがＤであるレコード、主キーがＨであるレコード、及び、主キーがＣであるレコード（以下、単にレコードＣとも表記する）が順に含まれている。 In the example shown in FIG. 6, the processing target data has a record whose primary key is D (hereinafter, also simply referred to as record D) and a record whose primary key is E (hereinafter, simply referred to as record E). , A record whose primary key is A (hereinafter, also simply referred to as record A), a record whose primary key is H (hereinafter also simply referred to as record H), a record whose primary key is D, and a primary key is G A record (hereinafter also simply referred to as a record G), a record whose primary key is A, a record whose primary key is D, a record whose primary key is H, and a record whose primary key is C (hereinafter simply Record C) is included in order.

具体的に、情報処理装置１は、この場合、図６に示すように、処理対象データ（「処理対象データ」の枠内のデータ）をレコードごとに分割する。 Specifically, in this case, the information processing device 1 divides the processing target data (data within the frame of “processing target data”) into each record, as shown in FIG.

続いて、情報処理装置１は、例えば、分割したレコード（「分割後レコード」の枠内のデータ）の分類を行う。ここで、図６に示す例では、レコードを含むファイル１３１がまた存在していない。そのため、情報処理装置１は、例えば、図７に示すように、レコードＤを出力する新たなファイル１３１ａを生成し、「分割後レコード」の枠内における１行目のレコードであるレコードＤを出力する。続いて、情報処理装置１は、例えば、図８に示すように、レコードＥを出力する新たなファイル１３１ｂを生成し、「分割後レコード」の枠内における２行目のレコードであるレコードＥを出力する。 Subsequently, the information processing apparatus 1 classifies the divided records (data in the frame of the “divided record”), for example. Here, in the example shown in FIG. 6, the file 131 including the record does not exist again. Therefore, the information processing apparatus 1, for example, as illustrated in FIG. 7, generates a new file 131a that outputs the record D, and outputs the record D that is the first line record in the frame of the “divided record”. To do. Subsequently, the information processing apparatus 1 generates a new file 131b that outputs the record E, for example, as illustrated in FIG. 8, and sets the record E, which is the second record in the “divided record” frame, to the record E. Output.

その後、情報処理装置１は、例えば、図９に示すように、レコードＡ、レコードＨ及びレコードＧを出力するファイル１３１ｃ、１３１ｄ及び１３１ｅをそれぞれ生成し、「分割後レコード」の枠内における５行目及び８行目のレコードであるレコードＤをファイル１３１ａに出力し、３行目及び７行目のレコードであるレコードＡをファイル１３１ｃに出力し、４行目及び９行目のレコードであるレコードＨをファイル１３１ｄに出力し、６行目のレコードであるレコードＧをファイル１３１ｅに出力する。 After that, the information processing apparatus 1 generates files 131c, 131d and 131e for outputting the record A, the record H and the record G, respectively, as shown in FIG. The record D, which is the record on the 8th and 8th lines, is output to the file 131a, the record A, which is the record on the 3rd and 7th lines, is output to the file 131c, and the record, which is the record on the 4th and 9th lines. The H is output to the file 131d, and the record G which is the record of the sixth line is output to the file 131e.

ここで、「分割後レコード」の枠内における１０行目のレコードは、出力先のファイル１３１がまだ生成されていないレコードＣである。しかしながら、ファイル数情報１３２が示す数が５であるため、情報処理装置１は、新たなファイル１３１をこれ以上生成することができない。 Here, the record on the 10th line within the “divided record” frame is the record C for which the output destination file 131 has not yet been generated. However, since the number indicated by the file number information 132 is 5, the information processing device 1 cannot generate a new file 131 any more.

そこで、情報処理装置１は、例えば、生成済のファイル１３１（ファイル１３１ａからファイル１３１ｅ）の中から、ランダムにファイル１３１ｂとファイル１３１ｃとを選択する。そして、情報処理装置１は、例えば、選択したファイル１３１ｂ及びファイル１３１ｃのうち、ファイルサイズが最小であるファイル１３１ｂを特定する。その後、情報処理装置１は、特定したファイル１３１ｂのファイルサイズが全ファイル１３１のファイルサイズの平均よりも小さいと判定した場合、例えば、図１０に示すように、「分割後レコード」の枠内における１０行目のレコードであるレコードＣを、特定したファイル１３１ｂに出力する。 Therefore, the information processing apparatus 1 randomly selects the file 131b and the file 131c from the generated files 131 (file 131a to file 131e), for example. Then, the information processing apparatus 1 specifies, for example, the file 131b having the smallest file size from the selected files 131b and 131c. After that, when the information processing apparatus 1 determines that the file size of the specified file 131b is smaller than the average of the file sizes of all the files 131, for example, as shown in FIG. The record C, which is the record on the 10th line, is output to the specified file 131b.

すなわち、情報処理装置１は、この場合、他のキーを有するレコードが既に出力されているファイル１３１のうち、ファイルサイズが他のファイル１３１よりも相対的に小さいと判定できるファイル１３１ｂを特定する。そして、情報処理装置１は、特定したファイル１３１ｂに対してレコードＣ（新たなキーを有するレコード）を出力する。 That is, in this case, the information processing device 1 specifies the file 131b that can be determined to have a file size relatively smaller than that of the other file 131 among the files 131 in which the records having the other keys have already been output. Then, the information processing device 1 outputs the record C (record having a new key) to the specified file 131b.

その後、情報処理装置１は、例えば、図１１に示すように、ファイルサイズが大きいファイル１３１から順に、外部システムのアプリケーション（１）から（３）に出力する。そして、アプリケーション（１）から（３）は、例えば、ファイル１３１ａに含まれるレコード、ファイル１３１ｃに含まれるレコード及びファイル１３１ｄに含まれるレコードに対する処理をそれぞれ開始する。 Then, the information processing device 1 outputs the files 131 to the applications (1) to (3) of the external system in order from the file 131 having the largest file size, as shown in FIG. Then, the applications (1) to (3) respectively start the processing on the record included in the file 131a, the record included in the file 131c, and the record included in the file 131d, respectively.

これにより、情報処理装置２（外部システム）は、ファイルサイズの大きいファイル１３１から順に処理を開始することが可能になる。そのため、情報処理装置２は、処理対象データに対する処理の所要時間をより短縮させることが可能になる。 As a result, the information processing device 2 (external system) can start processing in order from the file 131 with the largest file size. Therefore, the information processing device 2 can further reduce the time required for processing the processing target data.

［第１の実施の形態の詳細］
次に、第１の実施の形態の詳細について説明する。図１２から図１７は、第１の実施の形態におけるファイル出力処理の詳細を説明するフローチャート図である。また、図１８から図２３は、第１の実施の形態におけるファイル出力処理の詳細を説明する図である。 [Details of First Embodiment]
Next, details of the first embodiment will be described. 12 to 17 are flowcharts illustrating details of the file output process according to the first embodiment. 18 to 23 are diagrams for explaining the details of the file output process in the first embodiment.

情報処理装置１のデータ受付部１１１は、図１２に示すように、処理対象データの入力を受け付けるまで待機する（Ｓ２１のＮＯ）。 As shown in FIG. 12, the data reception unit 111 of the information processing device 1 waits until the input of the processing target data is received (NO in S21).

そして、処理対象データの入力を受け付けた場合（Ｓ２１のＹＥＳ）、情報処理装置１のデータ分割部１１２は、Ｓ２１の処理で入力を受け付けたデータを複数のレコードに分割する（Ｓ２２）。以下、Ｓ２２の処理で分割したレコードの具体例について説明を行う。 Then, when the input of the data to be processed is received (YES in S21), the data dividing unit 112 of the information processing device 1 divides the data received in the process of S21 into a plurality of records (S22). Hereinafter, a specific example of the record divided in the process of S22 will be described.

［Ｓ２２の処理で分割したレコードの具体例］
図１８は、Ｓ２２の処理で分割したレコードの具体例を説明する図である。 [Specific Example of Records Split in S22]
FIG. 18 is a diagram illustrating a specific example of the records divided in the process of S22.

図１８に示すレコードは、各レコードを識別する「項番」と、各社員の社員番号が記憶される「社員番号」と、各社員が属する部門の部門コードが記憶される「部門コード」と、各社員の氏名が記憶される「氏名」とを項目として有する。また、図１８に示すレコードは、各社員の性別が記憶される「性別」と、各社員の年齢が記憶される「年齢」と、各社員の役職が記憶される「役職」とを項目として有する。なお、以下、図１８に示すレコードに含まれる項目のうち、「部門コード」に記憶された情報が主キーであるものとして説明を行う。 The records shown in FIG. 18 include an “item number” that identifies each record, an “employee number” that stores the employee number of each employee, and a “department code” that stores the department code of the department to which each employee belongs. , "Name" in which the name of each employee is stored is included as an item. Further, the record shown in FIG. 18 has "gender" in which the gender of each employee is stored, "age" in which the age of each employee is stored, and "post" in which the job title of each employee is stored. Have. In the following description, it is assumed that the information stored in the “department code” among the items included in the record shown in FIG. 18 is the primary key.

具体的に、図１８に示すレコードにおいて、「項番」が「１」である情報には、「社員番号」として「１０４５６」が記憶され、「部門コード」として「Ｄ部門」が記憶され、「氏名」として「山田一郎」が記憶されている。また、「項番」が「１」である情報には、「性別」として「男」が記憶され、「年齢」として「４１（歳）」が記憶され、「役職」として「課長」が記憶されている。 Specifically, in the record shown in FIG. 18, “10456” is stored as the “employee number” and “D department” is stored as the “department code” in the information whose “item number” is “1”. “Ichiro Yamada” is stored as the “name”. Further, in the information in which the "item number" is "1", "male" is stored as "sex", "41 (age)" is stored as "age", and "section manager" is stored as "position". Has been done.

また、図１８に示すレコードにおいて、「項番」が「２」である情報には、「社員番号」として「０８４５１」が記憶され、「部門コード」として「Ｅ部門」が記憶され、「氏名」として「鈴木二郎」が記憶されている。また、「項番」が「２」である情報には、「性別」として「男」が記憶され、「年齢」として「５２（歳）」が記憶され、「役職」として「部長」が記憶されている。 Further, in the record shown in FIG. 18, “08451” is stored as the “employee number”, “E department” is stored as the “department code”, and “name” is stored in the information whose “item number” is “2”. "Jiro Suzuki" is stored as ". Further, in the information in which the "item number" is "2", "male" is stored as "sex", "52 (years)" is stored as "age", and "general manager" is stored as "position". Has been done.

さらに、図１８に示すレコードにおいて、「項番」が「３」である情報には、「社員番号」として「１３４０５」が記憶され、「部門コード」として「Ａ工場」が記憶され、「氏名」として「木村花子」が記憶されている。また、「項番」が「３」である情報には、「性別」として「女」が記憶され、「年齢」として「３６（歳）」が記憶され、「役職」として「なし」が記憶されている。図１８に含まれる他の情報についての説明は省略する。 Further, in the record shown in FIG. 18, “13405” is stored as the “employee number”, “A factory” is stored as the “department code”, and the “name” is stored in the information whose “item number” is “3”. "Hanako Kimura" is stored as ". Further, in the information in which the "item number" is "3", "woman" is stored as "sex", "36 (years)" is stored as "age", and "none" is stored as "position". Has been done. Descriptions of other information included in FIG. 18 are omitted.

なお、以下、「部門コード」に「Ｄ部門」が記憶されたレコード、「Ｅ部門」が記憶されたレコード、「Ａ工場」が記憶されたレコード、「Ｈ工場」が記憶されたレコード、「Ｇ部門」が記憶されたレコード及び「Ｃ部門」が記憶されたレコードのそれぞれは、図１１等で説明したレコードＤ、レコードＥ、レコードＡ、レコードＨ、レコードＧ及びレコードＣにそれぞれ対応するものとして説明を行う。そのため、以下、各レコードをそれぞれレコードＤ、レコードＥ、レコードＡ、レコードＨ、レコードＧ及びレコードＣと表記する。 In the following, a record in which “D department” is stored in the “department code”, a record in which “E department” is stored, a record in which “A factory” is stored, a record in which “H factory” is stored, The record in which “G department” is stored and the record in which “C department” is stored respectively correspond to record D, record E, record A, record H, record G, and record C described in FIG. Will be described as. Therefore, hereinafter, each record will be referred to as a record D, a record E, a record A, a record H, a record G, and a record C, respectively.

図１２に戻り、情報処理装置１のレコード分類部１１３は、Ｓ２２の処理で分割したレコードを１つ取得する（Ｓ２３）。具体的に、レコード分類部１１３は、図１８で説明したレコードのうち、「項番」が「１」であるレコードを取得する。 Returning to FIG. 12, the record classification unit 113 of the information processing device 1 acquires one record divided in the process of S22 (S23). Specifically, the record classification unit 113 acquires the record whose "item number" is "1" among the records described in FIG.

そして、レコード分類部１１３は、Ｓ２３の処理で取得したレコードを、そのレコードに含まれるキーに対応するレコード群に分類する（Ｓ２４）。 Then, the record classification unit 113 classifies the record acquired in the process of S23 into a record group corresponding to the key included in the record (S24).

その後、情報処理装置１のキー判定部１１４は、Ｓ２４の処理で分類したレコード群に含まれるレコードがファイル１３１に出力済であるか否かを判定する（Ｓ２５）。 After that, the key determination unit 114 of the information processing device 1 determines whether the records included in the record group classified in the process of S24 have been output to the file 131 (S25).

その結果、図１３に示すように、Ｓ２４の処理で分類したレコード群に含まれるレコードがファイル１３１に出力済であると判定した場合（Ｓ３１のＹＥＳ）、情報処理装置１のレコード出力部１１５は、Ｓ２３の処理で取得したレコードを、そのレコードに含まれるキーに対応するレコード群に対応するファイル１３１に出力する（Ｓ３２）。 As a result, as shown in FIG. 13, when it is determined that the records included in the record group classified in the process of S24 have been output to the file 131 (YES in S31), the record output unit 115 of the information processing device 1 , The record acquired in the process of S23 is output to the file 131 corresponding to the record group corresponding to the key included in the record (S32).

具体的に、例えば、Ｓ２３の処理で取得したレコードがレコードＤである場合であって、レコードＤに対応するファイル１３１ａが生成済である場合、レコード出力部１１５は、Ｓ２３の処理で取得したレコードをそのままファイル１３１ａに出力する。 Specifically, for example, when the record acquired in the process of S23 is the record D and the file 131a corresponding to the record D has been generated, the record output unit 115 causes the record output unit 115 to acquire the record acquired in the process of S23. Is directly output to the file 131a.

続いて、Ｓ２４の処理で分類したレコード群に含まれるレコードがファイル１３１に出力済でないと判定した場合（Ｓ３１のＮＯ）、情報処理装置１のファイル特定部１１７は、図１４に示すように、Ｓ２４の処理でレコードを分類したレコード群のそれぞれに対応するキーの数が、情報格納領域１３０に記憶されたファイル数情報１３２に対応する数を超えているか否かを判定する（Ｓ４１）。 Subsequently, when it is determined that the records included in the record group classified in the process of S24 have not been output to the file 131 (NO in S31), the file identifying unit 117 of the information processing device 1 is, as shown in FIG. It is determined whether or not the number of keys corresponding to each of the record groups into which the records are classified in the process of S24 exceeds the number corresponding to the file number information 132 stored in the information storage area 130 (S41).

その結果、キーの数がファイル数情報１３２に対応する数を超えていないと判定した場合（Ｓ４２のＮＯ）、レコード出力部１１５は、Ｓ２３の処理で出力したレコードを、新たなファイル１３１に出力する（Ｓ４３）。 As a result, when it is determined that the number of keys does not exceed the number corresponding to the file number information 132 (NO in S42), the record output unit 115 outputs the record output in the process of S23 to the new file 131. Yes (S43).

具体的に、例えば、Ｓ２３の処理で取得したレコードがレコードＤである場合であって、レコードＤに対応するファイル１３１ａが生成済でない場合、レコード出力部１１５は、レコードＤに対応するファイル１３１ａを情報格納領域１３０に新たに生成する。そして、レコード出力部１１５は、Ｓ２３の処理で取得したレコードをファイル１３１ａに出力する。以下、Ｓ３２及びＳ４３の処理のさらなる具体例について説明を行う。 Specifically, for example, when the record acquired in the process of S23 is the record D and the file 131a corresponding to the record D has not been generated, the record output unit 115 sets the file 131a corresponding to the record D It is newly generated in the information storage area 130. Then, the record output unit 115 outputs the record acquired in the process of S23 to the file 131a. Hereinafter, further specific examples of the processes of S32 and S43 will be described.

［Ｓ３２及びＳ４３の処理でファイルに出力したレコードの具体例］
図１９及び図２０は、Ｓ４３の処理でファイルに出力したレコードの具体例を説明する図である。また、図２１は、Ｓ３２の処理でファイルに出力したレコードの具体例を説明する図である。 [Specific example of record output to file in processing of S32 and S43]
19 and 20 are diagrams illustrating a specific example of the record output to the file in the process of S43. In addition, FIG. 21 is a diagram illustrating a specific example of the record output to the file in the process of S32.

具体的に、例えば、図１８で説明したレコードのうち、「項番」が「１」であるレコード（レコードＤ）がＳ２３の処理において取得された場合、レコード出力部１１５は、「項番」が「１」であるレコードを、新たに生成したファイル１３１ａに出力する（Ｓ４３）。そして、この場合、ファイル１３１ａには、図１９に示すように、「項番」が「１」であるレコードのみが含まれる状態になる。 Specifically, for example, when the record (record D) whose “item number” is “1” among the records described in FIG. 18 is acquired in the process of S23, the record output unit 115 sets the “item number”. Is output to the newly generated file 131a (S43). Then, in this case, as shown in FIG. 19, the file 131a is in a state in which only the record in which the "item number" is "1" is included.

次に、例えば、図１８で説明したレコードのうち、「項番」が「２」であるレコード（レコードＥ）がＳ２３の処理において取得された場合、レコード出力部１１５は、「項番」が「２」であるレコードを、新たに生成したファイル１３１ｂに出力する（Ｓ４３）。そして、この場合、ファイル１３１には、図２０（Ｂ）に示すように、「項番」が「２」であるレコードのみが含まれる状態になる。なお、この場合、ファイル１３１ａには、図２０（Ａ）に示すように、「項番」が「１」であるレコードのみが引き続き含まれている。 Next, for example, of the records described in FIG. 18, when the record (record E) whose “item number” is “2” is acquired in the process of S23, the record output unit 115 outputs the “item number”. The record of "2" is output to the newly created file 131b (S43). Then, in this case, as shown in FIG. 20B, the file 131 is in a state of including only the record having the “item number” of “2”. In this case, as shown in FIG. 20(A), the file 131a continues to include only the record having the "item number" of "1".

その後、例えば、図１８で説明したレコードのうち、「項番」が「５」であるレコード（レコードＤ）がＳ２３の処理において取得された場合、レコード出力部１１５は、「項番」が「５」であるレコードを、既に生成されているファイル１３１ａに出力する。そして、この場合、ファイル１３１ａには、図２１（Ａ）に示すように、「項番」が「１」であるレコードと「項番」が「５」であるレコードとが含まれる状態になる。なお、この場合、ファイル１３１ｂには、図２１（Ｂ）に示すように、「項番」が「２」であるレコードのみが含まれており、ファイル１３１ｃには、図２０（Ｃ）に示すように、「項番」が「３」であるレコードのみが含まれており、ファイル１３１ｄには、図２０（Ｄ）に示すように、「項番」が「４」であるレコードのみが含まれている。 After that, for example, in the record described in FIG. 18, when the record (record D) whose “item number” is “5” is acquired in the process of S23, the record output unit 115 sets the “item number” to “ The record of "5" is output to the already generated file 131a. Then, in this case, as shown in FIG. 21A, in this case, the file 131a is in a state of including a record whose "item number" is "1" and a record whose "item number" is "5". .. In this case, the file 131b includes only the record having the "item number" of "2" as shown in FIG. 21(B), and the file 131c is shown in FIG. 20(C). As shown in FIG. 20D, the file 131d includes only the record having the "item number" of "4". Has been.

図１４に戻り、キーの数がファイル数情報１３２に対応する数を超えていると判定した場合（Ｓ４２のＹＥＳ）、情報処理装置１は、平坦化処理を行う（Ｓ４４）。 Returning to FIG. 14, when it is determined that the number of keys exceeds the number corresponding to the file number information 132 (YES in S42), the information processing device 1 performs the flattening process (S44).

すなわち、Ｓ２４の処理でレコードを分類したレコード群に対応するファイル１３１がまだ生成されていない場合であって、Ｓ２４の処理でレコードを分類したレコード群のそれぞれに対応するキーの数が既に上限に達している場合、情報処理装置１は、Ｓ２４の処理で分類したレコードを出力するための新たなファイル１３１を生成することができない。 That is, when the file 131 corresponding to the record group in which the records are classified in the process of S24 has not been generated yet, and the number of keys corresponding to each of the record groups in which the records are classified in the process of S24 has already reached the upper limit. If the number has reached, the information processing device 1 cannot generate a new file 131 for outputting the records classified in the process of S24.

そこで、情報処理装置１は、この場合、情報処理装置２（外部システム）における処理の平坦化を図りながらレコードの出力を行うことができるファイル１３１を、Ｓ２４の処理で分類したレコードの出力先のファイル１３１として特定する平坦化処理を行う。以下、平坦化処理について説明を行う。 Therefore, in this case, the information processing apparatus 1 sets the file 131, which can output records while flattening the processing in the information processing apparatus 2 (external system), as the output destination of the records classified in the processing of S24. The flattening process specified as the file 131 is performed. The flattening process will be described below.

［平坦化処理］
図１５から図１７は、平坦化処理を説明するフローチャート図である。 [Flatization processing]
15 to 17 are flowcharts illustrating the flattening process.

情報処理装置１の平均算出部１１６は、図１５に示すように、情報格納領域１３０に記憶されたファイル１３１（生成済の全ファイル１３１）のファイルサイズの平均サイズを算出する（Ｓ５１）。 As shown in FIG. 15, the average calculation unit 116 of the information processing device 1 calculates the average file size of the files 131 (all generated files 131) stored in the information storage area 130 (S51).

そして、ファイル特定部１１７は、例えば、後述するＳ５４以降の処理が実行された回数を示すカウンタ（図示しない）と、Ｓ２４の処理で分類したレコードの出力先のファイル１３１を示す情報（図示しない）とを初期化する（Ｓ５２）。 Then, the file identifying unit 117, for example, a counter (not shown) indicating the number of times that the processing of S54 and later described below is executed, and information (not shown) indicating the file 131 of the output destination of the record classified in the processing of S24. And are initialized (S52).

続いて、ファイル特定部１１７は、情報格納領域１３０に記憶されたファイル１３１のうちの１つをランダムに選択する（Ｓ５３）。 Subsequently, the file identifying unit 117 randomly selects one of the files 131 stored in the information storage area 130 (S53).

そして、ファイル特定部１１７は、Ｓ５６の処理等を全ファイル１３１について実行したか否かを判定する（Ｓ５４）。 Then, the file identifying unit 117 determines whether or not the processing of S56 and the like have been executed for all the files 131 (S54).

その結果、全ファイル１３１について実行していないと判定した場合（Ｓ５５のＮＯ）、ファイル特定部１１７は、Ｓ５３の処理または後述するＳ６６の処理で最後に選択したファイル１３１の現在のファイルサイズが、Ｓ５３の処理またはＳ６６の処理で選択済のファイル１３１のファイルサイズの中で最小であるか否かを判定する（Ｓ５６）。 As a result, when it is determined that all the files 131 have not been executed (NO in S55), the file identification unit 117 determines that the current file size of the file 131 selected last in the process of S53 or the process of S66 described later is It is determined whether or not the file size of the file 131 selected in the process of S53 or S66 is the smallest (S56).

そして、図１６に示すように、最後に選択したファイル１３１の現在のファイルサイズが最小であると判定した場合（Ｓ６１のＹＥＳ）、ファイル特定部１１７は、Ｓ５３の処理またはＳ６６の処理で最後に選択したファイル１３１を、Ｓ２４の処理で分類したレコードの出力先のファイル１３１として選択する（Ｓ６３）。 Then, as shown in FIG. 16, when it is determined that the current file size of the last selected file 131 is the smallest (YES in S61), the file identifying unit 117 finally determines in the process of S53 or the process of S66. The selected file 131 is selected as the output destination file 131 of the records classified in the process of S24 (S63).

一方、最後に選択したファイル１３１の現在のファイルサイズが最小でないと判定した場合（Ｓ６１のＮＯ）、ファイル特定部１１７は、Ｓ６２の処理を行わない。 On the other hand, when it is determined that the current file size of the finally selected file 131 is not the smallest (NO in S61), the file identifying unit 117 does not perform the process of S62.

その後、ファイル特定部１１７は、カウンタに１を追加する（Ｓ６３）。そして、ファイル特定部１１７は、カウンタが示す数が予め定められたループ回数に到達したか否かを判定する（Ｓ６４）。 After that, the file identifying unit 117 adds 1 to the counter (S63). Then, the file identification unit 117 determines whether or not the number indicated by the counter has reached a predetermined loop count (S64).

その結果、カウンタが示す数がループ回数に到達していない場合（Ｓ６５のＮＯ）、ファイル特定部１１７は、例えば、Ｓ５３の処理またはＳ６６の処理で最後に選択したファイル１３１の次のファイル１３１を選択する（Ｓ６６）。 As a result, when the number indicated by the counter has not reached the loop count (NO in S65), the file identifying unit 117 selects, for example, the file 131 next to the file 131 last selected in the process of S53 or the process of S66. Select (S66).

具体的に、ファイル特定部１１７は、例えば、生成順序がＳ５３の処理またはＳ６６の処理で最後に選択したファイル１３１の次であったファイル１３１の選択を行う。なお、ファイル特定部１１７は、例えば、Ｓ５３の処理またはＳ６６の処理において生成順序が最後であるファイル１３１の選択を行っていた場合、ここでは、生成順序が最初であるファイル１３１の選択を行うものであってよい。 Specifically, the file identification unit 117 selects, for example, the file 131 that is next to the file 131 that was selected last in the process of S53 or S66 in the generation order. Note that, for example, when the file specifying unit 117 selects the file 131 having the last generation order in the process of S53 or the process of S66, here, the file specifying unit 117 selects the file 131 having the first generation order. May be

そして、情報処理装置１は、Ｓ６６の処理の後、Ｓ５４以降の処理を再度行う。 Then, the information processing apparatus 1 performs the processing of S54 and thereafter again after the processing of S66.

一方、カウンタが示す数がループ回数に到達している場合（Ｓ６５のＹＥＳ）、ファイル特定部１１７は、図１７に示すように、Ｓ６２の処理で選択した出力先のファイル１３１のファイルサイズが、Ｓ５１の処理で算出した平均サイズ以下であるか否かを判定する（Ｓ７１）。 On the other hand, when the number indicated by the counter has reached the number of loops (YES in S65), the file identification unit 117 determines that the file size of the output destination file 131 selected in the process of S62 is as shown in FIG. It is determined whether the average size is equal to or smaller than the average size calculated in the process of S51 (S71).

その結果、Ｓ６２の処理で選択した出力先のファイル１３１のファイルサイズが、Ｓ５１の処理で算出した平均サイズ以下であると判定した場合（Ｓ７２のＹＥＳ）、情報処理装置１は、平坦化処理を終了する。 As a result, when it is determined that the file size of the output destination file 131 selected in the process of S62 is equal to or smaller than the average size calculated in the process of S51 (YES in S72), the information processing device 1 performs the flattening process. finish.

これにより、情報処理装置１は、情報処理装置２（外部システム）における処理の平坦化を図りながらレコードの出力を行うことができるファイル１３１を、Ｓ２４の処理で分類したレコードの出力先のファイル１３１として特定することが可能になる。 As a result, the information processing apparatus 1 classifies the file 131, which can output records while flattening the processing in the information processing apparatus 2 (external system), into the output destination file 131 of the records classified in the processing of S24. Can be specified as.

一方、Ｓ６２の処理で選択した出力先のファイル１３１のファイルサイズが、Ｓ５１の処理で算出した平均サイズ以下でないと判定した場合（Ｓ７２のＮＯ）、情報処理装置１は、Ｓ６６以降の処理を再度行う。 On the other hand, when it is determined that the file size of the output destination file 131 selected in the process of S62 is not less than or equal to the average size calculated in the process of S51 (NO in S72), the information processing device 1 performs the processes in S66 and subsequent processes again. To do.

すなわち、Ｓ６２の処理で選択したファイル１３１のファイルサイズが全ファイル１３１の平均サイズ以下でない場合とは、Ｓ５３の処理やＳ６６の処理において選択したファイル１３１のファイルサイズが全ファイル１３１の中で偏っていた場合である。そのため、情報処理装置１は、この場合、Ｓ６６以降の処理を再度行い、Ｓ２４の処理で分類したレコードの出力先のファイル１３１の選択をさらに行う。 That is, when the file size of the file 131 selected in the process of S62 is not less than or equal to the average size of all the files 131, the file size of the file 131 selected in the process of S53 or S66 is uneven among all the files 131. That is the case. Therefore, in this case, the information processing device 1 again performs the processing of S66 and thereafter, and further selects the file 131 of the output destination of the records classified in the processing of S24.

そして、Ｓ５５の処理において、全ファイル１３１について実行していると判定した場合（Ｓ５５のＹＥＳ）、情報処理装置１は、平坦化処理を終了する。 Then, in the process of S55, when it is determined that all the files 131 are executed (YES in S55), the information processing device 1 ends the flattening process.

図１４に戻り、レコード出力部１１５は、Ｓ２４の処理で分類したレコードを、Ｓ４４の処理で選択した出力先のファイル１３１（Ｓ６２の処理で最後に選択した出力先のファイル１３１）に出力する（Ｓ４５）。以下、Ｓ４５の処理の具体例について説明を行う。 Returning to FIG. 14, the record output unit 115 outputs the records classified in the process of S24 to the output destination file 131 selected in the process of S44 (the output destination file 131 finally selected in the process of S62) ( S45). Hereinafter, a specific example of the process of S45 will be described.

［Ｓ４５の処理でファイルに出力したレコードの具体例］
図２２は、Ｓ４５の処理が行われる前のレコードの具体例を説明する図である。また、図２３は、Ｓ４５の処理でファイルに出力したレコードの具体例を説明する図である。 [Specific Example of Record Output to File in S45 Process]
FIG. 22 is a diagram illustrating a specific example of a record before the process of S45 is performed. In addition, FIG. 23 is a diagram illustrating a specific example of the record output to the file in the process of S45.

具体的に、図２２に示す例において、ファイル１３１ａには、図２２（Ａ）に示すように、図１８で説明したレコードのうち、「項番」が「１」であるレコードと「項番」が「５」であるレコードと「項番」が「８」であるレコードとが含まれており、ファイル１３１ｂには、図２２（Ｂ）に示すように、図１８で説明したレコードのうち、「項番」が「２」であるレコードが含まれている。また、図２２に示す例において、ファイル１３１ｃには、図２２（Ｃ）に示すように、図１８で説明したレコードのうち、「項番」が「３」であるレコードと「項番」が「７」であるレコードとが含まれており、ファイル１３１ｄには、図２２（Ｄ）に示すように、図１８で説明したレコードのうち、「項番」が「４」であるレコードと「項番」が「９」であるレコードとが含まれている。さらに、図２２に示す例において、ファイル１３１ｅには、図２２（Ｅ）に示すように、図１８で説明したレコードのうち、「項番」が「６」であるレコードが含まれている。 Specifically, in the example shown in FIG. 22, in the file 131a, as shown in FIG. 22(A), among the records described in FIG. 18, the record in which the “item number” is “1” and the “item number” are “Record” is “5” and a record whose “Item number” is “8” is included, and the file 131b includes the records described in FIG. 18 as shown in FIG. 22B. , A record whose “item number” is “2” is included. Further, in the example shown in FIG. 22, in the file 131c, as shown in FIG. 22C, among the records described in FIG. 18, the record in which the “item number” is “3” and the “item number” are The record that is “7” is included, and the file 131d includes the record whose “item number” is “4” and the record that is “4” among the records described in FIG. And a record whose item number is "9" are included. Further, in the example illustrated in FIG. 22, the file 131e includes a record in which the “item number” is “6” among the records described in FIG. 18, as illustrated in FIG.

すなわち、図２２に示す例は、ファイル１３１ａからファイル１３１ｅまでのそれぞれ（ファイル数情報１３２に対応する数のファイル１３１のそれぞれ）が既に生成済であり、各ファイル１３１のそれぞれにレコードが出力されている状態を示している。 That is, in the example shown in FIG. 22, each of the files 131a to 131e (each of the files 131 of the number corresponding to the file number information 132) has already been generated, and a record is output to each of the files 131. It shows the state.

そして、図２２に示す状態において、図１８で説明したレコードのうち、「項番」が「１０」であるレコード（レコードＣ）がＳ２３の処理において取得された場合、ファイル１３１ｂには、図２３に示すように、「項番」が「２」であるレコード（レコードＥ）だけでなく、「項番」が「１０」であるレコード（レコードＣ）が含まれる状態になる。 Then, in the state shown in FIG. 22, when the record (record C) whose “item number” is “10” is acquired in the process of S23 among the records described in FIG. As shown in FIG. 3, not only the record (record E) whose “item number” is “2” but also the record (record C) whose “item number” is “10” is included.

図１３に戻り、レコード出力部１１５は、Ｓ４５の処理の後、Ｓ２２の処理で分割したレコードの全てがＳ２３の処理において取得済であるか否かを判定する（Ｓ３３）。すなわち、レコード出力部１１５は、Ｓ２２の処理で分割したレコードの全てがいずれかのファイル１３１に出力済であるか否かを判定する。なお、レコード出力部１１５は、Ｓ３２の処理またはＳ４３の処理が行われた場合についても同様に、Ｓ３３の処理を行う。 Returning to FIG. 13, after the process of S45, the record output unit 115 determines whether all the records divided in the process of S22 have been acquired in the process of S23 (S33). That is, the record output unit 115 determines whether all the records divided in the process of S22 have been output to any of the files 131. The record output unit 115 also performs the process of S33 in the same manner when the process of S32 or the process of S43 is performed.

その結果、Ｓ２２の処理で分割したレコードの全てがＳ２３の処理において取得済でないと判定した場合（Ｓ３４のＮＯ）、情報処理装置１は、Ｓ２３以降の処理を再度実行する。 As a result, when it is determined that all the records divided in the process of S22 have not been acquired in the process of S23 (NO in S34), the information processing device 1 executes the processes of S23 and subsequent processes again.

一方、Ｓ２２の処理で分割したレコードの全てがＳ２３の処理において取得済であると判定した場合（Ｓ３４のＹＥＳ）、ファイル出力部１１８は、情報格納領域に記憶されたファイル１３１を、ファイルサイズが大きいファイル１３１から順に、情報処理装置２（外部システム）に対して出力する。 On the other hand, if it is determined that all the records divided in the process of S22 have been acquired in the process of S23 (YES in S34), the file output unit 118 determines that the file 131 stored in the information storage area has the file size of The large files 131 are sequentially output to the information processing device 2 (external system).

これにより、情報処理装置２（外部システム）は、ファイルサイズの大きいファイル１３１から順に処理を開始することが可能になる。そのため、情報処理装置２は、処理対象データに対する処理の所要時間を短縮させることが可能になる。 As a result, the information processing device 2 (external system) can start processing in order from the file 131 with the largest file size. Therefore, the information processing device 2 can shorten the time required for processing the data to be processed.

このように、本実施の形態における情報処理装置１は、処理対象データの入力を受け付けた際に、その処理対象データを複数のレコードに分割する。そして、情報処理装置１は、分割した複数のレコードのそれぞれに含まれるキーに基づいて、複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類する。 As described above, when the information processing device 1 according to the present embodiment receives the input of the processing target data, the processing target data is divided into a plurality of records. Then, the information processing device 1 sequentially classifies each of the records included in the plurality of records into a record group corresponding to the key included in each of the records, based on the keys included in each of the plurality of divided records.

続いて、レコードが分類されたレコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、情報処理装置１は、分類されたレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する。 Subsequently, when the number of keys corresponding to each of the record groups into which the records are classified does not exceed the preset number of files, the information processing device 1 includes each of the classified records in each record. Each record group corresponding to the key is output to a different file 131.

一方、レコードが分類されたレコード群のそれぞれに対応するキーの数がファイル数を超えた場合、情報処理装置１は、分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済であるレコードを、各レコードに含まれるキーに対応するレコード群ごとにそれぞれ異なるファイル１３１に出力する。 On the other hand, when the number of keys corresponding to each of the record groups into which the records are classified exceeds the number of files, the information processing device 1 outputs the classified records including the same key to the file 131. The completed records are output to different files 131 for each record group corresponding to the key included in each record.

また、レコードが分類されたレコード群のそれぞれに対応するキーの数がファイル数を超えた場合、情報処理装置１は、全ファイル１３１の中からランダムに抽出された複数のファイル１３１の中で、最小のファイルサイズであるファイル１３１を特定する。そして、情報処理装置１は、特定したファイル１３１のファイルサイズが全ファイル１３１のファイルサイズの平均サイズ以下である場合、特定したファイル１３１に、分類されたレコードであって、同一のキーを含むレコードがファイル１３１に出力済でないレコードを出力する。 When the number of keys corresponding to each of the record groups into which the records are classified exceeds the number of files, the information processing device 1 selects one of the plurality of files 131 randomly extracted from all the files 131. The file 131 having the smallest file size is specified. Then, when the file size of the specified file 131 is equal to or smaller than the average size of the file sizes of all the files 131, the information processing device 1 is a record classified into the specified file 131 and including the same key. Outputs a record that has not been output to the file 131.

すなわち、情報処理装置１は、全ファイル１３１からランダムに選択したファイル１３１の中で、現時点のファイルサイズが最小であるファイル１３１を、新たなキーを有するレコードを出力するファイル１３１の候補として特定する。そして、情報処理装置１は、特定したファイル１３１のファイルサイズが全ファイル１３１の平均サイズ以下である場合、ランダムに選択したファイル１３１のファイルサイズが全ファイル１３１の中で偏っているものではないと判定し、特定したファイル１３１に、新たなキーを有するレコードを出力する。 That is, the information processing apparatus 1 identifies the file 131 having the smallest file size at the present time among the files 131 randomly selected from all the files 131 as a candidate of the file 131 that outputs the record having the new key. .. Then, when the file size of the specified file 131 is equal to or smaller than the average size of all the files 131, the information processing apparatus 1 does not mean that the file sizes of the randomly selected files 131 are biased among all the files 131. A record having a new key is output to the determined and identified file 131.

これにより、情報処理装置１は、同じキーを有する各レコードを異なるファイル１３１に分散させることなく、各ファイル１３１のデータサイズの平坦化を行うことを可能とする。そのため、情報処理装置２は、処理対象データに対する処理の所要時間を短縮させることが可能になる。 As a result, the information processing apparatus 1 can flatten the data size of each file 131 without distributing each record having the same key to different files 131. Therefore, the information processing device 2 can shorten the time required for processing the data to be processed.

なお、ファイル数情報１３２が示す数は、情報処理装置２における並列処理の多重度に応じた数の整数倍であることが好ましい。 The number indicated by the file number information 132 is preferably an integral multiple of the number according to the multiplicity of parallel processing in the information processing device 2.

これにより、情報処理装置２は、並列処理間における処理の実行終了時間の差を小さくすることが可能になる。そのため、情報処理装置２は、処理対象データに対する処理の所要時間をより短縮させることが可能になる。 As a result, the information processing device 2 can reduce the difference in the process execution end time between the parallel processes. Therefore, the information processing device 2 can further reduce the time required for processing the processing target data.

以上の実施の形態をまとめると、以下の付記のとおりである。 The above embodiments are summarized as the following supplementary notes.

（付記１）
データの入力を受け付けた際に、該データを複数のレコードに分割し、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類し、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定し、
特定した前記ファイルのファイルサイズが、前記ファイル数に対応するファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力する、
処理をコンピュータに実行させることを特徴とする情報処理プログラム。 (Appendix 1)
When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, each record includes the classified records that have already been output to the file. Output to a different file for each record group corresponding to the key
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files corresponding to the number of files, the specified file is the classified record and the record including the same key is Output records that have not been output to the file,
An information processing program that causes a computer to execute a process.

（付記２）
付記１において、
前記ファイルに出力済でないレコードを出力する処理では、特定した前記ファイルのファイルサイズが、レコードが分類された前記ファイルのファイルサイズの平均サイズ以下でない場合、前記特定する処理及び前記ファイルに出力済でないレコードを出力する処理を再度行う、
ことを特徴とする情報処理プログラム。 (Appendix 2)
In Appendix 1,
In the process of outputting a record that has not been output to the file, if the file size of the specified file is not less than or equal to the average file size of the files into which the records are classified, the specified process and the output to the file have not been completed. Do the process to output the record again,
An information processing program characterized by the above.

（付記３）
データの入力を受け付けた際に、該データを複数のレコードに分割し、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類し、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定し、
特定した前記ファイルのファイルサイズが、レコードが分類された前記ファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力し、
前記複数のレコードの全てが前記ファイル数に対応するファイルに出力された場合、前記ファイル数に対応するファイルを外部に出力する、
処理をコンピュータに実行させることを特徴とする情報処理プログラム。 (Appendix 3)
When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, among the classified records, a record in which a record including the same key has been output to the file is included in each record. Output to a different file for each record group corresponding to the key,
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the records are classified, the specified file includes a record including the same key among the classified records. Output records that have not been output to
When all of the plurality of records are output to the file corresponding to the number of files, the file corresponding to the number of files is output to the outside,
An information processing program that causes a computer to execute a process.

（付記４）
付記３において、
前記外部に出力する処理では、
前記ファイル数に対応するファイルの出力順を、各ファイルのファイルサイズの大きさに基づいて決定し、
決定した前記出力順に基づいて、前記ファイル数に対応するファイルのそれぞれを出力する、
ことを特徴とする情報処理プログラム。 (Appendix 4)
In Appendix 3,
In the process of outputting to the outside,
The output order of the files corresponding to the number of files is determined based on the file size of each file,
Output each of the files corresponding to the number of files based on the determined output order,
An information processing program characterized by the above.

（付記５）
付記４において、
前記決定する処理では、前記ファイル数に対応するファイルの出力順を、各ファイルのファイルサイズの大きい順に決定する、
ことを特徴とする情報処理プログラム。 (Appendix 5)
In Appendix 4,
In the determining process, the output order of files corresponding to the number of files is determined in descending order of file size of each file,
An information processing program characterized by the above.

（付記６）
付記５において、
前記外部は、所定数の並列処理を行う外部プログラムである、
ことを特徴とする情報処理プログラム。 (Appendix 6)
In Appendix 5,
The external is an external program that performs a predetermined number of parallel processes,
An information processing program characterized by the above.

（付記７）
付記６において、
前記ファイル数は、前記所定数の整数倍の数である、
ことを特徴とする情報処理プログラム。 (Appendix 7)
In Appendix 6,
The number of files is an integer multiple of the predetermined number,
An information processing program characterized by the above.

（付記８）
データの入力を受け付けた際に、該データを複数のレコードに分割するデータ分割部と、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類するレコード分類部と、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力するレコード出力部と、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定するファイル特定部と、を有し、
前記レコード出力部は、特定した前記ファイルのファイルサイズが、前記ファイル数に対応するファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力する、
ことを特徴とする情報処理装置。 (Appendix 8)
A data division unit that divides the data into a plurality of records when the data input is accepted,
A record classification unit that sequentially classifies each of the records included in the plurality of records into a record group corresponding to the key included in each of the records based on the keys included in each of the divided records.
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. When the number of keys corresponding to each of the record groups exceeds the number of files, the classified records, the records including the same key are already output to the file, A record output unit that outputs to a different file for each of the record groups corresponding to the keys included in each record,
A plurality of files randomly extracted from among the files corresponding to the number of files, a file specifying unit for specifying a file having a minimum file size, and
When the file size of the specified file is equal to or smaller than the average file size of the files corresponding to the number of files, the record output unit is the record classified into the specified file and has the same Output the record that contains the key that has not been output to the file,
An information processing device characterized by the above.

（付記９）
データの入力を受け付けた際に、該データを複数のレコードに分割するデータ分割部と、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類するレコード分類部と、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力するレコード出力部と、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定するファイル特定部と、を有し、
前記レコード出力部は、特定した前記ファイルのファイルサイズが、レコードが分類された前記ファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力し、さらに、
前記複数のレコードの全てが前記ファイル数に対応するファイルに出力された場合、前記ファイル数に対応するファイルを外部に出力するファイル出力部を有する、
ことを特徴とする情報処理装置。 (Appendix 9)
A data division unit that divides the data into a plurality of records when the data input is accepted,
A record classification unit that sequentially classifies each of the records included in the plurality of records into a record group corresponding to the keys included in each of the records, based on the keys included in each of the plurality of divided records,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. When the number of keys corresponding to each of the record groups exceeds the number of files, the records including the same key among the classified records are output to the file. A record output unit that outputs to a different file for each of the record groups corresponding to the key included in the record,
A plurality of files randomly extracted from among the files corresponding to the number of files, a file specifying unit for specifying a file having a minimum file size, and
If the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the record is classified, the record output unit identifies the same key of the classified records in the specified file. Output the record including the record that has not been output to the file, and further,
When all of the plurality of records are output to the file corresponding to the number of files, a file output unit that outputs the file corresponding to the number of files to the outside,
An information processing device characterized by the above.

（付記１０）
付記９において、
前記ファイル出力部は、
前記ファイル数に対応するファイルの出力順を、各ファイルのファイルサイズの大きさに基づいて決定し、
決定した前記出力順に基づいて、前記ファイル数に対応するファイルのそれぞれを出力する、
ことを特徴とする情報処理装置。 (Appendix 10)
In Appendix 9,
The file output section is
The output order of the files corresponding to the number of files is determined based on the file size of each file,
Output each of the files corresponding to the number of files based on the determined output order,
An information processing device characterized by the above.

（付記１１）
データの入力を受け付けた際に、該データを複数のレコードに分割し、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類し、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定し、
特定した前記ファイルのファイルサイズが、前記ファイル数に対応するファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードであって、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力する、
ことを特徴とする情報処理方法。 (Appendix 11)
When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, each record includes the classified records that have already been output to the file. Output to a different file for each record group corresponding to the key
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files corresponding to the number of files, the specified file is the classified record and the record including the same key is Output records that have not been output to the file,
An information processing method characterized by the above.

（付記１２）
データの入力を受け付けた際に、該データを複数のレコードに分割し、
分割した前記複数のレコードのそれぞれに含まれるキーに基づいて、前記複数のレコードに含まれるレコードのそれぞれを、各レコードに含まれるキーに対応するレコード群に順次分類し、
前記レコード群のそれぞれに対応するキーの数が予め設定されたファイル数を超えていない場合、分類された前記レコードのそれぞれを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記レコード群のそれぞれに対応するキーの数が前記ファイル数を超えた場合、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済であるレコードを、各レコードに含まれるキーに対応する前記レコード群ごとにそれぞれ異なるファイルに出力し、
前記ファイル数に対応するファイルの中からランダムに抽出された複数のファイルの中で、最小のファイルサイズであるファイルを特定し、
特定した前記ファイルのファイルサイズが、レコードが分類された前記ファイルのファイルサイズの平均サイズ以下である場合、特定した前記ファイルに、分類された前記レコードのうち、同一のキーを含むレコードが前記ファイルに出力済でないレコードを出力し、
前記複数のレコードの全てが前記ファイル数に対応するファイルに出力された場合、前記ファイル数に対応するファイルを外部に出力する、
ことを特徴とする情報処理方法。 (Appendix 12)
When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, among the classified records, a record in which a record including the same key has been output to the file is included in each record. Output to a different file for each record group corresponding to the key,
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the records are classified, the specified file includes a record including the same key among the classified records. Output records that have not been output to
When all of the plurality of records are output to the file corresponding to the number of files, the file corresponding to the number of files is output to the outside,
An information processing method characterized by the above.

（付記１３）
付記１２において、
前記外部に出力する処理では、
前記ファイル数に対応するファイルの出力順を、各ファイルのファイルサイズの大きさに基づいて決定し、
決定した前記出力順に基づいて、前記ファイル数に対応するファイルのそれぞれを出力する、
ことを特徴とする情報処理方法。 (Appendix 13)
In Appendix 12,
In the process of outputting to the outside,
The output order of the files corresponding to the number of files is determined based on the file size of each file,
Output each of the files corresponding to the number of files based on the determined output order,
An information processing method characterized by the above.

１：情報処理装置２：情報処理システム
３：操作端末１０：情報処理システム 1: Information processing device 2: Information processing system 3: Operating terminal 10: Information processing system

Claims

When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, each record includes the classified records that have already been output to the file. Output to a different file for each record group corresponding to the key
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files corresponding to the number of files, the specified file is the classified record and the record including the same key is Output records that have not been output to the file,
An information processing program that causes a computer to execute a process.

In claim 1,
In the process of outputting a record that has not been output to the file, if the file size of the specified file is not less than or equal to the average file size of the files into which the records are classified, the specified process and the output to the file have not been completed. Do the process to output the record again,
An information processing program characterized by the above.

When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, among the classified records, a record in which a record including the same key has been output to the file is included in each record. Output to a different file for each record group corresponding to the key,
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the records are classified, the specified file includes a record including the same key among the classified records. Output records that have not been output to
When all of the plurality of records are output to the file corresponding to the number of files, the file corresponding to the number of files is output to the outside,
An information processing program that causes a computer to execute a process.

In claim 3,
In the process of outputting to the outside,
The output order of the files corresponding to the number of files is determined based on the file size of each file,
Output each of the files corresponding to the number of files based on the determined output order,
An information processing program characterized by the above.

In claim 4,
In the determining process, the output order of files corresponding to the number of files is determined in descending order of file size of each file,
An information processing program characterized by the above.

In claim 5,
The external is an external program that performs a predetermined number of parallel processes,
An information processing program characterized by the above.

In claim 6,
The number of files is an integer multiple of the predetermined number,
An information processing program characterized by the above.

A data division unit that divides the data into a plurality of records when the data input is accepted,
A record classification unit that sequentially classifies each of the records included in the plurality of records into a record group corresponding to the key included in each of the records based on the keys included in each of the divided records.
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. When the number of keys corresponding to each of the record groups exceeds the number of files, the classified records, the records including the same key are already output to the file, A record output unit that outputs to a different file for each of the record groups corresponding to the keys included in each record,
A plurality of files randomly extracted from among the files corresponding to the number of files, a file specifying unit for specifying a file having a minimum file size, and
When the file size of the specified file is equal to or smaller than the average file size of the files corresponding to the number of files, the record output unit is the record classified into the specified file and has the same Output the record that contains the key that has not been output to the file,
An information processing device characterized by the above.

A data division unit that divides the data into a plurality of records when the data input is accepted,
A record classification unit that sequentially classifies each of the records included in the plurality of records into a record group corresponding to the key included in each of the records based on the keys included in each of the divided records.
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. When the number of keys corresponding to each of the record groups exceeds the number of files, the records including the same key among the classified records are output to the file. A record output unit that outputs to a different file for each of the record groups corresponding to the key included in the record,
A plurality of files randomly extracted from among the files corresponding to the number of files, a file specifying unit for specifying a file having a minimum file size, and
If the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the record is classified, the record output unit identifies the same key of the records classified into the specified file. Output the record including the record that has not been output to the file, and further,
When all of the plurality of records are output to the file corresponding to the number of files, a file output unit that outputs the file corresponding to the number of files to the outside,
An information processing device characterized by the above.

When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, each record includes the classified records that have already been output to the file. Output to a different file for each record group corresponding to the key
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files corresponding to the number of files, the specified file is the classified record and the record including the same key is Output records that have not been output to the file,
An information processing method characterized by the above.

When the input of data is accepted, the data is divided into multiple records,
Based on the keys contained in each of the plurality of records divided, each of the records contained in the plurality of records is sequentially classified into a record group corresponding to the keys contained in each record,
When the number of keys corresponding to each of the record groups does not exceed the preset number of files, each of the classified records is a different file for each of the record groups corresponding to the keys included in each record. Output to
When the number of keys corresponding to each of the record groups exceeds the number of files, among the classified records, a record in which a record including the same key has been output to the file is included in each record. Output to a different file for each record group corresponding to the key,
Among a plurality of files randomly extracted from the files corresponding to the number of files, specify the file having the smallest file size,
When the file size of the specified file is equal to or smaller than the average size of the file sizes of the files into which the records are classified, the specified file includes a record including the same key among the classified records. Output records that have not been output to
When all of the plurality of records are output to the file corresponding to the number of files, the file corresponding to the number of files is output to the outside,
An information processing method characterized by the above.