JP2021125181A

JP2021125181A - Information processing method

Info

Publication number: JP2021125181A
Application number: JP2020020543A
Authority: JP
Inventors: 政典澤; Masanori Sawa
Original assignee: NEC Corp; NEC Solution Innovators Ltd
Current assignee: NEC Corp; NEC Solution Innovators Ltd
Priority date: 2020-02-10
Filing date: 2020-02-10
Publication date: 2021-08-30
Anticipated expiration: 2040-02-10
Also published as: JP7452840B2

Abstract

To provide an information processing method, an information processing device, and a program that prevent a backup time delay when a file name or a directory name is changed.SOLUTION: In a system, an information processing device (backup source device 100) obtains first information in which a path name of data that is stored in a backup destination region and unique information that is uniquely given to each piece of the data are associated with each other; obtains second information in which a path name of data that is stored in a backup source region and unique information that is uniquely given to each piece of the data are associated with each other; compares the obtained first information and second information; and performs processing based on a comparison result for the path name of the data stored in the backup destination region before duplicating the data stored in the backup source region, in the backup destination region.SELECTED DRAWING: Figure 2

Description

本発明は、情報処理方法、情報処理装置、プログラムに関する。 The present invention relates to an information processing method, an information processing device, and a program.

ファイルシステム間においてファイル単位のバックアップを行うことがある。 File-by-file backups may be performed between file systems.

上記のようなバックアップは、例えば、ＵＮＩＸ（登録商標）における「ｒｓｙｎｃ」コマンドやＷｉｎｄｏｗｓ（登録商標）における「ｒｏｂｏｃｏｐｙ」コマンドなどを用いて行うことが出来る。「ｒｓｙｎｃ」コマンドや「ｒｏｂｏｃｏｐｙ」コマンドを用いる一例としては、例えば、特許文献１がある。また、「ｒｓｙｎｃ」コマンドについて記載された文献としては、非特許文献１があり、「ｒｏｂｏｃｏｐｙ」コマンドについて記載された文献としては、非特許文献２がある。 The backup as described above can be performed by using, for example, the "rsync" command in UNIX (registered trademark), the "robocopy" command in Windows (registered trademark), and the like. As an example of using the "rsync" command and the "robocopy" command, there is, for example, Patent Document 1. Further, as a document describing the "rsync" command, there is Non-Patent Document 1, and as a document describing the "robocopy" command, there is Non-Patent Document 2.

特開２０１２−８３８８０号公報Japanese Unexamined Patent Publication No. 2012-83880

rsync［online］，［令和2年1月17日検索］，インターネット＜URL：https://rsync.samba.org/＞rsync [online], [Searched on January 17, 2nd year of Reiwa], Internet <URL: https://rsync.samba.org/> Microsoft［online］，［令和2年1月17日検索］，インターネット＜URL：https://docs.microsoft.com/ja-jp/windows-server/administration/windows-commands/robocopy＞Microsoft [online], [Searched on January 17, 2nd year of Reiwa], Internet <URL: https://docs.microsoft.com/ja-jp/windows-server/administration/windows-commands/robocopy>

特許文献１、非特許文献１、非特許文献２に記載されているような「ｒｓｙｎｃ」コマンドや「ｒｏｂｏｃｏｐｙ」コマンドを用いる方式では、バックアップ元ファイルのファイル名変更やディレクトリ変更のたびに、バックアップ先にファイルが存在しなくなるため、ファイルの削除と再コピーが発生していた。その結果、ファイル名やディレクトリ名を変更するだけで、バックアップ先領域のデータの削除・再コピーが発生し、バックアップ時間が無駄に遅くなる、という課題が生じていた。 In the method using the "rsync" command or the "robocopy" command as described in Patent Document 1, Non-Patent Document 1, and Non-Patent Document 2, the backup destination is changed every time the file name of the backup source file is changed or the directory is changed. The file was deleted and recopied because the file no longer exists in. As a result, simply changing the file name or directory name causes deletion / recopying of the data in the backup destination area, which causes a problem that the backup time is unnecessarily slowed down.

そこで、本発明の目的は、ファイル名やディレクトリ名の変更を行った際に、バックアップ時間が遅くなるおそれがある、という課題を解決する情報処理方法、情報処理装置、プログラムを提供することにある。 Therefore, an object of the present invention is to provide an information processing method, an information processing device, and a program that solves the problem that the backup time may be delayed when a file name or a directory name is changed. ..

かかる目的を達成するため本発明の一形態である情報処理方法は、
情報処理装置が、
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報を取得し、
バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報を取得し、
取得した前記第１情報と前記第２情報とを照合し、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う
という構成をとる。 The information processing method, which is one embodiment of the present invention, in order to achieve such an object
Information processing device
Acquire the first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data.
Acquire the second information that associates the path name of the data stored in the backup source area with the unique information uniquely given for each data.
The acquired first information and the second information are collated with each other.
Before replicating the data stored in the backup source area to the backup destination area, processing is performed based on the collation result for the path name of the data stored in the backup destination area.

また、本発明の他の形態である情報処理装置は、
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する取得部と、
前記取得部が取得した前記第１情報と前記第２情報とを照合する照合部と、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う前処理部と、
を有する
という構成をとる。 Further, the information processing device according to another embodiment of the present invention is
The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
It has a structure of having.

また、本発明の他の形態であるプログラムは、
情報処理装置に、
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する取得部と、
前記取得部が取得した前記第１情報と前記第２情報とを照合する照合部と、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う前処理部と、
を実現するためのプログラムである。 In addition, the program which is another form of the present invention
For information processing equipment
The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
It is a program to realize.

本発明は、以上のように構成されることにより、ファイル名やディレクトリ名の変更を行った際に、バックアップ時間が遅くなるおそれがある、という課題を解決する情報処理方法、情報処理装置、プログラムを提供することが可能となる。 The present invention is an information processing method, an information processing device, and a program that solves the problem that the backup time may be delayed when a file name or a directory name is changed by being configured as described above. Can be provided.

本発明の第１の実施形態におけるシステム全体の構成の一例を示している。An example of the configuration of the entire system according to the first embodiment of the present invention is shown. 図１で示すバックアップ元装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the backup source apparatus shown in FIG. インデックスＬの一例を示す図である。It is a figure which shows an example of the index L. インデックスＢの一例を示す図である。It is a figure which shows an example of index B. インデックス照合部による照合処理の一例を説明するための図である。It is a figure for demonstrating an example of the collation processing by an index collation unit. 前処理部による前処理の一例を説明するための図である。It is a figure for demonstrating an example of the pre-processing by a pre-processing unit. バックアップ先装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the backup destination device. 本発明の第１の実施形態におけるバックアップ元装置の動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation of the backup source apparatus in 1st Embodiment of this invention. 本発明の第２の実施形態における情報処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware composition of the information processing apparatus in the 2nd Embodiment of this invention. 本発明の第２の実施形態における情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of the structure of the information processing apparatus in the 2nd Embodiment of this invention.

［第１の実施形態］
本発明の第１の実施形態を図１から図８までを参照して説明する。図１は、本発明の第１の実施形態におけるシステム全体の構成の一例を示している。図２は、バックアップ元装置１００の構成の一例を示しブロック図である。図３は、インデックスＬの一例を示す図である。図４は、インデックスＢの一例を示す図である。図５は、インデックス照合部１２０による照合処理の一例を説明するための図である。図６は、前処理部１３０による前処理の一例を説明するための図である。図７は、バックアップ先装置２００の構成の一例を示すブロック図である。図８は、バックアップ元装置１００の動作の一例を示すフローチャートである。 [First Embodiment]
The first embodiment of the present invention will be described with reference to FIGS. 1 to 8. FIG. 1 shows an example of the configuration of the entire system according to the first embodiment of the present invention. FIG. 2 is a block diagram showing an example of the configuration of the backup source device 100. FIG. 3 is a diagram showing an example of the index L. FIG. 4 is a diagram showing an example of the index B. FIG. 5 is a diagram for explaining an example of collation processing by the index collation unit 120. FIG. 6 is a diagram for explaining an example of preprocessing by the preprocessing unit 130. FIG. 7 is a block diagram showing an example of the configuration of the backup destination device 200. FIG. 8 is a flowchart showing an example of the operation of the backup source device 100.

本発明の第１の実施形態においては、バックアップ元からバックアップ先へファイルコピーを行うシステムについて説明する。後述するように、バックアップ元装置１００は、バックアップ元領域内ファイルのパス名、チェックサムに基づいて第２情報であるインデックスＬを生成するとともに、バックアップ先領域内ファイルのパス名、チェックサムに基づいて第１情報であるインデックスＢを生成する。また、バックアップ元装置１００は、生成したインデックスＬとインデックスＢとを照合する。そして、バックアップ元装置１００は、照合した結果に基づいて所定の前処理を行った後、上書きなしの条件でファイルコピーを行う。このように照合結果に基づく前処理を行っておくことで、バックアップ元装置１００は、ファイル名やディレクトリ名の変更により無駄な削除・再コピーが生じることなどを抑制する。これにより、例えば、ＲＡＩＤ、スナップショット、重複排除対応のストレージなどを導入しない場合でも、ファイル名やディレクトリ名の変更を行った際のバックアップ時間の遅延を抑制することが出来る。 In the first embodiment of the present invention, a system for copying files from a backup source to a backup destination will be described. As will be described later, the backup source device 100 generates the index L, which is the second information, based on the path name and checksum of the file in the backup source area, and is based on the path name and checksum of the file in the backup destination area. The index B, which is the first information, is generated. Further, the backup source device 100 collates the generated index L with the index B. Then, the backup source device 100 performs a predetermined preprocessing based on the collation result, and then copies the file under the condition of no overwriting. By performing the preprocessing based on the collation result in this way, the backup source device 100 suppresses unnecessary deletion / recopying due to the change of the file name or the directory name. As a result, for example, even if RAID, snapshot, deduplication-compatible storage, or the like is not introduced, it is possible to suppress the delay in backup time when the file name or directory name is changed.

図１は、システム全体の構成の一例を示している。図１を参照すると、システムは、例えば、バックアップ元装置１００とバックアップ先装置２００とを有している。図１で示すように、バックアップ元装置１００とバックアップ先装置２００とは、例えば、互いに通信可能なように接続されている。 FIG. 1 shows an example of the configuration of the entire system. Referring to FIG. 1, the system has, for example, a backup source device 100 and a backup destination device 200. As shown in FIG. 1, the backup source device 100 and the backup destination device 200 are connected so as to be able to communicate with each other, for example.

バックアップ元装置１００は、ファイルなどのデータを記憶する情報処理装置である。バックアップ元装置１００は、例えば、パーソナルコンピュータ、タブレット、スマートフォンなどであって構わない。 The backup source device 100 is an information processing device that stores data such as files. The backup source device 100 may be, for example, a personal computer, a tablet, a smartphone, or the like.

図２は、バックアップ元装置１００の構成の一例を示している。図２を参照すると、バックアップ元装置１００は、例えば、インデックス生成部１１０と、インデックス照合部１２０と、前処理部１３０と、ファイルコピー部１４０と、を有している。また、バックアップ元装置１００は、記憶装置１５０を有している。記憶装置１５０に形成されたバックアップ元領域内には、バックアップ元領域内ファイル１５１が格納されている。 FIG. 2 shows an example of the configuration of the backup source device 100. Referring to FIG. 2, the backup source device 100 includes, for example, an index generation unit 110, an index collation unit 120, a preprocessing unit 130, and a file copy unit 140. Further, the backup source device 100 has a storage device 150. The file 151 in the backup source area is stored in the backup source area formed in the storage device 150.

バックアップ元装置１００は、例えば、ＣＰＵ（Central Processing Unit）などの演算装置と、記憶装置と、を有している。例えば、バックアップ元装置１００は、記憶装置に格納されたプログラムを演算装置が実行することで、上述した処理部を実現する。 The backup source device 100 includes, for example, an arithmetic unit such as a CPU (Central Processing Unit) and a storage device. For example, the backup source device 100 realizes the above-mentioned processing unit by executing the program stored in the storage device by the arithmetic unit.

インデックス生成部１１０は、記憶装置１５０に格納されたバックアップ元領域内ファイル１５１に基づいてインデックスＬを生成する。また、インデックス生成部１１０は、後述する記憶装置２２０に格納されたバックアップ先領域内ファイル２２１に基づいてインデックスＢを生成する。 The index generation unit 110 generates the index L based on the file 151 in the backup source area stored in the storage device 150. Further, the index generation unit 110 generates the index B based on the file 221 in the backup destination area stored in the storage device 220 described later.

例えば、インデックス生成部１１０は、記憶装置１５０に形成されたバックアップ元領域内に格納された各バックアップ元領域内ファイル１５１のパス名と、ファイル内容に基づいて算出されるチェックサムと、を取得する。そして、インデックス生成部１１０は、取得したパス名、チェックサムに基づいて、インデックスＬを生成する。 For example, the index generation unit 110 acquires the path name of each file 151 in the backup source area stored in the backup source area formed in the storage device 150 and the checksum calculated based on the file contents. .. Then, the index generation unit 110 generates the index L based on the acquired path name and checksum.

図３は、インデックス生成部１１０が生成するインデックスＬの一例を示している。図３を参照すると、インデックス生成部１１０が生成するインデックスＬでは、ファイルのパス名と、ファイルのチェックサムと、が対応づけられている。例えば、図３の１行目は、パス名「/directoryA/directoryC/file.text」のファイルのチェックサムが「62」であることを示している。 FIG. 3 shows an example of the index L generated by the index generation unit 110. Referring to FIG. 3, in the index L generated by the index generation unit 110, the path name of the file and the checksum of the file are associated with each other. For example, the first line of FIG. 3 indicates that the checksum of the file with the path name "/directoryA/directoryC/file.text" is "62".

また、インデックス生成部１１０は、バックアップ先領域内ファイル２２１のパス名、チェックサムをバックアップ先装置２００から取得する。そして、インデックス生成部１１０は、取得したパス名、チェックサムに基づいて、インデックスＢを生成する。 Further, the index generation unit 110 acquires the path name and checksum of the file 221 in the backup destination area from the backup destination device 200. Then, the index generation unit 110 generates the index B based on the acquired path name and checksum.

図４は、インデックス生成部１１０が生成するインデックスＢの一例を示している。図４を参照すると、インデックス生成部１１０が生成するインデックスＢでは、インデックスＬと同様に、ファイルのパス名と、ファイルのチェックサムと、が対応づけられている。例えば、図４の１行目は、パス名「/directoryA/directoryB/file.text」のファイルのチェックサムが「23」であることを示している。 FIG. 4 shows an example of the index B generated by the index generation unit 110. Referring to FIG. 4, in the index B generated by the index generation unit 110, the path name of the file and the checksum of the file are associated with each other as in the index L. For example, the first line of FIG. 4 indicates that the checksum of the file with the path name "/directoryA/directoryB/file.text" is "23".

なお、チェックサムは、ファイルの内容に基づいて予め算出されていても構わないし、例えば、インデックス生成部１１０がファイル内容に基づいて算出するよう構成しても構わない。また、インデックス生成部１１０は、バックアップ先装置２００で生成されたインデックスＢを受信するよう構成しても構わない。つまり、インデックス生成部１１０は、外部装置で生成されたインデックスＢを受信するよう構成されていても構わない。 The checksum may be calculated in advance based on the contents of the file, or may be configured so that the index generation unit 110 calculates it based on the contents of the file, for example. Further, the index generation unit 110 may be configured to receive the index B generated by the backup destination device 200. That is, the index generation unit 110 may be configured to receive the index B generated by the external device.

インデックス照合部１２０は、インデックス生成部１１０が生成したインデックスＬとインデックスＢとを照合する。例えば、インデックス照合部１２０は、インデックスＢに含まれるパス名、チェックサムの組み合わせごとに、同一のパス名、チェックサムがインデックスＬに含まれるか否か確認することで上記照合を行う。 The index collation unit 120 collates the index L generated by the index generation unit 110 with the index B. For example, the index collation unit 120 performs the above collation by confirming whether or not the same path name and checksum are included in the index L for each combination of the path name and the checksum included in the index B.

図５は、インデックス照合部１２０による照合処理の一例を示している。例えば、図５を参照すると、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryA/directoryB/file.txt」チェックサム「23」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、パス名「/directoryA/directoryB/file.txt」もチェックサム「23」も含まれていない。そのため、インデックス照合部１２０は、パス名「/directoryA/directoryB/file.txt」チェックサム「23」の組み合わせについて、チェックサム一致数が０であり、パス名一致数も０であると判断する。 FIG. 5 shows an example of collation processing by the index collation unit 120. For example, referring to FIG. 5, the index collation unit 120 has the same path name and checksum in the index L for the combination path name “/directoryA/directoryB/file.txt” checksum “23” included in the index B. Check if it is included. In the case shown in FIG. 5, the index L does not include the path name “/directoryA/directoryB/file.txt” or the checksum “23”. Therefore, the index collation unit 120 determines that the number of checksum matches is 0 and the number of path name matches is also 0 for the combination of the path name “/directoryA/directoryB/file.txt” and the checksum “23”.

同様に、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryA/directoryC/file.txt」チェックサム「40」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、パス名「/directoryA/directoryC/file.txt」が含まれており、チェックサム「40」は含まれていない。そのため、インデックス照合部１２０は、パス名「/directoryA/directory/file.txt」チェックサム「40」の組み合わせについて、チェックサム一致数が０であり、パス名一致数が１であると判断する。 Similarly, the index collating unit 120 determines whether or not the same path name and checksum are included in the index L for the combination path name “/directoryA/directoryC/file.txt” checksum “40” included in the index B. Check. In the case shown in FIG. 5, the index L includes the path name “/directoryA/directoryC/file.txt” and does not include the checksum “40”. Therefore, the index collation unit 120 determines that the number of checksum matches is 0 and the number of path name matches is 1 for the combination of the path name “/directoryA/directory/file.txt” and the checksum “40”.

同様に、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryD/directoryE/file.txt」チェックサム「55」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、チェックサム「55」が含まれており、パス名「/directoryD/directoryE/file.txt」は含まれていない。そのため、インデックス照合部１２０は、パス名「/directoryD/directoryE/file.txt」チェックサム「55」の組み合わせについて、チェックサム一致数が１であり、パス名一致数が０であると判断する。 Similarly, the index collating unit 120 determines whether or not the same path name and checksum are included in the index L for the combination path name “/directoryD/directoryE/file.txt” checksum “55” included in the index B. Check. In the case shown in FIG. 5, the index L includes the checksum “55” and does not include the path name “/directoryD/directoryE/file.txt”. Therefore, the index collation unit 120 determines that the number of checksum matches is 1 and the number of path name matches is 0 for the combination of the path name “/directoryD/directoryE/file.txt” and the checksum “55”.

同様に、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryF/directoryG/file.txt」チェックサム「77」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、パス名「/directoryF/directoryG/file.txt」が含まれており、チェックサム「77」も含まれている。そのため、インデックス照合部１２０は、パス名「/directoryF/directoryG/file.txt」チェックサム「77」の組み合わせについて、チェックサム一致数が１であり、パス名一致数が１であると判断する。 Similarly, the index collating unit 120 determines whether or not the same path name and checksum are included in the index L for the combination path name “/directoryF/directoryG/file.txt” checksum “77” included in the index B. Check. In the case shown in FIG. 5, the index L includes the path name “/directoryF/directoryG/file.txt” and also includes the checksum “77”. Therefore, the index collation unit 120 determines that the number of checksum matches is 1 and the number of path name matches is 1 for the combination of the path name “/directoryF/directoryG/file.txt” and the checksum “77”.

同様に、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryH/directoryI/file.txt」チェックサム「13」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、チェックサム「13」が２つ含まれており、パス名「/directoryH/directoryI/file.txt」は含まれていない。そのため、インデックス照合部１２０は、パス名「/directoryH/directoryI/file.txt」チェックサム「13」の組み合わせについて、チェックサム一致数が２であり、パス名一致数が０であると判断する。 Similarly, the index collating unit 120 determines whether or not the same path name and checksum are included in the index L for the combination path name “/directoryH/directoryI/file.txt” checksum “13” included in the index B. Check. In the case shown in FIG. 5, the index L contains two checksums “13” and does not include the path name “/directoryH/directoryI/file.txt”. Therefore, the index collation unit 120 determines that the number of checksum matches is 2 and the number of path name matches is 0 for the combination of the path name “/directoryH/directoryI/file.txt” and the checksum “13”.

同様に、インデックス照合部１２０は、インデックスＢに含まれる組み合わせパス名「/directoryA/directoryH/file.txt」チェックサム「39」について、同一のパス名やチェックサムがインデックスＬに含まれるか否か確認する。図５で示す場合、インデックスＬには、パス名「/directoryA/directoryH/file.txt」が含まれており、チェックサム「39」は２つ含まれている。そのため、インデックス照合部１２０は、パス名「/directoryA/directoryH/file.txt」チェックサム「39」の組み合わせについて、チェックサム一致数が２であり、パス名一致数が１であると判断する。 Similarly, the index collating unit 120 determines whether or not the same path name and checksum are included in the index L for the combination path name “/directoryA/directoryH/file.txt” checksum “39” included in the index B. Check. In the case shown in FIG. 5, the index L includes the path name “/directoryA/directoryH/file.txt” and two checksums “39”. Therefore, the index collation unit 120 determines that the number of checksum matches is 2 and the number of path name matches is 1 for the combination of the path name “/directoryA/directoryH/file.txt” and the checksum “39”.

例えば、以上説明したように、インデックス照合部１２０は、インデックスＢに含まれるパス名、チェックサムの組み合わせごとに、同一のパス名、チェックサムがインデックスＬに含まれるか否か確認することで、インデックスＬとインデックスＢとの照合を行う。換言すると、インデックス照合部１２０は、インデックスＢに含まれるパス名、チェックサムの組み合わせごとに、インデックスＬと照合することで、パス名が一致する数を示すパス名一致数とチェックサムが一致する数を示すチェックサム一致数とを確認する。 For example, as described above, the index collation unit 120 confirms whether or not the same path name and checksum are included in the index L for each combination of the path name and checksum included in the index B. The index L and the index B are collated. In other words, the index collation unit 120 collates each combination of the path name and the checksum included in the index B with the index L, so that the number of path name matches indicating the number of matching path names and the checksum match. Check with the number of checksum matches that indicate the number.

前処理部１３０は、インデックス照合部１２０による照合の結果に応じた処理を行う。例えば、前処理部１３０は、バックアップ先領域に格納されたバックアップ先領域内ファイル２２１のパス名に対する照合結果に応じた処理を行う。なお、前処理部１３０による処理は、ファイルコピー部１４０によるファイルコピーの前に行われる。 The pre-processing unit 130 performs processing according to the result of collation by the index collation unit 120. For example, the preprocessing unit 130 performs processing according to the collation result with respect to the path name of the file 221 in the backup destination area stored in the backup destination area. The processing by the preprocessing unit 130 is performed before the file copy by the file copy unit 140.

例えば、前処理部１３０は、インデックス照合部１２０がチェックサム一致数０、パス名一致数０と判断、または、チェックサム一致数０、パス名一致数１と判断したなど、チェックサム一致数０と判断した組み合わせがある場合、当該組み合わせに対応するバックアップ先領域内ファイル２２１をバックアップ先領域から削除する。例えば、図５で示す場合、パス名「/directoryA/directoryB/file.txt」チェックサム「23」の組み合わせは、チェックサム一致数０、パス名一致数０の組み合わせである。また、図５で示す場合、パス名「/directoryA/directoryC/file.txt」チェックサム「40」の組み合わせは、チェックサム一致数０、パス名一致数１の組み合わせである。従って、前処理部１３０は、図６で示すように、パス名「/directoryA/directoryB/file.txt」チェックサム「23」の組み合わせと、「/directoryA/directoryC/file.txt」チェックサム「40」の組み合わせに対応するバックアップ先領域内ファイル２２１をバックアップ先領域から削除する。このように、前処理部１３０は、インデックス照合部１２０によりチェックサム一致数が０であると判断された組み合わせに対応するバックアップ先領域内ファイル２２１をバックアップ先領域から削除する。 For example, the preprocessing unit 130 determines that the index collation unit 120 has 0 checksum matches and 0 path name matches, or determines that the checksum matches are 0 and the path name matches are 1, and the checksum matches are 0. If there is a combination determined to be, the file 221 in the backup destination area corresponding to the combination is deleted from the backup destination area. For example, in the case of FIG. 5, the combination of the path name “/directoryA/directoryB/file.txt” and the checksum “23” is a combination of the checksum match number 0 and the path name match number 0. Further, in the case of FIG. 5, the combination of the path name “/directoryA/directoryC/file.txt” and the checksum “40” is a combination of the checksum match number 0 and the path name match number 1. Therefore, as shown in FIG. 6, the preprocessing unit 130 has a combination of the path name “/directoryA/directoryB/file.txt” checksum “23” and the “/directoryA/directoryC/file.txt” checksum “40”. The file 221 in the backup destination area corresponding to the combination of "" is deleted from the backup destination area. In this way, the preprocessing unit 130 deletes the file 221 in the backup destination area corresponding to the combination determined by the index collation unit 120 to have 0 checksum matches from the backup destination area.

また、例えば、前処理部１３０は、インデックス照合部１２０がチェックサム一致数１、パス名一致数０と判断した組み合わせがある場合、当該組み合わせに対応するバックアップ先領域内ファイル２２１を、バックアップ先領域においてインデックスＬのパスにファイル移動する。例えば、図５で示す場合、パス名「/directoryD/directoryE/file.txt」チェックサム「55」の組み合わせは、チェックサム一致数１、パス名一致数０の組み合わせである。従って、前処理部１３０は、図６で示すように、パス名「/directoryD/directoryE/file.txt」チェックサム「55」の組み合わせに対応するバックアップ先領域内ファイル２２１を、対応するインデックスＬのパスとなるようにファイル移動する。つまり、前処理部１３０は、インデックスＬにおけるチェックサム「55」のパス名「/directoryD/directoryZ/file.txt」が示すパスとなるようにファイル移動する。 Further, for example, when the index collation unit 120 determines that the number of checksum matches is 1 and the number of path name matches is 0, the preprocessing unit 130 uses the file 221 in the backup destination area corresponding to the combination as the backup destination area. Move the file to the path of index L in. For example, in the case of FIG. 5, the combination of the path name “/directoryD/directoryE/file.txt” and the checksum “55” is a combination of the checksum match number 1 and the path name match number 0. Therefore, as shown in FIG. 6, the preprocessing unit 130 sets the file 221 in the backup destination area corresponding to the combination of the path name “/directoryD/directoryE/file.txt” and the checksum “55” to the corresponding index L. Move the file so that it becomes the path. That is, the preprocessing unit 130 moves the file so that the path name “/directoryD/directoryZ/file.txt” of the checksum “55” in the index L becomes the path indicated.

また、例えば、前処理部１３０は、インデックス照合部１２０がチェックサム一致数１、パス名一致数１と判断した組み合わせがある場合、当該組み合わせに対応するバックアップ先領域内ファイル２２１に対して何らかの前処理を行わない。例えば、図５で示す場合、パス名「/directoryF/directoryG/file.txt」チェックサム「77」の組み合わせは、チェックサム一致数１、パス名一致数１の組み合わせである。従って、前処理部１３０は、図６で示すように、パス名「/directoryF/directoryG/file.txt」チェックサム「77」の組み合わせに対応するバックアップ先領域内ファイル２２１をそのままにする。 Further, for example, when the index collation unit 120 determines that the checksum match number 1 and the path name match number 1 are a combination, the preprocessing unit 130 makes some advance with respect to the file 221 in the backup destination area corresponding to the combination. No processing is performed. For example, in the case of FIG. 5, the combination of the path name “/directoryF/directoryG/file.txt” and the checksum “77” is a combination of the checksum match number 1 and the path name match number 1. Therefore, as shown in FIG. 6, the preprocessing unit 130 leaves the file 221 in the backup destination area corresponding to the combination of the path name “/directoryF/directoryG/file.txt” and the checksum “77” as it is.

また、例えば、前処理部１３０は、インデックス照合部１２０がチェックサム一致数２、パス名一致数０と判断した組み合わせがあるなど、チェックサム一致数が複数ある組合せがある場合、当該組み合わせに対応するバックアップ先領域内ファイル２２１をバックアップ先領域内で閉じた処理として複製した後、パスの変更を行う。例えば、図５で示す場合、パス名「/directoryH/directoryI/file.txt」チェックサム「13」の組み合わせは、チェックサム一致数２、パス名一致数０の組み合わせである。従って、前処理部１３０は、図６で示すように、パス名「/directoryH/directoryI/file.txt」チェックサム「13」の組み合わせに対応するバックアップ先領域内ファイル２２１を、バックアップ先領域において複製する。そして、前処理部１３０は、複製した各バックアップ先領域内ファイル２２１が対応するインデックスＬのパスとなるようにファイル移動する。例えば、図６で示す場合、前処理部１３０は、一方のバックアップ先領域内ファイル２２１のパス名が「/directoryA/directoryZ/file.txt」となるようにファイル移動するとともに、他方のバックアップ先領域内ファイル２２１のパス名が「/directoryA/directoryY/file.txt」となるようにファイル移動する。このように、前処理部１３０は、チェックサム一致数が複数ありパス名一致数が０である場合、チェックサム一致数に応じた複製処理を行うとともに、インデックスＬに含まれる複数のパス名に応じたファイル移動処理を行う。 Further, for example, when there is a combination in which the index collation unit 120 determines that the number of checksum matches is 2 and the number of path name matches is 0, the preprocessing unit 130 corresponds to the combination. After duplicating the file 221 in the backup destination area as a closed process in the backup destination area, the path is changed. For example, in the case of FIG. 5, the combination of the path name “/directoryH/directoryI/file.txt” and the checksum “13” is a combination of the checksum match number 2 and the path name match number 0. Therefore, as shown in FIG. 6, the preprocessing unit 130 duplicates the file 221 in the backup destination area corresponding to the combination of the path name “/directoryH/directoryI/file.txt” and the checksum “13” in the backup destination area. do. Then, the preprocessing unit 130 moves the file so that each duplicated file 221 in the backup destination area becomes the path of the corresponding index L. For example, in the case shown in FIG. 6, the preprocessing unit 130 moves the file so that the path name of the file 221 in one backup destination area is "/directoryA/directoryZ/file.txt", and the other backup destination area. Move the file so that the path name of the internal file 221 is "/directoryA/directoryY/file.txt". In this way, when the preprocessing unit 130 has a plurality of checksum matches and the number of path name matches is 0, the preprocessing unit 130 performs duplication processing according to the number of checksum matches and assigns the plurality of path names included in the index L to the plurality of path names. Perform file movement processing according to the corresponding file movement process.

また、例えば、前処理部１３０は、インデックス照合部１２０がチェックサム一致数２、パス名一致数１と判断した組み合わせがある場合、当該組み合わせに対応するバックアップ先領域内ファイル２２１を、バックアップ先領域内で閉じた処理として複製した後、パスの変更を行う。例えば、図５で示す場合、パス名「/directoryA/directoryH/file.txt」チェックサム「39」の組み合わせは、チェックサム一致数２、パス名一致数１の組み合わせである。従って、前処理部１３０は、図６で示すように、パス名「/directoryA/directoryH/file.txt」チェックサム「39」の組み合わせに対応するバックアップ先領域内ファイル２２１を、バックアップ先領域において複製する。そして、前処理部１３０は、複製したバックアップ先領域内ファイル２２１が対応するインデックスＬのパスとなるようにファイル移動する。例えば、図６で示す場合、前処理部１３０は、複製の後、一方のバックアップ先領域内ファイル２２１のパス名が「/directoryJ/directoryJ/file.txt」となるようにファイル移動する。また、前処理部１３０は、パス名が一致しているバックアップ先領域内ファイル２２１についてはファイル移動しない。このように、前処理部１３０は、チェックサム一致数が複数ありパス名が一致するものもある場合、チェックサム一致数に応じた複製処理を行うとともに、インデックスＬに含まれるパス名に応じたファイル移動処理をパス名が一致しないファイルに対して行う。 Further, for example, when the index collation unit 120 determines that the number of checksum matches is 2 and the number of path name matches is 1, the preprocessing unit 130 uses the file 221 in the backup destination area corresponding to the combination as the backup destination area. After duplicating as a closed process inside, change the path. For example, in the case shown in FIG. 5, the combination of the path name “/directoryA/directoryH/file.txt” and the checksum “39” is a combination of the checksum match number 2 and the path name match number 1. Therefore, as shown in FIG. 6, the preprocessing unit 130 duplicates the file 221 in the backup destination area corresponding to the combination of the path name “/directoryA/directoryH/file.txt” and the checksum “39” in the backup destination area. do. Then, the preprocessing unit 130 moves the file so that the duplicated file 221 in the backup destination area becomes the path of the corresponding index L. For example, in the case shown in FIG. 6, after duplication, the preprocessing unit 130 moves the file so that the path name of the file 221 in one of the backup destination areas is "/directoryJ/directoryJ/file.txt". Further, the preprocessing unit 130 does not move the file 221 in the backup destination area having the same path name. In this way, when the preprocessing unit 130 has a plurality of checksum matches and some of the path names match, the preprocessing unit 130 performs duplication processing according to the number of checksum matches and corresponds to the path name included in the index L. Perform file move processing for files whose checksums do not match.

例えば、前処理部１３０は、以上説明したように、インデックス照合部１２０による照合の結果に応じた処理を行う。つまり、前処理部１３０は、チェックサム一致数の方がパス名一致数よりも多い場合、照合結果に応じた処理として、少なくともパス名を変更するためのファイル移動処理を行う。また、前処理部１３０は、チェックサム一致数が複数ある場合、照合結果に応じた処理として、バックアップ先領域内で閉じた処理として複製する。 For example, as described above, the pre-processing unit 130 performs processing according to the result of collation by the index collation unit 120. That is, when the number of checksum matches is larger than the number of path name matches, the preprocessing unit 130 performs at least a file movement process for changing the path name as a process according to the collation result. Further, when there are a plurality of checksum matches, the preprocessing unit 130 duplicates the checksum as a closed process in the backup destination area as a process according to the collation result.

ファイルコピー部１４０は、前処理部１３０による処理の後、必要なファイルコピーを行う。例えば、ファイルコピー部１４０は、記憶装置１５０に形成されたバックアップ元領域からバックアップ先領域に対して、上書きなし（つまり、既存ファイルがあればコピーしない）条件で、ファイルコピーを実施する。上述したように前処理部１３０が前処理を行っているため、ファイルコピー部１４０は、前回のバックアップからファイルの内容が変更されたファイル、または、新規に作成されたファイルについて、ファイルコピーすることになる。 The file copy unit 140 performs necessary file copy after the processing by the preprocessing unit 130. For example, the file copy unit 140 performs file copy from the backup source area formed in the storage device 150 to the backup destination area under the condition that there is no overwriting (that is, if there is an existing file, the file is not copied). Since the preprocessing unit 130 is performing the preprocessing as described above, the file copy unit 140 should copy the file whose contents have been changed from the previous backup or the newly created file. become.

記憶装置１５０は、ディスク装置などの記憶装置である。記憶装置１５０には、バックアップ元領域が形成されている。図２で示すように、バックアップ元領域には、バックアップ元領域内ファイル１５１が格納されている。 The storage device 150 is a storage device such as a disk device. A backup source area is formed in the storage device 150. As shown in FIG. 2, the file 151 in the backup source area is stored in the backup source area.

以上が、バックアップ元装置１００の構成の一例である。 The above is an example of the configuration of the backup source device 100.

バックアップ先装置２００は、ファイルなどのデータを記憶する情報記憶装置である。バックアップ先装置２００は、例えば、ＵＳＢ（Universal Serial Bus）メモリ、外付けハードディスク、ディスク装置などであって構わない。 The backup destination device 200 is an information storage device that stores data such as files. The backup destination device 200 may be, for example, a USB (Universal Serial Bus) memory, an external hard disk, a disk device, or the like.

図７は、バックアップ先装置２００の構成の一例を示している。図７を参照すると、バックアップ先装置２００は、データ制御部２１０を有している。また、バックアップ先装置２００は、記憶装置２２０を有している。記憶装置２２０に形成されたバックアップ先領域内には、バックアップ先領域内ファイル２２１が格納されている。 FIG. 7 shows an example of the configuration of the backup destination device 200. Referring to FIG. 7, the backup destination device 200 has a data control unit 210. Further, the backup destination device 200 has a storage device 220. The file 221 in the backup destination area is stored in the backup destination area formed in the storage device 220.

例えば、バックアップ先装置２００は、ＣＰＵやコントローラＩＣ（Integrated Circuit）などの演算装置と、記憶装置と、を有している。例えば、バックアップ先装置２００は、記憶装置に格納されたプログラムを演算装置が実行することで、上述した処理部を実現する。 For example, the backup destination device 200 has an arithmetic unit such as a CPU and a controller IC (Integrated Circuit), and a storage device. For example, the backup destination device 200 realizes the above-mentioned processing unit by executing the program stored in the storage device by the arithmetic unit.

データ制御部２１０は、バックアップ元装置１００からの指示に基づいて、記憶装置２２０に格納されているバックアップ先領域内ファイル２２１を制御する。例えば、データ制御部２１０は、バックアップ元装置１００からの指示に基づいて、記憶装置２２０に格納されているバックアップ先領域内ファイル２２１のコピーを行ったり、削除を行ったり、ファイル移動を行ったりする。 The data control unit 210 controls the file 221 in the backup destination area stored in the storage device 220 based on the instruction from the backup source device 100. For example, the data control unit 210 copies, deletes, or moves the file 221 in the backup destination area stored in the storage device 220 based on the instruction from the backup source device 100. ..

記憶装置２２０は、メモリ、ディスクなどの記憶装置である。記憶装置２２０には、バックアップ先領域が形成されている。図７で示すように、バックアップ先領域には、バックアップ先領域内ファイル２２１が格納されている。 The storage device 220 is a storage device for a memory, a disk, or the like. A backup destination area is formed in the storage device 220. As shown in FIG. 7, the file 221 in the backup destination area is stored in the backup destination area.

以上が、バックアップ先装置２００の構成の一例である。 The above is an example of the configuration of the backup destination device 200.

続いて、図８を参照して、バックアップ元からバックアップ先へファイルコピーを行う際のバックアップ元装置１００の動作の一例について説明する。 Subsequently, an example of the operation of the backup source device 100 when copying a file from the backup source to the backup destination will be described with reference to FIG.

図８を参照すると、バックアップ元装置１００のインデックス生成部１１０は、バックアップ元領域内に格納されたバックアップ元領域内ファイル１５１のパス名、チェックサムに基づいて、インデックスＬを生成する。また、インデックス生成部１１０は、バックアップ先領域内に格納されたバックアップ先領域内ファイル２２１のパス名、チェックサムに基づいて、インデックスＢを生成する。このように、インデックス生成部１１０は、インデックスＬとインデックスＢを生成する（ステップＳ１０１）。 Referring to FIG. 8, the index generation unit 110 of the backup source device 100 generates the index L based on the path name and checksum of the file 151 in the backup source area stored in the backup source area. Further, the index generation unit 110 generates the index B based on the path name and checksum of the file 221 in the backup destination area stored in the backup destination area. In this way, the index generation unit 110 generates the index L and the index B (step S101).

インデックス照合部１２０は、インデックス生成部１１０が生成したインデックスＬとインデックスＢとを照合する（ステップＳ１０２）。例えば、インデックス照合部１２０は、インデックスＢに含まれるパス名、チェックサムの組み合わせごとに、インデックスＬと照合することで、パス名が一致する数を示すパス名一致数とチェックサムが一致する数を示すチェックサム一致数とを確認する。 The index collation unit 120 collates the index L generated by the index generation unit 110 with the index B (step S102). For example, the index collation unit 120 collates each combination of the path name and the checksum included in the index B with the index L to indicate the number of matching path names and the number of matching checksums. Check with the number of checksum matches that indicate.

前処理部１３０は、インデックス照合部１２０による照合の結果に応じた前処理を行う（ステップＳ１０３）。例えば、前処理部１３０は、パス名一致数とチェックサム一致数とに応じた前処理を行う。 The pre-processing unit 130 performs pre-processing according to the result of collation by the index collation unit 120 (step S103). For example, the preprocessing unit 130 performs preprocessing according to the number of path name matches and the number of checksum matches.

ファイルコピー部１４０は、前処理部１３０による処理の後、必要なファイルコピーを行う（ステップＳ１０４）。例えば、ファイルコピー部１４０は、記憶装置１５０に形成されたバックアップ元領域からバックアップ先領域に対して、上書きなし（つまり、既存ファイルがあればコピーしない）条件で、ファイルコピーを実施する。 The file copy unit 140 makes a necessary file copy after the processing by the preprocessing unit 130 (step S104). For example, the file copy unit 140 performs file copy from the backup source area formed in the storage device 150 to the backup destination area under the condition that there is no overwriting (that is, if there is an existing file, the file is not copied).

以上が、バックアップ元からバックアップ先へファイルコピーを行う際のバックアップ元装置１００の動作の一例である。 The above is an example of the operation of the backup source device 100 when copying a file from the backup source to the backup destination.

このように、バックアップ元装置１００は、インデックス照合部１２０と、前処理部１３０と、ファイルコピー部１４０と、を有している。このような構成により、ファイルコピー部１４０は、インデックス照合部１２０による照合の結果に応じて前処理部１３０が前処理を行った後に、必要なファイルコピーを行うことが出来る。その結果、前処理としてファイル移動などを事前に行うことが可能となり、ファイル名やディレクトリ名の変更を行った際に生じるファイル削除・コピーを抑制することが可能となる。これにより、バックアップ時間が遅くなるおそれがある、という課題を解決することが出来る。 As described above, the backup source device 100 has an index collation unit 120, a preprocessing unit 130, and a file copy unit 140. With such a configuration, the file copy unit 140 can perform necessary file copy after the preprocessing unit 130 performs the preprocessing according to the collation result by the index collation unit 120. As a result, it is possible to move files in advance as preprocessing, and it is possible to suppress file deletion / copying that occurs when a file name or directory name is changed. This can solve the problem that the backup time may be delayed.

具体的には、例えば、前処理部１３０は、チェックサム一致数が１でありパス名一致数が０である場合、前処理として、インデックスＬのパスにファイル移動するよう構成されている。このように前処理を行っておくと、例えば、前回バックアップ実施後にパス名だけを変更した際などにおいて、無駄にファイルを削除して無駄なファイルコピーが生じる事態を抑制することが出来る。 Specifically, for example, the preprocessing unit 130 is configured to move a file to the path of index L as preprocessing when the number of checksum matches is 1 and the number of path name matches is 0. By performing the preprocessing in this way, for example, when only the path name is changed after the previous backup is performed, it is possible to suppress a situation in which the file is deleted unnecessarily and a useless file copy occurs.

また、例えば、前処理部１３０は、チェックサム一致数が複数ある場合、バックアップ先領域内に閉じた処理として対象ファイルを複製した後、必要に応じてインデックスＬのパスにファイル移動するよう構成されている。このように前処理を行っておくと、転送処理を抑制することが可能となり、バックアップ時間を高速にすることが可能となる。 Further, for example, when there are a plurality of checksum matches, the preprocessing unit 130 is configured to duplicate the target file as a closed process in the backup destination area and then move the file to the index L path as necessary. ing. If the pre-processing is performed in this way, the transfer processing can be suppressed and the backup time can be increased.

なお、ファイルコピー部１４０は、ファイルコピーを行う際に、インデックスＬなどを活用しても構わない。例えば、インデックスＬにチェックサムが同一のファイルが複数含まれる場合、ファイルコピー部１４０は、バックアップ先領域に対して１つのファイルを転送した後、バックアップ先領域内で閉じた処理として、必要な分転送したファイルを複製するよう構成しても構わない。 The file copy unit 140 may utilize the index L or the like when copying the file. For example, when the index L contains a plurality of files having the same checksum, the file copy unit 140 transfers one file to the backup destination area and then closes the file in the backup destination area. You may configure it to duplicate the transferred file.

また、本実施形態においては、ファイル（データ）ごとに一意に与えられるユニークな情報としてチェックサムを用いる場合について説明した。しかしながら、本発明は、例えば、ファイルの更新日時などの時刻ベースの識別子をチェックサムの代わりに用いるよう構成しても構わない。このように、ユニーク情報は、本実施形態において説明した以外の情報であっても構わない。 Further, in the present embodiment, a case where a checksum is used as unique information uniquely given for each file (data) has been described. However, the present invention may be configured to use a time-based identifier, such as a file modification date, instead of a checksum. As described above, the unique information may be information other than that described in the present embodiment.

また、システムの構成は、本実施形態において説明した場合に限定されない。例えば、バックアップ元装置１００は、複数台の情報処理装置から構成されても構わない。また、バックアップ元装置１００は、例えば、外部装置が生成したインデックスＬとインデックスＢを取得して照合するなど、本実施形態で説明した機能のうちの一部のみを有していても構わない。 Further, the configuration of the system is not limited to the case described in the present embodiment. For example, the backup source device 100 may be composed of a plurality of information processing devices. Further, the backup source device 100 may have only a part of the functions described in the present embodiment, for example, acquiring and collating the index L and the index B generated by the external device.

また、前処理部１３０は、バックアップ先領域内ファイル２２１をバックアップ先領域から削除する際、削除する処理の代わりに、当該ファイルに削除フラグを追加する、または、当該ファイルを世代前フォルダなどの他のフォルダに移動させる、処理などを行うよう構成しても構わない。例えば、/directoryZ/file.txtが削除対象のファイルであるとする。この場合、前処理部１３０は、例えば、上記ファイル（/directoryZ/file.txt）を削除せず、/bak_2019-10-10-12-00-00/directoryZ/file.txtなどに移動させるよう構成することが出来る。上記のように構成することで、例えば、世代前ファイルを入手したい場合に、削除フラグ付きまたは世代前フォルダ内のファイルをバックアップ先からバックアップ元にコピーすることで、世代前のファイルに復元することが可能となる。なお、上記のように構成する場合、バックアップ元装置１００などは、バックアップ先領域の容量に基づいて、削除フラグが追加されているファイルや世代前フォルダ中のファイルなどを削除するよう構成しても構わない。 Further, when the preprocessing unit 130 deletes the file 221 in the backup destination area from the backup destination area, instead of the process of deleting the file, the preprocessing unit 130 adds a deletion flag to the file, or adds the file to the pregeneration folder or the like. It may be configured to move to the folder of, or perform processing. For example, suppose /directoryZ/file.txt is the file to be deleted. In this case, the preprocessing unit 130 is configured to move the above file (/directoryZ/file.txt) to /bak_2019-10-10-12-00-00/directoryZ/file.txt, for example, without deleting it. Can be done. With the above configuration, for example, if you want to obtain a pre-generational file, you can restore it to the pre-generational file by copying the file with the delete flag or in the pre-generational folder from the backup destination to the backup source. Is possible. In the case of the above configuration, the backup source device 100 or the like may be configured to delete files to which the deletion flag is added, files in the previous generation folder, etc., based on the capacity of the backup destination area. I do not care.

また、例えば、本実施形態において説明したシステムにリストア機能を追加する場合などにおいて、バックアップ先領域で同じチェックサムかつパス名不一致のファイルについて、重複排除的に処理するよう構成しても構わない。例えば、前処理部１３０は、チェックサムが一致するファイルを重複してバックアップ先領域に記憶しないように、重複排除処理を行うことが出来る。 Further, for example, when a restore function is added to the system described in the present embodiment, files having the same checksum and path name mismatch in the backup destination area may be configured to be deduplicated. For example, the preprocessing unit 130 can perform deduplication processing so that files with matching checksums are not duplicated and stored in the backup destination area.

［第２の実施形態］
次に、本発明の第２の実施形態について、図９、図１０を参照して説明する。図９、図１０は、ストレージ装置５００の構成の一例を示している。 [Second Embodiment]
Next, a second embodiment of the present invention will be described with reference to FIGS. 9 and 10. 9 and 10 show an example of the configuration of the storage device 500.

図９は、情報処理装置３００のハードウェア構成の一例を示している。図９を参照すると、情報処理装置３００は、１台又は複数台の情報処理装置にて構成されており、一例として、以下のようなハードウェア構成を有している。
・ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３０１（演算装置）
・ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）３０２（記憶装置）
・ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）３０３（記憶装置）
・ＲＡＭ３０３にロードされるプログラム群３０４
・プログラム群３０４を格納する記憶装置３０５
・情報処理装置外部の記録媒体３１０の読み書きを行うドライブ装置３０６
・情報処理装置外部の通信ネットワーク３１１と接続する通信インタフェース３０７
・データの入出力を行う入出力インタフェース３０８
・各構成要素を接続するバス３０９ FIG. 9 shows an example of the hardware configuration of the information processing apparatus 300. With reference to FIG. 9, the information processing device 300 is composed of one or a plurality of information processing devices, and has the following hardware configuration as an example.
-CPU (Central Processing Unit) 301 (arithmetic unit)
-ROM (Read Only Memory) 302 (storage device)
-RAM (Random Access Memory) 303 (storage device)
-Program group 304 loaded in RAM 303
-Storage device 305 that stores the program group 304
A drive device 306 that reads and writes a recording medium 310 external to the information processing device.
-Communication interface 307 that connects to the communication network 311 outside the information processing device.
-I / O interface 308 for data input / output
-Bus 309 connecting each component

また、情報処理装置３００は、プログラム群３０４をＣＰＵ３０１が取得して当該ＣＰＵ３０１が実行することで、図１０に示す取得部３２１、照合部２２、前処理部３２３、としての機能を実現することが出来る。なお、プログラム群３０４は、例えば、予め記憶装置３０５やＲＯＭ３０２に格納されており、必要に応じてＣＰＵ３０１がＲＡＭ３０３などにロードして実行する。また、プログラム群３０４は、通信ネットワーク３１１を介してＣＰＵ３０１に供給されてもよいし、予め記録媒体３１０に格納されており、ドライブ装置３０６が該プログラムを読み出してＣＰＵ３０１に供給してもよい。 Further, the information processing apparatus 300 can realize the functions as the acquisition unit 321, the collation unit 22, and the preprocessing unit 323 shown in FIG. 10 by the CPU 301 acquiring the program group 304 and executing the program group 304. You can. The program group 304 is stored in the storage device 305 or the ROM 302 in advance, for example, and the CPU 301 loads the program group 304 into the RAM 303 or the like and executes the program group 304 as needed. Further, the program group 304 may be supplied to the CPU 301 via the communication network 311 or may be stored in the recording medium 310 in advance, and the drive device 306 may read the program and supply the program to the CPU 301.

なお、図９は、情報処理装置３００のハードウェア構成の一例を示しており、情報処理装置３００のハードウェア構成は上述した場合に限定されない。例えば、情報処理装置３００は、ドライブ装置３０６を有さないなど、上述した構成の一部から構成されてもよい。 Note that FIG. 9 shows an example of the hardware configuration of the information processing apparatus 300, and the hardware configuration of the information processing apparatus 300 is not limited to the above case. For example, the information processing device 300 may be configured from a part of the above-described configuration, such as not having the drive device 306.

取得部３２１は、バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する。 The acquisition unit 321 has the first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, and the path of the data stored in the backup source area. The second information that associates the name with the unique information uniquely given for each data is acquired.

照合部３２２は、取得部３２１が取得した第１情報と第２情報とを照合する。 The collation unit 322 collates the first information acquired by the acquisition unit 321 with the second information.

前処理部３２３は、バックアップ元領域に格納されたデータをバックアップ先領域に複製する際に、照合部３２２が照合した結果に基づく前処理を行う。 When duplicating the data stored in the backup source area to the backup destination area, the preprocessing unit 323 performs preprocessing based on the collation result of the collation unit 322.

このように、情報処理装置３００は、照合部３２２と前処理部３２３を有している。このような構成により、前処理部３２３は、バックアップ元領域に格納されたデータをバックアップ先領域に複製する際に、照合部３２２が照合した結果に基づく前処理を行うことが出来る。これにより、例えば、パス名を変更するデータ移動処理を前処理として行うことなどが可能となる。その結果、例えば、ファイル名やディレクトリ名の変更を行っただけの場合、前処理としてパス名を変更しておくことで、無駄にデータの削除・再コピーが生じる事態を抑制すること可能となり、バックアップ時間の遅延を抑制することが可能となる。 As described above, the information processing apparatus 300 has a collating unit 322 and a preprocessing unit 323. With such a configuration, the preprocessing unit 323 can perform preprocessing based on the collation result of the collation unit 322 when duplicating the data stored in the backup source area to the backup destination area. This makes it possible, for example, to perform a data movement process for changing the path name as a preprocess. As a result, for example, when only changing the file name or directory name, changing the path name as a pre-processing makes it possible to suppress the situation where data is deleted or re-copied unnecessarily. It is possible to suppress the delay in backup time.

なお、上述した情報処理装置３００は、当該情報処理装置３００に所定のプログラムが組み込まれることで実現できる。具体的に、本発明の他の形態であるプログラムは、情報処理装置３００に、バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する取得部３２１と、取得部３２１が取得した第１情報と第２情報とを照合する照合部３２２と、バックアップ元領域に格納されたデータをバックアップ先領域に複製する際に、照合部３２２が照合した結果に基づく前処理をデータの複製前に行う前処理部３２３と、を実現するためのプログラムである。 The information processing device 300 described above can be realized by incorporating a predetermined program into the information processing device 300. Specifically, in the program according to another embodiment of the present invention, the information processing apparatus 300 is associated with the path name of the data stored in the backup destination area and the unique information uniquely given for each data. Acquisition unit 321 and acquisition unit 321 that acquire the first information, the path name of the data stored in the backup source area, and the second information that associates the unique information uniquely given for each data with each other. The collation unit 322 that collates the first information and the second information acquired by the data, and the preprocessing based on the collation result when the data stored in the backup source area is duplicated in the backup destination area. This is a program for realizing the preprocessing unit 323, which is performed before duplication of the above.

また、上述した情報処理装置３００により実行される情報処理方法は、情報処理装置３００が、バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報を取得し、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報を取得し、取得した第１情報と第２情報とを照合し、バックアップ元領域に格納されたデータをバックアップ先領域に複製する際に、照合結果に基づく前処理をデータの複製前に行う、という方法である。 Further, in the information processing method executed by the information processing device 300 described above, the information processing device 300 corresponds to the path name of the data stored in the backup destination area and the unique information uniquely given for each data. The attached first information is acquired, and the second information corresponding to the path name of the data stored in the backup source area and the unique information uniquely given for each data is acquired, and the acquired first information is acquired. This is a method in which the data is collated with the second information, and when the data stored in the backup source area is duplicated in the backup destination area, preprocessing based on the collation result is performed before the data is duplicated.

上述した構成を有する、プログラム、又は、情報処理方法、の発明であっても、上記情報処理装置３００と同様の作用・効果を有するために、上述した本発明の目的を達成することが出来る。 Even the invention of the program or the information processing method having the above-mentioned configuration can achieve the above-mentioned object of the present invention because it has the same action and effect as the above-mentioned information processing apparatus 300.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明における情報処理方法などの概略を説明する。但し、本発明は、以下の構成に限定されない。 <Additional notes>
Part or all of the above embodiments may also be described as in the appendix below. Hereinafter, the outline of the information processing method and the like in the present invention will be described. However, the present invention is not limited to the following configurations.

（付記１）
情報処理装置が、
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報を取得し、
バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報を取得し、
取得した前記第１情報と前記第２情報とを照合し、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う
情報処理方法。
（付記２）
請求項１に記載の情報処理方法であって、
照合結果に基づく処理として、パス名を変更するデータ移動処理を行う
情報処理方法。
（付記３）
付記１または付記２に記載の情報処理方法であって、
前記第１情報に含まれるパス名とユニーク情報との組み合わせごとに、同一のパス名、ユニーク情報が前記第２情報に含まれるか否か確認する照合を行う
情報処理方法。
（付記４）
付記１から付記３までのいずれか１項に記載の情報処理方法であって、
前記第１情報と前記第２情報とを照合することで、前記第１情報に含まれるパス名とユニーク情報との組み合わせごとに、前記第２情報においてパス名が一致する数を示すパス名一致数とチェックサムが一致する数を示すチェックサム一致数とを算出する
情報処理方法。
（付記５）
付記４に記載の情報処理方法であって、
照合結果に基づく処理として、前記パス名一致数と前記チェックサム一致数とに応じた処理を行う
情報処理方法。
（付記６）
付記５に記載の情報処理方法であって、
前記チェックサム一致数の方が前記パス名一致数よりも多い場合、照合結果に基づく処理として、バックアップ先領域に格納されているデータのパス名を変更するデータ移動処理を行う
情報処理方法。
（付記７）
付記５または付記６に記載の情報処理方法であって、
前記チェックサム一致数が複数である場合、照合結果に基づく処理として、バックアップ先領域内でデータをコピーする処理を少なくとも行う
情報処理方法。
（付記８）
付記１から付記７までのいずれか１項に記載の情報処理方法であって、
前記ユニーク情報は、データの内容に基づいて算出されるチェックサムである
情報処理方法。
（付記９）
付記１から付記８までのいずれか１項に記載の情報処理方法であって、
バックアップ先領域に格納されているデータに基づいて前記第１情報を生成し、バックアップ元領域に格納されているデータに基づいて前記第２情報を生成し、
生成した前記第１情報と、生成した前記第２情報と、を照合する
情報処理方法。
（付記１０）
付記１から付記９までのいずれか１項に記載の情報処理方法であって、
照合結果に基づく処理を行った後、バックアップ先領域に存在しないデータをバックアップ元領域からバックアップ先領域に複製する
情報処理方法。
（付記１１）
付記１から付記１０までのいずれか１項に記載の情報処理方法であって、
照合結果に基づく処理としてデータを削除する処理を行う際、データの削除の代わりに、データに削除フラグを付与する、または、データを世代前フォルダに移動させる、処理を行う
情報処理方法。
（付記１２）
付記１から付記１１までのいずれか１項に記載の情報処理方法であって、
照合結果に基づく処理として、前記ユニーク情報が一致するデータを重複してバックアップ先領域に記憶しない重複排除処理を行う
情報処理方法。
（付記１３）
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する取得部と、
前記取得部が取得した前記第１情報と前記第２情報とを照合する照合部と、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う前処理部と、
を有する
情報処理装置。
（付記１４）
情報処理装置に、
バックアップ先領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第１情報と、バックアップ元領域に格納されているデータのパス名と、データごとに一意に与えられるユニーク情報と、を対応づけた第２情報と、を取得する取得部と、
前記取得部が取得した前記第１情報と前記第２情報とを照合する照合部と、
バックアップ元領域に格納されたデータをバックアップ先領域に複製する前に、バックアップ先領域に格納されたデータのパス名に対する照合結果に基づく処理を行う前処理部と、
を実現するためのプログラム。 (Appendix 1)
Information processing device
Acquire the first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data.
Acquire the second information that associates the path name of the data stored in the backup source area with the unique information uniquely given for each data.
The acquired first information and the second information are collated with each other.
An information processing method that performs processing based on the collation result for the path name of the data stored in the backup destination area before replicating the data stored in the backup source area to the backup destination area.
(Appendix 2)
The information processing method according to claim 1.
An information processing method that performs data movement processing that changes the path name as processing based on the collation result.
(Appendix 3)
The information processing method according to Appendix 1 or Appendix 2.
An information processing method for checking whether or not the same path name and unique information are included in the second information for each combination of the path name and the unique information included in the first information.
(Appendix 4)
The information processing method according to any one of Supplementary note 1 to Supplementary note 3.
By collating the first information with the second information, for each combination of the path name and the unique information included in the first information, the path name matching indicating the number of matching path names in the second information is shown. An information processing method that calculates the number of checksum matches, which indicates the number of matches and the number of checksum matches.
(Appendix 5)
The information processing method described in Appendix 4,
An information processing method that performs processing according to the number of matching path names and the number of matching checksums as processing based on the collation result.
(Appendix 6)
The information processing method described in Appendix 5
An information processing method that performs data movement processing that changes the path name of data stored in the backup destination area as processing based on the collation result when the number of checksum matches is larger than the number of path name matches.
(Appendix 7)
The information processing method according to Appendix 5 or Appendix 6.
An information processing method in which at least a process of copying data in a backup destination area is performed as a process based on a collation result when the number of checksum matches is a plurality.
(Appendix 8)
The information processing method according to any one of Supplementary note 1 to Supplementary note 7.
The unique information is an information processing method that is a checksum calculated based on the contents of data.
(Appendix 9)
The information processing method according to any one of Supplementary note 1 to Supplementary note 8.
The first information is generated based on the data stored in the backup destination area, and the second information is generated based on the data stored in the backup source area.
An information processing method for collating the generated first information with the generated second information.
(Appendix 10)
The information processing method according to any one of Supplementary note 1 to Supplementary note 9.
An information processing method that duplicates data that does not exist in the backup destination area from the backup source area to the backup destination area after performing processing based on the collation result.
(Appendix 11)
The information processing method according to any one of Supplementary note 1 to Supplementary note 10.
An information processing method that performs processing that deletes data as processing based on the collation result, instead of deleting the data, assigns a deletion flag to the data or moves the data to the previous generation folder.
(Appendix 12)
The information processing method according to any one of Supplementary note 1 to Supplementary note 11.
An information processing method that performs deduplication processing in which data with matching unique information is not duplicated and stored in the backup destination area as processing based on the collation result.
(Appendix 13)
The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the data, the second information associated with the information, and the acquisition unit.
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
Information processing device with.
(Appendix 14)
For information processing equipment
The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the data, the second information associated with the information, and the acquisition unit.
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
A program to realize.

なお、上記各実施形態及び付記において記載したプログラムは、記憶装置に記憶されていたり、コンピュータが読み取り可能な記録媒体に記録されていたりする。例えば、記録媒体は、フレキシブルディスク、光ディスク、光磁気ディスク、及び、半導体メモリ等の可搬性を有する媒体である。 The programs described in each of the above embodiments and appendices may be stored in a storage device or recorded in a computer-readable recording medium. For example, the recording medium is a portable medium such as a flexible disk, an optical disk, a magneto-optical disk, and a semiconductor memory.

以上、上記各実施形態を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることが出来る。 Although the present invention has been described above with reference to each of the above embodiments, the present invention is not limited to the above-described embodiments. Various changes that can be understood by those skilled in the art can be made to the structure and details of the present invention within the scope of the present invention.

１００バックアップ元装置
１１０インデックス生成部
１２０インデックス照合部
１３０前処理部
１４０ファイルコピー部
１５０記憶装置
１５１バックアップ元領域内ファイル
２００バックアップ先装置
２１０データ制御部
２２０記憶装置
２２１バックアップ先領域内ファイル
３００情報処理装置
３０１ＣＰＵ
３０２ＲＯＭ
３０３ＲＡＭ
３０４プログラム群
３０５記憶装置
３０６ドライブ装置
３０７通信インタフェース
３０８入出力インタフェース
３０９バス
３１０記録媒体
３１１通信ネットワーク
３２１取得部
３２２照合部
３２３前処理部

100 Backup source device 110 Index generation unit 120 Index collation unit 130 Preprocessing unit 140 File copy unit 150 Storage device 151 File in backup source area 200 Backup destination device 210 Data control unit 220 Storage device 221 File in backup destination area 300 Information processing device 301 CPU
302 ROM
303 RAM
304 Program group 305 Storage device 306 Drive device 307 Communication interface 308 Input / output interface 309 Bus 310 Recording medium 311 Communication network 321 Acquisition unit 322 Verification unit 323 Preprocessing unit

Claims

Information processing device
Acquire the first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data.
Acquire the second information that associates the path name of the data stored in the backup source area with the unique information uniquely given for each data.
The acquired first information and the second information are collated with each other.
An information processing method that performs processing based on the collation result for the path name of the data stored in the backup destination area before replicating the data stored in the backup source area to the backup destination area.

The information processing method according to claim 1.
An information processing method that performs data movement processing that changes the path name as processing based on the collation result.

The information processing method according to claim 1 or 2.
An information processing method for checking whether or not the same path name and unique information are included in the second information for each combination of the path name and the unique information included in the first information.

The information processing method according to any one of claims 1 to 3.
By collating the first information with the second information, for each combination of the path name and the unique information included in the first information, the path name matching indicating the number of matching path names in the second information is shown. An information processing method that calculates the number of checksum matches, which indicates the number of matches and the number of checksum matches.

The information processing method according to claim 4.
An information processing method that performs processing according to the number of matching path names and the number of matching checksums as processing based on the collation result.

The information processing method according to claim 5.
An information processing method that performs data movement processing that changes the path name of data stored in the backup destination area as processing based on the collation result when the number of checksum matches is larger than the number of path name matches.

The information processing method according to claim 5 or 6.
An information processing method in which at least a process of copying data in a backup destination area is performed as a process based on a collation result when the number of checksum matches is a plurality.

The information processing method according to any one of claims 1 to 7.
The unique information is an information processing method that is a checksum calculated based on the contents of data.

The information processing method according to any one of claims 1 to 8.
The first information is generated based on the data stored in the backup destination area, and the second information is generated based on the data stored in the backup source area.
An information processing method for collating the generated first information with the generated second information.

The information processing method according to any one of claims 1 to 9.
An information processing method that duplicates data that does not exist in the backup destination area from the backup source area to the backup destination area after performing processing based on the collation result.

The information processing method according to any one of claims 1 to 10.
An information processing method that performs processing that deletes data as processing based on the collation result, instead of deleting the data, assigns a deletion flag to the data or moves the data to the previous generation folder.

The information processing method according to any one of claims 1 to 11.
An information processing method that performs deduplication processing in which data that matches the unique information is not duplicated and stored in the backup destination area as processing based on the collation result.

The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
Information processing device with.

For information processing equipment
The first information that associates the path name of the data stored in the backup destination area with the unique information uniquely given for each data, the path name of the data stored in the backup source area, and each data. The acquisition unit that acquires the unique information uniquely given to the
A collation unit that collates the first information acquired by the acquisition unit with the second information,
Before replicating the data stored in the backup source area to the backup destination area, a preprocessing unit that performs processing based on the collation result with respect to the path name of the data stored in the backup destination area.
A program to realize.