KR101310253B1

KR101310253B1 - Hash data creation method and hash data comparison system and method

Info

Publication number: KR101310253B1
Application number: KR1020110111296A
Authority: KR
Inventors: 장성국; 유광희; 성주현; 진혜진; 이윤형
Original assignee: (주)네오위즈게임즈
Priority date: 2011-10-28
Filing date: 2011-10-28
Publication date: 2013-09-24
Also published as: CN102945241A; KR20130046746A; WO2013062223A1; TW201319929A

Abstract

본 출원은 본 출원은 데이터 파일에 대한 해시(Hash) 기술에 관한 것으로, 개시된 기술의 일 실시예에 따른 해시 데이터 비교 시스템은 파일 정보 및 해시 값을 포함하는 해시 데이터를 이용하여 원본 파일들을 상호 비교한다. 상기 해시 데이터 비교 시스템은 파일 정보 생성부, 해시 생성부 및 제어부를 포함한다. 상기 파일 정보 생성부는 원본 파일의 속성을 확인하여 원본 파일에 대한 파일 정보를 생성한다. 상기 해시 생성부는 원본 파일의 적어도 일부에 대하여 해시 함수 알고리즘을 적용하여 해시 값을 산출한다. 상기 제어부는 상기 파일 정보 및 해시 값을 포함하여 해당 원본 파일에 대한 해시 데이터를 생성시켜 패리티 정보를 포함해 준다. 본 출원의 개시된 기술에 따르면, 파일에 대한 해시 값을 비교하기 전에 파일이 서로 다른지 판별할 수 있으므로, 서로 다른 파일에 대하여 해시 값 전체를 비교할 필요가 없으므로 보다 빠르게 파일을 비교할 수 있는 효과가 있다.The present application relates to a hash technique for a data file. The hash data comparison system according to an embodiment of the disclosed technique compares original files with each other using hash data including file information and hash values. do. The hash data comparison system includes a file information generator, a hash generator, and a controller. The file information generation unit generates file information on the original file by checking the property of the original file. The hash generator calculates a hash value by applying a hash function algorithm to at least a portion of the original file. The controller generates parity information by generating hash data of the corresponding original file including the file information and the hash value. According to the disclosed technology of the present application, since it is possible to determine whether the files are different before comparing the hash values for the files, there is no need to compare the entire hash values for the different files, so that the files can be compared more quickly.

Description

Hash data generation method, hash data comparison system and method {HASH DATA CREATION METHOD AND HASH DATA COMPARISON SYSTEM AND METHOD}

본 출원은 데이터 파일에 대한 해시(Hash) 기술에 관한 것으로, 보다 상세하게는, 원본 파일의 고유한 특성 정보를 해시 값과 함께 사용하여 파일 비교를 보다 빠르게 수행할 수 있는 해시 데이터 구조를 이용하도록 한 해시 데이터 생성 방법, 해시 데이터 비교 시스템 및 방법에 관한 것이다.The present application relates to a hash technique for a data file, and more specifically, to use a hash data structure that can perform file comparison faster by using the unique characteristic information of the original file together with the hash value. A hash data generation method, hash data comparison system and method.

데이터, 특히 파일간의 비교는 다양한 연산에서 사용되고 있다. 예컨대, OS에서 파일 변화를 확인하거나, 소정의 패치를 수행하기 위하여 패치 파일과 원본 파일을 비교하는 등 여러 연산에서 필수적으로 사용되고 있다.Data, especially file-to-file comparisons, are used in a variety of operations. For example, it is essentially used in various operations such as checking a file change in an OS or comparing a patch file with an original file to perform a predetermined patch.

종래의 파일 비교 기술로는 파일 전체를 비교하는 방법, 파일에 버전 정보를 부여하여 이를 기초로 확인하는 방법, 파일에 대하여 해시 함수를 적용하여 이를 비교하는 방법 등이 사용되고 있다. As a conventional file comparison technique, a method of comparing an entire file, a method of providing version information to a file and verifying the file based on the file, a method of applying a hash function to the file, and the like are used.

파일 전체를 비교하는 방법은 비교할 데이터가 많고 전체적으로 느려 잘 사용되지 않고 있다. 파일에 버전 정보를 부여하여 파일을 비교하는 방법은 파일 내용의 변경이 이루어져도 파일 버전 정보가 변경되지 않으면, 파일 내용과 버전 정보가 달라질 수 있고 이로 인하여 제대로 된 파일 비교가 이루어질 수 없는 단점이 있다. Comparing entire files is not well used because of the large amount of data to compare and overall slowness. The method of comparing files by giving version information to a file has a disadvantage in that the file content and version information may be different if the file version information is not changed even if the file contents are changed, and thus, a proper file comparison cannot be made. .

따라서 대부분의 경우 파일에 대하여 해시 함수를 적용하여 해시 값을 산출하고 산출된 해시 값을 비교함으로써 파일 내용에 대한 비교를 수행하고 있다. 그러나, 이러한 해시 값만을 이용한 종래의 비교 방법의 경우, 파일의 크기가 큰 경우 그에 대한 해시 값을 생성하는데 컴퓨팅 자원과 해당 연산을 위해 필요한 시간이 많이 요구된다는 문제점이 있다.Therefore, in most cases, a hash function is applied to a file to calculate a hash value, and the calculated hash values are compared to compare the file contents. However, the conventional comparison method using only the hash value has a problem in that a large amount of time is required for computing resources and a corresponding operation to generate a hash value for a large file.

본 출원은 파일 간의 비교를 보다 적은 자원으로 손쉽게 수행할 수 있는 파일 비교를 위한 해시 데이터 구조를 제공하고자 한다.The present application is to provide a hash data structure for file comparison that can easily perform the comparison between files with less resources.

또한, 본 출원은 파일 비교를 위한 해시 데이터 구조를 이용하여 보다 빠르게 파일들을 비교할 수 있도록 하는 해시 데이터 구조 생성 방법 및 그에 대한 해시 데이터 구조 비교 방법을 제공하고자 한다. In addition, the present application is to provide a hash data structure generation method and a hash data structure comparison method therefor that allows to compare files faster using a hash data structure for file comparison.

또한, 본 출원은 파일 비교를 위한 해시 데이터 구조를 이용하여 효율적으로 파일들을 비교할 수 있도록 하는 해시 비교 시스템을 제공하고자 한다.In addition, the present application is to provide a hash comparison system that can efficiently compare files using a hash data structure for file comparison.

실시예들 중에서, 해시 데이터 구조는 소정의 데이터 비트로 구성된 원본 파일의 속성에 관한 파일 정보 및 특정의 데이터 비트로 구성된 상기 원본 파일에 대한 해시 값을 포함하고, 상기 파일 정보에 해당하는 데이터 비트 뒤에 연속하여 상기 해시 값에 해당하는 데이터 비트를 포함한다.Among the embodiments, the hash data structure includes file information about an attribute of the original file consisting of predetermined data bits and a hash value for the original file consisting of specific data bits, and subsequently after the data bits corresponding to the file information. And a data bit corresponding to the hash value.

일 실시예에서, 상기 파일 정보는 상기 원본 파일의 크기 값, 상기 원본 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 상기 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 포함할 수 있다. In one embodiment, the file information may include at least one of a size value of the original file, first partial data including first data of the original file, and second partial data including last data of the original file. have.

일 실시예에서, 상기 해시 데이터 구조는 상기 해시 데이터 구조에 포함된 상기 파일 정보 및 해시 값 각각에 대한 구조 정보를 포함하는 구조 헤더를 더 포함할 수 있다.In one embodiment, the hash data structure may further include a structure header including structure information for each of the file information and the hash value included in the hash data structure.

일 실시예에서, 상기 해시 데이터 구조는 상기 해시 데이터 구조에 대한 패리티 정보를 더 포함하고, 상기 패리티 정보는 상기 파일정보에 대한 제1 패리티 비트 및 상기 해시 값에 대한 제2 패리티 비트를 포함할 수 있다.In one embodiment, the hash data structure may further include parity information for the hash data structure, and the parity information may include a first parity bit for the file information and a second parity bit for the hash value. have.

실시예들 중에서, 해시 데이터 생성 방법은 각각의 원본 파일에 대하여 비교를 위한 각각의 해시 데이터를 생성하는 방법에 관한 것으로, 상기 해시 데이터 생성 방법은 (a) 원본 파일의 속성을 확인하고, 상기 확인된 속성을 기초로 소정의 데이터 비트로 구성되는 파일 정보를 생성하는 단계, (b) 상기 원본 파일의 적어도 일부에 대하여 해시 알고리즘을 적용하여 해시 값을 계산하는 단계 및 (c) 상기 파일 정보에 상기 해시 값을 연속하여 결합하여 해시 데이터를 생성하는 단계를 포함한다.Among the embodiments, the hash data generation method relates to a method for generating each hash data for comparison for each original file, the hash data generation method (a) confirms the properties of the original file, the identification Generating file information consisting of predetermined data bits based on the specified attributes, (b) calculating a hash value by applying a hash algorithm to at least a portion of the original file, and (c) the hash to the file information. Successively combining the values to produce hash data.

일 실시예에서, 상기 (a) 단계는 상기 원본 파일의 크기, 이름, 형식, 상기 원본 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 상기 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 확인하는 단계 및 상기 원본 파일의 크기, 이름, 형식, 상기 원본 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 상기 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 포함하는 상기 파일 정보를 생성하는 단계를 포함할 수 있다.In an embodiment, the step (a) may include at least one of a size, a name, a format of the original file, first partial data including original data of the original file, and second partial data including final data of the original file. Identifying one and said at least one of a size, a name, a format of said original file, at least one of said first partial data comprising original data of said original file and said second partial data comprising final data of said original file; And generating file information.

일 실시예에서, 상기 해시 데이터 생성 방법은 (d) 상기 해시 데이터에 대하여 해시 패리티 비트를 생성하는 단계를 더 포함할 수 있다.In one embodiment, the hash data generation method may further comprise (d) generating a hash parity bit for the hash data.

일 실시예에서, 상기 (d) 단계는 상기 파일 정보에 대하여 제1 패리티 비트를 생성하는 단계, 상기 해시 값에 대하여 제2 패리티 비트를 생성하는 단계 및 상기 제1 및 제2 패리티 비트를 연속하여 결합하여 상기 해시 패리티 비트를 생성하는 단계를 포함할 수 있다. In an embodiment, the step (d) may include generating a first parity bit for the file information, generating a second parity bit for the hash value, and successively generating the first and second parity bits. Combining to generate the hash parity bits.

실시예들 중에서, 해시 데이터 생성 방법은 각각의 원본 파일에 대하여 비교를 위한 각각의 해시 데이터를 생성하는 방법에 관한 것으로, 상기 해시 데이터 생성 방법은 (a) 상기 해시 데이터 구조에 포함된 상기 파일 정보 및 해시 값 각각에 대한 구조 정보를 포함하는 구조 헤더를 생성하는 단계, (b) 원본 파일의 속성을 확인하고, 상기 확인된 속성을 기초로 소정의 데이터 비트로 구성되는 파일 정보를 생성하는 단계, (c) 상기 원본 파일의 적어도 일부에 대하여 해시 알고리즘을 적용하여 해시 값을 계산하는 단계 및 (d) 상기 파일 정보에 상기 해시 값을 연속하여 결합하여 해시 데이터를 생성하는 단계를 포함한다.Among the embodiments, the hash data generating method relates to a method for generating respective hash data for comparison with respect to each original file, wherein the hash data generating method includes (a) the file information included in the hash data structure. And generating a structure header including structure information for each hash value, (b) identifying an attribute of the original file, and generating file information consisting of predetermined data bits based on the identified attribute; c) calculating a hash value by applying a hash algorithm to at least a portion of the original file, and (d) successively combining the hash value with the file information to generate hash data.

실시예들 중에서, 해시 데이터 비교 방법은 파일 정보 및 해시 값을 포함하는 해시 데이터를 이용하여 두 개의 원본 파일들을 상호 비교하는 해시 데이터 비교 방법에 관한 것으로, 상기 해시 데이터 비교 방법은 (a) 상기 두 개의 원본 파일들과 각각 연관된 두 개의 해시 데이터들을 확인하는 단계, (b) 상기 두 개의 데이터들 각각에 포함된 두 개의 파일 정보들을 상호 비교하는 단계 및 (c) 상기 두 개의 파일 정보들이 서로 동일하면, 상기 두 개의 해시 데이터들 각각에 포함된 두 개의 해시 값들을 상호 비교하고, 서로 동일하면 상기 두 개의 원본 파일들을 동일한 파일로 판단하는 단계를 포함한다.Among the embodiments, the hash data comparison method relates to a hash data comparison method for comparing two original files using hash data including file information and a hash value, wherein the hash data comparison method includes (a) the two Identifying two hash data associated with each of the two original files, (b) comparing two file information included in each of the two data with each other, and (c) if the two file information are the same, And comparing two hash values included in each of the two hash data, and determining the two original files as the same file if they are identical to each other.

일 실시예에서, 상기 파일 정보는 해당 원본 파일의 크기, 이름, 형식, 해당 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 해당 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 포함할 수 있다. In one embodiment, the file information includes at least one of a size, a name, a format of a corresponding original file, first partial data including first data of the corresponding file, and second partial data including last data of the original file. can do.

일 실시예에서, 상기 (b) 단계는 상기 두 개의 파일 정보의 데이터 비트들을 각 비트별로 상호 비교하는 단계를 포함할 수 있다. In an embodiment, the step (b) may include comparing data bits of the two file information with each bit.

일 실시예에서, 상기 (b) 단계는 상기 두 개의 파일 정보 각각에 대하여, 해당 파일 정보가 포함하고 있는 해당 원본 파일의 크기, 이름, 형식, 해당 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 해당 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 식별하는 단계 및 상기 식별된 해당 원본 파일의 크기, 이름, 형식, 해당 파일의 최초 데이터를 포함하는 제1 일부 데이터 및 해당 원본 파일의 최후 데이터를 포함하는 제2 일부 데이터 중 적어도 하나를 각각 비교하는 단계를 포함할 수 있다.In an embodiment, the step (b) may include, for each of the two file informations, first partial data including the size, name, format, original data of the corresponding file included in the corresponding file information, and Identifying at least one of the second partial data including the last data of the original file and the first partial data including the size, name, format, original data of the corresponding original file and the original file of the identified original file; And comparing each of at least one of the second some data comprising the last data of.

실시예들 중에서, 해시 데이터 비교 방법은 파일 정보, 해시 값 및 상기 파일 정보 및 해시 값 각각에 대한 구조 정보를 포함하는 구조 헤더을 포함하는 해시 데이터를 이용하여 두 개의 원본 파일들을 상호 비교하는 해시 데이터 비교 방법에 관한 것으로, 상기 해시 데이터 비교 방법은 (a) 상기 두 개의 원본 파일들에 대하여, 각각 구조 헤더를 비교하여 동일한 구조를 가지는 해시 데이터들인지 확인하는 단계, (b) 동일한 구조를 가지면, 상기 두 개의 원본 파일들에 각각 연관된 파일 정보들을 비교하는 단계, 및 (c) 상기 파일 정보들이 서로 동일하면, 상기 원본 파일들에 각각 연관된 해시 값들을 상호 비교하고, 서로 동일하면 상기 두 개의 원본 파일들을 동일한 파일로 판단하는 단계를 포함한다.Among the embodiments, the hash data comparison method includes a hash data comparison that compares two original files with each other using hash data including file information, a hash value, and a structure header including structure information for each of the file information and the hash value. The hash data comparison method comprises the steps of: (a) comparing the structure headers with respect to the two original files to determine whether they are hash data having the same structure, and (b) if the two structures have the same structure, Comparing file information associated with each of the two original files, and (c) if the file information is identical to each other, comparing hash values associated with each of the original files with each other; And determining the file.

실시예들 중에서, 해시 데이터 비교 시스템은 파일 정보 및 해시 값을 포함하는 해시 데이터를 이용하여 원본 파일들을 상호 비교한다. 상기 해시 데이터 비교 시스템은 파일 정보 생성부, 해시 생성부 및 제어부를 포함한다. 상기 파일 정보 생성부는 원본 파일의 속성을 확인하여 원본 파일에 대한 파일 정보를 생성한다. 상기 해시 생성부는 원본 파일의 적어도 일부에 대하여 해시 함수 알고리즘을 적용하여 해시 값을 산출한다. 상기 제어부는 상기 파일 정보 및 해시 값을 포함하여 해당 원본 파일에 대한 해시 데이터를 생성한다.Among the embodiments, the hash data comparison system compares the original files using hash data including file information and hash values. The hash data comparison system includes a file information generator, a hash generator, and a controller. The file information generation unit generates file information on the original file by checking the property of the original file. The hash generator calculates a hash value by applying a hash function algorithm to at least a portion of the original file. The controller generates hash data for the corresponding original file including the file information and the hash value.

일 실시예에서, 상기 해시 데이터 비교 시스템은 상기 생성된 해시 데이터를 저장하고, 저장된 해시 값에 연관된 원본 파일에 대한 정보를 유지하는 해시 파일 관리부를 더 포함할 수 있다.In one embodiment, the hash data comparison system may further include a hash file management unit for storing the generated hash data, and maintains information about the original file associated with the stored hash value.

일 실시예에서, 상기 제어부는 제1 및 제2 원본 파일들 각각에 대하여 상기 파일 정보 및 상기 해시 값을 순차적으로 비교하여 동일성을 판단할 수 있다.In one embodiment, the controller may compare the file information and the hash value sequentially with respect to each of the first and second source files to determine the sameness.

일 실시예에서, 상기 제어부는 상기 파일 정보 및 상기 해시 값에 대한 식별 정보를 포함하는 구조 헤더를 생성하고, 상기 구조 헤더, 상기 파일 정보 및 상기 해시 값을 포함하여 상기 해시 데이터를 생성할 수 있다.In an embodiment, the controller may generate a structure header including the file information and identification information about the hash value, and generate the hash data including the structure header, the file information, and the hash value. .

일 실시예에서, 상기 제어부는 제1 및 제2 원본 파일들 각각에 대하여 상기 구조 헤더, 상기 파일 정보 및 상기 해시 값을 순차적으로 비교하여, 모두 동일한 경우 상기 제1 및 제2 원본 파일들을 동일한 파일로 판단할 수 있다. In one embodiment, the control unit sequentially compares the structure header, the file information, and the hash value with respect to each of the first and second source files, so that the first and second source files are the same file if they are all the same. Judging by

일 실시예에서, 상기 제어부는 상기 파일 정보 및 상기 해시 값 각각에 대하여 계산된 패리티 비트들을 포함하여 상기 해시 데이터에 대하여 패리티 비트를 생성할 수 있다.In an embodiment, the controller may generate parity bits for the hash data, including parity bits calculated for each of the file information and the hash value.

본 출원의 개시된 기술에 따르면, 파일에 대한 해시 값을 비교하기 전에 파일이 서로 다른지 판별할 수 있으므로, 서로 다른 파일에 대하여 해시 값 전체를 비교할 필요가 없으므로 보다 빠르게 파일을 비교할 수 있는 효과가 있다.According to the disclosed technology of the present application, since it is possible to determine whether the files are different before comparing the hash values for the files, there is no need to compare the entire hash values for the different files, so that the files can be compared more quickly.

또한, 본 출원의 개시된 기술에 따르면, 파일 정보에 대한 패리티와 해시 값에 대한 패리티로 구성된 패리티 정보를 이용하여 파일 정보와 해시 데이터 구조 각각에 대하여 올바르게 구성되었는지 확인할 수 있다.In addition, according to the disclosed technology of the present application, it is possible to confirm whether the file information and the hash data structure are correctly configured using parity information including parity for file information and parity for hash values.

도 1은 개시된 기술에 따른 해시 데이터 구조의 일 실시예를 설명하기 위한 참고도이다.
도 2는 개시된 기술에 따른 해시 데이터 구조의 다른 일 실시예를 설명하기 위한 참고도이다.
도 3은 개시된 기술에 따른 해시 데이터 구조의 또 다른 일 실시예를 설명하기 위한 참고도이다.
도 4는 개시된 기술에 따른 해시 비교 시스템의 일 실시예를 설명하는 구성도이다.
도 5는 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 생성 방법의 일 실시예를 설명하는 순서도이다.
도 6은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 생성 방법의 다른 일 실시예를 설명하는 순서도이다.
도 7은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 비교 방법의 일 실시예를 설명하는 순서도이다.
도 8은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 비교 방법의 다른 일 실시예를 설명하는 순서도이다.
도 9는 개시된 기술에 따른 해시 비교 시스템의 다른 일 실시예를 설명하는 구성도이다.1 is a reference diagram for explaining an embodiment of a hash data structure according to the disclosed technology.
2 is a reference diagram for explaining another embodiment of a hash data structure according to the disclosed technology.
3 is a reference diagram for explaining another embodiment of a hash data structure according to the disclosed technology.
4 is a schematic diagram illustrating an embodiment of a hash comparison system according to the disclosed technology.
FIG. 5 is a flowchart illustrating an embodiment of a hash data generation method that may be performed in the hash comparison system of FIG. 4.
FIG. 6 is a flowchart illustrating another embodiment of a hash data generation method that may be performed in the hash comparison system of FIG. 4.
7 is a flowchart illustrating an embodiment of a hash data comparison method that may be performed in the hash comparison system of FIG. 4.
8 is a flowchart illustrating another embodiment of a hash data comparison method that may be performed in the hash comparison system of FIG. 4.
9 is a block diagram illustrating another embodiment of a hash comparison system according to the disclosed technology.

개시된 기술에 관한 설명은 구조적 내지 기능적 설명을 위한 실시예에 불과하므로, 개시된 기술의 권리범위는 본문에 설명된 실시예에 의하여 제한되는 것으로 해석되어서는 아니 된다. 즉, 실시예는 다양한 변경이 가능하고 여러 가지 형태를 가질 수 있으므로 개시된 기술의 권리범위는 기술적 사상을 실현할 수 있는 균등물들을 포함하는 것으로 이해되어야 한다.The description of the disclosed technique is merely an example for structural or functional explanation and the scope of the disclosed technology should not be construed as being limited by the embodiments described in the text. That is, the embodiments may be variously modified and may have various forms, and thus the scope of the disclosed technology should be understood to include equivalents capable of realizing the technical idea.

한편, 본 출원에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다.Meanwhile, the meaning of the terms described in the present application should be understood as follows.

"제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.The terms "first "," second ", and the like are intended to distinguish one element from another, and the scope of the right should not be limited by these terms. For example, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어"있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결될 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다고 언급된 때에는 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 한편, 구성요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.It is to be understood that when an element is referred to as being "connected" to another element, it may be directly connected to the other element, but there may be other elements in between. On the other hand, when an element is referred to as being "directly connected" to another element, it should be understood that there are no other elements in between. On the other hand, other expressions describing the relationship between the components, such as "between" and "immediately between" or "neighboring to" and "directly neighboring to", should be interpreted as well.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함하다"또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.It should be understood that the singular " include "or" have "are to be construed as including a stated feature, number, step, operation, component, It is to be understood that the combination is intended to specify that it does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

각 단계들에 있어 식별부호(예를 들어, a, b, c 등)는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 단계들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In each step, the identification code (e.g., a, b, c, etc.) is used for convenience of explanation, the identification code does not describe the order of each step, Unless otherwise stated, it may occur differently from the stated order. That is, each step may occur in the same order as described, may be performed substantially concurrently, or may be performed in reverse order.

여기서 사용되는 모든 용어들은 다르게 정의되지 않는 한, 개시된 기술이 속하는 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미를 지니는 것으로 해석될 수 없다.All terms used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. Commonly used predefined terms should be interpreted to be consistent with the meanings in the context of the related art and can not be interpreted as having ideal or overly formal meaning unless explicitly defined in the present application.

이하의 기재에서, 원본 파일이란 해시 데이터 구조를 적용하기 위하여 대상이 되는 파일을 의미한다. 개시된 기술은, 해시 값의 일반적인 특성과 마찬가지로, 원본 파일 별로 독립적인 값을 가지는 해시 데이터 구조를 제공한다.In the following description, the original file means a file that is a target for applying a hash data structure. The disclosed technique provides a hash data structure with independent values for each source file, as is the general nature of hash values.

도 1은 개시된 기술에 따른 해시 데이터 구조의 일 실시예를 설명하기 위한 참고도이다. 1 is a reference diagram for explaining an embodiment of a hash data structure according to the disclosed technology.

도 1을 참고하면, 해시 데이터 구조(100)는 파일 정보(110)와 해시 값(120)으로 구성된다. 더 상세히 설명하면, 해시 데이터 구조(100)는 원본 파일에 대한 파일 정보(110)에 대한 데이터 비트 뒤에 연속하여 해시 값에 해당하는 비트를 포함하여 구성될 수 있다. Referring to FIG. 1, the hash data structure 100 is composed of file information 110 and a hash value 120. In more detail, the hash data structure 100 may be configured to include bits corresponding to hash values in succession after the data bits for the file information 110 for the original file.

파일 정보(110)는 원본 파일의 크기 값(111), 원본 파일의 최초 데이터를 포함하는 일부 데이터(이하, 제1 일부 데이터)(112) 및 원본 파일의 최후 데이터를 포함하는 일부 데이터(이하, 제2 일부 데이터)(113)을 포함할 수 있다. 실시예에 따라, 파일 정보(110)는 상술한 3가지 데이터(111 내지 113) 중 적어도 하나로 구성될 수 있다. The file information 110 may include a size value 111 of the original file, some data including the first data of the original file (hereinafter, the first partial data) 112, and some data including the last data of the original file (hereinafter, Second partial data) 113. According to an exemplary embodiment, the file information 110 may be configured with at least one of the three types of data 111 to 113 described above.

파일 정보(110)는 후술할 실시예에 따라 하나 또는 복수의 시스템에서 서로 다른 길이로 구성될 수 있다. 즉, 반드시 특정된 데이터 비트로 구성되어야 하는 것은 아니고, 시스템의 설정 또는 필요에 따라 소정의 데이터 비트의 크기로 구성될 수 있다.The file information 110 may be configured to have different lengths in one or a plurality of systems according to an embodiment to be described later. That is, it is not necessarily composed of specific data bits, but may be composed of predetermined data bits according to the settings or needs of the system.

원본 파일의 크기 값(111)은 원본 파일의 크기를 나타내는 데이터이다.The size value 111 of the original file is data representing the size of the original file.

제1 일부 데이터(112)는 원본 파일의 최초 비트부터 소정이 길이에 해당하는 원본 파일의 일부이고, 제2 일부 데이터(113)는 원본 파일의 최후 비트로부터 소정이 길이에 해당하는 원본 파일의 일부이다. 여기에서, 제1 및 제2 일부 데이터(112, 113)의 길이는 해당 파일 비교 시스템에 따라 다르게 결정될 수 있으므로, 개시된 기술은 이러한 길이에 한정되는 것은 아니다.The first partial data 112 is a part of the original file corresponding to the predetermined length from the first bit of the original file, and the second partial data 113 is a part of the original file corresponding to the predetermined length from the last bit of the original file. to be. Here, the length of the first and second partial data 112, 113 may be determined differently according to the corresponding file comparison system, so the disclosed technique is not limited to this length.

해시 값(120)은 원본 파일에 대하여 해시 알고리즘을 적용하여 얻은 데이터이다. 일 실시예에서, 해시 값(120)은 특정 비트로 결정될 수 있다. 즉, 파일 정보(110)는 포함되는 요소 및 해당 요소의 크기가 변경될 수 있는 반면에, 해시 값(120)은 표준화된 크기 등과 같이 특정 크기(데이터 비트)로 한정될 수 있다. 예를 들어, SHA-0 또는 SHA-1 알고리즘의 경우 160비트, SHA-256/224 알고리즘의 경우 256/224 비트, SHA-512/384 알고리즘의 경우 512/384 비트일 수 있다. 다시 말해, 해시 값(110)은 실시예에 따라 하나 또는 복수의 시스템에서 적용되더라도 특정된 길이의 데이터 비트로 구성될 수 있다. 즉, 해시 값은 표준들에 따라 결정되는 것이 바람직하므로 특정된 데이터 비트의 크기로 한정될 수 있다. The hash value 120 is data obtained by applying a hash algorithm to the original file. In one embodiment, hash value 120 may be determined with a particular bit. That is, the file information 110 may include an element included therein and the size of the corresponding element, while the hash value 120 may be limited to a specific size (data bit) such as a standardized size. For example, it may be 160 bits for the SHA-0 or SHA-1 algorithm, 256/224 bits for the SHA-256 / 224 algorithm, and 512/384 bits for the SHA-512 / 384 algorithm. In other words, the hash value 110 may be composed of data bits of a specified length even if applied in one or a plurality of systems according to an embodiment. In other words, the hash value is preferably determined according to the standards and can therefore be limited to the size of the specified data bit.

파일 정보(110)는 파일 비교 시, HASH 값(120)보다 먼저 비교되어야 한다. 예를들어, 파일 A 내지 C 중에서 파일 A를 검색하고자 하는 예로서 설명하면, 파일 A에 대한 파일 정보를 이용하여 파일 A 내지 파일 C를 비교함으로써, 파일 A를 식별할 수 있다. 이러한 경우, 파일 정보(110) 만으로 해당 파일을 찾아낼 수 있으므로 해시 값을 상호 비교할 필요가 없어 보다 빠르고 적은 자원으로 원하는 파일을 찾을 수 있다.The file information 110 should be compared before the HASH value 120 when comparing files. For example, the file A may be identified by comparing the files A to C using the file information for the file A. For example, the file A may be searched among the files A to C. In this case, since the corresponding file can be found using only the file information 110, it is not necessary to compare hash values with each other, so that a desired file can be found with faster and less resources.

도 2는 개시된 기술에 따른 해시 데이터 구조의 다른 일 실시예를 설명하기 위한 참고도이다. 도 2에 도시된 해시 데이터 구조는 도 1의 실시예에 구조 헤더(130)를 더 포함하고 있다. 2 is a reference diagram for explaining another embodiment of a hash data structure according to the disclosed technology. The hash data structure shown in FIG. 2 further includes a structure header 130 in the embodiment of FIG.

구조 헤더(130)는 파일 정보(110) 및 해시 값(120)의 구조에 대한 정보를 포함하고 있다. 예를 들어, 구조 헤더(130)는 파일 정보(110) 전체의 비트 수 및 해시 값(120)의 전체 비트 수에 대한 정보를 포함할 수 있다. The structure header 130 includes information about the structure of the file information 110 and the hash value 120. For example, the structure header 130 may include information about the number of bits of the entire file information 110 and the number of bits of the hash value 120.

일 실시예에서, 구조 헤더(130)는 해시 값(120)을 산출해낸 해시 함수에 대한 정보를 포함할 수 있다. 예를 들어, SHA-0, SHA-1 등과 같이, 해당 해시 값(120)을 산출하는데 사용된 해시에 대한 정보를 포함할 수 있다.In one embodiment, the structure header 130 may include information about the hash function that yielded the hash value 120. For example, it may include information about the hash used to calculate the hash value 120, such as SHA-0, SHA-1, and the like.

일 실시예에서, 파일 정보(110)는 도시된 세 가지 데이터(111 내지 113) 중 적어도 하나만 포함할 수 있고, 구조 헤더(130)는 파일 정보(110)에 포함된 데이터에 대한 정보를 제공할 수 있다. In one embodiment, the file information 110 may include only at least one of the three pieces of data 111 to 113 shown, and the structure header 130 may provide information about the data included in the file information 110. Can be.

예를 들어, 파일 크기 정보, 제1 및 제2 일부 데이터를 각각 A, B, C로 식별하고, 파일 크기 정보는 2바이트의 고정 크기를 가지며, 구조 헤더(130)는 "6AB"로 구성되었다 하자. 이러한 경우, 구조 헤더(130)의 "6"은 파일 정보(110)의 전체 바이트 수를 의미하는 갑이고, AB는 파일 크기 정보(111) 및 제1 일부 데이터(112)로 파일 정보(110)가 구성되어 있다는 것을 의미한다. For example, the file size information, the first and second partial data are identified as A, B, and C, respectively, the file size information has a fixed size of 2 bytes, and the structure header 130 is composed of "6AB". lets do it. In this case, "6" of the structure header 130 is the value representing the total number of bytes of the file information 110, and AB is the file information 110 as the file size information 111 and the first partial data 112. Means that it is configured.

도 2의 실시예는 하나의 시스템에서 서로 다른 길이의 파일 정보(110)를 이용하는 경우에도 개시된 해시 데이터 구조(100)를 적용할 수 있다. 즉, 구조 헤더(130)를 이용하여 해시 데이터 구조(100)의 각 구성요소들에 대한 비트들을 각각 식별할 수 있기 때문이다.2 may apply the disclosed hash data structure 100 even when file information 110 of different lengths is used in one system. That is, the structure header 130 may be used to identify bits for each component of the hash data structure 100, respectively.

도 3은 개시된 기술에 따른 해시 데이터 구조의 또 다른 일 실시예를 설명하기 위한 참고도이다. 도 3에 도시된 해시 데이터 구조는 도 1의 실시예에 패리티 정보(140)를 더 포함하고 있다. 3 is a reference diagram for explaining another embodiment of a hash data structure according to the disclosed technology. The hash data structure shown in FIG. 3 further includes parity information 140 in the embodiment of FIG. 1.

패리티 정보(140)는 해시 데이터 구조(100)에 대한 패리티 값을 포함한다. Parity information 140 includes parity values for hash data structure 100.

일 실시예에서, 패리티 정보(140)는 (i) 파일 정보(110)에 대한 패리티 비트와, (ii)해시 값(120)에 대한 패리티 비트로 구성될 수 있다. 이는, 개시된 기술은 파일의 비교에 있어서 파일 정보(110)만으로 비교를 완료할 수 있으므로, 이를 위하여 패리티 값을 각각 구분하기 위함이다.In one embodiment, parity information 140 may be comprised of (i) parity bits for file information 110 and (ii) parity bits for hash value 120. This is because the disclosed technique can complete the comparison using only the file information 110 in comparing the files, and for this purpose, to distinguish the parity values.

도 3의 실시예는 파일의 전송 등이 발생하는 경우에 보다 효율적으로 오류 체크를 수행하며 파일들을 비교할 수 있다.The embodiment of FIG. 3 may perform error checking more efficiently and compare files when file transfer or the like occurs.

도 4는 개시된 기술에 따른 해시 비교 시스템의 일 실시예를 설명하는 구성도이다.4 is a schematic diagram illustrating an embodiment of a hash comparison system according to the disclosed technology.

해시 비교 시스템(200)은 파일 정보 생성부(210), 해시 생성부(220), 해시 파일 관리부(230) 및 제어부(250)를 포함한다. 일 실시예에서, 해시 비교 시스템(200)은 원본 파일 관리부(240)를 더 포함할 수 있다.The hash comparison system 200 includes a file information generator 210, a hash generator 220, a hash file manager 230, and a controller 250. In one embodiment, the hash comparison system 200 may further include an original file manager 240.

파일 정보 생성부(210)는 원본 파일의 속성을 확인하여, 원본 파일에 대한 파일 정보를 생성할 수 있다. 여기에서, 원본 파일의 속성이란, 원본 파일의 크기, 이름, 형식, 데이터 비트의 일부(예컨대, 최초 데이터 비트 또는 최후 데이터 비트로부터 소정의 길이) 등이 포함될 수 있다.The file information generator 210 may check the property of the original file and generate file information about the original file. Here, the attributes of the original file may include the size, name, format, a portion of data bits (eg, a predetermined length from the first data bit or the last data bit) of the original file.

일 실시예에서, 파일 정보 생성부(210)는 원본 파일의 데이터 비트의 최초 및 최후 비트에 대하여 기 설정된 길이만큼을 읽어들여 전술한 제1 및 제2 일부 데이터를 생성할 수 있다. 여기에서, 기 설정된 길이는 해당 해시 데이터 구조의 제1 및 제2 일부 데이터의 크기에 상응할 수 있다.According to an embodiment, the file information generator 210 may generate the first and second partial data described above by reading a predetermined length of the first and last bits of the data bits of the original file. Here, the preset length may correspond to the sizes of the first and second partial data of the hash data structure.

해시 생성부(220)는 원본 파일에 대하여 해시 함수를 적용하여 해시 값을 생성할 수 있다. 해시 생성부(220)는 개별 시스템에서 사용되는 해시 함수를 사용할 수도 있고, 또는 표준에 따른 해시 함수, 예컨대, SHA(Secure Hash Algorithm)에 따른 해시 함수를 이용할 수도 있다.The hash generator 220 may generate a hash value by applying a hash function to the original file. The hash generator 220 may use a hash function used in an individual system, or may use a hash function according to a standard, for example, a hash function according to a secure hash algorithm (SHA).

일 실시예에서, 해시 생성부(220)는 복수의 해시 함수를 구비하여, 제어부(250)의 요청에 따라 특정 해시 함수를 이용하여 원본 파일에 대한 해시 값을 생성할 수 있다.In one embodiment, the hash generator 220 may include a plurality of hash functions to generate a hash value for the original file by using a specific hash function at the request of the controller 250.

일 실시예에서, 해시 생성부(220)는 원본 파일의 일부만을 대상으로 하여 해시 값을 생성할 수 있다. 예를 들어, 원본 파일의 크기가 일정 값 이상인 경우, 해시 생성부(220)는 기 설정된 크기에 해당하는 원본 파일의 일부에 대해서 해시 값을 생성할 수 있다. 다른 예를 들어, 해시 생성부(220)는 원본 파일에 대하여, 제1 및 제2 일부 데이터에 해당하지 않는 부분만을 대상으로 해시 값을 생성할 수 있다. In one embodiment, the hash generator 220 may generate a hash value targeting only a part of the original file. For example, when the size of the original file is greater than or equal to a predetermined value, the hash generator 220 may generate a hash value for a part of the original file corresponding to the preset size. For another example, the hash generator 220 may generate a hash value for only a portion of the original file that does not correspond to the first and second partial data.

해시 파일 관리부(230)는 원본 파일과 그에 대응하는 해시 파일(구조)를 관리할 수 있다. 예를 들어, 해시 파일 관리부(230)는 해시 파일을 저장하고, 해당 해시 파일에 매칭되는 원본 파일에 대한 정보(예컨대, 링크 정보 등)을 유지할 수 있다. The hash file manager 230 may manage the original file and a hash file (structure) corresponding thereto. For example, the hash file manager 230 may store a hash file and maintain information (eg, link information) about an original file matching the hash file.

원본 파일 관리부(240)는 원본 파일을 저장하고, 각 원본 파일에 대한 히스토리를 유지할 수 있다. 예를 들어, A 파일에 대하여 해시 비교를 한 결과, 같은 파일이 변동된 것이라면 해당 A파일 및 그 해시 이력을 저장할 수 있다.The original file manager 240 may store the original file and maintain a history of each original file. For example, as a result of hash comparison of A file, if the same file is changed, the A file and its hash history can be stored.

제어부(250)는 해시 비교 시스템(200)의 전체적 동작을 제어하여 해시 데이터 구조를 생성하거나 원본 파일을 상호 비교할 수 있다.The controller 250 may control the overall operation of the hash comparison system 200 to generate a hash data structure or to compare the original files.

일 실시예에서, 제어부(250)는 원본 파일에 대하여 해시 데이터 구조(파일)을 생성할 수 있다. 더 상세히 설명하면, 제어부(250)는 특정 원본 파일을 파일 정보 생성부(210) 및 해시 생성부(220)에 제공하고, 그에 응답하여 제공받은 파일 정보 및 해시 값을 이용하여 해시 데이터 구조를 생성할 수 있다. 이러한 해시 데이터 구조의 생성에 관한 실시예에 대해서는, 이하 도 5 내지 도 6을 참조하여 더 상세히 설명한다.In one embodiment, the controller 250 may generate a hash data structure (file) for the original file. In more detail, the controller 250 provides the specific original file to the file information generator 210 and the hash generator 220, and generates a hash data structure using the file information and the hash value provided in response thereto. can do. An embodiment of generating such a hash data structure will be described in more detail with reference to FIGS. 5 to 6 below.

일 실시예에서, 제어부(250)는 해시 데이터 구조를 이용하여 두 원본 파일들을 상호 비교할 수 있다. 개시된 기술에 따른 해시 데이터 구조는 파일 정보와 해시 값으로 구분되고, 이러한 구조 상의 특징을 이용하여 원본 파일을 상호 비교한다. 더 상세히 설명하면, 제어부(250)는 비교 대상인 원본 파일들의 해시 데이터 구조를 해석하고, 먼저 해시 데이터 구조 중 파일 정보를 이용하여 원본 파일들이 서로 동일한 파일인지 확인한다. 동일한 파일로 확인되면, 제어부(250)는 해시 데이터 구조 중 해시 값을 이용하여 동일한 내용을 가지는 파일인지 확인한다. 개시된 기술은 파일 정보를 이용하여 파일이 서로 동일한 파일인지 확인하는 단계를 우선 수행하고, 동일하다고 확인되는 경우에만 해시 값을 비교하므로 보다 빠르게 비교를 수행할 수 있다.In one embodiment, the controller 250 may compare the two original files with each other using a hash data structure. The hash data structure according to the disclosed technique is divided into file information and hash values, and the original file is compared with each other using this structural feature. In more detail, the controller 250 analyzes the hash data structures of the original files to be compared, and first checks whether the original files are the same files using the file information in the hash data structure. If the same file is confirmed, the controller 250 checks whether the file has the same content by using the hash value among the hash data structures. The disclosed technique first performs a step of checking whether the files are the same file using the file information, and compares the hash values only when it is confirmed that the files are the same.

일 실시예에서, 파일 정보들을 비교할 때, 제어부(250)는 파일 정보를 구성하는 데이터 비트들을 각 비트별로 상호 비교할 수 있다. 다른 일 실시예에서, 제어부(250)는 파일 정보를 구성하는 각 요소를 식별한 다음, 식별된 요소들을 상호 비교함으로써 파일 정보들을 비교할 수 있다. 즉, 파일 정보 각각에 대하여, 해당 파일 정보가 포함하고 있는 해당 원본 파일의 크기, 이름, 형식, 제1 일부 데이터 및 해당 제2 일부 데이터 중 적어도 하나를 식별하고, 식별된 각 요소를 각각 비교할 수 있다.In one embodiment, when comparing the file information, the controller 250 may compare the data bits constituting the file information for each bit. In another embodiment, the controller 250 may identify each element constituting the file information, and then compare the file information by comparing the identified elements with each other. That is, for each file information, at least one of the size, name, format, first partial data, and second partial data of the corresponding original file included in the file information may be identified, and each identified element may be compared with each other. have.

일 실시예에서, 제어부(250)는 생성된 해시 파일 및 그에 연관된 원본 파일 정보를 해시 파일 관리부(230)에 제공하여, 해시 파일을 관리하도록 할 수 있다. 제어부(250)는 생성된 해시 파일을 해시 파일 관리부(230)에 제공하여 저장하도록 하고, 해시 비교 등 다른 연산 요청이 있을 때 특정 원본 파일에 해당하는 해시 파일을 해시 파일 관리부(230)로부터 제공받아 소정의 연산을 수행할 수 있다.In one embodiment, the controller 250 may provide the generated hash file and the original file information associated with the hash file manager 230 to manage the hash file. The controller 250 provides the generated hash file to the hash file manager 230 and stores the hash file. The controller 250 receives a hash file corresponding to a specific source file from the hash file manager 230 when another operation request such as hash comparison is received. Certain operations can be performed.

일 실시예에서, 제어부(250)는 원본 파일에 대한 히스토리를 생성하도록 원본 파일 관리부(240)를 제어할 수 있다. 예를 들어, 동일한 원본 파일에 대하여 패치 등이 발생하는 경우, 패치 이력이 요구될 수 있다. 이러한 예와 같은 경우, 제어부(250)는 원본 파일을 비교한 결과, (i) 파일 정보를 이용하여 동일한 원본 파일임을 확인하고, (ii) 해시 값을 이용하여 내용 상에 변화가 있다고 판단되면, 원본 파일 관리부(240)에 해당 원본 파일 및 해시 데이터 구조에 대한 정보를 제공하여 히스토리를 생성하도록 할 수 있다. In an embodiment, the controller 250 may control the original file manager 240 to generate a history of the original file. For example, when a patch or the like occurs for the same original file, a patch history may be required. In such a case, when the control unit 250 compares the original files, (i) confirms that they are the same original files using the file information, and (ii) determines that there is a change in the contents using the hash value, The source file manager 240 may provide information on the source file and the hash data structure to generate a history.

일 실시예에서, 제어부(250)는 해시 데이터 구조에 대한 구조 헤더를 생성할 수 있다. 더 상세히 설명하면, 파일 정보 생성부(210) 및 해시 생성부(220)로부터 각각 파일 정보 및 해시 값을 제공받으면, 제어부(250)는 파일 정보 및 해시 값을 식별할 수 있도록 해시 데이터 구조에 대한 구조 헤더를 생성할 수 있다. 예를 들어, 제어부(250)는 파일 정보(110)에 어떠한 요소들이 포함되었는지, 각 요소의 데이터 길이, 해시 값의 길이 등에 대한 정보를 포함하는 주도 헤더를 생성할 수 있다. 이러한 실시예에서, 해시 데이터 구조를 비교하는 경우, 제어부(250)는 구조 헤더를 먼저 해석하여 파일 정보와 해시 값을 구분하고, 파일 정보를 기초로 비교 대상 두 원본 파일들이 서로 동일한 파일인지 확인한 후, 동일한 파일이라고 확인되면 해시 값을 비교하여 내용이 변화되었는지 확인할 수 있다. In one embodiment, the controller 250 may generate a structure header for the hash data structure. In more detail, when the file information and the hash value are provided from the file information generation unit 210 and the hash generation unit 220, the control unit 250 can identify the file information and the hash value. You can create structure headers. For example, the controller 250 may generate a heading header including information on what elements are included in the file information 110, the data length of each element, the length of a hash value, and the like. In this embodiment, when comparing the hash data structure, the controller 250 first interprets the structure header to distinguish the file information from the hash value, and then confirms whether the two original files to be compared are the same file based on the file information. If the file is identified as being identical, the hash value can be compared to see if the contents have changed.

일 실시예에서, 제어부(250)는 해시 데이터 구조에 대하여 패리터 정보를 생성하여 부가할 수 있다. 더 상세히 설명하면, 제어부(250)는 파일 정보에 대한 패리터 비트와 해시 값에 대한 패리터 비트를 생성하고, 두 패리티 비트를 포함하여 패리티 정보를 생성할 수 있다. 이러한 실시예는 서로 다른 시스템에서 해시 데이터 구조의 전송 등이 발생하는 경우 적용될 수 있으며, 해시 데이터 구조의 파일 정보와 해시 값에 대하여 각각 패리티 비트를 계산함으로써 해시 데이터 구조를 비교할 때 보다 빠르게 패리터 연산을 수행할 수 있다. In one embodiment, the controller 250 may generate and add parity information to the hash data structure. In more detail, the controller 250 may generate a parit bit for file information and a parit bit for a hash value, and generate parity information including two parity bits. Such an embodiment may be applied when a hash data structure is transmitted in different systems, and parity operations are faster when comparing hash data structures by calculating parity bits for file information and hash values of the hash data structure. Can be performed.

도 5는 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 생성 방법의 일 실시예를 설명하는 순서도이다. FIG. 5 is a flowchart illustrating an embodiment of a hash data generation method that may be performed in the hash comparison system of FIG. 4.

도 5를 참조하면, 파일 정보 생성부(210)는 제어부(250)의 제어에 따라 원본 파일에 대하여 속성을 확인할 수 있다(단계 S510). 여기에서, 속성은 파일 정보를 생성하기 위하여 수집되는 데이터로서, 전술한 바와 같이 파일 크기, 파일 이름, 파일 형식, 제1 또는 제2 일부 데이터 등 일 수 있다. Referring to FIG. 5, the file information generation unit 210 may check a property of an original file under the control of the control unit 250 (step S510). Here, the attribute is data collected for generating file information, and may be a file size, a file name, a file format, first or second partial data, and the like as described above.

파일 정보 생성부(210)는 확인된 원본 파일의 속성을 기초로 파일 정보를 생성할 수 있다(단계 S520). 파일 정보는 해시 데이터를 비교할 때, 비교하고 있는 두 원본 파일들이 서로 같은 파일인지 확인하기 위하여 사용된다. 전술한 바와 같이, 파일 정보는 파일 크기, 제1 일부 데이터 및 제2 일부 데이터 중 적어도 하나를 포함할 수 있다. 또는 파일 이름, 파일 형식을 포함할 수 도 있다. 파일 정보 생성부(210)는 생성된 파일 정보를 제어부(250)에 제공한다.The file information generation unit 210 may generate file information based on the identified properties of the original file (step S520). File information is used when comparing hash data to make sure that the two original files being compared are the same file. As described above, the file information may include at least one of a file size, first partial data, and second partial data. It can also include a file name or file type. The file information generator 210 provides the generated file information to the controller 250.

해시 생성부(220)는 제어부(250)의 제어에 따라 원본 파일에 대응되는 해시 값을 생성할 수 있다(단계 S530). 일 실시예에서, 해시 생성부(220)는 다양한 해시 알고리즘을 구비할 수 있으며, 제어부(250)가 요청한 해시 알고리즘으로 원본 파일에 대하여 해시 값을 생성할 수 있다. 일 실시예에서, 해시 생성부(220)는 제어부(250)의 제어에 따라 원본 파일의 일부만을 이용하여 해시 값을 생성할 수 있다. 해시 생성부(220)는 생성된 해시 값을 제어부(250)에 제공한다.The hash generator 220 may generate a hash value corresponding to the original file under the control of the controller 250 (step S530). In one embodiment, the hash generator 220 may include various hash algorithms, and may generate a hash value for the original file using the hash algorithm requested by the controller 250. In one embodiment, the hash generator 220 may generate a hash value using only a part of the original file under the control of the controller 250. The hash generator 220 provides the generated hash value to the controller 250.

제어부(250)는 파일 정보 및 해시 값을 이용하여 해시 데이터를 생성할 수 있다(단계 S540). 제어부(250)는 파일 정보에 해당하는 데이터 비트에 연속하여 해시 값에 해당하는 데이터 비트를 연접하여 해시 데이터를 생성할 수 있다. 이러한 실시예에서, 제어부(250)는 최초 비트로부터 어느 비트까지가 파일 정보에 해당하는지를 미리 알 수 있다. 따라서, 제어부(250)는 파일 정보 생성부(210) 또는 해시 생성부(220)에 대하여 파일 정보 및 해시 값을 생성하라고 제어할 때, 해당 데이터의 크기에 대한 정보를 포함하여 생성을 요청할 수 있다. The controller 250 may generate hash data using the file information and the hash value (step S540). The controller 250 may generate hash data by concatenating the data bits corresponding to the hash value consecutively with the data bits corresponding to the file information. In such an embodiment, the controller 250 may know in advance which bits correspond to file information from the first bit. Therefore, when the controller 250 controls the file information generator 210 or the hash generator 220 to generate the file information and the hash value, the controller 250 may request generation including the information on the size of the corresponding data. .

도 6은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 생성 방법의 다른 일 실시예를 설명하는 순서도이다. 도 6의 실시예는 전술한 구조 헤더를 이용하여 해시 데이터를 생성하는 실시예에 관한 것이다. 도 6의 실시예는 도 5의 실시예에 소정의 단계가 추가된 것이므로, 도 5의 실시예와 동일하거나 상응하는 단계에 대해서는 간략히 설명한다.FIG. 6 is a flowchart illustrating another embodiment of a hash data generation method that may be performed in the hash comparison system of FIG. 4. The embodiment of FIG. 6 relates to an embodiment of generating hash data using the above-described structure header. Since the embodiment of FIG. 6 is a step added to the embodiment of FIG. 5, the steps that are the same as or correspond to the embodiment of FIG. 5 will be briefly described.

도 6을 참조하면, 제어부(250)는 파일 정보에 포함될 구성 요소를 미리 결정할 수 있다(단계 S610). 즉, 제어부(250)는 파일 정보에 포함될 구성 요소의 종류, 각 구성요소의 크기 등을 미리 결정하여 파일 정보의 구성에 대한 정보를 유지할 수 있다. 이후, 제어부(250)는 결정된 구성요소에 대한 정보를 포함하여 파일 정보를 생성하도록 파일 정보 생성부(210)에 요청할 수 있다.Referring to FIG. 6, the controller 250 may determine in advance a component to be included in file information (step S610). That is, the controller 250 may predetermine the type of components to be included in the file information, the size of each component, and the like to maintain information on the configuration of the file information. Thereafter, the controller 250 may request the file information generator 210 to generate file information including information on the determined component.

파일 정보 생성부(210)는 제어부(250)의 제어에 따라 파일 정보를 생성할 수 있다. 즉, 파일 정보 생성부(210)는 원본 파일의 속성을 확인하고(단계 S610), 확인된 속성을 이용하여 파일 정보를 생성하여(단계 S630), 제어부(250)에 제공할 수 있다.The file information generator 210 may generate file information under the control of the controller 250. That is, the file information generation unit 210 may check the property of the original file (step S610), generate file information using the checked property (step S630), and provide the same to the controller 250.

해시 생성부(220)는 제어부(250)의 제어에 따라 원본 파일의 해시 값을 생성하고(단계 S640), 제어부(250)에 제공할 수 있다.The hash generator 220 may generate a hash value of the original file under the control of the controller 250 (step S640), and provide the hash value to the controller 250.

제어부(250)는 파일 정보 및 해시 값에 대한 구조 헤더를 생성할 수 있다(단계 S650). 구조 헤더는 전술한 바와 같이, 해시 데이터의 구조에 대한 정보를 포함하고 있다. 구조 헤더를 사용하는 것은, 개시된 발명은 해시 데이터에서 파일 정보와 해시 값을 구분하여 상호 비교를 수행하기 때문이다. 일 실시예에서, 제어부(250)는 파일 정보 및 해시 값이 생성 되기 이전에 구조 헤더를 생성할 수 있다. 즉, 파일 정보 및 해시 값의 생성을 요청할 때 파일 정보 및 해시 값의 구성(예컨대, 파일 정보의 구성요소 및 그 구성요소들의 크기, 해시 값의 크기 등)을 함께 요청하는 경우, 파일 정보 및 해시 값을 받지 않아도 구조 헤더를 생성할 수 있기 때문이다. 다른 일 실시예에서, 제어부(250)는 파일 정보 및 해시 값을 각각 수신한 후, 그에 대한 구조 헤더를 생성할 수 있다. 즉, 파일 정보 생성부(210) 및 해시 생성부(220)가 스스로 파일 정보 및 해시 값을 생성하는 경우, 이를 각각 수신하여 구조 헤더를 생성할 수 있다.The controller 250 may generate a structure header for the file information and the hash value (step S650). As described above, the structure header includes information on the structure of hash data. The use of the structure header is because the disclosed invention distinguishes file information and hash values in hash data and performs mutual comparison. In an embodiment, the controller 250 may generate a structure header before the file information and the hash value are generated. That is, when requesting the generation of the file information and the hash value, when requesting the configuration of the file information and the hash value (for example, the components of the file information and the size of the components, the size of the hash value, etc.), the file information and the hash This is because structure headers can be created without receiving a value. In another embodiment, the controller 250 may receive the file information and the hash value, respectively, and generate a structure header thereof. That is, when the file information generator 210 and the hash generator 220 generate the file information and the hash value by themselves, the file information generator 210 and the hash generator 220 may generate the structure header by receiving them.

제어부(250)는 구조 헤더가 생성되면, 구조 헤더, 파일 정보 및 해시 값을 기초로 해시 데이터를 생성할 수 있다(단계 S660). When the structure header is generated, the controller 250 may generate hash data based on the structure header, the file information, and the hash value (step S660).

도 7은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 비교 방법의 일 실시예를 설명하는 순서도이다. 도 7에 개시된 해시 데이터 비교 방법은 도 5에 개시된 해시 데이터 생성 방법에 상응하는 실시예이다. 7 is a flowchart illustrating an embodiment of a hash data comparison method that may be performed in the hash comparison system of FIG. 4. The hash data comparison method disclosed in FIG. 7 is an embodiment corresponding to the hash data generation method disclosed in FIG. 5.

도 7을 참조하면, 제어부(250)는 비교할 두 원본 파일들 각각에 연관된 해시 데이터들을 선택할 수 있다(단계 S710). 해시 파일 관리부(230)를 포함하는 일 실시예의 경우, 제어부(250)는 비교할 두 원본 파일들에 대한 해시 데이터를 해시 파일 관리부(230)에 요청하여 취득할 수 있다.Referring to FIG. 7, the controller 250 may select hash data associated with each of two original files to be compared (step S710). In an exemplary embodiment including the hash file manager 230, the controller 250 may request and acquire hash data of two original files to be compared by asking the hash file manager 230.

제어부(250)는 확인된 두 해시 데이터들의 구성을 확인할 수 있다(단계 S720). 즉, 제어부(250)는 해시 데이터에서 파일 정보 및 해시 값이 각각 어느 부분인지를 확인할 수 있다. The controller 250 may check the configuration of the identified two hash data (step S720). That is, the controller 250 may identify which part of the file information and the hash value are each in the hash data.

제어부(250)는 두 해시 데이터들에 포함된 파일 정보들을 상호 비교하여, 두 원본 파일이 동일한 파일인지 1차적으로 확인할 수 있다(단계 S730). 예를 들어, 파일 정보에 파일 명, 파일 길이 등이 포함된 경우 파일 정보를 이용하여 두 원본 파일이 동일한 파일인지 먼저 확인한 후 파일 내용을 확인할 수 있다. 즉, 개시된 기술은 비교하고자 하는 두 파일이 동일한 객체 인지 먼저 객체의 동일성을 확인한 후, 동일한 객체로 판명되면 각 객체의 내용이 동일한지 내용의 동일성을 확인하여 비교를 완료할 수 있다. The controller 250 may first check whether the two original files are the same file by comparing the file information included in the two hash data with each other (step S730). For example, if the file information includes a file name, a file length, etc., the file information can be used to check whether two original files are the same file and then the file contents. That is, the disclosed technology may check whether the two files to be compared are the same object first, and then, if it is determined to be the same object, the comparison may be completed by checking the identity of the contents of each object whether they are the same.

만약 파일 정보들이 서로 동일하면(단계 S740, 예), 제어부(250)는 두 원본 파일에 연관된 해시 값들을 상호 비교할 수 있다(단계 S750).If the file information is the same (step S740, yes), the controller 250 may compare the hash values associated with the two original files with each other (step S750).

만약 해시 값들도 서로 동일하면(단계 S760, 예), 두 원본 파일들이 서로 동일한 파일로 판단할 수 있다(단계 S770).If the hash values are also the same (step S760), the two original files may be determined to be the same file (step S770).

만약 파일 정보들이 서로 상이하거나(단계 S740, 아니오) 해시 값들이 서로 상이하다면(단계 S760, 아니오), 두 원본 파일은 서로 상이한 파일로 판단할 수 있다.If the file information is different from each other (step S740, NO) or if the hash values are different from each other (step S760, NO), the two original files can be determined to be different files.

전술한 단계들 중에서, 제어부(250)는 파일 정보들 또는 해시 값들을 상호 비교할 때, 해당 비교 대상들의 데이터 비트를 대조함으로써 비교를 수행할 수 있다. 따라서, 파일 정보들 만으로 서로 상이한 파일로 판단되는 경우, 비교할 데이터 비트의 수가 현저히 줄어든다. 따라서, 여러 파일들 속에서 특정 원본 파일과 동일한 파일을 찾는 연산 등과 같이 1:N의 관계로 비교를 수행해야 하는 경우에 있어서, 개시된 기술은 매우 효율적으로 비교를 수행할 수 있다.Among the above-described steps, when comparing the file information or hash values with each other, the controller 250 may perform a comparison by contrasting data bits of corresponding comparison objects. Therefore, when it is determined that the files are different from each other only by the file information, the number of data bits to be compared is significantly reduced. Therefore, in the case where a comparison must be performed in a 1: N relationship, such as an operation of finding the same file as a specific source file among several files, the disclosed technique can perform the comparison very efficiently.

도 8은 도 4의 해시 비교 시스템에서 수행될 수 있는 해시 데이터 비교 방법의 다른 일 실시예를 설명하는 순서도이다. 도 8에 개시된 해시 데이터 비교 방법은 도 8에 개시된 해시 데이터 생성 방법에 상응하는 실시예로서, 도 8에 개시된 해시 데이터는 헤더 구조를 더 포함하고 있다. 따라서 본 실시예 중에서 도 7에 개시된 실시예와 동일 또는 상응하는 단계에 대해서는 간략히 설명한다.8 is a flowchart illustrating another embodiment of a hash data comparison method that may be performed in the hash comparison system of FIG. 4. The hash data comparison method disclosed in FIG. 8 is an embodiment corresponding to the hash data generation method disclosed in FIG. 8, and the hash data disclosed in FIG. 8 further includes a header structure. Therefore, the steps that are the same as or corresponding to the embodiment disclosed in FIG. 7 in the present embodiment will be briefly described.

도 8을 참조하면, 제어부(250)는 비교할 두 원본 파일들 각각에 연관된 해시 데이터들을 선택할 수 있다(단계 S810). Referring to FIG. 8, the controller 250 may select hash data associated with each of the two original files to be compared (step S810).

제어부(250)는 확인된 두 해시 데이터들에 대하여 구조 헤더를 확인하고 이를 해석할 수 있다(단계 S820). 구조 헤더는 전술한 바와 같이 해시 데이터에 포함된 파일 정보의 내용, 길이, 해시 값의 길이 등을 포함하고 있으므로, 이를 해석하여 해시 데이터의 각 구성을 구분할 수 있다. The controller 250 may check and interpret the structure header with respect to the identified two hash data (step S820). As described above, since the structure header includes the content, the length of the file information included in the hash data, the length of the hash value, and the like, the structure header may be interpreted to distinguish each component of the hash data.

제어부(250)는 확인된 두 해시 데이터들의 헤더 구조를 비교하고(단계 S830, 예), 서로 헤더 구조가 동일한 경우 각 해시 데이터들에 포함된 파일 정보들과 해시 값을 구분할 수 있다(단계 S840).The control unit 250 may compare the header structure of the identified two hash data (step S830, YES), and may distinguish the hash value and the file information included in each hash data when the header structure is the same (step S840). .

제어부(250)는 두 해시 데이터들에 포함된 파일 정보들을 상호 비교하여, 두 원본 파일이 동일한 파일인지 1차적으로 확인할 수 있다(단계 S850). The controller 250 may first compare whether the two original files are the same file by comparing the file information included in the two hash data with each other (step S850).

만약 파일 정보들이 서로 동일하면(단계 S860, 예), 제어부(250)는 두 원본 파일에 연관된 해시 값들을 상호 비교할 수 있다(단계 S870).If the file information is the same (step S860, yes), the controller 250 may compare the hash values associated with the two original files with each other (step S870).

만약 해시 값들도 서로 동일하면(단계 S880, 예), 두 원본 파일들이 서로 동일한 파일로 판단할 수 있다(단계 S890).If the hash values are also the same (step S880, yes), the two original files may be determined to be the same file (step S890).

만약 헤더 구조가 상이하거나(단계 S830, 아니오), 파일 정보들이 서로 상이하거나(단계 S860, 아니오) 해시 값들이 서로 상이하다면(단계 S880, 아니오), 두 원본 파일은 서로 상이한 파일로 판단할 수 있다.If the header structure is different (step S830, NO), the file information is different from each other (step S860, NO), or the hash values are different from each other (step S880, NO), the two original files may be determined as different files. .

도 8에 개시된 실시예는 헤더 구조를 이용하여 해시 데이터를 구성하는 파일 정보 및 해시 값을 구분할 수 있다. 이러한 실시예는 파일 정보 및 해시 값을 서로 다르게 적용하는 시스템에서 보다 효율적을 높일 수 있다. 또한 단계 S830에서 헤더 구조 자체로서 파일의 동일성을 판단할 수 있으므로 파일의 동일성을 보다 빠르고 정확하게 판단하여 효율적으로 비교를 수행할 수 있다.The embodiment disclosed in FIG. 8 may distinguish file information and hash values configuring hash data using a header structure. This embodiment can be more efficient in a system that applies file information and hash values differently. In addition, since the sameness of the file may be determined as the header structure itself in step S830, the sameness of the file may be determined more quickly and accurately, so that the comparison may be efficiently performed.

도 9는 개시된 기술에 따른 해시 비교 시스템의 다른 일 실시예를 설명하는 구성도이다. 도 9에 개시된 해시 비교 시스템은 1:N의 관계로 파일을 비교하는 경우에 적용될 수 있는 실시예로서, 1차적으로 파일 정보만을 비교하여 파일 정보가 동일한 파일들만으로 1차 비교군을 생성하고, 1차 비교군에 속한 파일들만을 대상으로 해시 값을 비교할 수 있는 시스템이다. 9 is a block diagram illustrating another embodiment of a hash comparison system according to the disclosed technology. The hash comparison system disclosed in FIG. 9 is an embodiment that may be applied when comparing files in a 1: N relationship, and primarily generates only a primary comparison group using only files having the same file information by comparing only file information. This is a system that can compare hash values only for files belonging to the difference comparison group.

도 9를 참조하면, 해시 비교 시스템(200)은 파일 정보 생성부(210), 해시 생성부(220), 제어부(250) 및 해시 비교부(260)를 포함한다. 일 실시예에서, 해시 비교 시스템(200)은 해시 파일 관리부(230) 또는 원본 파일 관리부(240) 중 적어도 하나를 더 포함할 수 있다. 도 9의 일 실시예를 설명함에 있어서 도 4의 실시예와 동일하거나 상응하는 구성 요소에 대한 설명은 생략하거나 간단히 설명한다.Referring to FIG. 9, the hash comparison system 200 may include a file information generator 210, a hash generator 220, a controller 250, and a hash comparer 260. In one embodiment, the hash comparison system 200 may further include at least one of the hash file manager 230 or the original file manager 240. In describing the exemplary embodiment of FIG. 9, the description of the same or corresponding components as the exemplary embodiment of FIG. 4 will be omitted or briefly described.

제어부(250)는 대상 파일 그룹 B에 대하여, 원본 파일 A와 동일한 파일을 선택할 수 있다. 이를 위하여, 제어부(250)는 대상 파일 그룹 B에 포함된 모든 파일들과 연관된 해시 데이터들을 선택하고, 원본 파일 A의 해시 데이터를 선택하여 상호 비교를 수행할 수 있다. 이러한 비교에 있어서, 제어부(250)는 각 해시 데이터들에 대하여 파일 정보와 해시 값으로 구분하고, 1차적으로 파일 정보들 만을 상호 비교할 수 있다. 즉, 제어부(250)는 원본 파일 A의 파일 정보와 대상 파일 그룹 B에 포함된 대상 파일들의 파일 정보들을 상호 비교하여 파일 정보가 동일한 대상 파일들을 분류하여 1차 비교군을 생성할 수 있다. 이후, 제어부(250)는 해시 비교부(260)를 이용하여 1차 비교군에 포함된 대상 파일들의 해시 값과 원본 파일 A의 해시 값을 상호 비교하여 동일한 파일을 확인할 수 있다.The controller 250 may select the same file as the original file A with respect to the target file group B. To this end, the controller 250 may select hash data associated with all files included in the target file group B, and select hash data of the original file A to perform mutual comparison. In this comparison, the controller 250 may classify each hash data into a file information and a hash value, and primarily compare only the file information. That is, the controller 250 may compare the file information of the source file A and the file information of the target files included in the target file group B to classify target files having the same file information and generate a primary comparison group. Thereafter, the controller 250 may check the same file by comparing the hash values of the target files included in the primary comparison group with the hash values of the original file A using the hash comparison unit 260.

해시 비교부(260)는 제어부(250)의 제어에 따라 해시 값들만을 상호 비교할 수 있다. 개시된 예에서는, 해시 비교부(260)를 두어 해시 값의 비교만 분리하여 수행함으로써 1:N의 관계로 검색이 필요한 경우 보다 효율적으로 비교를 수행할 수 있다. The hash comparison unit 260 may compare only hash values with each other under the control of the controller 250. In the disclosed example, the hash comparison unit 260 may be separated and performed to compare only hash values, so that the comparison may be performed more efficiently when a search is required in a 1: N relationship.

상기에서는 본 출원의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구의 범위에 기재된 본 출원의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 출원을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit and scope of the present invention as set forth in the following claims It can be understood that

Claims

delete

In a method for generating respective hash data for comparison for each source file,
(a) checking the attributes of the original file and generating file information consisting of predetermined data bits based on the identified attributes;
(b) calculating a hash value by applying a hash algorithm to at least a portion of the original file; And
(c) generating hash data by continuously combining the hash value with the file information to include parity information.

The method of claim 5, wherein step (a)
Identifying at least one of a size, a name, a format of the original file, first partial data including original data of the original file, and second partial data including final data of the original file;
Generating the file information including at least one of a size, a name, a format of the original file, first partial data including first data of the original file, and second partial data including last data of the original file; Hash data generation method comprising a.

Claim 7 has been abandoned due to the setting registration fee.

The method of claim 5, wherein step (c)
(d) generating, as the parity information, a hash parity bit for the hash data.

Claim 8 was abandoned when the registration fee was paid.

The method of claim 7, wherein step (d)
Generating a first parity bit for the file information;
Generating a second parity bit for the hash value; And
And generating the hash parity bit by successively combining the first and second parity bits.

In a method for generating respective hash data for comparison for each source file,
(a) generating a structure header including structure information for each of the file information and the hash value included in the structure of the hash data;
(b) checking the attributes of the original file and generating file information consisting of predetermined data bits based on the identified attributes;
(c) calculating a hash value by applying a hash algorithm to at least a portion of the original file; And
(d) generating hash data by successively combining the hash values with the file information;
The file information may include at least one of a size, a name, a format of a corresponding original file, at least one of first partial data including first data of the corresponding file and second partial data including last data of the original file. .

A hash data comparison method for comparing two original files with each other using hash data including file information and a hash value,
(a) identifying two hash data associated with each of the two original files;
(b) comparing two file informations included in each of the two data with each other; And
(c) comparing the two hash values included in each of the two hash data if the two file informations are the same, and determining the two original files as the same file if the two file information are the same. ,
The file information includes a hash data comparison method including at least one of a size, a name, a format of a corresponding original file, first partial data including first data of the corresponding file, and second partial data including last data of the original file. .

delete

Claim 12 is abandoned in setting registration fee.

The method of claim 10, wherein step (b)
And comparing the data bits of the two file information with each bit.

Claim 13 was abandoned upon payment of a registration fee.

The method of claim 10, wherein step (b)
For each of the two file informations, a first portion of data including the size, name and format of the original file included in the file information, first data including the first data of the file, and a second data including the last data of the original file. Identifying at least one of some data; And
Comparing at least one of the size, name, format, first partial data including the first data of the corresponding original file and second partial data including the last data of the corresponding original file, respectively; Hash data comparison method, characterized in that.

A hash data comparison method for comparing two source files with each other using hash data including file information, a hash value, and a structure header including structure information about each of the file information and the hash value,
(a) comparing the structure headers with respect to the two original files to determine whether they are hash data having the same structure;
(b) comparing file information associated with each of the two original files, if they have the same structure;
(c) comparing the hash values associated with the original files with each other if the file information is identical to each other, and determining the two original files as the same file if the file information is identical with each other;
The file information includes a hash data comparison method including at least one of a size, a name, a format of a corresponding original file, first partial data including first data of the corresponding file, and second partial data including last data of the original file. .

In the hash data comparison system for comparing the original files using the hash data including the file information and the hash value,
A file information generation unit which checks attributes of each of the original files and generates file information on each of the original files;
A hash generation unit configured to calculate a hash value by applying a hash function algorithm to at least a portion of each of the original files;
It includes a control unit for generating hash data for the original file including the file information and the hash value,
And the control unit compares the file information and the hash value with respect to each of the original files to determine identity.

Claim 16 has been abandoned due to the setting registration fee.

16. The system of claim 15, wherein the hash data comparison system is
And a hash file manager which stores the generated hash data and maintains information on the original file associated with the stored hash value.

delete

Claim 18 has been abandoned due to the setting registration fee.

The method of claim 15, wherein the control unit
And generating a structure header including the file information and identification information about the hash value, and generating the hash data including the structure header, the file information, and the hash value.

Claim 19 is abandoned in setting registration fee.

The method of claim 18, wherein the control unit
And sequentially comparing the structure header, the file information, and the hash value with respect to each of the original files, and determining the original files as the same file if they are all identical.

Claim 20 has been abandoned due to the setting registration fee.

The method of claim 15, wherein the control unit
And a parity bit for the hash data including parity bits calculated for each of the file information and the hash value.