WO2017146337A1 - 데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 - Google Patents
데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 Download PDFInfo
- Publication number
- WO2017146337A1 WO2017146337A1 PCT/KR2016/011463 KR2016011463W WO2017146337A1 WO 2017146337 A1 WO2017146337 A1 WO 2017146337A1 KR 2016011463 W KR2016011463 W KR 2016011463W WO 2017146337 A1 WO2017146337 A1 WO 2017146337A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- group
- record
- compression
- records
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G06F11/1446—Point-in-time backing up or restoration of persistent data
- G06F11/1448—Management of the data involved in backup or backup restore
- G06F11/1451—Management of the data involved in backup or backup restore by selection of backup contents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/113—Details of archiving
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
Definitions
- the present invention relates to a method for archiving a database, a device and a method for retrieving an archived database, and more particularly, to a method for archiving a database using a data compression technique and a method for retrieving a compressed and archived database. And to the apparatus.
- An object of the present invention is to provide a method and apparatus for archiving a database, which can reduce the capacity of a database by classifying records of a database according to a predetermined criterion, and compressing and archiving each classified record.
- the present invention is to provide a more efficient database search method and apparatus by searching the archived database in parallel as described above.
- the database archiving method provided by the present invention includes at least a plurality of records on the basis of selection information on at least one of time and field values in the original table to be archived. Selecting one record group; Storing group compression data generated by compressing the record groups for each of the at least one selected record group and the selection information corresponding to the group compression data in a compression table; And deleting a plurality of records included in the selected at least one record group from the original table.
- the storing of the selection information in the compression table may include storing data of a plurality of records included in the record group in a buffer, for each of the at least one selected record group; Compressing the data stored in the buffer to generate the group compressed data; Acquiring the screening information corresponding to the generated group compression data; And storing the group compression data in the same record on the selection information and the compression table.
- the storing of the selection information in the compression table may further include storing a serial number assigned to each of the separated plurality of record groups in the compression table.
- the database archiving apparatus includes at least one record group including a plurality of records based on selection information on at least one of a time and a field value in an original table to be archived.
- a data selection unit for selecting a;
- a data compression unit for compressing the selected at least one record group for each record group to generate group compressed data;
- a DB management unit which stores the group compression data and the selection information corresponding to the group compression data in a compression table, and deletes a plurality of records included in the at least one selected record group from the original table.
- the data compression unit stores data of a plurality of records included in the record group in each of the at least one selected record group in a buffer, compresses the data stored in the buffer to store the group compressed data.
- the DB manager may acquire the selection information corresponding to the generated group compression data and store the group compression data in the same record on the selection information and the compression table.
- the data selection unit if there is an excess record group in which the number of records exceeds a threshold value among the at least one selected record group, the excess record group to a plurality of the record group of which the number of records is less than or equal to the threshold value;
- the DB management unit may further store the serial number assigned to each of the separated plurality of record groups in the compression table.
- the search method of the archived database is a group compression data generated by compressing the selection information for at least one of the time and field values and a plurality of records corresponding to the selection information.
- the number of DB search processes for processing the search of the records in parallel is determined based on at least one of the performance of the computer on which the search is performed and the number of the group compressed data corresponding to the selection information satisfying the search condition. Doing; And performing a parallel search of records that satisfy the search condition based on the determined number of DB search processes.
- the determining of the number of DB search processes for the parallel processing includes collecting computer performance information on at least one of the number of CPUs included in the computer, the capacity of a memory, and the input / output speed of a storage device. ; Determining the number of the group compression data corresponding to the selection information satisfying the received search condition among the group compression data stored in the compression table; And determining the number of DB search processes for processing the search of the record in parallel based on at least one of the collected computer performance information and the determined number of group compressed data.
- the step of performing a search of the records satisfying the search conditions in parallel is for each of the determined number of DB search processes, the number of the group compression data corresponding to the selection information satisfying the search conditions Allocating at least one of the group compressed data based on at least one; And decompressing the allocated at least one group compressed data for each DB searching process and searching for records that satisfy the search condition in parallel.
- the step of performing a search of records that satisfy the search condition in parallel is further based on the table structure information which is information on the type, size, order and name of the fields included in the original table archived with the compression table. can do.
- the DB search process may perform a search using a process or thread allocated to each DB search process.
- the search apparatus of the archived database is a group compression data generated by compressing the selection information for at least one of time and field values and a plurality of records corresponding to the selection information.
- the number of DB search processes for processing the search of the records in parallel is determined based on at least one of the performance of the computer on which the search is performed and the number of the group compressed data corresponding to the selection information satisfying the search condition.
- a search preparation unit And a parallel search unit that performs a search of records that satisfy the search condition in parallel based on the determined number of DB search processes.
- the search preparation unit collects computer performance information on at least one of the number of CPUs included in the computer, the capacity of a memory, and the input / output speed of a storage device, and receives the received data from the group compression data stored in the compression table. Determine the number of the group compression data corresponding to the screening information satisfying the specified search condition, and process the search of the records in parallel based on at least one of the collected computer performance information and the determined number of group compression data. You can decide the number of DB retrieval process.
- the parallel search unit allocates at least one group compression data to each of the determined number of DB search processes based on the number of the group compression data corresponding to the selection information satisfying the search condition.
- the decompression of the at least one group compressed data allocated to each DB search process and a search for a record satisfying the search condition may be performed in parallel.
- the parallel search unit may further be based on table structure information which is information on the type, size, order, and name of fields included in the original table archived by the compression table.
- the DB search process may perform a search using a process or thread allocated to each DB search process.
- the present invention classifies the data stored in the database according to the search frequency, importance, etc., compresses the data according to the classification result, and archives it, thereby significantly reducing the use capacity of the database, and maximizing the search efficiency of the archived data.
- FIG. 1 is a flowchart illustrating a database archiving method according to an embodiment of the present invention.
- FIG. 2 is a flowchart illustrating a method of storing selection information in a compression table according to an embodiment of the present invention.
- FIG. 3 is a flowchart illustrating a search method of an archived database according to an embodiment of the present invention.
- FIG. 4 is a flowchart illustrating a method of determining the number of DB search processes according to an embodiment of the present invention.
- FIG. 5 is a diagram illustrating a database archiving apparatus according to an embodiment of the present invention.
- FIG. 6 is a diagram illustrating an apparatus for searching an archived database according to an embodiment of the present invention.
- FIG. 7 and 8 are diagrams for explaining a compression table according to an embodiment of the present invention.
- first, second, A, and B may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
- the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component.
- FIG. 1 is a flowchart illustrating a database archiving method according to an embodiment of the present invention.
- step S110 the database archiving apparatus selects at least one record group including a plurality of records from the original table to be archived based on selection information on at least one of time and field values.
- a table is a unit that forms a basic structure for storing data in a database, and an original table may be a table to be archived in order to save capacity among a plurality of tables included in the database.
- the database archiving apparatus may select at least one record group including a plurality of records based on the selection information for at least one of a predetermined time and a field value in the original table.
- the selection information may be information about time, information about a specific field value included in the original table, or information including both.
- the selection information on time may be information for selecting records on a monthly basis using a field for time included in the original table.
- the selection information on the field value may be information for selecting a record according to the field value of the field by using a predetermined field included in the original table.
- the selection information for both time and field values may be information for selecting records by using a field for time and a predetermined field together.
- the field value included in the selection information may be determined as the field value of the field where the most frequent search occurs in the original table.
- the reason for selecting records using the most frequently searched fields is that the efficiency can be maximized when searching the archived database after being archived.
- a record group is a group composed of a plurality of records classified using selection information among all records included in the original table. At least one record group may be generated according to a selection criterion. If necessary, the record group may be generated by limiting only a part of records but not all of the original table. For example, a record group can be created for archiving only records before 2015 in the original table.
- the number of records included in one record group may be determined by comprehensively analyzing and reviewing the total number of records included in the original table, the performance of a computer searching a database, and a search condition pattern of a database.
- the database archiving apparatus divides the excess record group into a plurality of record groups in which the number of records is less than or equal to the threshold. can do.
- a threshold which is the number of records that one record group may contain, may be set to 100,000. However, if there is an excess record group in the selected record group including the number of records exceeding the threshold, this may cause computer overload and inefficiency of the search process, which may be problematic.
- the database archiving apparatus may include two record groups having 100,000 records in each of the excess record groups and one record group having 50,000 records. A total of three record groups can be separated.
- a serial number (e.g., 1,2,3,4, ...) can be assigned to each of the separated plurality of record groups and further stored in the serial number field of the compression table. In this case, even when searching the archived database, the search can be performed by distinguishing each record group. This will be described later in detail with reference to FIG. 7.
- step S120 the database archiving apparatus stores the group compression data generated by compression for each record group and the selection information corresponding to the group compression data for each of the at least one selected record group in a compression table.
- the compression table is a table in which archived data is stored by compressing an original table in record group units.
- the compression table may include a field for storing group compression data generated by compression for each record group and at least one field for storing selection information corresponding to the group compression data.
- the group compression data may be binary data generated by compressing each of the classified record groups. A detailed process of generating the group compression data and storing the group compression data in the compression table will be described later in detail with reference to FIGS. 2, 7 and 8.
- step S130 the database archiving device deletes the plurality of records included in the selected at least one record group from the original table.
- the database archiving method has an effect of dramatically reducing the capacity of the database by archiving the database through a compression procedure.
- the database archiving method has an effect of dramatically reducing the capacity of the database by archiving the database through a compression procedure.
- by archiving the original table included in the database by time or frequently searched field values there is an effect of maximizing the efficiency of the search in later retrieving the archived data.
- FIG. 2 is a flowchart illustrating a method of storing selection information in a compression table according to an embodiment of the present invention.
- the process of storing the selection information in the compression table may be performed for each of the at least one selected record group.
- step S210 the database archiving device stores data of a plurality of records included in the record group in a buffer.
- the size of the buffer in which the data of the plurality of records is stored may be determined based on the table structure (number, type and size of fields) of the original table and the threshold of the records included in the record group.
- the database archiving apparatus may sequentially read all of the records included in the record group and the field values of the records, and sequentially store them in the buffer.
- step S220 the database archiving device compresses the data stored in the buffer to generate group compression data.
- the group compression data may be a binary result generated by compressing data of a record group stored in a buffer.
- a lossless compression algorithm such as ZIP, CTW, LZ77, or LZW may be used.
- step S230 the database archiving device acquires the selection information corresponding to the generated group compression data.
- the group compression data has the selection information corresponding to February 2015. It may have been generated from multiple records. In this case, the selection information corresponding to the group compression data may be February 2015.
- step S240 the database archiving device stores the generated group compression data in the same record on the compression table together with the obtained selection information.
- the compression table may include a field for storing group compression data in binary form and at least one field for storing selection information. That is, the generated group compressed data may be stored in a field for storing the compressed binary data, and the selection information corresponding to the group compressed data may be distributed and stored in the at least one field.
- FIG. 7 and 8 are diagrams for explaining a compression table, the structure of the compression table is as follows.
- the original table 710 includes a Date field 714 with respect to time.
- the original table 710 is classified based on the field value of the Date field 714, which is screening information on time, and then the group compression data 726 and the time corresponding to the group compression data for each classification result.
- the selection information 722 may be stored in the compression table 720.
- the number of records whose value of the Date field 714 is 2002.01 refers to the field value of the Doc.No. field 712
- the record group can then be divided into two record groups each containing 100,000 and 90,000 records.
- group compression data corresponding to the two separate record groups are generated, and serial numbers 724 unique to the generated group compression data 726 are assigned to 1 and 2, and are combined in the compression table 720 together. Can be stored.
- the original table 810 includes a Date field 814 with respect to time, and a Col1 field 816 and a Col2 field 818 which are frequently searched fields.
- the original table 810 is classified based on the field values of the Date field 814, Col1 field 816, and Col2 field 818, which are screening information on time and field values, and then group compression for each classification result.
- the selection information 821, 822, and 823 corresponding to the data 825 and the group compressed data may be stored in the compression table 820. That is, a record in which the value of the Date field 814 is 2002.01, the Col1 field 816 is 1000, and the Col2 field 818 is A, refers to the field value of the Doc.No. field 812.
- 60,000 records from 90,001 to 150,000, so that 60,000 records become one record group can be generated as the group compression data 825 and stored with the corresponding selection information 821, 822, 823.
- the method for storing the sorting information in the compression table stores the group compression data and the sorting information corresponding to the group compression data in the same record of the compression table and later uses only the sorting information. There is an effect that can effectively find the group compression data.
- FIG. 3 is a flowchart illustrating a search method of an archived database according to an embodiment of the present invention.
- the search apparatus of the database searches for a record desired by a user in a compression table including group compression data generated by compressing selection information on at least one of time and field values and a plurality of records corresponding to the selection information.
- the received search condition may be a search condition in the form of a structured query language (SQL) statement. That is, the search apparatus of the database may receive a search condition for searching for a record desired by a user in the form of an SQL statement in a compression table in which group compression data and selection information corresponding to the group compression data are stored.
- SQL structured query language
- the user may generate a search condition for searching a record in the original table without knowing whether to search the compressed table, and the search apparatus of the database may receive the generated search condition.
- step S320 a DB search for processing a search of records in parallel based on at least one of the group compression data corresponding to the selection information satisfying the performance and the search condition of the computer on which the search is performed in the database. Determine the number of courses.
- the DB retrieval process refers to a single process of retrieving records from an archived database. Therefore, if a record search is processed in parallel, it can be understood that there exist a plurality of DB search processes, which proceed simultaneously.
- the number of DB retrieval processes is determined based on the performance of the computer because the process of retrieving records from the compressed group compression data by each DB retrieval process can place a heavy load on the computer.
- the number of DB retrieval processes is determined based on the number of group compression data corresponding to the selection information satisfying the retrieval condition because the number of the group compression data may eventually be related to the amount or range of retrieval. .
- step S330 the search apparatus of the database performs a parallel search of records that satisfy the search condition based on the determined number of DB search processes.
- the search apparatus of the database may prepare as many DB search processes as previously determined, share the search range for each DB search process, and perform a search of records in parallel.
- the search apparatus of the database may allocate at least one group compressed data to each of the DB search processes, and search the records in parallel based on the at least one group compressed data.
- the search apparatus of the database has 2 groups for 2 out of 4 DB search processes. Compressed data can be allocated, and one group of compressed data can be allocated to the remaining two DB search processes. Then, the four DB search processes can perform a search of records for one or two group compressed data allocated in parallel.
- each DB search process may be performed by decompressing the allocated group compression data and storing the decompressed group data in a buffer and searching a record satisfying a search condition from the data stored in the buffer.
- the search apparatus of the database searches for a record that satisfies the search condition further based on the table structure information, which is information on the type, size, order, and name of the fields included in the original table archived as the compression table. Can be done.
- the search apparatus of the database selects the type, size, order, and If the table structure information, which is information on a name, is known, the search can be performed more easily based on the table structure information.
- the DB search process may perform a search using a process or thread allocated to each DB search process.
- each DB search process must search records in parallel, one child process or thread can be allocated to each DB search process to perform the search in order to satisfy this parallel characteristic.
- whether a child process is allocated or a thread is allocated to each DB search process may be determined by whether or not support is performed according to the CPU type and OS type of the computer on which the search is performed.
- the number of DB search processes is determined to be six, six child processes may be allocated to each DB search process, and each of the six child processes may perform a search for records for group compressed data allocated to the DB search process. Can be done in parallel
- the search method of the archived database has the effect of performing a search of records in parallel based on the number of DB search processes determined according to the performance of the computer and the range of the database search. .
- FIG. 4 is a flowchart illustrating a method of determining the number of DB search processes according to an embodiment of the present invention.
- the search apparatus of the database collects computer performance information about at least one of the number of CPUs included in the computer, the capacity of the memory, and the input / output speed of the storage device.
- the performance of the computer can be determined by the performance of the CPU, memory and storage included in the computer where the retrieval of the record is performed.
- performance may be determined by the number of CPUs mounted on the computer, a clock frequency, a cache size, and the number of cores for each CPU.
- the performance of the memory may be determined by the capacity and the operating clock frequency.
- the performance of the storage device may be determined by the input / output speed.
- the search apparatus of the database may collect computer performance information including information about at least one of a CPU, a memory, and a storage device of a computer on which a record search is performed.
- the search apparatus of the database determines the number of group compression data corresponding to the selection information satisfying the received search condition among the group compression data stored in the compression table.
- the search condition is satisfied.
- the number of group compression data corresponding to the screening information may be determined.
- step S430 the search apparatus of the database determines the number of DB search processes for processing the search of records in parallel based on at least one of the collected computer performance information and the determined number of group compressed data.
- the search apparatus of the database may comprehensively analyze the collected computer performance information and the number of group compression data to determine the number of DB search processes for processing the search of records in parallel.
- the number of DB search processes may be determined according to the number of group compression data, and the collected computer performance information may be reduced. Even if the number of group compression data determined in a very good case is large, the number of DB retrieval processes can be determined according to computer performance information.
- the DB may be determined based on the number of group compression data corresponding to the information on the performance of the computer on which the search is performed and the selection information satisfying the search condition.
- FIG. 5 is a diagram illustrating a database archiving apparatus according to an embodiment of the present invention.
- the database archiving apparatus 500 includes a data selection unit 510, a data compression unit 520, and a DB manager 530.
- the database archiving apparatus 500 may be mounted together on a computer on which a database is mounted or on a computer connected to the database and a network.
- the data selecting unit 510 selects at least one record group including a plurality of records from the original table to be archived based on selection information on at least one of time and field values.
- the data compression unit 520 compresses the selected at least one record group for each record group to generate group compression data.
- the DB manager 530 stores the group compression data and selection information corresponding to the group compression data in a compression table, and deletes a plurality of records included in the selected at least one record group from the original table.
- the data compression unit 520 stores, for each of the at least one selected record group, data of a plurality of records included in the record group in a buffer, compresses the data stored in the buffer, and compresses the group.
- the DB manager 530 may acquire the selection information corresponding to the generated group compression data, and store the group compression data in the same information on the selection information and the compression table.
- the data selection unit 510 determines that the number of records in which the number of records exceeds the threshold value is greater than or equal to the number of records. Further dividing into record groups, the DB manager 530 may further store serial numbers assigned to each of the separated plurality of record groups in a compression table.
- FIG. 6 is a diagram illustrating an apparatus for searching an archived database according to an embodiment of the present invention.
- an apparatus 600 for searching an archived database includes a receiver 610, a search preparation unit 620, and a parallel search unit 630.
- the search device 600 of the archived database may be mounted together on a computer on which the database is mounted or on a computer connected to the database and the network.
- the receiving unit 610 is a search condition for searching for a desired record by a user in a compression table including grouping data generated by compressing selection information on at least one of time and field values and a plurality of records corresponding to the selection information.
- the search preparation unit 620 performs a DB search for processing a record search in parallel based on at least one of the performance of the computer on which the search is performed and the number of group compressed data corresponding to the selection information satisfying the received search condition. Determine the number of courses.
- the search preparation unit 620 collects computer performance information on at least one of the number of CPUs included in the computer, the capacity of the memory, and the input / output speed of the storage device, and receives from the group compression data stored in the compression table. The number of group compression data corresponding to the selection information satisfying the search condition can be determined.
- the parallel search unit 630 performs a parallel search of records that satisfy the received search condition based on the determined number of DB search processes.
- the parallel search unit 630 allocates at least one group compression data to each of the determined number of DB search processes based on the number of group compression data corresponding to the selection information satisfying the search condition. For example, the decompression of at least one group compressed data allocated to each DB search process and a search of a record satisfying a search condition may be performed in parallel.
- the parallel search unit 630 may further be based on table structure information which is information on the type, size, order, and name of fields included in the original table archived as the compression table.
- the DB search process may perform a search using a process or thread allocated to each DB search process.
- the above-described embodiments of the present invention can be written as a program that can be executed in a computer, and can be implemented in a general-purpose digital computer that operates the program using a computer-readable recording medium.
- the computer-readable recording medium may include a magnetic storage medium (for example, a ROM, a floppy disk, a hard disk, etc.) and an optical reading medium (for example, a CD-ROM, DVD, etc.).
- a magnetic storage medium for example, a ROM, a floppy disk, a hard disk, etc.
- an optical reading medium for example, a CD-ROM, DVD, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (16)
- 데이터 아카이빙을 하려는 원본테이블에서 시간 및 필드(field)값 중 적어도 하나에 대한 선별정보에 기초하여 복수의 레코드(record)를 포함하는 적어도 하나의 레코드그룹을 선별하는 단계;상기 선별된 적어도 하나의 레코드그룹 각각에 대하여, 상기 레코드그룹별로 압축하여 생성된 그룹압축데이터 및 상기 그룹압축데이터에 대응되는 상기 선별정보를 압축테이블에 저장하는 단계; 및상기 선별된 적어도 하나의 레코드그룹에 포함된 복수의 레코드를 상기 원본테이블에서 삭제하는 단계를 포함하는 것을 특징으로 하는 데이터베이스 아카이빙 방법.
- 제1항에 있어서,상기 선별정보를 압축테이블에 저장하는 단계는상기 선별된 적어도 하나의 레코드그룹 각각에 대하여,상기 레코드그룹에 포함된 복수의 레코드의 데이터를 버퍼에 저장하는 단계;상기 버퍼에 저장된 데이터를 압축하여 상기 그룹압축데이터를 생성하는 단계;상기 생성된 그룹압축데이터에 대응되는 상기 선별정보를 취득하는 단계; 및상기 그룹압축데이터를 상기 선별정보와 상기 압축테이블 상의 동일한 레코드에 저장하는 단계를 포함하는 것을 특징으로 하는 데이터베이스 아카이빙 방법.
- 제1항에 있어서,상기 선별된 적어도 하나의 레코드그룹 중에서 레코드의 개수가 임계치를 초과하는 초과레코드그룹이 존재하면, 상기 초과레코드그룹을 레코드의 개수가 상기 임계치 이하인 복수의 상기 레코드그룹으로 분리하는 단계를 더 포함하고,상기 선별정보를 압축테이블에 저장하는 단계는상기 분리된 복수의 레코드그룹 각각에 부여된 일련번호를 상기 압축테이블에 더 저장하는 것을 특징으로 하는 데이터베이스 아카이빙 방법.
- 데이터 아카이빙을 하려는 원본테이블에서 시간 및 필드값 중 적어도 하나에 대한 선별정보에 기초하여 복수의 레코드를 포함하는 적어도 하나의 레코드그룹을 선별하는 데이터선별부;상기 선별된 적어도 하나의 레코드그룹 각각에 대하여, 상기 레코드그룹별로 압축하여 그룹압축데이터를 생성하는 데이터압축부; 및상기 그룹압축데이터 및 상기 그룹압축데이터에 대응되는 상기 선별정보를 압축테이블에 저장하고, 상기 선별된 적어도 하나의 레코드그룹에 포함된 복수의 레코드를 상기 원본테이블에서 삭제하는 DB관리부를 포함하는 것을 특징으로 하는 데이터베이스 아카이빙 장치.
- 제4항에 있어서,상기 데이터압축부는상기 선별된 적어도 하나의 레코드그룹 각각에 대하여,상기 레코드그룹에 포함된 복수의 레코드의 데이터를 버퍼에 저장하고, 상기 버퍼에 저장된 데이터를 압축하여 상기 그룹압축데이터를 생성하고,상기 DB관리부는상기 생성된 그룹압축데이터에 대응되는 상기 선별정보를 취득하고, 상기 그룹압축데이터를 상기 선별정보와 상기 압축테이블 상의 동일한 레코드에 저장하는 것을 특징으로 하는 데이터베이스 아카이빙 장치.
- 제5항에 있어서,상기 데이터선별부는상기 선별된 적어도 하나의 레코드그룹 중에서 레코드의 개수가 임계치를 초과하는 초과레코드그룹이 존재하면, 상기 초과레코드그룹을 레코드의 개수가 상기 임계치 이하인 복수의 상기 레코드그룹으로 더 분리하고,상기 DB관리부는상기 분리된 복수의 레코드그룹 각각에 부여된 일련번호를 상기 압축테이블에 더 저장하는 것을 특징으로 하는 데이터베이스 아카이빙 장치.
- 시간 및 필드값 중 적어도 하나에 대한 선별정보 및 상기 선별정보에 대응되는 복수의 레코드를 압축하여 생성된 그룹압축데이터를 포함하는 압축 테이블에서 사용자가 원하는 레코드를 검색하기 위한 검색조건을 수신하는 단계;검색이 수행되는 컴퓨터의 성능 및 상기 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수 중 적어도 하나에 기초하여, 상기 레코드의 검색을 병렬로 처리하기 위한 DB검색과정의 개수를 결정하는 단계; 및상기 결정된 DB검색과정의 개수에 기초하여 상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 단계를 포함하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 방법.
- 제7항에 있어서,상기 병렬로 처리하기 위한 DB검색과정의 개수를 결정하는 단계는상기 컴퓨터에 포함된 CPU의 개수, 메모리의 용량 및 저장장치의 입출력속도 중 적어도 하나에 대한 컴퓨터성능정보를 수집하는 단계;상기 압축테이블에 저장된 상기 그룹압축데이터 중에서 상기 수신된 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수를 결정하는 단계; 및상기 수집된 컴퓨터성능정보 및 상기 결정된 그룹압축데이터의 개수 중 적어도 하나에 기초하여 상기 레코드의 검색을 병렬로 처리하기 위한 DB검색과정의 개수를 결정하는 단계;를 포함하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 방법.
- 제7항에 있어서,상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 단계는상기 결정된 개수의 DB검색과정 각각에 대하여, 상기 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수에 기초하여 적어도 하나의 상기 그룹압축데이터를 할당하는 단계; 및상기 각각의 DB검색과정별로 상기 할당된 적어도 하나의 그룹압축데이터의 압축 해제 및 상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 단계를 포함하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 방법.
- 제7항에 있어서,상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 단계는상기 압축테이블로 아카이빙된 원본테이블에 포함된 필드의 종류, 크기, 순서 및 명칭에 대한 정보인 테이블구조정보에 더 기초하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 방법.
- 제7항에 있어서,상기 DB검색과정은상기 각각의 DB검색과정별로 할당된 프로세스(process) 또는 쓰레드(thread)를 이용하여 검색을 수행하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 방법.
- 시간 및 필드값 중 적어도 하나에 대한 선별정보 및 상기 선별정보에 대응되는 복수의 레코드를 압축하여 생성된 그룹압축데이터를 포함하는 압축 테이블에서 사용자가 원하는 레코드를 검색하기 위한 검색조건을 수신하는 수신부;검색이 수행되는 컴퓨터의 성능 및 상기 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수 중 적어도 하나에 기초하여, 상기 레코드의 검색을 병렬로 처리하기 위한 DB검색과정의 개수를 결정하는 검색준비부; 및상기 결정된 DB검색과정의 개수에 기초하여 상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 병렬검색부를 포함하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 장치.
- 제12항에 있어서,상기 검색준비부는상기 컴퓨터에 포함된 CPU의 개수, 메모리의 용량 및 저장장치의 입출력속도 중 적어도 하나에 대한 컴퓨터성능정보를 수집하고,상기 압축테이블에 저장된 상기 그룹압축데이터 중에서 상기 수신된 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수를 결정하고,상기 수집된 컴퓨터성능정보 및 상기 결정된 그룹압축데이터의 개수 중 적어도 하나에 기초하여 상기 레코드의 검색을 병렬로 처리하기 위한 DB검색과정의 개수를 결정하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 장치.
- 제12항에 있어서,상기 병렬검색부는상기 결정된 개수의 DB검색과정 각각에 대하여, 상기 검색조건을 만족하는 상기 선별정보에 대응되는 상기 그룹압축데이터의 개수에 기초하여 적어도 하나의 상기 그룹압축데이터를 할당하고,상기 각각의 DB검색과정별로 상기 할당된 적어도 하나의 그룹압축데이터의 압축 해제 및 상기 검색조건을 만족하는 레코드의 검색을 병렬로 수행하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 장치.
- 제12항에 있어서,상기 병렬검색부는상기 압축테이블로 아카이빙된 원본테이블에 포함된 필드의 종류, 크기, 순서 및 명칭에 대한 정보인 테이블구조정보에 더 기초하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 장치.
- 제12항에 있어서,상기 DB검색과정은상기 각각의 DB검색과정별로 할당된 프로세스 또는 쓰레드를 이용하여 검색을 수행하는 것을 특징으로 하는 아카이빙된 데이터베이스의 검색 장치.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP16891741.7A EP3422204A4 (en) | 2016-02-26 | 2016-10-13 | METHOD AND APPARATUS FOR ARCHIVING A DATABASE AND METHOD AND APPARATUS FOR SEARCHING AN ARCHIVED DATABASE |
JP2018543247A JP6638821B2 (ja) | 2016-02-26 | 2016-10-13 | データベースのアーカイビング方法及び装置、アーカイビングされたデータベースの検索方法及び装置 |
CN201680081603.6A CN108701134A (zh) | 2016-02-26 | 2016-10-13 | 数据库的存档方法及装置、存档的数据库的搜索方法及装置 |
AU2016394743A AU2016394743A1 (en) | 2016-02-26 | 2016-10-13 | Method and apparatus for archiving database, and method and apparatus for searching archived database |
US16/077,208 US11030050B2 (en) | 2016-02-26 | 2016-10-13 | Method and device of archiving database and method and device of retrieving archived database |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160023401A KR101663547B1 (ko) | 2016-02-26 | 2016-02-26 | 데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 |
KR10-2016-0023401 | 2016-02-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017146337A1 true WO2017146337A1 (ko) | 2017-08-31 |
Family
ID=57145318
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2016/011463 WO2017146337A1 (ko) | 2016-02-26 | 2016-10-13 | 데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 |
Country Status (7)
Country | Link |
---|---|
US (1) | US11030050B2 (ko) |
EP (1) | EP3422204A4 (ko) |
JP (1) | JP6638821B2 (ko) |
KR (1) | KR101663547B1 (ko) |
CN (1) | CN108701134A (ko) |
AU (1) | AU2016394743A1 (ko) |
WO (1) | WO2017146337A1 (ko) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110874417B (zh) | 2018-09-04 | 2024-04-16 | 华为技术有限公司 | 数据检索的方法和装置 |
CN111090652B (zh) * | 2019-12-20 | 2023-05-23 | 山大地纬软件股份有限公司 | 一种可水平扩展归档数据库的数据归档方法和装置 |
US11907713B2 (en) | 2019-12-28 | 2024-02-20 | Intel Corporation | Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator |
KR102559290B1 (ko) * | 2020-01-06 | 2023-07-26 | 주식회사 아미크 | 하이브리드 클라우드 기반의 실시간 데이터 아카이빙 방법 및 시스템 |
US11676066B2 (en) * | 2020-01-17 | 2023-06-13 | Western Digital Technologies, Inc. | Parallel model deployment for artificial intelligence using a primary storage system |
KR102256814B1 (ko) | 2020-09-10 | 2021-05-27 | 주식회사 아미크 | 목적 데이터 선별 방법 및 시스템 |
CN113111032B (zh) * | 2021-04-20 | 2022-03-08 | 河南水利与环境职业学院 | 一种档案管理系统数据归档方法和系统 |
CN113791742B (zh) * | 2021-11-18 | 2022-03-25 | 南湖实验室 | 一种高性能的数据湖系统及数据存储方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010287024A (ja) * | 2009-06-11 | 2010-12-24 | Yaskawa Information Systems Co Ltd | アーカイブシステム、アーカイブシステム用検索プログラムならびにアーカイブシステムによる検索方法 |
JP2011048679A (ja) * | 2009-08-27 | 2011-03-10 | Nec Corp | ストレージシステム、管理方法及びプログラム |
JP2013065224A (ja) * | 2011-09-20 | 2013-04-11 | Kddi Corp | メールアーカイブシステム |
KR20140072929A (ko) * | 2012-11-16 | 2014-06-16 | 현대중공업 주식회사 | 아카이빙 작업수행 자동화 방법 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9477729B2 (en) * | 2004-02-20 | 2016-10-25 | Informatica Llc | Domain based keyword search |
US8832045B2 (en) | 2006-04-07 | 2014-09-09 | Data Storage Group, Inc. | Data compression and storage techniques |
US8229902B2 (en) * | 2006-11-01 | 2012-07-24 | Ab Initio Technology Llc | Managing storage of individually accessible data units |
US9767098B2 (en) * | 2012-08-08 | 2017-09-19 | Amazon Technologies, Inc. | Archival data storage system |
EP2937794B1 (en) * | 2014-04-22 | 2016-08-17 | DataVard GmbH | Method and system for archiving digital data |
-
2016
- 2016-02-26 KR KR1020160023401A patent/KR101663547B1/ko active IP Right Grant
- 2016-10-13 CN CN201680081603.6A patent/CN108701134A/zh not_active Withdrawn
- 2016-10-13 AU AU2016394743A patent/AU2016394743A1/en not_active Abandoned
- 2016-10-13 US US16/077,208 patent/US11030050B2/en active Active
- 2016-10-13 EP EP16891741.7A patent/EP3422204A4/en not_active Withdrawn
- 2016-10-13 JP JP2018543247A patent/JP6638821B2/ja active Active
- 2016-10-13 WO PCT/KR2016/011463 patent/WO2017146337A1/ko active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010287024A (ja) * | 2009-06-11 | 2010-12-24 | Yaskawa Information Systems Co Ltd | アーカイブシステム、アーカイブシステム用検索プログラムならびにアーカイブシステムによる検索方法 |
JP2011048679A (ja) * | 2009-08-27 | 2011-03-10 | Nec Corp | ストレージシステム、管理方法及びプログラム |
JP2013065224A (ja) * | 2011-09-20 | 2013-04-11 | Kddi Corp | メールアーカイブシステム |
KR20140072929A (ko) * | 2012-11-16 | 2014-06-16 | 현대중공업 주식회사 | 아카이빙 작업수행 자동화 방법 |
Non-Patent Citations (2)
Title |
---|
KIM, JU CHEOL: "What is Archiving?", STORAGETEC, 10 February 2004 (2004-02-10), XP055413199, Retrieved from the Internet <URL:http://blog.naver.com/rainow/40000828618> * |
See also references of EP3422204A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP3422204A4 (en) | 2020-01-22 |
CN108701134A (zh) | 2018-10-23 |
KR101663547B1 (ko) | 2016-10-07 |
JP6638821B2 (ja) | 2020-01-29 |
AU2016394743A1 (en) | 2018-08-30 |
US11030050B2 (en) | 2021-06-08 |
JP2019512125A (ja) | 2019-05-09 |
US20190026189A1 (en) | 2019-01-24 |
EP3422204A1 (en) | 2019-01-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017146337A1 (ko) | 데이터베이스의 아카이빙 방법 및 장치, 아카이빙된 데이터베이스의 검색 방법 및 장치 | |
WO2017146338A1 (ko) | 인덱스정보를 생성하는 데이터베이스의 아카이빙 방법 및 장치, 인덱스정보를 포함하는 아카이빙된 데이터베이스의 검색 방법 및 장치 | |
TWI518530B (zh) | Repeated data processing methods, devices and systems | |
WO2021107211A1 (ko) | 인메모리 데이터베이스 기반의 시계열 데이터 관리시스템 | |
WO2021141294A1 (ko) | 데이터의 하이브리드 저장을 이용한 데이터 아카이빙 방법 및 시스템 | |
WO2019156309A1 (ko) | 플래시 저장장치의 내부 병렬성을 이용하는 키 값 기반의 데이터 액세스 장치 및 방법 | |
JP6527462B2 (ja) | 圧縮装置、圧縮方法、記録媒体および伸張装置 | |
WO2012050252A1 (ko) | 분류기의 동적 결합에 의한 대용량 분류기 자동 생성 시스템 및 방법 | |
WO2013136418A1 (ja) | ログ管理計算機、及びログ管理方法 | |
WO2010123168A1 (ko) | 데이터베이스 관리 방법 및 시스템 | |
JP2012203865A (ja) | 検索装置、検索システム、方法およびプログラム | |
US9183320B2 (en) | Data managing method, apparatus, and recording medium of program, and searching method, apparatus, and medium of program | |
JP2016521402A (ja) | データの編成及び高速検索 | |
US20020065793A1 (en) | Sorting system and method executed by plural computers for sorting and distributing data to selected output nodes | |
CN107391769B (zh) | 一种索引查询方法及装置 | |
WO2021141292A1 (ko) | 하이브리드 클라우드 기반의 실시간 데이터 아카이빙 방법 및 시스템 | |
WO2012046904A1 (ko) | 다중 자원 기반 검색정보 제공 장치 및 방법 | |
WO2012030049A2 (ko) | 동적 임계값이 적용된 유사문서 분류화 장치 및 방법 | |
CN111045994A (zh) | 一种基于kv数据库的文件分类检索方法及系统 | |
JP6103021B2 (ja) | データ生成方法、装置及びプログラム、検索処理方法、装置及びプログラム | |
WO2015020422A1 (ko) | 히스토그램을 이용한 고속 유사도 측정 방법 및 장치 | |
WO2022097881A1 (ko) | 네트워크 패킷 분석 기반의 대상파일 검출 장치 및 방법 | |
US11734282B1 (en) | Methods and systems for performing a vectorized delete in a distributed database system | |
KR102529704B1 (ko) | 인 메모리 데이터베이스의 데이터를 처리하는 방법 및 장치 | |
US20240061823A1 (en) | Memory-frugal index design in storage engine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2018543247 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2016394743 Country of ref document: AU Date of ref document: 20161013 Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2016891741 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2016891741 Country of ref document: EP Effective date: 20180926 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16891741 Country of ref document: EP Kind code of ref document: A1 |