WO2015178554A1

WO2015178554A1 - Apparatus and method for managing data source using compression scheme

Info

Publication number: WO2015178554A1
Application number: PCT/KR2014/011164
Authority: WO
Inventors: 한민호
Original assignee: 에스케이플래닛 주식회사
Priority date: 2014-05-22
Filing date: 2014-11-20
Publication date: 2015-11-26
Also published as: KR20150134718A

Abstract

Disclosed is an apparatus and method for managing a data source using a compression scheme. The present invention can generate a hash table in which each of elements corresponding to data sources is divided into a plurality of fields and stored, determine the number of compression data sources to be compressed from among the data sources using the number of times of inquiries of the fields, compress only the compression data sources from among the data sources, and process a structured query language inquiring request of a user using a cache in which the hash table and the data sources are stored.

Description

Apparatus and method for data source management using compression method

The present invention relates to a data source management apparatus and method using a compression scheme that can overcome the spatial limitations of the existing cache memory and manage data based on the frequency of use of the data stored in the cache.

The present invention claims the benefit of the filing date of Korean Patent Application No. 10-2014-0061843, filed May 22, 2014, the entire contents of which are incorporated herein.

On large traffic platforms, dozens or millions of Structrued Query Languages (SQL) are sent to the database every day. When the same structured query is repeatedly transmitted to the structured query, the database responds with a faster response time than the structured query response previously transmitted. This is because the database block is cached on the database server memory when the structured query is transmitted. However, the DBMS (Database Management System) checks the syntax or semantics of the structured query word, and it slows down because it optimizes the execution plan of the structured query word by using the optimizer.

Because of these operations, the data source, which is usually the response value after sending the structured query, is cached and sometimes supported by the database. However, this method has a very large amount of data and, in the case of many Data Manipulation Languages (DMLs), is deleted from the cache, which results in lower performance. In addition, every time a request is made from the database, the database cannot handle the traffic and there is a high risk of failure.

In order to reduce such database traffic, the inquiry is performed using a caching technique. If the query target is not in cache memory at the time of initial query, the caching technique stores the data source retrieved and read from the database in cache memory, and then in the cache memory without querying the database again if the query is for the same structured query. The method of reading a data source.

Prior art documents include Korean Laid-Open Patent Publication No. 10-2006-0117006, published May 28, 2008 (name: memory and method for compressing and managing data).

An object of the present invention is to provide a quick response to a structured query query request by keeping data as much as possible in the cache server by compressing the data sources based on frequency of use and storing them in the cache server.

In addition, an object of the present invention is to manage the data sources maintained in the cache server according to the frequency of use of the data source, thereby increasing the efficiency of the database management system by maintaining the frequently used or recently used data sources in the cache server for as long as possible. It is to let.

The data source management apparatus according to the present invention for achieving the above object comprises a hash table generation unit for generating a hash table in which each of the elements corresponding to the data sources are divided into a plurality of fields; A compression target calculation unit for determining the number of compressed data sources to be compressed among the data sources using the number of inquiries among the fields; A data source compressor for compressing only the compressed data sources of the data sources; And a query processing unit configured to process a structured query query request of a user using a cache in which the hash table and the data sources are stored.

In this case, the query processing unit may process the structured query query request using a database according to whether a data source corresponding to the structured query query request among the data sources exists in the cache using the hash table. have.

In this case, when the data source corresponding to the structured query query request does not exist in the cache, the query processing unit provides the user with a database block corresponding to the structured query query request using the database. The database block may be stored in the cache in a form corresponding to the data sources.

In this case, the query processing unit may decompress the data source and provide it to the user according to whether the data source corresponding to the structured query query request is included in the compressed data sources.

At this time, the data source management apparatus may further include a data source decompression unit for decompressing the compressed data sources.

At this time, the compression target calculation unit uses the remainder except for the upper data sources on the hash table corresponding to a predetermined percentage of the total number of inquiries among the data sources based on the hash table arranged in the order of the high inquiries. The number of compressed data sources can be determined.

In this case, the data source management apparatus may further include a data source updater for deleting at least one data source of which the number of inquiries has not increased during the predetermined period of the data sources from the cache.

At this time, the hash table generation unit may update the hash table for each predetermined period based on the at least one structured query query request requested during the predetermined period.

At this time, the compression target calculation unit may determine the number of the compressed data sources by using the number of inquiries of the updated hash table every predetermined period.

In this case, when the structured query table is changed, the hash table generator may search for and change data sources corresponding to the structured query table among the data sources by using a table name among the fields.

In addition, the data source management method according to the present invention comprises the steps of: generating a hash table storing each of the elements corresponding to the data sources divided into a plurality of fields; Determining the number of compressed data sources of the data sources to be compressed using the number of lookups among the fields; Compressing only the compressed data sources of the data sources; Processing the user's structured query query request using a cache in which the hash table and the data sources are stored.

In this case, the processing may include determining whether a data source corresponding to the structured query query request among the data sources exists in the cache using the hash table, and the database according to the determination result. Using the structured query query request can be processed.

In this case, the processing may include providing the user with a database block corresponding to the structured query query request using the database when the data source does not exist in the cache. The database block may be stored in the cache in a form corresponding to the data sources.

In this case, the processing may decompress the data source and provide it to the user depending on whether the data source corresponding to the structured query query request is included in the compressed data sources.

In this case, the data source management method may further include decompressing the compressed data sources.

In this case, the determining may be performed based on the hash table sorted in order of increasing number of lookups, using the remainder except for upper data sources on the hash table corresponding to a predetermined percentage of the total number of lookups among the data sources. The number of compressed data sources can be determined.

At this time, the data source management method may further include the step of deleting at least one data source of the data source for which the number of inquiries has not increased during a predetermined period of time from the cache.

In this case, the generating may update the hash table for each predetermined period based on the at least one structured query query request requested during the predetermined period.

In this case, the determining may determine the number of the compressed data sources by using the number of inquiries of the updated hash table every predetermined period of time.

According to the present invention, a data source compression scheme is used to maintain a large amount of data sources in the cache server, thereby providing a faster response when requesting a structured query query.

In addition, the present invention can determine the data sources maintained in the cache server based on the frequency of use of the data sources, thereby maintaining the frequently used or recently used data sources in the cache server to quickly process structured query queries.

1 is a block diagram illustrating an apparatus for managing a data source according to an embodiment of the present invention.

2 is a diagram illustrating a data source management system according to an embodiment of the present invention.

3 illustrates a hash table according to an embodiment of the present invention.

4 is a view showing a reverse index table that can be searched by the table name of the hash table according to an embodiment of the present invention.

5 is a flowchart illustrating a data source management method according to an embodiment of the present invention.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, in the following description and the accompanying drawings, detailed descriptions of well-known functions or configurations that may obscure the subject matter of the present invention will be omitted. In addition, it should be noted that like elements are denoted by the same reference numerals as much as possible throughout the drawings.

The terms or words used in the specification and claims described below should not be construed as being limited to ordinary or dictionary meanings, and the inventors are appropriate as concepts of terms for explaining their own invention in the best way. It should be interpreted as meanings and concepts in accordance with the technical spirit of the present invention based on the principle that it can be defined. Therefore, the embodiments described in the present specification and the configuration shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all of the technical ideas of the present invention, and various alternatives may be substituted at the time of the present application. It should be understood that there may be equivalents and variations. In addition, terms such as “first” and “second” are used to describe various components, and are only used to distinguish one component from another component and are not used to limit the components.

Referring to FIG. 1, the data source management apparatus 100 according to an exemplary embodiment of the present invention may include a hash table generator 110, a compression target calculation unit 120, a data source compressor 130, and a query processor 140. ), A data source decompressor 150, and a data source manager 160.

The hash table generator 110 may generate a hash table in which each element corresponding to the data sources is divided into a plurality of fields and stored. Hash table can generate structured query text based on key value generated by using hash function of MD5 or SHA1 in order to show fast search speed when searching the data source. In addition, the hash table may be displayed in a sorted order based on the number of inquiries.

At this time, based on at least one structured query query request requested for a predetermined period, the hash table may be updated for each predetermined period. For example, when the preset period is set to 24 hours, the number of inquiries of the corresponding data source may be increased according to the structured query query request requested for 24 hours. The sort order of the data sources in the hash table may be changed according to the changed lookup frequency, and the compression state of each of the data sources may also be changed.

In this case, when the structured query table is changed, the data sources corresponding to the structured query table among the data sources may be searched and changed using the table name among the fields. For example, you can construct a reverse index table by analyzing structured queries using table names as key values. Such a reverse index table may be used to delete from the cache when a data manipulation word (DML) occurs in the structured query table.

The compression target calculator 120 may determine the number of compressed data sources to be compressed among the data sources by using the number of inquiries among the fields. By determining compressed data sources using the number of inquiries, data sources with high usage frequency can be kept uncompressed, and data sources with relatively low usage frequency can be compressed and stored.

In this case, the number of compressed data sources is determined by using the rest of the data sources except the top data sources on the hash table corresponding to a predetermined percentage of the total number of queries based on the hash table sorted in ascending order. Can be. For example, the number of compressed data sources may be determined using lower data sources except for data sources having a rank of 80 percent of the total number of queries by summing the number of inquiries in the sort order of the hash table. The reason for applying these criteria is that the top 20 percent of top data sources, which are often used, can generate 80 percent of traffic.

In this case, the number of compressed data sources may be determined using the number of times the hash table is updated for each preset period. For example, the number of retrieval on the hash table may be changed by the structured query retrieval request requested for a predetermined period. Accordingly, the number of compressed data sources determined by using the hash table query count may also be newly determined by using the hash table updated in each preset period.

The data source compressor 130 may compress only compressed data sources among the data sources. The data sources may be compressed according to the number of compressed data sources determined by the compression target calculator 120 among the data sources stored in the cache. The reason for compressing and storing the data source may be to increase the speed of the structured data response by storing the maximum data source in the cache. In addition, the reason for using the compression method is to compress and decompress the data source with a central processing unit (CPU) rather than inputting and outputting a disk or querying a database. Because you can do it faster. In addition, when compressed, about 10 times more information can be stored in the cache.

The query processor 140 may process the structured query query request of the user using a cache in which a hash table and data sources are stored. When a specific structured query query request is received by the user, a data source corresponding to the structured query query request may be searched using a hash table, and the searched data source may be provided to the user from the cache.

In this case, the structured query query request may be processed using a database according to whether a data source corresponding to the structured query query request among data sources exists in the cache using a hash table. For example, if a data source corresponding to the structured query query request does not exist among the data sources stored in the cache, the data corresponding to the structured query query request may be queried from the database.

At this time, if a data source corresponding to the structured query query request does not exist in the cache, the database block is provided to the user using the database, and the database block is provided to the data sources. It can be cached in the corresponding form. In general, a structured query query request that has been requested once may occur several times at the same time. Therefore, if a data source that was not in the cache is requested, the database block can be provided through the database and stored in the form of a data source in the cache so that the next time the same query request occurs, it can be quickly responded. have.

The data source decompressor 150 may decompress the compressed data sources. For example, the data source decompressing unit 150 may decompress the data source, which is previously stored as the compressed data source, in the decompressed state according to the hash table updated every predetermined period. In addition, even when the data source corresponding to the structured query query request is a compressed data source stored in the cache, the compressed data source may be decompressed and provided to the user.

The data source updater 160 may delete at least one data source of which the number of inquiries has not increased during a predetermined period of the data sources from the cache. For example, data sources for which the number of inquiries fall within the lower 10 percent during the preset period or for which the number of inquiries have not increased during the predetermined period may be deleted from the cache. In general, when managing data sources on a cache, an expiration time is given to each data source, and data sources whose expiration time expires are deleted from the cache. However, this method can cause a large burden on the database because many data sources expire at the same time and a query is requested again using the database. In addition, in the case of frequently used data sources, similar problems may occur because similar usage patterns may occur.

By managing the data sources stored in the cache by using the data source management apparatus 100 as described above, a larger amount of data sources are stored in the cache and frequently used data sources are stored for a long time, thereby processing the database management system. Can improve speed.

2, a data source management system according to an embodiment of the present invention may include a data source management apparatus 200, a cache 210, a database 220, and a structured query transmission server 230. .

The data source management apparatus 200 may generate a hash table in which each element corresponding to the data sources is divided into a plurality of fields and stored. Hash table can generate structured query text based on key value generated by using hash function of MD5 or SHA1 in order to show fast search speed when searching the data source. In addition, the hash table may be displayed in a sorted order based on the number of inquiries.

In this case, when the structured query table is changed, the data sources corresponding to the structured query table among the data sources may be searched and changed using the table name among the fields. For example, you can construct a reverse index table by analyzing structured queries using table names as key values. The reverse index table may be used to delete from the cache 210 when a data manipulation word (DML) occurs in the structured query table.

In addition, the data source management apparatus 200 may determine the number of compressed data sources to be compressed among the data sources by using the number of inquiries among the fields. By determining compressed data sources using the number of inquiries, data sources with high usage frequency can be kept uncompressed, and data sources with relatively low usage frequency can be compressed and stored.

In addition, the data source management apparatus 200 may compress only compressed data sources among the data sources. The data sources may be compressed according to the number of compressed data sources determined by the compression target calculator 120 among the data sources stored in the cache 210. The reason for compressing and storing the data source may be to improve the speed of the structured data response by storing the maximum data source in the cache 210. In addition, the reason for using the compression method is that compressing and decompressing a data source with a central processing unit may speed up data processing rather than inputting and outputting a disk or querying a database 220. Because there is. In addition, when compressed, information about 10 times larger than that of the existing data may be stored in the cache 210.

In addition, the data source management apparatus 200 may process the structured query query request of the user using the cache 210 in which the hash table and the data sources are stored. When a specific structured query query request is received by the user, a hash table may be used to search a data source corresponding to the structured query query request, and the queryed data source may be provided to the user from the cache 210.

In this case, the structured query query request may be processed using the database 220 according to whether a data source corresponding to the structured query query request among data sources exists in the cache 210 using the hash table. For example, if no data source corresponding to the structured query query request exists among the data sources stored in the cache 210, the data corresponding to the structured query query request may be queried from the database 220.

In this case, when the data source corresponding to the structured query query request does not exist in the cache 210, the database 220 is used to provide a user with a database block corresponding to the structured query query request. The block may be stored in the cache 210 in a form corresponding to the data sources. In general, a structured query query request that has been requested once may occur several times at the same time. Therefore, if a data source that does not exist in the cache 210 is requested, the same inquiry request may be generated by providing the database block through the database 220 and storing the data in the cache 210 in the form of a data source. When you can respond quickly.

In addition, the data source management apparatus 200 may decompress the compressed data sources. For example, the data source decompressing unit 150 may decompress the data source, which is previously stored as the compressed data source, in the decompressed state according to the hash table updated every predetermined period. In addition, even when the data source corresponding to the structured query query request is a compressed data source stored in the cache 210, the compressed data source may be decompressed and provided to the user.

In addition, the data source management apparatus 200 may delete from the cache 210 at least one data source whose number of inquiries has not increased during a predetermined period of data sources. For example, data sources for which the number of inquiries belong to the lower 10 percent during the predetermined period or for which the number of inquiries have not increased at all during the predetermined period may be deleted on the cache 210. In general, when managing a data source on the cache 210, an expiration time is given to each data source, so that data sources whose expiration time has expired are deleted from the cache 210. However, this method may cause a large burden on the database 220 because many data sources expire at the same time and a request is made again using the database 220. In addition, in the case of frequently used data sources, similar problems may occur because similar usage patterns may occur.

Cache 210 may store data sources and a hash table. The data sources stored in the cache 210 may be data sources from which the structured query query request has been made at least once from the structured query transmission server 230. The reason for providing the data sources using the cache 210 is that, when a query request is made to the database 220 every time the structured query query request is made, the database 220 cannot handle the traffic and a failure may occur. Because there is. Therefore, in order to process the structured query query request by the structured query transmission server 230, it may be inquired whether a data source corresponding to the structured query query request is stored in the cache 210. At this time, the information of the data sources stored in the cache 210 can be inquired using a hash table. If the data source corresponding to the structured query query request does not exist in the cache 210, the database 220 may be queried to provide the corresponding database block to the structured query transmission server 230.

As described above, the database 220 may store data for processing the structured query query request in the form of a database block.

The structured query transmission server 230 may deliver the structured query query request requested by the user to the cache.

By using such a data source management system, a user can quickly search and provide a data source for a structured query word desired by a user.

3 illustrates a hash table according to an embodiment of the present invention.

Referring to FIG. 3, it can be seen that the hash table 310 according to an embodiment of the present invention divides and stores the element 330 of the data source into a plurality of columns 320.

For example, the type of the column 320 is a Rank for sorting the Touch Count, which is the number of hits, as a rank, a Key value which is a value created using a hash function such as MD5 or SHA1, and the structured query word. Touch Count which shows the number of times of searching, Update Time which records the change time of data source, and Compress SQL which shows binary string compressed data source.

The hash table 310 may be used to query the data sources stored in the cache, and the frequently used data sources may be kept in the cache for a long time by managing the data sources to be kept in the cache using the number of inquiries.

Referring to FIG. 4, the reverse index table 410 that can be searched by the table name of the hash table according to an embodiment of the present invention uses the value of the table name, which is one of the columns of the hash table, of the data sources. You can sort the key values. The reverse index table 410 may be used to delete data sources corresponding to the structured query table from the cache when data manipulation occurs in a specific structured query table.

Referring to FIG. 5, the data source management method according to an embodiment of the present invention may generate a hash table in which each element corresponding to the data sources is divided into a plurality of fields and stored (S510). Hash table can generate structured query text based on key value generated by using hash function of MD5 or SHA1 in order to show fast search speed when searching the data source. In addition, the hash table may be displayed in a sorted order based on the number of inquiries.

In addition, the data source management method according to an embodiment of the present invention may determine the number of compressed data sources to be compressed among the data sources by using the number of inquiries among the fields (S520). By determining compressed data sources using the number of inquiries, data sources with high usage frequency can be kept uncompressed, and data sources with relatively low usage frequency can be compressed and stored.

In addition, the data source management method according to an embodiment of the present invention may compress only compressed data sources among the data sources (S530). The data sources may be compressed according to the number of compressed data sources determined by the compression target calculation unit among the data sources stored in the cache. The reason for compressing and storing the data source may be to increase the speed of the structured data response by storing the maximum data source in the cache. In addition, the compression method is used because compressing and decompressing a data source with a central processing unit may speed up data processing rather than inputting and outputting a disk or querying a database. . In addition, when compressed, about 10 times more information can be stored in the cache.

In addition, the data source management method according to an embodiment of the present invention may process a structured query query request of a user by using a cache in which a hash table and data sources are stored (S540). When a specific structured query query request is received by the user, a data source corresponding to the structured query query request may be searched using a hash table, and the searched data source may be provided to the user from the cache.

Although not shown in FIG. 5, the data source management method according to an embodiment of the present invention may decompress compressed data sources. For example, the data source decompressing unit 150 may decompress the data source, which is previously stored as the compressed data source, in the decompressed state according to the hash table updated every predetermined period. In addition, even when the data source corresponding to the structured query query request is a compressed data source stored in the cache, the compressed data source may be decompressed and provided to the user.

In addition, although not shown in FIG. 5, the data source management method according to an embodiment of the present invention may delete at least one data source of which the number of inquiries has not increased during a predetermined period of data sources from the cache. For example, data sources for which the number of inquiries fall within the lower 10 percent during the preset period or for which the number of inquiries have not increased during the predetermined period may be deleted from the cache. In general, when managing data sources on a cache, an expiration time is given to each data source, and data sources whose expiration time expires are deleted from the cache. However, this method can cause a large burden on the database because many data sources expire at the same time and a query is requested again using the database. In addition, in the case of frequently used data sources, similar problems may occur because similar usage patterns may occur.

By using such a data source management method, data sources are compressed and stored in a cache based on a frequency of use, and thus, a large amount of data can be kept in a cache to enable a rapid processing of a structured query query request.

The data source management method according to the present invention can be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and any type of hardware device specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions may include high-level language code that can be executed by a computer using an interpreter as well as machine code such as produced by a compiler. Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

As described above, the data source management apparatus and method using the compression method according to the present invention are not limited to the configuration and method of the embodiments described as described above, but the embodiments may be modified in various ways. All or some of the embodiments may be selectively combined.

According to the present invention, a hash table is generated by dividing and storing each element corresponding to the data sources into a plurality of fields, and determining the number of compressed data sources to be compressed among the data sources by using the number of inquiries among the fields. In addition, only compressed data sources among data sources may be compressed, and a hash structure and a cache storing data sources may be used to process a user structured query query request. In addition, by maintaining the maximum amount of data sources in the cache server using the data source compression scheme, it is possible to provide a faster response to the structured query query request while reducing the cost of expanding the cache server.

Claims

A hash table generator configured to generate a hash table in which each element corresponding to data sources is divided into a plurality of fields and stored therein;

A compression target calculation unit for determining the number of compressed data sources to be compressed among the data sources using the number of inquiries among the fields;

A data source compressor for compressing only the compressed data sources of the data sources; And

Query processing unit for processing a user's structured query query request using the cache stored in the hash table and the data sources

Data source management apparatus comprising a.
The method according to claim 1,

The query processing unit

And managing the structured query query request using a database according to whether a data source corresponding to the structured query query request among the data sources exists in the cache using the hash table. Device.
The method according to claim 2,

The query processing unit

If a data source corresponding to the structured query query request does not exist in the cache, provide the user with a database block corresponding to the structured query query request using the database, and provide the database block to the user. And storing the data in the cache in a form corresponding to data sources.
The method according to claim 2,

The query processing unit

And decompressing the data source and providing the data source to the user according to whether the data source corresponding to the structured query query request is included in the compressed data sources.
The method according to claim 4,

The data source management device

And a data source decompressor for decompressing the compressed data sources.
The method according to claim 1,

The compression target mountain unit

The number of the compressed data sources using the remainder except the upper data sources on the hash table corresponding to a predetermined percentage of the total number of inquiries among the data sources based on the hash table sorted in ascending order Data source management device, characterized in that for determining.
The method according to claim 1,

The data source management device

And a data source update unit which deletes at least one data source of which the number of inquiries has not increased during the predetermined period of the data sources from the cache.
The method according to claim 7,

The hash table generator

And the hash table is updated every predetermined period of time based on the at least one structured query query request requested during the predetermined period of time.
The method according to claim 8,

The compression target mountain unit

And determining the number of the compressed data sources by using the number of inquiries of the updated hash table every predetermined period of time.
The method according to claim 8,

The hash table generator

And when a structured query table is changed, querying and changing data sources corresponding to the structured query table among the data sources by using a table name among the fields.
Generating a hash table storing each of the elements corresponding to the data sources into a plurality of fields;

Determining the number of compressed data sources of the data sources to be compressed using the number of lookups among the fields;

Compressing only the compressed data sources of the data sources; And

Processing a user's structured query query request using a cache stored in the hash table and the data sources

Data source management method comprising a.
The method according to claim 11,

The processing step

Determining whether a data source corresponding to the structured query query request among the data sources exists in the cache by using the hash table;

And processing the structured query query request using a database according to the determination result.
The method according to claim 12,

The processing step

If the data source does not exist in the cache as a result of the determination, providing the user with a database block corresponding to the structured query query request using the database;

Storing the database block in the cache in a form corresponding to the data sources.
The method according to claim 12,

The processing step

And decompressing the data source and providing the data source to the user according to whether the data source corresponding to the structured query query request is included in the compressed data sources.
The method according to claim 14,

The data source management method

Decompressing the compressed data sources.
The method according to claim 11,

The determining step

The number of the compressed data sources using the remainder except for the upper data sources on the hash table corresponding to a predetermined percentage of the total number of inquiries among the data sources based on the hash table sorted in ascending order Determining a data source.
The method according to claim 11,

The data source management method

And deleting from the cache at least one data source of which the number of inquiries has not increased during a predetermined period of the data sources.
The method according to claim 17,

The generating step

And updating the hash table every predetermined period of time based on the at least one structured query query request requested during the predetermined period of time.
The method according to claim 18,

The determining step

And determining the number of the compressed data sources by using the retrieval number of the updated hash table every predetermined period of time.
A computer-readable recording medium having recorded thereon a program for executing the method of claim 11.