KR101693687B1

KR101693687B1 - Method for compressing database by column unit

Info

Publication number: KR101693687B1
Application number: KR1020160023065A
Authority: KR
Inventors: 양희정
Original assignee: 주식회사 티맥스데이터
Priority date: 2016-02-26
Filing date: 2016-02-26
Publication date: 2017-01-06

Abstract

According to an embodiment of the present invention, disclosed is a method for compressing a database on a per-unit column basis, which is performed by a computing device including at least one processor and main memory storing instructions adapted to be performed by the processor. The method for compressing a database on a per-unit column basis may comprise the steps of: determining the first number of rows, for which a compression unit is to be generated, in at least one column disposed in a table of a database; generating a first compression unit by compressing the determined first number rows; comparing a size of the first compression unit with a size of an extent; and determining the second number of rows, for which the compression unit is to be generated, in at least one column based on a result of the comparison.

Description

METHOD FOR COMPRESSING DATABASE BY COLUMN UNIT

본 발명은 데이터베이스에 관한 것으로, 보다 구체적으로는 데이터베이스의 데이터에 대한 압축에 관한 것이다.The present invention relates to databases, and more particularly, to compression of data in databases.

최근, 기존의 데이터베이스 관리도구로 데이터를 수집, 저장, 관리, 분석할 수 있는 역량을 넘어서는 대량의 정형 또는 비정형 데이터 집합 및 이러한 데이터로부터 가치를 추출하고 결과를 분석하는 빅데이터 처리 등 대용량 데이터베이스 관리 시스템에 대한 기술이 발전하고 있다.Recently, a large-scale database management system such as a large amount of fixed or unstructured data sets beyond the ability to collect, store, manage, and analyze data with existing database management tools and big data processing that extracts values from these data and analyzes the results Are being developed.

대용량의 자료를 보관하는 데이터베이스에서 대부분의 데이터는 문자열로 구성되므로, 데이터를 압축하여 저장하는 경우에 높은 압축 효율을 획득할 수 있으며, 이를 통해 보다 작은 저장 공간에 많은 데이터를 저장할 수 있으므로, 데이터베이스의 경제성이 증가될 수 있다. Since most of the data in the database that stores large amounts of data is composed of strings, it is possible to obtain a high compression efficiency when the data is compressed and stored, thereby storing a large amount of data in a smaller storage space. The economic efficiency can be increased.

그러나 데이터를 압축하여 저장하는 경우 성능의 하락이 발생할 수 있는 단점이 있다. 따라서, 당 업계에서 데이터베이스의 데이터를 압축하여 저장하면서도 성능의 하락을 최소화 할 수 있는 솔루션에 대한 수요가 존재한다. However, there is a drawback in that performance may be degraded when data is compressed and stored. Thus, there is a need in the art for a solution that can compress and store data in a database while minimizing performance degradation.

미국 등록특허공보 US8,583,692호U.S. Patent Publication No. 8,583,692

본 발명은 전술한 배경기술에 대응하여 안출된 것으로, 데이터베이스를 압축하여 저장하면서도 성능의 하락을 최소화 할 수 있는 솔루션을 제공하기 위한 것이다.SUMMARY OF THE INVENTION The present invention has been devised in response to the above-described background art, and is intended to provide a solution capable of compressing and storing a database while minimizing degradation in performance.

전술한 바와 같은 과제를 실현하기 위한 본 발명의 일 실시 예에 따라 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 장치에서 수행되는 데이터베이스의 컬럼 단위 압축 방법이 개시된다. 상기 데이터베이스의 컬럼 단위 압축 방법은, 상기 데이터베이스의 테이블 내에 위치하는 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우(row)들의 제 1 개수를 결정하는 단계, 결정된 제 1 개수의 로우들을 압축하여 제 1 압축 유닛을 생성하는 단계, 상기 제 1 압축 유닛의 크기와 익스텐트(extent)의 크기를 비교하는 단계 및 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하는 단계를 포함할 수 있다. A method of column-by-column compression of a database performed in a computing device including at least one processor and a main memory for storing instructions executable by the processor, in accordance with an embodiment of the present invention for realizing the above-mentioned problems. The column-wise compression method of the database comprising the steps of: determining a first number of rows to generate a compression unit for one or more columns located in a table of the database; compressing the determined first number of rows to produce a first Generating a compression unit, comparing the size of the first compression unit with a size of an extent, and determining a second number of rows to generate a compression unit for one or more columns based on the comparison result Step < / RTI >

대안적으로, 상기 익스텐트는 데이터 블록의 집합을 포함하며, 사전 설정된 크기를 가질 수 있다. Alternatively, the extent includes a set of data blocks and may have a predetermined size.

대안적으로, 상기 비교 결과 상기 제 1 압축 유닛의 크기가 익스텐트의 크기를 초과할 경우, 오버플로우(overflow)로 판단하는 단계 및 상기 비교 결과, 오버플로우가 발생한 것으로 판단한 경우, 언더플로우(underflow)가 발생한 로우의 수에 관련한 정보가 존재하는지 여부를 판단하는 단계를 더 포함할 수 있다. Alternatively, if the size of the first compression unit exceeds the size of the extent as a result of the comparison, it is determined that the size of the first compression unit is an overflow, and if it is determined that overflow has occurred, And determining whether or not information related to the number of generated rows exists.

대안적으로, 상기 언더플로우가 발생한 로우의 수에 관련한 정보가 존재하지 않는 경우, 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하는 단계는, 상기 제 1 개수를 제 1 사전 설정된 수로 나눈 결과값을 상기 제 2 개수로 결정하는 단계를 포함할 수 있다.Alternatively, the step of determining a second number of rows to generate a compression unit for one or more columns based on the comparison result, if there is no information related to the number of rows in which the underflow occurred, And determining a result value obtained by dividing the first number by the first predetermined number as the second number.

대안적으로, 상기 언더플로우가 발생한 로우의 수 정보가 존재하는 경우, 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하는 단계는, 상기 제 1 개수에서 상기 제 1 개수와 상기 언더플로우가 발생한 로우의 수의 차이를 제 2 사전 설정된 수로 나눈 수를 뺀 결과값을 상기 제 2 개수로 결정하는 단계를 포함할 수 있다. Alternatively, the step of determining a second number of rows to generate a compression unit for one or more columns based on the result of the comparison, if there is information about the number of rows in which the underflow occurred, Determining a result value obtained by subtracting the difference between the first number and the number of rows in which the underflow occurred by a second predetermined number, as the second number.

대안적으로, 상기 비교 결과 상기 제 1 압축 유닛의 크기가 상기 익스텐트의 크기의 사전설정된 비율 이하인 경우, 언더플로우(underflow)로 판단하는 단계 및 상기 비교 결과, 언더플로우가 발생한 것으로 판단한 경우, 오버플로우가 발생한 로우의 수에 관련한 정보가 존재하는지 여부를 판단하는 단계를 더 포함할 수 있다.Alternatively, if it is determined that the size of the first compression unit is equal to or less than a predetermined ratio of the size of the extent, the step of judging the underflow as a result of the comparison, And determining whether or not information related to the number of generated rows exists.

대안적으로, 상기 오버플로우가 발생한 로우의 수에 관련한 정보가 존재하지 않는 경우, 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하는 단계는, 상기 제 1 개수에 제 3 사전 설정된 수를 곱한 결과값을 상기 제 2 개수로 결정하는 단계를 포함할 수 있다.Alternatively, the step of determining a second number of rows to generate a compression unit for one or more columns based on the result of the comparison, when there is no information related to the number of rows in which the overflow occurred, And determining the second number as a result of multiplying the first number by a third predetermined number.

대안적으로, 상기 제 3 사전 설정된 수는 상기 압축 유닛이 가질 수 있는 최대 크기를 상기 제 1 압축 유닛의 크기로 나눈 수를 포함할 수 있다.Alternatively, the third predetermined number may comprise a number of times the maximum size the compression unit may have divided by the size of the first compression unit.

대안적으로, 상기 오버플로우가 발생한 로우의 수에 관련한 정보가 존재하는 경우, 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하는 단계는, 상기 제 1 개수와 상기 오버플로우가 발생한 로우의 수의 차이를 제 4 사전 설정된 수로 나눈 수와 상기 제 1 개수를 더한 결과 값을 상기 제 2 개수로 결정하는 단계를 포함할 수 있다. Alternatively, the step of determining a second number of rows to generate a compression unit for one or more columns based on the result of the comparison, if there is information relating to the number of rows where the overflow occurred, And determining the second number as a result of adding the first number and the number obtained by dividing the difference between the number of rows and the number of rows in which the overflow occurs by a fourth predetermined number.

본 발명의 다른 일 실시 예에 따라, 데이터베이스의 컬럼 단위 압축을 제공하는 데이터베이스 서버가 개시된다. 상기 데이터베이스의 컬럼 단위 압축을 제공하는 데이터베이스 서버는, 상기 데이터베이스 서버는, 하나 이상의 프로세서 및 상기 하나 이상의 프로세서에서 실행가능한 명령들을 저장하는 메모리를 포함하고, 상기 하나 이상의 프로세서는 상기 데이터베이스의 테이블 내에 위치하는 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우(row)들의 제 1 개수 및 제 2 개수를 결정하는 로우 개수 결정 모듈, 결정된 제 1 개수의 로우들을 압축하여 제 1 압축 유닛을 생성하는 압축 유닛 생성 모듈 및 제 1 압축 유닛의 크기와 익스텐트(extent)의 크기를 비교하는 압축 유닛 크기 비교 모듈을 포함할 수 있다. According to another embodiment of the present invention, a database server that provides column-by-column compression of a database is disclosed. Wherein the database server comprises one or more processors and a memory storing instructions executable on the one or more processors, wherein the one or more processors are located in a table of the database A row number determination module for determining a first number and a second number of rows to generate a compression unit for one or more columns, a compression unit generation module for generating a first compression unit by compressing the determined first number of rows, And a compression unit size comparison module for comparing the size of the first compression unit with the size of the extent.

본 발명의 또 다른 일 실시 예에 따라, 하나 이상의 프로세서에 의해 실행되는 복수의 명령들을 포함하는, 판독가능 매체(computer readable medium)에 저장된 컴퓨터 프로그램이 개시된다. 상기 컴퓨터 프로그램은, 상기 데이터베이스의 테이블 내에 위치하는 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우(row)들의 제 1 개수를 결정하기 위한 명령, 결정된 제 1 개수의 로우들을 압축하여 제 1 압축 유닛을 생성하기 위한 명령, 상기 제 1 압축 유닛의 크기와 익스텐트(extent)의 크기를 비교하기 위한 명령 및 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정하기 위한 명령을 포함할 수 있다.According to yet another embodiment of the present invention, a computer program stored in a computer readable medium is disclosed that includes a plurality of instructions executed by one or more processors. The computer program comprising instructions for: determining a first number of rows to generate a compression unit for one or more columns located in a table of the database; compressing the determined first number of rows to produce a first compression unit Instructions for comparing the size of the first compression unit with the size of the extent and instructions for determining a second number of rows to generate a compression unit for one or more columns based on the comparison result Command.

본 발명은 데이터베이스를 압축하여 저장하면서도 성능의 하락을 최소화 할 수 있는 솔루션을 제공할 수 있다.The present invention can provide a solution capable of compressing and storing a database while minimizing degradation in performance.

도 1 은 본 발명의 일 실시 예에 따라 데이터베이스의 컬럼 단위 압축을 제공하는 데이터베이스 서버의 블록 구성도(block diagram)이다.
도 2 는 본 발명의 일 실시 예에 따라 데이터베이스의 압축 유닛을 나타낸 예시도이다.
도 3 은 본 발명의 일 실시 예에 따라 데이터베이스의 컬럼 단위 압축 방법의 순서도이다.
도 4 는 본 발명의 실시 예들이 구현될 수 있는 예시적인 컴퓨팅 환경에 대한 간략하고 일반적인 개략도를 도시한다.1 is a block diagram of a database server that provides column-based compression of a database according to an embodiment of the present invention.
2 is an exemplary diagram illustrating a compression unit of a database according to one embodiment of the present invention.
3 is a flowchart of a column-by-column compression method of a database according to an embodiment of the present invention.
Figure 4 illustrates a simplified, general schematic diagram of an exemplary computing environment in which embodiments of the present invention may be implemented.

다양한 실시 예들이 이제 도면을 참조하여 설명되며, 전체 도면에서 걸쳐 유사한 도면번호는 유사한 구성요소를 나타내기 위해서 사용된다. 본 명세서에서, 다양한 설명들이 본 발명의 이해를 제공하기 위해서 제시된다. 그러나 이러한 실시 예들은 이러한 구체적인 설명 없이도 실행될 수 있음이 명백하다. 다른 예들에서, 공지된 구조 및 장치들은 실시 예들의 설명을 용이하게 하기 위해서 블록 다이어그램 형태로 제공된다.Various embodiments are now described with reference to the drawings, wherein like reference numerals are used throughout the drawings to refer to like elements. In this specification, various explanations are given in order to provide an understanding of the present invention. It will be apparent, however, that such embodiments may be practiced without these specific details. In other instances, well-known structures and devices are provided in block diagram form in order to facilitate describing the embodiments.

본 명세서에서 사용되는 용어 "컴포넌트", "모듈", "시스템" 등은 컴퓨터-관련 엔티티, 하드웨어, 펌웨어, 소프트웨어, 소프트웨어 및 하드웨어의 조합, 또는 소프트웨어의 실행을 지칭한다. 예를 들어, 컴포넌트는 프로세서상에서 실행되는 처리과정, 프로세서, 객체, 실행 스레드, 프로그램, 및/또는 컴퓨터일 수 있지만, 이들로 제한되는 것은 아니다. 예를 들어, 컴퓨팅 장치에서 실행되는 애플리케이션 및 컴퓨팅 장치 모두 컴포넌트일 수 있다. 하나 이상의 컴포넌트는 프로세서 및/또는 실행 스레드 내에 상주할 수 있고, 일 컴포넌트는 하나의 컴퓨터 내에 로컬화될 수 있고, 또는 2개 이상의 컴퓨터들 사이에 분배될 수 있다. 또한, 이러한 컴포넌트들은 그 내부에 저장된 다양한 데이터 구조들을 갖는 다양한 컴퓨터 판독가능한 매체로부터 실행할 수 있다. 컴포넌트들은 예를 들어 하나 이상의 데이터 패킷들을 갖는 신호(예를 들면, 로컬 시스템, 분산 시스템에서 다른 컴포넌트와 상호작용하는 하나의 컴포넌트로부터 데이터 및/또는 신호를 통해 다른 시스템과 인터넷과 같은 네트워크를 통한 데이터)에 따라 로컬 및/또는 원격 처리들을 통해 통신할 수 있다. The terms "component," "module," system, "and the like, as used herein, refer to a computer-related entity, hardware, firmware, software, combination of software and hardware, or execution of software. For example, a component may be, but is not limited to, a process executing on a processor, a processor, an object, an executing thread, a program, and / or a computer. For example, both an application running on a computing device and a computing device may be a component. One or more components may reside within a processor and / or thread of execution, one component may be localized within one computer, or it may be distributed between two or more computers. Further, such components may execute from various computer readable media having various data structures stored therein. The components may be, for example, a signal (e.g., a local system, data from one component interacting with another component in a distributed system, and / or data over a network, such as the Internet, Lt; RTI ID = 0.0 > and / or < / RTI >

제시된 실시 예들에 대한 설명은 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시 예들에 대한 다양한 변형들은 본 발명의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이며, 여기에 정의된 일반적인 원리들은 본 발명의 범위를 벗어남이 없이 다른 실시 예들에 적용될 수 있다. 그리하여, 본 발명은 여기에 제시된 실시 예들로 한정되는 것이 아니라, 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다. The description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features presented herein.

도 1 은 본 발명의 일 실시 예에 따라 데이터베이스의 컬럼 단위 압축을 제공하는 데이터베이스 서버의 블록 구성도(block diagram)이다. 1 is a block diagram of a database server that provides column-based compression of a database according to an embodiment of the present invention.

본 발명의 일 실시 예에 따른 데이터베이스 서버(100)는 하나 이상의 프로세서, 메인 메모리(150) 및 영구 저장 장치(190)를 포함할 수 있다. 상기 하나 이상의 프로세서는 로우 개수 결정 모듈(110), 압축 유닛 생성 모듈(130), 압축 유닛 크기 비교 모듈(150)을 포함할 수 있다. The database server 100 according to an embodiment of the present invention may include one or more processors, main memory 150, and persistent storage 190. The one or more processors may include a row number determination module 110, a compression unit generation module 130, and a compression unit size comparison module 150.

데이터베이스는 테이블 형태로 구성되어, 하나 이상의 로우들을 포함하며, 하나의 로우는 하나 이상의 컬럼들로 구성된다. 본 발명의 일 실시 예에서 컬럼 단위 압축은 복수의 로우들에서 동일한 컬럼을 모아 압축 유닛을 구성하고, 컬럼 별로 압축을 적용하는 동작을 포함할 수 있다. The database is organized in a table, comprising one or more rows, and one row is composed of one or more columns. In one embodiment of the present invention, column-based compression may comprise collecting identical columns in a plurality of rows to construct a compression unit, and applying compression on a column-by-column basis.

메모리(150)는 서버(100)의 동작을 지원하기 위한 다양한 타입들의 데이터를 저장하도록 구성된다. 또한, 메모리(150)는 압축 유닛(200)을 일시적으로 저장할 수 있고, 언더플로우가 발생한 로우의 수에 관련된 정보, 오버플로우가 발생한 로우의 수에 관련된 정보를 저장할 수 있다. 메모리(150)는 동적 랜덤 액세스 메모리(DRAM), 정적 랜덤 액세스 메모리(SRAM), 전기적으로 삭제가능한 프로그램어블 판독-전용 메모리(EEPROM), 삭제가능한 프로그램어블 판독-전용 메모리(EPROM) 등 임의의 타입의 휘발성 또는 비-휘발성 메모리 디바이스들 또는 이들의 조합을 사용하여 구현될 수 있다.The memory 150 is configured to store various types of data to support the operation of the server 100. Also, the memory 150 may temporarily store the compression unit 200, and may store information related to the number of rows where an underflow has occurred, information relating to the number of rows overflowed. Memory 150 may be any type of memory such as dynamic random access memory (DRAM), static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read- Volatile or non-volatile memory devices, or a combination thereof.

영구 저장 장치(190)는 데이터베이스의 데이터를 저장하도록 구성된다. 영구 저장 장치(190)는 하드 디스크 드라이브(HDD), 솔리드 스테이트 드라이브(SSD: solid state drive)와 같은 서버(100)의 메모리(150)를 제외한 다른 저장 매체를 포함할 수 있다. 메모리(150) 또는 영구 저장 장치(190)는 복수의 데이터 블록을 포함할 수 있다. 익스텐트(extent)는 데이터 블록의 집합을 포함할 수 있다. The persistent storage device 190 is configured to store data of the database. The persistent storage device 190 may include other storage media other than the memory 150 of the server 100, such as a hard disk drive (HDD) or a solid state drive (SSD). Memory 150 or persistent storage 190 may comprise a plurality of blocks of data. An extent may comprise a set of data blocks.

로우 개수 결정 모듈(110)은 상기 데이터베이스의 테이블 내에 위치하는 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우(row)들의 개수를 결정할 수 있다. 로우 개수 결정 모듈(110)은 데이터베이스를 압축 저장하기 위하여, 압축 유닛에 포함될 로우 들의 개수를 결정할 수 있다. 로우 개수 결정 모듈(110)은 압축 유닛 생성 모듈(130)로 하여금 압축 유닛을 생성할 로우들을 결정할 수 있다. 로우 개수 결정 모듈(110)은 압축 유닛 생성 모듈(130)이 제 1 압축 유닛을 생성한 후, 압축 유닛 크기 비교 모듈(150)의 비교 결과에 기초하여 다시 압축 유닛을 생성할 로우의 개수를 결정할 수 있다. The row count determination module 110 may determine the number of rows to generate a compression unit for one or more columns located in the table of the database. The row count determination module 110 may determine the number of rows to be included in the compression unit to compress and store the database. The row number determination module 110 may determine the rows for which the compression unit generation module 130 will generate the compression unit. The row number determination module 110 determines the number of rows to generate the compression unit again based on the comparison result of the compression unit size comparison module 150 after the compression unit generation module 130 generates the first compression unit .

로우 개수 결정 모듈(110)은 압축 유닛 크기 비교 모듈(150)이 제 1 압축 유닛의 크기가 익스텐트의 크기를 초과하는 것으로 판단한 경우(오버플로우(overflow)로 판단하는 경우), 다음에 생성되는 압축 유닛에 포함될 로우의 개수(로우들의 제 2 개수)를 로우들의 제 1 개수보다 작게 결정할 수 있다. The row number determination module 110 determines whether the compression unit size comparison module 150 determines that the size of the first compression unit exceeds the size of the extent (in the case of determining overflow) The number of rows (the second number of rows) to be included in the unit can be determined to be smaller than the first number of rows.

또한, 로우 개수 결정 모듈(110)은 압축 유닛 크기 비교 모듈(150)이 제 1 압축 유닛의 크기가 익스텐트 크기의 사전설정된 비율 이하인 것으로 판단한 경우(언더플로우(underflow)로 판단하는 경우), 다음에 생성되는 압축 유닛에 포함될 로우의 개수(로우들의 제 2 개수)를 로우들의 제 1 개수보다 크게 결정할 수 있다. In addition, when the compression unit size comparison module 150 determines that the size of the first compression unit is equal to or less than a preset ratio of the extent size (when determining that the size of the first compression unit is underflow), the row number determination module 110 next The number of rows (the second number of rows) to be included in the resulting compression unit may be determined to be greater than the first number of rows.

하나의 압축 유닛의 전부가 하나의 익스텐트에 모두 포함되는 경우에, 데이터베이스의 성능이 높아 질 수 있다. 따라서, 압축 유닛에 다수의 로우가 포함되어 익스텐트의 크기를 초과하는 경우, 압축 유닛에 포함되는 로우의 개수를 줄여 압축 유닛의 크기를 익스텐트의 크기 이하로 줄일 수 있다. 또한, 압축 유닛에 너무 적은 로우가 포함되어 압축 유닛이 포함되는 익스텐트에 빈 공간이 많이 남게 되는 경우에도 데이터베이스의 효율성은 하락할 수 있으므로, 압축 유닛에 포함되는 로우의 개수를 늘려 압축 유닛의 크기가 익스텐트의 크기에 비하여 너무 작지 않도록 할 수 있다. When all of one compression unit is included in one extent, the performance of the database can be enhanced. Accordingly, when a plurality of rows are included in the compression unit and the size of the extents is exceeded, the number of rows included in the compression unit can be reduced to reduce the size of the compression unit to be less than the size of the extent. In addition, even if the compression unit includes too few rows, the efficiency of the database may deteriorate even when a lot of empty space remains in the extent where the compression unit is included. Therefore, the number of rows included in the compression unit is increased, It is possible to prevent the size of the magnetic recording medium from becoming too small.

로우 개수 결정 모듈(110)은 제 1 압축 유닛의 크기가 익스텐트의 크기를 초과하여 오버플로우(overflow)가 발생하는 경우, 압축 유닛 생성 모듈(130)이 보다 작은 크기를 갖는 압축 유닛을 생성하도록, 압축 유닛에 포함될 로우들의 개수를 줄일 수 있다. 이때, 로우 개수 결정 모듈(110)은 이전에 압축 유닛을 생성할 때 언더플로우가 발생한 로우의 수에 관련한 정보가 존재하는지 여부에 따라서, 상이한 방식으로 이후의 압축 유닛에 포함될 로우들의 제 2 개수를 결정할 수 있다. The row number determination module 110 determines whether the compression unit generation module 130 generates a compression unit having a smaller size when an overflow occurs because the size of the first compression unit exceeds the size of the extent, The number of rows to be included in the compression unit can be reduced. At this time, the row number determination module 110 determines the number of rows to be included in the subsequent compression unit in a different manner, depending on whether or not there is information relating to the number of rows where the underflow occurred when the compression unit was previously generated You can decide.

언더플로우가 발생한 로우의 수에 관련된 정보가 존재하지 않는 경우, 로우 개수 결정 모듈(110)은 로우들의 제 1 개수를 제 1 사전설정된 수로 나눈 결과값을 로우들의 제 2 개수로 결정할 수 있다. If there is no information related to the number of rows underflow, the row number determination module 110 may determine the result of dividing the first number of rows by the first predetermined number as the second number of rows.

보다 구체적으로, 제 1 압축 유닛에 포함되는 로우들의 제 1 개수를

라 하고, 새로운 압축 유닛을 생성할 로우들의 제 2 개수를

, 제 1 사전 결정된 수를 x1으로 둔다면, 양자는 수학식 1의 관계를 가질 수 있다. More specifically, the first number of rows included in the first compression unit

, And the second number of rows to create a new compression unit

, And if the first predetermined number is x1, then both can have the relationship of Equation (1).

여기서 제 1 사전 결정된 수는 임의의 양수를 포함할 수 있다. 예를 들어, 제 1 사전 결정된 수는 2를 포함할 수 있다. 제 1 사전 결정된 수(x1)는 다음 번의 압축 유닛 생성시 오버플로우가 반복되는 것을 방지하기에 충분한 수 일 수 있다. Wherein the first predetermined number may comprise any positive number. For example, the first predetermined number may comprise two. The first predetermined number (x1) may be a sufficient number to prevent the overflow from repeating in the next compression unit generation.

상기 언더플로우가 발생한 로우의 수 정보가 존재하는 경우, 로우 개수 결정 모듈(110)은 로우들의 제 1 개수에서, 제 1 개수와 언더플로우가 발생한 로우의 수의 차이를 제 2 사전 설정된 수로 나눈 수를 뺀 결과값을 로우들의 제 2 개수로 결정할 수 있다. When there is information on the number of rows in which the underflow occurs, the row number determination module 110 determines the number of rows in which the underflow has occurred by dividing the difference between the first number and the number of rows underflowed by the second predetermined number Can be determined as the second number of rows.

라 하고, 새로운 압축 유닛을 생성할 로우들의 제 2 개수를

, 제 2 사전 결정된 수를 x2, 언더플로우가 발생한 로우의 수를

로 둔다면, 양자는 수학식 2의 관계를 가질 수 있다. More specifically, the first number of rows included in the first compression unit

, And the second number of rows to create a new compression unit

The second predetermined number is x2, the number of rows in which the underflow occurs is

, The two can have the relationship of Equation (2).

여기서 제 2 사전 결정 된 수는 임의의 양수를 포함할 수 있다. 예를 들어, 제 2 사전 결정된 수는 2를 포함할 수 있다. Where the second predetermined number may comprise any positive number. For example, the second predetermined number may comprise two.

로우 개수 결정 모듈(110)은 제 1 압축 유닛의 크기가 익스텐트의 크기의 사전 설정된 비율 이하인 언더플로우(underflow)가 발생하는 경우, 압축 유닛 생성 모듈(130)이 보다 큰 크기를 갖는 압축 유닛을 생성하도록, 압축 유닛에 포함될 로우들의 개수를 늘릴 수 있다. 이때, 로우 개수 결정 모듈(110)은 이전에 압축 유닛을 생성할 때 오버플로우가 발생한 로우에 수에 관련한 정보가 존재하는지 여부에 따라서, 상이한 방식으로 이후의 압축 유닛에 포함될 로우들의 제 2 개수를 결정할 수 있다. The row number determination module 110 generates a compression unit having a larger size when the first compression unit generates an underflow in which the size of the first compression unit is less than a predetermined ratio of the size of the extent , The number of rows to be included in the compression unit can be increased. At this time, the row number determination module 110 determines the number of rows to be included in the subsequent compression unit in a different manner, depending on whether there is information about the number in the row where the overflow occurred when the compression unit was previously generated You can decide.

오버플로우가 발생한 로우의 수에 관련된 정보가 존재하지 않는 경우, 로우 개수 결정 모듈(110)은 로우들의 제 1 개수에 제 3 사전 설정된 수를 곱한 결과값을 제 2 개수로 결정할 수 있다. If there is no information related to the number of overflowed rows, the row number determination module 110 may determine a second number of result values of the first number of rows multiplied by a third predetermined number.

라 하고, 새로운 압축 유닛을 생성할 로우들의 제 2 개수를

, 제 3 사전 결정된 수를 x3으로 둔다면, 양자는 수학식 3의 관계를 가질 수 있다. More specifically, the first number of rows included in the first compression unit

, And the second number of rows to create a new compression unit

, And if the third predetermined number is set to x3, then both can have the relationship of Equation (3).

여기서 상기 제 3 사전 설정된 수는 임의의 양수를 포함할 수 있다. 예를 들어, 제 3 사전 설정된 수는 상기 압축 유닛이 가질 수 있는 최대 크기(MaxBytesInCU)를 상기 제 1 압축 유닛의 크기(BytesInCurrentCU)로 나눈 수

를 포함할 수 있다. 제 3 사전 설정된 수(x3)는 다음 번의 압축 유닛 생성시 언더플로우가 반복되는 것을 방지하기에 충분한 수를 포함할 수 있다. Wherein the third predetermined number may comprise any positive number. For example, the third predetermined number is the number of times the maximum size (MaxBytesInCU) that the compression unit can have, divided by the size of the first compression unit (BytesInCurrentCU)

. &Lt; / RTI > The third predetermined number (x3) may include a number sufficient to prevent the underflow from repeating at the next compression unit creation.

상기 오버플로우가 발생한 로우의 수 정보가 존재하는 경우, 로우 개수 결정 모듈(110)은 로우들의 제 1 개수와 오버플로우가 발생한 로우의 수 차이를 제 4 사전 설정된 수로 나눈 수와 상기 제 1 개수를 더한 결과 값을 제 2 개수로 결정할 수 있다. If there is information on the number of rows in which the overflow occurs, the row number determination module 110 determines the number of rows in which overflow has occurred by dividing the difference between the first number of rows and the number of generated rows by a fourth predetermined number, The resultant value can be determined as the second number.

라 하고, 새로운 압축 유닛을 생성할 로우들의 제 2 개수를

, 제 4 사전 결정된 수를 x4, 언더플로우가 발생한 로우의 수를

로 둔다면, 양자는 수학식 4의 관계를 가질 수 있다. More specifically, the first number of rows included in the first compression unit

, And the second number of rows to create a new compression unit

, The fourth predetermined number is x4, the number of rows in which the underflow occurs

, The two can have the relationship of Equation (4).

여기서 제 4 사전 결정 된 수는 임의의 양수를 포함할 수 있다. 예를 들어, 제 4 사전 결정된 수는 2를 포함할 수 있다. Where the fourth predetermined number may comprise any positive number. For example, the fourth predetermined number may comprise two.

압축 유닛 생성 모듈(130)은 로우 개수 결정 모듈(110)에서 결정된 개수의 로우들을 압축하여 압축 유닛을 생성할 수 있다. 상기 압축 유닛은 데이터베이스의 상기 로우에 해당하는 컬럼들의 내용을 포함할 수 있다. 압축 유닛 생성 모듈(130)은 임의의 압축 알고리즘을 통해 상기 데이터를 압축할 수 있다. 압축 유닛 생성 모듈(130)은 압축 유닛 헤더(230)에 상기 압축 유닛에 포함된 데이터를 판독하기 위한 정보, 상기 압축 유닛에 포함된 데이터를 식별하기 위한 정보를 기록할 수 있다. The compression unit generation module 130 may compress the determined number of rows in the row number determination module 110 to generate a compression unit. The compression unit may include the contents of the columns corresponding to the row of the database. The compression unit generation module 130 may compress the data through any compression algorithm. The compression unit generation module 130 may record information for reading data included in the compression unit in the compression unit header 230, information for identifying data included in the compression unit.

압축 유닛 크기 비교 모듈(150)은 생성된 압축 유닛의 크기와 익스텐트의 크기를 비교할 수 있다. 압축 유닛 크기 비교 모듈(150)은 압축 유닛의 크기가 익스텐트의 크기를 초과하는 경우 오버플로우로 판단할 수 있다. 오버플로우가 발생하는 경우, 한 압축 유닛에 포함되는 데이터가 2이상의 익스텐트에 기록되므로, OLAP(online analytical processing)등의 데이터 조회를 위한 질의 또는 OLTP(online transaction processing)등의 질의에 응답하는 데이터베이스의 성능이 저하될 수 있다. 따라서, 압축 유닛 크기 비교 모듈(150)은 오버플로우가 발생하는 경우, 압축 유닛 생성 모듈(130)이 생성하는 압축 유닛의 크기가 익스텐트의 크기를 초과하지 않도록, 로우 개수 결정 모듈(110)이 적절한 개수(제 2 개수)의 로우들을 결정하도록 할 수 있다. The compression unit size comparison module 150 can compare the size of the generated compression unit with the size of the extents. The compression unit size comparison module 150 can determine that the compression unit is overflow when the size of the compression unit exceeds the size of the extent. In the case where an overflow occurs, data included in one compression unit is recorded in two or more extents. Therefore, a database for responding to a query for data inquiry such as OLAP (online analytical processing) or OLTP (online transaction processing) Performance may be degraded. Accordingly, when the overflow occurs, the compression unit size comparison module 150 determines whether the row number determination module 110 determines that the size of the compression unit generated by the compression unit generation module 130 does not exceed the size of the extent To determine the number of rows (the second number).

또한, 압축 유닛 크기 비교 모듈(150)은 압축 유닛의 크기가 익스텐트의 크기의 사전 설정된 비율 이하인 경우 언더플로우로 판단할 수 있다. 상기 사전 설정된 비율은 예를 들어, 95%일 수 있다. 전술한 예시에서, 익스텐트의 크기가 100KB인 경우, 압축 유닛의 크기가 95KB 이하인 경우, 압축 유닛 크기 비교 모듈(150)은 언더플로우가 발생한 것으로 판단할 수 있다. 전술한 익스텐트 및 압축유닛의 크기와 사전 설정된 비율은 예시일 뿐이며 본 발명은 이에 제한되지 않는다. 언더플로우가 발생하는 경우, 한 익스텐트에 빈 공간이 많이 포함되게 되므로, 데이터베이스의 저장 효율이 저하될 수 있다. 따라서, 압축 유닛 크기 비교 모듈(150)은 언더플로우가 발생하는 경우, 압축 유닛 생성 모듈(130)이 생성하는 압축 유닛의 크기가 익스텐트의 크기에 비하여 너무 작지 않도록, 로우 개수 결정 모듈(110)이 적절한 개수(제 2 개수)의 로우들을 결정하도록 할 수 있다.In addition, the compression unit size comparison module 150 can determine that the compression unit is underflow when the size of the compression unit is less than or equal to a predetermined ratio of the size of the extent. The predetermined ratio may be, for example, 95%. In the above example, when the size of the extent is 100 KB, and the size of the compression unit is 95 KB or less, the compression unit size comparison module 150 can determine that an underflow has occurred. The sizes and predetermined ratios of the extents and compression units described above are only exemplary, and the present invention is not limited thereto. When an underflow occurs, since there is a lot of empty space in one extent, the storage efficiency of the database may deteriorate. Therefore, when the underflow occurs, the compression unit size comparison module 150 determines that the number of the row number determination module 110 is smaller than the size of the compression unit generated by the compression unit generation module 130 To determine the appropriate number of rows (the second number).

본 발명의 일 실시 예에 따라 데이터베이스 서버(100)는 압축 유닛에 포함될 적절한 수의 로우를 결정할 수 있다. 압축 유닛에 너무 많은 로우가 포함되어, 한 압축 유닛이 익스텐트를 초과하는 경우에는, 데이터 조회를 위한 질의에 대한 데이터베이스의 성능이 하락할 수 있다. 또한, 압축 유닛에 너무 적은 로우가 포함되어, 한 압축 유닛을 저장하는 익스텐트에 빈 공간이 발생하는 경우에는, 데이터베이스의 저장 공간 효율이 감소 할 수 있다. 또한, 각 압축 유닛에 포함되는 각 로우는 서로 다른 데이터를 포함할 수 있어, 압축 유닛에 포함되기 위한 적절한 로우들의 적절한 수는 일정하지 않을 수 있다. 따라서, 본 발명의 일 실시 예에 따라 데이터베이스 서버(100)는 성능 저하를 최소화하고, 저장 공간 효율을 높일 수 있는 압축 유닛의 적절한 크기를 압축 유닛 생성시 마다 결정할 수 있도록 할 수 있다. 또한, 본 발명의 일 실시 예에 따라 데이터베이스 서버(100)는 언더플로우 또는 오버플로우가 반복되지 않도록, 압축 유닛에 포함될 로우의 수를 재설정 할 수 있어, 데이터베이스의 압축 저장에 필요한 컴퓨팅 파워를 줄일 수 있다. According to one embodiment of the present invention, database server 100 may determine an appropriate number of rows to include in the compression unit. If too many rows are included in the compression unit and one compression unit exceeds the extent, the performance of the database for the query for data retrieval may be degraded. Also, if too few rows are included in the compression unit, and space is created in the extent storing one compression unit, the storage space efficiency of the database may be reduced. Also, each row included in each compression unit may contain different data, so that the appropriate number of appropriate rows to be included in the compression unit may not be constant. Thus, according to one embodiment of the present invention, the database server 100 may be able to minimize the performance degradation and to determine the appropriate size of the compression unit that can increase the storage space efficiency each time the compression unit is created. Also, according to an embodiment of the present invention, the database server 100 can reset the number of rows to be included in the compression unit so that the underflow or the overflow is not repeated, thereby reducing the computing power required for compression storage of the database have.

도 2 는 본 발명의 일 실시 예에 따라 데이터베이스의 압축 유닛을 나타낸 예시도이다. 2 is an exemplary diagram illustrating a compression unit of a database according to one embodiment of the present invention.

압축 유닛(200)은 복수의 데이터 블록을 포함하는 익스텐트에 저장될 수 있다. 압축 유닛(200)은 복수의 데이터 블록을 포함하여 구성될 수 있고, 각 데이터 블록의 헤더(210), 압축 유닛 헤더(230), 데이터들(250)을 포함할 수 있다. 익스텐트는 데이터 블록들의 집합으로, 영구 저장 장치(190) 또는 메모리(150)의 일부일 수 있다. The compression unit 200 may be stored in an extent that includes a plurality of data blocks. The compression unit 200 may include a plurality of data blocks and may include a header 210 of each data block, a compression unit header 230, and data 250. An extent is a collection of data blocks, which may be part of persistent storage 190 or memory 150.

데이터 블록 헤더(210)는 해당 데이터 블록에 기록된 데이터를 식별하기 위한 정보, 데이터의 타입에 관련한 정보 등을 포함할 수 있다. The data block header 210 may include information for identifying data recorded in the corresponding data block, information related to the type of the data, and the like.

압축 유닛 헤더(230)는 압축을 풀어 압축 유닛(230)에 포함된 데이터를 판독하기 위한 정보, 압축 알고리즘에 관련한 정보, 압축 유닛에 포함된 컬럼들의 길이에 관련한 정보, 압축 유닛에 포함된 데이터를 식별하기 위한 정보 등을 포함할 수 있다. The compression unit header 230 includes information for decompressing and reading the data contained in the compression unit 230, information relating to the compression algorithm, information relating to the length of the columns included in the compression unit, And information for identifying the user.

데이터들(250)은 데이터 테이블의 특정 컬럼 및 로우의 데이터일 수 있다. 한 압축 유닛(200)에는 OLTP 등의 질의에 대한 응답 성능을 높이기 위하여 결정된 로우의 모든 컬럼의 데이터가 포함될 수도 있다. 예를 들어, 컬럼의 수가 5이고, 로우의 수가 100인 데이터베이스에서, 압축 유닛에 포함된 로우의 수가 5인 경우, 압축 유닛에 포함된 데이터는 로우 1, 컬럼 1부터(1,1), 로우 5, 컬럼 5(5,5)까지의 데이터를 포함할 수 있다.The data 250 may be data of a particular column and row of a data table. The compression unit 200 may include data of all the columns of the row determined to increase the response performance to the query such as OLTP. For example, in a database where the number of columns is five and the number of rows is 100, if the number of rows included in the compression unit is five, the data included in the compression unit is low 1, column 1 to (1,1) 5, and column 5 (5, 5).

도 3 은 본 발명의 일 실시 예에 따라 데이터베이스의 컬럼 단위 압축 방법의 순서도이다. 3 is a flowchart of a column-by-column compression method of a database according to an embodiment of the present invention.

상기 데이터베이스의 컬럼 단위 압축 방법은, 하나 이상의 프로세서 및 상기 프로세서에서 수행 가능한 명령들을 저장하는 메인 메모리를 포함하는 컴퓨팅 장치에서 수행될 수 있다. The column-wise compression method of the database may be performed in a computing device that includes one or more processors and a main memory that stores instructions executable by the processor.

로우 개수 결정 모듈(110)은 데이터베이스의 테이블 내에 위치하는 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우(row)들의 제 1 개수를 결정할 수 있다(310). 데이터베이스는 테이블 형태로 구성되어, 하나 이상의 로우들을 포함하며, 하나의 로우는 하나 이상의 컬럼들로 구성된다. 로우 개수 결정 모듈(110)은 하나 이상의 로우들 중 제 1 개수의 로우에 포함된 데이터들을 압축 유닛 생성 모듈(130)이 압축하도록, 압축 유닛 생성의 대상이 되는 로우들의 수를 결정할 수 있다. The row count determination module 110 may determine 310 the first number of rows to generate a compression unit for one or more columns located in a table of the database. The database is organized in a table, comprising one or more rows, and one row is composed of one or more columns. The row count determination module 110 may determine the number of rows for which the compression unit generation module 130 compresses the data contained in the first row of the one or more rows.

압축 유닛 생성 모듈(130)은 결정된 제 1 개수의 로우들을 압축하여 제 1 압축 유닛을 생성할 수 있다(320). 압축 유닛 생성 모듈(130)은 임의의 압축 알고리즘을 이용하여 제 1 개수의 로우들을 압축할 수 있다. The compression unit generation module 130 may compress the determined first number of rows to generate a first compression unit (320). The compression unit generation module 130 may compress the first number of rows using any compression algorithm.

압축 유닛 크기 비교 모듈(150)은 생성된 제 1 압축 유닛의 크기와 익스텐트의 크기를 비교할 수 있다(330). 압축 유닛 크기 비교 모듈(150)은 제 1 압축 유닛의 크기가 익스텐트(extent)의 크기를 초과하는 경우, 오버플로우가 발생한 것으로 판단할 수 있다. 또한, 압축 유닛 크기 비교 모듈(150)은 제 1 압축 유닛의 크기가 익스텐트 크기의 사전설정된 비율 이하인 경우, 언더플로우가 발생한 것으로 판단할 수 있다. The compression unit size comparison module 150 may compare the size of the generated first compression unit with the size of the extent (330). The compression unit size comparison module 150 can determine that an overflow has occurred when the size of the first compression unit exceeds the size of the extent. In addition, the compression unit size comparison module 150 can determine that an underflow has occurred when the size of the first compression unit is less than or equal to a predetermined ratio of the extent size.

로우 개수 결정 모듈(110)은 상기 비교 결과에 기초하여 하나 이상의 컬럼에 대하여 압축 유닛을 생성할 로우들의 제 2 개수를 결정할 수 있다(340). 로우 개수 결정 모듈(110)은 압축 유닛 크기 비교 모듈(150)의 비교 결과 오버플로우가 발생한 경우, 다음에 생성되는 압축 유닛에 포함될 로우의 개수(로우들의 제 2 개수)를 로우들의 제 1 개수보다 작게 결정할 수 있다. 또한, 로우 개수 결정 모듈(110)은 압축 유닛 크기 비교 모듈(150)의 비교 결과 언더플로우가 발생한 경우, 다음에 생성되는 압축 유닛에 포함될 로우의 개수(로우들의 제 2 개수)를 로우들의 제 1 개수보다 크게 결정할 수 있다. 이때, 로우 개수 결정 모듈(110)은 이전에 언더플로우가 발생한 로우의 수 정보 또는 오버플로우가 발생한 로우의 수 정보가 존재하는지 여부에 따라서 상이한 방식으로 제 2 개수를 결정할 수 있다. The row count determination module 110 may determine 340 the second number of rows to generate a compression unit for one or more columns based on the comparison result. The row number determination module 110 determines the number of rows (the second number of rows) to be included in the next compression unit when the comparison result of the compression unit size comparison module 150 results in an overflow, from the first number of rows It can be decided small. The row number determination module 110 also determines the number of rows (the second number of rows) to be included in the next compression unit when the underflow occurs as a result of the comparison of the compression unit size comparison module 150 to the first Can be determined to be larger than the number. At this time, the row number determination module 110 may determine the second number in a different manner depending on whether there is information on the number of rows in which an underflow has occurred or information on the number of rows in which an overflow occurred.

도 4 는 본 발명의 실시 예들이 구현될 수 있는 예시적인 컴퓨팅 환경에 대한 간략하고 일반적인 개략도를 도시한다.Figure 4 illustrates a simplified, general schematic diagram of an exemplary computing environment in which embodiments of the present invention may be implemented.

본 발명이 일반적으로 하나 이상의 컴퓨터 상에서 실행될 수 있는 컴퓨터 실행가능 명령어와 관련하여 전술되었지만, 당업자라면 본 발명이 기타 프로그램 모듈들과 결합되어 하드웨어와 소프트웨어의 조합으로서 구현될 수 있다는 것을 잘 알 것이다.Although the present invention has been described above generally in terms of computer-executable instructions that can be executed on one or more computers, those skilled in the art will appreciate that the present invention can be implemented as a combination of hardware and software in combination with other program modules.

일반적으로, 프로그램 모듈은 특정의 태스크를 수행하거나 특정의 추상 데이터 유형을 구현하는 루틴, 프로그램, 컴포넌트, 데이터 구조, 기타 등등을 포함한다. 또한, 당업자라면 본 발명의 방법이 단일-프로세서 또는 멀티프로세서 컴퓨터 시스템, 미니컴퓨터, 메인프레임 컴퓨터는 물론 퍼스널 컴퓨터, 핸드헬드 컴퓨팅 장치, 마이크로프로세서-기반 또는 프로그램가능 가전 제품, 기타 등등(이들 각각은 하나 이상의 연관된 장치와 연결되어 동작할 수 있음)을 비롯한 다른 컴퓨터 시스템 구성으로 실시될 수 있다는 것을 잘 알 것이다.Generally, program modules include routines, programs, components, data structures, etc. that perform particular tasks or implement particular abstract data types. It will also be appreciated by those skilled in the art that the methods of the present invention may be practiced with other computer systems, such as single-processor or multiprocessor computer systems, minicomputers, mainframe computers, as well as personal computers, handheld computing devices, microprocessor-based or programmable consumer electronics, And may operate in conjunction with one or more associated devices).

본 발명의 설명된 실시 예들은 또한 어떤 태스크들이 통신 네트워크를 통해 연결되어 있는 원격 처리 장치들에 의해 수행되는 분산 컴퓨팅 환경에서 실시될 수 있다. 분산 컴퓨팅 환경에서, 프로그램 모듈은 로컬 및 원격 메모리 저장 장치 둘다에 위치할 수 있다.The described embodiments of the invention may also be practiced in distributed computing environments where certain tasks are performed by remote processing devices connected through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

컴퓨터는 통상적으로 다양한 컴퓨터 판독가능 매체를 포함한다. 컴퓨터에 의해 액세스 가능한 매체는 그 어떤 것이든지 컴퓨터 판독가능 매체가 될 수 있고, 이러한 컴퓨터 판독가능 매체는 휘발성 및 비휘발성 매체, 일시적(transitory) 및 비일시적(non-transitory) 매체, 이동식 및 비-이동식 매체를 포함한다. 제한이 아닌 예로서, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보를 저장하는 임의의 방법 또는 기술로 구현되는 휘발성 및 비휘발성 매체, 일시적 및 비-일시적 매체, 이동식 및 비이동식 매체를 포함한다. 컴퓨터 저장 매체는 RAM, ROM, EEPROM, 플래시 메모리 또는 기타 메모리 기술, CD-ROM, DVD(digital video disk) 또는 기타 광 디스크 저장 장치, 자기 카세트, 자기 테이프, 자기 디스크 저장 장치 또는 기타 자기 저장 장치, 또는 컴퓨터에 의해 액세스될 수 있고 원하는 정보를 저장하는 데 사용될 수 있는 임의의 기타 매체를 포함하지만, 이에 한정되지 않는다.Computers typically include a variety of computer readable media. Any medium accessible by a computer may be a computer-readable medium, which may include volatile and non-volatile media, transitory and non-transitory media, removable and non-removable media, Removable media. By way of example, and not limitation, computer readable media may comprise computer storage media. Computer storage media includes volatile and nonvolatile media, both temporary and non-volatile media, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, . Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital video disk (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, Or any other medium which can be accessed by a computer and used to store the desired information.

삭제delete

컴퓨터(1102)를 포함하는 본 발명의 여러가지 측면들을 구현하는 예시적인 환경(1100)이 나타내어져 있으며, 컴퓨터(1102)는 처리 장치(1104), 시스템 메모리(1106) 및 시스템 버스(1108)를 포함한다. 시스템 버스(1108)는 시스템 메모리(1106)(이에 한정되지 않음)를 비롯한 시스템 컴포넌트들을 처리 장치(1104)에 연결시킨다. 처리 장치(1104)는 다양한 상용 프로세서들 중 임의의 프로세서일 수 있다. 듀얼 프로세서 및 기타 멀티프로세서 아키텍처도 역시 처리 장치(1104)로서 이용될 수 있다.There is shown an exemplary environment 1100 that implements various aspects of the present invention including a computer 1102 and includes a processing unit 1104, a system memory 1106 and a system bus 1108 do. The system bus 1108 couples system components, including but not limited to, system memory 1106 to the processing unit 1104. The processing unit 1104 may be any of a variety of commercially available processors. Dual processors and other multiprocessor architectures may also be used as the processing unit 1104.

시스템 버스(1108)는 메모리 버스, 주변장치 버스, 및 다양한 상용 버스 아키텍처 중 임의의 것을 사용하는 로컬 버스에 추가적으로 상호 연결될 수 있는 몇가지 유형의 버스 구조 중 임의의 것일 수 있다. 시스템 메모리(1106)는 판독 전용 메모리(ROM)(1110) 및 랜덤 액세스 메모리(RAM)(1112)를 포함한다. 기본 입/출력 시스템(BIOS)은 ROM, EPROM, EEPROM 등의 비휘발성 메모리(1110)에 저장되며, 이 BIOS는 시동 중과 같은 때에 컴퓨터(1102) 내의 구성요소들 간에 정보를 전송하는 일을 돕는 기본적인 루틴을 포함한다. RAM(1112)은 또한 데이터를 캐싱하기 위한 정적 RAM 등의 고속 RAM을 포함할 수 있다.The system bus 1108 may be any of several types of bus structures that may additionally be interconnected to a local bus using any of the memory bus, peripheral bus, and various commercial bus architectures. The system memory 1106 includes read only memory (ROM) 1110 and random access memory (RAM) The basic input / output system (BIOS) is stored in a non-volatile memory 1110, such as a ROM, EPROM, EEPROM or the like, which is a basic (non-volatile) memory device that aids in transferring information between components within the computer 1102 Routine. The RAM 1112 may also include a high speed RAM such as static RAM for caching data.

컴퓨터(1102)는 또한 내장형 하드 디스크 드라이브(HDD)(1114)(예를 들어, EIDE, SATA)―이 내장형 하드 디스크 드라이브(1114)는 또한 적당한 섀시(도시 생략) 내에서 외장형 용도로 구성될 수 있음―, 자기 플로피 디스크 드라이브(FDD)(1116)(예를 들어, 이동식 디스켓(1118)으로부터 판독을 하거나 그에 기록을 하기 위한 것임), 및 광 디스크 드라이브(1120)(예를 들어, CD-ROM 디스크(1122)를 판독하거나 DVD 등의 기타 고용량 광 매체로부터 판독을 하거나 그에 기록을 하기 위한 것임)를 포함한다. 하드 디스크 드라이브(1114), 자기 디스크 드라이브(1116) 및 광 디스크 드라이브(1120)는 각각 하드 디스크 드라이브 인터페이스(1124), 자기 디스크 드라이브 인터페이스(1126) 및 광 드라이브 인터페이스(1128)에 의해 시스템 버스(1108)에 연결될 수 있다. 외장형 드라이브 구현을 위한 인터페이스(1124)는 USB(Universal Serial Bus) 및 IEEE 1394 인터페이스 기술 중 적어도 하나 또는 그 둘다를 포함한다.The computer 1102 may also be an internal hard disk drive (HDD) 1114 (e.g., EIDE, SATA) - this internal hard disk drive 1114 may also be configured for external use within a suitable chassis , A magnetic floppy disk drive (FDD) 1116 (e.g., for reading from or writing to a removable diskette 1118), and an optical disk drive 1120 (e.g., a CD-ROM For reading disc 1122 or reading from or writing to other high capacity optical media such as DVD). The hard disk drive 1114, magnetic disk drive 1116 and optical disk drive 1120 are connected to the system bus 1108 by a hard disk drive interface 1124, a magnetic disk drive interface 1126 and an optical drive interface 1128, respectively. . The interface 1124 for external drive implementation includes at least one or both of USB (Universal Serial Bus) and IEEE 1394 interface technologies.

이들 드라이브 및 그와 연관된 컴퓨터 판독가능 매체는 데이터, 데이터 구조, 컴퓨터 실행가능 명령어, 기타 등등의 비휘발성 저장을 제공한다. 컴퓨터(1102)의 경우, 드라이브 및 매체는 임의의 데이터를 적당한 디지털 형식으로 저장하는 것에 대응한다. 상기에서의 컴퓨터 판독가능 매체에 대한 설명이 HDD, 이동식 자기 디스크, 및 CD 또는 DVD 등의 이동식 광 매체를 언급하고 있지만, 당업자라면 집 드라이브(zip drive), 자기 카세트, 플래쉬 메모리 카드, 카트리지, 기타 등등의 컴퓨터에 의해 판독가능한 다른 유형의 매체도 역시 예시적인 운영 환경에서 사용될 수 있으며 또 임의의 이러한 매체가 본 발명의 방법들을 수행하기 위한 컴퓨터 실행가능 명령어를 포함할 수 있다는 것을 잘 알 것이다.These drives and their associated computer-readable media provide non-volatile storage of data, data structures, computer-executable instructions, and the like. In the case of computer 1102, the drives and media correspond to storing any data in a suitable digital format. While the above description of computer readable media refers to HDDs, removable magnetic disks, and removable optical media such as CDs or DVDs, those skilled in the art will appreciate that other types of storage devices, such as zip drives, magnetic cassettes, flash memory cards, Or the like may also be used in the exemplary operating environment and any such medium may include computer-executable instructions for carrying out the methods of the present invention.

운영 체제(1130), 하나 이상의 애플리케이션 프로그램(1132), 기타 프로그램 모듈(1134) 및 프로그램 데이터(1136)을 비롯한 다수의 프로그램 모듈이 드라이브 및 RAM(1112)에 저장될 수 있다. 운영 체제, 애플리케이션, 모듈 및/또는 데이터의 전부 또는 그 일부분이 또한 RAM(1112)에 캐싱될 수 있다. 당업자라면 본 발명이 여러가지 상업적으로 이용가능한 운영 체제 또는 운영 체제들의 조합에서 구현될 수 있다는 것을 잘 알 것이다.A number of program modules may be stored in the drive and RAM 1112, including an operating system 1130, one or more application programs 1132, other program modules 1134, and program data 1136. All or a portion of the operating system, applications, modules, and / or data may also be cached in the RAM 1112. Those skilled in the art will appreciate that the invention may be implemented in a variety of commercially available operating systems or combinations of operating systems.

사용자는 하나 이상의 유선/무선 입력 장치, 예를 들어, 키보드(1138) 및 마우스(1140) 등의 포인팅 장치를 통해 컴퓨터(1102)에 명령 및 정보를 입력할 수 있다. 기타 입력 장치(도시 생략)로는 마이크, IR 리모콘, 조이스틱, 게임 패드, 스타일러스 펜, 터치 스크린, 기타 등등이 있을 수 있다. 이들 및 기타 입력 장치가 종종 시스템 버스(1108)에 연결되어 있는 입력 장치 인터페이스(1142)를 통해 처리 장치(1104)에 연결되지만, 병렬 포트, IEEE 1394 직렬 포트, 게임 포트, USB 포트, IR 인터페이스, 기타 등등의 기타 인터페이스에 의해 연결될 수 있다.A user may enter commands and information into the computer 1102 via one or more wired / wireless input devices, such as a keyboard 1138 and a pointing device such as a mouse 1140. [ Other input devices (not shown) may include a microphone, an IR remote control, a joystick, a game pad, a stylus pen, a touch screen, and so on. These and other input devices are often connected to the processing unit 1104 via an input device interface 1142 that is coupled to the system bus 1108, but may be a parallel port, an IEEE 1394 serial port, a game port, a USB port, an IR interface, &Lt; / RTI > and so forth.

모니터(1144) 또는 다른 유형의 디스플레이 장치도 역시 비디오 어댑터(1146) 등의 인터페이스를 통해 시스템 버스(1108)에 연결된다. 모니터(1144)에 부가하여, 컴퓨터는 일반적으로 스피커, 프린터, 기타 등등의 기타 주변 출력 장치(도시 생략)를 포함한다.A monitor 1144 or other type of display device is also connected to the system bus 1108 via an interface, such as a video adapter 1146, In addition to the monitor 1144, the computer typically includes other peripheral output devices (not shown) such as speakers, printers,

컴퓨터(1102)는 유선 및/또는 무선 통신을 통한 원격 컴퓨터(들)(1148) 등의 하나 이상의 원격 컴퓨터로의 논리적 연결을 사용하여 네트워크화된 환경에서 동작할 수 있다. 원격 컴퓨터(들)(1148)는 워크스테이션, 서버 컴퓨터, 라우터, 퍼스널 컴퓨터, 휴대용 컴퓨터, 마이크로프로세서-기반 오락 기기, 피어 장치 또는 기타 통상의 네트워크 노드일 수 있으며, 일반적으로 컴퓨터(1102)에 대해 기술된 구성요소들 중 다수 또는 그 전부를 포함하지만, 간략함을 위해, 메모리 저장 장치(1150)만이 도시되어 있다. 도시되어 있는 논리적 연결은 근거리 통신망(LAN)(1152) 및/또는 더 큰 네트워크, 예를 들어, 원거리 통신망(WAN)(1154)에의 유선/무선 연결을 포함한다. 이러한 LAN 및 WAN 네트워킹 환경은 사무실 및 회사에서 일반적인 것이며, 인트라넷 등의 전사적 컴퓨터 네트워크(enterprise-wide computer network)를 용이하게 해주며, 이들 모두는 전세계 컴퓨터 네트워크, 예를 들어, 인터넷에 연결될 수 있다.Computer 1102 may operate in a networked environment using logical connections to one or more remote computers, such as remote computer (s) 1148, via wired and / or wireless communication. The remote computer (s) 1148 can be a workstation, a server computer, a router, a personal computer, a portable computer, a microprocessor-based entertainment device, a peer device or other conventional network node, Includes a number of or all of the described elements, but for simplicity, only memory storage device 1150 is shown. The logical connections depicted include a wired / wireless connection to a local area network (LAN) 1152 and / or a larger network, e.g., a wide area network (WAN) These LAN and WAN networking environments are commonplace in offices and corporations and facilitate enterprise-wide computer networks such as intranets, all of which can be connected to computer networks worldwide, for example the Internet.

LAN 네트워킹 환경에서 사용될 때, 컴퓨터(1102)는 유선 및/또는 무선 통신 네트워크 인터페이스 또는 어댑터(1156)를 통해 로컬 네트워크(1152)에 연결된다. 어댑터(1156)는 LAN(1152)에의 유선 또는 무선 통신을 용이하게 해줄 수 있으며, 이 LAN(1152)은 또한 무선 어댑터(1156)와 통신하기 위해 그에 설치되어 있는 무선 액세스 포인트를 포함하고 있다. WAN 네트워킹 환경에서 사용될 때, 컴퓨터(1102)는 모뎀(1158)을 포함할 수 있거나, WAN(1154) 상의 통신 서버에 연결되거나, 또는 인터넷을 통하는 등, WAN(1154)을 통해 통신을 설정하는 기타 수단을 갖는다. 내장형 또는 외장형 및 유선 또는 무선 장치일 수 있는 모뎀(1158)은 직렬 포트 인터페이스(1142)를 통해 시스템 버스(1108)에 연결된다. 네트워크화된 환경에서, 컴퓨터(1102)에 대해 설명된 프로그램 모듈들 또는 그의 일부분이 원격 메모리/저장 장치(1150)에 저장될 수 있다. 도시된 네트워크 연결이 예시적인 것이며 컴퓨터들 사이에 통신 링크를 설정하는 기타 수단이 사용될 수 있다는 것을 잘 알 것이다.When used in a LAN networking environment, the computer 1102 is connected to the local network 1152 via a wired and / or wireless communication network interface or adapter 1156. [ The adapter 1156 may facilitate wired or wireless communication to the LAN 1152 and the LAN 1152 also includes a wireless access point installed therein to communicate with the wireless adapter 1156. [ When used in a WAN networking environment, the computer 1102 may include a modem 1158, or may be coupled to a communications server on the WAN 1154, or to other devices that establish communications over the WAN 1154, such as via the Internet. . A modem 1158, which may be an internal or external and a wired or wireless device, is coupled to the system bus 1108 via a serial port interface 1142. In a networked environment, program modules described for the computer 1102, or portions thereof, may be stored in the remote memory / storage device 1150. It will be appreciated that the network connections shown are exemplary and other means of establishing a communication link between the computers may be used.

컴퓨터(1102)는 무선 통신으로 배치되어 동작하는 임의의 무선 장치 또는 개체, 예를 들어, 프린터, 스캐너, 데스크톱 및/또는 휴대용 컴퓨터, PDA(portable data assistant), 통신 위성, 무선 검출가능 태그와 연관된 임의의 장비 또는 장소, 및 전화와 통신을 하는 동작을 한다. 이것은 적어도 Wi-Fi 및 블루투스 무선 기술을 포함한다. 따라서, 통신은 종래의 네트워크에서와 같이 미리 정의된 구조이거나 단순하게 적어도 2개의 장치 사이의 애드혹 통신(ad hoc communication)일 수 있다.The computer 1102 may be any wireless device or entity that is deployed and operable in wireless communication, such as a printer, a scanner, a desktop and / or portable computer, a portable data assistant (PDA) Any equipment or place, and communication with the telephone. This includes at least Wi-Fi and Bluetooth wireless technology. Thus, the communication may be a predefined structure, such as in a conventional network, or simply an ad hoc communication between at least two devices.

Wi-Fi(Wireless Fidelity)는 유선 없이도 인터넷 등으로의 연결을 가능하게 해준다. Wi-Fi는 이러한 장치, 예를 들어, 컴퓨터가 실내에서 및 실외에서, 즉 기지국의 통화권 내의 아무 곳에서나 데이터를 전송 및 수신할 수 있게 해주는 셀 전화와 같은 무선 기술이다. Wi-Fi 네트워크는 안전하고 신뢰성있으며 고속인 무선 연결을 제공하기 위해 IEEE 802.11(a,b,g, 기타)이라고 하는 무선 기술을 사용한다. 컴퓨터를 서로에, 인터넷에 및 유선 네트워크(IEEE 802.3 또는 이더넷을 사용함)에 연결시키기 위해 Wi-Fi가 사용될 수 있다. Wi-Fi 네트워크는 비인가 2.4 및 5 GHz 무선 대역에서, 예를 들어, 11Mbps(802.11a) 또는 54 Mbps(802.11b) 데이터 레이트로 동작하거나, 양 대역(듀얼 대역)을 포함하는 제품에서 동작할 수 있다.Wi-Fi (Wireless Fidelity) allows you to connect to the Internet without wires. Wi-Fi is a wireless technology such as a cell phone that allows such devices, e.g., computers, to transmit and receive data indoors and outdoors, i. E. Anywhere within the coverage area of a base station. Wi-Fi networks use a wireless technology called IEEE 802.11 (a, b, g, etc.) to provide a secure, reliable, and high-speed wireless connection. Wi-Fi can be used to connect computers to each other, the Internet, and a wired network (using IEEE 802.3 or Ethernet). The Wi-Fi network may operate in unlicensed 2.4 and 5 GHz wireless bands, for example, at 11 Mbps (802.11a) or 54 Mbps (802.11b) data rates, or in products containing both bands have.

본 발명의 기술 분야에서 통상의 지식을 가진 자는 정보 및 신호들이 임의의 다양한 상이한 기술들 및 기법들을 이용하여 표현될 수 있다는 것을 이해할 것이다. 예를 들어, 위의 설명에서 참조될 수 있는 데이터, 지시들, 명령들, 정보, 신호들, 비트들, 심볼들 및 칩들은 전압들, 전류들, 전자기파들, 자기장들 또는 입자들, 광학장들 또는 입자들, 또는 이들의 임의의 결합에 의해 표현될 수 있다.Those of ordinary skill in the art will understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced in the above description may include voltages, currents, electromagnetic waves, magnetic fields or particles, Particles or particles, or any combination thereof.

본 발명의 기술 분야에서 통상의 지식을 가진 자는 여기에 개시된 실시 예들과 관련하여 설명된 다양한 예시적인 논리 블록들, 모듈들, 프로세서들, 수단들, 회로들 및 알고리즘 단계들이 전자 하드웨어, (편의를 위해, 여기에서 "소프트웨어"로 지칭되는) 다양한 형태들의 프로그램 또는 설계 코드 또는 이들 모두의 결합에 의해 구현될 수 있다는 것을 이해할 것이다. 하드웨어 및 소프트웨어의 이러한 상호 호환성을 명확하게 설명하기 위해, 다양한 예시적인 컴포넌트들, 블록들, 모듈들, 회로들 및 단계들이 이들의 기능과 관련하여 위에서 일반적으로 설명되었다. 이러한 기능이 하드웨어 또는 소프트웨어로서 구현되는지 여부는 특정한 애플리케이션 및 전체 시스템에 대하여 부과되는 설계 제약들에 따라 좌우된다. 본 발명의 기술 분야에서 통상의 지식을 가진 자는 각각의 특정한 애플리케이션에 대하여 다양한 방식들로 설명된 기능을 구현할 수 있으나, 이러한 구현 결정들은 본 발명의 범위를 벗어나는 것으로 해석되어서는 안 될 것이다.Those skilled in the art will appreciate that the various illustrative logical blocks, modules, processors, means, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented or performed with a specific purpose, (Which may be referred to herein as "software") or a combination of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the design constraints imposed on the particular application and the overall system. Those skilled in the art may implement the described functionality in various ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

여기서 제시된 다양한 실시 예들은 방법, 장치, 또는 표준 프로그래밍 및/또는 엔지니어링 기술을 사용한 제조 물품(article)으로 구현될 수 있다. 용어 "제조 물품"은 임의의 컴퓨터-판독가능 장치로부터 액세스 가능한 컴퓨터 프로그램, 캐리어, 또는 매체(media)를 포함한다. 예를 들어, 컴퓨터-판독가능 매체는 자기 저장 장치(예를 들면, 하드 디스크, 플로피 디스크, 자기 스트립, 등), 광학 디스크(예를 들면, CD, DVD, 등), 스마트 카드, 및 플래쉬 메모리 장치(예를 들면, EEPROM, 카드, 스틱, 키 드라이브, 등)를 포함하지만, 이들로 제한되는 것은 아니다. 또한, 여기서 제시되는 다양한 저장 매체는 정보를 저장하기 위한 하나 이상의 장치 및/또는 다른 기계-판독가능한 매체를 포함한다. 용어 "기계-판독가능 매체"는 명령(들) 및/또는 데이터를 저장, 보유, 및/또는 전달할 수 있는 무선 채널 및 다양한 다른 매체를 포함하지만, 이들로 제한되는 것은 아니다. The various embodiments presented herein may be implemented as a method, apparatus, or article of manufacture using standard programming and / or engineering techniques. The term "article of manufacture" includes a computer program, carrier, or media accessible from any computer-readable device. For example, the computer-readable medium can be a magnetic storage device (e.g., a hard disk, a floppy disk, a magnetic strip, etc.), an optical disk (e.g., CD, DVD, etc.), a smart card, But are not limited to, devices (e. G., EEPROM, cards, sticks, key drives, etc.). The various storage media presented herein also include one or more devices and / or other machine-readable media for storing information. The term "machine-readable medium" includes, but is not limited to, a wireless channel and various other media capable of storing, holding, and / or transferring instruction (s) and / or data.

제시된 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조는 예시적인 접근들의 일례임을 이해하도록 한다. 설계 우선순위들에 기반하여, 본 발명의 범위 내에서 프로세스들에 있는 단계들의 특정한 순서 또는 계층 구조가 재배열될 수 있다는 것을 이해하도록 한다. 첨부된 방법 청구항들은 샘플 순서로 다양한 단계들의 엘리먼트들을 제공하지만 제시된 특정한 순서 또는 계층 구조에 한정되는 것을 의미하지는 않는다.It will be appreciated that the particular order or hierarchy of steps in the presented processes is an example of exemplary approaches. It will be appreciated that, based on design priorities, certain orders or hierarchies of steps in processes may be rearranged within the scope of the present invention. The appended method claims provide elements of the various steps in a sample order, but are not meant to be limited to the specific order or hierarchy presented.

제시된 실시 예들에 대한 설명은 임의의 본 발명의 기술 분야에서 통상의 지식을 가진 자가 본 발명을 이용하거나 또는 실시할 수 있도록 제공된다. 이러한 실시 예들에 대한 다양한 변형들은 본 발명의 기술 분야에서 통상의 지식을 가진 자에게 명백할 것이며, 여기에 정의된 일반적인 원리들은 본 발명의 범위를 벗어남이 없이 다른 실시 예들에 적용될 수 있다. 그리하여, 본 발명은 여기에 제시된 실시 예들로 한정되는 것이 아니라, 여기에 제시된 원리들 및 신규한 특징들과 일관되는 최광의의 범위에서 해석되어야 할 것이다.The description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features presented herein.

Claims

CLAIMS 1. A method of column-by-column compression of a database performed in a computing device comprising at least one processor and a main memory storing instructions executable by the processor,
Determining a first number of rows to generate a compression unit for one or more columns located in a table of the database;
Compressing the determined first number of rows to produce a first compression unit;
Comparing a size of the first compression unit with a size of an extent;
Determining that the first compression unit is overflow if the size of the first compression unit exceeds the size of the extent;
Determining whether there is information related to the number of rows underflow when it is determined that an overflow has occurred; And
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result;
/ RTI >
A method of column-by-column compression of a database.

The method according to claim 1,
Wherein the extent comprises a set of data blocks,
A method of column-by-column compression of a database.

The method according to claim 1,
If there is no information related to the number of rows in which the underflow occurred,
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result,
Determining a result value obtained by dividing the first number by a first predetermined number as the second number;
/ RTI >
A method of column-by-column compression of a database.

The method according to claim 1,
If there is information on the number of rows in which the underflow occurred,
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result,
Determining a result value obtained by subtracting a difference between the first number and the number of generated rows of the underflow by a second predetermined number in the first number as the second number;
/ RTI >
A method of column-by-column compression of a database.

CLAIMS 1. A method of column-by-column compression of a database performed in a computing device comprising at least one processor and a main memory storing instructions executable by the processor,
Determining a first number of rows to generate a compression unit for one or more columns located in a table of the database;
Compressing the determined first number of rows to produce a first compression unit;
Comparing a size of the first compression unit with a size of an extent;
Determining that the size of the first compression unit is underflow if the size of the first compression unit is less than or equal to a predetermined ratio of the size of the extent;
Determining whether there is information related to the number of rows where an overflow has occurred when it is determined that an underflow has occurred; And
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result;
/ RTI >
A method of column-by-column compression of a database.

6. The method of claim 5,
If there is no information related to the number of rows overflowing,
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result,
Determining a result value of the first number multiplied by a third predetermined number as the second number;
/ RTI >
A method of column-by-column compression of a database.

The method according to claim 6,
Wherein the third predetermined number is a number that is the maximum size the compression unit can have divided by the size of the first compression unit,
A method of column-by-column compression of a database.

6. The method of claim 5,
If there is information about the number of rows overflowed,
Determining a second number of rows to generate a compression unit for one or more columns based on the comparison result,
Determining the second number as a result of adding the first number and the number obtained by dividing the difference between the first number and the number of generated rows by a fourth predetermined number;
/ RTI >
A method of column-by-column compression of a database.

As a database server that provides column-by-column compression of a database,
The database server,
One or more processors; And
A memory for storing instructions executable on the one or more processors;
Lt; / RTI >
The one or more processors
Determine a first number of rows to generate a compression unit for one or more columns located in a table of the database and determine a second number of rows to generate a compression unit as the number of rows underflow Based on at least in part the information relating to the number of rows;
A compression unit generation module for compressing the determined first number of rows to generate a first compression unit; And
A compression unit size comparison module comparing the size of the first compression unit with the size of the extent and determining that the size of the first compression unit is overflow when the size of the first compression unit exceeds the size of the extent;
/ RTI >
A database server that provides column-by-column compression of the database.

As a database server that provides column-by-column compression of a database,
The database server,
One or more processors; And
A memory for storing instructions executable on the one or more processors;
Lt; / RTI >
Determining a first number of rows to generate a compression unit for one or more columns located in a table of the database and determining a second number of rows to generate a compression unit from the number of rows overflowed A row number determination module that determines a second number based at least in part on information related to the row number;
A compression unit generation module for compressing the determined first number of rows to generate a first compression unit; And
Comparing the size of the first compression unit with the size of the extent and comparing the compression unit size judging as underflow when the size of the first compression unit is equal to or less than a predetermined ratio of the size of the extent module;
/ RTI >
A database server that provides column-by-column compression of the database.

23. A computer program stored in a computer readable medium, comprising a plurality of instructions executed by one or more processors,
The computer program comprising:
Instructions for determining a first number of rows to generate a compression unit for one or more columns located in a table of a database;
Compressing the determined first number of rows to produce a first compression unit;
A command for comparing the size of the first compression unit with the size of the extent;
An instruction to determine an overflow if the size of the first compression unit exceeds the size of the extent as a result of the comparison;
An instruction for determining whether or not there is information related to the number of rows underflow when it is determined that an overflow has occurred; And
Instructions for determining a second number of rows to generate a compression unit for one or more columns based on the comparison result;
/ RTI >
A computer program stored on a computer readable medium.

A computer program stored in a computer readable storage medium, the computer program comprising a plurality of instructions executed by one or more processors,
The computer program comprising:
Instructions for determining a first number of rows to generate a compression unit for one or more columns located in a table of a database;
Compressing the determined first number of rows to produce a first compression unit;
A command for comparing the size of the first compression unit with the size of the extent;
An instruction to determine an underflow if the size of the first compression unit is equal to or less than a predetermined ratio of the size of the extent;
An instruction for determining whether or not there is information related to the number of rows in which an overflow has occurred when it is determined that an underflow has occurred; And
Instructions for determining a second number of rows to generate a compression unit for one or more columns based on the comparison result;
/ RTI >
A computer program stored on a computer readable storage medium.