KR102177792B1

KR102177792B1 - A system that displays large amounts of data on a chart without memory capacity limitation by using a binary file storage structure per column

Info

Publication number: KR102177792B1
Application number: KR1020190022401A
Authority: KR
Inventors: 곽기영; 권오주
Original assignee: 주식회사 퍼즐시스템즈
Priority date: 2019-02-26
Filing date: 2019-02-26
Publication date: 2020-11-11
Also published as: KR20200104035A

Abstract

본 발명은 데이터베이스에서 조회된 데이터를 컬럼 단위의 바이너리(Binary) 파일로 생성하여 메모리가 아닌 디스크에 저장하고, 디스크에 저장된 데이터를 읽을 때는 전체 데이터를 한번에 메모리에 적재하지 않고, 레코드 단위로 읽어서 차트(Chart)에 표시하기 때문에 DBMS(Database Management System)에서 조회된 데이터의 용량이 아무리 커도 메모리 오버플로우(Overflow) 오류가 발생하지 않는 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템에 관한 것이다.
본 발명에 의하면, DBMS에서 조회된 대용량의 데이터를 이용하여 표현된 차트 상의 각 좌표에 해당하는 복수 개의 레코드 인덱스 정보를 컬럼 단위의 바이너리 파일로 생성하여 디스크에 저장하고, 차트 상에 표현된 특정 위치 상의 좌표가 나타내는 레코드 정보에 대한 요청이 있는 경우 차트 상의 위치 좌표를 키 값으로 하고 레코드 인덱스 정보를 값으로 하는 해쉬(Hash) 함수 구조를 적용함으로써, 좌표에 해당하는 레코드 인덱스 정보를 빠르게 찾을 수 있을 뿐만 아니라, 레코드 인덱스 정보를 가져온 후 레코드 인덱스에 해당하는 정보를 디스크에 저장된 컬럼 단위 바이너리 파일에서 검색하여 로딩하기 때문에 메모리 오버플로우의 오류가 발생하지 않아 메모리 용량의 제약없이 차트로 표시할 수 있는 장점이 있다.In the present invention, the data retrieved from the database is created as a column-based binary file and stored in a disk rather than a memory. When reading data stored on the disk, the entire data is not loaded into the memory at once, but is read in record units and charted. Because it is displayed in (Chart), a large amount of data is charted without memory capacity limitation by using a binary file storage structure for each column that does not cause a memory overflow error no matter how large the size of the data retrieved from the DBMS (Database Management System) is. It relates to the system represented by.
According to the present invention, a plurality of record index information corresponding to each coordinate on a chart expressed using a large amount of data retrieved from a DBMS is created as a binary file in column units and stored on a disk, and a specific location expressed on the chart When there is a request for record information indicated by the coordinates of the image, by applying a hash function structure that uses the position coordinates on the chart as the key value and the record index information as the value, it is possible to quickly find the record index information corresponding to the coordinates. In addition, since the information corresponding to the record index is retrieved and loaded from the column-level binary file stored on the disk after the record index information is retrieved, there is no memory overflow error, so that the chart can be displayed without limiting the memory capacity. There is this.

Description

A SYSTEM THAT DISPLAYS LARGE AMOUNTS OF DATA ON A CHART WITHOUT MEMORY CAPACITY LIMITATION BY USING A BINARY FILE STORAGE STRUCTURE PER COLUMN}

본 발명은 빅데이터, 클라우드 컴퓨팅 등의 환경에서 데이터 조회 결과 수신한 대용량 데이터를 메모리 용량의 제약 없이 처리하기 위한 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템에 관한 것이다.The present invention uses a binary file storage structure for each column to process large-capacity data received as a result of data inquiry in environments such as big data and cloud computing, without limitation of memory capacity, to a system that displays large-capacity data as a chart without memory capacity limitation. About.

기존에 서버의 데이터베이스에서 데이터를 조회하면 XML 형태의 스트링을 API를 이용하여 DataSet으로 변경하여 웹 서버에 저장 후 사용자 컴퓨터로 전송하거나, PHP나 JSP 파일을 API를 이용하여 JSON으로 변경하여 웹 서버에 저장 후 사용자 컴퓨터로 전송한다(3 tier 방식). 또는 서버의 데이터베이스에서 조회된 데이터를 직접 사용자 컴퓨터로 전송하여 사용자 컴퓨터에서 XML 형태의 스트링을 API를 이용하여 DataSet으로 변경하여 저장하거나, PHP나 JSP 파일을 API를 이용하여 JSON으로 변경하여 저장한다(2 tier 방식).When data is retrieved from the existing server database, the XML-type string is converted into a DataSet using API, saved in the web server, and then transmitted to the user's computer, or PHP or JSP files are converted to JSON using the API and transferred to the web server. After saving, transfer to user's computer (3 tier method). Alternatively, the data retrieved from the server database is directly transmitted to the user's computer, and the XML string in the user's computer is converted into a DataSet using API and saved, or a PHP or JSP file is changed to JSON using API and saved ( 2 tier method).

이와 같은 기존 방법을 이용하여 데이터베이스에서 데이터를 조회하면 조회된 데이터는 XML, JSON 또는 CSV 형식의 TEXT 파일 스트림 형태로 만들어지기 때문에 본래의 값을 표현하기 위해 필요로 하는 Byte 크기보다 커지게 된다.When data is retrieved from the database using such an existing method, the retrieved data is created in the form of a TEXT file stream in XML, JSON, or CSV format, and thus becomes larger than the byte size required to express the original value.

결국, 이와 같은 방식으로 조회된 데이터 전부를 메모리에 로드 시 실제 값이 가지는 본래의 크기보다 더 많은 메모리를 사용하게 된다.Eventually, when all of the data retrieved in this way is loaded into memory, more memory than the original size of the actual value is used.

이에 따라 운영체제의 종류(32bit, 64bit)에 따른 프로세스가 접근할 수 있는 메모리 영역의 크기 제한으로 인해 메모리 오버플로우 오류가 발생한다.Accordingly, a memory overflow error occurs due to a limit on the size of a memory area that can be accessed by a process according to the type of operating system (32bit, 64bit).

참고로 32bit 운영체제에서 사용 가능한 총 메모리는 4G 이며, 64bit 운영체제에서 사용 가능한 총 메모리는 8G ~ 2TB 이다. 그리고 32bit 응용프로그램에서 사용 가능한 메모리는 2~3G 이며, 64bit 응용프로그램에서 사용 가능한 메모리는 8TB 이고, 단일 개체로는 2G까지 사용 가능하다.For reference, the total memory available in the 32bit operating system is 4G, and the total memory available in the 64-bit operating system is 8G ~ 2TB. In addition, the available memory for 32bit application programs is 2~3G, and the memory available for 64bit applications is 8TB, and up to 2G can be used as a single entity.

도 1을 참고하면, XML DataSet을 사용하는 일반적인 데이터 조회 방법을 나타낸다. ① 먼저 DBMS에서 쿼리를 실행하고 ② Middle Tier(Web 서버)에서 조회 결과(Result Set) 데이터를 XML DataSet 형태로 메모리에 적재한다. 이 과정에서 Middle Tier(Web 서버)의 프로세스당 메모리 용량을 초과하는 경우 메모리 오버플로우(Memory Overflow) 오류가 발생한다. ③ 클라이언트에서 전송 받은 조회 결과(Result Set) 데이터를 XML DataSet 형태로 메모리에 적재한다. 이 과정에서 클라이언트의 프로세스당 메모리 용량을 초과하는 경우 메모리 오버플로우 오류가 발생한다. ④ 마지막으로 메모리 상의 XML DataSet을 Chart 컨트롤에 바인딩하여 Chart 로 표시한다. Chart 컨트롤에 바인딩하는 과정에서 Chart 컨트롤의 데이터 바인딩 방식에 따라 추가적으로 메모리 오버플로우 오류가 발생한다.Referring to FIG. 1, a general data search method using an XML DataSet is shown. ① First, execute the query in the DBMS. ② Load the search result data in the middle tier (Web server) into the memory in the form of XML DataSet. In this process, if the memory capacity per process of the middle tier (Web server) is exceeded, a memory overflow error occurs. ③ Loads the query result set data received from the client into the memory in the form of XML DataSet. During this process, if the memory capacity per process of the client is exceeded, a memory overflow error occurs. ④ Finally, bind the XML DataSet in memory to the Chart control and display it as a chart. In the process of binding to the chart control, an additional memory overflow error occurs depending on the data binding method of the chart control.

상기에서 살펴본 바와 같이 빅데이터 및 클라우드 컴퓨팅 환경에서 기존 방식으로 대용량의 데이터를 조회하면 프로세스당 메모리 용량을 초과하여 메모리 오버플로우 오류가 발생할 뿐만 아니라, 데이터 조회에 걸리는 시간도 상당히 오래 걸리게 된다.As described above, when a large amount of data is searched in a conventional way in a big data and cloud computing environment, not only a memory overflow error occurs due to exceeding the memory capacity per process, but also the time it takes to search the data takes a long time.

또한, 대용량의 데이터를 차트(Chart)로 표현하고, 차트가 그려진 후 사용자가 차트 상의 특정 위치에 대한 자세한 정보를 보기 위해서는 메모리에서 특정 위치에 해당하는 레코드를 탐색을 통해 찾아야 하는데 대용량의 데이터를 메모리에서 관리할 경우 메모리 오버플로우의 위험이 있을 뿐만 아니라, 메모리에서 해당 위치 좌표를 갖는 정보를 순차적으로 탐색 시 CPU 자원을 상당히 소모할 뿐만 아니라, 시간도 상당히 오래 걸리게 된다.In addition, in order to express a large amount of data in a chart, and after the chart is drawn, in order for the user to view detailed information about a specific position on the chart, the record corresponding to a specific position in the memory must be searched for. In case of managing in the system, there is a risk of memory overflow, and when sequentially searching for information having the corresponding location coordinates in the memory, it consumes a lot of CPU resources and takes a long time.

등록특허 10-1530441(컬럼 기반 데이터 처리 방법 및 장치)Registered Patent 10-1530441 (Column-based data processing method and device)

본 발명은 상기와 같은 문제점을 개선하기 위하여 발명된 것으로, 데이터베이스에서 조회된 데이터를 컬럼 단위의 바이너리(Binary) 파일로 생성하여 메모리가 아닌 디스크에 저장하고, 디스크에 저장된 데이터를 읽을 때는 전체 데이터를 한번에 메모리에 적재하지 않고, 레코드 단위로 읽어서 차트(Chart)에 표시하기 때문에 DBMS(Database Management System)에서 조회된 데이터의 용량이 아무리 커도 메모리 오버플로우(Overflow) 오류가 발생하지 않는 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템을 제공하기 위한 것이다.The present invention was invented to improve the above-described problems, and the data retrieved from the database is created as a column-based binary file and stored in a disk rather than a memory. When reading data stored on the disk, the entire data is saved. Because the data is not loaded into memory at once, it is read in units of records and displayed on the chart, so no matter how large the size of the data retrieved from the DBMS (Database Management System) is, a binary file for each column that does not cause a memory overflow error is saved. It is to provide a system for displaying large-capacity data in a chart without limitation of memory capacity using a structure.

또한, DBMS(Database Management System)에서 조회된 대용량의 원본 데이터를 메모리에 로딩하지 않고, 차트 상의 각 좌표에 해당하는 복수 개의 레코드 인덱스 정보를 컬럼 단위의 바이너리 파일로 생성하여 디스크에 저장하며, 차트 상에 표현된 특정 위치 상의 좌표가 나타내는 레코드 정보에 대한 요청이 있는 경우 차트 상의 위치 좌표를 키 값으로 하고 레코드 인덱스 정보를 값으로 하는 해쉬(Hash) 함수 구조를 적용함으로써, 좌표에 해당하는 레코드 인덱스 정보를 빠르게 찾을 수 있는 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템을 제공하기 위한 것이다.In addition, without loading the large amount of original data retrieved from the DBMS (Database Management System) into memory, multiple record index information corresponding to each coordinate on the chart is created as a binary file in column units and stored on disk. When there is a request for record information indicated by the coordinates on a specific location expressed in, by applying a hash function structure that uses the location coordinates on the chart as the key value and record index information as the value, record index information corresponding to the coordinates It is to provide a system that displays large-capacity data as a chart without memory capacity limitation by using a binary file storage structure for each column that can quickly find.

상기와 같은 목적을 달성하기 위하여, 본 발명은 DBMS(Database Management System)에서 데이터 조회 결과 수신한 테이블(Result Set) 데이터를 컬럼 단위 바이너리(Binary) 파일로 생성하여 디스크에 저장하는 컬럼 바이너리 파일 저장부; 상기 컬럼 바이너리 파일 저장부에 의해 디스크에 저장된 각각의 컬럼 별 바이너리 파일에서 셀 값을 하나씩 읽어 레코드(Row)를 구성 후 레코드 단위로 메모리 로딩하는 컬럼 바이너리 파일 리더부; 및 상기 컬럼 바이너리 파일 리더부에 의해 메모리에 로딩된 레코드에서 X 좌표와 Y 좌표에 해당하는 각 셀 값을 이용하여 차트 상의 X 좌표와 Y 좌표에 해당하는 위치에 픽셀(Pixel) 형태로 시각적으로 표시하는 차트 표시부를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention is a column binary file storage unit that generates a table (Result Set) data received as a result of a data inquiry from a database management system (DBMS) as a column-based binary file and stores it on a disk. ; A column binary file reader configured to read cell values one by one from a binary file for each column stored on a disk by the column binary file storage unit, construct a record, and then load a memory in a record unit; And visually display in the form of pixels at positions corresponding to X and Y coordinates on the chart using cell values corresponding to X and Y coordinates in the record loaded into the memory by the column binary file reader. It characterized in that it comprises a chart display unit.

상기와 같은 구성의 본 발명에 따르면, 다음과 같은 효과를 도모할 수 있다. According to the present invention having the above configuration, the following effects can be achieved.

DBMS(Database Management System)에서 조회된 대용량의 원본 데이터를 메모리에 로딩하지 않고, 컬럼 단위의 바이너리 파일로 작성하여 디스크에 저장하기 때문에 DBMS에서 조회된 대용량의 데이터를 메모리에서 관리함으로써 발생하는 메모리 오버플로우의 위험을 원천적으로 차단할 수 있다.Memory overflow caused by managing the large amount of data retrieved from the DBMS in memory because the large amount of original data retrieved from the DBMS (Database Management System) is not loaded into memory, but is written as a column-unit binary file and stored on disk. It can block the risk of

그리고 DBMS에서 조회된 대용량의 데이터를 이용하여 표현된 차트 상의 각 좌표에 해당하는 복수 개의 레코드 인덱스 정보를 컬럼 단위의 바이너리 파일로 생성하여 디스크에 저장하고, 차트 상에 표현된 특정 위치 상의 좌표가 나타내는 레코드 정보에 대한 요청이 있는 경우 차트 상의 위치 좌표를 키 값으로 하고 레코드 인덱스 정보를 값으로 하는 해쉬(Hash) 함수 구조를 적용함으로써, 좌표에 해당하는 레코드 인덱스 정보를 빠르게 찾을 수 있을 뿐만 아니라, 레코드 인덱스 정보를 가져온 후 레코드 인덱스에 해당하는 정보를 디스크에 저장된 컬럼 단위 바이너리 파일에서 검색하여 로딩하기 때문에 메모리 오버플로우의 오류가 발생하지 않아 메모리 용량의 제약없이 차트로 표시할 수 있는 장점이 있다.In addition, a plurality of record index information corresponding to each coordinate on the chart expressed using the large amount of data retrieved from the DBMS is created as a binary file in column units and stored on disk, and the coordinates on a specific location expressed on the chart are indicated. When there is a request for record information, it is possible to quickly find the record index information corresponding to the coordinate by applying a hash function structure that uses the position coordinate on the chart as the key value and the record index information as the value. Since the index information is retrieved and the information corresponding to the record index is retrieved and loaded from the column-unit binary file stored on the disk, memory overflow errors do not occur, so that the chart can be displayed without limiting the memory capacity.

도 1은 종래기술에 따라 XML DataSet을 사용하는 일반적인 데이터 조회 방법을 나타낸다.
도 2는 본 발명에 따라 컬럼 별 바이너리 파일을 디스크에 저장 후 레코드 단위로 메모리에 적재함으로써 메모리를 거의 사용하지 않는 방법을 나타낸다.
도 3은 본 발명의 일 실시 예(제1 실시 예)에 따라 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 기능부를 설명하기 위한 블록도이다.
도 4는 본 발명의 다른 실시 예(제2 실시 예)에 따라 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 기능부를 설명하기 위한 블록도이다.
도 5는 본 발명에 따라 컬럼 별 바이너리 파일을 생성할 때 용량을 감소시키지 않는 기본 형과 용량을 감소시키는 제1-제4 용량 감소 정책에 대한 설명을 나타낸다.
도 6은 본 발명의 일 실시 예에 따라 테이블(Result Set)을 컬럼 단위의 바이너리 파일로 생성하여 디스크에 저장할 때 용량을 감소시키지 않고 원본 형식의 크기를 그대로 유지하는 컬럼 별 바이너리 파일 구조에 대한 개념도이다.
도 7은 본 발명의 다른 실시 예에 따라 테이블(Result Set)을 컬럼 단위의 바이너리 파일로 생성하여 디스크에 저장할 때 각 컬럼에 속하는 셀 데이터 값의 성격에 맞는 용량 감소 정책에 따라 원본 컬럼의 형식을 변환시키는 컬럼 별 바이너리 파일 구조에 대한 개념도이다.
도 8은 본 발명의 일 실시 예와 다른 실시 예에 따라 컬럼 단위의 바이너리 파일을 생성할 때 생성되는 헤더의 구조에 개념도이다.
도 9는 본 발명의 다른 실시 예에 따라 용량 감소 정책을 적용하였을 때 변환 후 컬럼의 크기를 변환 전 원본 컬럼의 크기와 비교한 예를 나타낸다.
도 10은 본 발명에 따라 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템의 기능을 나타내는 블록도이다.
도 11은 본 발명에 따라 차트 상에 표시되는 각 좌표에 해당하는 레코드 인덱스 정보를 컬럼 단위 바이너리 파일로 생성하여 디스크에 저장하는 구조에 대한 개념도이다.1 shows a general data search method using an XML DataSet according to the prior art.
FIG. 2 shows a method in which a binary file for each column is stored in a disk and then loaded into a memory in a record unit according to the present invention, thereby using little memory.
3 is a block diagram illustrating a function unit for storing large-capacity data searched in a database as a binary file for each column according to an embodiment of the present invention (first embodiment).
4 is a block diagram illustrating a function unit for storing large-capacity data searched from a database as a binary file for each column according to another embodiment of the present invention (second embodiment).
5 shows a description of a basic type that does not reduce a capacity and a first to fourth capacity reduction policy that reduces the capacity when generating a binary file for each column according to the present invention.
6 is a conceptual diagram of a binary file structure for each column that maintains the size of an original format without reducing capacity when a table (Result Set) is created as a column-based binary file and stored on disk according to an embodiment of the present invention to be.
7 illustrates the format of an original column according to a capacity reduction policy suitable for the characteristics of cell data values belonging to each column when a table (Result Set) is created as a column-based binary file and stored on disk according to another embodiment of the present invention. This is a conceptual diagram of the binary file structure for each column to be converted.
8 is a conceptual diagram illustrating a structure of a header generated when a binary file in column units is generated according to an embodiment of the present invention and another embodiment.
9 shows an example in which the size of a column after conversion is compared with the size of an original column before conversion when a capacity reduction policy is applied according to another embodiment of the present invention.
10 is a block diagram illustrating a function of a system for displaying a large amount of data in a chart without a memory capacity limitation using a binary file storage structure for each column according to the present invention.
11 is a conceptual diagram of a structure in which record index information corresponding to each coordinate displayed on a chart is generated as a column-based binary file and stored in a disk according to the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되는 실시 예를 참조하면 명확해질 것이다.Advantages and features of the present invention, and a method of achieving them will become apparent with reference to embodiments to be described later in detail together with the accompanying drawings.

그러나, 본 발명은 이하에서 개시되는 실시 예로 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이다.However, the present invention is not limited to the embodiments disclosed below, but will be implemented in various different forms.

본 명세서에서 본 실시 예는 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이다.In the present specification, the present embodiment is provided to complete the disclosure of the present invention, and to completely inform the scope of the invention to those of ordinary skill in the art to which the present invention pertains.

그리고 본 발명은 청구항의 범주에 의해 정의될 뿐이다.And the invention is only defined by the scope of the claims.

따라서, 몇몇 실시 예에서, 잘 알려진 구성 요소, 잘 알려진 동작 및 잘 알려진 기술들은 본 발명이 모호하게 해석되는 것을 피하기 위하여 구체적으로 설명되지 않는다.Accordingly, in some embodiments, well-known components, well-known operations, and well-known techniques have not been described in detail in order to avoid obscuring interpretation of the present invention.

또한, 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭하고, 본 명세서에서 사용된(언급된) 용어들은 실시 예를 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다.In addition, throughout the specification, the same reference numerals refer to the same constituent elements, and terms used in the present specification (referred to) are for describing exemplary embodiments and not limiting the present invention.

본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함하며, '포함(또는, 구비)한다'로 언급된 구성 요소 및 동작은 하나 이상의 다른 구성요소 및 동작의 존재 또는 추가를 배제하지 않는다.In this specification, the singular form also includes the plural form unless specifically stated in the phrase, and the components and actions referred to as'include (or, have)' do not exclude the presence or addition of one or more other components and actions. .

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used as meanings that can be commonly understood by those of ordinary skill in the art to which the present invention belongs.

또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 정의되어 있지 않은 한 이상적으로 또는 과도하게 해석되지 않는다.In addition, terms defined in a commonly used dictionary are not interpreted ideally or excessively unless defined.

이하, 첨부된 도면을 참고로 본 발명의 바람직한 실시 예에 대하여 설명한다.Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings.

본 발명에서 레코드(Record)는 테이블의 행(Row)과 같은 의미이며, 컬럼(Column)은 테이블의 열과 같은 의미이다. 하나의 컬럼은 하나 이상의 셀(Cell)로 구성된다. Result Set은 DBMS에서 쿼리를 실행하고 얻어진 결과 데이터(테이블)를 말한다.In the present invention, a record has the same meaning as a row of a table, and a column has the same meaning as a column of a table. One column consists of one or more cells. Result Set refers to the result data (table) obtained by executing a query in the DBMS.

상위 서버와 클라이언트 사이에 중간 서버를 거치는 경우(3 tier 방식) 본 발명에 따른 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 구조는 중간 서버뿐만 아니라 클라이언트에도 설치되어 동작한다. 이 경우 중간 서버는 DBMS에서 데이터 조회 결과를 수신하면 컬럼 별 바이너리 파일로 생성하여 디스크에 저장하며, 클라이언트는 중간 서버에서 수신한 컬럼 별 바이너리 파일 저장 구조를 그대로 수신하여 디스크에 저장한다.In the case of passing through an intermediate server between the upper server and the client (3 tier method) The structure for storing large-capacity data searched in the database according to the present invention as a binary file for each column is installed and operated not only in the intermediate server but also in the client. In this case, when the intermediate server receives the data search result from the DBMS, it creates a binary file for each column and stores it on the disk, and the client receives the binary file storage structure for each column received from the intermediate server and stores it on the disk.

서버와 클라이언트만으로 구성되는 경우(2 tier 방식) 본 발명에 따른 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 구조는 클라이언트에 설치되어 동작한다. 이 경우 클라이언트는 DBMS에서 데이터 조회 결과를 수신하면 컬럼 별 바이너리 파일로 생성하여 디스크에 저장한다.In the case of consisting of only a server and a client (2-tier method), a structure for storing large-capacity data searched in a database according to the present invention as a binary file for each column is installed and operated in the client. In this case, when the client receives the data search result from the DBMS, it creates a binary file for each column and stores it on the disk.

도 10 및 도 11을 참고하면, 본 발명에 따른 컬럼 별 바이너리 파일 저장 구조를 이용하여 대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템(300)은 컬럼 바이너리 파일 저장부(110, 210)와 컬럼 바이너리 파일 리더부(120, 220) 및 차트 표시부(330)를 포함한다.10 and 11, the system 300 for displaying large-capacity data as a chart without memory capacity limitation using the binary file storage structure for each column according to the present invention is a column binary file storage unit 110 and column It includes binary file reader units 120 and 220 and a chart display unit 330.

컬럼 바이너리 파일 저장부(110)와 컬럼 바이너리 파일 리더부(120)는 하기의 제1 실시 예에서 자세하게 설명하고 있으며, 컬럼 바이너리 파일 저장부(210)와 컬럼 바이너리 파일 리더부(220)는 하기의 제2 실시 예에서 자세하게 설명하고 있다.The column binary file storage unit 110 and the column binary file reader unit 120 are described in detail in the following first embodiment, and the column binary file storage unit 210 and the column binary file reader unit 220 are described below. This is described in detail in the second embodiment.

컬럼 바이너리 파일 저장부(110, 210)는 DBMS(Database Management System)에서 데이터 조회 결과 수신한 테이블(Result Set) 데이터를 컬럼 단위 바이너리(Binary) 파일로 생성하여 디스크에 저장한다.The column binary file storage units 110 and 210 generate table data received as a result of a data search from a database management system (DBMS) as a binary file for each column and store it in a disk.

컬럼 바이너리 파일 리더부(120, 220)는 컬럼 바이너리 파일 저장부(110, 210)에 의해 디스크에 저장된 각각의 컬럼 별 바이너리 파일에서 셀 값을 하나씩 읽어 레코드(Row)를 구성 후 레코드 단위로 메모리 로딩한다.The column binary file reader units 120 and 220 read cell values one by one from the binary file for each column stored on the disk by the column binary file storage units 110 and 210, form a record, and then load the memory in units of records. do.

차트 표시부(330)는 컬럼 바이너리 파일 리더부(120, 220)에 의해 메모리에 로딩된 레코드에서 X 좌표와 Y 좌표에 해당하는 각 셀 값을 이용하여 차트 상의 X 좌표와 Y 좌표에 해당하는 위치에 픽셀(Pixel) 형태로 시각적으로 표시한다. 레코드의 컬럼 중에서 X 좌표와 Y 좌표의 성격을 갖는 값을 이용하여 차트 상의 X 좌표와 Y 좌표에 해당하는 위치에 레코드를 픽셀 형태로 표시한다.The chart display unit 330 uses each cell value corresponding to the X coordinate and the Y coordinate in the record loaded into the memory by the column binary file reader units 120 and 220 to position the X coordinate and the Y coordinate on the chart. Visually displayed in the form of pixels. Records are displayed in pixel form at positions corresponding to X and Y coordinates on the chart using values having the characteristics of X and Y coordinates among the columns of the record.

차트 표시부(330)는 조회된 모든 레코드 또는 사용자에 의해 지정된 범위에 해당하는 모든 레코드를 읽어서 차트 상에 표시한다. 차트 상에 표시되는 픽셀 하나는 레코드 하나를 나타낸다.The chart display unit 330 reads all the inquired records or all records corresponding to the range designated by the user and displays them on the chart. Each pixel displayed on the chart represents a record.

구체적으로, 차트 표시부(330)는 2차원 배열 인덱스 번호 관리부(331), 배열구조 변환부(332), 레코드 인덱스 작성부(333) 및 레코드 인덱스 리더부(334)를 포함한다.Specifically, the chart display unit 330 includes a two-dimensional array index number management unit 331, an array structure conversion unit 332, a record index creation unit 333, and a record index reader unit 334.

예를 들어, 컬럼 바이너리 파일 리더부(120, 220)가 디스크에 저장된 컬럼 별 바이너리 파일에서 셀 값을 하나씩 읽어 레코드를 구성 후 레코드 단위로 메모리에 로딩하면, 차트 표시부(330)가 300만개의 레코드를 순차적으로 차트 상에 표시할 때 각 레코드는 2차원 좌표 상의 서로 다른 위치에 표시되며, 같은 X, Y 좌표에는 복수 개의 레코드가 중첩되어 표시될 수 있다.For example, when the column binary file readers 120 and 220 read cell values one by one from the binary file for each column stored on the disk, compose records and load them into the memory in units of records, the chart display unit 330 will display 3 million records. When is sequentially displayed on the chart, each record is displayed at a different location on the two-dimensional coordinates, and a plurality of records may be overlapped and displayed in the same X and Y coordinates.

도 11에서는 75, 50 좌표에 총 14개의 레코드가 중첩되어 표시된 예를 나타낸다. 기존에는 75, 50 좌표에 찍힌 레코드를 확인하기 위해, 테이블에서 X 좌표 75, Y 좌표 50을 갖는 값을 순차적으로 탐색해야 했다. 이는 시간이 상당히 오래 걸리는 작업일 뿐만 아니라, 탐색을 위해 대용량의 데이터를 메모리에서 관리해야 하기 때문에 메모리 오버플로우의 위험이 항상 존재했다.11 shows an example in which a total of 14 records are superimposed on coordinates 75 and 50. Previously, in order to check the records stamped at coordinates 75 and 50, it was necessary to sequentially search for values with an X coordinate of 75 and Y coordinate of 50 in a table. Not only is this a very time-consuming operation, but there has always been a risk of memory overflow because a large amount of data must be managed in memory for exploration.

도 11을 참고하면, 2차원 배열 인덱스 번호 관리부(331)는 차트 상의 X 좌표와 Y 좌표에 해당하는 위치에 픽셀(Pixel) 형태로 시각적으로 표시되는 순서대로 레코드 인덱스(Row Index) 번호를 부여하여 X 좌표와 Y 좌표에 해당하는 2차원 배열의 픽셀 각 위치에 대응하여 하나 이상의 레코드 인덱스 번호를 저장한다.Referring to FIG. 11, the two-dimensional array index number management unit 331 assigns a row index number to a location corresponding to the X coordinate and the Y coordinate on the chart in the order of visual display in the form of pixels. One or more record index numbers are stored in correspondence with each position of a pixel in a two-dimensional array corresponding to the X coordinate and the Y coordinate.

본 발명에서는 컬럼 바이너리 파일 리더부(120, 220)가 디스크에서 컬럼 단위로 읽어서 레코드를 구성 후 레코드 단위로 메모리에 로딩하면, 2차원 배열 인덱스 번호 관리부(331)는 픽셀 형태로 시각적으로 표시되는 순서대로 레코드 인덱스 번호를 부여하여 각 픽셀의 위치에 해당하는 하나 이상의 레코드 인덱스 번호를 관리한다.In the present invention, when the column binary file reader units 120 and 220 read from the disk in column units to compose records and then load them into the memory in record units, the two-dimensional array index number management unit 331 visually displays the order in pixel form. By assigning a record index number as described above, one or more record index numbers corresponding to each pixel position are managed.

배열구조 변환부(332)는 2차원 배열을 1차원 배열로 변환하고, 2차원 배열에서의 X 좌표와 Y 좌표를 1차원 배열의 인덱스 번호로 변환하고, 1차원 배열의 인덱스 번호에 대응하여 하나 이상의 레코드 인덱스 번호를 저장한다.The array structure conversion unit 332 converts a two-dimensional array into a one-dimensional array, converts the X coordinates and Y coordinates in the two-dimensional array to the index number of the one-dimensional array, Save the above record index number.

도 11에서 2차원 배열에 해당하는 75, 50 좌표는 1차원 배열로 변환되고, 1차원 인덱스 번호를 다시 부여 받는다. 75, 50 좌표에 해당하는 1차원 배열의 인덱스에 14개의 레코드에 대한 레코드 인덱스가 저장된다.In FIG. 11, coordinates of 75 and 50 corresponding to a two-dimensional array are converted into a one-dimensional array, and a one-dimensional index number is given again. Record indexes for 14 records are stored in the index of the one-dimensional array corresponding to coordinates 75 and 50.

레코드 인덱스 작성부(333)는 1차원 배열의 인덱스 번호에 대응하는 레코드 인덱스 번호의 레코드 인덱스 데이터 파일([imifilename_hash].imi.data) 상에서의 시작 위치를 나타내는 주소 값과 레코드 인덱스 번호의 개수를 동일한 크기의 하나의 셀에 속하도록 하여 순차적으로 구성하고, 모든 셀 데이터를 차례로 바이너리 값으로 변환 후 레코드 인덱스 지시자(Indicator) 파일([imifilename_hash].imi.index)을 생성하여 디스크에 저장하며, 1차원 배열의 인덱스 번호에 대응하는 레코드 인덱스 번호를 지정된 크기에 해당하는 바이트 단위로 순차적으로 바이너리 값으로 변환 후 레코드 인덱스 데이터 파일([imifilename_hash].imi.data)을 생성하여 디스크에 저장한다.The record index creation unit 333 has the same address value indicating the start position in the record index data file ([imifilename_hash].imi.data) of the record index number corresponding to the index number of the one-dimensional array and the number of record index numbers. It is organized sequentially by belonging to one cell of the size, and after converting all cell data to binary values in turn, a record index indicator file ([imifilename_hash].imi.index) is created and stored on the disk, and is stored in one dimension. After converting the record index number corresponding to the index number of the array into a binary value sequentially in byte units corresponding to the specified size, a record index data file ([imifilename_hash].imi.data) is created and stored on the disk.

도 11을 참고하면, 레코드 인덱스 지시자(Indicator) 파일([imifilename_hash].imi.index)은 4 byte 헤더를 갖고, 레코드 인덱스 데이터 파일([imifilename_hash].imi.data)은 2 byte 헤더를 갖는 것을 예로 들었다. 레코드 인덱스 지시자(Indicator) 파일([imifilename_hash].imi.index)의 4 byte 헤더(Type(2), Size(2))는 레코드 인덱스 데이터 파일([imifilename_hash].imi.data) 상에서의 시작 위치를 나타내는 주소 값의 형식과 레코드 인덱스 번호의 개수에 대한 형식을 각각 1 byte씩 할당하여 저장하고, 레코드 인덱스 데이터 파일([imifilename_hash].imi.data) 상에서의 시작 위치를 나타내는 주소 값 형식의 크기와 레코드 인덱스 번호의 개수 형식의 크기를 각각 1 byte씩 할당하여 저장한다. 레코드 인덱스 데이터 파일([imifilename_hash].imi.data)의 2 byte 헤더(Type(2))에는 레코드 인덱스 번호의 형식이 저장된다.Referring to FIG. 11, for example, a record index indicator file ([imifilename_hash].imi.index) has a 4 byte header, and a record index data file ([imifilename_hash].imi.data) has a 2 byte header. heard. The 4 byte headers (Type(2), Size(2)) of the record index indicator file ([imifilename_hash].imi.index) indicate the start position in the record index data file ([imifilename_hash].imi.data). The format of the indicated address value and the format of the number of record index numbers are allocated and stored by 1 byte, respectively, and the size and record of the address value format indicating the start position in the record index data file ([imifilename_hash].imi.data). The size of the number format of the index number is allocated and stored by 1 byte each. The format of the record index number is stored in the 2-byte header (Type(2)) of the record index data file ([imifilename_hash].imi.data).

레코드 인덱스 리더부(334)는 차트 상의 특정 픽셀을 지정 시 해당 픽셀의 X 좌표와 Y 좌표에 해당하는 1차원 배열의 인덱스 번호를 이용하여 디스크에 저장된 레코드 인덱스 지시자 파일([imifilename_hash].imi.index)에서 레코드 인덱스 번호의 레코드 인덱스 데이터 파일([imifilename_hash].imi.data) 상에서의 시작 위치를 나타내는 주소 값과 레코드 인덱스 번호의 개수를 구한 후 레코드 인덱스 데이터 파일([imifilename_hash].imi.data)에서 해당하는 개수의 레코드 인덱스 번호를 가져온다.When designating a specific pixel on the chart, the record index reader 334 uses the index number of a one-dimensional array corresponding to the X coordinate and Y coordinate of the pixel, and the record index indicator file ([imifilename_hash].imi.index) stored in the disk. ) From the record index data file ([imifilename_hash].imi.data) of the record index number ([imifilename_hash].imi.data), calculate the address value indicating the starting position and the number of record index numbers in the record index data file ([imifilename_hash].imi.data). Gets the index number of the corresponding number of records.

도 11에서 사용자가 75, 50 좌표를 클릭하여 레코드 조회를 요청하면, 75, 50 좌표에 해당하는 1차원 배열의 인덱스 번호를 이용하여 레코드 인덱스 지시자(Indicator) 파일([imifilename_hash].imi.index)에서 해당하는 셀의 위치를 찾을 수 있다. 도 11의 레코드 인덱스 지시자 파일([imifilename_hash].imi.index)에서는 첫 번째 셀에 레코드 인덱스 데이터 파일 상에서의 시작 위치를 나타내는 주소 값 2와 레코드 인덱스 번호의 개수에 해당하는 14를 구한 후 레코드 인덱스 데이터 파일([imifilename_hash].imi.data)에서 14개의 레코드 인덱스 번호를 가져온다.In FIG. 11, when a user requests a record search by clicking 75 or 50 coordinates, a record index indicator file ([imifilename_hash].imi.index) using an index number of a one-dimensional array corresponding to the 75 or 50 coordinates ([imifilename_hash].imi.index) You can find the location of the corresponding cell in. In the record index indicator file ([imifilename_hash].imi.index) of FIG. 11, the address value 2 indicating the start position in the record index data file and 14 corresponding to the number of record index numbers are calculated in the first cell, and then record index data Get 14 record index numbers from the file ([imifilename_hash].imi.data).

도 11에서는 75, 50 좌표에 레코드가 14개, 76, 50 좌표에 레코드가 10개, 77, 50 좌표에 레코드가 5개, 78, 50 좌표에 레코드가 11개 표시된 예를 나타낸다.11 shows an example in which 14 records are displayed at coordinates 75 and 50, 10 records are at coordinates 76 and 50, 5 records are at coordinates 77 and 50, and 11 records are displayed at coordinates 78 and 50.

컬럼 바이너리 파일 리더부(120, 220)는 레코드 인덱스 리더부(334)가 가져온 하나 이상의 레코드 인덱스 번호를 전달 받아 디스크에 저장된 컬럼 단위의 바이너리 파일에서 레코드 인덱스 번호에 해당하는 셀 값을 하나씩 읽어 레코드(Row)를 구성하는 방식으로 전달 받은 모든 레코드 인덱스 번호에 대해 차례로 레코드를 구성하여 차트 표시부(330)에 전달하고, 차트 표시부(330)는 레코드 인덱스 번호에 해당하는 레코드 정보를 디스플레이 한다.The column binary file reader units 120 and 220 receive one or more record index numbers obtained from the record index reader unit 334 and read cell values corresponding to the record index numbers one by one from the column-unit binary file stored on the disk. Rows) are sequentially configured for all the received record index numbers and transferred to the chart display unit 330, and the chart display unit 330 displays record information corresponding to the record index number.

컬럼 바이너리 파일 저장부(110, 210)와 컬럼 바이너리 파일 리더부(120, 220)에 대해서는 아래의 제1 실시 예와 제2 실시 예에서 자세하게 설명한다.The column binary file storage units 110 and 210 and the column binary file reader units 120 and 220 will be described in detail in the first and second embodiments below.

[제1 실시 예] [First embodiment]

제1 실시 예는 DBMS에서 데이터 조회 결과 수신한 테이블(Result Set)을 컬럼 별 바이너리 파일로 저장할 때 원본 형식의 크기를 유지하는 구조이다.The first embodiment is a structure in which the size of the original format is maintained when a table (Result Set) received as a result of data inquiry from a DBMS is stored as a binary file for each column.

도 2, 3, 5, 6 및 8을 참고하면, 본 발명의 일 실시 예에 따른 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 구조(100)는 컬럼 바이너리 파일 저장부(110)와 컬럼 바이너리 파일 리더부(120)를 포함한다.2, 3, 5, 6, and 8, a structure 100 for storing large-capacity data searched in a database as a binary file for each column according to an embodiment of the present invention includes a column binary file storage unit 110 and It includes a column binary file reader 120.

컬럼 바이너리 파일 저장부(110)는 DBMS(Database Management System)에서 데이터 조회 결과 수신한 테이블(Result Set)에 대해 수치형 컬럼은 테이블(Result Set)에 포함된 원본 컬럼 형식의 셀 크기 간격으로 해당 컬럼의 모든 셀 데이터를 차례로 바이너리(Binary) 값으로 변환 후 컬럼 별 하나의 바이너리 파일로 생성하여 디스크에 저장하고, 문자형 컬럼은 테이블(Result Set)에 포함된 각 셀에 들어있는 문자수에 해당하는 크기만큼 바이너리 값으로 변환 후 컬럼 별 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The column binary file storage unit 110 is for a table (Result Set) received as a result of data search from a database management system (DBMS), and a numeric column is a corresponding column at the cell size interval of the original column type included in the table (Result Set). Converts all cell data of in order to binary values, creates a binary file for each column and stores it on disk, and the character column is the size corresponding to the number of characters in each cell included in the table (Result Set) After converting as many as binary values, it is created as one binary file for each column and saved on disk.

수치형 컬럼은 big int, small int, int, float 등이 해당하며, 문자형 컬럼은 Nvarchar, varchar, char 등이 해당한다. 이 밖에도 다양한 종류의 수치형 컬럼과 문자형 컬럼이 해당한다.Numeric columns include big int, small int, int, and float, and character columns include Nvarchar, varchar, and char. In addition, various types of numeric and character columns are applicable.

컬럼 바이너리 파일 리더부(120)는 컬럼 바이너리 파일 저장부(110)에 의해 디스크에 저장된 각각의 컬럼 별 바이너리 파일에서 셀 값을 하나씩 읽어 특정 순번의 레코드(Row)를 구성 후 레코드 단위로 메모리에 로딩한다. 사용자는 디스크에 저장된 대용량의 데이터에 대해 다양한 요구 및 조회를 할 수가 있으며, 사용자의 요구 사항 발생 시 레코드 단위로 메모리에 로딩함으로써, 최소한의 메모리 자원만을 사용할 수 있어 메모리 오버플로우의 문제가 발생하지 않는다.The column binary file reader 120 reads cell values one by one from the binary file for each column stored on the disk by the column binary file storage unit 110, composes a row of a specific order, and then loads it into the memory in units of records. do. Users can make various requests and inquiries for large-capacity data stored on the disk, and when a user's request occurs, the memory overflow problem does not occur as the minimum memory resources can be used by loading into the memory in units of records. .

구체적으로, 컬럼 바이너리 파일 저장부는 고정길이 컬럼 바이너리 저장부(111)와 가변길이 컬럼 바이너리 저장부(112)를 포함한다.Specifically, the column binary file storage unit includes a fixed-length column binary storage unit 111 and a variable-length column binary storage unit 112.

고정길이 컬럼 바이너리 저장부(111)는 수치형 컬럼의 테이블(Result Set)에 포함된 원본 컬럼 형식의 셀 크기 간격으로 해당 컬럼의 모든 셀 데이터를 차례로 바이너리 값으로 변환 후 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The fixed-length column binary storage unit 111 converts all cell data in the corresponding column sequentially into binary values at the cell size interval of the original column type included in the numeric column table (Result Set), and then creates a single binary file. Save to disk.

도 6을 참고하면, 테이블(Result Set)에 포함된 Col1은 헤더 8 byte를 시작으로 각 셀의 데이터를 big int 타입에 해당하는 8 byte 간격으로 차례로 바이너리 값으로 변환 후 ColumnFile0001.conbin 바이너리 파일로 생성하여 디스크에 저장한다. Col2 는 헤더 8 byte를 시작으로 각 셀의 데이터를 small int 타입에 해당하는 2 byte 간격으로 차례로 바이너리 값으로 변환 후 ColumnFile0002.conbin 바이너리 파일로 생성하여 디스크에 저장한다. Col3는 헤더 8 byte를 시작으로 int 타입에 해당하는 4 byte 간격으로 차례로 바이너리 값으로 변환 후 ColumnFile0003.conbin 바이너리 파일로 생성하여 디스크에 저장하며, Col4는 헤더 8 byte를 시작으로 float 타입에 해당하는 8 byte 간격으로 차례로 바이너리 값으로 변환 후 ColumnFile0004.conbin 바이너리 파일로 생성하여 디스크에 저장한다. 각 컬럼의 바이너리 파일 명은 예시로 든 것이며, 원본 테이블의 컬럼 형식을 그대로 유지한다.Referring to FIG. 6, Col1 included in the table (Result Set) converts the data of each cell into binary values sequentially at 8 byte intervals corresponding to the big int type starting with 8 bytes of the header, and then creates the ColumnFile0001.conbin binary file. And save it to disk. Col2 converts the data of each cell into a binary value sequentially at 2-byte intervals corresponding to the small int type, starting with 8 bytes of the header, and then creates the ColumnFile0002.conbin binary file and stores it on the disk. Col3 converts to binary values sequentially at 4-byte intervals corresponding to int type starting with 8 bytes of header, and then creates ColumnFile0003.conbin binary file and stores them on disk.Col4 starts with 8 bytes of header and stores 8 corresponding to float type. After converting to binary values sequentially at byte intervals, it is created as a ColumnFile0004.conbin binary file and saved to disk. The binary file name of each column is provided as an example, and the column format of the original table is maintained.

가변길이 컬럼 바이너리 저장부(112)는 문자형 컬럼의 각 셀에 포함되는 문자수에 해당하는 바이트(Byte) 단위 크기와 셀 데이터 파일 상에서 각 셀의 문자가 시작하는 위치를 나타내는 주소 값을 동일한 크기의 하나의 셀에 속하도록 하여 각 셀을 연속적으로 구성하고 모든 셀 데이터를 차례로 바이너리 값으로 변환 후 셀 지시자(Indicator) 파일을 생성하여 디스크에 저장하며, 각 셀에 포함되는 문자수에 해당하는 바이트 단위 크기만큼 순차적으로 바이너리 값으로 변환 후 셀 데이터 파일을 생성하여 디스크에 저장한다.The variable-length column binary storage unit 112 stores a size in bytes corresponding to the number of characters included in each cell of a character column and an address value indicating the start position of each cell in the cell data file. Consecutively configure each cell to belong to one cell, convert all cell data into binary values in turn, create a cell indicator file and store it on the disk, in byte units corresponding to the number of characters contained in each cell After converting to binary values sequentially as much as the size, cell data files are created and stored on disk.

도 6을 참고하면, 문자형 컬럼에 해당하는 Col5는 셀 지시자 파일에 해당하는 ColumnFile0005.conbin와 셀 데이터 파일에 해당하는 ColumnFile0005.conbin.data 두 개의 바이너리 파일로 구성된다.Referring to FIG. 6, Col5 corresponding to a character column is composed of two binary files, ColumnFile0005.conbin corresponding to the cell indicator file and ColumnFile0005.conbin.data corresponding to the cell data file.

예를 들어, Col5 의 두 번째 셀에 해당하는 'A…C' 의 경우 셀 지시자(Indicator) 파일에는 문자수에 해당하는 15와 셀 데이터 파일 상에서 'A…C' 가 시작하는 위치를 나타내는 주소 값 28이 두 번째 셀 위치에 포함된다. 셀 데이터 파일은 문자형 컬럼의 원본 셀에 포함되는 문자열이 바이트 단위의 크기로 연속적으로 저장된다.For example,'A…' for the second cell of Col5. In the case of C', the cell indicator file contains 15 characters and'A… The address value 28 indicating the position where C'starts is included in the second cell position. In the cell data file, the character string included in the original cell of the character column is continuously stored in bytes.

문자형 컬럼의 세 번째 셀 위치에 해당하는 'A..F'를 바이너리 값으로 변환하여 저장할 때 셀 지시자 파일의 세 번째 셀 위치에는 셀 데이터 파일에서 'A..F'가 시작하는 위치를 나타내는 주소 값 43과 문자수에 해당하는 5가 저장되고, 셀 데이터 파일에는 두 번째 셀 위치의 문자열에 이어서 'A..F'를 저장한다.When'A..F' corresponding to the third cell position of a character column is converted to a binary value and saved, the third cell position of the cell indicator file is an address indicating the position where'A..F' starts in the cell data file. The value 43 and 5 corresponding to the number of characters are stored, and'A..F' is stored in the cell data file following the string at the second cell position.

위와 같은 방식으로 셀 지시자 파일에 해당하는 ColumnFile0005.conbin 바이너리 파일과 셀 데이터 파일에 해당하는 ColumnFile0005.conbin.data 바이너리 파일을 생성하여 디스크에 저장한다.In the same way as above, the ColumnFile0005.conbin binary file corresponding to the cell indicator file and the ColumnFile0005.conbin.data binary file corresponding to the cell data file are created and saved on the disk.

고정길이 컬럼 바이너리 저장부(111)는 고정길이 헤더 구성부(111a)를 포함한다.The fixed-length column binary storage unit 111 includes a fixed-length header configuration unit 111a.

도 6을 참고하면, 고정길이 헤더 구성부(111a)는 컬럼 별 바이너리 파일을 생성 시 각 컬럼마다 헤더를 생성하며, 헤더에는 테이블(Result Set)에 정의된 수치형 컬럼의 원본 형식, 컬럼 별 바이너리 파일의 생성 형식 및 컬럼 별 바이너리 파일 생성시 셀 당 크기를 저장한다. 도 6의 예에서는 수치형 컬럼의 헤더를 8 byte로 모두 동일하게 구성하는 경우를 나타내었다.Referring to FIG. 6, the fixed-length header configuration unit 111a generates a header for each column when generating a binary file for each column, and in the header, the original format of the numeric column defined in the table (Result Set), the binary file for each column Saves the file generation format and the size per cell when generating binary files for each column. In the example of FIG. 6, a case in which all the headers of the numeric column are configured identically with 8 bytes is shown.

이때, 수치형 컬럼의 원본 형식과 컬럼 별 바이너리 파일의 생성 형식은 같은 값을 같고, 컬럼 별 바이너리 파일 생성시 셀 당 크기는 원본 형식의 크기와 같은 값이 저장된다.At this time, the original format of the numeric column and the generation format of the binary file for each column have the same value, and when the binary file for each column is created, the size per cell is the same as the size of the original format.

도 6에서 고정길이 헤더 구성부(111a)는 컬럼 별 바이너리 파일 ColumnFile0001.conbin, ColumnFile0002.conbin, ColumnFile0003.conbin, ColumnFile0004.conbin 생성 시 각 컬럼마다 8 byte의 헤더를 생성한다.In FIG. 6, the fixed-length header configuration unit 111a generates an 8-byte header for each column when generating binary files ColumnFile0001.conbin, ColumnFile0002.conbin, ColumnFile0003.conbin, and ColumnFile0004.conbin for each column.

도 8을 참고하면, 고정길이기본의 경우 Fetch Type 2 byte, P Type1, P Type2 각각 1 byte, P Size1, P Size2 각각 1byte 그리고 Reserved1, Reserved2 각각 1 byte로 총 8 byte로 구성된다.Referring to FIG. 8, in the case of fixed length basic, each of Fetch Type is 2 bytes, P Type1, P Type2 is 1 byte, P Size1, P Size2 is 1 byte, and Reserved1 and Reserved2 are each 1 byte.

여기서 Fetch Type은 조회된 테이블(Result Set)의 원본 형식을 나타내고, P Type1, P Type2 는 본 발명에 따른 컬럼 바이너리 파일 작성 형식을 나타내며, P Size1, P Size2는 본 발명에 따른 컬럼 바이너리 파일의 셀 당 크기(byte)를 나타낸다. Reserved1, Reserved2는 추가적인 사용을 위해 남겨둔 것이다.Here, Fetch Type represents the original format of the searched table (Result Set), P Type1 and P Type2 represent the column binary file creation format according to the present invention, and P Size1 and P Size2 represent the cell of the column binary file according to the present invention. It represents the size per unit (byte). Reserved1 and Reserved2 are reserved for further use.

고정길이 컬럼 바이너리 저장부(111)는 고정길이 헤더 구성부(111a)에 의해 저장된 헤더를 바이너리 값으로 변환하고, 테이블(Result Set)에 포함된 원본 컬럼의 셀 데이터를 고정길이 헤더 구성부(111a)에 의해 저장된 컬럼 별 바이너리 파일 생성시 셀 당 크기 간격으로 차례로 바이너리 값으로 변환시켜 컬럼 별 헤더와 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The fixed-length column binary storage unit 111 converts the header stored by the fixed-length header construction unit 111a into a binary value, and converts cell data of the original column included in the table (Result Set) into a fixed-length header construction unit 111a. When creating a binary file for each column saved by ), it is converted into a binary value sequentially at intervals of size per cell, and it is created as a single binary file that combines the header and data for each column and stores it on disk.

도 6에서 수치형에 해당하는 Col1~Col4는 고정길이 바이너리 파일로 구성하였으나, 문자형에 해당하는 Col5는 아래와 같이 가변길이 바이너리 파일로 구성한다. 그리고 수치형 컬럼은 원본 컬럼 별로 하나의 바이너리 파일을 생성하였으나, 문자형 컬럼은 원본 컬럼 별로 셀 지시자 파일과 셀 데이터 파일로 구성되는 두 개의 바이너리 파일을 생성한다.In FIG. 6, Col1 to Col4 corresponding to the numeric type is composed of a fixed-length binary file, but Col5 corresponding to the character type is composed of a variable-length binary file as follows. In addition, one binary file is created for each original column in the numeric column, but two binary files consisting of a cell indicator file and a cell data file are created for each original column in the character column.

가변길이 컬럼 바이너리 저장부(112)는 셀 지시자 헤더 구성부(112a)와 셀 데이터 헤더 구성부(112b)를 포함한다.The variable-length column binary storage unit 112 includes a cell indicator header construction unit 112a and a cell data header construction unit 112b.

도 6을 참고하면, ColumnFile0005.conbin은 셀 지시자 파일을 나타내고, ColumnFile0005.conbin.data는 셀 데이터 파일을 나타낸다.Referring to FIG. 6, ColumnFile0005.conbin represents a cell indicator file, and ColumnFile0005.conbin.data represents a cell data file.

셀 지시자 헤더 구성부(112a)는 셀 지시자(Indicator) 파일 생성 시 헤더를 생성하며, 헤더에는 테이블(Result Set)에 정의된 문자형 컬럼의 원본 형식, 셀 지시자 파일 생성 형식 및 셀 지시자 파일 생성시 셀 당 크기를 저장한다.The cell indicator header configuration unit 112a generates a header when generating a cell indicator file, and in the header, the original format of the character column defined in the table (Result Set), the cell indicator file generation format, and the cell indicator file are generated. Save the sugar size.

도 8을 참고하면, 가변길이기본이 셀 지시자 헤더 구성부(112a)에 의해 생성되는 8 byte의 First File Header를 나타낸다.Referring to FIG. 8, an 8-byte First File Header generated by the variable-length basic cell indicator header configuration unit 112a is shown.

Fetch Type은 조회된 Result Set의 원본 형식을 나타내고, P Type1 및 P Type2는 셀 지시자 파일의 생성 형식을 나타내며, P Size1, P Size2는 셀 지시자 파일이 생성하는 셀 당 크기를 나타낸다.Fetch Type represents the original format of the retrieved Result Set, P Type1 and P Type2 represent the cell indicator file generation format, and P Size1 and P Size2 represent the size per cell generated by the cell indicator file.

구체적으로, P Type1에는 데이터 파일의 위치를 나타내는 정수 형식, P Type2 는 읽을 문자수 정수 형식을 나타내며, P Size1은 데이터 파일 위치 정수 형식의 크기를 나타내고, P Size2는 읽을 문자수 정수 형식의 크기(byte)를 나타낸다. Reserved1, Reserved2는 추가적인 사용을 위해 남겨둔 것이다.Specifically, P Type1 represents an integer format representing the location of the data file, P Type2 represents the integer format of the number of characters to be read, P Size1 represents the size of the data file location integer format, and P Size2 represents the size of the integer format of the number of characters to read ( byte). Reserved1 and Reserved2 are reserved for further use.

도 6에서는 셀 지시자 파일을 구성하는 각 셀이 6 byte로 구성되는 예를 나타내었으며, 6 byte의 공간에 데이터 파일의 위치를 나타내는 정수와 데이터 파일의 읽을 문자수가 들어간다. 물론, 데이터 파일의 위치는 데이터 파일의 2 byte 헤더부터 계산된 위치 값이 들어간다.6 shows an example in which each cell constituting the cell indicator file is composed of 6 bytes, and an integer indicating the location of the data file and the number of characters to be read of the data file are entered in the 6 byte space. Of course, the location of the data file contains the location value calculated from the 2-byte header of the data file.

가변길이 컬럼 바이너리 저장부(112)는 셀 지시자 헤더 구성부(112a)에 의해 저장된 헤더를 바이너리 값으로 변환하고, 문자형 컬럼의 각 셀에 포함되는 문자수에 해당하는 바이트 단위의 크기와 셀 데이터 파일 상에서 각 셀의 문자가 시작하는 위치를 나타내는 주소 값을 셀 지시자 헤더 구성부(112a)에 의해 저장된 셀 지시자 파일 생성 시 셀 당 크기 간격으로 차례로 바이너리 값으로 변환시켜 셀 지시자 헤더와 셀 지시자 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The variable-length column binary storage unit 112 converts the header stored by the cell indicator header construction unit 112a into a binary value, and a size in bytes corresponding to the number of characters included in each cell of a character column and a cell data file When the cell indicator file stored by the cell indicator header configuration unit 112a is generated, the cell indicator header and the cell indicator data are combined in order by converting the address value indicating the start position of each cell in the image into a binary value in order of size per cell. It is created as a single binary file and saved to disk.

도 6의 예에서는 각 셀에 포함되는 문자수에 해당하는 바이트 단위의 크기와 셀 데이터 파일 상에서 각 셀의 문자가 시작하는 위치를 나타내는 주소 값을 셀 당 크기에 해당하는 6 byte 간격으로 차례로 바이너리 값으로 변환시켜 셀 지시자 헤더와 셀 지시자 데이터가 결합된 하나의 바이너리 파일을 생성하여 디스크에 저장한다.In the example of FIG. 6, the size in bytes corresponding to the number of characters included in each cell and the address value indicating the start position of the character in each cell in the cell data file are sequentially binary values at 6 byte intervals corresponding to the size per cell. To create a binary file in which the cell indicator header and the cell indicator data are combined and stored in the disk.

도 6을 참고하면, 셀 데이터 헤더 구성부(112b)는 셀 데이터 파일 생성 시 헤더를 생성하며, 헤더에는 테이블(Result Set)에 정의된 문자형 컬럼의 원본 형식을 저장한다.Referring to FIG. 6, the cell data header configuration unit 112b generates a header when generating a cell data file, and stores the original format of a character column defined in a table (Result Set) in the header.

도 8을 참고하면, Second File Header의 가변길이기본이 셀 데이터 헤더 구성부(112b)에 의해 생성되는 2 byte의 헤더 구조를 나타낸다. 셀 데이터 파일의 헤더는 2 바이트 또는 4 바이트로 구성할 수 있으며, 도 6에서는 2 바이트로 구성된 예를 나타낸다. 셀 데이터 헤더 구성부(112b)에 의해 생성되는 헤더에는 Result Set의 원본 형식이 포함되며, 디스크에 저장된 셀 데이터 바이너리 파일에서 문자형 컬럼을 반환할 때 헤더에 포함된 Result Set의 원본 형식으로 반환하게 된다.Referring to FIG. 8, a 2 byte header structure generated by the cell data header configuration unit 112b is shown in which the variable length basic of the Second File Header. The header of the cell data file may be composed of 2 bytes or 4 bytes, and FIG. 6 shows an example composed of 2 bytes. The header generated by the cell data header configuration unit 112b includes the original format of the result set, and when a character column is returned from the cell data binary file stored on the disk, the original format of the result set included in the header is returned. .

가변길이 컬럼 바이너리 저장부(112)는 셀 데이터 헤더 구성부(112b)에 의해 저장된 헤더를 바이너리 값으로 변환하고, 문자형 컬럼의 각 셀에 포함되는 문자수에 해당하는 바이트 단위 크기 간격으로 각 셀의 문자를 바이너리 값으로 변환시켜 셀 데이터 헤더와 셀 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The variable-length column binary storage unit 112 converts the header stored by the cell data header configuration unit 112b into a binary value, and stores each cell at a byte unit size interval corresponding to the number of characters included in each cell of the character column. The character is converted into a binary value, and the cell data header and cell data are combined into a single binary file and stored on the disk.

셀 데이터 파일에는 데이터(Result Set)에 들어있던 원본 문자열 데이터가 순차적으로 바이너리 값으로 변환되어 들어가며, 상기에서 살펴본 셀 지시자 파일에서 각 셀의 문자가 시작하는 위치와 읽을 문자수 값을 확인 후 셀 데이터 파일에서 해당하는 문자수 크기만큼을 읽어서 원본 형식으로 변환 후 반환하게 된다.In the cell data file, the original string data contained in the data (Result Set) is sequentially converted into binary values, and the cell data after checking the starting position of each cell's character and the number of characters to be read in the cell indicator file discussed above. It reads as much as the number of characters in the file, converts it to the original format, and returns it.

컬럼 바이너리 파일 리더부(120)는 고정길이 컬럼 바이너리 리더부(121)와 가변길이 컬럼 바이너리 리더부(122)를 포함한다.The column binary file reader unit 120 includes a fixed-length column binary reader unit 121 and a variable-length column binary reader unit 122.

고정길이 컬럼 바이너리 리더부(121)는 디스크에 저장된 수치형 컬럼의 바이너리 파일에서 N 번째 레코드 값을 읽을 때 (바이너리 파일의 헤더 크기 + N × 바이너리 파일의 헤더에 정의된 원본 컬럼 형식의 셀 당 크기)를 구하여 N 번째 레코드 위치로 이동 후 원본 컬럼 형식의 셀 크기에 해당하는 만큼 읽어온다.The fixed-length column binary reader 121 reads the Nth record value from the binary file of the numeric column stored on the disk (header size of the binary file + N × size per cell of the original column format defined in the header of the binary file) ), move to the Nth record position, and read as much as the cell size of the original column format.

도 6을 참고하면, 수치형 컬럼에 해당하는 ColumnFile0001.conbin에서 3번째 레코드 값을 읽을 때 (바이너리 파일의 헤더 크기 8 byte + 2 × 바이너리 파일의 헤더에 정의된 원본 컬럼 형식의 셀 당 크기 8 byte)를 계산하면 3 번째 레코드 위치로 이동 후 원본 컬럼 형식의 셀 크기에 해당하는 만큼 읽어올 수 있다.Referring to FIG. 6, when reading the 3rd record value from ColumnFile0001.conbin corresponding to a numeric column (header size of binary file 8 byte + 2 × size per cell of original column format defined in header of binary file) ), after moving to the third record position, you can read as much as the cell size of the original column format.

참고로, 레코드 인덱스는 0 부터 시작하며, 3 번째 레코드 인덱스 값은 2가 되는 것이다.For reference, the record index starts from 0, and the third record index value becomes 2.

가변길이 컬럼 바이너리 리더부(122)는 디스크에 저장된 문자형 컬럼의 바이너리 파일에서 N 번째 레코드 값을 읽을 때 (셀 지시자(Indicator) 파일의 헤더 크기 + N × (바이너리 파일의 헤더에 정의된 셀 데이터 파일 위치 정수 형식 크기 + 읽을 문자수 정수 형식 크기))를 구하여 N 번째 레코드 위치로 이동 후 셀 데이터 파일 위치 정수 형식 크기에 해당하는 만큼 읽어서 셀 데이터 파일에서의 N 번째 레코드의 시작 주소를 구하고, 읽을 문자수 정수 형식의 크기에 해당하는 만큼 읽어서 셀 데이터 파일에서 읽을 문자수에 해당하는 바이트 크기를 구한 후 셀 데이터 파일의 N 번째 레코드의 시작 주소에서 읽을 문자수에 해당하는 바이트 크기만큼 읽어온다.When reading the Nth record value from the binary file of the character column stored on the disk, the variable-length column binary reader 122 reads (header size of the cell indicator file + N × (cell data file defined in the header of the binary file). Position Integer format size + number of characters to read Integer format size)), move to the Nth record position, read as much as that corresponding to the cell data file location integer format size, and find the starting address of the Nth record in the cell data file, and the character to read After reading as many as the size of the number integer format and obtaining the byte size corresponding to the number of characters to be read from the cell data file, the byte size corresponding to the number of characters to be read is read from the start address of the Nth record of the cell data file.

도 6을 참고하면, 문자형 컬럼의 바이너리 파일에서 3 번째 레코드 값을 읽을 때 (셀 지시자(Indicator) 파일 ColumnFile0005.conbin의 헤더 크기 8 byte + 2 × (바이너리 파일의 헤더에 정의된 셀 데이터 파일 위치 정수 형식 크기 + 읽을 문자수 정수 형식 크기))를 구한다. 도 6의 예에서 바이너리 파일의 헤더에 정의된 셀 데이터 파일 위치 정수 형식 크기와 읽을 문자수 정수 형식 크기를 더하면 6 byte가 될 것이다. 8 + 2 × 6 = 20 byte가 된다.Referring to FIG. 6, when reading the third record value from a binary file of a character column (header size of the cell indicator file ColumnFile0005.conbin 8 byte + 2 × (Cell data file position constant defined in the header of the binary file) Calculate format size + number of characters to read integer format size)). In the example of FIG. 6, when the integer format size of the cell data file location defined in the header of the binary file and the integer format size of the number of characters to be read are added, it will be 6 bytes. 8 + 2 × 6 = 20 bytes.

해당 위치에서 셀 데이터 파일 위치 정수 형식 크기에 해당하는 만큼 읽어서 셀 데이터 파일 ColumnFile0005.conbin.data 에서의 N 번째 레코드의 시작 주소인 '43'을 구하고, 읽을 문자수 정수 형식의 크기에 해당하는 만큼 읽어서 셀 데이터 파일 ColumnFile0005.conbin.data 에서 읽을 문자수에 해당하는 바이트 크기 '5'를 구한 후 셀 데이터 파일의 N 번째 레코드의 시작 주소에서 읽을 문자수에 해당하는 5 바이트 크기만큼 읽어온다.From that location, read as much as the size of the cell data file location integer format, find '43', the starting address of the Nth record in the cell data file ColumnFile0005.conbin.data, and read as much as the number of characters to be read corresponding to the integer format size. After calculating the byte size of '5' corresponding to the number of characters to be read from the cell data file ColumnFile0005.conbin.data, the size of 5 bytes corresponding to the number of characters to be read is read from the start address of the Nth record of the cell data file.

[제2 실시 예][Second Embodiment]

제2 실시 예는 DBMS에서 데이터 조회 결과 수신한 테이블(Result Set)을 컬럼 별 바이너리 파일로 저장할 때 용량 감소 정책을 적용하여 원본 컬럼의 형식보다 용량이 감소된 방식으로 저장하는 구조이다. 용량 감소 정책을 적용하면 디스크에 저장되는 용량을 상당히 줄일 수가 있다. 제1 실시 예의 경우와 중복되는 부분에 대해서는 자세한 설명을 생략한다.In the second embodiment, when a table (Result Set) received as a result of a data inquiry from a DBMS is stored as a binary file for each column, a capacity reduction policy is applied to store it in a way that the capacity is reduced compared to the original column format. Applying a capacity reduction policy can significantly reduce the capacity stored on the disk. Detailed descriptions of parts that overlap with the case of the first embodiment will be omitted.

도 4, 5, 7, 8 및 9를 참고하면, 본 발명의 다른 실시 예에 따른 데이터베이스에서 조회된 대용량 데이터를 컬럼 별 바이너리 파일로 저장하는 구조(200)는 컬럼 바이너리 파일 저장부(210)와 컬럼 바이너리 파일 리더부(220)를 포함하며, 용량 감소 정책 정보부(230)를 더 포함할 수 있다.4, 5, 7, 8, and 9, a structure 200 for storing large-capacity data searched in a database as a binary file for each column according to another embodiment of the present invention includes a column binary file storage unit 210 and It includes a column binary file reader unit 220, and may further include a capacity reduction policy information unit 230.

컬럼 바이너리 파일 저장부(210)는 DBMS(Database Management System)에서 데이터 조회 결과 수신한 테이블(Result Set)에 대해 각 수치형 컬럼에 속하는 셀 데이터 값의 성격에 따라 원본 컬럼에 대해 각 수치형 컬럼 별 서로 다른 용량 감소 변환 형식을 적용하여, 변환된 형식의 셀 크기 간격으로 해당 컬럼의 모든 셀 데이터를 차례로 바이너리(Binary) 값으로 변환 후 컬럼 별 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The column binary file storage unit 210 provides each numeric column for the original column according to the characteristics of the cell data values belonging to each numeric column for a table (Result Set) received as a result of data search from a database management system (DBMS). By applying different capacity reduction conversion formats, all cell data in the corresponding column is sequentially converted into binary values at the cell size interval of the converted format, and then created as a single binary file for each column and stored on disk.

컬럼 바이너리 파일 리더부(220)는 컬럼 바이너리 파일 저장부에 의해 디스크에 저장된 각각의 컬럼 별 바이너리 파일에서 셀 값을 하나씩 구하여 특정 순번의 레코드(Row)를 구성 후 레코드 단위로 메모리에 로딩한다.The column binary file reader 220 obtains a cell value one by one from the binary file for each column stored on the disk by the column binary file storage unit, constructs a record of a specific order, and loads it into the memory in a record unit.

용량 감소 정책 정보부(230)는 데이터 조회 결과 수신한 테이블(Result Set)의 하나 이상의 수치형 컬럼을 각각 바이너리 파일로 변환하여 생성할 때 바이너리 파일의 크기를 줄이기 위해 각 컬럼에 속하는 셀 데이터 값의 성격에 맞는 용량 감소 정책을 각 컬럼 별로 매칭하여 저장한다. 각 용량 감소 정책의 구체적인 내용은 아래에서 자세하게 설명한다.The capacity reduction policy information unit 230 converts one or more numeric columns of a table (Result Set) received as a result of data inquiry into a binary file, and generates the characteristics of cell data values belonging to each column to reduce the size of the binary file. The appropriate capacity reduction policy is matched for each column and stored. The details of each capacity reduction policy are described in detail below.

용량 감소 정책은 문자형 컬럼에는 적용되지 않으며, 수치형 컬럼에만 적용된다. 여기서 컬럼에 속하는 셀 데이터 값의 성격은 big int, float, int, small int 등의 컬럼 타입, 해당 컬럼에 속하는 각 셀 값의 중복 정도 및 다른 컬럼에 속하는 셀 값과의 관계 등을 말하며, 각 컬럼에 속하는 셀 데이터 값의 성격에 맞는 용량 감소 정책은 관리자에 의해 매칭될 수 있다.The capacity reduction policy does not apply to character columns, but only to numeric columns. Here, the characteristics of cell data values belonging to a column refer to the column types such as big int, float, int, small int, etc., the degree of overlap of each cell value belonging to the column, and the relationship with the cell values belonging to other columns. A capacity reduction policy that fits the characteristics of the cell data value belonging to may be matched by the administrator.

컬럼 바이너리 파일 저장부(210)는 수신한 테이블(Result Set)에서 컬럼 별로 셀의 데이터 값을 추출 후 용량 감소 정책 정보부(230)를 참고하여 해당 컬럼에 매칭된 용량 감소 정책에 따라 테이블의 원본 컬럼 형식 보다 용량이 감소된 컬럼 별 바이너리 파일을 생성하여 디스크에 저장한다.The column binary file storage unit 210 extracts the cell data value for each column from the received table (Result Set) and refers to the capacity reduction policy information unit 230 to determine the original column of the table according to the capacity reduction policy matched to the column. Creates a binary file for each column whose capacity is smaller than that of the format and saves it on disk.

용량 감소 정책 정보부(230)는 테이블(Result Set)의 원본 컬럼 타입보다 작지만 해당 컬럼에 속하는 셀의 실제 데이터 최대값을 수용할 수 있는 지정된 크기의 컬럼 타입으로 해당 컬럼의 형식을 변환시켜서 원본 컬럼의 용량을 감소시키는 제1 용량 감소 정책(231)을 포함한다.The capacity reduction policy information unit 230 is smaller than the original column type of the table (Result Set), but converts the type of the corresponding column into a column type of a designated size that can accommodate the maximum actual data value of the cell belonging to the corresponding column. And a first capacity reduction policy 231 to reduce capacity.

도 5, 7, 8 및 9를 참고하면, 형식변환이 제1 용량 감소 정책(231)에 해당한다. 테이블(Result Set)의 Col1 컬럼 형식은 bigint(8 byte)이지만 실제 최대값이 int(4 byte)이어서 형식변환을 적용하여 8 byte에 해당하는 크기의 셀을 4 byte에 해당하는 크기로 용량을 축소시키는 것이다. 원본의 크기를 반으로 줄여서 디스크에 저장할 수가 있다.5, 7, 8, and 9, the format conversion corresponds to the first capacity reduction policy 231. The Col1 column format of the table (Result Set) is bigint (8 byte), but the actual maximum value is int (4 byte), so a type conversion is applied to reduce the capacity of a cell of 8 byte to 4 byte. It is to let. You can cut the original size in half and save it to disk.

컬럼 바이너리 파일 저장부(210)는 테이블(Result Set)에 정의된 해당 컬럼의 원본 형식, 컬럼 별 바이너리 파일의 생성 형식에 해당하는 제1 용량 감소 정책(231)에 의해 변환된 크기의 컬럼 형식 및 변환된 형식의 크기를 저장하는 헤더를 생성하는 제1 용량 감소 헤더 구성부(211)를 포함한다.The column binary file storage unit 210 includes a column format of the size converted by the first capacity reduction policy 231 corresponding to the original format of the corresponding column defined in the table (Result Set) and the generation format of binary files for each column, and And a first capacity reduction header configuration unit 211 that generates a header storing the size of the converted format.

제1 용량 감소 헤더 구성부(211)는 8 byte의 헤더를 구성하며, Fetch Type에는 Result Set에 정의된 해당 컬럼의 원본 형식을 저장하고, P Type에는 형식 변환 형식을 저장하고, P Size에는 변환 형식의 크기를 저장하고, Reserved에는 추후 사용을 위해 남겨둔다. Fetch Type, P Type, P Size, Reserved는 각각 2 byte로 구성한다.The first capacity reduction header configuration unit 211 configures a header of 8 bytes, and stores the original type of the column defined in the Result Set in Fetch Type, stores the format conversion type in P Type, and converts it in P Size. The size of the format is saved, and reserved for future use. Fetch Type, P Type, P Size, and Reserved are each composed of 2 bytes.

컬럼 바이너리 파일 저장부(210)는 제1 용량 감소 헤더 구성부(211)에 의해 생성된 헤더를 바이너리 값으로 변환하고, 테이블(Result Set)에 포함된 원본 컬럼의 셀 데이터를 제1 용량 감소 정책(231)에 의해 변환된 크기의 컬럼 형식 및 크기에 따라 테이블의 원본 컬럼 형식 보다 용량이 감소된 셀 당 크기 간격으로 차례로 바이너리 값으로 변환시켜 헤더와 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The column binary file storage unit 210 converts the header generated by the first capacity reduction header configuration unit 211 into a binary value, and converts the cell data of the original column included in the table (Result Set) into a first capacity reduction policy. Depending on the column format and size of the size converted by (231), the size of the table is converted to binary values in order per cell size interval, which is smaller than the original column format of the table, and the header and data are combined into a single binary file to be created. Save it to.

용량 감소 정책 정보부(230)는 테이블(Result Set)의 원본 컬럼 타입이 부동소수(float)인 경우 해당 컬럼에 속하는 셀의 데이터 값을 정수 부분과 소수 부분으로 나누어 정수 부분을 수용할 수 있는 크기의 제1 타입을 지정하고, 소수 부분을 수용할 수 있는 크기의 제2 타입을 지정하여, 제1 타입과 제2 타입이 결합된 크기에 해당하는 컬럼 타입으로 원본 컬럼의 형식을 복합적으로 변환시켜서 원본 컬럼의 용량을 감소시키는 제2 용량 감소 정책(232)을 포함한다.When the original column type of the table (Result Set) is a floating-point number, the capacity reduction policy information unit 230 divides the data value of a cell belonging to the corresponding column into an integer part and a decimal part, and has a size that can accommodate the integer part. By designating the first type and designating a second type with a size that can accommodate a fractional part, the original column format is converted into a column type corresponding to the combined size of the first type and the second type. And a second capacity reduction policy 232 to reduce the capacity of the column.

도 5, 7, 8 및 9를 참고하면, 형식복합변환이 제2 용량 감소 정책(232)에 해당한다. 테이블(Result Set)의 Col2 컬럼이 float(8byte)이지만 정수 부분과 소수 부분(소수 부분을 정수로 변환했을 때)의 최대값이 2byte로 표현 가능하여 총 4byte로 형식복합변환을 적용하여 디스크에 저장되는 데이터의 용량을 줄일 수가 있다. 하나의 셀 공간을 나누어 정수 부분을 수용할 수 있는 크기의 제1 타입에는 정수를 저장하고, 소수 부분을 수용할 수 있는 크기의 제2 타입에는 소수를 저장한다.5, 7, 8 and 9, the format complex conversion corresponds to the second capacity reduction policy 232. Although the Col2 column of the table (Result Set) is a float (8 bytes), the maximum value of the integer part and the decimal part (when the decimal part is converted to an integer) can be expressed in 2 bytes, so a total of 4 bytes is applied and saved to disk. It is possible to reduce the amount of data to be used. An integer is stored in a first type having a size that can accommodate an integer portion by dividing one cell space, and a decimal number is stored in a second type having a size that can accommodate the decimal portion.

컬럼 바이너리 파일 저장부(210)는 테이블(Result Set)에 정의된 해당 컬럼의 원본 형식, 컬럼 별 바이너리 파일의 생성 형식에 해당하는 제2 용량 감소 정책(232)에 따라 정수 부분을 수용할 수 있는 크기의 제1 타입 형식, 제2 용량 감소 정책에 따라 소수 부분을 수용할 수 있는 크기의 제2 타입 형식, 제1 타입 형식의 크기 및 제2 타입 형식의 크기를 저장하는 헤더를 생성하는 제2 용량 감소 헤더 구성부(212)를 포함한다.The column binary file storage unit 210 is capable of accommodating an integer portion according to the second capacity reduction policy 232 corresponding to the original format of the corresponding column defined in the table (Result Set) and the generation format of binary files for each column. A second type that generates a header storing the size of the first type format, the size of the second type format that can accommodate a fractional part according to the second capacity reduction policy, the size of the first type format, and the size of the second type format. It includes a capacity reduction header configuration unit 212.

도 8을 참고하면, 2 byte의 Fetch Type에는 테이블(Result Set)의 원본 형식을 저장하고, 1 byte의 P Type1에는 정수 부분을 수용할 수 있는 크기의 제1 타입 형식을 저장하고, 1 byte의 P Type2에는 소수 부분을 수용할 수 있는 크기의 제2 타입 형식을 저장하고, 1 byte의 P Size1에는 제1 타입 형식의 크기를 저장하고, 1 byte의 P Size2에는 제2 타입 형식의 크기를 저장하며, Reserved1에는 소수부 정밀도를 저장한다.Referring to FIG. 8, an original format of a table (Result Set) is stored in a 2-byte Fetch Type, and a first type format having a size that can accommodate an integer part is stored in a 1-byte P Type1, and P Type2 stores the second type format with a size that can accommodate the fractional part, 1 byte P Size1 stores the size of the first type type, and 1 byte P Size2 stores the size of the second type format. And the precision of the decimal part is stored in Reserved1.

컬럼 바이너리 파일 저장부(210)는 제2 용량 감소 헤더 구성부(212)에 의해 생성된 헤더를 바이너리 값으로 변환하고, 테이블(Result Set)에 포함된 원본 컬럼의 셀 데이터를 제2 용량 감소 정책(232)에 의해 변환된 크기의 컬럼 형식 및 크기에 따라 테이블의 원본 컬럼 형식 보다 용량이 감소된 셀 당 크기 간격으로 해당 컬럼의 각 셀의 정수 부분과 소수 부분을 바이너리 값으로 변환시켜 헤더와 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다.The column binary file storage unit 210 converts the header generated by the second capacity reduction header configuration unit 212 into a binary value, and converts the cell data of the original column included in the table (Result Set) into a second capacity reduction policy. Header and data are converted into binary values by converting the integer part and the decimal part of each cell of the column into binary values at the size interval per cell whose capacity is reduced compared to the original column format of the table according to the column format and size of the size converted by (232). It is created as a single binary file that is combined and saved to disk.

용량 감소 정책 정보부(230)는 테이블(Result Set)의 원본 컬럼에 속하는 각 셀 데이터 값을 다른 컬럼의 같은 레코드(Row) 위치에 있는 셀 데이터 값을 변수로 하는 계산식에 의해 구할 수 있는 관계를 형성하는 경우 원본 컬럼에 속하는 셀 데이터 전부를 소정의 크기를 갖는 하나의 셀에 상기 계산식을 저장하는 방식으로 변환시켜 원본 컬럼의 용량을 감소시키는 제3 용량 감소 정책(233)을 포함한다.The capacity reduction policy information unit 230 forms a relationship that can be obtained by a calculation formula in which each cell data value belonging to an original column of a table (Result Set) is a cell data value at the same record (Row) position in another column as a variable. In this case, a third capacity reduction policy 233 for reducing the capacity of the original column by converting all of the cell data belonging to the original column into a single cell having a predetermined size and storing the calculation formula is included.

도 5, 7, 8 및 9를 참고하면, 계산식 변환이 제3 용량 감소 정책(232)에 해당한다. 테이블(Result Set) 상의 Col3 컬럼 형식은 int(4byte)이지만 Col5 컬럼의 수식 연산 결과, Col3 = 2 * Col5 + 5 의 관계식이 성립하므로 {Col5} * 2 + 5 수식을 후위 표기법으로 변환하여 계산식 변환을 적용한다. 이와 같은 계산식을 이용하여 Col3의 값을 Col5와의 계산식에 의해 간단하게 구할 수가 있기 때문에 Col3의 원본 컬럼 데이터 대신에 계산식만 있으면 된다.5, 7, 8, and 9, the conversion of the calculation formula corresponds to the third capacity reduction policy 232. The Col3 column format in the table (Result Set) is int (4 bytes), but the formula operation result of the Col5 column, Col3 = 2 * Col5 + 5, is established, so the {Col5} * 2 + 5 formula is converted to postfix notation to convert the calculation formula. Apply. Using such a calculation formula, the value of Col3 can be easily calculated by the calculation formula with Col5, so only the calculation formula is needed instead of the original column data of Col3.

도 7의 예에서는 계산식 변환을 적용하여 8 byte의 헤더 외에 10 byte의 공간에 {Col5} * 2 + 5 수식만을 저장함으로써, 4억 바이트에 해당하는 원본 컬럼의 크기를 10 byte로 줄일 수가 있다.In the example of FIG. 7, the size of an original column corresponding to 400 million bytes can be reduced to 10 bytes by storing only the expression {Col5} * 2 + 5 in a 10-byte space in addition to the 8-byte header by applying the calculation formula conversion.

컬럼 바이너리 파일 저장부(210)는 테이블(Result Set)에 정의된 해당 컬럼의 원본 형식, 컬럼 별 바이너리 파일의 생성 형식 및 컬럼 별 바이너리 파일 생성시 셀 당 크기를 저장하는 헤더를 생성하는 제3 용량 감소 헤더 구성부(213)를 포함한다.The column binary file storage unit 210 is a third capacity that generates a header that stores the original format of a corresponding column defined in a table (Result Set), a binary file generation format for each column, and a size per cell when a binary file is generated for each column. It includes a reduced header component 213.

컬럼 바이너리 파일 저장부(210)는 제3 용량 감소 헤더 구성부(213)에 의해 생성된 헤더를 바이너리 값으로 변환하고, 제3 용량 감소 정책(233)에 정의된 계산식(도 7의 예에서는 {Col5} * 2 + 5)이 포함된 소정의 크기만큼을 바이너리 값으로 변환시켜 헤더와 데이터가 결합된 하나의 바이너리 파일로 생성하여 디스크에 저장한다. 상기의 다른 컬럼 및 계산식은 사용자에 의해 지정된다.The column binary file storage unit 210 converts the header generated by the third capacity reduction header configuration unit 213 into a binary value, and a calculation formula defined in the third capacity reduction policy 233 (in the example of FIG. 7, { Col5} * 2 + 5) is converted into a binary value, and the header and data are combined into a single binary file and stored on the disk. The other columns and calculation formulas above are specified by the user.

용량 감소 정책 정보부(230)는 테이블(Result Set)의 원본 컬럼에 속하는 각 셀 데이터 값이 동일한 값에 대해서는 중복을 제거하여 서로 다른 값들로만(Distinct) 구성된 딕셔너리(Dictionary) 컬럼에 해당하는 ColumnFile0005.conbin.Dictionary 바이너리 파일을 만들고, 테이블(Result Set)의 원본 컬럼 타입보다 작고 원본 셀 데이터 값의 딕셔너리 컬럼 상의 인덱스(Index)를 저장하는 지정된 크기의 인덱스 컬럼에 해당하는 ColumnFile0005.conbin 바이너리 파일을 만들어서 원본 컬럼의 용량을 감소시키는 제4 용량 감소 정책(234)을 포함한다.The capacity reduction policy information unit 230 removes duplicates for values in which each cell data value belonging to the original column of a table (Result Set) is the same, so that the columnFile0005.conbin corresponds to a dictionary column composed of only different values (Distinct). Create a .Dictionary binary file, create a ColumnFile0005.conbin binary file that corresponds to the index column of the designated size that is smaller than the original column type of the table (Result Set) and stores the index on the dictionary column of the original cell data value. And a fourth capacity reduction policy 234 to reduce the capacity of the unit.

도 5, 7, 8 및 9를 참고하면, Column Dictionary가 제4 용량 감소 정책(234)에 해당한다. 테이블(Result Set)의 Col4 컬럼 형식은 int(4byte)이지만 중복제거(Distinct)했을 때 값 개수가 200개 미만이어서 셀 데이터의 중복을 제거하여 Dictionay 파일을 만들고 해당 Dictionary의 Hash값(Index Key)을 colbin 파일로 저장하는 Column Dictionary 변환을 적용한다.5, 7, 8, and 9, the column dictionary corresponds to the fourth capacity reduction policy 234. The Col4 column format of the table (Result Set) is int (4 bytes), but the number of values is less than 200 when distincted, so a dictionary file is created by removing duplicates of cell data, and the hash value (Index Key) of the dictionary is Apply Column Dictionary transformation that is saved as a colbin file.

도 7을 참고하면, ColumnFile0005.conbin 의 첫번째 컬럼 0은 ColumnFile0005.conbin.Dictionary 의 36을 가리키고, 두번째 컬럼 1은 42를 가리키며, 세번째 컬럼 0은 중복된 값이어서 제거된 36을 가리킨다. 네번째 컬럼 2는 중복 제거 후 한 칸 앞으로 당겨진 366을 가리킨다.Referring to FIG. 7, the first column 0 of ColumnFile0005.conbin indicates 36 of ColumnFile0005.conbin.Dictionary, the second column 1 indicates 42, and the third column 0 indicates 36 removed because it is a duplicate value. Fourth column 2 refers to 366 pulled forward one space after deduplication.

컬럼 바이너리 파일 저장부(210)는 테이블(Result Set)에 정의된 해당 컬럼의 원본 형식, 컬럼 별 바이너리 파일의 생성 형식에 해당하는 제4 용량 감소 정책(234)에 의해 지정된 인덱스 컬럼 형식 및 인덱스 컬럼 형식의 크기를 저장하는 헤더를 생성하는 제4 용량 감소 인덱스 헤더 구성부(2140a)를 포함한다.The column binary file storage unit 210 is an index column format and an index column designated by the fourth capacity reduction policy 234 corresponding to the original format of the corresponding column defined in the table (Result Set) and the generation format of binary files for each column. And a fourth capacity reduction index header configuration unit 2140a that generates a header storing the format size.

도 8을 참고하면, First File Header(8 byte)는 제4 용량 감소 인덱스 헤더 구성부(2140a)에 의해 생성되는 헤더의 구성을 나타낸다. Fetch Type에는 데이터(Result Set)의 원본 형식이 저장되고, P Type에는 Dictionay Key(Index) 형식이 저장되고, P Size에는 Dictionay Key(Index) 형식의 크기가 저장된다.Referring to FIG. 8, a First File Header (8 bytes) represents a configuration of a header generated by the fourth capacity reduction index header configuration unit 2140a. The original type of data (Result Set) is stored in the fetch type, the Dictionay Key (Index) type is stored in the P Type, and the size of the Dictionay Key (Index) type is stored in the P Size.

컬럼 바이너리 파일 저장부(210)는 제4 용량 감소 인덱스 헤더 구성부(214a)에 의해 생성된 헤더를 바이너리 값으로 변환하고, 제4 용량 감소 정책(234)에 의해 지정된 크기의 인덱스 컬럼 형식 및 크기에 따라 원본 셀 데이터 값의 딕셔너리 컬럼 상의 인덱스 값을 차례로 바이너리 값으로 변환시켜 헤더와 데이터가 결합된 하나의 인덱스 바이너리 파일로 생성하여 디스크에 저장한다.The column binary file storage unit 210 converts the header generated by the fourth capacity reduction index header configuration unit 214a into a binary value, and the index column format and size of the size designated by the fourth capacity reduction policy 234 According to this, the index value on the dictionary column of the original cell data value is sequentially converted into a binary value, and the header and data are combined into a single index binary file and stored on disk.

컬럼 바이너리 파일 저장부(210)는 딕셔너리(Dictionary) 컬럼 형식 및 딕셔너리 컬럼 형식의 크기를 저장하는 헤더를 생성하는 제4 용량 감소 딕셔너리 헤더 구성부(214b)를 포함한다.The column binary file storage unit 210 includes a fourth capacity reduction dictionary header construction unit 214b that generates a header storing a dictionary column format and a size of the dictionary column format.

도 8을 참고하면, Second File Header(2 또는 4 byte)는 제4 용량 감소 딕셔너리 헤더 구성부(214b)에 의해 생성되는 헤더의 구성을 나타낸다. 헤더에는 Dictionary 값의 형식과 Dictionary 형식의 크기가 저장된다.Referring to FIG. 8, the Second File Header (2 or 4 bytes) represents the configuration of a header generated by the fourth capacity reduction dictionary header configuration unit 214b. In the header, the format of the Dictionary value and the size of the Dictionary format are stored.

제4 용량 감소 정책(234)에 따라 테이블(Result Set)에 정의된 해당 컬럼에서 중복을 제거하여 서로 다른 값들로만 순차적으로 구성하고, 딕셔너리 컬럼의 셀 당 크기 간격으로 차례로 바이너리 값으로 변환시켜 헤더와 데이터가 결합된 하나의 딕셔너리 바이너리 파일로 생성하여 디스크에 저장한다.According to the fourth capacity reduction policy 234, duplicates are removed from the corresponding column defined in the table (Result Set), and only different values are sequentially configured, and the dictionary column is sequentially converted into binary values at intervals of size per cell. It is created as a single dictionary binary file that combines the data and stores it on disk.

컬럼 바이너리 파일 리더부(220)는 용량 감소 정책 정보부(230)에 저장된 각 컬럼 별 용량 감소 정책을 해석하여 디스크에 저장된 수치형 컬럼 바이너리 파일에서 셀 값을 하나씩 구하고, 컬럼 바이너리 파일의 헤더에 저장된 원본 컬럼 형식으로 변환하여 특정 순번의 레코드를 구성 후 메모리에 로딩한다.The column binary file reader 220 interprets the capacity reduction policy for each column stored in the capacity reduction policy information unit 230 to obtain cell values one by one from the numeric column binary file stored on the disk, and the original stored in the header of the column binary file. Converts to a column format, composes a record of a specific sequence, and loads it into memory.

컬럼 바이너리 파일 리더부(220)는 디스크에 저장된 컬럼 별 바이너리 파일에서 헤더에 저장된 P Type과 P Size 값을 읽어서 디스크에 저장된 셀 데이터를 확인한 후 Fetch Type에 해당하는 원본 컬럼 형식으로 변환하여 특정 순번의 레코드를 구성 후 메모리에 로딩할 수 있다.The column binary file reader 220 reads the P Type and P Size values stored in the header from the binary file for each column stored on the disk, checks the cell data stored on the disk, and converts it into the original column format corresponding to the fetch type, Records can be constructed and loaded into memory.

지금까지 본 발명을 바람직한 실시 예를 참조하여 상세히 설명하였지만, 본 발명이 속하는 기술분야의 당업자는 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시할 수 있으므로, 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로서 이해해야만 한다.Until now, the present invention has been described in detail with reference to preferred embodiments, but those skilled in the art to which the present invention pertains can practice the present invention in other specific forms without changing the technical spirit or essential features thereof, The embodiments are illustrative in all respects and should be understood as non-limiting.

그리고, 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 특정되는 것이며, 특허청구범위의 의미 및 범위 그리고 그 등가개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.In addition, the scope of the present invention is specified by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts are included in the scope of the present invention. Should be interpreted as.

100, 200...컬럼 별 바이너리 파일로 저장하는 구조
110, 210...컬럼 바이너리 파일 저장부
111...고정길이 컬럼 바이너리 저장부
111a...고정길이 헤더 구성부
112...가변길이 컬럼 바이너리 저장부
112a...셀 지시자 헤더 구성부
112b...셀 데이터 헤더 구성부
120, 220...컬럼 바이너리 파일 리더부
121...고정길이 컬럼 바이너리 리더부
122...가변길이 컬럼 바이너리 리더부
211...제1 용량 감소 헤더 구성부
212...제2 용량 감소 헤더 구성부
213...제3 용량 감소 헤더 구성부
214a...제4 용량 감소 인덱스 헤더 구성부
214b..제4 용량 감소 딕셔너리 헤더 구성부
230...용량 감소 정책 정보부
231...제1 용량 감소 정책
232...제2 용량 감소 정책
233...제3 용량 감소 정책
234...제4 용량 감소 정책
300...대용량 데이터를 메모리 용량 제약없이 차트로 표시하는 시스템
330...차트표시부
331...2차원 배열 인덱스 번호 관리부
332...배열구조 변환부
333...레코드 인덱스 작성부
334...레코드 인덱스 리더부Structure storing 100, 200... as binary files for each column
110, 210...column binary file storage
111... fixed-length column binary storage
111a... fixed-length header components
112...variable length column binary storage
112a...cell indicator header configuration section
112b...cell data header configuration section
120, 220...column binary file reader
121...fixed-length column binary reader
122...variable length column binary reader
211...first capacity reduction header component
212...second capacity reduction header component
213...3rd capacity reduction header component
214a...4th capacity reduction index header component
214b..4th capacity reduction dictionary header component
230...Ministry of Information on Capacity Reduction Policy
231...first capacity reduction policy
232...second capacity reduction policy
233...3rd capacity reduction policy
234...the fourth capacity reduction policy
A system that displays 300... large amounts of data in a chart without memory capacity limitations
330...chart display
331...2D array index number management unit
332...array structure conversion unit
333... record indexing unit
334...record index reader

Claims

A column binary file storage unit that generates table data received as a result of a data search result from a database management system (DBMS) as a column-based binary file and stores it in a disk;
A column binary file reader configured to read cell values one by one from a binary file for each column stored on a disk by the column binary file storage unit, construct a record, and then load a memory in a record unit; And
Visually display in the form of pixels at positions corresponding to X coordinates and Y coordinates on the chart by using each cell value corresponding to the X coordinate and Y coordinate in the record loaded into the memory by the column binary file reader. Including the chart display,
The chart display unit
By assigning record index numbers in the order of visual display in pixel form at the positions corresponding to the X and Y coordinates on the chart, each position of the pixels in the two-dimensional array corresponding to the X and Y coordinates. A two-dimensional array index number management unit for storing at least one record index number correspondingly;
Convert the two-dimensional array to a one-dimensional array, convert the X and Y coordinates of the two-dimensional array to the index number of the one-dimensional array, and store the index number of the one or more records in correspondence with the index number of the one-dimensional array. An array structure conversion unit;
The address value indicating the start position in the record index data file of the record index number corresponding to the index number of the one-dimensional array and the number of record index numbers belong to one cell of the same size and are sequentially configured, and all cells After converting the data to binary values in turn, a record index indicator file is created and stored on disk, and the record index number corresponding to the index number of the one-dimensional array is sequentially converted into binary values in byte units corresponding to the specified size. A record index creation unit for generating a record index data file after conversion and storing it in a disk; And
When designating a specific pixel on the chart, the index number of the one-dimensional array corresponding to the X coordinate and Y coordinate of the pixel is used to indicate the start position of the record index data file of the record index number in the record index indicator file stored on the disk. A large amount of data is stored using a binary file storage structure for each column, comprising a record index reader that obtains the address value and the number of record index numbers and retrieves the corresponding number of record index numbers from the record index data file. Charting system without capacity constraints.

delete

The method according to claim 1,
The column binary file reader receives one or more record index numbers obtained by the record index reader and reads cell values corresponding to the record index numbers one by one from the column-unit binary file stored in the disk to form a record. Records are sequentially formed for all the received record index numbers and transferred to the chart display unit,
The chart display unit displays record information corresponding to a record index number, using a binary file storage structure for each column, and displaying a large amount of data as a chart without a memory capacity limitation.

A column binary file storage unit that generates table data received as a result of a data search result from a database management system (DBMS) as a column-based binary file and stores it in a disk;
A column binary file reader configured to read cell values one by one from a binary file for each column stored on a disk by the column binary file storage unit, construct a record, and then load a memory in a record unit; And
Visually display in the form of pixels at positions corresponding to X coordinates and Y coordinates on the chart by using each cell value corresponding to the X coordinate and Y coordinate in the record loaded into the memory by the column binary file reader. Including the chart display,
The column binary file storage unit,
Fixed-length column binary storage that converts all cell data in the column into binary values sequentially at the cell size interval of the original column type included in the numeric column table (Result Set), then creates a single binary file and stores it on disk ; And
The size in bytes corresponding to the number of characters included in each cell of the character column and the address value indicating the starting position of the character of each cell in the cell data file belong to one cell of the same size. Consecutively, after converting all cell data to binary values in sequence, a cell indicator file is created and stored on the disk, and cells are sequentially converted to binary values as much as the size in bytes corresponding to the number of characters contained in each cell. It includes a variable-length column binary storage unit that creates a data file and stores it on a disk,
The column binary file reader unit,
When reading the Nth record value from the binary file of the numeric column stored on the disk (header size of the binary file + N × size per cell of the original column format defined in the header of the binary file), it moves to the Nth record position. A fixed-length column binary reader that reads as much as the cell size of the original column format; And
When reading the value of the Nth record from the binary file of the character column stored on the disk (Header size of the cell indicator file + N × (Cell data file position defined in the header of the binary file) Integer format size + Number of characters to read Integer format Size)), move to the Nth record position, read as much as the size of the cell data file location integer format, and find the starting address of the Nth record in the cell data file, and read as much as the number of characters to be read that corresponds to the integer format size A column comprising a variable-length column binary reader that obtains the byte size corresponding to the number of characters to be read from the cell data file and reads the byte size corresponding to the number of characters to be read from the start address of the N-th record of the cell data file. A system that displays large amounts of data in a chart without memory capacity restrictions using the storage structure of each binary file.

delete

A column binary file storage unit that generates table data received as a result of a data search result from a database management system (DBMS) as a column-based binary file and stores it in a disk;
A column binary file reader configured to read cell values one by one from a binary file for each column stored on a disk by the column binary file storage unit, construct a record, and then load a memory in a record unit; And
Visually display in the form of pixels at positions corresponding to X coordinates and Y coordinates on the chart by using each cell value corresponding to the X coordinate and Y coordinate in the record loaded into the memory by the column binary file reader. Including the chart display,
The column binary file storage unit,
For the received table (Result Set), according to the characteristics of the cell data values belonging to each numeric column, a different capacity reduction conversion format for each numeric column is applied to the original column, corresponding to the cell size interval of the converted format. Converts all cell data in a column to binary values in turn, creates a binary file for each column and stores it on disk.
In order to reduce the size of the binary file when generating by converting one or more numeric columns of the table (Result Set) received as a result of the data inquiry into a binary file, a capacity reduction policy suitable for the characteristics of the cell data values belonging to each column is implemented. Further comprising a capacity reduction policy information unit matching and storing by column,
The column binary file storage unit extracts the cell data value for each column from the received table (Result Set), and then refers to the capacity reduction policy information unit to reduce the capacity than the original column type of the table according to the capacity reduction policy matched to the corresponding column. A system that displays large-capacity data as a chart without limiting memory capacity using a binary file storage structure for each column, characterized in that binary files for each column are created and stored on disk.

delete

The method of claim 6,
The column binary file reader unit
By analyzing the capacity reduction policy for each column stored in the capacity reduction policy information unit, one by one cell value is obtained from the numeric column binary file stored on the disk, and converted to the original column format stored in the header of the column binary file to construct a specific sequence of records. A system that displays large-capacity data as a chart without limiting memory capacity by using a binary file storage structure for each column, which is then loaded into memory.