KR100533883B1

KR100533883B1 - Techniques for Graphics Data Processing and Transformation of Linear Expressions, Designed for Graphics Processor

Info

Publication number: KR100533883B1
Application number: KR10-2003-0037962A
Authority: KR
Inventors: 임인성; 오진상
Original assignee: 학교법인 서강대학교
Priority date: 2003-06-12
Filing date: 2003-06-12
Publication date: 2005-12-07
Also published as: KR20040107730A

Abstract

본 발명은 그래픽스 프로세서에서 선형식을 효율적으로 구현할 수 있도록 하는 그래픽스 데이터 처리방법과 선형식 변환방법에 관한 것이다.The present invention relates to a graphics data processing method and a linear conversion method for efficiently implementing a linear equation in a graphics processor.

본 발명에 따른 그래픽스 데이터 처리방법은, 그래픽스 프로세서에 아래의 수식과 같은 임의의 선형식 E가 입력되면, 상기 선형식에 대해 최소 비용이 소요되고 상기 선형식과 동등한 변형 선형식을 찾는 변형단계와; 상기 변형 선형식에 대해 최소 개수의 버텍스 명령어를 적용하는 쉐이더코딩단계를 포함한다.Graphics data processing method according to the present invention, if any linear equation E, such as the following equation is input to the graphics processor, the transformation step of finding a transformed linear equation that is the minimum cost and equivalent to the linear equation and ; And a shader coding step of applying a minimum number of vertex instructions to the deformation linear equation.

또한, 선형식 변환방법은, 목표비용을 설정하는 목표설정단계와; 상기 선형식 E의 임의의 변형 선형식 E’을 구하는 행렬변형단계와; 상기 선형식 E의 계산에 소요되는 비용과 상기 변형 선형식 E’의 계산에 소요되는 비용을 산출하는 비용산출단계와; 상기 변형 선형식 E’의 계산에 소요되는 비용과 상기 선형식 E의 계산에 소요되는 비용의 차이값을 구하는 비용차이값계산단계와; 상기 비용의 차이값이 0보다 작거나, 진행과정에서 0보다 작을 확률이 크면 상기 변형 선형식 E’을 새로운 선형식 E 으로 재설정하는 선형식재설정단계와; 상기 선형식 E의 계산에 소요되는 비용이 상기 목표비용이 될 때까지 상기 행렬변형단계 내지 선형식재설정단계를 반복 수행하는 목표달성단계를 포함한다.In addition, the linear conversion method includes a goal setting step of setting a target cost; A matrix transformation step of obtaining any modified linear equation E 'of the linear equation E; A cost calculation step of calculating a cost for calculating the linear equation E and a cost for calculating the modified linear equation E '; Calculating a cost difference value for calculating a difference between a cost required for calculating the modified linear equation E 'and a cost required for calculating the linear equation E; A linear reset step of resetting the modified linear equation E 'to a new linear equation E when the difference in the cost is less than zero or a probability that the difference is less than zero in the process; And a target achievement step of repeating the matrix transformation step or the linear plant reset step until the cost required for the calculation of the linear equation E becomes the target cost.

Description

Graphics Data Processing and Transformation of Linear Expressions, Designed for Graphics Processor

본 발명은 그래픽스 프로세서에서의 그래픽스 데이터 처리방법과 선형식 변환방법에 관한 것으로서, 보다 상세하게는 그래픽스 프로세서에서 선형식을 효율적으로 구현할 수 있도록 하는 그래픽스 데이터 처리방법과 선형식 변환방법에 관한 것이다.The present invention relates to a graphics data processing method and a linear conversion method in a graphics processor, and more particularly, to a graphics data processing method and a linear conversion method for efficiently implementing a linear expression in a graphics processor.

그래픽스 분야에서 현재 보편화되고 있는 그래픽스 프로세서들은 고정된 파이프라인을 따르지 않고 사용자가 직접 프로그래밍할 수 있도록 발전되고 있다. 이러한 그래픽스 프로세서들의 쉐이더(shader)기술은 다양한 렌더링(rendering) 효과들이 실시간으로 구현되는 것을 가능하게 해주며, 기존의 CPU 상에서만 가능했던 다양한 계산들을 그래픽스 프로세서(GPU) 상에서도 처리할 수 있게 해 주었다.Graphics processors, which are now commonplace in the graphics world, are evolving so that users can program themselves without having to follow a fixed pipeline. The shader technology of these graphics processors enables various rendering effects to be implemented in real-time and allows processing on a graphics processor (GPU) for a variety of calculations that were only possible on a conventional CPU.

기존의 고정된 파이프라인에서는 사용자가 파라메터만을 변경할 수 있기 때문에 한정된 렌더링 효과만을 얻을 수 있었다. 이러한 단점을 극복하고자 렌더링 파이프라인의 각 부분을 사용자가 직접 프로그래밍한 코드로 대치하고자 하는 노력이 있어 왔는데, 그 대표적인 예가 RenderMan의 쉐이딩 언어이다.In the existing fixed pipeline, only limited parameters can be obtained because the user can change only parameters. To overcome these shortcomings, efforts have been made to replace each part of the rendering pipeline with user-programmed code. A typical example is the shading language of RenderMan.

이 RenderMan 쉐이딩 언어는 소프트웨어 렌더링 시스템으로서, 쉐이딩과 라이팅에 대한 계산을 사용자가 원하는 대로 조작할 수 있다는 장점을 갖는다. 그러나, 이러한 쉐이딩 방식은 그 복잡성으로 인하여 주로 오프라인 렌더링 분야인 영화 또는 실사 수준의 이미지를 얻기 위해 많이 사용되고 있다. 최근 등장하고 있는 프로그래밍 가능한 그래픽스 프로세서는 하드웨어 상에서 쉐이더를 제공하여 실시간으로 다양한 쉐이딩 효과를 얻을 수 있게 한다.This RenderMan shading language is a software rendering system that has the advantage of allowing you to customize the shading and lighting calculations. However, due to its complexity, such a shading scheme is widely used to obtain film or photorealistic images, which are mainly an offline rendering field. Emerging programmable graphics processors provide shaders on hardware to achieve various shading effects in real time.

이 프로그래밍이 가능한 그래픽스 프로세서의 가장 큰 특징은 버텍스쉐이더(vertex shader)와 픽셀쉐이더(pixel shader)를 제공한다는 점이다. 즉, 버텍스별(per-vertex)연산과 픽셀별(per-pixel)연산을 프로그래머가 자유롭게 구현할 수 있는 특징이 있다. 그래픽스 프로세서를 이용할 경우, 기존에는 CPU의 계산이 많이 필요했던 범프매핑(bump mapping), 환경매핑(environmental mapping), 반사매핑(reflection mapping) 등도 쉐이더 상에서는 매우 유연하게 수행할 수 있다. 따라서, 최근 연구 동향은 그 동안 오프라인 상에서 수행되어 왔던 많은 고급 렌더링 기술들을 이러한 쉐이더를 이용하여 실시간화하려는 방향으로 초점이 맞춰지고 있고, 특히 매우 빠른 수행시간을 갖는 픽셀쉐이더에 대한 활용도가 급격하게 증가하고 있다.The main feature of this programmable graphics processor is that it provides vertex shaders and pixel shaders. That is, the programmer can freely implement per-vertex and per-pixel operations. When using a graphics processor, bump mapping, environmental mapping, and reflection mapping, which previously required a lot of CPU calculation, can be performed flexibly on the shader. Therefore, the recent research trend is focused on realizing many advanced rendering techniques that have been performed offline in such a shader, and in particular, the use of pixel shaders with very fast execution time increases rapidly. Doing.

픽셀쉐이더를 활용한 기술로서, 광선추적법(ray tracing)과 볼륨렌더링(volume rendering)에 관한 다양한 기술들이 연구, 제안되었다. 그러나, 현재까지는 픽셀쉐이더를 이용한 다양한 연구들이 많이 발표되었을 뿐이고, 버텍스쉐이더를 활용한 연구들은 아직까지는 빈약한 상태이다.As a technique using a pixel shader, various techniques related to ray tracing and volume rendering have been studied and proposed. However, until now, many studies using pixel shaders have been published, and studies using vertex shaders are still poor.

한편, 컴퓨터 그래픽스 분야뿐만 아니라 여러 공학분야에서 다양한 문제들을 표현하는데 있어서 가장 단순한 형태의 수식 표현방법인 선형식(linear expression)이 널리 쓰이고 있다. 특히 아핀변환(affine transform)에 해당하는 y=Ax+b 형태의 선형식은 컴퓨터 그래픽스 분야에서 사용되는 여러 응용문제를 다루는데 있어서 중요한 역할을 한다. 위와 같은 선형식에서는 행렬연산에 대한 비용이 많이 드는데, 특히 행렬의 크기가 매우 크고 희소행렬(sparse matrix)일 경우 이에 대한 효율적인 계산방식이 요구된다.Meanwhile, linear expression, the simplest form of expression expression, is widely used to express various problems in various fields of engineering as well as computer graphics. In particular, the y = Ax + b linear equation, which corresponds to the affine transform, plays an important role in dealing with various application problems used in computer graphics. In the above linear equations, the cost of matrix operation is high, especially when the size of the matrix is very large and sparse matrix, an efficient calculation method is required.

지금까지는 이러한 행렬연산에 대한 계산은 주로 CPU에 의해서 수행되었으나, 최근 그래픽스 프로세서를 이용하여 행렬연산을 수행하는 방법이 개발되었으며, 이에 따라 행렬연산을 그래픽스 프로세서가 제공하는 버텍스쉐이더 상에서 효율적으로 구현할 수 있도록 쉐이더 코드를 최적화해주는 방법이 개발될 필요성이 있다.Until now, such matrix operations have been mainly performed by the CPU, but recently, a method of performing matrix operations using a graphics processor has been developed. Accordingly, the matrix operations can be efficiently implemented on the vertex shader provided by the graphics processor. There is a need to develop a way to optimize shader code.

상기한 필요성을 충족시키기 위하여 안출된 본 발명의 목적은 아핀변환 형태의 선형식을 사용하는 응용문제를 그래픽스 프로세서가 제공하는 버텍스쉐이더 상에서 효율적으로 구현할 수 있도록 쉐이더 코드를 최적화하는 방법을 제공하기 위한 것이다. An object of the present invention devised to meet the above needs is to provide a method for optimizing shader code to efficiently implement an application problem using affine type linear equations on a vertex shader provided by a graphics processor. .

상기한 목적을 달성하기 위한 본 발명에 따른 그래픽스 데이터 처리방법은, 그래픽스 프로세서에 아래의 수식과 같은 임의의 선형식 E가 입력되면, 상기 선형식에 대해 최소 비용이 소요되고 상기 선형식과 동등한 변형 선형식을 찾는 변형단계와;In the graphics data processing method according to the present invention for achieving the above object, if any linear equation E, such as the following equation, is input to the graphics processor, a minimum cost is required for the linear equation and is equivalent to the linear equation. A transformation step of finding a linear equation;

상기 변형 선형식에 대해 최소 개수의 버텍스 명령어를 적용하는 쉐이더코딩단계;A shader coding step of applying a minimum number of vertex instructions to the deformation linear equation;

를 포함한 것을 특징으로 한다.Characterized by including.

[수식][Equation]

여기서, 선형식 E는 V가 주어졌을 때 a를 구하는 수식으로서, M과 A는 각각 상수행렬과 상수벡터이고, V와 a는 변수를 나타내는 벡터이다.Here, the linear equation E is a formula for finding a when V is given, where M and A are constant matrices and constant vectors, respectively, and V and a are vectors representing variables.

또한, 본 발명에 따르면 그래픽스 프로세서에 상술한 바와 같은 그래픽스 데이터 처리방법을 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체가 제공된다.According to the present invention, there is also provided a computer-readable recording medium having recorded thereon a program for executing a graphics data processing method as described above in a graphics processor.

또한, 본 발명에 따른 프래픽스 프로세서에서의 선형식 변환방법은, 그래픽스 프로세서에 아래의 수식과 같은 임의의 선형식 E 가 입력되면, 목표비용을 설정하는 목표설정단계와;In addition, a linear conversion method in a prefix processor according to the present invention includes: a target setting step of setting a target cost when an arbitrary linear equation E, such as the following equation, is input to a graphics processor;

상기 선형식 E의 임의의 변형 선형식 E’을 구하는 행렬변형단계와;A matrix transformation step of obtaining any modified linear equation E 'of the linear equation E;

상기 선형식 E의 계산에 소요되는 비용과 상기 변형 선형식 E’의 계산에 소요되는 비용을 산출하는 비용산출단계와;A cost calculation step of calculating a cost for calculating the linear equation E and a cost for calculating the modified linear equation E ';

상기 변형 선형식 E’의 계산에 소요되는 비용과 상기 선형식 E의 계산에 소요되는 비용의 차이값을 구하는 비용차이값계산단계와;Calculating a cost difference value for calculating a difference between a cost required for calculating the modified linear equation E 'and a cost required for calculating the linear equation E;

상기 비용의 차이값이 0보다 작거나 진행과정에서 0보다 작을 확률이 크면 상기 변형 선형식 E’을 새로운 선형식 E 로 재설정하는 선형식재설정단계와;A linear reset step of resetting the modified linear equation E 'to a new linear equation E when the difference in cost is less than zero or a probability that the difference is less than zero in the process;

상기 선형식 E의 계산에 소요되는 비용이 상기 목표비용이 될 때까지 상기 행렬변형단계 내지 선형식재설정단계를 반복 수행하는 목표달성단계;A goal achievement step of repeating the matrix transformation step or the linear plant reset step until the cost of calculating the linear equation E becomes the target cost;

를 포함한 것을 특징으로 한다.Characterized by including.

[수식][Equation]

또한, 본 발명에 따르면 그래픽스 프로세서에 상술한 바와 같은 선형식 변환방법을 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체가 제공된다.According to the present invention, there is also provided a computer-readable recording medium having recorded thereon a program for executing a linear conversion method as described above in a graphics processor.

이하, 첨부된 도면을 참조하여 본 발명의 한 실시예에 따른 그래픽스 프로세서에서의 그래픽스 데이터 처리방법과 선형식 변환방법을 보다 상세하게 설명하면 다음과 같다.Hereinafter, a graphic data processing method and a linear conversion method in a graphics processor according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

본 발명을 설명하기에 앞서, 본 발명이 적용되는 버텍스쉐이더에 관해 먼저 살펴본다.Before describing the present invention, a vertex shader to which the present invention is applied will be described first.

버텍스쉐이더는 기존의 실시간 그래픽스 파이프라인 중 래스터화 이전의 버텍스별(per-vertex) 연산을 대치하기 위한 쉐이더로서, 가장 큰 목적은 기하변환(Transformation)과 라이팅(Lighting) 계산을 수행하는 것이다. 버텍스쉐이더는 기계어 수준의 SIMD(Single Instruction Multiple Data) 명령어를 제공해주기 때문에 병렬처리가 가능하다. 버텍스쉐이더의 명령어는 4개의 필드로 구성된 레지스터 단위로 연산을 수행하는 4-wide SIMD 명령어이다.The vertex shader is a shader for replacing per-vertex operations before rasterization in the existing real-time graphics pipeline, and its biggest purpose is to perform geometric transformation and lighting calculations. Vertex shaders provide machine language-level Single Instruction Multiple Data (SIMD) instructions, enabling parallel processing. Vertex shader instructions are 4-wide SIMD instructions that perform operations in register units consisting of four fields.

버텍스쉐이더의 명령어로서, MUL, ADD, MAD, DP3, DP4 등의 명령어가 있다. MUL은 동시에 4개의 곱셈을 수행하는 명령어이고, ADD는 동시에 4개의 덧셈을 수행하는 명령어이고, MAD는 동시에 4개의 곱셈과 4개의 덧셈을 수행하는 명령어이고, DP3은 동시에 3개의 곱셈과 2개의 덧셈을 수행하는 명령어이고, DP4는 동시에 4개의 곱셈과 3개의 덧셈을 수행하는 명령어이다.As vertex shader instructions, there are instructions such as MUL, ADD, MAD, DP3, and DP4. MUL is an instruction that performs four multiplications at the same time, ADD is an instruction that performs four additions at the same time, MAD is an instruction that performs four multiplications and four additions at the same time, and DP3 is three multiplications and two additions at the same time. DP4 is a command to perform 4 multiplications and 3 additions at the same time.

버텍스쉐이더의 경우 이 명령어들을 활용하면 선형식의 연산을 단순화시키는 것이 가능하다. 그래픽스 프로세서로서 NVIDIA GeForce3&4와 ATI RADEON8500이나 그 이후의 기종들이 보편적으로 사용되고 있는데, 이들 경우에 한 버텍스쉐이더 안에서 사용할 수 있는 버텍스명령어의 개수가 제한되어 있고, 또한 사용 가능한 레지스터의 개수가 한정되어 있기 때문에 주어진 자원을 효율적으로 활용하려는 노력이 요구된다. 더욱이 명령어의 개수와 계산시간은 밀접한 관계가 있기 때문에 명령어의 개수를 줄이는 것은 큰 의미를 갖는다.In the case of vertex shaders, it is possible to simplify linear operations by using these instructions. NVIDIA GeForce3 & 4 and ATI RADEON8500 and later are commonly used as graphics processors, in which case the number of vertex instructions available in a vertex shader is limited and the number of registers available is limited. Efforts are needed to utilize resources efficiently. Moreover, since the number of instructions and the calculation time are closely related, reducing the number of instructions has a great meaning.

한편, 버텍스쉐이더는 레지스터의 4개의 필드가 동시에 연산되기 때문에 선형식 연산을 위해서는 4개의 데이터에 대해 선택적으로 연산을 하는 것이 가능해야만 한다. 이를 위해 버텍스쉐이더에서는 스위즐링(swizzling) 기능을 제공하는데, 이 기능을 이용해서 레지스터의 4개의 데이터를 원하는 것만 선택하거나 순서를 바꿔서 연산하는 것이 가능하다. 이러한 스위즐링 기능을 사용하더라도 성능에 대한 불이익은 발생하지 않는다.On the other hand, the vertex shader must be able to selectively operate on the four data for linear operation because the four fields of the register are operated at the same time. To do this, the vertex shader provides a swizzling feature that allows you to select or reorder only the four pieces of data in a register. Using this swizzling feature does not incur any performance penalty.

본 발명은 4-wide SIMD 계산을 지원하고, MOV, ADD, MUL, MAD, DP4 명령어를 지원하며, 스위즐링기능을 지원하는 버텍스쉐이더의 그래픽스 프로세서에 적용될 수 있다.The present invention can be applied to the graphics processor of the vertex shader that supports 4-wide SIMD calculation, supports MOV, ADD, MUL, MAD, DP4 instructions, and supports swizzling.

일반적으로 그래픽스 분야에서 수학식 1과 같은 아핀변환 형태의 선형식은 매우 빈번하게 발생한다. 본 발명은 이러한 아핀변환 형태의 선형식을 프로그래밍이 가능한 그래픽스 프로세서에서 제공하는 버텍스쉐이더를 이용하여 가장 최적화된 명령어로 표현하기 위한 것이다.In general, in the graphics field, affine type linear equations such as Equation 1 occur very frequently. The present invention is intended to represent the most optimized instruction using a vertex shader provided by a programmable graphics processor.

선형식 E는 V가 주어졌을 때 a를 구하는 수식으로서, M과 A는 각각 상수행렬과 상수벡터이고, V와 a는 변수를 나타내는 벡터이다. 위의 수학식 1에서 E는 아래의 수학식 2와 같은 행렬식으로 나타낼 수 있다.Linear E is a formula for a given V, where M and A are constant matrices and constant vectors, and V and a are vectors representing variables. E in Equation 1 may be represented by a determinant such as Equation 2 below.

버텍스쉐이더에서 SIMD 명령어는 4개의 데이터 단위로 계산되기 때문에 선형식 E를 4개씩 분할하고, 선형식 E의 크기를 4의 배수로 맞추기 위해 나머지 부분을 ‘0’으로 채운다. 그러면 위의 수학식 2와 같이 ‘0’이 추가된 행렬식을 얻을 수 있다. 위에서 행렬 M의 크기는 4n × 4n (4n = m + α, 0≤α<4)이다.In the vertex shader, since the SIMD instruction is calculated in four data units, the linear E is divided into four, and the remainder is filled with '0' to fit the linear E in multiples of four. Then, as shown in Equation 2 above, a determinant in which '0' is added can be obtained. In the above, the size of the matrix M is 4n × 4n (4n = m + α, 0 ≦ α <4).

행렬 M의 부분 4 × 4 행렬들을 M_ij(i,j= 0, 1, 2, …, n-1)이라고 하고, 실제로 부분 4 × 4 행렬에 의해 계산되는 식을 전개하면 아래의 수학식 3과 같다.If the partial 4 × 4 matrices of the matrix M are called M _ij (i, j = 0, 1, 2, ..., n-1), and the equation calculated by the partial 4 × 4 matrix is developed, Same as

이제 전개된 앞의 식을 버텍스 프로그램의 SIMD 명령어로 변환해야 하는데, 먼저, 전체 선형식에서 4 × 4 행렬인 부분행렬 M_ij 와 변수들로 이루어진 4차원 변수벡터 V를 곱하고, 이 부분 곱에 대한 결과인 S_ij들과 상수벡터 A_i를 더하여, 최종적으로 R_i를 얻는다. 이를 수식으로 표현하면 아래의 수학식 4와 같다.Now, we need to convert the previous expression to the SIMD instruction of the vertex program. First, we multiply the submatrix M _ij , which is a 4 × 4 matrix in the entire linear equation, by the four-dimensional variable vector V of the variables, and then Finally, R _i is obtained by adding the resulting S _ij and the constant vector A _i . If this is expressed as an equation, Equation 4 below.

이제, 선형식 E에 대한 계산은 아래의 수학식 5와 같이 4-wide SIMD 명령어에 맞게 바뀌게 된다. 여기서, S_ij, A_i, R_i는 모두 4차원 벡터들이다.Now, the calculation for linear E is changed to fit a 4-wide SIMD instruction as shown in Equation 5 below. Here, S _ij , A _i , R _i are all four-dimensional vectors.

위의 수학식 5는 버텍스쉐이더의 4-wide SIMD 명령어를 사용하여 계산하기 쉽도록 변형된 형태이다. 상수행렬에 해당하는 행렬 M이 희소행렬일 경우, ‘0’과의 곱셈 및 덧셈으로 이루어지는 계산들을 무시한다면 훨씬 적은 명령어들로 선형식을 계산할 수 있다. 그래서, 본 발명에서는 주어진 선형식을 계산하기 위해 사용되는 명령어의 개수를 최소화하기 위하여 쉐이더코드를 최적화하는 방법을 제안한다.Equation 5 above is modified to be easy to calculate using the vertex shader 4-wide SIMD instruction. If the matrix M, which is a constant matrix, is a sparse matrix, it is possible to calculate a linear expression with much fewer instructions, ignoring the calculation of multiplication and addition with '0'. Thus, the present invention proposes a method of optimizing shader code to minimize the number of instructions used to calculate a given linear equation.

도 1은 본 발명에 따른 그래픽스 데이터 처리방법을 도시한 동작 흐름도이다. 본 발명에 따른 그래픽스 데이터 처리방법은 변형단계와 쉐이더 코딩단계로 이루어진다. 변형단계는 주어진 선형식을 더 효율적인 다른 형태의 동등한 선형식으로 변형하는 단계이고, 쉐이더 코딩단계는 변형단계에서 변형된 선형식을 최소의 SIMD 명령어를 이용하여 계산하는 단계이다.1 is an operation flowchart showing a graphics data processing method according to the present invention. The graphics data processing method according to the present invention comprises a transform step and a shader coding step. The transform step transforms a given linear equation into another type of equivalent linear equation that is more efficient, and the shader coding step calculates the transformed linear equation using the minimum SIMD instruction.

즉, 변형단계는 주어진 선형식과 동등하고 적은 수의 명령어를 요구하는 더 좋은 형태의 선형식을 찾는 단계이고, 계산단계는 변형단계에서 찾은 변형 선형식을 계산하기 위해 요구되는 가장 최적화된 SIMD 명령어들을 찾는 단계이다.That is, the transform step is to find a better form of linear equation that is equivalent to a given linear equation and requires fewer instructions, and the calculation step is the most optimized SIMD instruction required to compute the transform equation found in the transform step. Finding them.

먼저, 그래픽스 프로세서에 선형식 E와 최종 목표비용이 입력되면(S11), 목표비용을 설정한다(S12).First, when the linear E and the final target cost are input to the graphics processor (S11), the target cost is set (S12).

그리고, 선형식 E의 계산에 필요한 비용을 계산한다(S13). 여기서, 선형식 E의 계산에 필요한 비용이라 함은 선형식 E를 연산할 때 필요한 버텍스 명령어의 개수를 의미한다.Then, the cost required for the calculation of the linear equation E is calculated (S13). Here, the cost required for the calculation of the linear E means the number of vertex instructions required when the linear E is calculated.

선형식 E의 비용 C(E)를 계산하는 과정을 설명한다. 먼저, 앞서의 수학식 2와 같이 선형식 E의 크기가 4의 배수가 되도록 선형식 E의 나머지 부분을‘0’으로 채우고, 4 × 4 부분행렬로 나눈다.The process of calculating the cost C (E) of the linear equation E will be described. First, as shown in Equation 2 above, the remainder of the linear equation E is filled with '0' so that the magnitude of the linear equation E is a multiple of four, and divided by a 4 × 4 submatrix.

그리고, 아래의 수학식 6을 이용하여 선형식 E의 비용 C(E)를 계산한다.Then, the cost C (E) of the linear equation E is calculated using Equation 6 below.

여기서, C(E)는 선형식의 비용이고, 는 (i,j)번째 부분행렬에 대해 한 행에서‘0’이 아닌 원소의 개수의 최대값과 ‘0’이 아닌 원소가 있는 행의 개수 중 작은 값이고, 는 한 행에서 ‘0’이 아닌 원소의 개수의 최대값이 ‘0’이 아닌 원소가 있는 행의 개수보다 많은 부분행렬의 개수()에서 1을 감산한 값이다.Where C (E) is the linear cost, Is the smaller of the maximum number of nonzero elements in a row for the (i, j) th submatrix and the number of rows with nonzero elements, Is the number of submatrices in which a maximum of the number of non-zero elements in a row is greater than the number of rows with non-zero elements ( Subtracted 1 from).

선형식 E의 비용 C(E)가 수학식 6과 같이 계산되는 이유는 후술하기로 한다.The reason why the cost C (E) of the linear E is calculated as in Equation 6 will be described later.

다음, 선형식 E를 변형한 변형 선형식 E’을 구한다(S14). 선형식 E를 구현하기 위해 요구되는 버텍스 명령어의 개수는 선형식 E 의 행렬 M, 벡터 A의 형태와 밀접한 관계가 있기 때문에, 주어진 선형식 E를 변형하여 행렬 M과 벡터 A의 형태를 바꾸면 비용도 변하게 된다. 본 발명에서는 최소한의 비용이 소요되는 선형식을 찾기 위하여, 선형식 E와 동등한 변형 선형식 E’을 구한다.Next, the modified linear equation E 'is obtained by deforming the linear equation E (S14). Since the number of vertex instructions required to implement linear E is closely related to the form of matrix M and vector A of linear E, changing the shape of matrix M and vector A by modifying the given linear E Will change. In the present invention, in order to find a linear equation that requires the least cost, a modified linear equation E 'equivalent to the linear equation E is obtained.

변형 선형식 E’을 구하는 규칙은 도 2에 도시된 바와 같이, 수학식 1의 선형식 E에 대해 행렬 M과 벡터 a, V, A에서 i행과 j행을 교환하고, 행렬 M에서 i열과 j열을 교환하여, 얻는다.As shown in FIG. 2, the rule for obtaining the modified linear equation E 'is to exchange rows i and j in the matrix M and the vectors a, V, and A for the linear equation E of Equation 1, and The j column is exchanged and obtained.

다음, 이 변형 선형식 E’을 수학식 6에 적용하여 비용 C(E’)을 계산한다(S15). 단계 S15에서 구한 변형 선형식 E’의 비용 C(E’)와 단계 S13에서 구한 선형식 E의 비용 C(E)의 차(C)를 구하고(S16), 1과 확률값 중 작은 값을 p로 설정하고(S17), p가 0과 1 사이의 임의의 실수값보다 작으면 단계 S14로 되돌아가고, p가 0과 1 사이의 임의의 실수값보다 작지 않으면 변형 선형식 E’을 새로운 선형식 E로 재설정한다(S19).Next, the cost C (E ') is calculated by applying the modified linear equation E' to Equation 6 (S15). The difference C between the cost C (E ') of the modified linear equation E' obtained in step S15 and the cost C (E) of the linear equation E obtained in step S13 is obtained (S16), and 1 and a probability value Sets a smaller value to p (S17), and returns to step S14 if p is smaller than any real value between 0 and 1, and if p is not less than any real value between 0 and 1, the deformation linear equation E 'Is reset to a new linear E (S19).

즉, C가 0보다 작으면 확률값 이 1보다 크기 때문에 p는 1로 설정되어 단계 S18에서 p는 0과 1 사이의 임의의 실수값보다 무조건 크기 때문에 단계 S19로 진행하고, C가 0보다 크면 확률값 는 1보다 작고 C에 따라 0에 가까워지기 때문에 임의의 0과 1 사이의 임의의 실수값을 선택하면 p가 그 실수값보다 크거나 작을 수 있는데, 이때 p가 선택된 실수값보다 크면 단계 S19로 진행하고 p가 선택된 실수값보다 크지 않으면 단계 S14로 진행한다.That is, if C is less than zero, the probability P is set to 1 because it is larger than 1, and in step S18, since p is unconditionally larger than any real value between 0 and 1, the process proceeds to step S19, and if C is greater than 0, the probability value Since is less than 1 and close to 0 according to C, if any real value between 0 and 1 is selected, p may be greater than or less than that real value, in which case proceed to step S19. If p is not greater than the selected real value, the flow proceeds to step S14.

이를 정리하면, 변형 선형식 E’의 비용 C(E’)이 선형식 E의 비용 C(E)보다 적거나 적어질 수 있는 확률이 높으면, 이 변형 선형식 E’을 새로운 선형식 E로 설정한다. 그러나, 변형 선형식 E’의 비용 C(E’)이 선형식 E의 비용 C(E)보다 커질 확률이 높으면, 새로운 변형 선형식 E’을 구한다.Summarizing this, if there is a high probability that the cost C (E ') of the deformation linear E' may be less or less than the cost C (E) of the linear E, then set this deformation linear E 'as the new linear E do. However, if the cost C (E ') of the deformation linear E' is more likely to be greater than the cost C (E) of the linear E, a new deformation linear E 'is obtained.

새롭게 설정된 선형식 E에 대해 선형식 E의 비용이 목표비용보다 작거나 같아지면(S20), 목표비용을 업데이트하고(S21), 목표비용 업데이트 횟수가 최대 반복횟수보다 크거나 같으면(S22) 이 선형식 E를 최소 비용이 소요되는 변형 선형식으로 인지하고 이 변형 선형식에 최적의 버텍스 명령어를 적용하여 계산한다(S23).If the cost of the linear E is less than or equal to the target cost (S20) for the newly set linear E (S20), update the target cost (S21), and if the target cost update count is greater than or equal to the maximum number of iterations (S22) Recognize the type E as the least expensive deformation linear equation and calculate it by applying the optimal vertex instruction to the deformation linear equation (S23).

한편, 단계 S20에서 선형식 E의 비용이 목표비용보다 크거나, 단계 S22에서 목표비용 업데이트 횟수가 최대 반복 횟수보다 작으면 단계 S14로 진행하여, 새로운 변형 선형식 E’을 구한다.On the other hand, if the cost of the linear equation E is greater than the target cost in step S20 or the target cost update number is smaller than the maximum number of repetitions in step S22, the flow advances to step S14 to obtain a new modified linear equation E '.

다음, 선형식 E의 비용이 수학식 6과 같이 계산되는 이유와 버텍스 코딩단계에 대해 설명한다.Next, the reason why the cost of the linear equation E is calculated as in Equation 6 and the vertex coding step will be described.

수학식 5를 최소 SIMD 명령어를 사용하여 코딩하기 위해서는 수학식 4의 과정 1에 해당하는 부분 곱과 과정 2에 해당하는 벡터 합에 대한 최소계산방식이 정의되어야 한다.In order to code Equation 5 using the minimum SIMD instruction, a partial calculation method corresponding to Process 1 of Equation 4 and a minimum calculation method for the vector sum corresponding to Process 2 should be defined.

먼저, 부분 곱에 대해 살펴보면, 4 × 4 행렬인 M_ij과, 4개의 변수로 이루어진 벡터 V의 곱셈 S_ij의 내용을 보면 아래의 수학식 7과 같다.First, referring to the partial product, the contents of the multiplication S _ij of the vector V composed of 4 × 4 M _ij and the four variables V are expressed by Equation 7 below.

즉, S_ij를 계산하려면 총 곱셈 16번과 덧셈 12번의 연산이 필요하다. 그러나, M_ij의 원소 중에 ‘0’이 존재하면 연산되는 수가 줄어들다. 또한, 버텍스 명령어는 SIMD 명령어이므로 한 개의 명령어에 의해 가능한 한 많은 연산을 수행하게 된다면 총 명령어 개수는 줄어든다. 그리고, 이 하나의 계산에 필요한 총 명령어 개수값이 해당 계산에 필요한 비용값이다.That is, to calculate S _ij , 16 multiplications and 12 additions are required. However, if '0' is present in the elements of M _ij , the number of operations is reduced. Also, since vertex instructions are SIMD instructions, the total number of instructions is reduced if one instruction performs as many operations as possible. The total number of instructions required for this one calculation is the cost value for that calculation.

본 발명에서는 M_ij의 모양에 따라 최소계산방식을 제안한다. 여기서, M_ij의 모양이라 함은 M_ij에서 ‘0’이 아닌 원소가 분포된 형태를 의미한다.In the present invention, a minimum calculation method is proposed according to the shape of M _ij . Here, means the form of the elements other than '0' in M _ij distribution as the shape of M _ij.

본 발명에서는 M_ij의 모양에 따라 열-우선곱(column-major multiplication) 방식과 행-우선곱(row-major multiplication) 방식으로 선형식을 계산한다. 아래의 도 3은 열-우선곱 방식을 표현한 도면이고, 도 4는 행-우선곱 방식을 표현한 도면이다.In the present invention, the linear equations are calculated by column-major multiplication and row-major multiplication according to the shape of M _ij . 3 is a diagram representing a column-first product method, and FIG. 4 is a diagram representing a row-first product method.

먼저, 도 3을 참조하면, 열-우선곱 방식은 S_ij를 계산하기 위해서 MAD 명령어와 MUL 명령어를 이용하는 방식이다. 한 행에서 ‘0’이 아닌 원소의 개수가 최대 C_E(i,j)라고 할 때, 한 개의 MUL 명령어와 C_E(i,j)-1개의 MAD 명령어를 이용하여 S _ij를 계산한다. 따라서, 사용되는 총 명령어의 개수는 C_E(i,j)가 된다.First, referring to FIG. 3, the column-first product method uses a MAD instruction and a MUL instruction to calculate S _ij . S _ij is calculated using one MUL instruction and one C _E (i, j) -1 MAD instruction when the number of nonzero '0' elements in a row is C _E (i, j). Thus, the total number of instructions used is C _E (i, j).

한편, 행-우선곱 방식은 도 4에 도시된 바와 같이 DP4, DP3, MUL 명령어를 이용하여 S_ij를 계산하는데, 내적을 이용하여 행렬 연산을 하는 일반적인 형태를 말한다. 부분행렬 M_ij에서 ‘0’아닌 원소가 있는 행의 개수가 r_E(i,j)라고 할 때, r_E(i,j)개의 DP4 명령어로 S_ij를 계산한다. 따라서, 사용되는 총 명령어의 개수는 r_E(i,j)가 된다.Meanwhile, the row-first multiplication method calculates S _ij by using the DP4, DP3, and MUL instructions as shown in FIG. 4, and refers to a general form of performing a matrix operation using an inner product. When the number of lines with the elements other than '0' in the sub-matrix M _ij be said _E r (i, j), and calculates the S _ij to DP4 of the command r _E (i, j). Thus, the total number of instructions used is r _E (i, j).

각 S_ij는 위에서 설명한 두 가지 계산방식으로 계산될 수 있다. 하지만, 비용을 절감하기 위해서는 더 적은 수의 명령어를 요구하는 방식을 선택하여 S_ij를 계산한다. 즉, C_E(i,j)< r_E(i,j)인 부분행렬은 열-우선곱 방식으로 계산하고, C_E(i,j)>r_E(i,j)인 부분행렬은 행-우선곱 방식으로 계산한다.Each S _ij may be calculated using the two calculation methods described above. However, to reduce cost, S _ij is calculated by selecting a method that requires fewer instructions. That is, the submatrices with C _E (i, j) <r _E (i, j) are computed in column-first order, and the submatrices with C _E (i, j)> r _E (i, j) are rows. Calculate using the first order method.

그리고, C_E(i,j)= r_E(i,j)인 부분행렬은 열-우선곱 방식으로 계산하는데, 그 이유는 다음과 같다. 선형식 E의 계산에서, S_ij를 계산한 후 ADD 명령어를 사용하여 A_i와 가산하는 벡터 합 계산을 더 수행해야 하는데, 만약 S_ij와 A_i가 모두 ‘0’벡터인 경우에는 벡터 합 연산이 필요 없어진다. 또한, 열-우선곱 방식의 경우, 이미 계산되어진 벡터가 있을 때, 처음에 사용되는 MUL 명령어 대신에 MAD 명령어를 사용하면 이미 계산되어진 벡터와 벡터 합 연산까지 함께 수행되어 별도의 ADD 명령어를 추가하지 않아도 되므로 효과적으로 명령어의 개수를 줄일 수 있게 된다. 따라서, 앞서 언급한 바와 같이 C_E(i,j)= r_E(i,j)인 부분행렬은 열 -우선곱 방식으로 계산한다.Sub-matrix where C _E (i, j) = r _E (i, j) is calculated in a column-first order method for the following reasons. For in the calculation of the line type E, after calculating the S _ij must do more vector sum calculation for adding and A _i using the ADD instruction and, if S _ij and A _i are both zero vector is a vector sum operation This becomes unnecessary. In addition, in the case of the column-first multiplication method, when there is a vector that has already been calculated, if the MAD instruction is used instead of the MUL instruction that is used first, the vector and vector sum operation that have already been calculated are performed together, so that a separate ADD instruction is not added. This can effectively reduce the number of instructions. Therefore, as mentioned above, the submatrices with C _E (i, j) = r _E (i, j) are calculated in a column-first order manner.

즉, 임의의 부분 곱 S_ij를 계산하는데 요구되는 비용 은 C_E(i,j)와 r_E(i,j) 중 작은 값이 되며, 이를 수식으로 표현하면 아래의 수학식 8과 같다.That is, the cost required to calculate any partial product S _ij Is a smaller value of C _E (i, j) and r _E (i, j), which is expressed by Equation 8 below.

C_E(i,j)> r_E(i,j)인 부분행렬은 행-우선곱 방식으로 계산된 S_ij의 경우에는 S_ij와 A_i를 가산하기 위하여 ADD 명령어를 한번 더 사용해야 하는데, 이러한 벡터 합을 위한 비용 은 행-우선곱 방식으로 계산된 부분행렬에만 적용되며 이를 수식으로 표현하면 아래의 수학식 9와 같다.A partial matrix _{C E (i, j)>} r E (i, j) is a line-first case of the S _ij calculated as the product scheme is to use the ADD instruction again to adding the S _ij and A _i, such Cost for vector sum Is applied only to the submatrices calculated by the row-first-product method, and the expression is expressed by the following equation (9).

여기서, 는 행-우선곱 방식에 의해 계산된 S_ij의 개수이다.here, Is the number of S _ij calculated by the row-first order method.

따라서, 주어진 선형식 E 계산에 필요한 비용은 위의 벡터 곱에 필요한 비용과 벡터 합에 필요한 비용의 합으로 구할 수 있으며, 이를 수식으로 표현한 것이 위의 수학식 6이다.Therefore, the cost required for the calculation of the linear equation E can be obtained as the sum of the cost required for the vector product and the cost required for the sum of the vectors, which is expressed by Equation 6 above.

본 발명을 4차 런지-쿠타(4^th-order Runge-Kutta) 방법에 적용하기 전(a)과 적용한 후(b)의 선형식의 상수벡터가 도 5에 도시된다. 본 발명을 적용하기 전의 비용은 80이지만, 본 발명을 적용한 후의 비용은 56으로서, 120개의 버텍스 명령어를 사용하여 선형식 계산이 가능하다.The constant vector of the linear equations before (a) and after (b) before applying the present invention to the 4 ^th- order Runge-Kutta method is shown in FIG. 5. The cost before applying the present invention is 80, but the cost after applying the present invention is 56, which allows linear calculation using 120 vertex instructions.

도 6은 본 발명을 가우스-자이델(Gauss-Seidel) 방법에 적용하기 전(a)과 적용한 후(b)의 선형식의 상수벡터를 도시한다. 본 발명을 적용하기 전의 비용은 112이지만, 본 발명을 적용한 후의 비용은 72로 감소한다.6 shows a linear constant vector before (a) and after (b) application of the present invention to a Gauss-Seidel method. The cost before applying the present invention is 112, but the cost after applying the present invention is reduced to 72.

도 7은 본 발명을 2차원 파동방정식 계산에 적용하기 전(a)과 적용한 후(b)의 선형식 상수벡터를 도시한다. 본 발명을 적용하기 전의 비용은 82이지만, 본 발명을 적용한 후의 비용은 63으로 감소한다.7 shows the linear constant vector before (a) and after (b) before applying the present invention to the calculation of two-dimensional wave equation. The cost before applying the present invention is 82, but the cost after applying the present invention is reduced to 63.

위에서 양호한 실시예에 근거하여 이 발명을 설명하였지만, 이러한 실시예는 이 발명을 제한하려는 것이 아니라 예시하려는 것이다. 이 발명이 속하는 분야의 숙련자에게는 이 발명의 기술사상을 벗어남이 없이 위 실시예에 대한 다양한 변화나 변경 또는 조절이 가능함이 자명할 것이다. 그러므로, 이 발명의 보호범위는 첨부된 청구범위에 의해서만 한정될 것이며, 위와 같은 변화예나 변경예 또는 조절예를 모두 포함하는 것으로 해석되어야 할 것이다.While the invention has been described above based on the preferred embodiments thereof, these embodiments are intended to illustrate rather than limit the invention. It will be apparent to those skilled in the art that various changes, modifications, or adjustments to the above embodiments can be made without departing from the spirit of the invention. Therefore, the protection scope of the present invention will be limited only by the appended claims, and should be construed as including all such changes, modifications or adjustments.

본 발명에 따르면 그래픽스 버텍스 쉐이더를 이용하여 선형식을 계산할 때, 최적화된 SIMD 명령어를 사용하여 계산할 수 있도록 명령어의 개수를 최소화할 수 있기 때문에, 처리 속도가 향상되며 한정된 자원을 보다 효율적으로 이용할 수 있는 잇점이 있다. According to the present invention, when calculating a linear equation using a graphics vertex shader, since the number of instructions can be minimized to be calculated using an optimized SIMD instruction, processing speed is increased and a limited resource can be used more efficiently. There is an advantage.

도 1은 본 발명에 따른 그래픽스 데이터 처리방법을 도시한 동작 흐름도,1 is an operation flowchart showing a graphics data processing method according to the present invention;

도 2는 변형 선형식 E’을 구하는 규칙을 도시한 도면,2 is a diagram showing a rule for obtaining a deformation linear equation E ';

도 3은 열-우선곱 방식을 표현한 도면,3 is a diagram representing a column-first product method,

도 4는 행-우선곱 방식을 표현한 도면,4 is a diagram representing a row-priority method;

도 5는 본 발명에 따른 방법을 4차 런지-쿠타(4^th-order Runge-Kutta) 방법에 적용하기 전(a)과 적용한 후(b)의 선형식의 상수벡터를 도시한 도면,FIG. 5 is a diagram showing a linear constant vector before (a) and after (b) application of the method according to the present invention to a 4 ^th -order Runge-Kutta method.

도 6은 본 발명에 따른 방법을 가우스-자이델(Gauss-Seidel) 방법에 적용하기 전(a)과 적용한 후(b)의 선형식의 상수벡터를 도시한 도면,FIG. 6 is a diagram illustrating a constant vector of linear equations before (a) and after (b) application of the method according to the present invention to a Gauss-Seidel method.

도 7은 본 발명에 따른 방법을 2차원 파동방정식 계산에 적용하기 전(a)과 적용한 후(b)의 선형식 상수벡터를 도시한 도면이다.7 is a diagram illustrating a linear constant vector before (a) and after (b) of applying the method according to the present invention to the calculation of two-dimensional wave equation.

Claims

delete

When an arbitrary linear equation E is input to the graphics processor as shown in the following equation, graphics data processing is performed that finds a deformation linear equation that is the minimum cost for the linear equation E and is equivalent to the linear equation E and applies the minimum number of vertex instructions. In the method,

A goal setting step of setting, by the graphics processor, a target cost required for the calculation of the linear E;

A matrix transformation step of the graphics processor obtaining any modified linear equation E 'equivalent to the linear equation E;

A cost calculation step of calculating, by the graphics processor, the cost of the calculation of the linear equation E and the cost of the calculation of the deformation linear equation E ';

A cost difference calculation step of calculating, by the graphics processor, a difference between a cost required for the calculation of the modified linear equation E ′ and a cost required for the calculation of the linear equation E ′;

A linear reset step of the graphics processor resetting the modified linear equation E 'to a new linear equation E when the cost difference is less than zero or a probability that the cost difference is less than zero in the process;

A target achievement step of the graphics processor repeatedly performing the matrix transformation step or the linear plant reset step until the cost of calculating the linear equation E becomes the target cost;

And a shader coding step in which the graphics processor applies a minimum number of vertex instructions to a linear equation that requires the least cost obtained as a result of the target achievement step.

[Equation]

Here, the linear equation E is a formula for finding a when V is given, where M and A are constant matrices and constant vectors, respectively, and V and a are vectors representing variables.

The method of claim 3, wherein

And repeating the goal setting step or the goal achievement step while changing the target cost until the target number of times of updating the target cost becomes a predetermined value.

A cost calculation step of calculating, by the graphics processor, the cost for the calculation of the linear E;

A matrix transformation step of the graphics processor obtaining all the modified linear equations E ′ of the linear equations E ′;

A cost calculation step of the strain line type for calculating, by the graphics processor, the cost of the calculation of all the deformation linear equations E ';

A search step of the graphics processor comparing the cost of the calculation of the linear equation E with the cost of the calculation of all the deformation linear equation E ', and finding a deformation linear equation having the minimum cost;

And a shader coding step in which the graphics processor applies a minimum number of vertex instructions to a linear equation that requires the least cost resulting from the searching step.

[Equation]

The method according to claim 3 or 5,

The cost calculation step,

And dividing the linear equation into 4 × 4 submatrices and calculating the result by the following equation.

[Equation]

Where C (E) is the linear cost, Is the smaller of the maximum number of nonzero elements in a row for the (i, j) th submatrix and the number of rows with nonzero elements, Is the number of submatrices in which a maximum of the number of non-zero elements in a row is greater than the number of rows with non-zero elements ( Subtracted 1 from).

The method of claim 6,

The cost calculation step,

And dividing the remaining portion into '0' so that the linear size is a multiple of four, and dividing it into 4x4 submatrices.

The method according to claim 3 or 5,

The matrix transformation step,

And i rows and j rows in the matrix M and the vectors a, V, and A, and i columns and j columns in the matrix M.

The method according to claim 3 or 5,

The shader coding step,

Split the minimum cost linear equation into 4 × 4 submatrices,

For each partitioned sub-matrix, shaders are coded in a column-first order manner when the number of columns with elements other than '0' is less than or equal to the number of rows with elements other than '0' and '0'. The method of processing graphics data according to claim 1, wherein the number of partial matrixes having non-zero elements is larger than the number of rows having non-zero elements is shader coded in a row-first order manner.

When an arbitrary linear equation E is input to the graphics processor as shown in the following equation, graphics data processing is performed that finds a deformation linear equation that is the minimum cost for the linear equation E and is equivalent to the linear equation E and applies the minimum number of vertex instructions. A computer-readable recording medium having recorded thereon a program for executing the method,

The graphics data processing method,

And a shader coding step in which the graphics processor applies a minimum number of vertex instructions to the least costly linear equation obtained as a result of the goal achievement step.

[Equation]

In the linear deformation method for finding a deformation linear equation which is the minimum cost for the linear equation E and is equivalent to the linear equation E, when an arbitrary linear equation E is input to the graphics processor.

And a target achievement step of repeatedly performing the matrix transformation step or the linear plant reset step until the cost of the calculation of the linear equation E becomes the target cost.

[Equation]

The method of claim 11,

And repeating the goal setting step or the goal achievement step while updating the target cost until the target cost update number becomes a predetermined value.

The method of claim 11,

The cost calculation step,

And dividing the linear equation into 4 × 4 submatrices and calculating the linear equation by the following equation.

[Equation]

The method of claim 13,

The cost calculation step,

And dividing the remaining portion into '0' so that the size of the linear expression is a multiple of four and dividing it into 4x4 submatrices.

The method of claim 11,

The matrix transformation step,

And i rows and j rows in the matrix M and the vectors a, V, and A, and columns i and j in the matrix M are exchanged.

When an arbitrary linear equation E is input to the graphics processor as shown in the following equation, a program for executing the linear deformation method that finds the deformation linear equation that is the least cost for the linear equation E and is equivalent to the linear equation E is recorded. In a computer-readable recording medium,

The linear deformation method,

And a target achievement step of repeatedly performing the matrix transformation step or the linear plant reset step until the cost of the calculation of the linear equation E becomes the target cost. media.

[Equation]