KR20080096306A

KR20080096306A - Compiling method and system for a rule-based optimal placement of scaling shifts

Info

Publication number: KR20080096306A
Application number: KR1020070041603A
Authority: KR
Inventors: 백윤흥; 박상현; 조두산; 김태송
Original assignee: 재단법인서울대학교산학협력재단
Priority date: 2007-04-27
Filing date: 2007-04-27
Publication date: 2008-10-30

Abstract

A compiling method for optimal placement of scaling shifts and a system thereof are provided to efficiently generate codes by removing the reverse effect due to the insertion of scaling shift operation. A compiler(108) is divided into a generation means(210) and back-end(220). The generation means produces intermediate codes which are internal representation by analyzing inputted source codes(109), and the back-end produces an object code(110) from the intermediate code. The generation means comprises a lexical analysis means(211), a parsing means(212), a semantic analysis means(213), an intermediate code generation means(214). The lexical analysis means separates necessary tokens from the source code. The parsing means composes the tokens according to a given grammar. The meaning analyzing mean gives meaning to the grammar found out by the parsing means. The intermediate code generation means generates an intermediate code from the source code produced through the meaning analyzing mean. The back-end part includes a transforming means(221) and a translating means(222). The transforming means transforms the intermediate code according to a correction-writing rule, and the translating means generates an object code by selecting a proper command for the object code and assigning a register to the selected command.

Description

Compiling Method And System For A Rule-based Optimal Placement Of Scaling Shifts}

도 1은 본 발명의 바람직한 실시예를 구현하는데 사용될 수 있는 하드웨어 환경의 구조도.1 is a structural diagram of a hardware environment that may be used to implement a preferred embodiment of the present invention.

도 2는 본 발명의 바람직한 실시예에 따른 컴파일러의 블록도.2 is a block diagram of a compiler in accordance with a preferred embodiment of the present invention.

도 3은 본 발명에 따른 방향성 비순환 그래프의 일 실시예.Figure 3 is an embodiment of a directional acyclic graph according to the present invention.

도 4는 방향성 비순환 그래프의 패턴을 변형시키는 여러가지 고쳐쓰기 규칙의 예.4 is an example of various rewrite rules for modifying the pattern of a directional acyclic graph.

도 5는 본 발명에 따른 방향성 비순환 그래프의 패턴을 변형시키는 고쳐쓰기 규칙의 일 실시예. 5 is an embodiment of a rewrite rule to modify the pattern of a directional acyclic graph in accordance with the present invention.

도 6은 본 발명에 따른 방향성 비순환 그래프의 패턴을 변형시키는 고쳐쓰기 규칙의 일 실시예. 6 is an embodiment of a rewrite rule for modifying a pattern of a directional acyclic graph in accordance with the present invention.

도 7은 본 발명에 따른 일 실시예에서 실행 시간이 감소된 비율을 백분율로 도시한 그래프.7 is a graph showing the percentage reduction in execution time in one embodiment according to the present invention.

도 8은 본 발명에 따른 일 실시예에서 코드의 사이즈가 줄어든 비율을 백분율로 도시한 그래프.Figure 8 is a graph showing the percentage reduction in the size of the code in one embodiment according to the present invention.

본 발명은 규칙에 기반하여 스케일링 쉬프트의 최적의 위치를 찾는 컴파일 방법 및 시스템에 관한 것으로, 보다 상세하게는 고정 소수점 방식에 의한 원시 코드를 포함하는 일련의 프로그래밍 언어 문장으로부터 생성된 중간 코드에서 스케일링 쉬프트 연산의 노드가 포함된 방향성 비순환 그래프의 패턴을 변형시키는 소정의 고쳐쓰기 규칙에 따라 중간 코드를 변형시키는 컴파일 방법 및 시스템에 관한 것이다.The present invention relates to a compilation method and system for finding an optimal location of a scaling shift based on a rule, and more particularly to a scaling shift in an intermediate code generated from a series of programming language statements comprising source code in a fixed point manner. A compilation method and system for modifying intermediate code in accordance with a predetermined rewrite rule for modifying a pattern of a directional acyclic graph containing nodes of operation.

고급언어로 쓰여진 프로그램이 프로세서에서 수행되도록 하기 위해서는 프로세서가 직접 이해할 수 있는 언어로 바꾸어 주어야 한다. 이러한 일을 하는 프로그램을 컴파일러라고 한다. 예를 들어 원시언어가 파스칼(Pascal)이나 코볼(Cobol)같은 고급언어이고 목적언어가 어셈블리어나 기계어일 경우, 이를 번역해 주는 프로그램을 컴파일러라고 한다.In order for a program written in a high-level language to run on a processor, it must be translated into a language that the processor can understand directly. A program that does this is called a compiler. For example, if the source language is a high-level language such as Pascal or Cobol, and the target language is an assembly or machine language, the program that translates it is called a compiler.

컴파일을 하기 위해 입력되는 프로그램을 원시 프로그램이라 하고 이 프로그램을 기술한 언어를 원시언어(source language)라 한다. 또 번역되어 출력되는 프로그램을 목적 프로그램이라 하고 이 프로그램을 기술한 언어를 목적언어(object language 또는 target language)라 한다. The program that is input to compile is called a source program, and the language that describes it is called a source language. The translated and output program is called an object program, and the language in which the program is described is called an object language (object language or target language).

한편, 컴퓨터에서 수를 표현하는 방식에는 부동 소수점 방식과 고정 소수점 방식이 있는데, 여기서 부동 소수점 방식이란 실수를 실수부와 가수부로 나누어 표 현하는 방식을 말한다.On the other hand, there are floating point and fixed point methods for representing numbers in a computer. The floating point method refers to a method of expressing a real number divided into a real part and a mantissa part.

대부분의 부동 소수점 방식에 의한 프로세서들은 부동 소수값을 IEEE의 규격 포맷을 사용해 메모리에 저장하는데, 예를 들어 IEEE754는 전체 64비트 중 최상위 비트(bit 63)에는 부호를, 두 번째 비트부터 11개의 비트(bit 62 ~ bit 52)에는 지수 부분을, 그리고 나머지 비트(bit 51 ~ bit 0)에는 유효숫자(mantissa)를 저장하는 형태로 되어 있다.Most floating-point processors store floating-point values in memory using IEEE's standard format.For example, IEEE754 uses the most significant bit of all 64-bits (bit 63) for the sign and the second to 11 bits. The exponent portion is stored in (bit 62 to bit 52), and significant digits (mantissa) are stored in the remaining bits (bit 51 to bit 0).

그러나 위와 같은 부동 소수의 계산이 대부분의 프로그램에서 필수적이고 계산량이 많은데 비해 프로세서에서 실수형 숫자를 저장하는 방식은 정수형 숫자를 저장하는 방식에 비해 매우 복잡하여 연산 속도가 매우 느린 단점이 있었다.However, while floating point calculations as above are essential and computational in most programs, the method of storing real numbers in the processor is more complicated than the method of storing integer numbers, which has the disadvantage of being very slow.

이러한 단점을 극복하기 위해 보다 빠르게 부동 소수의 계산을 처리하기 위해 고안된 방법 중의 하나가 고정 소수점 방식에 의한 연산이다. 고정 소수점 방식에 의한 연산 방법이란 정수형을 이용해 실수의 정수부와 소수부를 저장하도록 하는 수치해석적인 방법으로, 실수 연산을 정수형 타입의 변수를 이용해 연산하는 방법을 말한다.To overcome this drawback, one of the methods designed to handle floating point calculations faster is fixed point operation. Fixed point method is a numerical method that stores the integer part and the decimal part of a real number using an integer type, and is a method of calculating a real number using a variable of an integer type.

일반적으로 고정 소수점 방식에 의한 프로세서는 부동 소수점 방식에 의한 프로세서보다 가격이 저렴하다. 그래서 부피가 크고, 값싼 DSP 시스템들은 저전력 소비와 저렴한 가격에 우선순위를 두고 있기 때문에 고정 소수점 방식에 의한 프로세서를 주로 이용하고 있다. In general, a fixed point processor is less expensive than a floating point processor. As a result, bulky, cheap DSP systems prioritize low power consumption and low cost, so they usually use fixed-point processors.

그러나 고정 소수점 방식에 의한 프로세서의 다이내믹 레인지(dynamic range)와 정밀도(precision)는 매우 제한적인 경우가 많다. 프로그래머들은 제한적 인 다이내믹 레인지와 정밀도 내에서 수치적 정확성(numeric accuracy)과 퍼포먼스(performance)를 적절하게 유지하는 데에 많은 시간을 소비하였고, 결과적으로 고정소수점 방식에 의한 프로세서를 프로그래밍하는 일은 프로그래머들에게 항상 고통스러운 작업이었다.However, the dynamic range and precision of processors by fixed-point methods are often very limited. Programmers spent a lot of time maintaining adequate numerical accuracy and performance within the limited dynamic range and precision. As a result, programming a fixed-point processor is a matter of programmer. It has always been a painful task.

그래서 일반적으로 프로그래머들은 먼저 부동 소수점 방식의 프로세서를 채택하여 그들이 만든 디자인과 알고리즘을 검증하고 나중에 부동 소수점 데이터 타입과 등가의 고정 소수점 데이터 타입으로 전환하는 방법으로 고정 소수점 방식에 의한 프로세서에서 검증된 알고리즘을 구현해왔다.Therefore, programmers generally adopt a floating-point processor to validate their designs and algorithms, and later switch to a fixed-point data type that is equivalent to the floating-point data type. Has been implemented.

부동 소수점 방식에서 고정 소수점 방식으로 전환(floating-point to fixed-point conversion;FFC)하는 과정에서의 첫 번째 단계로, 프로그래머들은 코드 내에서의 각 변수들로부터 요구되는 다이내믹 레인지와 정밀도를 찾아내야만 한다. As a first step in the process of floating-point to fixed-point conversion (FFC), programmers must find the dynamic range and precision required from each variable in the code. do.

이렇게 찾아진 다이내믹 레인지와 정밀도를 바탕으로, 프로그래머들은 코드에 스케일링 쉬프트 연산을 삽입해야 한다. 부동 소수점 방식에서 고정 소수점 방식으로의 전환 과정에서 필수적인 부분은 이러한 스케일링 쉬프트 연산을 어디에 삽입할 것인지 적절한 장소를 결정하는 것이다. Based on this dynamic range and precision, programmers must insert scaling shift operations in their code. An essential part of the transition from floating-point to fixed-point is determining where to put these scaling shift operations.

스케일링 쉬프트 연산의 위치에 대한 결정이 두 개의 중요한 팩터(factor)인 신호 대 양자화 잡음비(signal-to-quantization noise ratio;SQNR)와 오버플로우(overflow)에 깊게 영향을 미치며, 이 두 팩터는 전환의 결과물인 고정 소수점 방식에 의한 코드의 수치적 정확성을 결정한다. Determining the position of the scaling shift operation deeply affects two important factors: signal-to-quantization noise ratio (SQNR) and overflow, which are factors of the transition. Determine the numerical accuracy of the resulting code using the fixed-point method.

그래서 부동 소수점 방식에서 고정 소수점 방식으로의 전환 과정에서 프로그 래머들은 변수들의 정확한 다이내믹 레인지와 정밀도을 얻는 데 사용될 런타임 밸류 레인지(run-time value range)를 모든 변수에 대해 측정하기 위해 정적 분석(static analysis)이나 시뮬레이션을 엄격하게 수행해야 한다.Thus, during the transition from floating point to fixed point, programmers use static analysis to measure the run-time value range for all variables, which will be used to obtain the correct dynamic range and precision of the variables. However, the simulation must be performed strictly.

모든 전환을 손으로 처리하는 것은 매우 시간 낭비적이고 오류를 범하기 쉬운(error-prone) 작업이다. 실험적인 연구 결과에 따르면, 수작업 과정은 프로세서의 총 구현 시간 중 약 3분의 1을 차지한다고 한다.Handling all conversions by hand is a very time-consuming and error-prone task. Experimental studies show that manual work takes up about one-third of the processor's total implementation time.

프로그래머들이 이러한 번거로운 작업을 덜 수 있도록 하기 위해 많은 연구자들이 부동 소수점 방식에서 고정 소수점 방식으로의 전환 프로세스를 효율적으로 자동화시키는 오토스케일러(Autoscaler)나 프릿지(FRIDGE) 같은 다양한 툴(tool)을 개발해냈다. To help programmers do this less work, many researchers have developed a variety of tools, such as Autoscaler and FRIDGE, which efficiently automate the process of floating-point to fixed-point conversion. .

그러나 모든 이러한 툴은 컴파일러의 코드 생성 과정에서 고정 소수점 방식에 의한 코드에 새롭게 추가된 스케일링 쉬프트 연산들의 유해한 효과들을 고려하지 않고 있다. However, all these tools do not consider the detrimental effects of the new scaling shift operations added to fixed-point code during compiler generation.

이러한 스케일링 쉬프트 연산의 유해한 효과로 인해 기존의 컴파일러들은 원시코드로부터 DSP 프로세서에서 매우 유용한 멀티플라이-애드(multiply-add;MAC)나 닷 오퍼레이션(dot operation)과 같은 합성 명령어(composite instruction)들을 생성하지 못하는 문제점이 있었다.Due to the detrimental effects of these scaling shift operations, existing compilers are unable to generate composite instructions from source code, such as multiply-add (MAC) or dot operations, which are very useful in DSP processors. There was a problem.

상기와 같은 문제점을 해결하기 위해 본 발명에서는 스케일링 쉬프트 연산의 삽입으로 인한 역효과를 제거하여 효율적인 코드를 생성하는 컴파일러 및 컴파일 방법을 제공하는 것을 목적으로 한다.In order to solve the above problems, an object of the present invention is to provide a compiler and a compilation method for generating an efficient code by removing the adverse effects caused by the insertion of a scaling shift operation.

특히 컴파일러가 DSP에 특화된 합성 명령어를 생성해낼 수 있도록 스케일링 쉬프트 연산의 노드가 포함된 방향성 비순환 그래프(Directed Acyclic Graph;DAG)로 표현될 수 있는 중간코드(intermediate code)에서 방향성 비순환 그래프의 패턴을 변형시켜 컴파일러의 성능을 향상시키는 것을 목적으로 한다.In particular, the compiler transforms the pattern of the directional acyclic graph in intermediate code, which can be represented as a directed acyclic graph (DAG) with nodes for scaling shift operations, to generate DSP-specific synthesis instructions. To improve the performance of the compiler.

상기한 목적을 달성하기 위해 본 발명은 스케일링 쉬프트(scaling shift) 연산이 포함된 고정 소수점 방식에 의한 원시 코드를 포함하는 일련의 프로그래밍 언어 문장으로부터 노드가 포함된 방향성 비순환 그래프(directed acyclic graph;DAG)로 표현될 수 있는 중간코드(intermediate 코드)를 생성하는 생성단계와 대수학적 변환에 의해 상기 노드 중 상기 스케일링 쉬프트 연산의 노드 및 다른 대수학적 연산의 노드가 포함된 상기 방향성 비순환 그래프의 패턴을 변형시키는 소정의 고쳐쓰기 규칙에 따라 상기 중간코드를 변형시키는 변형단계 및 상기 변형된 중간코드를 목적 코드로 번역하는 번역단계를 포함하는 컴파일 방법을 제공하는 것을 목적으로 한다.To achieve the above object, the present invention provides a directed acyclic graph (DAG) including a node from a series of programming language sentences including source code by fixed-point method including scaling shift operation. Generating a intermediate code (intermediate code) that can be represented by algebraic transformation and transforming the pattern of the directional acyclic graph including nodes of the scaling shift operation and nodes of other algebraic operations An object of the present invention is to provide a compilation method including a transformation step of transforming the intermediate code according to a predetermined rewriting rule and a translation step of translating the modified intermediate code into a target code.

우선 도 1과 도 2를 통해 본 발명의 바람직한 실시예에 따른 컴파일러에 대해 설명하도록 한다.First, a compiler according to a preferred embodiment of the present invention will be described with reference to FIGS. 1 and 2.

도 1은 본 발명의 바람직한 실시예를 구현하는데 사용될 수 있는 하드웨어 환경의 구조도이다. 본 발명의 전형적인 하드웨어 환경에서, 컴퓨터(100)는 프로세서(101), 메모리(102), 데이터 저장장치(103), 데이터 통신장치(104), 입력장 치(105) 및 디스플레이 장치(106) 등을 포함할 수 있다.1 is a structural diagram of a hardware environment that may be used to implement a preferred embodiment of the present invention. In a typical hardware environment of the present invention, the computer 100 may include a processor 101, a memory 102, a data storage 103, a data communication device 104, an input device 105, a display device 106, and the like. It may include.

데이터 저장장치(103)는 하드 디스크, 플로피 디스크, CD-ROM 디스크 드라이브 및 USB 메모리 등을 포함할 수 있으며, 데이터 통신장치(104)는 모뎀, 네트워크 인터페이스 등을 포함할 수 있다. 입력장치(105)는 키보드 또는 마우스 등을 포함할 수 있으며, 디스플레이 장치(106)는 CRT 또는 LCD 모니터 등을 포함할 수 있다.The data storage device 103 may include a hard disk, a floppy disk, a CD-ROM disk drive, a USB memory, and the like, and the data communication device 104 may include a modem, a network interface, and the like. The input device 105 may include a keyboard or a mouse, and the display device 106 may include a CRT or an LCD monitor.

메모리(102)는 컴퓨터(100)내의 저장수단으로서 운영체제(107), 컴파일러(108), 원시코드(109), 목적코드(110), 고쳐쓰기 규칙(111) 등을 저장할 수 있다.The memory 102 may store the operating system 107, the compiler 108, the source code 109, the object code 110, the rewrite rule 111, and the like as storage means in the computer 100.

컴퓨터(100)는 WINDOWS^TM, MACINTOSH^TM, UNIX^TM 등과 같은 운영체제(107)의 제어 하에서 동작하며, 운영체제(107)는 컴퓨터(100)가 파워-온(power-on)되거나 리셋(reset)되는 경우 컴퓨터(100)의 메모리(102) 내에서 실행을 위해 부트(boot)된다. 이후 운영체제(107)는 컴파일러(108)와 같은 하나 이상의 컴퓨터 프로그램의 실행을 제어한다.The computer 100 operates under the control of an operating system 107 such as WINDOWS ^TM , MACINTOSH ^TM , UNIX ^TM, and the like, and the operating system 107 is operated when the computer 100 is powered on or reset. It is booted for execution in memory 102 of computer 100. The operating system 107 then controls the execution of one or more computer programs, such as the compiler 108.

컴파일러(108)는 하나 이상의 프로그래밍 문장을 보유하는 원시코드(109)를 분석한다. 원시코드(109)는 통상적으로 데이터 저장 장치(103) 상의 텍스트 파일로 저장되거나 프로그래머에 의해 입력 장치(105)로부터 입력된다. 컴파일러는 원시코드(109)로부터 고쳐쓰기 규칙(111)을 이용해 목적코드(110)를 생성한다.Compiler 108 analyzes source code 109 that holds one or more programming statements. Source code 109 is typically stored in a text file on data storage device 103 or input from input device 105 by a programmer. The compiler generates the object code 110 using the rewrite rule 111 from the source code 109.

본 발명의 경우 원시코드(109)는 부동소수점 방식에 의한 코드로부터 오토스케일러나 프릿지 등의 툴을 이용해 전환된 고정 소수점 방식에 의한 고급언어로 작 성된 코드인 것이 바람직하나, 반드시 이에 국한될 것은 아니다.In the case of the present invention, the source code 109 is preferably a code written in a high-level language using a fixed point method, which is converted from a code using a floating point method using a tool such as an autoscaler or a ridge, but is not limited thereto. no.

도 2는 본 발명의 바람직한 실시예에 따른 컴파일러의 블록도이다. 2 is a block diagram of a compiler in accordance with a preferred embodiment of the present invention.

컴파일러(108)는 크게 생성수단(210)과 백엔드(220)로 나누어진다. 원시코드(109)가 컴파일러(108)로 입력되면 생성수단(210)은 원시코드(109)를 분석하여 내부적인 표현방식인 중간코드를 생성하며, 백엔드(220)는 중간코드로부터 목적코드(110)를 생성한다.Compiler 108 is largely divided into a generating means 210 and a back end 220. When the source code 109 is input to the compiler 108, the generating means 210 analyzes the source code 109 to generate the intermediate code, which is an internal representation, and the back end 220 generates the object code 110 from the intermediate code. )

생성수단(210)은 프론트엔드(frontend) 부분으로 구문 분석(Lexical Analysis) 수단(211),파싱(Parsing) 수단(212), 의미 분석(Semantic Analysis) 수단(213), 중간 코드 생성(Intermediate Code Generation) 수단(214)을 포함한다.Generating means 210 is a front end (Lexical Analysis) means 211, Parsing means (212), Semantic Analysis means (213), Intermediate code generation (Intermediate Code) Generation means 214.

구문 분석 수단(211)은 필요한 토큰(token)들을 원시코드(109)로부터 분리해내는 작업을 수행한다. 토큰이란 한 의미 단위에 대해 태깅(tagging)을 해 놓은 것을 말한다. The parsing means 211 performs a task of separating necessary tokens from the source code 109. A token is a tagging of a semantic unit.

예를 들어 "x=a+b;" 라는 문장이 있을 경우 구문 분석 수단(211)은 이 문장을 ID("x"), OP(ASSIGN), ID("a"), OP(PLUS), ID("b"), SEMICOLON의 임의의 토큰명을 가진여섯 개의 토큰으로 분리해낼 수 있다. For example, "x = a + b;" Is parsed, the parsing means 211 interprets the sentence as ID ("x"), OP (ASSIGN), ID ("a"), OP (PLUS), ID ("b"), or any of SEMICOLON. It can be separated into six tokens with token names.

파싱 수단(212)은 구문 분석 수단(211)이 분리해 낸 토큰들을 주어진 문법(Grammer)에 맞게 구성한다. 즉, 파싱 수단(212)은 구문 분석 수단(211)을 거쳐 나온 원시코드(109)를 적당히 해석하여 토큰들의 적절한 위치를 잡아준다.The parsing means 212 configures tokens separated by the parsing means 211 according to a given grammar. In other words, the parsing means 212 properly interprets the source code 109 passed through the parsing means 211 to properly position the tokens.

의미 분석 수단(213)은 파싱 수단(212)에서 찾아낸 문법에 실제로 의미를 부여한다. 예를 들어 "int x; float y; float z=x+y;"에서 y와 z는 float 형의 변수 이지만 x는 int 형의 변수이다. 그러나 이러한 정보들은 파싱에 의해서는 얻어낼 수 없다. 파싱된 상태에서는 x,y,z 모두가 단순한 변수(variable) 또는 심벌(symbol) 등으로만 취급되기 때문이다.의미 분석 수단(213)은 이러한 정보들을 처리한다.The semantic analysis means 213 actually gives meaning to the grammar found by the parsing means 212. For example, in "int x; float y; float z = x + y;", y and z are variables of type float, but x is a variable of type int. However, this information cannot be obtained by parsing. This is because, in the parsed state, all of x, y, and z are treated only as simple variables, symbols, or the like. The meaning analyzing means 213 processes this information.

중간 코드 생성수단(214)은 의미 분석 수단(213)을 거쳐 나온 원시 코드(109)로부터 중간 코드를 생성한다. 중간 코드는 어셈블리어와 비슷한 형태의 코드 또는 컴파일러에서 내부적으로 가지고 있는 컴파일된 정보 등을 의미한다. The intermediate code generating means 214 generates the intermediate code from the source code 109 passed through the semantic analysis means 213. Intermediate code refers to code similar to assembly language or compiled information that the compiler has internally.

일반적으로 컴파일러에서 중간코드는 보통 노드를 포함하는 트리(tree) 또는 방향성 비순환 그래프(Directed Acyclic Graph;DAG)의 형태로 표현될 수 있으며, 여기서 노드 하나는 명령어(instruction) 하나를 나타낸다.In general, an intermediate code in a compiler may be expressed in the form of a tree or a directed acyclic graph (DAG) including a normal node, where one node represents one instruction.

백엔드(220)는 변형수단(221)과 번역수단(222)을 포함한다.The back end 220 includes a deforming means 221 and a translation means 222.

변형수단(221)은 생성수단(210)이 생성한 중간 코드를 분석하여 코드의 최적화(Code Optimization)를 수행하는 단계이다. 최적화란 타겟 머신(target machine)에서 목적 코드(110)의 실행 시간을 좀 더 단축시키고 목적 코드(110)가 기억 장소를 좀 더 적게 점유하도록 하기 위해 중간 코드에 프로그램 변환을 적용하는 과정을 말한다. The deforming means 221 analyzes the intermediate code generated by the generating means 210 to perform code optimization. Optimization refers to a process of applying program transformation to intermediate code in order to shorten execution time of the object code 110 in the target machine and to allow the object code 110 to occupy less storage space.

본 발명의 변형수단(221)은 메모리(102)에 저장된 고쳐쓰기 규칙에 따라 생성수단(210)으로부터 생성된 중간 코드를 변형시킨다. 고쳐쓰기 규칙은 대수학적 변환(Algebraic Transformation)에 의해 중간코드의 스케일링 쉬프트 연산의 노드를 포함한 방향성 비순환 그래프의 패턴을 변형시키는 소정의 규칙을 말한다.The deforming means 221 of the present invention deforms the intermediate code generated from the generating means 210 according to the rewrite rule stored in the memory 102. The rewrite rule refers to a predetermined rule that transforms the pattern of the directional acyclic graph including nodes of the scaling shift operation of the intermediate code by Algebraic Transformation.

일반적으로 컴파일러에서 번역수단(222)은 변형수단(221)에 의해 최적화된 중간 코드로부터 적절한 목적 코드(110)의 명령어를 선택한 후 레지스터를 할당하여 목적 코드를 생성한다. In general, the translation means 222 in the compiler selects an instruction of the appropriate object code 110 from the intermediate code optimized by the transformation means 221 and allocates a register to generate the object code.

중간 코드에서는 하나의 노드가 하나의 명령어(instruction)를 나태내고 있지만 실제 타겟 머신(target machine)에서는 두가지 일을 하나의 복합 명령어(composite instruction)로 수행할 수 있는 경우가 많으므로 코드 생성수단(222)은 타겟 머신의 종류에 따라 중간 코드로부터 복합 명령어를 포함하는 효율적인 목적 코드(110)를 생성해낸다.In the intermediate code, one node represents one instruction, but in a real target machine, two things can often be performed as one composite instruction. ) Generates efficient object code 110 including compound instructions from intermediate code according to the type of target machine.

본 발명은 대수학적 변환을 통해 기존의 일반적인 컴파일러에서 생성되는 중간 코드의 방향성 비순환 그래프를 좀 더 바람직한 형태로 재구성하여 컴파일러(108)가 최적화된 코드를 생성하도록 하는 점에 특징이 있다.The present invention is characterized in that the compiler 108 generates an optimized code by reconstructing a directional acyclic graph of intermediate code generated by an existing general compiler into a more preferable form through algebraic transformation.

본 발명은 일련의 고쳐쓰기 규칙(111)들이 설정되고, 설정된 고쳐쓰기 규칙(111)에 따라 컴파일러(108)가 중간 코드로부터 기존의 목적코드와 기능적으로 동등하면서도 좀 더 바람직한 목적코드를 생성할 수 있도록 자동적으로 방향성 비순환 그래프의 패턴을 변환시킨다.According to the present invention, a series of rewrite rules 111 are set, and according to the set rewrite rules 111, the compiler 108 can generate a more desirable object code while being functionally equivalent to the existing object code from the intermediate code. Automatically convert the pattern of the directional acyclic graph.

이하에서는 부동 소수점 방식에 의한 코드로부터 변환된 고정 소수점 방식에 의한 코드에 삽입된 스케일링 쉬프트 연산으로 인해 발생하는 현상을 살펴보고 이에 대한 본원 발명의 특징을 설명한다.Hereinafter, the phenomenon occurring due to the scaling shift operation inserted into the fixed-point code converted from the floating-point code will be described and the characteristics of the present invention will be described.

[표 1]은 일반적인 부동 소수점 방식에 의해 작성된 유명한 DSP 필터 (filter)인 IIR(Infinite Impulse Response)필터를 구현한 C언어 코드의 일부이다. [Table 1] is a part of C language code that implements the Infinite Impulse Response (IIR) filter, which is a famous DSP filter written by a general floating point method.

w(n) = x(n) - ai1*w(n-1) - ai2*w(n-2) y(n) = bi0*w(n) + bi1*w(n-1) +bi2*(n-2)w (n) = x (n)-ai1 * w (n-1)-ai2 * w (n-2) y (n) = bi0 * w (n) + bi1 * w (n-1) + bi2 * (n-2)

[표 2]는 오토스케일러에 의해 [표 1]의 코드로부터 변환된 고정 소수점 방식에 의한 C언어 코드이다. [표 2]를 보면 변환 결과 스케일링 쉬프트 연산이 많이 삽입된 것을 알 수 있다.[Table 2] is a C-language code of the fixed point method converted from the code of [Table 1] by the autoscaler. In Table 2, we can see that a lot of scaling shift operations are inserted.

w(n)=(x(n)-multf(ai1,w(n-1))>>1 - multf(ai2,w(n-2))>>2)<<3 y(n)=(multf(bi0,w(n))>>1 + multf(bi1,w(n-1))>>2 + multf(bi2,w(n-2)))<<1w (n) = (x (n) -multf (ai1, w (n-1)) >> 1-multf (ai2, w (n-2)) >> 2) << 3 y (n) = ( multf (bi0, w (n)) >> 1 + multf (bi1, w (n-1)) >> 2 + multf (bi2, w (n-2))) << 1

아래의 [표 3]은 [표 2]의 코드로부터 일반적인 컴파일러에 의해 바로 생성된 ZSP400 프로세서용 어셈블리 코드이다. [표 3]의 어셈블리 코드를 보면 일반적인 컴파일러는 [표 2]의 코드로부터 DSP의 특별한 복합 명령어들로 쉽게 번역될 수 있는 몇몇 패턴들을 활용하지 못하고 있음을 알 수 있다.Table 3 below shows the assembly code for the ZSP400 processor, which is generated directly by the general compiler from the code in Table 2. Looking at the assembly code in Table 3, we can see that the general compiler does not take advantage of some patterns that can be easily translated from the code in Table 2 into special complex instructions in the DSP.

mul.a r4, r6; shra r0, 1; sub r2, r0; mul.a r5, r7; shra r0, 2; sub r2, r0; shla r2, 3; mul.a r2, r8; shra r0, 1; mul.b r6, r9; shra r2, 2; add r0, r2; shra r0, 1; mac.a r7, r10; shla r0, 1;mul.a r4, r6; shra r0, 1; sub r2, r0; mul.a r5, r7; shra r0, 2; sub r2, r0; shla r2, 3; mul.a r2, r8; shra r0, 1; mul.b r6, r9; shra r2, 2; add r0, r2; shra r0, 1; mac.a r7, r10; shla r0, 1;

도 3은 본 발명에 따른 방향성 비순환 그래프의 일 실시예이다. 도 3(a)는 일반적인 컴파일러가 [표 3]으로부터 생성한 중간 코드를 표현하는 방향성 비순환 그래프의 일부이다(310). 3 is an embodiment of a directed acyclic graph according to the present invention. 3 (a) is a part of a directional acyclic graph representing an intermediate code generated by a general compiler from Table 3 (310).

도 3(b)는 본 발명에 의해 도 3(a)의 그래프(310)로부터 패턴이 변형되었으나 도 3(a)의 그래프(310)와 기능이 동등한 방향성 비순환 그래프이다(320). 도 3(b)에서 곱셈 연산 "*"의 노드(321)와 뺄셈 연산 "-"의 노드(323)가 본 발명에 의한 고쳐쓰기 규칙의 적용 결과 서로 인접하게 되어, 컴파일러가 곱셈 연산 "*"의 노드(321)와 뺄셈 연산 "-"의 노드(323)로부터 복합 명령어 "mac"을 생성할 수 있게 된다.FIG. 3 (b) is a directional acyclic graph in which the pattern is modified from the graph 310 of FIG. 3 (a) but the function is the same as that of the graph 310 of FIG. In Fig. 3 (b), the node 321 of the multiplication operation "*" and the node 323 of the subtraction operation "-" are adjacent to each other as a result of the application of the rewrite rule according to the present invention, so that the compiler calculates the multiplication operation "*". It is possible to generate a compound instruction "mac" from the node 321 of and the node 323 of the subtraction operation "-".

[표 4]는 도 3(b)의 방향성 비순환 그래프로 표현된 중간 코드로부터 생성된 목적 코드이다. 실제로 본 발명에 의해 변형된 중간 코드로부터 생성된 목적 코드인 [표 4]는 기존의 발명에 의한 [표 3]보다 코드 사이즈가 20%가 감소하였다.Table 4 is an object code generated from an intermediate code represented by the directional acyclic graph of FIG. 3 (b). Indeed, Table 4, which is an object code generated from the intermediate code modified by the present invention, has a 20% smaller code size than Table 3 according to the existing invention.

shra r4, 1; nmac.b r4, r6; shrla r2, 2; nmac.b r5, r7; shrla r2, 1; mul.a r2, r8; shra r6, 1; mac.a r6, r9; shra r0, 2; mac.a r7, r10; shla r0, 1;shra r4, 1; nmac.b r4, r6; shrla r2, 2; nmac.b r5, r7; shrla r2, 1; mul.a r2, r8; shra r6, 1; mac.a r6, r9; shra r0, 2; mac.a r7, r10; shla r0, 1;

도 4는 방향성 비순환 그래프의 패턴을 변형시키는 여러 가지 고쳐쓰기 규칙의 예를 도시한 도면이고, 도 5는 본 발명에 따른 방향성 비순환 그래프의 패턴을 변형시키는 고쳐쓰기 규칙을 도시한 것이다. 이하에서는 본 발명의 대수학적 변환에 의한 고쳐쓰기 규칙에 대해 설명한다.4 is a diagram illustrating examples of various rewrite rules for modifying a pattern of a directional acyclic graph, and FIG. 5 illustrates a rewrite rule for modifying a pattern of a directional acyclic graph according to the present invention. Hereinafter, the rewrite rule by the algebraic transformation of the present invention will be described.

대수학적 변환은 컴파일러의 최적화나 하이 레벨(high level)의 시스템 설계 등의 많은 영역에서 사용되어 왔다. 임의의 방향성 비순환 그래프가 주어진 경우, 어떤 조건들에 의해 지배를 받는 최적화된 방향성 비순환 그래프의 변형 방법을 찾는 것은 널리 알려진 대로 매우 어려운 문제이다. Algebraic transformations have been used in many areas, such as compiler optimizations and high level system design. Given any directional acyclic graph, finding a method of transforming an optimized directional acyclic graph that is governed by certain conditions is a very difficult problem as is well known.

미리 정해진 고쳐쓰기 규칙들을 방향성 비순환 그래프의 다양한 서브그래프 (subgraph)에 적용하여 단계적으로 최적화된 패턴을 형성하고자 한다. We will apply predetermined rewrite rules to various subgraphs of the directional acyclic graph to form a stepwise optimized pattern.

하나의 소스 패턴에는 도 4에서 보이는 바와 같이 여러 상이한 고쳐쓰기 규칙이 존재할 수 있다. 이러한 상이한 규칙들의 적용 결과 생성된 목적코드들은 기존의 목적코드와 동일한 기능을 수행하지만 아웃풋 코드(output code) 내에서 서로 상이한 신호 대 양자화 잡음비와 오버플로우(overflow)의 효과를 갖는다. There may be several different rewrite rules in one source pattern as shown in FIG. The object codes generated as a result of the application of these different rules perform the same functions as the existing object codes, but have different signal-to-quantization noise ratios and overflow effects in the output code.

따라서 본 발명은 바람직한 실시예는 중간 코드를 변형시킬 경우의 효과를 정확하게 예측하여 바람직하지 않은 효과를 갖는 규칙들을 배제한 고쳐쓰기 규칙을 제공한다.Accordingly, the present invention provides a rewrite rule that excludes rules with undesirable effects by accurately predicting the effect of modifying intermediate code.

도 5와 도 6은 본 발명에 따른 고쳐쓰기 규칙들을 나타내고 있으며 이러한 규칙들은 각각 스케일링 쉬프트 연산의 노드에 대한 처리를 포함하고 있다. 5 and 6 show rewrite rules in accordance with the present invention, each of which includes processing for nodes of a scaling shift operation.

방향성 비순환 그래프가 주어진 경우 고쳐쓰기 규칙의 개수는 각 규칙에서의 패턴의 크기에 대해 기하급수적으로 비례하여 증가하게 된다. 따라서 본 발명의 바람직한 실시예에 따른 고쳐쓰기 규칙들은 도 5에 나타난 바와 같이 가운데의 스케일링 쉬프트 연산의 노드로부터 2개 이하의 연산자의 노드만을 포함하도록 제한하였다. 왜냐하면 합성 명령어들은 중간 코드를 표현하는 방향성 비순환 그래프에서 일반적으로 많아 봐야 3개의 인접한 노드들로부터 생성되기 때문이다.Given a directional acyclic graph, the number of rewrite rules increases exponentially with the size of the pattern in each rule. Accordingly, the rewrite rules according to the preferred embodiment of the present invention are limited to include only nodes of two or less operators from nodes of the middle scaling shift operation as shown in FIG. 5. Because synthetic instructions are generally generated from three adjacent nodes at most in the directional acyclic graph representing the intermediate code.

도 5에 나타난 바와 같이, 본 발명에서는 대수학적 연산자들을 덧셈 연산의 "+", 곱셈 연산의 "x", 그리고 스케일링 쉬프트 연산의 "< 또는 >>"의 3개의 클래스로 분류했다. 도 5에서 기호 "⊙" 는 덧셈 연산의 "+" 및 곱셈 연산의 "x" 중에서 임의로 선택된 대수학적 연산자를 나타낸다.As shown in FIG. 5, the algebraic operators are classified into three classes: "+" of an addition operation, "x" of a multiplication operation, and "<or >>" of a scaling shift operation. In Fig. 5, the symbol "⊙" represents an algebraic operator arbitrarily selected from "+" of an addition operation and "x" of a multiplication operation.

도 5에서는 본 발명의 고쳐쓰기 방법이 적용되는 패턴들을 이러한 연산자들의 상대적인 위치에 따라 3개의 케이스로 분류했다. 도 5(1)은 첫 번째 케이스로 두 개의 스케일링 쉬프트 연산의 노드가 서로 인접했을 경우에 적용된다(510). In FIG. 5, patterns to which the rewriting method of the present invention is applied are classified into three cases according to the relative positions of these operators. 5 (1) is applied to a case where two nodes of two scaling shift operations are adjacent to each other in a first case (510).

도 5(1)의 대수학적 변환에 의한 고쳐쓰기 규칙은 수학식 1에 의해 고쳐쓰기 규칙의 적용 이전 및 이후의 중간 코드들이 서로 동등한 기능을 수행한다는 점이 증명될 수 있으며, 이러한 인접한 스케일링 쉬프트 연산의 노드들은 신호 대 양자화 잡음비나 오버플로우에 대한 유해한 효과 없이 안전하게 병합될 수 있다. The rewrite rule by the algebraic transformation of FIG. 5 (1) can be proved by Equation 1 that the intermediate codes before and after the application of the rewrite rule perform functions equivalent to each other. Nodes can be safely merged without deleterious effects on signal to quantization noise ratio or overflow.

또한 B=(A<<nx)>>ny 의 표현 역시 도 5(1)의 고쳐쓰기 규칙에 따라서 B=A<<(nx-ny)로 단순화시킬 수 있다.In addition, the expression B = (A << nx) >> ny can also be simplified to B = A << (nx-ny) according to the rewrite rule of FIG. 5 (1).

두 번째 케이스는 도 5의 (2.1)에서 (2.4)에 나타나 있으며, 이는 스케일링 쉬프트 연산의 노드가 곱셈의 "x" 연산자의 노드에 인접해 있으며, "x" 연산자의 노드와 "⊙" 연산자의 노드 사이에 위치해 있는 경우에 적용될 수 있다.The second case is shown in (2.1) to (2.4) of FIG. 5, in which the node of the scaling shift operation is adjacent to the node of the "x" operator of the multiplication, and the node of the "x" operator and the "⊙" operator. Applicable when located between nodes.

만약 프로세서가 곱셈의 "x" 연산과 "⊙" 연산으로 이루어진 합성 명령어를 갖는다면, 본 발명에 따른 두 번째 케이스의 네 가지 고쳐쓰기 규칙들에 의해 스케일링 쉬프트 연산의 노드를 다른 장소로 옮겨서 "x" 연산의 노드와 "⊙" 연산의 노드가 서로 인접하게 하여 컴파일러가 합성 명령어를 생성하도록 할 수 있다. If the processor has a compound instruction consisting of an "x" operation and a "⊙" operation of multiplication, the four rewrite rules of the second case according to the present invention move the node of the scaling shift operation to another position by "x". The node of the "operation" and the node of the "⊙" operation can be adjacent to each other, allowing the compiler to generate a compound instruction.

도 5의 (2.1)의 고쳐쓰기 규칙은 수학식 2에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (2.1)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A에 스케일링 쉬프트 연산을 수행한 결과값과 P, Q의 "⊙" 연산과 곱셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(521). 이 경우 신호 대 양자화 잡음비는 변하지 않는다.The rewrite rule of (2.1) of FIG. 5 may prove to be functionally equivalent by Equation 2. From the intermediate code modified by the rewrite rule of Fig. 5 (2.1), the translation means 222 of the present invention combines the result value of performing a scaling shift operation on A, and the " A command can be generated (521). In this case, the signal-to-quantization noise ratio does not change.

도 5의 (2.2)의 고쳐쓰기 규칙은 수학식 3에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (2.2)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A, P 및 Q의 "⊙" 연산과 곱셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(522). 이 경우 신호 대 양자화 잡음비는 높아진다.The rewrite rule of (2.2) of Figure 5 can be proved to be functionally equivalent by the equation (3). From the intermediate code modified by the rewrite rule of FIG. 5 (2.2), the translation means 222 of the present invention can generate a compound instruction from the "⊙" and multiplication operations of A, P, and Q (522). ). In this case, the signal-to-quantization noise ratio is high.

도 5의 (2.3)의 고쳐쓰기 규칙은 수학식 4에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (2.3)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명 의 번역수단(222)은 B에 스케일링 쉬프트 연산을 수행한 결과값과 A, C의 "⊙" 연산과 곱셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(523). 이 경우 신호 대 양자화 잡음비는 낮아진다.The rewrite rule of FIG. 5 (2.3) may be proved to be functionally equivalent by Equation 4. From the intermediate code modified by the rewrite rule of FIG. 5 (2.3), the translation means 222 of the present invention combines a result value of performing a scaling shift operation on B, and a "⊙" operation of A and C and a multiplication operation. A command can be generated (523). In this case, the signal-to-quantization noise ratio is lowered.

도 5의 (2.4)의 고쳐쓰기 규칙은 수학식 5에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (2.4)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A, P 및 Q의 "⊙" 연산과 곱셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(524). 이 경우 신호 대 양자화 잡음비는 낮아진다.The rewrite rule of FIG. 5 (2.4) can be proved to be functionally equivalent by the equation (5). From the intermediate code modified by the rewrite rule of FIG. 5 (2.4), the translation means 222 of the present invention can generate a complex instruction from the "⊙" and multiplication operations of A, P, and Q (524). ). In this case, the signal-to-quantization noise ratio is lowered.

세 번째 케이스는 도 5의 (3.1)부터 (3.2) 및 도 6의 (3.3)부터 (3.8)에 나타나 있으며, 이는 스케일링 쉬프트 연산의 노드가 덧셈의 "+" 연산자의 노드에 인접해 있으며, "+" 연산자의 노드와 "⊙" 연산자의 노드 사이에 위치해 있는 경우에 적용된다.The third case is shown in (3.1) to (3.2) of FIG. 5 and (3.3) to (3.8) of FIG. 6, in which the node of the scaling shift operation is adjacent to the node of the "+" operator of addition. This applies if it is located between a node of the "+" operator and a node of the "⊙" operator.

만약 프로세서가 덧셈의 "+" 연산과 "⊙" 연산으로 이루어진 합성 명령어를 갖는다면, 본 발명에 따른 세 번째 케이스의 여덟 가지 고쳐쓰기 규칙들에 의해 스케일링 쉬프트 연산의 노드를 다른 장소로 옮겨서 "+" 연산의 노드와 "⊙" 연산의 노드가 서로 인접하게 하여 컴파일러가 합성 명령어를 생성하도록 할 수 있다. If the processor has a compound instruction consisting of the "+" and "⊙" operations of the add, the eight shifting rules of the third case according to the present invention move the node of the scaling shift operation to another position by adding "+". The node of the "operation" and the node of the "⊙" operation can be adjacent to each other, allowing the compiler to generate a compound instruction.

도 5의 (3.1)의 고쳐쓰기 규칙은 수학식 6에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (3.1)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A에 스케일링 쉬프트 연산을 수행한 결과값과 P, Q의 "⊙" 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(531). 이 경우 신호 대 양자화 잡음비는 낮아진다.The rewrite rule of (3.1) of Figure 5 can be proved to be functionally equivalent by the equation (6). From the intermediate code modified by the rewrite rule of (3.1) of FIG. 5, the translation means 222 of the present invention combines a result value of performing a scaling shift operation on A, and a "⊙" operation and an addition operation of P and Q. A command can be generated (531). In this case, the signal-to-quantization noise ratio is lowered.

도 5의 (3.2)의 고쳐쓰기 규칙은 수학식 7에 의해 기능적으로 등가임이 증명될 수 있다. 도 5의 (3.2)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A에 스케일링 쉬프트 연산을 수행한 결과값과 P, Q의 "⊙" 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(532). 이 경우 신호 대 양자화 잡음비는 변하지 않는다.The rewrite rule of (3.2) of Figure 5 can be proved to be functionally equivalent by the equation (7). From the intermediate code modified by the rewrite rule of (3.2) of FIG. 5, the translation means 222 of the present invention combines a result value of performing a scaling shift operation on A, and a "⊙" operation and an addition operation of P and Q. A command may be generated (532). In this case, the signal-to-quantization noise ratio does not change.

도 6의 (3.3)의 고쳐쓰기 규칙은 수학식 8에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.3)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A, P 및 Q의 "⊙" 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(633). 이 경우 신호 대 양자화 잡음비는 변하지 않는다.The rewrite rule of (3.3) of FIG. 6 may be proved to be functionally equivalent by Equation (8). From the intermediate code modified by the rewrite rule of FIG. 6 (3.3), the translation means 222 of the present invention can generate a complex instruction from the "⊙" and addition operations of A, P, and Q (633). ). In this case, the signal-to-quantization noise ratio does not change.

도 6의 (3.4)의 고쳐쓰기 규칙은 수학식 9에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.4)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A에 스케일링 쉬프트 연산을 수행한 결과값과 P, Q의 "⊙" 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(634). 이 경우 신호 대 양자화 잡음비는 낮아진다.The rewrite rule of FIG. 6 (3.4) can be proved to be functionally equivalent by Equation (9). From the intermediate code modified by the rewrite rule of Fig. 6 (3.4), the translation means 222 of the present invention combines the result value of performing a scaling shift operation on A, and the "? &Quot; and addition operations of P and Q. A command can be generated (634). In this case, the signal-to-quantization noise ratio is lowered.

도 6의 (3.5)의 고쳐쓰기 규칙은 수학식 10에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.1)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 A에 스케일링 쉬프트 연산을 수행한 결과값, B에 스케일링 쉬프트 연산을 수행한 결과값 및 P의 "⊙" 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(635). 이 경우 신호 대 양자화 잡음비는 낮아진다.The rewrite rule of (3.5) of FIG. 6 can be proved to be functionally equivalent by the equation (10). From the intermediate code modified by the rewrite rule of (3.1) of FIG. 6, the translation means 222 of the present invention performs a result of performing a scaling shift operation on A, a result of performing a scaling shift operation on B, and a value of P. A complex instruction can be generated from the "⊙" operation and the addition operation (635). In this case, the signal-to-quantization noise ratio is lowered.

도 6의 (3.6)의 고쳐쓰기 규칙은 수학식 11에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.1)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발 명의 번역수단(222)은 D,E,F 및 G의 곱셈 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(636). 이 경우 신호 대 양자화 잡음비는 높아진다.The rewrite rule of (3.6) of FIG. 6 can be proved to be functionally equivalent by Equation (11). From the intermediate code modified by the rewrite rule of FIG. 6 (3.1), the translation means 222 of the present invention can generate a complex instruction from the multiplication and addition operations of D, E, F, and G (636). ). In this case, the signal-to-quantization noise ratio is high.

도 6의 (3.7)의 고쳐쓰기 규칙은 수학식 12에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.1)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 B에 스케일링 쉬프트 연산을 수행한 결과값과 D, E의 곱셈 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(637). 이 경우 신호 대 양자화 잡음비는 높아진다.The rewrite rule of FIG. 6 (3.7) can be proved to be functionally equivalent by Equation 12. From the intermediate code modified by the rewrite rule of (3.1) of FIG. 6, the translation means 222 of the present invention performs a complex instruction from the multiplication operation and the addition operation of D, E, and the result value of performing the scaling shift operation on B. It can be generated (637). In this case, the signal-to-quantization noise ratio is high.

도 6의 (3.8)의 고쳐쓰기 규칙은 수학식 13에 의해 기능적으로 등가임이 증명될 수 있다. 도 6의 (3.8)의 고쳐쓰기 규칙에 의해 변형된 중간코드로부터 본 발명의 번역수단(222)은 B에 스케일링 쉬프트 연산을 수행한 결과값과 D, E의 곱셈 연산과 덧셈 연산으로부터 복합 명령어를 생성할 수 있게 된다(638). 이 경우 신호 대 양자화 잡음비는 높아진다.The rewriting rule of FIG. 6 (3.8) can be proved to be functionally equivalent by Equation 13. From the intermediate code modified by the rewrite rule of FIG. 6 (3.8), the translation means 222 of the present invention performs a complex instruction from the multiplication operation and the addition operation of D, E and the result value of performing the scaling shift operation on B. It can be generated (638). In this case, the signal-to-quantization noise ratio is high.

본 발명의 변형수단(221)은 메모리(102)에 저장된 고쳐쓰기 규칙(111)에 포함된 노드의 패턴이 중간코드 생성수단(214)에서 생성된 중간코드의 방향성 비순환 그래프에 포함되어 있는지 패턴을 서로 매칭한 후 매칭되는 패턴에 대하여 대수학적 특성(algebraic properties)에 의해 기능적으로 등가인(functionally-equivalent) 방향성 비순환 그래프로의 변형을 수행한다.The modification means 221 of the present invention determines whether the pattern of the node included in the rewrite rule 111 stored in the memory 102 is included in the directional acyclic graph of the intermediate code generated by the intermediate code generating means 214. After matching each other, the matching pattern is transformed into a functionally-equivalent directional acyclic graph by algebraic properties.

본 발명의 패턴 매칭(pattern matching)의 복잡성을 줄이기 위해, 변형수단(221)은 둘 이상의 고쳐쓰기 규칙(111)이 하나의 중간코드에 동시에 적용될 수 있는 경우 소정의 우선순위 부여기준에 따라 우선순위가 높은 고쳐쓰기 규칙을 먼저 적용하는 것이 바람직하다. 본 발명에 따른 우선순위 부여기준의 바람직한 일 실시예는 다음의 [표 5]에 나타난 바와 같다.In order to reduce the complexity of pattern matching of the present invention, the deforming means 221 prioritizes according to predetermined priority criteria when two or more rewrite rules 111 can be applied to one intermediate code at the same time. It is desirable to apply the high rewrite rule first. One preferred embodiment of the priority criteria according to the present invention is as shown in the following [Table 5].

우선 순위Priority 고쳐쓰기 규칙Rewrite Rule 정밀도(precision)와 연산 횟수(computation)Precision and computation 1One 636636 정밀도 ↑, 연산 횟수 ↓Precision ↑, operation count ↓ 22 510510 연산 횟수 ↓Operation count ↓ 33 522, 637, 638522, 637, 638 정밀도 ↑Precision ↑ 44 521, 532, 633521, 532, 633 변화 없음No change 55 523, 524, 531, 634, 635523, 524, 531, 634, 635 정밀도 ↓Precision ↓

위 [표 5]에서 우선순위는 정밀도(precision)와 연산 횟수(computation)라는 두 가지 기준에 따라 부여되었다. 정밀도는 신호 대 양자화 잡음비의 값에 의해 평가되었고, 연산 횟수는 방향성 비순환 그래프에서 노드의 개수에 의해 평가되었다.In Table 5 above, priorities are assigned based on two criteria: precision and computation. The precision was evaluated by the value of the signal-to-quantization noise ratio, and the number of operations was evaluated by the number of nodes in the directional acyclic graph.

예를 들어, 도 3(a)의 방향성 비순환 그래프(310)에는 도 5의(2.3)의 고쳐쓰기 규칙(523)과 도 6의 (3.8)의 고쳐쓰기 규칙(638)이 모두 동시에 적용 가능하며, 변형수단(221)에 의해 곱셈 연산의 노드(311)와 뺄셈 연산의 노드(313) 사이에서 스케일링 쉬프트 연산의 노드(312)가 제거된다면, 번역수단(222)은 합성 명령어인 mac 명령어를 생성할 수 있다. For example, both the rewrite rule 523 of FIG. 5 (2.3) and the rewrite rule 638 of FIG. 6 (3.8) can be simultaneously applied to the directional acyclic graph 310 of FIG. 3 (a). If the scaling means 221 removes the node 312 of the scaling shift operation between the node 311 of the multiplication operation and the node 313 of the subtraction operation, the translation means 222 generates a mac instruction that is a compound instruction. can do.

그러나 이 경우 위 표 5에 따르면 도 5의 (2.3)의 고쳐쓰기 규칙(523)보다 도 6의 (3.8)의 고쳐쓰기 규칙(638)의 우선순위가 높으므로 변형수단(221)은 도 6의 (3.8)의 고쳐쓰기 규칙(638)에 따라 중간코드를 변형할 것이다.However, in this case, according to Table 5 above, since the priority of the rewriting rule 638 of FIG. 6 (3.8) is higher than that of the rewriting rule 523 of FIG. 5 (2.3), the deforming means 221 of FIG. We will modify the intermediate code according to the rewrite rule 638 of (3.8).

본 발명의 바람직한 실시예에 따르면 변형수단(221)은 메모리(102)에 저장된 고쳐쓰기 규칙(111)이 중간 코드에 더 이상 적용될 수 없을 때까지 반복적으로 고쳐쓰기 규칙(111)을 적용한다.According to a preferred embodiment of the present invention, the deforming means 221 repeatedly applies the rewriting rule 111 until the rewriting rule 111 stored in the memory 102 can no longer be applied to the intermediate code.

도 7은 본 발명에 따른 일 실시예에서 실행 시간의 감소 비율을 백분율로 도시한 그래프이고 도 8은 본 발명에 따른 일 실시예에서 코드의 사이즈의 감소 비율을 백분율로 도시한 그래프이다. 이하에서는 본 발명의 효율성을 설명하는 본 발명의 바람직한 실시예의 실험 결과를 기술한다.7 is a graph showing the percentage reduction of the execution time in one embodiment according to the present invention, Figure 8 is a graph showing the percentage reduction in the size of the code in an embodiment according to the present invention. The following describes experimental results of a preferred embodiment of the present invention illustrating the efficiency of the present invention.

본 발명의 바람직한 실시예는 ZSP 컴파일러에서 구현되었다. 실험을 위해 DSPstone으로부터 제공된 부동 소수점 코드로부터 두 세트의 ZSP400 프로세서용 실행 파일이 벤치마크 코드로 생성되었다. A preferred embodiment of the present invention is implemented in a ZSP compiler. For the experiment, two sets of executables for the ZSP400 processor were generated as benchmark code from the floating-point code provided by DSPstone.

첫 번째 세트의 실행파일인 실행파일1은 오토스케일러에 의해 부동 소수점 방식의 코드로부터 변환된 고정 소수점 방식의 원시코드(109)로부터 종래의 컴파일러에 의해 생성된 목적 코드(110)를 포함하는 것이다.Execution file 1, which is the first set of executable files, includes object code 110 generated by a conventional compiler from fixed-point source code 109 converted from a floating-point type code by an autoscaler.

두 번째 세트의 실행파일인 실행파일2는 오토스케일러에 의해 부동 소수점 방식의 코드로부터 변환된 고정 소수점 방식의 코드로 원시코드(109)로부터 본 발명에 따른 컴파일러(108)에 의해 생성된 목적 코드(110)를 포함하는 것이다.The second set of executable files, executable file 2, is fixed-point code converted from the floating-point code by the autoscaler, and generated from the source code 109 by the compiler 108 according to the present invention. 110).

이 두 세트의 실행파일들로부터 시뮬레이터를 이용해 싸이클을 카운트하여 실행시간을 측정하였고 유틸리티 툴(utility tool)을 이용해 코드의 사이즈를 측정하였다. 본 발명의 바람직한 실시예에 따른 성능의 향상에 기인한 실행 파일의 실행 시간의 감소와 코드의 사이즈의 감소는 수학식 14를 이용하여 백분율로 계산되었으며 이는 각각 도 7과 도 8에 도시되었다.From these two sets of executables, the cycle was counted using a simulator to measure execution time, and the utility tool was used to measure the size of the code. The reduction in the execution time of the executable file and the decrease in the size of the code due to the performance improvement according to the preferred embodiment of the present invention were calculated as a percentage using Equation 14, which is shown in FIGS. 7 and 8, respectively.

도 7은 본 발명의 바람직한 실시예로부터 생성된 실행 파일의 ZSP400 프로세서에서의 실행 시간의 감소로 인한 퍼모먼스(performance)의 향상을 도시하고 있다. 도 6은 본 발명에 의할 경우 최대 21.5 %, 평균 12.7 %로 퍼포먼스가 향상되었음을 나타낸다.Figure 7 illustrates the improvement in performance due to a reduction in the execution time of the executable file generated from the preferred embodiment of the present invention in the ZSP400 processor. Figure 6 shows that the performance is improved by up to 21.5%, average 12.7% according to the present invention.

도 8은 본 발명의 바람직한 실시예에 의해 실행 파일의 코드의 사이즈가 감소될 수 있음을 도시하고 있다. 도 8은 본 발명에 의할 경우 최대 16.7 %, 평균 10 %로 코드 사이즈가 감소했음을 나타내고 있다.8 shows that the code size of an executable file can be reduced by a preferred embodiment of the present invention. 8 shows that the code size is reduced to a maximum of 16.7% and an average of 10% according to the present invention.

도 7과 도 8에서 세로축은 감소된 시간과 코드 사이즈의 백분율을 표시한 것이며 가로축은 콘볼루션 연산(convolution)이나 평균을 구하는 연산(average)등 여러 연산을 표시한 것으로 도 7과 도 8은 여러 연산에서의 실행파일의 성능을 비교한 것이다.In FIG. 7 and FIG. 8, the vertical axis represents a reduced time and a percentage of code size, and the horizontal axis represents various operations such as a convolution operation or an average calculation. This is a comparison of the performance of the executable in the operation.

본 발명의 컴파일 방법은 또한 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는, ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The compilation method of the present invention can also be embodied in computer readable code on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like, which are also implemented in the form of a carrier wave (for example, transmission over the Internet). It also includes. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

이상에서와 같이, 본 발명에 의하면 DSP 시스템을 위한 부동 소수점 방식에 의한 코드로부터 전환된 고정 소수점 방식에 의한 코드에 삽입된 스케일링 쉬프트 연산 때문에 기존의 컴파일러에서 복합 명령어를 생성하지 못하였던 문제점이 해결되어 컴파일러로부터 생성된 목적 코드를 포함한 실행 파일의 타겟 프로세서에서의 실행 시간과 코드 사이즈가 상당히 감소될 수 있다. 따라서, 본 발명에 의하면 고정 소수점 방식에 의한 DSP 프로세서의 개발 및 이용이 더욱 촉진될 것이다. As described above, according to the present invention has been solved the problem that the conventional compiler could not generate a complex instruction due to the scaling shift operation inserted in the fixed-point code converted from the floating-point code for the DSP system The execution time and code size of an executable file including the object code generated from the compiler can be significantly reduced. Therefore, according to the present invention, the development and use of the DSP processor by the fixed point method will be further facilitated.

Claims

Intermediate code that can be represented as a directed acyclic graph (DAG) containing nodes from a series of programming language statements that contain fixed-point source code with scaling shift operations. Generating step of generating;

Transforming the intermediate code according to a predetermined rewrite rule for modifying a pattern of the directional acyclic graph including a node of the scaling shift operation and a node of another algebraic operation by algebraic transformation;

And a translation step of translating the modified intermediate code into object code.

The scaling rule of claim 1, wherein the rewriting rule is that if the pattern of the directional acyclic graph is a node of the scaling shift operation is adjacent to a node of another scaling shift operation, the nodes of the two scaling shift operations are one scaling shift. And modifying the pattern of the directional acyclic graph to merge into nodes of the operation.

The method of claim 1, wherein the rewriting rule is a pattern of the directional acyclic graph, wherein the node of the scaling shift operation is a first node for any one of addition and multiplication, and the second node for addition and multiplication. Compiling the pattern of the directional acyclic graph so that the first node and the second node adjacent to the second node when the second node is located between the first node and the second node.

The method of claim 1, wherein said modifying step is repeated until the rewrite rule can no longer be applied.

The compilation method as claimed in claim 1, wherein, in the transforming step, when two or more rewrite rules can be applied, a rewrite rule having a higher priority is first applied according to a predetermined priority criterion.

6. The method of claim 5, wherein the priority criterion comprises at least one of a criterion including a value of a signal-to-quantization noise ratio (SQNR) and the number of nodes in the pattern of the directional acyclic graph. Compiling method, characterized in that including.

Intermediate, which can be represented as a directed acyclic graph (DAG) containing nodes from a series of programming language statements containing source code by fixed-point method with scaling shift operations. Generating means for generating code);

Storage means for storing a predetermined rewrite rule for modifying a pattern of said directional acyclic graph including a node of said scaling shift operation and a node of another algebraic operation by said algebraic transformation;

Deforming means for deforming the intermediate code according to the rewrite rule stored in the storing means;

Compiler comprising; translation means for translating the modified intermediate code to the object code.

8. The method of claim 7, wherein the rewriting rule is that if the pattern of the directional acyclic graph is a node of the scaling shift operation is adjacent to a node of another scaling shift operation, the nodes of the two scaling shift operations are one scaling shift. And modify the pattern of the directional acyclic graph to merge into nodes of the operation.

8. The method of claim 7, wherein the rewrite rule further comprises: a pattern of the directional acyclic graph, wherein the node of the scaling shift operation is a first node relating to any one of addition and multiplication, and a second node related to any one of addition and multiplication. Computing the pattern of the directional acyclic graph so that when the first node and the second node is adjacent to the second node and located between the first node and the second node.

8. The compiler of claim 7, wherein the modifying means first applies a rewrite rule having a higher priority in accordance with a predetermined priority criterion when two or more rewrite rules can be applied.

8. The compiler of claim 7, wherein the modifying means repeatedly modifies the intermediate code according to the rewriting rule until the rewriting rule can no longer be applied.

A recording medium containing a program for realizing the compilation method according to any one of claims 1 to 6.