WO2024065866A1 - 一种用于计算图编译的中间表示方法及装置 - Google Patents

一种用于计算图编译的中间表示方法及装置 Download PDF

Info

Publication number
WO2024065866A1
WO2024065866A1 PCT/CN2022/124002 CN2022124002W WO2024065866A1 WO 2024065866 A1 WO2024065866 A1 WO 2024065866A1 CN 2022124002 W CN2022124002 W CN 2022124002W WO 2024065866 A1 WO2024065866 A1 WO 2024065866A1
Authority
WO
WIPO (PCT)
Prior art keywords
tensor
variable
pointing
pointer set
tensor variable
Prior art date
Application number
PCT/CN2022/124002
Other languages
English (en)
French (fr)
Inventor
王宏升
潘爱民
陈�光
Original Assignee
之江实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 之江实验室 filed Critical 之江实验室
Priority to US18/071,958 priority Critical patent/US20240104016A1/en
Publication of WO2024065866A1 publication Critical patent/WO2024065866A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Definitions

  • the present invention relates to the field of computer systems based on specific computing models, and in particular to an intermediate representation method and device for computing graph compilation.
  • the existing graph compilation technology for neural network computing has not yet appeared in the compilation technology for analyzing the tensor variables pointing to the same memory address relationship in the tensor flow of the computational graph. Therefore, the existing compilation technology has high requirements for hardware memory resources.
  • the purpose of the present invention is to provide an intermediate representation method and device for computational graph compilation to overcome the deficiencies in the prior art.
  • the present invention provides the following technical solutions:
  • the present invention discloses an intermediate representation method for computing graph compilation, comprising the following steps:
  • Step 1 Compile the neural network into a computational graph for neural network calculation
  • Step 2 Build a node for each tensor variable in the computation graph.
  • Step 3 Associating the nodes representing the tensor variables in the computation graph to the set of pointers pointing to the tensor variables;
  • Step 4 Analyze the constraint relationship between tensor variables in the calculation graph, including the following sub-steps:
  • Step 4.1 define the constraint representation of the address assignment operation between tensor variables in the computation graph
  • Step 4.2 Define the constraint representation of the assignment operation between tensor variables in the computation graph
  • Step 4.3 define the constraint representation of the tensor variable pointer set loading operation in the computation graph
  • Step 4.4 define the constraint representation of the storage operation of the tensor variable pointer set in the computation graph
  • Step 5 Iteratively construct the topology graph of the intermediate representation based on the constraint relationship of the computational graph tensor variables, including the following sub-steps:
  • Step 5.1 Construct the propagation process of the intermediate representation of the computational graph based on the constraint representation of the assignment operation between tensor variables;
  • Step 5.2 constructing the loading process of the tensor variable pointer set in the intermediate representation of the computation graph based on the constraint representation of the tensor variable pointer set loading operation;
  • Step 5.3 Construct the storage process of the tensor variable pointer set in the intermediate representation of the computational graph based on the constraint representation of the tensor variable pointer set storage operation;
  • Step 6 Analyze the tensor variables with different aliases pointing to the same memory location based on the intermediate representation and allocate registers for them.
  • the step 4.1 defines a constraint representation of the address assignment operation between tensor variables in the computation graph, which refers to a constraint representation method of assigning the address of tensor variable b to tensor variable a, specifically: if there is a relationship between tensor variable b and tensor variable a in which the address of tensor variable b is assigned to tensor variable a, then a basic constraint relationship is defined between tensor variable b and tensor variable a: the set containing tensor variable b is included in the set of pointers pointing to tensor variable a.
  • the step 4.2 defines the constraint representation of the assignment operation between tensor variables in the computational graph, which refers to a constraint representation method for assigning tensor variable b to tensor variable a, specifically: if there is a relationship between tensor variable b and tensor variable a in which tensor variable b is assigned to tensor variable a, then a constraint relationship of the assignment operation is defined between tensor variable b and tensor variable a: the set of pointers pointing to tensor variable b is included in the set of pointers pointing to tensor variable a.
  • the step 4.3 defines the constraint representation of the tensor variable pointer set loading operation in the computational graph, which refers to a constraint representation method for loading the elements in the pointer set pointing to tensor variable b into the pointer set pointing to tensor variable a, specifically: if tensor a and tensor b satisfy that the elements in the pointer set pointing to tensor variable b are assigned to tensor variable a, and the pointer set pointing to tensor variable b contains the tensor variable t element, then it is defined that there is a constraint relationship between tensor variables a and b: the constraint relationship of the operation of loading the pointer set pointing to tensor variable b into the pointer set pointing to tensor variable a is defined as the pointer set pointing to tensor variable t is included in the pointer set pointing to tensor variable a.
  • the step 4.4 defines the constraint representation of the storage operation of the pointer set of the tensor variable in the computational graph, which refers to a constraint representation method of storing the pointer set pointing to the tensor variable b in the pointer set of the elements of the pointer set pointing to the tensor variable a, specifically: if the tensor variable b and the tensor variable a satisfy: the pointer set pointing to the tensor variable b is assigned to the elements of the pointer set pointing to the tensor variable a and the pointer set pointing to the tensor variable a contains the tensor variable t element, then it is defined that there is a constraint relationship between tensors a and b: the constraint relationship of storing the pointer set pointing to the tensor variable b in the pointer set of the elements of the pointer set pointing to the tensor variable a is defined as the pointer set pointing to the computational graph, which
  • step 5.1 refers to propagating the set containing tensor variables along the direction of the edge of the computation graph based on the constraint representation, and the specific sub-steps are as follows:
  • Step 5.1.1 construct the intermediate representation of the assignment operation in the computation graph: For the constraint representation of the assignment operation in the computation graph, if there is a relationship between tensor variable b and tensor variable a that assigns tensor variable b to tensor variable a, then there is a constraint representation between tensor variable b and tensor variable a: the set of pointers pointing to tensor variable b is included in the set of pointers pointing to tensor variable a;
  • Step 5.1.2 constructing a topological graph based on the constraint representation of the assignment operation, specifically: for a constraint relationship that the pointer set pointing to the tensor variable b is included in the pointer set pointing to the tensor variable a, based on the constraint relationship, generating an edge in the topological graph from the node of the pointer set of the tensor variable b to the node of the pointer set of the tensor variable a;
  • the propagation process of tensor variables of assignment operations in the above topology graph is as follows: the execution flow of the computational graph passes through the constraint relationship edge representing the assignment operation, and the constraint relationship edge is propagated from the pointer set node pointing to the tensor variable b to the pointer set node pointing to the tensor variable a, and the tail node pointer of the constraint relationship also points to the set containing tensor variables pointed to by the pointer of the starting node of the constraint relationship edge.
  • step 5.2 refers to loading the elements in the pointer set pointing to the tensor variable b into the pointer set pointing to the tensor variable a, and the elements in the pointer set pointing to the tensor variable b are propagated along the direction of the edge of the computation graph based on the constraint representation, and the specific sub-steps are as follows:
  • Step 5.2.1 construct the intermediate representation of the loading operation in the computational graph: For the constraint representation of the loading operation in the computational graph, if the tensor variable b and the tensor variable a exist, the elements in the pointer set pointing to the tensor variable b are assigned to the tensor variable a, and the pointer set pointing to the tensor variable b contains the elements of the tensor variable t, then the pointer set pointing to the tensor variable b is loaded into the pointer set pointing to the tensor variable a.
  • the constraint representation of the operation the pointer set pointing to the tensor variable t is included in the pointer set pointing to the tensor variable a;
  • Step 5.2.2 constructing a topological graph based on the constraint representation of the loading operation, specifically: for an element of the pointer set of the tensor variable t whose constraint relationship is that the pointer set pointing to the tensor variable b is included in the pointer set pointing to the tensor variable a, based on the constraint relationship, generating an edge in the topological graph from a node of the pointer set of the tensor variable t to a node of the pointer set of the tensor variable a;
  • the propagation process of the tensor variables of the loading operation in the above topology graph is: the execution flow of the computational graph passes through the constraint relationship edge representing the loading operation, and the constraint relationship edge is propagated from the pointer set node of the corresponding element to the pointer set node pointing to the tensor variable a.
  • step 5.3 refers to storing the pointer set pointing to the tensor variable b into the pointer set of the elements of the pointer set pointing to the tensor variable a, and the pointer set pointing to the tensor variable b is propagated along the direction of the edge of the computation graph based on the constraint representation, and the specific sub-steps are as follows:
  • Step 5.3.1 construct an intermediate representation of the storage operation in the computation graph: For the constraint representation of the storage operation in the computation graph, if there exists a constraint representation that assigns the pointer set pointing to the tensor variable b to the element of the pointer set pointing to the tensor variable a and the pointer set pointing to the tensor variable a contains the tensor variable t element, then the pointer set pointing to the tensor variable b is stored in the pointer set of the element of the pointer set pointing to the tensor variable a: the pointer set pointing to the tensor variable b is contained in the pointer set of the element t of the pointer set pointing to the tensor variable a;
  • Step 5.3.2 constructing a topological graph based on the constraint representation of the storage operation, specifically: for a constraint relationship that the pointer set pointing to the tensor variable b is included in the pointer set pointing to the tensor variable t, and based on the constraint relationship, generating an edge in the topological graph from a node of the pointer set of the tensor variable b to a node of the pointer set of the tensor variable t;
  • the propagation process of the tensor variables of the storage operation in the above topology graph is: the execution flow of the computational graph passes through the constraint relationship edge representing the storage operation, and the constraint relationship edge is propagated from the pointer set node pointing to the tensor variable b to the pointer set node corresponding to the element t in the pointer set pointing to the tensor variable a.
  • step 6 is specifically as follows: if it is found that in the topological graph of the intermediate representation, the set of pointers pointing to a tensor variable contains tensor variables with different aliases, since their memory addresses are the same, these tensor variables with different aliases are regarded as the same tensor variable, and the same free register is allocated to these tensor variables with different aliases.
  • the present invention discloses an intermediate representation device for computational graph compilation, the device comprising a memory and one or more processors, the memory storing executable code, and the intermediate representation method for the above-mentioned computational graph compilation when the one or more processors execute the executable code.
  • the present invention provides an intermediate representation method and device for computational graph compilation, provides an analysis method for tensor variables with alias relationships pointing to the same memory location in the computational graph, and stores the tensor variables with alias relationships pointing to the same memory location in the computational graph in the same register after analysis.
  • the intermediate representation method for computational graph compilation proposed by the present invention optimizes the compilation efficiency of tensor variables pointing to the same memory location in the computational graph, and reduces the demand for hardware memory resources when the computational graph is executed, while improving the execution efficiency of the computational graph when it is running.
  • researchers and engineering users use the intermediate representation method and device for computational graph compilation to optimize the model, improve the compilation efficiency of the computational graph, and promote the development of the application of deep neural network models.
  • FIG1 is a schematic diagram of a constraint representation of an address assignment operation in a computation graph according to an embodiment of the present invention
  • FIG2 is a schematic diagram of a constraint representation of an assignment operation in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of a constraint representation of a tensor variable pointer set loading operation in an embodiment of the present invention
  • FIG4 is a schematic diagram of a constraint representation of a tensor variable pointer set storage operation in an embodiment of the present invention
  • FIG5 is a schematic diagram of a propagation process for constructing an intermediate representation of a computation graph in an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a process of loading a set of tensor variable pointers in an intermediate representation of a computation graph in an embodiment of the present invention
  • FIG. 7 is a schematic diagram of a storage process for constructing a set of tensor variable pointers in an intermediate representation of a computation graph in an embodiment of the present invention
  • FIG8 is a schematic diagram of an intermediate representation of an address assignment operation in an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of an intermediate representation of an assignment operation in an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of an intermediate representation of a storage operation in an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of an intermediate representation of a load operation in an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of an intermediate representation of a load operation in an embodiment of the present invention.
  • FIG13 is a schematic diagram of the overall architecture of the intermediate representation method in an embodiment of the present invention.
  • FIG. 14 is a schematic diagram of a device according to an embodiment of the present invention.
  • An embodiment of the present invention provides an intermediate representation method for computing graph compilation.
  • the architecture diagram of the intermediate representation method is shown in FIG13 , and includes the following steps:
  • Step 1 Compile the neural network into a computational graph for neural network calculation
  • Step 2 Build a node for each tensor variable v in the computation graph.
  • Step 3 Associate the node representing the tensor variable v in the computation graph with a pointer set P(v) pointing to the tensor variable v;
  • Step 4 Analyze the constraint relationship between tensor variables in the calculation graph, including the following sub-steps:
  • Step 4.1 define the constraint representation of the address assignment operation between tensor variables in the computation graph
  • Step 4.2 Define the constraint representation of the assignment operation between tensor variables in the computation graph
  • Step 4.3 define the constraint representation of the tensor variable pointer set loading operation in the computation graph
  • Step 4.4 define the constraint representation of the storage operation of the tensor variable pointer set in the computation graph
  • Step 5 Iteratively construct the topology graph of the intermediate representation based on the constraint relationship of the computational graph tensor variables, including the following sub-steps:
  • Step 5.1 Construct the propagation process of the intermediate representation of the computational graph based on the constraint representation of the assignment operation between tensor variables;
  • Step 5.2 constructing the loading process of the tensor variable pointer set in the intermediate representation of the computation graph based on the constraint representation of the tensor variable pointer set loading operation;
  • Step 5.3 construct the storage process of the tensor variable pointer set in the intermediate representation of the computational graph based on the constraint representation of the tensor variable pointer set storage operation;
  • Step 6 Analyze the tensor variables with different aliases pointing to the same memory location based on the intermediate representation and allocate registers for them.
  • the step 4.1 defines the constraint representation of the address assignment operation in the computation graph.
  • the constraint representation of the address assignment operation is Figure 1 shows the constraint representation process of the address assignment operation in the computation graph.
  • the step 4.2 defines the constraint representation of the assignment operation in the computation graph.
  • the constraint representation of the assignment operation is Figure 2 shows the constraint representation process of the assignment operation in the computation graph.
  • the step 4.3 defines the constraint representation of the tensor variable pointer set loading operation in the computation graph.
  • the constraint representation of the tensor variable pointer set loading operation refers to the constraint representation method of loading the elements in the pointer set P(b) pointing to the tensor variable b to the pointer set P(a) pointing to the tensor variable a.
  • the constraint representation of the loading operation is:
  • the step 4.4 defines the constraint representation of the storage operation of the tensor variable pointer set in the computation graph.
  • the constraint representation of the storage operation of the tensor variable pointer set refers to the constraint representation method of storing the pointer set P(b) pointing to the tensor variable b into the pointer set of the elements of the pointer set P(a) pointing to the tensor variable a.
  • the constraint relationship of the storage operation is:
  • tensors a and b have a constraint relationship: the constraint relationship of storing the pointer set P(b) pointing to tensor variable b in the pointer set of the element of the pointer set P(a) pointing to tensor variable a is defined as the pointer set pointing to tensor variable b is included in the pointer set P(t) of the element t of the pointer set pointing to tensor variable a.
  • the constraint of the storage operation is expressed as: Constraint relationship.
  • Figure 4 shows the constraint representation process of the storage operation of the tensor variable pointer set in the computation graph.
  • the step 5.1 constructs the propagation process of the intermediate representation of the computational graph.
  • the propagation process of constructing the intermediate representation of the computational graph refers to the propagation of a set of tensor variables along the direction of the edge of the computational graph based on the constraint representation.
  • the modeling process of the intermediate representation propagation process is as follows:
  • Constraint representation construction process For the constraint relationship, the pointer set pointing to the tensor variable b is included in the pointer set pointing to the tensor variable a.
  • the graph construction process of the constraint representation means that a topological graph based on the constraint relationship will generate an edge from the node P(b) of the pointer set of the tensor variable b to the node P(a) of the pointer set of the tensor variable a.
  • the constraint relationship edge is propagated from the pointer set node P(b) pointing to tensor variable b to the pointer set node P(a) pointing to tensor variable a, that is, the tail node pointer of the constraint relationship also points to the set of tensor variables pointed to by the pointer of the starting node of the constraint relationship edge.
  • Figure 5 shows the propagation process of constructing the intermediate representation of the computation graph.
  • the process of loading the tensor variable pointer set in the intermediate representation of the computation graph in step 5.2 is constructed.
  • the process of loading the tensor variable pointer set in the intermediate representation of the computation graph is to load the elements in the pointer set P(b) pointing to the tensor variable b into the pointer set P(a) pointing to the tensor variable a, and the elements in the pointer set P(b) pointing to the tensor variable b are propagated along the direction of the edge of the computation graph based on the constraint representation.
  • the modeling process of the intermediate representation loading process is as follows:
  • Constraint representation construction process For a constraint relationship, the set of pointers to (the elements t ⁇ P(b) contained in the set of pointers to tensor variable b) tensor variable t is contained in the set of pointers to tensor variable a And t ⁇ P(b), the construction process of the constraint representation means that a topological graph based on the constraint relationship will generate an edge from the node of the pointer set of the tensor variable t to the node of the pointer set of the tensor variable a.
  • the step 5.3 constructs a storage process of a set of tensor variable pointers in the intermediate representation of the computation graph.
  • the storage process of constructing a set of tensor variable pointers in the intermediate representation of the computation graph refers to storing a set of pointers P(b) pointing to tensor variable b into a set of pointers to elements of a set of pointers P(a) pointing to tensor variable a, and the set of pointers P(b) pointing to tensor variable b is propagated along the direction of the edge of the computation graph based on the constraint representation.
  • the modeling process of the intermediate representation storage process is as follows:
  • the constraint representation of storing the pointer set P(b) pointing to tensor variable b in the pointer set of the elements of the pointer set P(a) pointing to tensor variable a is:
  • the pointer set representing the tensor variable b is included in the pointer set P(t) of the element t of the pointer set pointing to the tensor variable a.
  • Constraint representation construction process For the constraint relationship, the set of pointers pointing to tensor variable b is contained in the set of pointers pointing to tensor variable t (which is contained in the element t ⁇ P(a) of the set of pointers pointing to tensor variable a). And t ⁇ P(a), the construction process of the constraint representation means that a topological graph based on the constraint relationship will generate an edge from a node of the pointer set of the tensor variable b to a node of the pointer set of the tensor variable t.
  • FIG. 7 shows the storage process of constructing the tensor variable pointer set in the intermediate representation of the computation graph.
  • the step 6 is specifically as follows: if it is found that in the topological graph of the intermediate representation, the pointer set pointing to a tensor variable contains tensor variables with different aliases, since their memory addresses are the same, these tensor variables with different aliases are regarded as the same tensor variable, and the same free register is allocated to these tensor variables with different aliases.
  • the first step is to build a node for each tensor variable in the computation graph, which contains a set of pointers to tensor variables.
  • the computation graph is as follows:
  • b &a:Indicates that the pointer variable b points to the address of the tensor variable a, where &a means to get the memory address of the tensor variable a;
  • b a: indicates that the pointer variable b points to the tensor variable a;
  • *b a: indicates that the element of pointer variable b points to tensor variable a.
  • *b means to dereference the pointer pointing to tensor variable b, that is, to obtain the elements contained in the pointer set pointing to tensor variable b;
  • b *a: indicates that the pointer variable b points to the element of the pointer variable a.
  • the second step is to analyze the constraint relationship between tensor variables in the calculation graph.
  • the constraint relationship of storing the pointer set P(b) pointing to tensor variable b in the pointer set P(t) of element t of the pointer set P(a) pointing to tensor variable a is defined as the pointer set P(b) pointing to tensor variable b is included in the pointer set P(t) of element t of the pointer set pointing to tensor variable a, such as And t ⁇ P(a).
  • the constraint relationship of loading the pointer set P(b) pointing to tensor variable b into the pointer set P(a) pointing to tensor variable a is defined as the pointer set P(t) containing the element t in the pointer set P(b) pointing to tensor variable b is included in the pointer set pointing to tensor variable a, such as And t ⁇ P(b).
  • Step 3 Iteratively construct the topology of the intermediate representation based on the constraint relationship of the computation graph tensor variables. Iteratively constructing the topology of the intermediate representation based on the constraint relationship of the computation graph tensor variables means iteratively constructing the topology of the intermediate representation according to the constraint relationship of the computation graph tensor variables until the structure of the topology of the intermediate representation no longer changes.
  • the first round of iteration constructs the intermediate representation topology graph:
  • Figure 8 shows the intermediate representation process of the address assignment operation.
  • the second round of iterations constructs the intermediate representation topology graph:
  • Analyze tensor variables with different aliases pointing to the same memory location based on the intermediate representation From the topological graph analysis of the intermediate representation, it is found that the set of pointers pointing to the tensor variable x contains the elements of tensor variables x and z, so there is an alias relationship between the tensor variables x and z in the computational graph, and the memory addresses of the tensor variables x and z are the same, indicating that the tensor variables x and z are the same tensor variable.
  • Allocate registers for tensor variables in the computation graph Because the memory addresses of tensor variables x and z are the same, only one free register needs to be allocated for tensor variables x and z.
  • an embodiment of the present invention further provides an intermediate representation device for computational graph compilation, which also includes a memory and one or more processors, wherein executable code is stored in the memory, and when the one or more processors execute the executable code, they are used to implement the intermediate representation method for computational graph compilation in the above embodiment.
  • An embodiment of an intermediate representation device for computing graph compilation of the present invention can be applied to any device with data processing capabilities, and the arbitrary device with data processing capabilities can be a device or apparatus such as a computer.
  • Device implementation example can be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by the processor of any device with data processing capabilities in which it is located to read the corresponding computer program instructions in the non-volatile memory into the memory and run it. From the hardware level, as shown in Figure 14, it is a hardware structure diagram of any device with data processing capabilities where the intermediate representation device for computing graph compilation of the present invention is located.
  • any device with data processing capabilities where the device in the embodiment is located can also include other hardware according to the actual function of the arbitrary device with data processing capabilities, which will not be repeated here.
  • the implementation process of the functions and effects of each unit in the above-mentioned device is specifically detailed in the implementation process of the corresponding steps in the above-mentioned method, which will not be repeated here.
  • the relevant parts can refer to the partial description of the method embodiment.
  • the device embodiment described above is only schematic, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the scheme of the present invention. Ordinary technicians in this field can understand and implement it without paying creative work.
  • An embodiment of the present invention further provides a computer-readable storage medium on which a program is stored.
  • the program is executed by a processor, the intermediate representation method for computing graph compilation in the above embodiment is implemented.
  • the computer-readable storage medium may be an internal storage unit of any device with data processing capability described in any of the aforementioned embodiments, such as a hard disk or a memory.
  • the computer-readable storage medium may also be an external storage device of any device with data processing capability, such as a plug-in hard disk, a smart media card (SMC), an SD card, a flash card, etc. equipped on the device.
  • the computer-readable storage medium may also include both an internal storage unit and an external storage device of any device with data processing capability.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by any device with data processing capability, and may also be used to temporarily store data that has been output or is to be output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种用于计算图编译的中间表示方法,包括如下步骤:步骤1、将神经网络编译为用于神经网络计算的计算图;步骤2、为计算图中每个张量变量构建节点;步骤3、将计算图中表示张量变量的节点关联到指向所述张量变量的指针集合;步骤4、分析计算图中张量变量之间的约束关系;步骤5、基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图;步骤6、基于中间表示分析指向相同内存位置的不同别名的张量变量,并为其分配寄存器。本发明提供了一种针对所述计算图中指向同一块内存位置的存在别名关系的张量变量之间的分析方法,本发明提出的用于计算图编译的中间表示方法优化了计算图中指向相同内存位置的张量变量的编译效率。

Description

一种用于计算图编译的中间表示方法及装置
本申请要求于2022年9月27日向中国国家知识产权局提交的发明专利申请号为202211177783.1,发明名称为“一种用于计算图编译的中间表示方法及装置”的中国专利申请的优先权权益,其全部内容通过引用合并于本申请。
技术领域
本发明涉及基于特定计算模型的计算机系统领域,特别涉及一种用于计算图编译的中间表示方法及装置。
背景技术
随着近几年神经网络模型的落地,用于计算图编译的技术变得越来越重要。目前,现有的面向神经网络计算的图编译技术仍未出现分析计算图张量流中存在指向同一内存地址关系的张量变量的编译技术,因此,现有的编译技术对硬件的内存资源需求高。
发明内容
本发明的目的在于提供一种用于计算图编译的中间表示方法及装置,以克服现有技术中的不足。
为实现上述目的,本发明提供如下技术方案:
本发明公开了一种用于计算图编译的中间表示方法,包括如下步骤:
步骤1、将神经网络编译为用于神经网络计算的计算图;
步骤2、为计算图中每个张量变量构建节点;
步骤3、将计算图中表示张量变量的节点关联到指向所述张量变量的指针集合;
步骤4、分析计算图中张量变量之间的约束关系,包括以下子步骤:
步骤4.1、定义计算图中张量变量之间的地址赋值操作的约束表示;
步骤4.2、定义计算图中张量变量之间的赋值操作的约束表示;
步骤4.3、定义计算图中张量变量指针集合加载操作的约束表示;
步骤4.4、定义计算图中张量变量指针集合存储操作的约束表示;
步骤5、基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图,包括以下子步骤:
步骤5.1、基于张量变量之间的赋值操作的约束表示构建计算图中间表示的传播过程;
步骤5.2、基于张量变量指针集合加载操作的约束表示构建计算图中间表示中的张量变量指针集合的加载过程;
步骤5.3、基于张量变量指针集合存储操作的约束表示构建计算图中间表示中的张量变量指针 集合的存储过程;
步骤6、基于中间表示分析指向相同内存位置的不同别名的张量变量,并为其分配寄存器。
作为优选的,所述步骤4.1定义计算图中张量变量之间的地址赋值操作的约束表示是指将张量变量b的地址赋予张量变量a的约束表示方法,具体为:若张量变量b与张量变量a存在将张量变量b的地址赋予张量变量a的关系,则定义张量变量b与张量变量a之间存在基本约束关系:包含张量变量b的集合被包含于指向张量变量a的指针集合。
作为优选的,所述步骤4.2定义计算图中张量变量之间的赋值操作的约束表示是指将张量变量b赋予张量变量a的约束表示方法,具体为:若张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,则定义张量变量b与张量变量a之间存在赋值操作的约束关系:指向张量变量b的指针集合被包含于指向张量变量a的指针集合。
作为优选的,所述步骤4.3定义计算图中张量变量指针集合加载操作的约束表示是指将指向张量变量b的指针集合中的元素加载到指向张量变量a的指针集合的约束表示方法,具体为:若张量a与张量b满足指向张量变量b的指针集合中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素,则定义张量变量a与b存在约束关系:定义将指向张量变量b指针集合加载到指向张量变量a的指针集合操作的约束关系为指向张量变量t的指针集合被包含于指向张量变量a的指针集合中。
作为优选的,所述步骤4.4定义计算图中张量变量指针集合存储操作的约束表示是指将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束表示方法,具体为:若张量变量b与张量变量a满足:将指向张量变量b的指针集合赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,则定义张量a与b存在约束关系:定义将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合中。
作为优选的,所述步骤5.1是指包含张量变量的集合沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
步骤5.1.1、构建计算图中赋值操作的中间表示:对于计算图中赋值操作的约束表示,满足张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,则张量变量b与张量变量a存在约束表示:指向张量变量b的指针集合被包含于指向张量变量a的指针集合;
步骤5.1.2、基于所述赋值操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合,基于约束关系,拓扑图中生成一条由 张量变量b的指针集合的节点指向张量变量a的指针集合的节点的连边;
上述拓扑图中赋值操作的张量变量的传播过程为:计算图的执行流经过表示赋值操作的约束关系边,则所述约束关系边是从指向张量变量b的指针集合节点传播至指向张量变量a的指针集合节点,所述约束关系的尾节点指针也指向所述约束关系边起始节点的指针所指向的包含张量变量的集合。
作为优选的,所述步骤5.2是指将指向张量变量b的指针集合中的元素加载到指向张量变量a的指针集合中,且指向张量变量b的指针集合中的元素沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
步骤5.2.1、构建计算图中加载操作的中间表示:对于计算图中加载操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素,则将指向张量变量b指针集合加载到指向张量变量a的指针集合操作的约束表示:指向张量变量t的指针集合被包含于指向张量变量a的指针集合中;
步骤5.2.2、基于所述加载操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b指针集合的元素张量变量t的指针集合被包含于指向张量变量a的指针集合中,基于约束关系,拓扑图中生成一条由张量变量t的指针集合的节点指向张量变量a的指针集合节点的连边;
上述拓扑图中加载操作的张量变量的传播过程为:计算图的执行流经过表示加载操作的约束关系边,则所述约束关系边是从对应元素的指针集合节点传播至指向张量变量a的指针集合节点。
作为优选的,所述步骤5.3是指将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中,且指向张量变量b的指针集合沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
步骤5.3.1、构建计算图中存储操作的中间表示:对于计算图中存储操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,则将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束表示:代表为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合中;
步骤5.3.2、基于所述存储操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b指针集合被包含于指向张量变量t的指针集合中且,基于约束关系,拓扑图中生成一条 由张量变量b的指针集合的节点指向张量变量t的指针集合节点的连边;
上述拓扑图中存储操作的张量变量的传播过程为:计算图的执行流经过表示存储操作的约束关系边,则所述约束关系边是从指向张量变量b的指针集合节点传播至指向张量变量a的指针集合中对应元素t的指针集合节点。
作为优选的,所述步骤6具体为:若发现所述中间表示的拓扑图中,指向某张量变量的指针集合中包含有不同别名的张量变量,由于其内存地址相同,将这些不同别名的张量变量视为同一个张量变量,并为这些不同别名的张量变量分配同一个空闲寄存器。
本发明公开了一种用于计算图编译的中间表示装置,所述装置包括存储器和一个或多个处理器,所述存储器中存储有可执行代码,所述一个或多个处理器执行所述可执行代码时,用于上述计算图编译的中间表示方法。
本发明的有益效果:本发明一种用于计算图编译的中间表示方法及装置,提供了一种针对所述计算图中指向同一块内存位置的存在别名关系的张量变量之间的分析方法,并将计算图中指向同一块内存位置的存在别名关系的张量变量经分析后存储在同一个寄存器中,本发明提出的用于计算图编译的中间表示方法优化了计算图中指向相同内存位置的张量变量的编译效率,而且既降低了计算图执行时对硬件内存资源的需求,同时提高了计算图运行时的执行效率。研究人员和工程应用者开发算法模型的过程中,利用所述的一种用于计算图编译的中间表示方法及装置优化模型,提高了计算图的编译效率,推动了深度神经网络模型落地应用的发展。
附图说明
图1是本发明实施例中计算图中地址赋值操作的约束表示的示意图;
图2是本发明实施例中赋值操作的约束表示的示意图;
图3是本发明实施例中张量变量指针集合加载操作的约束表示的示意图;
图4是本发明实施例中张量变量指针集合存储操作的约束表示的示意图;
图5是本发明实施例中构建计算图中间表示的传播过程的示意图;
图6是本发明实施例中构建计算图中间表示中的张量变量指针集合的加载过程的示意图;
图7是本发明实施例中构建计算图中间表示中的张量变量指针集合的存储过程的示意图;
图8是本发明实施例中地址赋值操作的中间表示的示意图;
图9是本发明实施例中赋值操作的中间表示的示意图;
图10是本发明实施例中存储操作的中间表示的示意图;
图11是本发明实施例中加载操作的中间表示的示意图;
图12是本发明实施例中加载操作的中间表示的示意图;
图13是本发明实施例中中间表示方法的整体架构的示意图;
图14是本发明实施例中装置的示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明了,下面通过附图及实施例,对本发明进行进一步详细说明。但是应该理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限制本发明的范围。此外,在以下说明中,省略了对公知结构和技术的描述,以避免不必要地混淆本发明的概念。
本发明实施例提供了一种用于计算图编译的中间表示方法,所述中间表示方法的架构图如图13所示,包括如下步骤:
步骤1、将神经网络编译为用于神经网络计算的计算图;
步骤2、为计算图中每个张量变量v构建一个节点;
步骤3、将计算图中表示张量变量v的节点关联到一个指向所述张量变量v的指针集合P(v);
步骤4、分析计算图中张量变量之间的约束关系,包括以下子步骤:
步骤4.1、定义计算图中张量变量之间的地址赋值操作的约束表示;
步骤4.2、定义计算图中张量变量之间的赋值操作的约束表示;
步骤4.3、定义计算图中张量变量指针集合加载操作的约束表示;
步骤4.4、定义计算图中张量变量指针集合存储操作的约束表示;
步骤5、基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图,包括以下子步骤:
步骤5.1、基于张量变量之间的赋值操作的约束表示构建计算图中间表示的传播过程;
步骤5.2、基于张量变量指针集合加载操作的约束表示构建计算图中间表示中的张量变量指针集合的加载过程;
步骤5.3、基于张量变量指针集合存储操作的约束表示构建计算图中间表示中的张量变量指针集合的存储过程;
步骤6、基于中间表示分析指向相同内存位置的不同别名的张量变量,并为其分配寄存器。
所述步骤4.1定义计算图中地址赋值操作的约束表示。所述定义计算图中张量变量之间的地址赋值操作的约束表示是指将张量变量b的地址赋予张量变量a的约束表示方法。如果张量变量b与张量变量a存在将张量变量b的地址赋予张量变量a的关系,如a=&b,那么定义张量变量b与张量变量a之间存在基本约束关系:包含张量变量b的集合被包含于指向张量变量a的指针集合。所述地址赋值操作的约束表示为
Figure PCTCN2022124002-appb-000001
图1展示了计 算图中地址赋值操作的约束表示过程。
所述步骤4.2定义计算图中赋值操作的约束表示。所述定义计算图中张量变量之间的赋值操作的约束表示是指将张量变量b赋予张量变量a的约束表示方法。如果张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,如a=b,那么定义张量变量b与张量变量a之间存在赋值操作的约束关系:指向张量变量b的指针集合被包含于指向张量变量a的指针集合。所述赋值操作的约束表示为
Figure PCTCN2022124002-appb-000002
图2展示了计算图中赋值操作的约束表示过程。
所述步骤4.3定义计算图中张量变量指针集合加载操作的约束表示。所述张量变量指针集合加载操作的约束表示是指将指向张量变量b的指针集合P(b)中的元素加载到指向张量变量a的指针集合P(a)的约束表示方法。所述加载操作的约束表示为:
如果张量a与b满足:指向张量变量b的指针集合P(b)中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素:a=*b且t∈P(b)。那么张量变量a与b存在约束关系:定义将指向张量变量b指针集合P(b)加载到指向张量变量a的指针集合P(a)操作的约束关系为指向张量变量t的指针集合被包含于指向张量变量a的指针集合中:
Figure PCTCN2022124002-appb-000003
图3展示了计算图中张量变量指针集合加载操作的约束表示过程。
所述步骤4.4定义计算图中张量变量指针集合存储操作的约束表示。所述张量变量指针集合存储操作的约束表示是指将指向张量变量b的指针集合P(b)存储到指向张量变量a的指针集合P(a)的元素的指针集合中的约束表示方法。所述存储操作的约束关系为:
如果张量变量b与张量变量a满足:将指向张量变量b的指针集合P(b)赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,如a=*b且t∈P(a)。那么张量a与b存在约束关系:定义将指向张量变量b的指针集合P(b)存储到指向张量变量a的指针集合P(a)的元素的指针集合中的约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合P(t)中。所述存储操作的约束表示为:
Figure PCTCN2022124002-appb-000004
约束关系。图4展示了计算图中张量变量指针集合存储操作的约束表示过程。
所述步骤5.1构建计算图中间表示的传播过程。所述构建计算图中间表示的传播过程是指包含张量变量的集合沿着计算图的基于约束表示的边的方向传播。所述中间表示传播过程的建模过程如下:
(i)计算图的中间表示过程:对于计算图中赋值操作的约束表示,满足张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,如a=b,则张量变量b与张量变量a存在约束表示:
Figure PCTCN2022124002-appb-000005
代表指向张量变量b的指针集合被包含于指向张量变量a的指针集合。
(ii)约束表示的构图过程:对于约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合
Figure PCTCN2022124002-appb-000006
所述约束表示的构图过程是指基于约束关系的拓扑图会生成一条由张量变量b的指针集合的节点P(b)指向张量变量a的指针集合节点P(a)的连边。
(iii)张量变量的传播过程:由于指向张量变量b的指针集合被包含于指向张量变量a的指针集合,所以指向张量变量b的指针集合所包含的全部张量元素都流向指向张量变量a的指针集合。因此只要计算图的执行流经过表示赋值操作的约束关系边,那么所述约束关系边是从指向张量变量b的指针集合节点P(b)传播至指向张量变量a的指针集合节点P(a),也就是说,所述约束关系的尾节点指针也指向所述约束关系边起始节点的指针所指向的包含张量变量的集合。图5展示了构建计算图中间表示的传播过程。
所述步骤5.2中构建计算图中间表示中的张量变量指针集合的加载过程。所述构建计算图中间表示中的张量变量指针集合的加载过程是指将指向张量变量b的指针集合P(b)中的元素加载到指向张量变量a的指针集合P(a)中,而且指向张量变量b的指针集合P(b)中的元素沿着计算图的基于约束表示的边的方向传播。所述中间表示加载过程的建模过程如下:
(i)计算图的中间表示过程:对于计算图中加载操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合P(b)中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素:a=*b且t∈P(b)。则将指向张量变量b指针集合P(b)加载到指向张量变量a的指针集合P(a)操作的约束表示:
Figure PCTCN2022124002-appb-000007
且t∈P(b),代表指向(包含于指向张量变量b指针集合的元素t∈P(b))张量变量t的指针集合被包含于指向张量变量a的指针集合中。
(ii)约束表示的构图过程:对于约束关系为指向(包含于指向张量变量b指针集合的元素t∈P(b))张量变量t的指针集合被包含于指向张量变量a的指针集合中
Figure PCTCN2022124002-appb-000008
且t∈P(b),所述约束表示的构图过程是指基于约束关系的拓扑图会生成一条由张量变量t的指针集合的节点指向张量变量a的指针集合节点的连边。
(iii)张量变量的传播过程:由于指向(包含于指向张量变量b指针集合的元素t∈P(b))张量变量t的指针集合被包含于指向张量变量a的指针集合中
Figure PCTCN2022124002-appb-000009
且t∈P(b),所以指向张量变量b的指针集合所包含的全部张量元素的指针集合都流向指向张量变量a的指针集合。因此只要计算图的执行流经过表示加载操作的约束关系边,那么所述约束关系边是从对应元素的指针集合节点P(t)传播至指向张量变量a的指针集合节点P(a)。图6展示了构建计算图中间表示中的张量变量指针集合的加载过程。
所述步骤5.3构建计算图中间表示中的张量变量指针集合的存储过程。所述构建计算 图中间表示中的张量变量指针集合的存储过程是指将指向张量变量b的指针集合P(b)存储到指向张量变量a的指针集合P(a)的元素的指针集合中,而且指向张量变量b的指针集合P(b)沿着计算图的基于约束表示的边的方向传播。所述中间表示存储过程的建模过程如下:
(i)计算图的中间表示过程:对于计算图中存储操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合P(b)赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,如*a=b且t∈P(a)。则将指向张量变量b的指针集合P(b)存储到指向张量变量a的指针集合P(a)的元素的指针集合中的约束表示:
Figure PCTCN2022124002-appb-000010
代表为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合P(t)中。
(ii)约束表示的构图过程:对于约束关系为指向张量变量b指针集合被包含于指向(包含于指向张量变量a指针集合的元素t∈P(a))张量变量t的指针集合中
Figure PCTCN2022124002-appb-000011
且t∈P(a),所述约束表示的构图过程是指基于约束关系的拓扑图会生成一条由张量变量b的指针集合的节点指向张量变量t的指针集合节点的连边。
(iii)张量变量的传播过程:由于指向张量变量b指针集合被包含于指向(包含于指向张量变量a指针集合的元素t∈P(a))张量变量t的指针集合中
Figure PCTCN2022124002-appb-000012
且t∈P(a),所以指向张量变量b的指针集合节点流向指向张量变量t的指针集合节点。因此只要计算图的执行流经过表示存储操作的约束关系边,那么所述约束关系边是从指向张量变量b的指针集合节点P(b)传播至指向张量变量a的指针集合中对应元素t的指针集合节点P(t)。图7展示了构建计算图中间表示中的张量变量指针集合的存储过程。
所述步骤6具体为:若发现所述中间表示的拓扑图中,指向某张量变量的指针集合中包含有不同别名的张量变量,由于其内存地址相同,将这些不同别名的张量变量视为同一个张量变量,并为这些不同别名的张量变量分配同一个空闲寄存器。
具体地,所述为用于计算图编译的中间表示方法的过程如下:
第一步、将为计算图中每个张量变量构建一个关于指向张量变量的指针集合的节点。所述计算图如下所示:
y=&x;
x=&z;
w=x;
*w=y;
x=*w;
其中所述计算图中的操作含义如下:
b=&a:表示指针变量b指向张量变量a的地址,其中&a表示取张量变量a的内存地址;
b=a:表示指针变量b指向张量变量a;
*b=a:表示指针变量b的元素指向张量变量a。其中*b表示取指向张量变量b的指针的解引用,也就是说获取指向张量变量b的指针集合所包含的元素;
b=*a:表示指针变量b指向指针变量a的元素。
第二步、分析计算图中张量变量之间的约束关系。
Figure PCTCN2022124002-appb-000013
其中所述计算图中张量变量之间的约束关系含义如下:
(1)地址赋值操作
Figure PCTCN2022124002-appb-000014
如果张量变量b与张量变量a存在将张量变量b的地址赋予张量变量a的关系,如a=&b。那么定义张量变量b与张量变量a之间存在基本约束关系:包含张量变量b的集合{b}被包含于指向张量变量a的指针集合P(a),如
Figure PCTCN2022124002-appb-000015
(2)赋值操作
Figure PCTCN2022124002-appb-000016
如果张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,如a=b。那么定义张量变量b与张量变量a之间存在赋值操作的约束关系:指向张量变量b的指针集合P(b)被包含于指向张量变量a的指针集合P(a),如
Figure PCTCN2022124002-appb-000017
(3)存储操作a=*b→t∈P(b)且
Figure PCTCN2022124002-appb-000018
指向张量变量a的指针集合包含张量变量t元素,将指向张量变量b的指针集合P(b)赋予指向张量变量a的指针集合的元素t,如*a=b且t∈P(a)。那么张量a与b存在约束关系:定义将指向张量变量b的指针集合P(b)存储到指向张量变量a的指针集合P(a)的元素t的指针集合P(t)中的约束关系为指向张量变量b的指针集合P(b)被包含于指向张量变量a的指针集合元素t的指针集合P(t)中,如
Figure PCTCN2022124002-appb-000019
且t∈P(a)。
(4)加载操作a=*b→t∈P(b)且
Figure PCTCN2022124002-appb-000020
指向张量变量b指针集合包含张量变量t元素,如t∈P(b),指向张量变量b的指针集合P(b)中的元素t赋予张量变量a,如a=*b且t∈P(b)。那么张量变量a与b存在约束关系:定义将指向张量变量b指针集合P(b)加载到指向张量变量a的指针集合P(a)操作的约束关系为指向张量变量b的指针集合P(b)中所包 含元素t的指针集合P(t)被包含于指向张量变量a的指针集合中,如
Figure PCTCN2022124002-appb-000021
且t∈P(b)。
第三步、基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图。所述基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图是指根据计算图张量变量的约束关系迭代地构建中间表示拓扑图,直至所述中间表示拓扑图结构不再变化为止。
第一轮迭代构建中间表示拓扑图:所述第一轮迭代构建中间表示拓扑图包括如下过程:(1)构建地址赋值操作的中间表示:根据地址赋值操作y=&x和x=&z,分析得出张量变量x、y和z之间存在
Figure PCTCN2022124002-appb-000022
Figure PCTCN2022124002-appb-000023
的约束关系。因此构建包含张量变量x的集合被包含于指向张量变量y的指针集合节点的拓扑结构,同理构建包含张量变量z的集合被包含于指向张量变量x的指针集合节点的拓扑结构。图8展示了地址赋值操作的中间表示过程。
(2)构建赋值操作的中间表示:根据赋值操作w=x,分析得出张量变量x与w之间存在
Figure PCTCN2022124002-appb-000024
约束关系。因此构建指向张量变量x的指针集合节点被包含于指向张量变量w的指针集合节点的拓扑结构,也就是说执行流信息从指向张量变量x的指针集合节点流向指向张量变量w的指针集合节点。
(3)赋值操作的张量变量的传播过程:经过所述的赋值操作步骤,包含张量变量z的集合{z}被传播至指向张量变量w的指针集合节点P(w),因此存在
Figure PCTCN2022124002-appb-000025
图9展示了赋值操作的中间表示过程。
(4)构建存储操作的中间表示:根据存储操作*w=y,由于指向张量变量w的指针集合节点P(w)包含{z}元素,所以分析得出指向张量变量y的指针集合节点会流向指向张量变量z的指针集合节点
Figure PCTCN2022124002-appb-000026
且z∈P(w)。
(5)存储操作的张量变量的传播过程:经过所述的存储操作步骤,指向张量变量y的指针集合节点P(y)会流向指向张量变量z的指针集合节点P(z),又由于x∈P(y)指向张量变量y的指针集合节点包含元素张量变量x,所以包含张量变量x的集合{x}也从节点P(y)传播至节点P(z),因此得到x∈P(z)。图10展示了存储操作的中间表示过程。
(6)构建加载操作的中间表示:根据存储操作x=*w,由于指向张量变量w的指针集合节点P(w)包含{z}元素,所以分析得出指向张量变量z的指针集合节点会流向指向张量变量x的指针集合节点
Figure PCTCN2022124002-appb-000027
且z∈P(w)。
(7)加载操作的张量变量的传播过程:经过所述的加载操作步骤,指向张量变量z的指针集合节点P(z)会流向指向张量变量x的指针集合节点P(x),又由于x∈P(z)和
Figure PCTCN2022124002-appb-000028
指向张量变量z的指针集合节点包含元素张量变量x,所以包含张量变量x的集合{x}也从节点P(z)传播至节点P(x),然后继续传播至节点P(w),因此得到x∈P(x)和x∈P(w)。图11展示 了加载操作的中间表示过程。
第二轮迭代构建中间表示拓扑图:所述第二轮迭代构建中间表示拓扑图包括如下过程:由于第一轮迭代为指向张量变量x的指针节点P(x)新增了元素x和为指向张量变量w的指针节点P(w)新增了元素x,所以有关P(x)和P(w)参与的关于*x和*w的计算操作需要进行迭代更新。因为计算图中只有关于*w参与的操作,所以需要更新*w=y和x=*w操作。
(i)对于x=*w操作:经过第一轮迭代,由于P(w)更新为
Figure PCTCN2022124002-appb-000029
所以对于有关指向张量变量w的指针集合节点P(w)参与的操作原语更新为:根据存储操作x=*w,由于指向张量变量w的指针集合节点P(w)约束关系的更新只是新增了x∈P(w),所以得出
Figure PCTCN2022124002-appb-000030
且x∈P(w),因为
Figure PCTCN2022124002-appb-000031
表示指向张量变量x的指针集合节点P(x)流向自身节点,所以所述中间表示结构图不需要进行更新。
(ii)对于*w=y操作:经过第一轮迭代,由于P(w)更新为
Figure PCTCN2022124002-appb-000032
所以对于有关指向张量变量y的指针集合节点P(y)参与的操作原语更新为:根据存储操作*w=y,由于指向张量变量w的指针集合节点P(w)约束关系的更新只是新增了x∈P(w),所以得出
Figure PCTCN2022124002-appb-000033
且x∈P(w),因为
Figure PCTCN2022124002-appb-000034
表示指向张量变量y的指针集合节点P(y)流向指向张量变量x的指针集合节点P(x)节点,所以所述中间表示结构图更新为需要新增一条由指向张量变量y的指针集合节点P(y)传播至指向张量变量x的指针集合节点P(x)节点的连边。图12展示了加载操作的中间表示过程。
经过第二轮迭代更新中间表示拓扑图,所述拓扑图的结构不再发生变化,所以完成将所述计算图编译为基于计算图张量变量约束关系的中间表示的过程。
基于中间表示分析指向相同内存位置的不同别名的张量变量:由所述中间表示的拓扑图分析发现,指向张量变量x的指针集合包含张量变量x和z元素,所以计算图中张量变量x与z存在别名关系,张量变量x与z的内存地址相同,说明张量变量x与z是同一个张量变量。
为计算图中的张量变量分配寄存器。因为张量变量x与z的内存地址相同,所以对于张量变量x和z只需要分配一个空闲寄存器。
参见图14,本发明实施例还提供了一种用于计算图编译的中间表示装置,还包括存储器和一个或多个处理器,存储器中存储有可执行代码,所述一个或多个处理器执行所述可执行代码时,用于实现上述实施例中的用于计算图编译的中间表示方法。
本发明一种用于计算图编译的中间表示装置的实施例可以应用在任意具备数据处理能力的设备上,该任意具备数据处理能力的设备可以为诸如计算机等设备或装置。装置实施 例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在任意具备数据处理能力的设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图14所示,为本发明一种用于计算图编译的中间表示装置所在任意具备数据处理能力的设备的一种硬件结构图,除了图14所示的处理器、内存、网络接口、以及非易失性存储器之外,实施例中装置所在的任意具备数据处理能力的设备通常根据该任意具备数据处理能力的设备的实际功能,还可以包括其他硬件,对此不再赘述。上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本发明方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
本发明实施例还提供了一种计算机可读存储介质,其上存储有程序,该程序被处理器执行时,实现上述实施例中的用于计算图编译的中间表示方法。
所述计算机可读存储介质可以是前述任一实施例所述的任意具备数据处理能力的设备的内部存储单元,例如硬盘或内存。所述计算机可读存储介质也可以是任意具备数据处理能力的设备的外部存储设备,例如所述设备上配备的插接式硬盘、智能存储卡(Smart Media Card,SMC)、SD卡、闪存卡(Flash Card)等。进一步的,所述计算机可读存储介质还可以既包括任意具备数据处理能力的设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算机程序以及所述任意具备数据处理能力的设备所需的其他程序和数据,还可以用于暂时地存储已经输出或者将要输出的数据。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换或改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种用于计算图编译的中间表示方法,其特征在于:所述中间表示方法包括如下步骤:
    步骤1、将神经网络编译为用于神经网络计算的计算图;
    步骤2、为计算图中每个张量变量构建节点;
    步骤3、将计算图中表示张量变量的节点关联到指向所述张量变量的指针集合;
    步骤4、分析计算图中张量变量之间的约束关系,包括以下子步骤:
    步骤4.1、定义计算图中张量变量之间的地址赋值操作的约束表示;
    步骤4.2、定义计算图中张量变量之间的赋值操作的约束表示;
    步骤4.3、定义计算图中张量变量指针集合加载操作的约束表示;
    步骤4.4、定义计算图中张量变量指针集合存储操作的约束表示;
    步骤5、基于计算图张量变量的约束关系迭代地构建中间表示的拓扑图,包括以下子步骤:
    步骤5.1、基于张量变量之间的赋值操作的约束表示构建计算图中间表示的传播过程;
    步骤5.2、基于张量变量指针集合加载操作的约束表示构建计算图中间表示中的张量变量指针集合的加载过程;
    步骤5.3、基于张量变量指针集合存储操作的约束表示构建计算图中间表示中的张量变量指针集合的存储过程;
    步骤6、基于中间表示分析指向相同内存位置的不同别名的张量变量,并为其分配寄存器。
  2. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤4.1定义计算图中张量变量之间的地址赋值操作的约束表示是指将张量变量b的地址赋予张量变量a的约束表示方法,具体为:若张量变量b与张量变量a存在将张量变量b的地址赋予张量变量a的关系,则定义张量变量b与张量变量a之间存在基本约束关系:包含张量变量b的集合被包含于指向张量变量a的指针集合。
  3. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤4.2定义计算图中张量变量之间的赋值操作的约束表示是指将张量变量b赋予张量变量a的约束表示方法,具体为:若张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,则定义张量变量b与张量变量a之间存在赋值操作的约束关系:指向张量变量b的指针集合被包含于指向张量变量a的指针集合。
  4. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤4.3定义计算图中张量变量指针集合加载操作的约束表示是指将指向张量变量b的指针集合中的元素加载到指向张量变量a的指针集合的约束表示方法,具体为:若张量a与张量b满足指向张量变量b的指针集合中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素, 则定义张量变量a与b存在约束关系:定义将指向张量变量b指针集合加载到指向张量变量a的指针集合操作的约束关系为指向张量变量t的指针集合被包含于指向张量变量a的指针集合中。
  5. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤4.4定义计算图中张量变量指针集合存储操作的约束表示是指将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束表示方法,具体为:若张量变量b与张量变量a满足:将指向张量变量b的指针集合赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,则定义张量a与b存在约束关系:定义将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合中。
  6. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤5.1是指包含张量变量的集合沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
    步骤5.1.1、构建计算图中赋值操作的中间表示:对于计算图中赋值操作的约束表示,满足张量变量b与张量变量a存在将张量变量b赋予张量变量a的关系,则张量变量b与张量变量a存在约束表示:指向张量变量b的指针集合被包含于指向张量变量a的指针集合;
    步骤5.1.2、基于所述赋值操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b的指针集合被包含于指向张量变量a的指针集合,基于约束关系,拓扑图中生成一条由张量变量b的指针集合的节点指向张量变量a的指针集合的节点的连边;
    上述拓扑图中赋值操作的张量变量的传播过程为:计算图的执行流经过表示赋值操作的约束关系边,则所述约束关系边是从指向张量变量b的指针集合节点传播至指向张量变量a的指针集合节点,所述约束关系的尾节点指针也指向所述约束关系边起始节点的指针所指向的包含张量变量的集合。
  7. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤5.2是指将指向张量变量b的指针集合中的元素加载到指向张量变量a的指针集合中,且指向张量变量b的指针集合中的元素沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
    步骤5.2.1、构建计算图中加载操作的中间表示:对于计算图中加载操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合中的元素赋予张量变量a,且指向张量变量b指针集合包含张量变量t元素,则将指向张量变量b指针集合加载到指向张量变量a的指针集合操作的约束表示:指向张量变量t的指针集合被包含于指向张量变量a的指针集合中;
    步骤5.2.2、基于所述加载操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b指针集合的元素张量变量t的指针集合被包含于指向张量变量a的指针集合中,基于约束关系,拓扑图中生成一条由张量变量t的指针集合的节点指向张量变量a的指针集合节点的连边;
    上述拓扑图中加载操作的张量变量的传播过程为:计算图的执行流经过表示加载操作的约束关系边,则所述约束关系边是从对应元素的指针集合节点传播至指向张量变量a的指针集合节点。
  8. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤5.3是指将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中,且指向张量变量b的指针集合沿着计算图的基于约束表示的边的方向传播,具体子步骤如下:
    步骤5.3.1、构建计算图中存储操作的中间表示:对于计算图中存储操作的约束表示,满足张量变量b与张量变量a存在将指向张量变量b的指针集合赋予指向张量变量a的指针集合的元素且指向张量变量a的指针集合包含张量变量t元素,则将指向张量变量b的指针集合存储到指向张量变量a的指针集合的元素的指针集合中的约束表示:代表为指向张量变量b的指针集合被包含于指向张量变量a的指针集合元素t的指针集合中;
    步骤5.3.2、基于所述存储操作的约束表示构建拓扑图,具体为:对于约束关系为指向张量变量b指针集合被包含于指向张量变量t的指针集合中且,基于约束关系,拓扑图中生成一条由张量变量b的指针集合的节点指向张量变量t的指针集合节点的连边;
    上述拓扑图中存储操作的张量变量的传播过程为:计算图的执行流经过表示存储操作的约束关系边,则所述约束关系边是从指向张量变量b的指针集合节点传播至指向张量变量a的指针集合中对应元素t的指针集合节点。
  9. 如权利要求1所述的用于计算图编译的中间表示方法,其特征在于:所述步骤6具体为:若发现所述中间表示的拓扑图中,指向某张量变量的指针集合中包含有不同别名的张量变量,由于其内存地址相同,将这些不同别名的张量变量视为同一个张量变量,并为这些不同别名的张量变量分配同一个空闲寄存器。
  10. 一种用于计算图编译的中间表示装置,其特征在于:所述装置包括存储器和一个或多个处理器,所述存储器中存储有可执行代码,所述一个或多个处理器执行所述可执行代码时,用于实现权利要求1-9任一项所述的用于计算图编译的中间表示方法。
PCT/CN2022/124002 2022-09-27 2022-10-09 一种用于计算图编译的中间表示方法及装置 WO2024065866A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/071,958 US20240104016A1 (en) 2022-09-27 2022-11-30 Intermediate Representation Method and Apparatus for Compiling Computation Graphs

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211177783.1 2022-09-27
CN202211177783.1A CN115756474A (zh) 2022-09-27 2022-09-27 一种用于计算图编译的中间表示方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/071,958 Continuation US20240104016A1 (en) 2022-09-27 2022-11-30 Intermediate Representation Method and Apparatus for Compiling Computation Graphs

Publications (1)

Publication Number Publication Date
WO2024065866A1 true WO2024065866A1 (zh) 2024-04-04

Family

ID=85350265

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/124002 WO2024065866A1 (zh) 2022-09-27 2022-10-09 一种用于计算图编译的中间表示方法及装置

Country Status (2)

Country Link
CN (1) CN115756474A (zh)
WO (1) WO2024065866A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117008916B (zh) * 2023-07-06 2024-08-20 清华大学 张量程序优化方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200110984A1 (en) * 2018-10-09 2020-04-09 Hewlett Packard Enterprise Development Lp Avoiding cycles in neural networks
US20200319861A1 (en) * 2019-04-02 2020-10-08 Graphcore Limited Compiling a Program from a Graph
CN114186687A (zh) * 2022-02-17 2022-03-15 之江实验室 一种面向神经网络模型计算的中间表示方法和装置
CN114492772A (zh) * 2021-11-16 2022-05-13 阿里云计算有限公司 神经网络张量形状追踪方法和计算平台
CN114936099A (zh) * 2022-07-25 2022-08-23 之江实验室 一种用于神经网络计算的图优化方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200110984A1 (en) * 2018-10-09 2020-04-09 Hewlett Packard Enterprise Development Lp Avoiding cycles in neural networks
US20200319861A1 (en) * 2019-04-02 2020-10-08 Graphcore Limited Compiling a Program from a Graph
CN114492772A (zh) * 2021-11-16 2022-05-13 阿里云计算有限公司 神经网络张量形状追踪方法和计算平台
CN114186687A (zh) * 2022-02-17 2022-03-15 之江实验室 一种面向神经网络模型计算的中间表示方法和装置
CN114936099A (zh) * 2022-07-25 2022-08-23 之江实验室 一种用于神经网络计算的图优化方法和装置

Also Published As

Publication number Publication date
CN115756474A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
US9996394B2 (en) Scheduling accelerator tasks on accelerators using graphs
WO2024021192A1 (zh) 一种用于神经网络计算的图优化方法和装置
Acar et al. Adaptive functional programming
US11900113B2 (en) Data flow processing method and related device
JP2007528059A (ja) ソフトウェアのモデル化、抽象、および分析のためのシステムと方法
WO2024065867A1 (zh) 一种用于神经网络编译的内存优化方法及装置
Rawat et al. Resource conscious reuse-driven tiling for GPUs
WO2023093185A1 (zh) 一种用于神经网络计算的数据流动方法和装置
US20080229297A1 (en) Method and system for reducing memory reference overhead associated with treadprivate variables in parallel programs
Shun Shared-memory parallelism can be simple, fast, and scalable
WO2024065866A1 (zh) 一种用于计算图编译的中间表示方法及装置
CN114330735A (zh) 处理机器学习模型的方法、电子设备和计算机程序产品
US20240104016A1 (en) Intermediate Representation Method and Apparatus for Compiling Computation Graphs
WO2023221626A1 (zh) 一种内存分配的方法和装置
WO2023082901A1 (zh) 一种用于计算图编译的优化方法及装置
WO2020238348A1 (zh) 区块的验证方法、装置及设备
Zaki et al. Implementation, scheduling, and adaptation of partial expansion graphs on multicore platforms
CN115269205B (zh) 一种面向神经网络计算的内存优化方法和装置
US11762641B2 (en) Allocating variables to computer memory
US10732946B2 (en) Simulation-based code duplication
WO2024065869A1 (zh) 一种用于图计算的指令执行方法及装置
US20150082443A1 (en) System to automate compliance with licenses of software third-party content
US11922152B2 (en) Workload oriented constant propagation for compiler
US20240104341A1 (en) Memory optimization method and apparatus for neural network compilation
Coelho et al. ACQuA: A Parallel Accelerator Architecture for Pure Functional Programs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22960470

Country of ref document: EP

Kind code of ref document: A1