CN113011585B

CN113011585B - Compiling optimization method, system, equipment and storage medium for eliminating splicing operator

Info

Publication number: CN113011585B
Application number: CN202110295853.2A
Authority: CN
Inventors: 谭黎敏; 田承雷; 宋捷
Original assignee: Shanghai Xijing Technology Co ltd
Current assignee: Shanghai Xijing Technology Co ltd
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2023-09-26
Anticipated expiration: 2041-03-19
Also published as: CN113011585A

Abstract

The invention provides a compiling optimization method, a compiling optimization system, compiling optimization equipment and a storage medium for eliminating a splicing operator, wherein the compiling optimization method comprises the following steps: searching a splicing operator to be eliminated in the neural network model; acquiring address information of an output array of the splicing operator; acquiring address information of an input array of the splicing operator; updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined; and deleting the splicing operator in the neural network model. According to the invention, the splicing operator in the neural network model is eliminated through compiling, the model size is optimized, the running time of the neural network model is not limited by the execution time of the splicing operator, and the reasoning speed of the neural network model is accelerated.

Description

Compiling optimization method, system, equipment and storage medium for eliminating splicing operator

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a compiling optimization method, system, device, and storage medium for eliminating a splicing operator.

Background

The convolutional neural network (Convolutional Neural Network, CNN) is a feed-forward neural network whose artificial neurons can respond to surrounding cells in a part of the coverage area with excellent performance for large image processing. It includes a convolution layer (convolutional layer) and a pooling layer (pooling layer). Convolutional neural networks have been widely used for image classification, object recognition, and object tracking.

Because of the reasoning of convolutional neural networks, huge computation is needed, and special AI (Artificial Intelligence ) processing chips are generated. Usually the model needs to be transformed, optimized before it can run on a dedicated chip, a process also called AI compiler. The optimization section focuses on reducing the model size and reducing the run time. The optimization mainly comprises the following steps:

1. operator optimization

2. Graph optimization

3. Model compression

One important means in computational graph optimization is operator fusion, and the purpose of reducing the operand and the memory access is achieved by combining operators.

Operator fusion is based on observation of deep learning topology patterns. Deep learning operators can be divided into two categories:

computationally intensive operators, such as convolution, full join, etc., i.e., are computationally intensive at runtime.

Memory intensive operators, such as ReLU, splice, etc., that is, memory needs to be accessed frequently at runtime.

In a typical deep learning model, generally computation-intensive and memory-intensive operators are concomitantly present, such as "conv+relu" concomitantly present. Taking GPU (Graphics Processing Unit, graphics processor) as an example, we can combine the operators into one composite operator, and after GPU has performed Conv, perform Relu in the video memory, so as to reduce the interaction with the main memory.

A stitching operator (concat) is a common operator in neural networks that links multiple tensors (tensors) of an input on a specified axis. This operator belongs to an access-intensive operator, which mainly consumes memory access time. On each hardware platform, the execution time is proportional to the memory bandwidth as long as the splicing operation is performed.

Aiming at the memory intensive operator, the AI compiler can reduce the memory access by fusing adjacent operators. However, the stitching operator is generally used to fuse features of different layers, and usually, two input arrays are far apart in the computational graph and the actual memory, which is not satisfied. Therefore, conventional memory intensive operator fusion methods cannot be used for stitching operator fusion.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide a compiling and optimizing method, a system, equipment and a storage medium for eliminating a splicing operator, which are used for accelerating the reasoning speed of a neural network model by compiling and eliminating the splicing operator in the neural network model.

The embodiment of the invention provides a compiling optimization method for eliminating a splicing operator, which comprises the following steps:

s100: searching a splicing operator to be eliminated in the neural network model;

s200: acquiring address information of an output array of the splicing operator;

s300: acquiring address information of an input array of the splicing operator;

s400: updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

s500: and deleting the splicing operator in the neural network model.

In some embodiments, the address information of the output array of the splice operator includes a start address of the output array, and the address information of the input array of the splice operator includes the start address of the input array and an array length;

In the step S400, updating the address information of the input array of the splicing operator includes the following steps:

and taking the starting address of the output array of the splicing operator as the starting address of the first input array, wherein the starting address of each input array is equal to the sum of the starting address of the previous input array and the array length of the previous input array except for the first input array.

In some embodiments, before the step S400 uses the start address of the output array of the splicing operator as the start address of the first input array, the method further includes the following steps:

and sequencing the input arrays according to the splicing sequence of the splicing operators to the input arrays.

In some embodiments, the step S100: searching a splicing operator to be eliminated in a neural network model, comprising the following steps:

traversing an operator list of the neural network model, and searching for unequalized splicing operators;

and taking the searched splicing operator as the splicing operator to be eliminated.

In some embodiments, the step S200: the method for obtaining the address information of the output array of the splicing operator comprises the following steps: acquiring the DDR offset address of the output array of the splicing operator according to the operator parameters of the neural network model;

The step S300: the method for obtaining the address information of the input array of the splicing operator comprises the following steps: and acquiring the DDR offset address and the array length of the input array of the splicing operator according to the operator parameters of the neural network model.

In some embodiments, the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

acquiring a splicing sequence of the splicing operator to an input array according to operator parameters of the neural network model, and sequencing the input array according to the splicing sequence;

and sequentially updating the DDR offset addresses of the input arrays according to the ordering sequence of the input arrays, so that the DDR offset addresses of the input arrays of the splicing operator are combined and correspond to the DDR offset addresses of the output arrays of the splicing operator.

In some embodiments, the step of sequentially updating the DDR offset addresses of the input arrays according to the sorting order of the input arrays includes the steps of:

for the first input array, updating the DDR offset address of the input array into the DDR offset address of the output array of the splicing operator;

for a subsequent input array other than the first input array, the DDR offset address of the input array is updated to the DDR offset address of the previous input array plus the array length of the previous input array.

In some embodiments, the step S500: after deleting the splicing operator in the neural network model, the method further comprises the following steps:

traversing an operator list of the neural network model, and judging whether an undeleted splicing operator exists or not;

if yes, selecting the splicing operator which is not eliminated as the splicing operator to be eliminated, and then continuing to step S200;

if not, judging whether other compiling optimization tasks exist, if so, executing the other compiling optimization tasks, and if not, compiling the neural network model to obtain an executable file which can be run by the chip.

The embodiment of the invention also provides a compiling and optimizing system for eliminating the splicing operator, which is used for realizing the compiling and optimizing method for eliminating the splicing operator, and comprises the following steps:

the splicing operator searching module is used for searching splicing operators to be eliminated in the neural network model;

the address information acquisition module is used for acquiring the address information of the output array of the splicing operator and acquiring the address information of the input array of the splicing operator;

the address information updating module is used for updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

And the splicing operator deleting module is used for deleting the splicing operator in the neural network model.

the address information updating module updates the address information of the input array of the splicing operator by adopting the following steps:

In some embodiments, the method further comprises a network algorithm compiling module, and the splicing operator searching module is used for searching splicing operators to be eliminated in the neural network model by adopting the following steps:

traversing an operator list of the neural network model, and searching whether an undeleted splicing operator exists;

if yes, taking the searched splicing operator as the splicing operator to be eliminated;

if not, the network algorithm compiling module judges whether other compiling optimization tasks exist, if so, the network algorithm compiling module executes the other compiling optimization tasks, and if not, the network algorithm compiling module compiles the neural network model to obtain an executable file which can be run by a chip.

In some embodiments, the address information obtaining module is configured to obtain, according to operator parameters of the neural network model, a DDR offset address of an output array of the splicing operator, and obtain, according to operator parameters of the neural network model, a DDR offset address and an array length of an input array of the splicing operator;

the address information updating module is used for updating the address information of the input array of the splicing operator by adopting the following steps:

The embodiment of the invention also provides compiling and optimizing equipment for eliminating the splicing operator, which comprises the following steps:

a processor;

a memory having stored therein executable instructions of the processor;

Wherein the processor is configured to perform the steps of the compile optimization method of the eliminate splice operator via execution of the executable instructions.

The embodiment of the invention also provides a computer readable storage medium for storing a program, which when being executed by a processor, realizes the steps of the compiling optimization method for eliminating the splicing operator.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

The compiling optimization method, system, equipment and storage medium for eliminating the splicing operator have the following beneficial effects:

according to the invention, the input and input address information is updated according to the address information of the output array of the splicing operator in compiling, so that the address information of the input array of the splicing operator is combined and then corresponds to the address information of the output array of the splicing operator, and the splicing function is realized through updating the address information without setting the splicing operator alone, thereby eliminating the splicing operator in the neural network model through compiling, optimizing the model size, ensuring that the running time of the neural network model is not limited by the execution time of the splicing operator, and accelerating the reasoning speed of the neural network model.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings.

FIG. 1 is a flow chart of a compilation optimization method of eliminating a splice operator according to an embodiment of the present invention;

FIG. 2 is a functional schematic of a splice operator according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a portion of a model prior to elimination of a stitching operator according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a portion of a model after elimination of a stitching operator according to an embodiment of the present invention;

FIG. 5 is a flow chart of updating address information of an input array according to an embodiment of the present invention;

FIG. 6 is a flow chart of a loop cancellation stitching operator of an embodiment of the present invention;

FIG. 7 is a schematic diagram of a compilation optimization system that eliminates stitching operators, in accordance with an embodiment of the present invention;

FIG. 8 is a schematic diagram of a compiling optimization device for eliminating a splicing operator according to an embodiment of the present invention;

fig. 9 is a schematic structural view of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.

The flow diagrams depicted in the figures are exemplary only and not necessarily all steps are included. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.

As shown in fig. 1, an embodiment of the present invention provides a compiling optimization method for eliminating a splicing operator, including the following steps:

s300: acquiring address information of an input array of the splicing operator;

s500: and deleting the splicing operator in the neural network model.

According to the compiling optimization method for eliminating the splicing operator, firstly, the splicing operator to be eliminated is searched through the step S100, then the address information of the output array and the address information of the input array are respectively obtained through the steps S200 and S300, and the input and input address information is updated according to the address information of the output array of the splicing operator in compiling through the step S400, so that the address information of the input array of the splicing operator is combined and corresponds to the address information of the output array of the splicing operator, the splicing function is realized through updating the address information, the splicing operator is not required to be set independently, and therefore the required splicing function can still be realized after the splicing operator is deleted through the step S500. Therefore, the method eliminates the splicing operator in the neural network model through compiling, optimizes the model size, ensures that the running time of the neural network model is not limited by the execution time of the splicing operator, and accelerates the reasoning speed of the neural network model.

The algorithm of the neural network model to be compiled comprises a plurality of operators, and the neural network model comprises an operator list, operator parameters and weight data. Wherein the operator list includes each operator included in the model. For a splice operator, its operator parameters include at least the parameters of its input array and the parameters of its output array.

In this embodiment, the step S100: searching a splicing operator to be eliminated in a neural network model, comprising the following steps:

and taking the searched splicing operator as a splicing operator to be eliminated, and executing subsequent steps S200-S500 on the splicing operator to be eliminated.

The splicing operator (Concat) is used for splicing two or more arrays, and in the process of executing the splicing operator, memory copying is executed in a chip substantially, so that the splicing operator belongs to a memory access intensive operator. FIG. 2 is a functional schematic of a splice operator. The input of the splicing operator is two arrays of Array1 and Array2, and the output is Array3, namely the splicing operator is used for splicing the two input arrays of Array1 and Array2 to obtain an output Array of Array3. The size of the output Array3 is Array1+Array2, the front part of the output Array3 is the same as the input Array1, and the rear part is the same as the input Array 2. The splice operator does not perform any operation during execution.

Each array corresponds to a segment of memory, and the structure is expressed as:

where start represents the start address of the array and len represents the length of the array.

In this embodiment, the address information of the output array of the concatenation operator includes a start address of the output array, and the address information of the input array of the concatenation operator includes the start address of the input array and an array length. The address information update for the input array is mainly based on the start address of the output array and the array length of the input array.

the starting address of the output array of the splicing operator is taken as the starting address of the first input array, the starting address of each input array is equal to the sum of the starting address of the previous input array and the array length of the previous input array except the first input array, so that the effect of splicing each input array is realized by updating address information, the spliced input array can be directly used for the input of the next operator, and the function of the splicing operator can be realized under the condition that the splicing operator is not used.

Specifically, the start address OutArray. Start of the output array OutArray of the splicing operator is acquired in step S200, the start addresses InArray1.Start, inArray2.Start … … InArray (N). Start of the N input arrays InArray1, inArray2.… … InArray (N) of the splicing operator, and the lengths InArray1.Len, inArray2.Len … … InArray (N). Len of the N input arrays are acquired in step S300.

In step S400, the start address of the update input array is as follows:

the start address inarray1.start=outarraystart of the first input array;

the start address InArray (i). Start=inarray (i-1). Start+inarray (i-1). Len, i is a positive integer of 2 or more and n or less.

Since the input arrays are spliced by the splicing operator, the input arrays are generally required to be spliced according to a fixed order, in this case, the initial address of the input arrays needs to be updated in a specific order in step S400, so as to ensure that the finally obtained output arrays can be correctly used for inputting the post operator. Specifically, in this embodiment, before taking the start address of the output array of the splicing operator as the start address of the first input array in step S400, the method further includes: and sequencing the input arrays according to the splicing sequence of the splicing operators to the input arrays. The concatenation order of the input array can be obtained according to operator parameters of a concatenation operator in the neural network model.

As shown in fig. 3 and 4, a splicing operator that splices two input arrays is described here as an example. As shown in fig. 3, the input Array1 of the splicing operator is the output of the pre-operator 1, the input Array2 of the splicing operator is the output of the pre-operator 2, and the input Array1 and the input Array2 are spliced in the splicing operator to obtain the output Array3 as the input of the post-operator. As shown in fig. 4, the start addresses of the input Array1 and the input Array2 are updated in the neural network model, the start address of the input Array1 is updated to the start address of the output Array3, and the start address of the input Array2 is updated to the start address of the Array 1+the Array length of the Array 1. As shown in fig. 4, the output of the pre-operator 1, the output of the pre-operator 2 and the input of the post-operator are the same array, and the array is sequentially arranged, so that the actual effect of the splicing operator can be realized, and the splicing operator is not required to be arranged in practice.

In this embodiment, the start address of the array is represented by a DDR (Double Data Rate, one type of memory) offset address of the array on the target machine after compiling, and step S200: the method for obtaining the address information of the output array of the splicing operator comprises the following steps: and acquiring the DDR offset address of the output array of the splicing operator according to the operator parameters of the neural network model. The present invention is not limited to this, and other methods of expressing the initial address are also possible, which fall within the scope of the present invention.

As shown in fig. 5, in this embodiment, the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

s410: acquiring a splicing sequence of the splicing operator on the input array according to operator parameters of the neural network model, for example, when a plurality of input arrays exist, arranging the input arrays into InArray1 and InArray2 … … in sequence according to the splicing sequence;

s420: sequencing the input arrays according to the splicing sequence;

s430: and sequentially updating the DDR offset addresses of the input arrays according to the ordering sequence of the input arrays, so that the DDR offset addresses of the input arrays of the splicing operator are combined and correspond to the DDR offset addresses of the output arrays of the splicing operator.

Specifically, the step S430 includes the steps of:

s431: for the first input array, updating the DDR offset address of the input array to be the DDR offset address of the output array of the splicing operator, namely InArray1. Start=OutArray1. Start;

S432: for the subsequent input arrays except the first input array, the DDR offset address of the input array is updated to be the DDR offset address of the previous input array plus the array length of the previous input array, namely the start address InArray (i) of the ith input array, namely start=InArray (i-1), start+InArray (i-1), len, i is a positive integer which is more than or equal to 2 and less than or equal to n.

In this embodiment, the step S500: deleting a splicing operator in the neural network model, and specifically comprises deleting the splicing operator in an operator list of the neural network model and deleting operator parameters of the splicing operator in the neural network model.

As shown in fig. 6, in this embodiment, the step S500: after deleting the splicing operator in the neural network model, the method further comprises the following steps:

s610: traversing an operator list of the neural network model, and judging whether an undeleted splicing operator exists or not;

if so, S620: selecting an undeployed splicing operator as a splicing operator to be eliminated, and executing steps S200-S500 on the splicing operator to be eliminated so as to eliminate the splicing operator on the basis of reserving the functions of the splicing operator;

if not, then step S630 is continued: judging whether there are other compilation optimization tasks, such as compilation optimization on convolution operators, compilation optimization on full join operators, and so on, if there are other compilation optimization tasks, continuing to step S640: executing other compiling optimization tasks, if there are no other compiling optimization tasks, continuing with step S650: the neural network model is compiled to obtain an executable file which can be operated by the chip, so that the executable file operated in the chip does not contain any splicing operator, the size of the model is reduced, and the operation time of the model in the chip is shortened. The data format in the executable file varies with the requirements of the various chips, with the aim of compiling the operator parameters, input data, etc. in the neural network into a chip-aware format.

As shown in fig. 7, the embodiment of the present invention further provides a compiling optimization system for eliminating a splicing operator, for implementing the compiling optimization method for eliminating a splicing operator, where the system includes:

a stitching operator searching module M100, configured to search a neural network model for a stitching operator to be eliminated, in this embodiment, by traversing an operator list of the neural network model;

the address information acquisition module M200 is used for acquiring the address information of the output array of the splicing operator and acquiring the address information of the input array of the splicing operator;

the address information updating module M300 is configured to update address information of an input array of the splicing operator according to address information of an output array of the splicing operator, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

a stitching operator deleting module M400, configured to delete a stitching operator in the neural network model, specifically, delete the stitching operator in the operator list of the neural network model and delete the operator parameter of the stitching operator in the neural network model.

According to the compiling optimization system for eliminating the splicing operator, firstly, the splicing operator to be eliminated is searched through the splicing operator searching module M100, then the address information of the output array and the address information of the input array are respectively obtained through the address information obtaining module M200, and the input and input address information is updated according to the address information of the output array of the splicing operator in compiling through the address information updating module M300, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined, the splicing function is realized through updating of the address information, the splicing operator is not required to be set independently, and therefore the required splicing function can still be realized after the splicing operator is deleted through the splicing operator deleting module M400. Therefore, the system eliminates the splicing operator in the neural network model through compiling, optimizes the model size, ensures that the running time of the neural network model is not limited by the execution time of the splicing operator, and accelerates the reasoning speed of the neural network model.

In this embodiment, the address information of the output array of the concatenation operator includes a start address of the output array, and the address information of the input array of the concatenation operator includes the start address of the input array and an array length.

The address information updating module M300 updates the address information of the input array of the splicing operator by adopting the following steps:

sequencing the input arrays according to the splicing sequence of the splicing operators on the input arrays;

In this embodiment, the system further includes a network algorithm compiling module, configured to compile the neural network model into an executable file that can be run by the chip, where a format of data in the executable file is a data format that can be recognized by the chip.

Specifically, the splicing operator searching module M100 is configured to search a neural network model for a splicing operator to be eliminated by adopting the following steps:

if not, the network algorithm compiling module judges whether other compiling optimization tasks exist, if so, the other compiling optimization task executing module executes the other compiling optimization tasks, and if not, the network algorithm compiling module compiles the neural network model to obtain an executable file which can be run by a chip.

In this embodiment, the starting address of the array is represented by the DDR offset address of the array on the target machine after compiling, but the present invention is not limited thereto, and other starting address expression methods are also possible, which fall within the scope of the present invention. The address information obtaining module M200 is configured to obtain, according to operator parameters of the neural network model, a DDR offset address of an output array of the splicing operator, and obtain, according to operator parameters of the neural network model, a DDR offset address and an array length of an input array of the splicing operator.

The address information updating module M300 is configured to update address information of the input array of the concatenation operator by adopting the following steps:

After the address information updating module M300 updates the initial address of the input array, the influence on the structure of the neural network model is as shown in fig. 3 and 4, that is, the output of the front operator and the input of the rear operator of the original splicing operator are the same array, and because the arrays are sequentially arranged, the actual effect of the splicing operator can be achieved, and the splicing operator can be deleted.

The embodiment of the invention also provides compiling and optimizing equipment for eliminating the splicing operator, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the compile optimization method of the eliminate splice operator via execution of the executable instructions.

Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" platform.

An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 600 shown in fig. 8 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 8, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different system components (including the memory unit 620 and the processing unit 610), a display unit 640, etc.

Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention described in the above-described compilation optimization method section of a cancellation stitching operator of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.

The memory unit 620 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.

The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

In the compiling and optimizing device for eliminating the splicing operator, the steps of the compiling and optimizing method for eliminating the splicing operator are realized when the program in the memory is executed by the processor, so that the device can obtain the technical effects of the compiling and optimizing method for eliminating the splicing operator.

The embodiment of the invention also provides a computer readable storage medium for storing a program, which when being executed by a processor, realizes the steps of the compiling optimization method for eliminating the splicing operator. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention as described in the above-mentioned compilation optimization method section of a cancellation splicing operator of the present specification, when said program product is executed on a terminal device.

Referring to fig. 9, a program product 800 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be executed on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The program in the computer storage medium is executed by the processor to implement the steps of the method for compiling and optimizing the elimination splicing operator, so that the computer storage medium can also obtain the technical effects of the method for compiling and optimizing the elimination splicing operator.

The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims

1. A compiling optimization method for eliminating a splicing operator is characterized by comprising the following steps:

s300: acquiring address information of an input array of the splicing operator;

s500: deleting a splicing operator in the neural network model;

the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

sequentially updating the DDR offset addresses of the input arrays according to the ordering sequence of the input arrays, so that the DDR offset addresses of the input arrays of the splicing operator are combined and correspond to the DDR offset addresses of the output arrays of the splicing operator;

the DDR offset addresses of the input arrays are updated in sequence according to the ordering sequence of the input arrays, and the DDR offset addresses of the input arrays are updated in sequence, and the method comprises the following steps:

2. The method for optimizing compilation of elimination splicing operators according to claim 1, wherein the step S100: searching a splicing operator to be eliminated in a neural network model, comprising the following steps:

3. The method for optimizing compilation of elimination splicing operators according to claim 1, wherein the step S200: the method for obtaining the address information of the output array of the splicing operator comprises the following steps: acquiring the DDR offset address of the output array of the splicing operator according to the operator parameters of the neural network model;

4. The method for optimizing compilation of elimination splicing operators according to claim 1, wherein the step S500: after deleting the splicing operator in the neural network model, the method further comprises the following steps:

5. A compilation optimization system of elimination splice operators, characterized in that it is adapted to implement the compilation optimization method of elimination splice operators according to any of claims 1 to 4, said system comprising:

the splicing operator deleting module is used for deleting the splicing operator in the neural network model;

6. The system of claim 5, further comprising a network algorithm compiling module, wherein the splice operator lookup module is configured to lookup a splice operator to be eliminated in a neural network model by:

7. The system according to claim 6, wherein the address information obtaining module is configured to obtain, according to operator parameters of the neural network model, a DDR offset address of an output array of the splicing operator, and obtain, according to operator parameters of the neural network model, a DDR offset address and an array length of an input array of the splicing operator.

8. A compilation optimization device that eliminates splice operators, comprising:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the compilation optimization method of the cancellation stitching operator of any one of claims 1 to 4 via execution of the executable instructions.

9. A computer-readable storage medium storing a program, wherein the program when executed by a processor implements the steps of the compilation optimization method of eliminating a stitching operator according to any one of claims 1 to 4.