CN113011585A

CN113011585A - Compiling optimization method, system, equipment and storage medium for eliminating splicing operator

Info

Publication number: CN113011585A
Application number: CN202110295853.2A
Authority: CN
Inventors: 谭黎敏; 田承雷; 宋捷
Original assignee: Shanghai Westwell Information Technology Co Ltd
Current assignee: Shanghai Westwell Information Technology Co Ltd
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2021-06-22
Anticipated expiration: 2041-03-19
Also published as: CN113011585B

Abstract

The invention provides a compiling optimization method, a compiling optimization system, compiling optimization equipment and a storage medium for eliminating splicing operators, wherein the method comprises the following steps: searching a splicing operator to be eliminated in the neural network model; acquiring address information of an output array of the splicing operator; acquiring address information of an input array of the splicing operator; updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined; the splice operator is deleted in the neural network model. According to the invention, the splicing operator in the neural network model is eliminated through compiling, the size of the model is optimized, the running time of the neural network model is not limited by the execution time of the splicing operator any more, and the reasoning speed of the neural network model is accelerated.

Description

Compiling optimization method, system, equipment and storage medium for eliminating splicing operator

Technical Field

The invention relates to the technical field of data processing, in particular to a compiling optimization method, a compiling optimization system, compiling optimization equipment and a storage medium for eliminating splicing operators.

Background

A Convolutional Neural Network (CNN) is a feed-forward Neural Network whose artificial neurons can respond to a portion of the coverage of surrounding cells, and performs well for large image processing. It includes a convolutional layer (convolutional layer) and a pooling layer (Pooling layer). Convolutional neural networks have been widely used for image classification, object recognition, and target tracking.

Due to the reasoning of the convolutional neural network, a huge amount of computation is required, and a special AI (Artificial Intelligence) processing chip is produced. Usually, the model needs to be transformed and optimized before it can be run on a dedicated chip, and this process is also called AI compiler. The optimization section focuses on reducing the model size, reducing the run time. The optimization mainly comprises the following steps:

1. operator optimization

2. Graph optimization

3. Model compression

One important means in the optimization of the computation graph is operator fusion, and the purpose of reducing the operation amount and the memory access amount is achieved by combining operators.

Operator fusion is based on observations of deep-learning topology patterns. Deep learning operators can be divided into two categories:

computation intensive operators such as convolution, full join, etc., i.e., there are a large number of computations in operation.

Access-intensive operators, such as ReLU, concatenation, etc., require frequent memory accesses at run-time.

In a typical deep learning model, generally computation-intensive and memory-intensive operators are concomitant, such as "Conv + ReLU". Taking a GPU (Graphics Processing Unit, Graphics processor) as an example, the operators can be fused into a composite operator, and after the GPU executes the Conv, the Relu is executed in the video memory, so that interaction with the main memory can be reduced.

The concatenation operator (concat) is a commonly used operator in neural networks, which connects the input tensors (tensors) on a specified axis. This operator belongs to the access-intensive operator and mainly consumes memory access time. On each hardware platform, as long as the splicing operation is carried out, the execution time is in direct proportion to the memory bandwidth.

Aiming at the access and storage intensive operator, the access of the memory is reduced by fusing adjacent operators through the AI compiler. However, the stitching operator is generally used to fuse features of different layers, and usually two input arrays are far apart in the computation graph and the actual memory, which is not satisfied with this condition. Therefore, the conventional memory access intensive operator fusion method cannot be used for splicing operator fusion.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide a compiling optimization method, a compiling optimization system, compiling optimization equipment and a storage medium for eliminating a splicing operator, wherein the splicing operator in a neural network model is eliminated through compiling, and the reasoning speed of the neural network model is accelerated.

The embodiment of the invention provides a compiling optimization method for eliminating a splicing operator, which comprises the following steps:

s100: searching a splicing operator to be eliminated in the neural network model;

s200: acquiring address information of an output array of the splicing operator;

s300: acquiring address information of an input array of the splicing operator;

s400: updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

s500: the splice operator is deleted in the neural network model.

In some embodiments, the address information of the output array of the splicing operator comprises a start address of the output array, and the address information of the input array of the splicing operator comprises a start address of the input array and an array length;

in step S400, updating the address information of the input array of the splicing operator includes the following steps:

and taking the initial address of the output array of the splicing operator as the initial address of the first input array, wherein the initial address of each input array except the first input array is equal to the sum of the initial address of the previous input array and the array length of the previous input array.

In some embodiments, before the step S400 uses the start address of the output array of the splicing operator as the start address of the first input array, the method further includes the following steps:

and sequencing the input arrays according to the splicing sequence of the splicing operator to the input arrays.

In some embodiments, the step S100: searching for a splicing operator to be eliminated in a neural network model, comprising the following steps:

traversing an operator list of the neural network model, and searching for an unremoved splicing operator;

and taking the searched splicing operator as the splicing operator to be eliminated.

In some embodiments, the step S200: acquiring address information of an output array of the splicing operator, including: acquiring a DDR offset address of an output array of the splicing operator according to the operator parameter of the neural network model;

the step S300: acquiring address information of an input array of the splicing operator, wherein the address information comprises: and acquiring the DDR offset address and the array length of the input array of the splicing operator according to the operator parameter of the neural network model.

In some embodiments, the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

acquiring the splicing sequence of the splicing operator to the input array according to the operator parameters of the neural network model, and sequencing the input array according to the splicing sequence;

and sequentially updating the DDR offset addresses of the input arrays according to the sorting sequence of the input arrays, so that the DDR offset addresses of the input arrays of the splicing operator correspond to the DDR offset addresses of the output arrays of the splicing operator after being combined.

In some embodiments, sequentially updating the DDR offset addresses of the input arrays according to the sorting order of the input arrays includes the following steps:

for the first input array, updating the DDR offset address of the input array as the DDR offset address of the output array of the splicing operator;

and for the subsequent input arrays except the first input array, updating the DDR offset address of the input array to be the DDR offset address of the previous input array plus the array length of the previous input array.

In some embodiments, the step S500: after the splicing operator is deleted from the neural network model, the method further comprises the following steps:

traversing an operator list of the neural network model, and judging whether an unremoved splicing operator still exists;

if so, selecting the splicing operator which is not eliminated as the splicing operator to be eliminated, and continuing to the step S200;

if not, judging whether other compiling optimization tasks exist, if so, executing the other compiling optimization tasks, and if not, compiling the neural network model to obtain an executable file which can be operated by the chip.

The embodiment of the invention also provides a compiling optimization system for eliminating the splicing operator, which is used for realizing the compiling optimization method for eliminating the splicing operator, and the system comprises the following steps:

the splicing operator searching module is used for searching a splicing operator to be eliminated in the neural network model;

the address information acquisition module is used for acquiring the address information of the output array of the splicing operator and acquiring the address information of the input array of the splicing operator;

the address information updating module is used for updating the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

and the splicing operator deleting module is used for deleting the splicing operator in the neural network model.

the address information updating module updates the address information of the input array of the splicing operator by adopting the following steps:

In some embodiments, the method further comprises a network algorithm compiling module, and the splicing operator searching module is configured to search for a splicing operator to be eliminated in the neural network model by using the following steps:

traversing an operator list of the neural network model, and searching whether an unremoved splicing operator exists or not;

if so, taking the searched splicing operator as the splicing operator to be eliminated;

if not, the network algorithm compiling module judges whether other compiling and optimizing tasks exist, if so, the other compiling and optimizing tasks are executed, and if not, the network algorithm compiling module compiles the neural network model to obtain an executable file which can be operated by a chip.

In some embodiments, the address information obtaining module is configured to obtain, according to an operator parameter of the neural network model, a DDR offset address of an output array of the splicing operator, and obtain, according to the operator parameter of the neural network model, a DDR offset address and an array length of an input array of the splicing operator;

the address information updating module is used for updating the address information of the input array of the splicing operator by adopting the following steps:

The embodiment of the present invention further provides a compiling and optimizing device for eliminating a splicing operator, including:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the compilation optimization method of elimination splice operators via execution of the executable instructions.

The embodiment of the invention also provides a computer-readable storage medium for storing a program, and the program realizes the steps of the compiling optimization method for eliminating the splicing operator when being executed by a processor.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

The compiling optimization method, the compiling optimization system, the compiling optimization equipment and the compiling optimization storage medium for eliminating the splicing operator have the following beneficial effects:

according to the method, the input and input address information is updated according to the address information of the output array of the splicing operator in the compiling process, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined, the splicing function is realized through the updating of the address information without independently setting the splicing operator, the splicing operator in the neural network model is eliminated through the compiling process, the model size is optimized, the running time of the neural network model is not limited by the execution time of the splicing operator any more, and the reasoning speed of the neural network model is accelerated.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings.

FIG. 1 is a flow diagram of a compilation optimization method for eliminating splice operators according to an embodiment of the present invention;

FIG. 2 is a functional diagram of a splice operator according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a partial structure of a model before a stitching operator is eliminated according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a partial structure of a model after the removal of the stitching operator according to an embodiment of the present invention;

FIG. 5 is a flowchart of updating address information of an input array according to an embodiment of the present invention;

FIG. 6 is a flow diagram of a loop elimination stitching operator according to one embodiment of the present invention;

FIG. 7 is a block diagram of a compilation optimization system that eliminates concatenation operators according to an embodiment of the present invention;

FIG. 8 is a schematic structural diagram of a compiling optimization device for eliminating a splicing operator according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the steps. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

As shown in fig. 1, an embodiment of the present invention provides a compiling optimization method for eliminating a splicing operator, including the following steps:

s300: acquiring address information of an input array of the splicing operator;

s500: the splice operator is deleted in the neural network model.

The compiling optimization method for eliminating the splicing operator comprises the steps of firstly searching the splicing operator to be eliminated through the step S100, then respectively obtaining address information of an output array and an input array through the steps S200 and S300, and updating the input and output address information according to the address information of the output array of the splicing operator in compiling through the step S400, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined, the splicing function is realized through updating the address information, the splicing operator does not need to be independently arranged, and the required splicing function can be still realized after the splicing operator is deleted through the step S500. Therefore, the method eliminates the splicing operator in the neural network model through compiling, optimizes the size of the model, ensures that the running time of the neural network model is not limited by the execution time of the splicing operator any more, and accelerates the reasoning speed of the neural network model.

The algorithm of the neural network model to be compiled comprises a plurality of operators, and the neural network model comprises an operator list, operator parameters and weight data. Wherein, the operator list comprises each operator included in the model. For the splicing operator, the operator parameters at least comprise the parameters of the input array and the parameters of the output array.

In this embodiment, the step S100: searching for a splicing operator to be eliminated in a neural network model, comprising the following steps:

and taking the searched splicing operator as the splicing operator to be eliminated, and executing the subsequent steps S200-S500 on the splicing operator to be eliminated.

The splicing operator (Concat) is used for splicing two or more arrays, and in the process of executing the splicing operator, memory copy is executed in the chip substantially, so that the splicing operator belongs to an access intensive operator. FIG. 2 is a functional diagram of a splice operator. The input of the splicing operator is two arrays of Array1 and Array2, and the output is Array3, namely the splicing operator is used for splicing the two input arrays of Array1 and Array2 to obtain an output Array 3. The output Array3 is sized as Array1+ Array2, the front portion of the output Array3 is identical to the input Array1, and the back portion is identical to the input Array2. The splice operator does not perform any operation during execution.

Each array corresponds to a segment of memory, and the structure is expressed as:

where start represents the start address of the array and len represents the length of the array.

In this embodiment, the address information of the output array of the splicing operator includes a start address of the output array, and the address information of the input array of the splicing operator includes a start address of the input array and an array length. The updating of the address information of the input array is mainly based on the initial address of the output array and the array length of the input array to update the initial address of the input array.

the initial address of the output array of the splicing operator is used as the initial address of the first input array, except the first input array, the initial address of each input array is equal to the sum of the initial address of the previous input array and the array length of the previous input array, so that the effect of splicing each input array is realized by updating address information, and the spliced input array can be directly used for inputting the next operator, so that the function of the splicing operator can be realized under the condition of not using the splicing operator.

Specifically, the start address outarray.start of the output array OutArray of the stitching operator is obtained in step S200, the start addresses InArray1.start, InArray2.start … … InArray (N) start of the N input arrays InArray1, InArray2 … … InArray (N) of the stitching operator are obtained in step S300, and the lengths InArray1.len, InArray2.len … … InArray (N) len of the N input arrays.

In step S400, the start address of the updated input array is as follows:

the start address inarray1.start of the first input array is outer array.start;

the start address InArray (i) of the ith input array is InArray (i-1), start + InArray (i-1), len, i is a positive integer not less than 2 and not more than n.

Because the input array is generally spliced according to a fixed sequence by the splicing operator, in this case, updating the start address of the input array in step S400 also needs to be updated in a specific sequence, thereby ensuring that the finally obtained output array can be correctly used for the input of the post-operator. Specifically, in this embodiment, before the step S400, taking the start address of the output array of the splicing operator as the start address of the first input array, the method further includes: and sequencing the input arrays according to the splicing sequence of the splicing operator to the input arrays. The splicing sequence of the input array can be obtained according to operator parameters of splicing operators in the neural network model.

As shown in fig. 3 and 4, a concatenation operator for concatenating two input arrays is taken as an example for explanation. As shown in fig. 3, the input Array1 of the splicing operator is the output of the prefix operator 1, the input Array2 of the splicing operator is the output of the prefix operator 2, and after the input Array1 and the input Array2 are spliced in the splicing operator, the output Array3 is obtained and used as the input of the prefix operator. As shown in fig. 4, in the neural network model, the start addresses of the input Array1 and the input Array2 are updated, the start address of the input Array1 is updated to the start address of the output Array3, and the start address of the input Array2 is updated to the start address of Array1+ the Array length of Array1. As shown in fig. 4, the output of the prefix operator 1, the output of the prefix operator 2, and the input of the postoperator are the same array, and the array is arranged sequentially, so that the actual effect of the splicing operator can be realized without actually setting the splicing operator.

In this embodiment, the start address of the array is represented by a DDR (Double Data Rate, a kind of Double Data synchronous dynamic random access memory, internal memory) offset address of the compiled array on the target machine, and the step S200: acquiring address information of an output array of the splicing operator, including: and acquiring the DDR offset address of the output array of the splicing operator according to the operator parameter of the neural network model. Here, the present invention is only an expression of the start address, but the present invention is not limited to this, and other expressions of the start address are also possible, and all fall within the scope of the present invention.

As shown in fig. 5, in this embodiment, the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

s410: acquiring the splicing sequence of the input arrays of the splicing operator according to the operator parameters of the neural network model, for example, when a plurality of input arrays exist, sequentially arranging the input arrays into InAlray 1 and InAlray 2 … … according to the splicing sequence;

s420: sorting the input arrays according to the splicing sequence;

s430: and sequentially updating the DDR offset addresses of the input arrays according to the sorting sequence of the input arrays, so that the DDR offset addresses of the input arrays of the splicing operator correspond to the DDR offset addresses of the output arrays of the splicing operator after being combined.

Specifically, the step S430 includes the following steps:

s431: for the first input array, updating the DDR offset address of the input array to be the DDR offset address of the output array of the splicing operator, that is, inarray1.start ═ outer array. start;

s432: for the subsequent input arrays except the first input array, the DDR offset address of the input array is updated to be the DDR offset address of the previous input array plus the array length of the previous input array, namely the start address InAlrray (i) of the ith input array, start is InAlrray (i-1), start + InAlrray (i-1), len, and i is a positive integer which is greater than or equal to 2 and less than or equal to n.

In this embodiment, the step S500: and deleting the splicing operator in the neural network model, specifically, deleting the splicing operator in an operator list of the neural network model and deleting an operator parameter of the splicing operator in the neural network model.

As shown in fig. 6, in this embodiment, the step S500: after the splicing operator is deleted from the neural network model, the method further comprises the following steps:

s610: traversing an operator list of the neural network model, and judging whether an unremoved splicing operator still exists;

if so, S620: selecting an unremoved splicing operator as a splicing operator to be eliminated, and then executing the steps S200-S500 on the splicing operator to be eliminated so as to eliminate the splicing operator on the basis of keeping the function of the splicing operator;

if not, continue with step S630: judging whether other compiling optimization tasks such as compiling optimization of convolution operators, compiling optimization of full-link operators and the like exist, if so, continuing to step 640: executing other compiling optimization tasks, if no other compiling optimization tasks exist, continuing to the step S650: the neural network model is compiled to obtain an executable file which can be operated by a chip, so that the executable file which operates in the chip does not contain any splicing operator, the size of the model is reduced, and the operation time of the model on the chip is reduced. The data format in the executable file is different according to different requirements of various chips, and the purpose is to compile operator parameters, input data and the like in the neural network into a format recognized by the chips.

As shown in fig. 7, an embodiment of the present invention further provides a compiling and optimizing system for eliminating a splicing operator, which is used to implement the compiling and optimizing method for eliminating a splicing operator, where the system includes:

a splicing operator searching module M100, configured to search a splicing operator to be eliminated in the neural network model, in this embodiment, the splicing operator to be eliminated is searched by traversing an operator list of the neural network model;

an address information obtaining module M200, configured to obtain address information of an output array of the splicing operator, and obtain address information of an input array of the splicing operator;

the address information updating module M300 is configured to update the address information of the input array of the splicing operator according to the address information of the output array of the splicing operator, so that the address information of the input array of the splicing operator corresponds to the address information of the output array of the splicing operator after being combined;

a splicing operator deleting module M400, configured to delete a splicing operator in the neural network model, specifically, to delete the splicing operator in an operator list of the neural network model and delete an operator parameter of the splicing operator in the neural network model.

According to the compiling optimization system for eliminating the splicing operator, firstly, the splicing operator to be eliminated is searched through the splicing operator searching module M100, then the address information of the output array and the address information of the input array are respectively obtained through the address information obtaining module M200, the address information of the input and the output array of the splicing operator is updated through the address information updating module M300 in compiling according to the address information of the output array of the splicing operator, the address information of the input array of the splicing operator is combined and then corresponds to the address information of the output array of the splicing operator, therefore, the splicing function is achieved through updating of the address information, the splicing operator does not need to be separately arranged, and the required splicing function can be achieved after the splicing operator is deleted through the splicing operator deleting module M400. Therefore, the system eliminates the splicing operator in the neural network model through compiling, optimizes the size of the model, ensures that the running time of the neural network model is not limited by the execution time of the splicing operator any more, and accelerates the reasoning speed of the neural network model.

In this embodiment, the address information of the output array of the splicing operator includes a start address of the output array, and the address information of the input array of the splicing operator includes a start address of the input array and an array length.

The address information updating module M300 updates the address information of the input array of the splicing operator by the following steps:

sorting the input arrays according to the splicing sequence of the splicing operator to the input arrays;

In this embodiment, the system further includes a network algorithm compiling module, configured to compile the neural network model into an executable file that can be executed by the chip, where a format of data in the executable file is a data format that can be recognized by the chip.

Specifically, the splicing operator searching module M100 is configured to search a splicing operator to be eliminated in the neural network model by using the following steps:

if not, the network algorithm compiling module judges whether other compiling and optimizing tasks exist, if other compiling and optimizing tasks exist, other compiling and optimizing tasks are executed by other compiling and optimizing task executing modules, and if other compiling and optimizing tasks do not exist, the network algorithm compiling module compiles the neural network model to obtain an executable file which can be operated by a chip.

In this embodiment, the start address of the array is expressed by the DDR offset address of the compiled array on the target machine, but the invention is not limited thereto, and other starting address expressions are also possible and all fall within the scope of the invention. The address information obtaining module M200 is configured to obtain a DDR offset address of an output array of the splicing operator according to the operator parameter of the neural network model, and obtain a DDR offset address and an array length of an input array of the splicing operator according to the operator parameter of the neural network model.

The address information updating module M300 is configured to update the address information of the input array of the splicing operator by adopting the following steps:

After the initial address of the input array is updated by the address information updating module M300, the influence on the structure of the neural network model is as shown in fig. 3 and fig. 4, that is, the output of the pre-operator and the input of the post-operator of the original splicing operator are the same array, and since the arrays are arranged sequentially, the actual effect of the splicing operator can be realized, and the splicing operator can be deleted.

The embodiment of the invention also provides compiling optimization equipment for eliminating the splicing operator, which comprises a processor; a memory having stored therein executable instructions of the processor; wherein the processor is configured to perform the steps of the compilation optimization method of elimination splice operators via execution of the executable instructions.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" platform.

An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 600 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 8, the electronic device 600 is embodied in the form of a general purpose computing device. The components of the electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one storage unit 620, a bus 630 that connects the various system components (including the storage unit 620 and the processing unit 610), a display unit 640, and the like.

Wherein the storage unit stores program code, which can be executed by the processing unit 610, so that the processing unit 610 executes the steps according to various exemplary embodiments of the present invention described in the above compiling optimization method for eliminating a splicing operator section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.

The storage unit 620 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)6201 and/or a cache memory unit 6202, and may further include a read-only memory unit (ROM) 6203.

The memory unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 630 may be one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 600 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 650. Also, the electronic device 600 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 via the bus 630. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

In the compiling and optimizing device for eliminating the splicing operator, when the program in the memory is executed by the processor, the step of the compiling and optimizing method for eliminating the splicing operator is realized, so that the device can also obtain the technical effect of the compiling and optimizing method for eliminating the splicing operator.

The embodiment of the invention also provides a computer-readable storage medium for storing a program, and the program realizes the steps of the compiling optimization method for eliminating the splicing operator when being executed by a processor. In some possible embodiments, the various aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present invention described in the above-mentioned compilation optimization method section of the elimination splice operator of the present specification, when the program product is executed on the terminal device.

Referring to fig. 9, a program product 800 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be executed on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

When being executed by a processor, the program in the computer storage medium implements the steps of the compiling optimization method for eliminating the splicing operator, so that the computer storage medium can also obtain the technical effect of the compiling optimization method for eliminating the splicing operator.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A compiling optimization method for eliminating a splicing operator is characterized by comprising the following steps:

s300: acquiring address information of an input array of the splicing operator;

s500: the splice operator is deleted in the neural network model.

2. The compiling optimization method for eliminating a splicing operator according to claim 1, wherein the address information of the output array of the splicing operator comprises a start address of the output array, and the address information of the input array of the splicing operator comprises a start address of the input array and an array length;

3. The compiling and optimizing method for eliminating a splicing operator according to claim 2, wherein in the step S400, before the start address of the output array of the splicing operator is the start address of the first input array, further comprising the steps of:

4. The compilation optimization method for eliminating splicing operators according to claim 1, wherein the step S100: searching for a splicing operator to be eliminated in a neural network model, comprising the following steps:

5. The compilation optimization method for eliminating splicing operators according to claim 1, wherein the step S200: acquiring address information of an output array of the splicing operator, including: acquiring a DDR offset address of an output array of the splicing operator according to the operator parameter of the neural network model;

6. The compilation optimization method for eliminating splicing operators according to claim 4, wherein the step S400: updating the address information of the input array of the splicing operator, comprising the following steps:

7. The compiling and optimizing method for eliminating a splicing operator according to claim 6, wherein the step of sequentially updating the DDR offset addresses of the input arrays according to the sorting order of the input arrays comprises the steps of:

8. The compilation optimization method for eliminating splicing operators according to claim 1, wherein the step S500: after the splicing operator is deleted from the neural network model, the method further comprises the following steps:

9. A compilation optimization system for eliminating a splicing operator, wherein the compilation optimization method for eliminating the splicing operator is implemented according to any one of claims 1 to 8, and the system comprises:

10. The compiling optimization system for eliminating a splicing operator according to claim 9, wherein the address information of the output array of the splicing operator comprises a start address of the output array, and the address information of the input array of the splicing operator comprises a start address of the input array and an array length;

11. The compiling and optimizing system for eliminating a splicing operator according to claim 9 further comprising a network algorithm compiling module, wherein the splicing operator searching module is configured to search a neural network model for a splicing operator to be eliminated by:

12. The compiling and optimizing system for eliminating the splicing operator according to claim 11, wherein the address information obtaining module is configured to obtain a DDR offset address of an output array of the splicing operator according to the operator parameter of the neural network model, and obtain a DDR offset address and an array length of an input array of the splicing operator according to the operator parameter of the neural network model;

13. A compilation optimization device that eliminates concatenation operators, comprising:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the compilation optimization method of elimination-splice operator of any of claims 1 to 8 via execution of the executable instructions.

14. A computer-readable storage medium storing a program, wherein the program when executed by a processor implements the steps of the method for compilation optimization of elimination-concatenation operators of any of claims 1 to 8.