CN115858648A

CN115858648A - Database generation method, data stream segmentation method, device, equipment and medium

Info

Publication number: CN115858648A
Application number: CN202211518362.0A
Authority: CN
Inventors: 孙铁力; 潘佳诚; 张亚林
Original assignee: Shanghai Enflame Technology Co ltd
Current assignee: Shanghai Enflame Technology Co ltd
Priority date: 2022-11-29
Filing date: 2022-11-29
Publication date: 2023-03-28

Abstract

The invention discloses a database generation method, a data stream segmentation method, a device, equipment and a medium, wherein the database generation method comprises the following steps: generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams and segmentation modes and costs corresponding to the initial tensor shapes, and the costs corresponding to the initial tensor shapes are smaller than a tolerable cost threshold; carrying out at least one dimension size reduction on the initial tensor shape in at least one dimension until the reduced tensor shape and the corresponding cost thereof do not meet the linear relation, and recording the tensor shape before the last reduction as a boundary tensor shape; generating a boundary database based on the boundary tensor shapes; the initial database and the boundary database are used for carrying out tensor shape matching on the input data stream, and determining the segmentation mode of the input data stream according to the matching result. The embodiment of the invention can accurately predict the cost of the data stream when each segmentation mode is applied, and is beneficial to determining the better segmentation mode of each data stream.

Description

Database generation method, data stream segmentation method, device, equipment and medium

Technical Field

The invention relates to the technical field of data processing, in particular to a database generation method, a data stream segmentation method, a device, equipment and a medium.

Background

An AI chip, such as a DLA (Deep Learning Accelerator), generally has theoretical calculation performance after being developed, that is, the AI chip can achieve the best calculation capability under ideal conditions. The actual computation performance of the AI chip is determined by the data storage manner (e.g., single-level storage or multi-level storage), the bandwidth size, the theoretical computation capability of the chip, and the routing allocation manner of the AI chip. DLAs, which are mainstream at present, include a complicated storage hierarchy (memory hierarchy). As shown in fig. 1, taking three-level storage as an example, each DLA includes a plurality of clusters (clusters), each cluster includes a plurality of computing units, and each computing unit includes one or more multiplier arrays. The storage capacity of each storage level is gradually reduced from top to bottom, and the bandwidth and the access speed are gradually increased. When the data stream is input to the DLA, the input data passes through the multilevel storage from top to bottom, and the output data passes through the multilevel storage from bottom to top.

For the input { in0, in1, in2, \ 8230;) and output { out0, out1, out2, \ 8230; } of DLA, the stored splits { in0_ i, in1_ i, in2_ i, \ 8230;) and { out0_ i, out1_ i, out2_ i, \8230; } at the ith layer constitute a data stream splitting and handling strategy. There are actually an exponential number of data stream slicing and handling strategies for a given input and output. If the optimal data stream carrying strategy is found by adopting the existing search strategy, the cost of each strategy needs to be modeled, so that the complexity of the model is high, and the accuracy of the model is difficult to guarantee. Therefore, in the prior art, it is difficult to accurately predict the cost required when the tensor shape of the current data stream applies each segmentation mode, and it is difficult to determine a better segmentation mode which has lower cost and better performance for the current tensor shape.

Disclosure of Invention

The invention provides a database generation method, a data stream segmentation method, a device, equipment and a medium, which are used for accurately predicting the cost of each segmentation mode applied to a data stream and are beneficial to determining the better segmentation mode of each data stream.

In a first aspect, an embodiment of the present invention provides a database generation method, including:

generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and a segmentation mode and a cost corresponding to the initial tensor shapes, wherein the cost corresponding to the initial tensor shapes is less than a tolerable cost threshold;

performing dimension reduction on the initial tensor shape for at least one dimension at least once, stopping reduction until the reduced tensor shape and the corresponding cost thereof do not meet the linear relation, and recording the tensor shape before the final reduction as a boundary tensor shape;

generating a boundary database based on a boundary tensor shape obtained by dimension size reduction on each dimension; the boundary database comprises a plurality of boundary tensor shapes, and a segmentation mode and cost corresponding to the boundary tensor shapes;

the initial database and the boundary database are used for carrying out tensor shape matching on an input data stream, and determining the segmentation mode of the input data stream according to a matching result.

Optionally, the obtaining of the boundary tensor shape includes:

performing one dimension size reduction in any dimension for any initial tensor shape;

for the current dimension, calculating the dimension ratio of the dimension after reduction to the dimension before reduction; calculating a cost ratio of the cost corresponding to the reduced tensor shape to the cost corresponding to the tensor shape before reduction;

judging whether the size ratio and the cost ratio are equal or not;

if so, continuing to reduce the dimension of the reduced tensor shape until the dimension ratio and the cost ratio obtained after reduction are not equal, and recording the tensor shape before the last reduction as the boundary tensor shape.

Optionally, continued dimensional size reduction is performed based on the current dimension or other dimensions.

Optionally, if the size ratio is not equal to the cost ratio, performing dimension size reduction on the current initial tensor shape in other dimensions to obtain a boundary tensor shape corresponding to the current initial tensor shape.

Optionally, if the size ratio and the cost ratio obtained by correspondingly reducing the size of the current initial tensor shape in each dimension are not equal to each other, it is determined that the current initial tensor shape does not have a corresponding boundary tensor shape.

In a second aspect, an embodiment of the present invention further provides a database generation apparatus, including:

the initial database generating module is used for generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and segmentation modes and costs corresponding to the initial tensor shapes, wherein the costs corresponding to the initial tensor shapes are smaller than a tolerable cost threshold;

a boundary tensor shape screening module, configured to perform dimension reduction on the initial tensor shape in at least one dimension at least once, stop reducing until the reduced tensor shape and the corresponding cost thereof do not satisfy a linear relationship, and record the tensor shape before the last reduction as a boundary tensor shape;

the boundary database generation module is used for generating a boundary database based on a boundary tensor shape obtained by dimension size reduction on each dimension; the boundary database comprises a plurality of boundary tensor shapes, and segmentation modes and costs corresponding to the boundary tensor shapes;

In a third aspect, an embodiment of the present invention further provides a data stream segmentation method, which is implemented based on the initial database and the boundary database provided in any embodiment of the present invention, and the data stream segmentation method includes:

acquiring a current tensor shape of an input data stream;

and if the initial database and/or the boundary database have/has a target tensor shape matched with the current tensor shape, segmenting the input data stream by adopting a segmentation mode of the target tensor shape.

Optionally, if the shape of the target tensor is not unique, estimating an estimated cost generated when the input data stream is segmented by adopting a segmentation mode of the shape of the target tensor according to the cost of the shape of the target tensor; and selecting a segmentation mode of the shape of the target tensor with the lowest estimated cost to segment the input data stream.

Optionally, the matching of the target tensor shape to the current tensor shape comprises:

the first matching condition is as follows: the target tensor shape is the same as the current tensor shape;

or,

the second matching condition is as follows: the number of dimensions of the target tensor shape is the same as the number of dimensions of the current tensor shape, and the dimension size of each dimension of the current tensor shape is an integral multiple of the dimension size of the corresponding dimension of the target tensor shape.

Optionally, the selecting process of the shape of the target tensor comprises:

judging whether a first target tensor shape meeting a first matching condition exists in the initial database or not; if so, taking the first target tensor shape as the target tensor shape;

if the first target tensor shape does not exist in the initial database, judging whether a second target tensor shape meeting a first matching condition exists in the boundary database or not; if so, taking the second target tensor shape as the target tensor shape;

if the second target tensor shape does not exist in the boundary database, judging whether a third target tensor shape meeting a second matching condition exists in the initial database or not; if so, taking the third target tensor shape as the target tensor shape;

if the third target tensor shape does not exist in the initial database, judging whether a fourth target tensor shape meeting the second matching condition exists in the boundary database or not; and if so, taking the fourth target tensor shape as the target tensor shape.

In a fourth aspect, an embodiment of the present invention further provides a data stream segmentation apparatus, including:

a tensor shape acquisition module for acquiring a current tensor shape of the input data stream;

and the segmentation mode endowing module is used for segmenting the input data stream by adopting the segmentation mode of the target tensor shape when the target tensor shape matched with the current tensor shape exists in an initial database and/or a boundary database.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor to enable the at least one processor to execute the database generation method provided by any embodiment of the present invention and/or the data stream splitting method provided by any embodiment of the present invention.

In a sixth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and the computer instructions are configured to, when executed, cause a processor to implement the database generation method provided in any embodiment of the present invention and/or the data stream splitting method provided in any embodiment of the present invention.

According to the database generation method provided by the embodiment of the invention, the initial tensor shape set with the cost less than the tolerable cost threshold is obtained to form the initial database, so that the performance of each initial tensor shape can be ensured to be better; and dimension size reduction in each dimension is carried out on each initial tensor shape in the initial database to obtain a large number of linear boundary tensor shapes meeting the linear relation of cost, so that the boundary database is formed, and the accurate measurement of the cost of the tensor shapes of which the sizes are in a multiple relation with the boundary tensor shapes can be ensured. The tensor shapes in the initial database and the boundary database are both used as reference tensor shapes for carrying out tensor shape matching on the input data streams, and on one hand, for any unknown input data stream, as long as the reference tensor shape matched with the unknown input data stream exists, the cost consumed by the input data stream by applying the segmentation mode of the reference tensor shape can be accurately estimated according to the cost of the reference tensor shape without carrying out independent modeling processing on each input data stream; and secondly, because the performance of each initial tensor shape is better, the performance of the boundary tensor shape which has a linear relation with the initial tensor shape can be ensured to a certain extent, when a reference tensor shape matched with the input data stream exists, a segmentation mode of the reference tensor shape is directly endowed to the input data stream, which is equivalent to multiplexing the data stream of the smaller tensor shape to ensure the performance of the larger tensor shape, thereby ensuring that the finally selected segmentation and transportation mode has better performance, and fully exerting the calculation power of the chip. Therefore, compared with the prior art, the embodiment of the invention can accurately predict the cost of the data stream when each segmentation mode is applied, and is beneficial to determining the better segmentation mode of each data stream.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of a prior art DLA multilevel memory structure;

fig. 2 is a schematic flowchart of a database generation method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart diagram illustrating another database generation method according to an embodiment of the present invention;

fig. 4 is a schematic diagram illustrating a correspondence relationship between sizes of tensor shapes and cost linearity according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a correspondence between actual cost linearity and predicted cost linearity of each tensor shape provided by the embodiment of the present invention;

fig. 6 is a schematic flow chart of a data stream segmentation method according to an embodiment of the present invention;

figure 7 is a schematic diagram of a process for finding a shape of a target tensor according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a database generation apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a data stream segmentation apparatus according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.

The embodiment of the invention provides a database generation method, wherein a generated database is used as a basis for providing a generalizable data stream segmentation and carrying strategy, the method is suitable for the processing requirements of AI chips containing any storage hierarchical structure on data streams during calculation, the method can be executed by a database generation device, the database generation device can be realized in a hardware and/or software mode, and the database generation device can be configured in electronic equipment. Fig. 2 is a schematic flowchart of a database generation method according to an embodiment of the present invention. As shown in fig. 2, the database generation method includes:

s110, generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and segmentation modes and costs corresponding to the initial tensor shapes, wherein the costs corresponding to the initial tensor shapes are smaller than a tolerable cost threshold.

The segmentation mode corresponding to the tensor shape comprises a segmentation mode corresponding to the tensor shape in each layer of storage of the AI chip. The costs corresponding to the tensor shape include, but are not limited to, time and power consumption for segmenting the data stream and carrying. The initial database is used as a part of a reference database when the unknown input data stream is selected in a segmentation mode, and is constructed as a set of tensor shapes of the data stream with the cost less than a tolerable cost threshold value, so that the cost corresponding to the initial tensor shape of each initial data stream can be ensured to be small, and the computing power of an AI chip can be fully exerted; the initial data stream used as the reference is guaranteed to be the data stream with better performance, so that the performance of each boundary tensor shape obtained after dimension reduction is carried out on the basis of the initial tensor shape is guaranteed to be better, and further, the segmentation modes of the initial tensor shape and the boundary tensor shape are guaranteed.

Specifically, the generation manner of the initial database may be: the method comprises the steps of randomly generating a plurality of initial data streams, obtaining a segmentation search space of each initial data stream in a target chip, successively filtering invalid segmentation modes in the segmentation search space according to a plurality of segmentation constraint conditions to obtain a plurality of alternative segmentation modes, compiling the original data streams by using each alternative segmentation pattern to obtain measured data corresponding to each alternative segmentation mode, finally obtaining a target segmentation mode with the minimum cost of the original data streams in the target chip in each alternative segmentation mode according to each measured data, recording the shape of an initial tensor of each target segmentation mode and the cost corresponding to the initial tensor of each target segmentation mode, and generating an initial database.

Illustratively, taking the example of matrix multiplication implemented by the storage structure of fig. 1, the input of the chip includes two data streams, a left operand and a right operand, and the output of the chip includes one data stream of the calculation result. The left operand tensor shape of the input, the right operand tensor shape and the tensor shape of the output are specified in advance. Then, after the left operand and the right operand are input to the DLA, the DLA may determine a corresponding slicing manner in each layer of storage. Furthermore, in the process that input data is finally input into the multiplier array for calculation through three-level storage, and calculation results are gradually returned to the DLA for merging and storage, the splitting modes of the left operand, the right operand and the calculation results in the cluster, the splitting mode in the calculation unit and the splitting mode in the multiplier array need to be respectively determined.

The obtaining process of the preferred segmentation mode of each data stream may be: all segmentation modes (without considering the rationality) of each data stream of the operator under each storage level are obtained, unreasonable segmentation modes which do not meet the constraint conditions are gradually filtered out in each segmentation mode according to various constraint conditions, such as hardware constraint conditions, software stack constraint conditions and the like, and the rest few segmentation modes can be used as alternative segmentation modes. And actually measuring that when the input data is sliced according to the remaining alternative slicing modes, the AI chip screens out the slicing mode with better performance in each aspect of each storage level according to the calculation performance of the operator finally, and stores the better slicing mode corresponding to the tensor shape of each data stream of the operator and the cost in the initial database for later use.

And S120, performing at least one dimension size reduction on the initial tensor shape in at least one dimension, stopping the reduction until the reduced tensor shape and the corresponding cost thereof do not meet the linear relation, and recording the tensor shape before the last reduction as a boundary tensor shape.

Each tensor shape comprises at least one dimension, each dimension comprises at least one element, and dimension reduction is performed on the number of elements in a certain dimension aiming at the initial tensor shape. Illustratively, halving the number of elements in a dimension of the initial tensor shape once corresponds to a one-dimensional size reduction for the initial tensor shape. The tensor shape and its corresponding cost satisfy a linear relationship that: tensor shapes are reduced by the same factor at the cost of one dimensional reduction.

This step is actually equivalent to finding all the sets of minimum size boundary tensor shapes obtained at the boundary of the linear relationship between size and cost by reducing each initial tensor shape in each dimension direction. The performance of the boundary tensor shape may be lower than that of the initial tensor shape, but as a linear boundary of the initial tensor shape, the shape size is smaller than that of the initial tensor shape, and as long as the tensor shape size of the input data stream is an integral multiple of the boundary tensor shape size, the cost of the input data stream and the cost corresponding to the boundary tensor shape conform to a linear multiple relation, so that the cost of the input data stream can be accurately estimated. The method is equivalent to the fact that the initial database is expanded by adopting a boundary database formed by more boundary tensor shapes with smaller sizes, the initial database and the boundary database jointly form a reference database for performing tensor shape matching on the input data stream, and on the basis of ensuring accurate estimation of cost, the probability of matching the reference tensor shape and the unknown input tensor shape can be improved. The large number of boundary tensor shapes can greatly enrich the number and the structure of the reference tensor shapes, widen the application range of the reference database and widen the application range of the subsequent data stream segmentation method.

Where an initial tensor shape of one n-dimension generally corresponds to a boundary tensor shape of more than n in the n-dimensional space. Illustratively, for an n-dimensional initial tensor shape, reducing the initial tensor shape to a linear boundary by always performing dimension reduction along one dimension to obtain n boundary tensor shapes; in the reduction process, dimension is converted to reduce dimension size, and other boundary tensor shapes different from the n boundary tensors can be obtained; even if the same dimensions are selected for dimension reduction, the reduction scale (multiple) of any dimension is different, and the reduction scale corresponds to different boundary tensor shapes.

S130, generating a boundary database based on the boundary tensor shape obtained by dimension size reduction on each dimension; the boundary database comprises a plurality of boundary tensor shapes and segmentation modes and costs corresponding to the boundary tensor shapes; the initial database and the boundary database are used for tensor shape matching of the input data stream, and the segmentation mode of the input data stream is determined according to the matching result.

If the tensor shape of the input data stream is the same as or in a multiple relation with the tensor shape in the reference database, the tensor shape of the input data stream can be considered to be matched with the tensor shape in the reference database, and at the moment, the cost of the input data stream can be estimated according to the cost of the matched tensor shape without performing complex processing such as modeling on the input data stream. Illustratively, when the tensor shape of the input data stream is matched with the plurality of tensor shapes in the reference database, the tensor shape belonging to the initial database can be preferentially selected, and the input data stream is given the segmentation mode of the tensor shape, because the tensor shapes in the initial database are all obtained by actual operation, compared with the boundary tensor shape obtained after reduction processing, the performance corresponding to the segmentation mode can be ensured to be better, and therefore the input data stream is ensured to be selected to be the segmentation mode with better cost. Or, when the tensor shape of the input data stream is matched with a plurality of tensor shapes in the reference database, the estimated costs obtained by segmenting the input data stream by adopting the segmentation mode of each tensor shape can be sequenced, and the segmentation mode with the minimum estimated cost is selected to be endowed to the input data stream.

According to the database generation method provided by the embodiment of the invention, the initial tensor shape set with the cost less than the tolerable cost threshold is obtained to form the initial database, so that the performance of each initial tensor shape can be ensured to be better; and dimension size reduction in each dimension is carried out on each initial tensor shape in the initial database to obtain a large number of linear boundary tensor shapes meeting the linear relation of cost, so that the boundary database is formed, and the accurate measurement of the cost of the tensor shapes of which the sizes are in a multiple relation with the boundary tensor shapes can be ensured. The tensor shapes in the initial database and the boundary database are both used as reference tensor shapes for performing tensor shape matching on the input data streams, and firstly, for any unknown input data streams, as long as the reference tensor shapes matched with the unknown input data streams exist, the cost consumed by the segmentation mode of applying the reference tensor shapes to the input data streams can be accurately estimated according to the cost of the reference tensor shapes, and independent modeling processing is not required to be performed on each input data stream; and secondly, because the performance of each initial tensor shape is better, the performance of the boundary tensor shape which has a linear relation with the initial tensor shape can be ensured to a certain extent, when a reference tensor shape matched with the input data stream exists, a segmentation mode of the reference tensor shape is directly endowed to the input data stream, which is equivalent to multiplexing the data stream of the smaller tensor shape to ensure the performance of the larger tensor shape, thereby ensuring that the finally selected segmentation and transportation mode has better performance, and fully exerting the calculation power of the chip. Therefore, compared with the prior art, the embodiment of the invention can accurately predict the cost of the data stream when each segmentation mode is applied, and is beneficial to determining the better segmentation mode of each data stream.

The following describes in detail the procedure of performing the dimension reduction processing on each initial tensor shape with reference to fig. 3, but the present invention is not limited thereto. Fig. 3 is a schematic flowchart of another database generation method provided in an embodiment of the present invention, and referring to fig. 3, the database generation method includes:

and S200, generating an initial database.

S201, selecting an unreduced initial tensor shape from an initial database.

S202, aiming at the selected initial tensor shape, optionally selecting one unreduced dimensionality, carrying out primary dimensionality size reduction to obtain a primary reduced tensor shape.

This step corresponds to a one-dimensional size reduction of any initial tensor shape in any dimension, for example, by halving the number of elements in a dimension of the selected initial tensor shape.

S203, judging whether the size ratio and the cost ratio between the initial tensor shape and the primary reduced tensor shape are equal or not; if yes, executing S204; if not, S211 is executed.

Wherein, the size ratio is: for the current dimension to be reduced, the ratio of the dimension size (number of elements) of the tensor shape before reduction (initial tensor shape in this step) in this dimension to the dimension size (number of elements) of the tensor shape after reduction (initial reduced tensor shape in this step) in this dimension. The cost ratio is: the ratio of the cost corresponding to the tensor shape obtained after the reduction to the cost corresponding to the tensor shape obtained before the reduction.

If the size ratio between the initial tensor shape and the initial reduced tensor shape is equal to the cost ratio, the initial tensor shape and the initial reduced tensor shape are in a linear cost relationship, and dimension reduction of the next step can be carried out on the initial reduced tensor shape. If the size ratio is not equal to the cost ratio, it is indicated that the linear relation of the cost does not exist between the primary reduced tensor shape and the initial tensor shape, that is, for the selected initial tensor shape, when the current reduced dimensionality is used as the initial reduced dimensionality, a corresponding boundary tensor shape does not exist, the size reduction attempt of the dimensionality needs to be carried out on other dimensionalities by converting the other dimensionalities aiming at the initial tensor shape, and whether the linear relation exists between the primary reduced tensor shape obtained under the other dimensionalities and the initial tensor shape is judged.

And S204, continuously reducing the dimension of the primarily reduced tensor shape based on the current dimension or other dimensions.

When the initial reduced tensor shape and the initial tensor shape satisfy the linear relationship, dimension reduction can be continuously performed on the initial reduced tensor shape to find each linear boundary of the initial reduced tensor shape in each dimension direction.

S205, judging whether the size ratio and the cost ratio before and after reduction are equal; if yes, returning to execute S204; if not, go to S206.

In the step, the two ratios are compared with the current tensor shape before reduction and the current tensor shape after reduction, so as to judge whether the current tensor shape after reduction and the tensor shape before reduction still meet the linear relation. If the size ratio and the cost ratio obtained after reduction are not equal, it is indicated that the tensor shape obtained after the current reduction does not meet the requirement of the cost linear condition, the dimension reduction in the current dimension direction reaches the linear boundary, and at this time, the reduction is stopped.

And S206, recording the tensor shape before the last reduction as the boundary tensor shape, and adding the boundary tensor shape into the boundary database.

Wherein, the last reduction is as follows: after the reduction is completed, the size ratio and the cost ratio obtained by the reduction are not equal for the first time. That is, before the reduction, the tensor shape has a larger size, so that the hardware running time can be covered, and the shape size and the cost are in a direct proportion relationship on the whole; after the reduction, the tensor shape is too small in size, so that the hardware running time cannot be covered, and even if the dimension is reduced by times, the cost cannot be reduced too much. Therefore, the cost of the tensor shape obtained by continuously reducing the boundary tensor shape cannot be accurately estimated.

Fig. 4 is a schematic diagram illustrating a correspondence between sizes and cost linearity of tensor shapes according to an embodiment of the present invention. Referring to fig. 4, the size of each tensor shape is unchanged or reduced in the right-left direction with the product of the sizes of the dimensions as the size of the entire tensor shape; the closer the linearity of the cost is to 1 represents the closer the cost of the tensor shape is to a linear relationship with the tensor shape before reduction. As can be seen from fig. 4, when the size of the tensor shape is reduced to a certain degree, the linearity of the corresponding cost deviates from 1. In fig. 4, [12_56_64_256_1 \ ] can be used as the boundary tensor shape. Taking the input scale of the boundary tensor shape as a threshold, and linearly increasing the cost when the size of the tensor shape is increased; when the tensor form size is reduced, the cost no longer conforms to the linear feature due to the characteristics of hardware flow and the like.

S207, judging whether reduction processing in each dimension direction is finished aiming at the shape of the primary reduction tensor; if yes, go to step S208; if not, the process returns to the step S204.

The process of reducing the tensor shape for the first time in each dimension direction is as follows: all boundary tensor shapes have been found for the first reduced tensor shape; alternatively, the number of boundary tensor shapes found for the first reduced tensor shape has reached a number threshold.

Exemplarily, a dimension selected when the initial tensor shape is reduced for the first time is taken as a current dimension; subsequent reduction processes for the initial reduced tensor shape include, but are not limited to, the following: firstly, dimension reduction is continuously carried out in the current dimension until the linear condition is not met, and a boundary tensor shape can be obtained. And secondly, continuously reducing the dimension in any dimension except the current dimension until the linear condition is not met, and obtaining a boundary tensor shape in each dimension. And thirdly, successively transforming different dimensions to reduce the dimension size until the linear condition is not met, and obtaining other boundary tensor shapes different from the two conditions.

S208, judging whether the reduction processing of each dimension direction is finished aiming at the initial tensor shape; if yes, executing S209; if not, the process returns to the step S202.

The reduction processing of each dimension direction by the initial tensor shape is as follows: the initial tensor shape is subjected to primary reduction under each dimension to obtain each primary reduced tensor shape, and reduction processing in each dimension direction is completed. If the initial tensor shape has completed the reduction processing in each dimension direction, it means that each boundary tensor shape corresponding to the initial tensor shape has been added to the boundary database, and then other unreduced initial tensor shapes in the initial database can be selected for reduction processing.

S209, judging whether all the initial tensor shapes in the initial database complete reduction processing; if yes, executing S210; if not, the process returns to the step S201.

And S210, ending the reduction to obtain a boundary database.

S211, judging whether all dimensions are subjected to one-time dimension reduction aiming at the selected initial tensor shape; if yes, go to step S212; if not, the process returns to the step S202.

After the size ratio and the cost ratio between the initial tensor shape and the initial reduced tensor shape are judged to be unequal in S203, the judgment step of S211 is firstly carried out, if the initial reduced tensor shape obtained after all dimensions of the initial tensor shape are subjected to one-time dimension reduction, a linear relation does not exist between the initial reduced tensor shape and the initial tensor shape, the fact that the size of the initial tensor shape after reduction cannot cover hardware running time is shown, namely, the fact that no corresponding boundary tensor shape exists after the current initial tensor shape is reduced is shown, and the initial tensor shape can be directly discarded.

It should be noted that, when all dimensions of the initial tensor shape have no linear relationship with the initial tensor shape after one dimension reduction, in addition to that the size of the initial tensor shape itself cannot cover hardware pipeline time, there is a possibility that the initial tensor shape itself is just the boundary tensor shape. The initial tensors which do not accord with the cost linear condition after all dimensions are subjected to one-time dimension reduction are judged to be directly discarded without corresponding boundary tensor shapes, and the judgment steps can be effectively simplified on the basis of not influencing the richness of the reference database. This is because: the initial database already contains the initial tensor shapes, namely the reference database already contains each initial tensor shape, and for the initial tensor shapes which are just positioned at the linear boundary, the initial tensor shapes are directly discarded without judging whether the initial tensor shapes are the linear boundary or not, the number of the tensor shapes in the boundary database is reduced, the types of the reference tensor shapes contained in the reference database are not influenced, and the selection of the unknown tensor shape segmentation mode is not influenced; in addition, the initial tensor shape which is just the shape of the boundary tensor is not put into the boundary database, so that the repetition of the tensor shape in the reference database can be avoided, and the subsequent tensor shape matching process can be simplified.

S212, discarding the initial tensor shape.

In the embodiment of the present invention, a process of constructing a boundary database based on an initial tensor shape is given through S200 to S212, and a series of boundary tensor shapes and costs can be generated for one initial tensor shape in the initial database by finding a minimum shape (boundary tensor shape) set with a linear cost obtained by reducing the dimension size of each initial tensor shape in each dimension direction, so that the number of reference tensor shapes is greatly increased.

To verify the applicability of the above reference database, the inventors actually measured and predicted some tensor shapes, and the correspondence between the actual cost linearity and the predicted cost linearity of each tensor shape can be seen in fig. 5. Illustratively in fig. 5, the solid line represents the actual cost linearity and the dashed line represents the predicted cost linearity. As can be seen from fig. 5, in the linear interval, the predicted curve substantially coincides with the actual curve; in the nonlinear interval, the predicted curve deviates more from the actual curve, which indicates that the shape of the boundary tensor in the embodiment is reasonably selected, and the reliability of the cost prediction result of other tensor shapes with the size larger than that of the linear boundary tensor is higher. For unknown tensor shapes, the accuracy of prediction in a linear interval can reach more than 90% -95%. Therefore, it is reliable to use the boundary tensor shape as the reference tensor shape for tensor shape matching of the input data stream.

The embodiment of the invention also provides a data stream segmentation method which is realized based on the reference databases, namely the initial database and the boundary database, provided by the above embodiments. The method is suitable for the processing requirement of an AI chip containing any storage hierarchical structure on the data stream during calculation, and can be executed by a data stream segmentation device which can be realized in a hardware and/or software mode and can be configured in electronic equipment. Fig. 6 is a schematic flow chart of a data stream segmentation method according to an embodiment of the present invention. Referring to fig. 6, the data stream segmentation method includes:

and S310, acquiring the current tensor shape of the input data stream.

And S320, if the target tensor shape matched with the current tensor shape exists in the initial database and/or the boundary database, segmenting the input data stream by adopting a segmentation mode of the target tensor shape.

The initial tensor shapes in the initial database are tensor shapes which are obtained through actual operation, are low in cost and high in performance, and whether the tensor shapes in the initial database are matched with the current tensor shapes or not can be preferentially judged in actual application.

The initial database and the boundary database jointly form a reference database, and a reference tensor shape in the reference database and a current tensor shape meet any one of the following matching conditions, namely the reference tensor shape can be considered as a target tensor shape matched with the current tensor shape. Specifically, the matching conditions include: the first matching condition is: the shape of the target tensor is the same as the shape of the current tensor; or, the second matching condition: the number of dimensions of the target tensor shape is the same as the number of dimensions of the current tensor shape, and the dimension size of each dimension of the current tensor shape is integral multiple of the dimension size of the corresponding dimension of the target tensor shape. Compared with the second matching condition, the similarity between the shape of the target tensor meeting the first matching condition and the shape of the current tensor is higher, and the matching of the segmentation mode and the shape of the current tensor can be better ensured. Therefore, a tensor shape satisfying the first matching condition can be preferentially selected as the target tensor shape.

According to the data stream segmentation method provided by the embodiment of the invention, through matching search in the initial database and the boundary database, a segmentation mode which is optimal for the input data stream can be quickly and accurately searched, the computing power of a chip can be fully exerted, and the problems that the efficient segmentation mode corresponding to each input data stream cannot be accurately and effectively identified and the experience dependency on developers is large in the prior art are solved.

The following describes the process of finding the shape of the target tensor in detail with reference to fig. 7, but the present invention is not limited thereto. Fig. 7 is a schematic flowchart of finding the shape of the target tensor according to an embodiment of the present invention. Referring to fig. 7, in one embodiment, optionally, finding the target tensor shape comprises:

and S410, acquiring the current tensor shape of the input data stream.

S420, judging whether a first target tensor shape meeting a first matching condition exists in the initial database; if yes, go to S460; if not, go to S430.

First, a first matching condition is judged according to an initial database, and if the first target tensor shape is found, the first target tensor shape can be directly used as the target tensor shape. The shape of the current tensor of the input data stream is the same as that of the first target tensor, the segmentation can be directly carried out in a segmentation mode of the shape of the first target tensor, and the cost is clear and known and is low. Therefore, the input data stream can be ensured to have low cost and excellent performance to the greatest extent when the segmentation mode is adopted.

S430, judging whether a second target tensor shape meeting the first matching condition exists in the boundary database; if yes, go to S460; if not, S440 is performed.

If the second target tensor shape is found under the condition that the first target tensor shape is not found, the second target tensor shape can be directly used as the target tensor shape. The current tensor shape of the input data stream is the same as the second target tensor shape, the segmentation can be directly performed in the segmentation mode of the second target tensor shape, and the cost is also definitely known. Therefore, the input data stream can be ensured to have low cost and excellent performance when the segmentation mode is adopted.

S440, judging whether a third target tensor shape meeting a second matching condition exists in the initial database; if yes, go to S460; if not, go to S450.

However, since the current tensor shape of the input data stream is various, even if the initial database and the boundary database already include many tensor shapes, the probability that the current tensor shape is exactly the same as the reference tensor shape is not actually high. Then, performing a search using the second matching condition is equivalent to widening the search condition, although it is possible to sacrifice part of the running performance, as long as the reference tensor shape is satisfied in any dimension: the condition that "current tensor shape% refers to tensor shape =0" still ensures that the cost of the current tensor shape can be accurately predicted. Therefore, compared with the method for searching the segmentation mode by depending on the experience of developers or depending on complex modeling in the prior art, the segmentation mode which is optimal for the input data stream can still be efficiently and accurately searched.

In the case where the first target tensor shape and the second target tensor shape are not found, if the third target tensor shape is found, the third target tensor shape may be directly used as the target tensor shape. The current tensor shape size of the input data stream is integral multiple of the shape size of the third target tensor, the segmentation mode of the shape of the third target tensor can be endowed to the current tensor shape, and the segmentation times can be increased in the dimension of the current tensor shape with the multiple relation of two times or more. This is equivalent to multiplexing the data stream of smaller shapes (target tensor shapes) to guarantee the performance of larger shapes (current tensor shapes).

S450, judging whether a fourth target tensor shape meeting a second matching condition exists in the boundary database or not; if yes, go to S460; if not, go to S470.

If the fourth target tensor shape is found under the condition that none of the three types of target tensor shapes is found, the fourth target tensor shape can be directly used as the target tensor shape. The current tensor shape size of the input data stream is an integer multiple of the fourth target tensor shape size, and the current tensor shape may be assigned a slicing mode of the fourth target tensor shape.

And S460, determining the shape of the target tensor.

And S470, judging that the target tensor shape does not exist.

The embodiment of the invention provides the searching steps of the shape of the target tensor through S410-S470, and searches the shape of the target tensor from the initial database and the boundary database in the sequence of high matching degree and low reliability of the performance after segmentation, so as to search the optimal segmentation mode belonging to the shape of the current tensor as far as possible.

On the basis of the foregoing embodiments, optionally, if the shape of the target tensor is not unique, estimating, according to the cost of the shape of the target tensor, an estimated cost generated when the input data stream is segmented by using a segmentation method of the shape of the target tensor; and a segmentation mode of the shape of the target tensor with the lowest estimated cost is selected to segment the input data stream, so that the performance of the selected segmentation method is ensured to be optimal as far as possible.

In addition to the above embodiments, optionally, because the manner of determining the tensor shape of the data stream, such as the initial data stream and the input data stream, is relatively free, the AI chip may process the data stream at a certain level of storage position by using an auto-padding (auto-padding) strategy, so that the size of each dimension of the padded data stream is an integer multiple of the size of the storage unit of the level, that is, the tensor shape of the padded data stream is represented as a standard tensor shape whose size is an integer multiple of the size of the storage unit tensor shape (split shape), so that the data stream can be transported in a simple cycle, thereby ensuring regularity and efficiency of transporting the data stream. Illustratively, the filling process may refer to data filling of the original tensor shape, and may specifically be filling the original tensor shape to a specified size, for example, filling the original tensor shape in the form of M × K directly to the standard tensor shape in the form of W × V, where W is greater than M and V is greater than K. Alternatively, the original tensor shape may be filled with a predetermined number, for example, the original tensor shape of the form M × K is filled to the standard tensor shape of the form (M + 1) × (K + 1). Illustratively, the size of the memory cell tensor shape in a certain dimension is 5, the size of the to-be-processed tensor shape in the dimension is 13, and the auto-padding strategy can fill the size of the to-be-processed tensor shape in the dimension to 15, so that the corresponding relation of integral multiples can be realized by the minimum padding amount as much as possible.

Specifically, taking a data flow including NHWCiCo dimension as an example, under the padding policy, it needs to be guaranteed that: n is a multiple of hN; h is a multiple of hH; w is a multiple of hW; ci is a multiple of hCi; co is a multiple of hCo. The dimensions of the memory cells are hN, hH, hW, hCi and hCo, respectively, or the dimensions of the cut shapes in the dimensions. In addition, the AI chip needs to ensure that hardware behaviors below Auto-padding are consistent as much as possible to ensure that the linear relationship of the tensor shape is reliable. Although the auto-padding strategy can lose part of the calculation power, the strategy is beneficial to the transportation of data streams, and in practical application, the filling strategy can be properly adopted due to the balance between the calculation power and the transportation regularity. For a specific tensor shape, a small auto-padding segmentation can be found as much as possible, so that the performance of the segmentation is close to the performance of the segmentation without auto-padding.

If the cost of the tensor shape of NHWCiCo is known as a when { hN, hH, hW, hCi, hCo, \8230; } is configured neglecting hardware pipeline time, then its cost a 'can be calculated for the unknown tensor shape N' H 'W' Ci 'Co' using the following equation:

where Ceil represents the operation that returns the smallest integer greater than or equal to the specified expression.

It should be noted that the data stream segmentation method provided by the embodiment of the present invention is applicable to storage hierarchies with various structures. To reduce bandwidth and to achieve a balance in computation and bandwidth, a multi-level storage structure may be utilized to provide data multiplexing for each level of storage and computation. Further, to facilitate data handling and data storage, the cut-off size between different storage levels may be in a multiple relationship.

The embodiment of the invention also provides a database generation device, which is used for executing the database generation method provided by any embodiment of the invention and has corresponding functional modules and beneficial effects of the execution method. Fig. 8 is a schematic structural diagram of a database generation apparatus according to an embodiment of the present invention, and referring to fig. 8, the database generation apparatus includes: an initial database generation module 110, a boundary tensor shape filtering module 120, and a boundary database generation module 130.

The initial database generation module 110 is configured to generate an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and segmentation modes and costs corresponding to the initial tensor shapes, wherein the costs corresponding to the initial tensor shapes are smaller than a tolerable cost threshold. The boundary tensor shape screening module 120 is configured to perform dimension reduction on the initial tensor shape in at least one dimension at least once, stop the reduction until the reduced tensor shape and the corresponding cost thereof do not satisfy the linear relationship, and record the tensor shape before the final reduction as the boundary tensor shape. The boundary database generation module 130 is configured to generate a boundary database based on a boundary tensor shape obtained by performing dimension reduction on each dimension; the boundary database comprises a plurality of boundary tensor shapes and segmentation modes and costs corresponding to the boundary tensor shapes. The initial database and the boundary database are used for carrying out tensor shape matching on the input data stream, and determining the segmentation mode of the input data stream according to the matching result.

On the basis of the foregoing embodiments, optionally, the boundary tensor shape filtering module 120 is specifically configured to: performing one-dimensional size reduction in any dimension for any initial tensor shape; for the current dimension, calculating the dimension ratio of the dimension after reduction to the dimension before reduction; calculating a cost ratio of the cost corresponding to the reduced tensor shape to the cost corresponding to the tensor shape before reduction; judging whether the size ratio and the cost ratio are equal or not; if so, continuing to reduce the dimension of the reduced tensor shape until the dimension ratio and the cost ratio obtained after reduction are not equal, and recording the tensor shape before the last reduction as a boundary tensor shape; and if not, reducing the dimension of the current initial tensor shape in other dimensions to obtain a boundary tensor shape corresponding to the current initial tensor shape. And if the size ratio and the cost ratio which are correspondingly obtained after the dimension reduction is respectively carried out on each dimension on the current initial tensor shape, judging that the current initial tensor shape has no corresponding boundary tensor shape.

The embodiment of the invention also provides a data stream segmentation device, which is used for executing the data stream segmentation method provided by any embodiment of the invention and has corresponding functional modules and beneficial effects of the execution method. Fig. 9 is a schematic structural diagram of a data stream segmentation apparatus provided in an embodiment of the present invention, and referring to fig. 9, the data stream segmentation apparatus includes: tensor shape acquisition module 210 and slicing mode assignment module 220.

The tensor shape acquisition module 210 is configured to acquire a current tensor shape of the input data stream. The segmentation mode assigning module 220 is configured to segment the input data stream by using a segmentation mode of a target tensor shape when the target tensor shape matched with the current tensor shape exists in the initial database and/or the boundary database.

On the basis of the foregoing embodiments, optionally, the data stream segmentation apparatus further includes: the cost estimation module is used for estimating estimated cost generated when the input data stream is segmented by adopting a segmentation mode of the target tensor shape according to the cost of the target tensor shape when the target tensor shape is not unique; and selecting a segmentation mode of the shape of the target tensor with the lowest estimated cost to segment the input data stream.

On the basis of the foregoing embodiments, optionally, the matching of the target tensor shape with the current tensor shape includes: the first matching condition is: the shape of the target tensor is the same as the shape of the current tensor; or, the second matching condition: the number of dimensions of the target tensor shape is the same as the number of dimensions of the current tensor shape, and the dimension size of each dimension of the current tensor shape is integral multiple of the dimension size of the corresponding dimension of the target tensor shape.

On the basis of the foregoing embodiments, optionally, the segmentation mode assignment module 220 is specifically configured to: judging whether a first target tensor shape meeting a first matching condition exists in an initial database or not; if so, taking the first target tensor shape as a target tensor shape; if the first target tensor shape does not exist in the initial database, judging whether a second target tensor shape meeting a first matching condition exists in the boundary database or not; if so, taking the second target tensor shape as a target tensor shape; if the second target tensor shape does not exist in the boundary database, judging whether a third target tensor shape meeting a second matching condition exists in the initial database or not; if so, taking the third target tensor shape as a target tensor shape; if the third target tensor shape does not exist in the initial database, judging whether a fourth target tensor shape meeting a second matching condition exists in the boundary database or not; if so, the fourth target tensor shape is taken as the target tensor shape.

The embodiment of the invention also provides the electronic equipment. Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device includes at least one processor 410, as shown in fig. 10; and a memory 420 communicatively coupled to the at least one processor 410; the memory 420 stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor 410, so that the at least one processor 410 can execute the database generation method provided by any embodiment of the present invention and/or the data stream splitting method provided by any embodiment of the present invention. In addition, the electronic device may further include: an input device 430 and an output device 440. The processor 410, the memory 420, the input device 430 and the output device 440 in the electronic apparatus may be connected by a bus or other means, and the bus connection is exemplified in fig. 10.

The memory 420 serves as a computer-readable storage medium for storing software programs, computer-executable programs, and modules, such as the database generation method in the embodiment of the present invention, and/or program instructions/modules corresponding to the data stream segmentation method (for example, the initial database generation module 110, the boundary tensor shape filtering module 120, and the boundary database generation module 130 in the database generation apparatus, and/or the tensor shape acquisition module 210 and the segmentation mode assignment module 220 in the data stream segmentation apparatus). The processor 410 executes various functional applications and data processing of the electronic device by executing software programs, instructions and modules stored in the memory 420, that is, the database generation method and/or the data stream segmentation method described above are implemented.

The memory 420 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 420 may further include memory located remotely from processor 410, which may be connected to an electronic device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 430 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus. The output device 440 may include a display device such as a display screen.

Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a database generation method, and/or a data stream segmentation method.

Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also execute the database generation method provided by any embodiment of the present invention, and/or the related operations in the data stream segmentation method.

From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which can be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

It should be noted that, in the embodiments of the database generation apparatus and the data stream segmentation apparatus, each unit and each module included in the embodiment are only divided according to functional logic, but are not limited to the above division, as long as corresponding functions can be implemented; in addition, the specific names of the functional units are only for the convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.

The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A database generation method, comprising:

generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and segmentation modes and costs corresponding to the initial tensor shapes, wherein the costs corresponding to the initial tensor shapes are smaller than a tolerable cost threshold;

generating a boundary database based on the shape of the boundary tensor obtained by dimension size reduction on each dimension; the boundary database comprises a plurality of boundary tensor shapes, and segmentation modes and costs corresponding to the boundary tensor shapes;

2. The database generation method according to claim 1, wherein the obtaining of the boundary tensor shape includes:

performing one-dimensional size reduction in any dimension for any initial tensor shape;

judging whether the size ratio and the cost ratio are equal;

3. The database generation method of claim 2, wherein the continued dimension reduction is based on the current dimension or other dimensions.

4. The database generation method according to claim 2, wherein if the size ratio is not equal to the cost ratio, dimension reduction is performed on the current initial tensor shape in other dimensions to obtain a boundary tensor shape corresponding to the current initial tensor shape.

5. The database generation method according to claim 4, wherein if the size ratio and the cost ratio obtained by respectively performing one-time dimension reduction on each dimension on the current initial tensor shape are not equal, it is determined that the current initial tensor shape does not have a corresponding boundary tensor shape.

6. A database generation apparatus, comprising:

the initial database generation module is used for generating an initial database; the initial database comprises initial tensor shapes of a plurality of initial data streams, and a segmentation mode and a cost corresponding to the initial tensor shapes, wherein the cost corresponding to the initial tensor shapes is less than a tolerable cost threshold;

the boundary tensor shape screening module is used for reducing the dimension of the initial tensor shape in at least one dimension for at least one time, stopping reducing the initial tensor shape until the reduced tensor shape and the corresponding cost of the initial tensor shape do not meet the linear relation, and recording the tensor shape before the final reduction as the boundary tensor shape;

the initial database and the boundary database are used for tensor shape matching of the input data stream, and the segmentation mode of the input data stream is determined according to the matching result.

7. A method for segmenting data streams, which is implemented based on the initial database and the boundary database of any one of claims 1 to 5, and comprises the following steps:

acquiring a current tensor shape of an input data stream;

and if the initial database and/or the boundary database has a target tensor shape matched with the current tensor shape, segmenting the input data stream by adopting a segmentation mode of the target tensor shape.

8. The method according to claim 7, wherein if the shape of the target tensor is not unique, estimating an estimated cost generated by segmenting the input data stream by using the segmentation mode of the shape of the target tensor according to the cost of the shape of the target tensor; and selecting a segmentation mode of the shape of the target tensor with the lowest estimated cost to segment the input data stream.

9. The method according to claim 7, wherein the matching the target tensor shape with the current tensor shape comprises:

the first matching condition is: the target tensor shape is the same as the current tensor shape;

or,

10. The method according to claim 9, wherein the selecting of the shape of the target tensor comprises:

if the third target tensor shape does not exist in the initial database, judging whether a fourth target tensor shape meeting the second matching condition exists in the boundary database; and if so, taking the fourth target tensor shape as the target tensor shape.

11. A data stream segmentation apparatus, comprising:

12. An electronic device, characterized in that the electronic device comprises:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the database generation method of any of claims 1-5 and/or the data stream slicing method of any of claims 7-10.

13. A computer-readable storage medium, having stored thereon computer instructions for causing a processor to execute a database generation method according to any one of claims 1-5 and/or a data stream splitting method according to any one of claims 7-10.