WO2022266888A1

WO2022266888A1 - Congestion prediction model training method, image processing method and apparatus

Info

Publication number: WO2022266888A1
Application number: PCT/CN2021/101860
Authority: WO
Inventors: 李栋; 王超; 张锐; 刘武龙; 黄宇
Original assignee: 华为技术有限公司
Priority date: 2021-06-23
Filing date: 2021-06-23
Publication date: 2022-12-29
Also published as: CN117561515A

Abstract

A congestion prediction model training method, and an image processing method and apparatus. The training method comprises: dividing a plurality of metal layers into at least two prediction layers, wherein the plurality of metal layers are metal layers included in each of K semiconductor chips, and K is a positive integer; determining M first feature maps corresponding to each prediction layer, wherein the M first feature maps are respectively used for describing M chip features of each prediction layer, and M is a positive integer; and adding the M first feature maps corresponding to each prediction layer in the K semiconductor chips to a data set, and training a congestion prediction model by using the data set. By means of the method, the time consumption of congestion prediction can be reduced while the congestion prediction accuracy of the semiconductor chips is improved.

Description

Congestion prediction model training method, image processing method and device

technical field

The present application relates to the technical field of Electronic Design Automation (EDA), in particular to a congestion prediction model training method, image processing method and device.

Background technique

Congestion Prediction (Congestion Prediction), as an important link in the chip electronic design automation EDA physical design, runs through the entire design process. Whether the layout scheme is congested or not directly determines the chip's delay, thermal power consumption and other indicators. The goal of congestion prediction is: in the global placement (Global Placement, GP) process, according to the current unit (Cell) placement position to estimate the chip's winding congestion level, so as to provide an optimization basis for the global placement, so that the placer (Placer) can Scatter the cells in the heavily congested area to reduce the layout congestion in this area, thereby reducing the overall congestion of the chip. Its essence is to predict the difference between the routing resource demand (Routing Tracks Demand) of each grid (Grid) and the given routing resource amount (Routing TrackCapacity) on the rasterized chip.

Limited by the huge scale and complex structure of the chip, existing congestion prediction methods have the following limitations: low accuracy of congestion prediction, long time-consuming calculation of congestion, and difficulty in both accuracy and time-consuming congestion prediction.

Contents of the invention

The embodiment of the present application discloses a congestion prediction model training method, an image processing method and a device. The congestion prediction method can reduce the time consumption of congestion prediction while improving the accuracy of semiconductor chip congestion prediction.

In the first aspect, the embodiment of the present application discloses a congestion prediction model training method, the method includes: dividing a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is each of the K semiconductor chips metal layers contained in a semiconductor chip, K is a positive integer; M first feature maps corresponding to each of the prediction layers are determined; wherein, the M first feature maps are respectively used to describe each of the prediction layers M chip features, where M is a positive integer; adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and using the data set to train a congestion prediction model.

It should be understood that the embodiment of the present application may divide the metal layer in each semiconductor chip into at least two prediction layers based on different methods, so that part or all of the feature data of different metal layers in the same prediction layer after division presents a relatively Strong correlation and consistency; where each prediction layer contains at least one metal layer.

It can be seen that, in the embodiment of the present application, by grouping the metal layers in the semiconductor chip, each semiconductor chip is divided into at least two prediction layers. Since the feature data (i.e., feature map) corresponding to the metal layer contained in each prediction layer shows strong correlation and consistency, on the one hand, it can avoid the use of different metal layers due to lack of stratification in the prior art. On the other hand, when using the feature data corresponding to different prediction layers to train the congestion prediction model, because the feature data in different prediction layers show different trends, Therefore, after using the characteristic data corresponding to different prediction layers to train the congestion prediction model, the obtained models can effectively identify the characteristic data of different trends, and make corresponding predictions based on the identified features, that is, the model has Refined identification and prediction capabilities. In summary, adopting the training method in the embodiment of the present application can effectively improve the prediction accuracy and generalization ability of the model.

In a feasible implementation manner, the above method further includes: performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing ; adding the real congestion map corresponding to each of the prediction layers in the K semiconductor chips to the data set.

It can be seen that in the embodiment of the present application, the real congestion map of each prediction layer is calculated based on the K semiconductor chips after global winding, so that the real congestion map can be compared with the predicted congestion map of each prediction layer later , to adjust the parameters of the congestion prediction model, and then obtain the optimal congestion prediction model.

In a feasible implementation manner, the above-mentioned division of the plurality of metal layers into at least two prediction layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips. The layers are divided into at least two prediction layers.

It can be seen that in the embodiment of the present application, the multiple metal layers in each semiconductor chip can be divided into at least two through the manufacturing process of each metal layer in each semiconductor chip and/or the distribution of functional modules on each metal layer. For example, the metal layer in each semiconductor chip can be divided into two prediction layers according to whether there are macro cells, one prediction layer contains macro cells and standard cells, and the other prediction layer only contains standard cells; or based on The difference in the amount of routing resources of each metal layer divides the plurality of metal layers into at least two prediction layers, and the amount of routing resources of each metal layer in each prediction layer is equivalent. Based on the above division method, multiple metal layers are divided into different prediction layers, so that the feature data of the metal layer in the same prediction layer are highly consistent, and the feature data corresponding to different prediction layers are quite different. After the corresponding feature data is used to train the congestion prediction model, the obtained model can effectively identify feature data of different trends, and make corresponding predictions based on the identified features, that is, the model has refined identification and prediction capabilities .

In a feasible implementation manner, the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature describing the feature of any chip in each metal layer Figure obtained.

It should be understood that the above method of obtaining the first feature map describing the feature of any chip based on the second feature map describing the feature of any chip in each metal layer may include: for each second feature map describing the feature of the same chip in each metal layer The pixel value of the corresponding pixel point on the first feature map describing the feature of the same chip is obtained by means of weighted average, maximum value or minimum value, etc. for the corresponding pixel point on the map.

It can be seen that, in the embodiment of the present application, each metal layer corresponds to M second feature maps, and the M second feature maps are used to respectively describe M chip features of each metal layer. Based on the above-mentioned layering method, in each prediction layer, for the same chip feature, since the second feature map corresponding to each metal layer has strong correlation and consistency, based on the above-mentioned average, maximum or minimum The first feature map describing the feature of the same chip obtained by means of value and other methods can accurately represent the feature of the same chip of each metal layer, that is, the first feature map describing the feature of the same chip and the first feature map describing the feature of the same chip in each metal layer The two feature maps have good correlation and consistency; it avoids that the first feature map describing the same chip feature is relatively different because each metal layer in the same prediction layer, so that the final obtained first feature describing the same chip feature The graph is quite different from the second feature graph in which each metal layer describes the features of the same chip. In summary, for the features of the same chip, the first feature map that accurately reflects the features of the same chip on each metal layer in each prediction layer can be obtained through the above-mentioned layering method and the determination method of the first feature map, so when using different After the first feature map corresponding to the prediction layer is used to train the congestion prediction model, the obtained models can effectively identify the feature data of different trends, and make corresponding predictions based on the identified features, that is, adopt the implementation of this application The model trained by the example has refined recognition and prediction capabilities.

In a feasible implementation manner, the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers The graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.

It should be understood that the above method of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in each prediction layer may include: for the second level real congestion map corresponding to each metal layer Calculate the mean value, weighted average or take the maximum value, etc., to obtain the first level real congestion map corresponding to the prediction layer; in addition, those skilled in the art can also use other methods based on the second level real congestion map corresponding to each metal layer A first level real congestion map corresponding to the prediction layer is obtained, which is not limited in this application. Similarly, the acquisition method of the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, which will not be repeated here.

It can be seen that in the embodiment of the present application, based on the above-mentioned layering method, in each prediction layer, since the real congestion map corresponding to each metal layer has strong correlation and consistency, each prediction The real congestion map of each prediction layer obtained by means of the real congestion map of each layer (that is, the real congestion map) has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, it can accurately reflect each prediction. The real congestion map of the congestion level of each layer can be used to ensure the prediction accuracy of the congestion prediction model trained by using the real congestion map of each prediction layer.

In a feasible implementation manner, adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: inputting M first feature maps corresponding to any prediction layer in the data set into the congestion prediction model to obtain the A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.

It can be seen that in the embodiment of the present application, the congestion prediction model can be obtained through multiple trainings, and in the process of each iterative training, the model input data is the feature data of each prediction layer (M first feature Fig. 1), since the characteristic data of each prediction layer can accurately reflect the corresponding characteristics of each metal layer in the prediction layer, and there are large differences in the characteristic data corresponding to different prediction layers, so the prediction model can be refined based on the different prediction layers Congestion prediction is performed on the feature data to obtain a predicted congestion map that can accurately reflect the congestion degree of each prediction layer, and then the model parameters are updated based on the predicted congestion map and the real congestion map of each prediction layer, so that the trained model has a higher prediction accuracy and strong generalization ability.

In a feasible implementation manner, the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; The M first feature maps corresponding to each of the prediction layers in the K semiconductor chips are added to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set The feature map and the corresponding real congestion map are used to train the first congestion prediction model; the second congestion prediction model is trained using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map .

It can be seen that in the embodiment of the present application, when multiple metal layers in each semiconductor chip are divided into a macro-unit layer and a non-macro-unit layer (that is, divided into two predictive layers), since the macro-unit layer and the non-macro-unit layer The characteristic data corresponding to the macro-unit layer are quite different, and the first congestion prediction model can be trained by using the characteristic data corresponding to the macro-unit layer, and the second congestion prediction model can be trained by using the characteristic data corresponding to the non-macro-unit layer, so that Obtain a prediction model based on the feature data of the macro-unit layer for accurate prediction, and a prediction model based on non-macro-unit features for accurate prediction, that is, to improve the prediction accuracy of the model; Making predictions can improve the speed of congestion prediction for the same semiconductor chip.

In a feasible implementation manner, the above M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.

It should be understood that the M chip features corresponding to each metal layer and prediction layer may also include other chip features except the above four chip features, which is not limited in the present application.

It can be seen that in the embodiment of the present application, the above M chip features may include one or more of pin density, network connection density, module mask, or routing resource amount reflecting chip functions and on-chip devices. Obtain the first feature map of the chip features corresponding to each prediction layer, and train the prediction model based on the M first feature maps that accurately reflect the functions of the chip and the characteristics of the on-chip device, thereby improving the prediction accuracy of the trained prediction model .

In the second aspect, the embodiment of the present application discloses an image processing method, which includes: determining M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two predicted layer, the M is a positive integer; use the congestion prediction model to process the M first feature maps corresponding to each prediction layer, and obtain the prediction congestion map corresponding to each prediction layer; wherein, the congestion prediction model is Obtained after training through a data set, the data set includes training data corresponding to the training prediction layer contained in each training semiconductor chip in a plurality of training semiconductor chips, and the training data corresponding to each training prediction layer includes M A first training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training predictions Each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.

It should be understood that the method of determining the M first feature maps corresponding to each prediction layer is the same as the method of determining the M first training feature maps of each training prediction layer, which will not be repeated here.

It can be seen that in the embodiment of the present application, in the process of congestion prediction using the congestion prediction model trained by the model training method in the first aspect above, the model obtained through training has better prediction accuracy and generalization Therefore, a more accurate predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted can be obtained, that is, the accuracy of the predicted congestion map can be improved, and it is convenient to use the obtained prediction layer in the chip production process to accurately predict congestion The chip layout is optimized accordingly.

In a feasible implementation manner, the above method further includes: aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.

It can be seen that in the embodiment of the present application, since the accuracy of the predicted congestion map corresponding to each prediction layer is high, after aggregation based on the predicted congestion map corresponding to each prediction layer, the obtained information describing the congestion degree of the chip to be predicted The predicted congestion map of the predicted chip also has high accuracy, which is convenient for optimizing the chip layout by using the obtained predicted congestion map of the chip to be predicted in the chip production process.

In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation of the predicted congestion maps corresponding to each of the predicted layers is obtained to obtain the The predicted congestion graph corresponding to the semiconductor chip to be predicted includes: using a hierarchical aggregation operator to aggregate the vertical predicted congestion graph corresponding to each of the prediction layers to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate each Aggregate the horizontal predicted congestion maps corresponding to the prediction layers to obtain a reference horizontal predicted congestion map; use a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the pending congestion map Predict the predicted congestion map corresponding to the semiconductor chip; or use the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the predicted layers to obtain the reference prediction corresponding to each of the predicted layers Congestion map: using the hierarchical aggregation operator to aggregate the reference predicted congestion map corresponding to each prediction layer to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.

It can be seen that, in the embodiment of the present application, for the chip to be predicted, the hierarchical aggregation operator and the directional aggregation operator can be used to aggregate the predicted congestion graph corresponding to each prediction layer to obtain the predicted congestion of the chip to be predicted picture. Wherein, the operations of the hierarchical aggregation operator and the directional aggregation operator include, but are not limited to, operations such as performing weighted average, maximum value or minimum value on the predicted congestion graph. Since the predicted congestion map corresponding to each prediction layer obtained based on the prediction model has high accuracy, after the subsequent specific operations of the hierarchical aggregation operator and directional aggregation operator determined according to the specific scene, based on the aggregation operator, the The predicted congestion map of the chips to be predicted also has high accuracy.

In a feasible implementation manner, the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.

In a feasible implementation manner, the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.

In a feasible implementation manner, the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.

In a feasible implementation manner, the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers The real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.

In a feasible implementation manner, during each iterative training process of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the real congestion map corresponding to any training prediction layer Updating the congestion prediction model; the predicted congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer into the congestion prediction model.

In a feasible implementation manner, the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set. The training feature map and the corresponding real congestion map are obtained by training.

In a feasible implementation manner, the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.

It should be understood that the beneficial effects of the above described embodiments of the congestion prediction model training process can refer to the effective effects of the corresponding training method in the first aspect, which will not be repeated here.

In a third aspect, the present application discloses a congestion prediction model training device, which includes: a layering unit for dividing multiple metal layers into at least two prediction layers; wherein, the multiple metal layers are K The metal layer contained in each semiconductor chip in the semiconductor chips, K is a positive integer; the determination unit is used to determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively M chip features used to describe each of the prediction layers, M is a positive integer; a training unit, used to convert the M first features corresponding to the prediction layer contained in each of the K semiconductor chips The graph and the real congestion graph are used as a data set, and the congestion prediction model is trained using the data set; wherein, the real congestion graph corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.

In a feasible implementation manner, the above-mentioned training unit is further configured to: obtain a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing; The real congestion maps corresponding to the prediction layers are added to the data set.

In a feasible implementation manner, the above layering unit is specifically configured to: divide the plurality of metal layers into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.

In a feasible implementation manner, the above determination unit is specifically configured to: obtain M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used for Describe the M chip features of each metal layer; generate M first feature maps corresponding to each of the prediction layers based on the M second feature maps of each metal layer; wherein, the The first feature map used to describe any chip feature in the M first feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.

In a feasible implementation manner, in the aspect of using the data set to train the congestion prediction model, the above training unit is specifically configured to: use the data set to iteratively train the congestion prediction model; wherein, each time The iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain a prediction congestion map corresponding to any prediction layer; based on the prediction congestion map The real congestion map corresponding to the any prediction layer updates the congestion prediction model.

In a feasible implementation manner, the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer; in the aspect of using the data set to train the congestion prediction model, the above-mentioned training unit is specifically used for : use the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model; use the first feature map corresponding to the non-macro-unit layer in the data set and the corresponding The real congestion map is used to train the second congestion prediction model.

In a feasible implementation manner, the above M chip features include one or more of pin density, network connection density, module mask or amount of routing resources.

In a fourth aspect, the present application discloses an image processing device, which includes: a determination unit, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted Including at least two prediction layers, where M is a positive integer; a processing unit configured to use a congestion prediction model to process the M first feature maps corresponding to each prediction layer to obtain each prediction layer Corresponding predicted congestion map; wherein, the congestion prediction model is obtained after training through a data set, and the data set includes training data respectively corresponding to the training prediction layer contained in each of the K training semiconductor chips, The training data corresponding to each of the training prediction layers includes M first training feature maps and real congestion maps, and the M first training feature maps are respectively used to describe M chip features of each of the training prediction layers, The real congestion map is used to describe the real congestion level of each training prediction layer, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.

In a feasible implementation manner, the above device further includes: an aggregation unit, configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted .

In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the above aggregation unit is specifically configured to: use a hierarchical aggregation operator to Aggregating the corresponding vertical predicted congestion graphs to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate the horizontal predicted congestion graphs corresponding to each of the prediction layers to obtain a reference horizontal predicted congestion graph; using directional aggregation The operator aggregates the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted; or, using the directional aggregation operator to Aggregating the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each layer to obtain a reference predicted congestion map corresponding to each of the predicted layers; using the hierarchical aggregation operator to perform aggregated to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted.

In a fifth aspect, the present application discloses a chip system, which is characterized in that the chip system includes at least one processor, a memory, and an interface circuit, and the memory, the interface circuit, and the at least one processor are interconnected by wires Instructions are stored in the at least one memory; when the instructions are executed by the processor, the method described in any one of the first aspect and/or the second aspect is implemented.

In a sixth aspect, the present application discloses a terminal device, wherein the terminal device includes the system on chip as described in the third aspect above, and a discrete device coupled to the system on chip.

In a seventh aspect, the present application discloses a computer-readable storage medium, which is characterized in that the computer-readable storage medium stores program instructions, and when the program instructions are run on a processor, the above-mentioned first aspect is realized And/or any method described in the second aspect.

In an eighth aspect, the present application discloses a computer program product, which is characterized in that, when the computer program product is run on a terminal, the method described in any one of the first aspect and/or the second aspect is implemented.

Description of drawings

The accompanying drawings used in the embodiments of the present application are introduced below.

FIG. 1 is a schematic structural diagram of a system architecture provided by an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a network model provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a chip hardware structure provided by an embodiment of the present application;

FIG. 4 is a schematic structural diagram of another system architecture provided by an embodiment of the present application;

FIG. 5 is a schematic flowchart of a method for training a congestion prediction model provided in an embodiment of the present application;

FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application;

FIG. 7 is a schematic flowchart of an image processing method provided in an embodiment of the present application;

Fig. 8 is a schematic diagram of the spatial relationship between a first feature map and a fourth feature map provided by an embodiment of the present application;

FIG. 9 is a schematic flowchart of a congestion prediction provided by an embodiment of the present application;

Fig. 10 is a schematic structural diagram of a model training device provided by an embodiment of the present application;

FIG. 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application;

Fig. 12 is a schematic diagram of the hardware structure of a model training device in the embodiment of the present application;

FIG. 13 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.

detailed description

Embodiments of the present application are described below with reference to the drawings in the embodiments of the present application.

Embodiments of the present application can be applied to image processing tasks, for example, in the congestion prediction (Congestion Prediction) in the physical design of chip electronic design automation (Electronic Design Automation, EDA), that is, based on the feature map (feature data) of the semiconductor chip ) to predict the congestion degree of the chip, thereby providing an optimization basis for the global layout, so that the placer (Placer) can push away the cells in the severely congested area, reduce the layout congestion degree of the area, and thereby reduce the overall congestion of the chip.

It should be understood that the images in the embodiments of the present application may be static images (or called static pictures) or dynamic images (or called dynamic pictures), for example, the images in the present application may be videos or dynamic pictures, or, the present application Images in can also be still images or photographs. For ease of description, the present application collectively refers to static images or dynamic images as images in the following embodiments.

In addition, the congestion prediction described above is only a specific scenario where the method of the embodiment of the present application is applied. The method of the embodiment of the present application is not limited to the above-mentioned scenarios when applied. in the scene being processed. Alternatively, the method in the embodiment of the present application can also be similarly applied to other fields, for example, speech recognition and natural language processing, etc., which is not limited in the embodiment of the present application.

The method provided by this application is described below from the model training side and the model application side:

The training method of the congestion prediction model provided in the embodiment of the present application involves the processing of computer vision, and can be specifically applied to data processing methods such as data training, machine learning, and deep learning, for training data (such as the first feature map in the present application) Perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc., and finally obtain a well-trained congestion prediction model; and, the image processing method provided in the embodiment of the present application can use the above-mentioned trained congestion prediction model, Input the input data (such as the feature map in this application) into the trained congestion prediction model to obtain output data (such as the predicted congestion map of the chip to be predicted in this application). It should be noted that the congestion prediction model training method and the image processing method provided in the embodiment of this application are inventions based on the same idea, and can also be understood as two parts in a system, or two stages in an overall process : Such as model training phase and model application phase.

The embodiment of the present application involves a large number of related applications of neural networks. In order to better understand the solutions of the embodiments of the present application, the following first introduces the relevant terms and concepts in the fields of neural networks and computer vision that may be involved in the embodiments of the present application.

(1) neural network

A neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x _s and an intercept 1 as input, and the output of the operation unit can be:

Among them, s=1, 2, ... n, n is a natural number greater than 1, W _s is the weight of x _s , and b is the bias of the neuron unit. f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer. The activation function may be a sigmoid function. A neural network is a network formed by connecting many of the above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field. The local receptive field can be an area composed of several neural units.

(2) Deep Neural Network

A deep neural network (DNN), also known as a multi-layer neural network, can be understood as a neural network with many hidden layers, and there is no special metric for the "many" here. According to the position of different layers of DNN, the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the layers in the middle are all hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer. Although DNN looks complicated, it is actually not complicated in terms of the work of each layer. In simple terms, it is the following linear relationship expression:

in,

is the input vector,

is the output vector,

Is the offset vector, W is the weight matrix (also called coefficient), and α() is the activation function. Each layer is just an input vector

After such a simple operation to get the output vector

Due to the large number of DNN layers, the coefficient W and the offset vector

The number is also a lot. The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as

The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4. The summary is: the coefficient of the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as

It should be noted that the input layer has no W parameter. In deep neural networks, more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks. Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).

(3) Convolutional neural network

Convolutional neural network (CNN, convolutional neuron network) is a deep neural network with a convolutional structure. A convolutional neural network consists of a feature extractor consisting of a convolutional layer and a subsampling layer. The feature extractor can be seen as a filter, and the convolution process can be seen as using a trainable filter to convolve with an input image or convolutional feature map. The convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network. In the convolutional layer of a convolutional neural network, a neuron can only be connected to some adjacent neurons. A convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as a way to extract image information that is independent of location. The underlying principle is that the statistical information of a certain part of the image is the same as that of other parts. That means that the image information learned in one part can also be used in another part. So for all positions on the image, the same learned image information can be used. In the same convolution layer, multiple convolution kernels can be used to extract different image information. Generally, the more the number of convolution kernels, the richer the image information reflected by the convolution operation.

The convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.

(4) Loss function

In the process of training the deep neural network, because it is hoped that the output of the deep neural network is as close as possible to the value you really want to predict, you can compare the predicted value of the current network with the target value you really want, and then according to the difference between the two to update the weight vector of each layer of the neural network (of course, there is usually an initialization process before the first update, that is, to pre-configure parameters for each layer in the deep neural network), for example, if the predicted value of the network If it is high, adjust the weight vector to make it predict lower, and keep adjusting until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", which is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value important equation. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the deep neural network becomes a process of reducing the loss as much as possible.

(5) Back propagation algorithm

The convolutional neural network can use the error back propagation (back propagation, BP) algorithm to correct the size of the parameters in the initial super-resolution model during the training process, so that the reconstruction error loss of the super-resolution model becomes smaller and smaller. Specifically, passing the input signal forward until the output will generate an error loss, and updating the parameters in the initial super-resolution model by backpropagating the error loss information, so that the error loss converges. The backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the parameters of the optimal super-resolution model, such as the weight matrix.

(6) Pixel value s

The pixel value of the image can be a red-green-blue (RGB) color value, and the pixel value can be a long integer representing the color. For example, the pixel value is 256*Red+100*Green+76Blue, where Blue represents a blue component, Green represents a green component, and Red represents a red component. In each color component, the smaller the value, the lower the brightness, and the larger the value, the higher the brightness. For grayscale images, the pixel values may be grayscale values.

The system architecture provided by the embodiment of the present application is introduced below.

Referring to FIG. 1 , FIG. 1 is a schematic structural diagram of a system architecture 100 provided by an embodiment of the present application. As shown in the system architecture 100, the data collection device 160 is used to collect training data. In the embodiment of the present application, the training data includes first feature maps and real congestion maps corresponding to all prediction layers.

After collecting the training data, the data collection device 160 stores the training data in the database 130, and the training device 120 trains the target model 101 based on the training data maintained in the database 130 (ie, the congestion prediction model in the embodiment of the present application).

The following will use Embodiment 1 to describe in more detail how the training device 120 obtains the target model 101 based on the training data. The target model 101 can be used to implement the image processing method provided by the embodiment of the present application, that is, each prediction layer of the chip to be predicted The corresponding M first feature maps are input to the target model 101 after being pre-processed to obtain a predicted congestion map corresponding to each prediction layer. The target model 101 in the embodiment of the present application may specifically be a congestion prediction model. In the embodiment provided in the present application, the congestion prediction model is obtained through at least one training. It should be noted that, in practical applications, the training data maintained in the database 130 may not all be collected by the data collection device 160, but may also be received from other devices. In addition, it should be noted that the training device 120 does not necessarily perform the training of the target model 101 based entirely on the training data maintained by the database 130, and it is also possible to obtain training data from the cloud or other places for model training. limit.

The target model 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. A terminal, etc., may also be a server or a cloud. In accompanying drawing 1, execution equipment 110 is equipped with input/output (input/output, I/O) interface 112, is used for carrying out data interaction with external equipment, the user can input data to I/O interface 112 through client equipment 140, In this embodiment of the present application, the input data may include a first feature map corresponding to each prediction layer of the chip to be predicted.

When the execution device 110 preprocesses the input data, or in the calculation module 111 of the execution device 110 performs calculation and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 150 .

Finally, the I/O interface 112 returns the processing result, such as the predicted congestion map of each prediction layer of the chip to be predicted (or the corresponding predicted congestion map of the chip to be predicted) obtained above, to the client device 140 to provide to the user.

It is worth noting that the training device 120 can generate corresponding target models 101 based on different training data for different goals or different tasks, and the corresponding target models 101 can be used to achieve the above-mentioned goals or complete the above-mentioned tasks, thereby Provide the user with the desired result.

In the case shown in FIG. 1 , the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 112 . In another case, the client device 140 can automatically send the input data to the I/O interface 112 . If the client device 140 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 140 . The user can view the results output by the execution device 110 on the client device 140, and the specific presentation form may be specific ways such as display, sound, and action. The client device 140 can also be used as a data collection terminal, collecting the input data input to the I/O interface 112 as shown in the figure and the output results of the output I/O interface 112 as new sample data, and storing them in the database 130 . Of course, the client device 140 may not be used for collection, but the I/O interface 112 directly uses the input data input to the I/O interface 112 as shown in the figure and the output result of the output I/O interface 112 as a new sample. The data is stored in database 130 .

It is worth noting that accompanying drawing 1 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation, for example, in accompanying drawing 1 , the data storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .

As shown in FIG. 1, the target model 101 is obtained by training according to the training device 120. In the embodiment of the present application, the target model 101 can be obtained by training based on the training method of the congestion prediction model in the embodiment of the present application; specifically, the implementation of the present application The congestion prediction model provided in the example can be a convolutional neural network, a generative adversarial neural network, a variational autoencoder, or a semantic segmentation neural network, which is not specifically limited in this solution.

As mentioned in the introduction to the basic concepts above, the convolutional neural network is a deep neural network with a convolutional structure and a deep learning (DL) architecture. The deep learning architecture refers to the algorithm through machine learning. Multiple levels of learning are performed at different levels of abstraction. As a deep learning architecture, CNN is a feed-forward artificial neural network in which individual neurons can respond to images input into it.

As shown in FIG. 2 , a convolutional neural network (CNN) 200 may include an input layer 210 , a convolutional/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230 .

Convolutional layer/pooling layer 220:

Convolution layer:

As shown in Figure 2, the convolutional layer/pooling layer 220 may include layers 221-226 as examples, for example: in one implementation, the 221st layer is a convolutional layer, the 222nd layer is a pooling layer, and the 223rd layer is a volume Layers, 224 are pooling layers, 225 are convolutional layers, and 226 are pooling layers; in another implementation, 221 and 222 are convolutional layers, 223 are pooling layers, and 224 and 225 are convolutional layers Layer, 226 is a pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.

The following will take the convolutional layer 221 as an example to introduce the inner working principle of one convolutional layer.

The convolution layer 221 may include many convolution operators, which are also called kernels, and their role in image processing is equivalent to a filter for extracting specific information from the input image matrix. The convolution operators are essentially It can be a weight matrix. This weight matrix is usually pre-defined. During the convolution operation on the image, the weight matrix is usually one pixel by one pixel (or two pixels by two pixels) along the horizontal direction on the input image. ...It depends on the value of the stride) to complete the work of extracting specific features from the image. The size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image. During the convolution operation, the weight matrix will be extended to The entire depth of the input image. Therefore, convolution with a single weight matrix will produce a convolutional output with a single depth dimension, but in most cases instead of using a single weight matrix, multiple weight matrices of the same size (row×column) are applied, That is, multiple matrices of the same shape. The output of each weight matrix is stacked to form the depth dimension of the convolution image, where the dimension can be understood as determined by the "multiple" mentioned above. Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to filter unwanted noise in the image. Do blurring etc. The multiple weight matrices have the same size (row×column), and the feature maps extracted by the multiple weight matrices of the same size are also of the same size, and then the extracted multiple feature maps of the same size are combined to form the convolution operation. output.

The weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions .

When the convolutional neural network 200 has multiple convolutional layers, the initial convolutional layer (such as 221) often extracts more general features, which can also be referred to as low-level features; As the depth of the network 200 deepens, the features extracted by the later convolutional layers (such as 226) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.

Pooling layer:

Since it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce a pooling layer after the convolutional layer, and each layer of 221-226 as shown in the convolutional layer/pooling layer 220 in Figure 2 can be one layer A convolutional layer is followed by a pooling layer, or a multi-layer convolutional layer is followed by one or more pooling layers. In image processing, the sole purpose of pooling layers is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling an input image to obtain an image of a smaller size. The average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of average pooling. The maximum pooling operator can take the pixel with the largest value within a specific range as the result of maximum pooling. In addition, just as the size of the weight matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the size of the image. The size of the image output after being processed by the pooling layer may be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.

Neural Network Layer 230:

After being processed by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not enough to output the required output information. Because as mentioned earlier, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to use the neural network layer 230 to generate one or a group of outputs with the required number of classes. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 2 ) and an output layer 240, and the parameters contained in the multi-layer hidden layers may be based on specific task types. The related training data are pre-trained.

After the multi-layer hidden layer in the neural network layer 230, that is, the last layer of the entire convolutional neural network 200 is the output layer 240, which has a loss function similar to the classification cross entropy, and is specifically used to calculate the prediction error, Once the forward propagation of the entire convolutional neural network 200 (as shown in Fig. 2, the propagation from 210 to 240 direction is forward propagation) is completed, the backpropagation (as shown in Fig. 2, the propagation from 240 to 210 direction is back propagation) will Start to update the weight values and biases of the aforementioned layers to reduce the loss of the convolutional neural network 200 and the error between the output of the convolutional neural network 200 through the output layer and the ideal result.

It should be noted that the convolutional neural network 200 shown in FIG. 2 is only an example of a convolutional neural network, and in specific applications, the convolutional neural network may also exist in the form of other network models.

A chip hardware structure provided by the embodiment of the present application is introduced below.

FIG. 3 is a chip hardware structure provided by an embodiment of the present invention, and the chip includes a neural network processor 50 . The chip can be set in the execution device 110 shown in FIG. 1 to complete the computing work of the computing module 111 . The chip can also be set in the training device 120 shown in FIG. 1 to complete the training work of the training device 120 and output the target model 101 . The algorithms of each layer in the convolutional neural network shown in Figure 2 can be implemented in the chip shown in Figure 3 .

The neural network processor NPU 50 is mounted on the main CPU (Host CPU) as a coprocessor, and the tasks are assigned by the Host CPU. The core part of the NPU is the operation circuit 503, and the controller 504 controls the operation circuit 503 to extract data in the memory (weight memory or input memory) and perform operations.

In some implementations, the operation circuit 503 includes multiple processing units (process engine, PE). In some implementations, arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.

For example, suppose there is an input matrix A, a weight matrix B, and an output matrix C. The operation circuit fetches the data corresponding to the matrix B from the weight memory 502, and caches it in each PE in the operation circuit. The operation circuit takes the data of matrix A from the input memory 501 and performs matrix operation with matrix B, and the obtained partial or final results of the matrix are stored in the accumulator 508 (accumulator).

The vector computing unit 507 can further process the output of the computing circuit, such as vector multiplication, vector addition, exponent operation, logarithmic operation, size comparison and so on. For example, the vector calculation unit 507 can be used for network calculations of non-convolution/non-FC layers in neural networks, such as pooling (Pooling), batch normalization (batch normalization), local response normalization (local response normalization), etc. .

In some implementations, vector computation unit 507 can store a vector of processed outputs to unified memory 506 . For example, the vector calculation unit 507 may apply a non-linear function to the output of the operation circuit 503, such as a vector of accumulated values, to generate activation values. In some implementations, the vector computation unit 507 generates normalized values, merged values, or both. In some implementations, the vector of processed outputs can be used as an activation input to arithmetic circuitry 503, for example for use in subsequent layers in a neural network.

The unified memory 506 is used to store input data and output data.

The weight data directly transfers the input data in the external memory to the input memory 501 and/or unified memory 506 through the storage unit access controller 505 (direct memory access controller, DMAC), stores the weight data in the external memory into the weight memory 502, And store the data in the unified memory 506 into the external memory.

A bus interface unit (bus interface unit, BIU) 510 is configured to implement interaction between the main CPU, DMAC and instruction fetch memory 509 through the bus.

An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504.

The controller 504 is configured to call the instruction cached in the instruction fetch memory 509 to control the operation process of the operation accelerator.

Generally, the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all on-chip memory, and the external memory is a memory outside the NPU, and the external memory can be a double data rate synchronous dynamic random Memory (double data rate synchronous dynamic random access memory, referred to as DDR SDRAM), high bandwidth memory (high bandwidth memory, HBM) or other readable and writable memory.

Wherein, the operations of each layer in the convolutional neural network shown in FIG. 2 can be performed by the operation circuit 503 or the vector calculation unit 507 .

The training device 120 in FIG. 1 introduced above can execute the various steps of the congestion prediction model training method in the embodiment of the present application, and the execution device 110 in FIG. 1 can execute the various steps in the image processing method in the embodiment of the present application, as shown in FIG. The neural network model shown in 2 and the chip shown in FIG. 3 can also be used to execute the various steps of the image processing method of the embodiment of the present application, and the chip shown in FIG. 3 can also be used to execute the congestion prediction model in the embodiment of the present application steps of the training method.

As shown in FIG. 4 , FIG. 4 is a schematic structural diagram of a system architecture 300 provided in an embodiment of the present application. The system architecture includes a local device 301, a local device 302, an execution device 210, and a data storage system 250; wherein, the local device 301 and the local device 302 are connected to the execution device 210 through a communication network.

Execution device 210 may be implemented by one or more servers. Optionally, the execution device 210 may be used in cooperation with other computing devices, such as data storage, routers, load balancers and other devices. Execution device 210 may be arranged on one physical site, or distributed on multiple physical sites. The execution device 210 may use the data in the data storage system 250 or call the program code in the data storage system 250 to implement the congestion prediction model training method or the image processing method in the embodiment of the present application.

Specifically, the execution device 210 may perform the following process:

Divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers are metal layers contained in each semiconductor chip in K semiconductor chips, N is an integer greater than 1, and K is a positive integer; determine each M first feature maps corresponding to the prediction layers; wherein, the M first feature maps are respectively used to describe the M chip features of each of the prediction layers, and M is a positive integer; the K The M first feature maps corresponding to each prediction layer in the semiconductor chip are added to the data set, and the congestion prediction model is trained using the data set.

A congestion prediction model can be trained through the execution device 210 above, and the congestion prediction model can be used for image processing, speech processing, and natural language processing, etc., for example, the congestion prediction model can be used to implement the congestion prediction method in the embodiment of the present application.

Alternatively, the execution device 210 can be built into an image processing device through the above process, and the image processing device can be used for image processing (for example, it can be used to realize the congestion prediction of the semiconductor chip in the embodiment of the present application).

Users can operate their respective user devices (such as the local device 301 and the local device 302 ) to interact with the execution device 210 . Each local device can represent a variety of computing devices, such as personal computers, computer workstations, smartphones, tablets, and so on.

Each user's local device can interact with the execution device 210 through any communication mechanism/communication standard communication network, and the communication network can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.

In one implementation, the local device 301 and the local device 302 obtain the relevant parameters of the congestion prediction model from the execution device 210, deploy the congestion prediction model on the local device 301 and the local device 302, and use the congestion prediction model to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.

In another implementation, the trained congestion prediction model can be directly deployed on the execution device 210, and the execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the trained congestion prediction model Congestion prediction is performed on the chip to be predicted, and the predicted congestion map of the chip to be predicted is obtained.

In one implementation, the local device 301 and the local device 302 obtain relevant parameters of the image processing device from the execution device 210, deploy the image processing device on the local device 301 and the local device 302, and use the image processing device to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.

In another implementation, the image processing device can be directly deployed on the execution device 210. The execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the image processing device to perform congestion prediction on the chip to be predicted. , to get the predicted congestion map of the chip to be predicted.

That is to say, the above execution device 210 may also be a cloud device, at this time, the execution device 210 may be deployed on the cloud; or, the above execution device 210 may also be a terminal device, at this time, the execution device 210 may be deployed on the user terminal side, The embodiment of the present application does not limit this.

The method for training the congestion prediction model and the image processing method (for example, the congestion prediction method in EDA) of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Please refer to FIG. 5. FIG. 5 is a schematic flowchart of a congestion prediction model training method 500 provided in an embodiment of the present application. The method includes but is not limited to the following steps:

Step S510: Divide the plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers are metal layers included in each of the K semiconductor chips, and K is a positive integer.

Specifically, the metal layers included in each semiconductor chip are divided into at least two prediction layers; wherein, each prediction layer includes at least one metal layer. Please refer to FIG. 6 . FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application. As shown in FIG. 6, the semiconductor chip may comprise a plurality of metal layers (from top to bottom are metal layer 1-1 ... metal layer N-B), and the semiconductor chip can be divided into at least two predictive layers (from top to bottom are sequentially Prediction layer 1...prediction layer N), each prediction layer contains at least one metal layer, for example, prediction layer 1 may contain metal layer 1-1...metal layer 1-A, and prediction layer N may contain metal layer N-1...metal layer N-B; wherein, A and B are positive integers, and N is an integer greater than or equal to 2.

In a feasible implementation manner, the above-mentioned division of the plurality of metal layers into at least two predictive layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each semiconductor chip for at least two prediction layers.

Specifically, the stratification of the prediction layer is based on the manufacturing process of the metal layer or whether the metal layer contains the same functional module (Module), that is, on the same semiconductor chip, metal layers with similar manufacturing processes can be divided into the same prediction layer, Or divide metal layers containing the same functional modules into the same prediction layer. Wherein, the manufacturing process of each metal layer can be characterized by the routing track capacity (Routing track capacity) of each metal layer, and the routing resource capacity is specifically the number of routing tracks (Routing track) on each metal layer; when the metal layer The more advanced the manufacturing process, the greater the amount of routing resources. For example, the amount of routing resources on the metal layer of the 7nm manufacturing process is greater than the amount of routing resources on the metal layer of the 14nm manufacturing process. Functional modules refer to some hardware structures in the metal layer, for example, macro cell (Macro Cell) layer or registers, etc.

For example, when the multiple metal layers in the same semiconductor chip are divided into at least two prediction layers according to the difference in the amount of routing resources of each metal layer, the difference in the amount of routing resources can be less than or equal to a preset threshold The metal layers are divided into the same prediction layer, and the preset threshold can be determined according to the specific application scenario. Assuming that the semiconductor chip contains 6 metal layers, and the amount of wiring resources of the 6 metal layer chips are 15, 15, 14, 10, 10 and 2 respectively, the preset threshold is 2; at this time, the 6 metal layers can be The layer is divided into three prediction layers; among them, the unit of the amount of routing resources is bar. Specifically, the three metal layers whose routing resources are 15, 15 and 14 can be divided into the same prediction layer; the two metal layers whose routing resources are 10 can be divided into the same prediction layer; the routing resources Metal layers with a value of 2 are assigned to the same prediction layer.

For example, when multiple metal layers in the same semiconductor chip are divided into at least two predictive layers according to the functional modules contained in each metal layer, the same semiconductor chip can be divided into The multiple metal layers in the chip are divided into at least two predictive layers. For example, metal layers containing macrocell layers in the same semiconductor chip can be divided into the same predictive layer, and metal layers containing non-macrocell layers can be divided into the same predictive layer; or metal layers containing registers in the same semiconductor chip can be divided into the same predictive layer. The prediction layer, which divides metal layers that do not contain registers into the same prediction layer.

It can be seen that based on the above division method, multiple metal layers are divided into different prediction layers, so that the feature data of the metal layer in the same prediction layer are highly consistent, and the feature data corresponding to different prediction layers are quite different. After the corresponding characteristic data of the prediction layer train the congestion prediction model, the obtained models can effectively identify the characteristic data of different trends, and make corresponding predictions based on the identified characteristics, that is, the model has refined identification and predictive power.

Step S520: Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer.

Specifically, M chip features used for congestion prediction may be determined according to specific application scenarios. Obtain chip-related data, including netlist, macromodule location, transistor location, transistor pin location, and routing resources, etc. M first feature maps corresponding to the M chip features of each prediction layer are calculated based on the above chip-related data, and each chip feature of the prediction layer corresponds to a first feature map describing the feature of the chip.

In a feasible implementation manner, the above M chip features may include one or more of pin density, network connection density, module mask, or amount of routing resources.

It should be understood that the above-mentioned chip features are only a few specific examples listed in the embodiments of the present application, and those skilled in the art may use other chip features to describe the prediction layer accordingly. In addition, after stratification using the above prediction layer stratification basis, each metal layer in each prediction layer has similar chip characteristics; for example, in the same prediction layer, the pin density of one metal layer Pin density is similar.

In a feasible implementation manner, the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature map describing any chip feature in each metal layer obtained from the feature map.

Specifically, each metal layer included in each semiconductor chip corresponds to the aforementioned M chip features, that is, the chip features of each metal layer may include one or more of pin density, network connection density, module mask, or amount of routing resources. Multiple; among them, the pin density and the module mask are not directional; the amount of routing resources and the network connection density are directional. In the same metal layer, the amount of routing resources is in the horizontal or vertical direction, and the network connection density is in the horizontal or vertical direction.

Among them, the amount of routing resources is specifically the number of routing tracks on each metal layer. Since the routing tracks on each metal layer are directional, either horizontal or vertical, the amount of routing resources It is also directional accordingly. The network connection density refers to the number of windings per unit area, and the windings are wound on the above-mentioned winding track, so the network connection density also has directionality, that is, it is horizontal or vertical accordingly. It should be understood that, for a directional chip feature, the first feature map describing the chip feature is also directional; for example, when the amount of routing resources is horizontal, the first feature map describing the amount of routing resources is also directional. for the horizontal direction.

Further, the acquisition of the M second feature maps corresponding to each metal layer in each of the prediction layers includes: performing feature extraction based on the wiring data of each metal layer in each prediction layer, and obtaining the M second feature maps corresponding to each metal layer. The second feature map. The above-mentioned M second feature maps based on each metal layer generate M first feature maps corresponding to each prediction layer, including: for the same chip feature, based on the first feature map describing the same chip feature on each metal layer Two feature maps, obtaining a first feature map describing the features of the same chip, where the first feature map is one of the M first feature maps corresponding to each prediction layer. Specifically, the corresponding pixel points on each second feature map describing the same chip feature on each metal layer in the same prediction layer are weighted averaged, and the maximum value or minimum value is taken to obtain the corresponding pixel points on the first feature map corresponding to the same chip feature. The pixel value of the pixel point. Perform the above operations on each pixel on each second feature map to obtain the pixel value of each pixel on the first feature map corresponding to the feature of the same chip, that is, obtain the first feature map describing the feature of the same chip.

It should be understood that, in addition to adopting the above-mentioned weighted average, taking the maximum value or the minimum value, those skilled in the art may also use other methods to process the second characteristic maps describing the characteristics of the same chip on each metal layer, and obtain the description of the same chip. The first feature map of the feature is not limited in this application.

For example, when a prediction layer includes four metal layers, and M chip features include pin density, the first feature map corresponding to the pin density of the prediction layer is determined as follows: first, based on the four metal layers The wiring data of each metal layer obtains four second feature maps respectively describing the pin density of the four metal layers; the pixel values at the same position in the four second feature maps are weighted average, and the maximum value or minimum value is taken etc. to obtain the pixel value at the same position in the first feature map describing the pin density on the prediction layer. Each pixel in the above four second feature maps is processed in the above manner to obtain the first feature map corresponding to the pin density on the prediction layer.

It should be noted that when a metal layer does not have the first chip feature among the above M chip features, that is, the metal layer does not have a real second feature map for describing the first chip feature, then the preset operation can be used A second characteristic map describing the first chip characteristics of the metal layer is obtained. The preset operation may be: firstly determine the prediction layer where the one metal layer is located, and determine the description of the first chip on the one metal layer based on the second feature maps in the prediction layer that describe the characteristics of the first chip on the other metal layers. The second feature map of the feature, for example, can carry out weighted average, take the maximum value or minimum value or other processing methods on the corresponding pixel points on each second feature map describing the first chip features of each metal layer to obtain the description of the metal layer. Layer the second feature map of the first chip features.

Step S530: Determine a data set based on the M first feature maps and real congestion maps corresponding to the prediction layer included in each of the K semiconductor chips, and use the data set to train a congestion prediction model; wherein, The real congestion map corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.

Wherein, the degree of congestion refers to the difference between the routing resource demand and the routing resource amount. The amount of routing resources refers to the number of routing tracks. The routing resource requirement refers to the number of routing required to connect all the netlists. The required routing is wound in the routing track, so the difference between the routing resource requirement and the amount of routing resources is the degree of congestion. For example, if there are 10 routing resources, 10 routings are needed to connect all the netlists; Other winding wires are wound together in the same track, that is, the congestion degree at this time is 2.

Among them, chip design can be divided into chip layout and global routing (Global routing) two stages. The chip layout stage mainly determines the netlist in each metal layer on the chip, the position of the macro module, the position of the transistor, the position of the transistor pin and the amount of winding resources, etc. In the global winding stage, the metal wire is mainly wound in the winding track corresponding to the amount of winding resources.

Specifically, obtaining the actual congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing includes: after performing global routing on the semiconductor chip, the number of routings on the chip can be determined, That is, the routing track demand; and then calculate the real congestion map of each prediction layer based on the routing resource demand and the amount of routing resources.

The image processing method (also called the congestion prediction method) in the embodiment of the present application is mainly used in the chip layout stage. When global routing is not performed, the chip congestion program is predicted based on this method to adjust the chip layout accordingly.

Specifically, the real congestion map corresponding to each metal layer in each semiconductor chip is obtained, and the real congestion map corresponding to each metal layer includes a second horizontal real congestion map and a second vertical real congestion map; wherein, the second horizontal real congestion map is used to describe the degree of congestion of the metal layer in the horizontal direction, and the second vertical real congestion map is used to describe the degree of congestion of the metal layer in the vertical direction. Wherein, the real congestion map corresponding to each metal layer is calculated based on the routing resource requirement and routing resource amount of each metal layer after global routing is performed.

Optionally, the specific process of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in the prediction layer can correspond to the above-mentioned determination of the first feature map corresponding to the prediction layer. The specific process, that is, the specific process of obtaining the first feature map describing the feature of the same chip by using the second feature map corresponding to the feature of the same chip on each metal layer, will not be repeated here. Similarly, the specific process of determining the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, and will not be repeated here.

It should be understood that the size of the M first feature maps corresponding to each prediction layer is the same as that of the real congestion map.

It can be seen that in the embodiment of the present application, based on the above-mentioned layering method, in each prediction layer, since the real congestion map corresponding to each metal layer has strong correlation and consistency, based on the embodiment of the present application The real congestion map of each prediction layer obtained by the method has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, the real congestion map that accurately reflects the congestion degree of each prediction layer can be obtained, thereby ensuring subsequent The prediction accuracy of the congestion prediction model trained with the real congestion map of each prediction layer.

In a feasible implementation manner, adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.

Specifically, each iteration training above includes: determining a single iteration training sample from the M first feature maps corresponding to any prediction layer and 2 real congestion maps. Among them, a single iteration training sample contains M third feature maps and 2 target real congestion maps. Input the M third feature maps into the congestion prediction model to obtain the predicted congestion map output by the model; determine the prediction error based on the predicted congestion map and the two target real congestion maps. According to the prediction error, the model parameters in the congestion prediction model are updated using the gradient descent method or other backpropagation algorithms. Finally, it is judged whether the training process meets the preset condition. When the preset condition is met, the training process of the congestion prediction model is ended; when the preset condition is not met, the next iteration training is started. The preset condition may be that the number of training times is greater than or equal to the preset number, the prediction error is less than or equal to the preset error, or other feasible conditions, which are not limited in the present application. Wherein, the aforementioned congestion prediction model may be a model such as a generative adversarial neural network, a variational autoencoder, a semantic segmentation neural network, etc., which is not limited in this application.

Among them, the above-mentioned process of determining a single iteration training sample is as follows:

Select M third feature maps from any identical areas on the M first feature maps corresponding to the prediction layer, respectively select two target real congestion maps from the arbitrary same areas on the two real congestion maps corresponding to the prediction layer, the The size of the M third feature maps and the two target real congestion maps is equal to the target size; the M third feature maps and the two target real congestion maps are used as training samples for a single iteration. Wherein, the target size is the size of the input image allowed by the congestion prediction model, and the target size may be smaller than or equal to the size of the first feature map.

Further, when the target size is equal to the size of the M first feature maps corresponding to the prediction layer, the M first feature maps are respectively used as the above M third feature maps, and the two real feature maps corresponding to the prediction layer The congestion maps are respectively used as the above two target real congestion maps, that is, the single iteration training samples at this time include M first feature maps and two real congestion maps.

In a feasible implementation manner, the prediction layers included in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the Adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set A feature map and a corresponding real congestion map to train the first congestion prediction model; using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map to train the second congestion prediction model train.

Optionally, the multiple metal layers in each semiconductor chip are divided into two predictive layers, namely a macro-unit layer and a non-macro-unit layer. The model structures of the first congestion prediction model and the second congestion prediction model are completely the same, and initial model parameters may be the same or different.

Specifically, the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map are used to train the first congestion prediction model, including: M first feature maps corresponding to any macro-unit layer and the real congestion map In the figure, a single iteration training sample is determined, and then the first congestion prediction model is trained by using the single iteration training sample. Wherein, the determination process of the single-iteration training samples used to train the first congestion prediction model can refer to the determination process of the single-iteration training samples of the aforementioned congestion prediction model, which will not be repeated here; the specific details of the first congestion prediction model The training process may be the same as the training process of the congestion prediction model in the foregoing embodiment, and details are not repeated here.

Similarly, the training process of the second congestion prediction model is the same as the training process of the first congestion prediction model, and will not be repeated here.

Please refer to FIG. 7. FIG. 7 is a schematic flowchart of an image processing method 700 provided in the embodiment of the present application. The method includes but is not limited to the following steps:

Step S710: Determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;

Specifically, for the manner of determining the M first feature maps corresponding to each prediction layer above, reference may be made to the specific description of the embodiment shown in FIG. 5 , which will not be repeated here.

Step S720: using the congestion prediction model to process the M first feature maps corresponding to each of the prediction layers, to obtain a predicted congestion map corresponding to each of the prediction layers;

Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, each of the training prediction The training data corresponding to the layer includes M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map uses To describe the actual congestion level of each training prediction layer, each training semiconductor chip includes at least two training prediction layers, each training prediction layer includes at least one metal layer, and K is a positive integer.

Optionally, when the size of the first feature map is larger than the target size of the input image of the congestion prediction model, a sliding window method may be used to obtain M fourth feature maps for a single prediction by the model from the M first feature maps. Wherein, the size of each fourth feature map is the target size, as shown in Figure 8, the length and width of the target size can be E and F respectively, E and F are positive integers, and the unit can be a pixel.

Specifically, the composition of the model input data used by the congestion prediction model for a single prediction will be described below with reference to FIG. 8:

The first feature map shown in FIG. 8 is any one of the M first feature maps corresponding to the prediction layer. As shown in Figure 8, the first feature map may contain D fourth feature maps, and the width of the overlapping portion between any two adjacent fourth feature maps is G, where G is an integer greater than or equal to zero, and the unit can be for pixels. That is, each first feature map in the M first feature maps contains D fourth feature maps. The M fourth feature maps at the same region on the M first feature maps are used as input data for a single prediction of the model. In summary, the M first feature maps corresponding to each prediction layer contain a total of D sets of input data for congestion prediction, and each set of input data corresponds to a specific area on the first feature map, and also corresponds to a specific area on the prediction layer. area, D is a positive integer.

Optionally, the aforementioned congestion prediction model is used to process the M first feature maps corresponding to each prediction layer to obtain a prediction congestion map corresponding to each prediction layer, including: sequentially inputting D groups of input data corresponding to each prediction layer Input one or more congestion prediction models to obtain a predicted congestion map corresponding to each set of input data, and a total of D groups of predicted congestion maps are obtained. Wherein, each group of predicted congestion maps includes a horizontal predicted congestion map and a vertical predicted congestion map. Concatenate the horizontal prediction congestion diagrams in group D prediction congestion diagrams to obtain the horizontal prediction congestion diagrams corresponding to each prediction layer; splice the vertical prediction congestion diagrams in group D prediction congestion diagrams to obtain the vertical prediction congestion diagrams corresponding to each prediction layer Congestion map. It should be noted that in the process of splicing the two predicted congestion maps, in the overlapping part of the two predicted congestion maps, each pixel point in the overlapping part corresponds to two pixel values on the two predicted congestion maps, The pixel value of each pixel after splicing can be determined by means of weighted average or maximum value of two pixel values corresponding to each pixel.

Further, optionally, the above-mentioned sequentially inputting D sets of input data corresponding to each prediction layer into one or more congestion prediction models includes: sequentially inputting the above-mentioned D sets of input data into a congestion prediction model obtained through training, and obtaining each The predicted congestion map corresponding to the group of input data; or input each group of input data in the D group of input data into one of the multiple congestion prediction models to perform parallel prediction, and obtain the prediction corresponding to each group of input data Congestion map; wherein, the model structure and parameters of each congestion prediction model in the plurality of congestion prediction models are the same as the congestion prediction model obtained through training.

It can be seen that, in the embodiment of the present application, the parallel prediction method can greatly save the time of congestion prediction and improve efficiency.

Optionally, the above-mentioned input data corresponding to each prediction layer D is sequentially input into one or more congestion prediction models, including: when the semiconductor chip to be predicted is divided into two prediction layers, respectively macro unit layer and non-macro unit layer In the case of the unit layer, the D group of input data corresponding to the macro-unit layer is input into one or more first congestion prediction models to obtain a predicted congestion map corresponding to the macro-unit layer. Input the group D of input data corresponding to the non-macro unit layer into one or more second congestion prediction models to obtain a predicted congestion map corresponding to the non-macro unit layer.

Specifically, for the specific training process of the above congestion prediction model, reference may be made to the embodiment described in FIG. 5 , which will not be repeated here.

Specifically, the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted are aggregated to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.

Optionally, the above-mentioned hierarchical aggregation operator may be an operation such as taking an average value or taking a maximum value. Specifically, any two corresponding pixel points on the predicted congestion map subjected to hierarchical aggregation are subjected to operations such as averaging or maximum value to obtain the pixel value of the corresponding pixel point on the aggregated predicted congestion map. Similarly, the above-mentioned directional aggregation operator may also be an operation such as taking an average value or a maximum value, which is not limited in this application. In addition, for the specific operation process of the directional aggregation operator, please refer to the operation process corresponding to the hierarchical aggregation operator, which will not be repeated here.

In summary, the semiconductor chip to be predicted contains at least two prediction layers, and the directional aggregation operator can be used to aggregate the horizontal prediction congestion map and vertical prediction congestion map corresponding to each prediction layer to obtain the reference prediction corresponding to each prediction layer Congestion map; then use the hierarchical aggregation operator to aggregate the reference prediction congestion map corresponding to each prediction layer to obtain the prediction congestion map corresponding to the semiconductor chip to be predicted. Or use the hierarchical aggregation operator to aggregate the corresponding horizontal prediction congestion graphs of each prediction layer to obtain the reference horizontal prediction congestion graph; use the hierarchical aggregation operator to aggregate the corresponding vertical prediction congestion graphs of each prediction layer to obtain the reference vertical prediction congestion graph Figure; and then use the directional aggregation operator to aggregate the reference horizontal predicted congestion map and the reference vertical predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted. Or adopt other aggregation order, use hierarchical aggregation operator and directional aggregation operator to aggregate the predicted congestion graph corresponding to each prediction layer, and obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted, which is not limited in this application.

Specifically, for the specific process of using the training semiconductor chip to train the congestion prediction model, reference may be made to the specific description of the embodiment shown in FIG. 5 , which will not be repeated here.

Please refer to FIG. 9 , which is a schematic flowchart of a congestion prediction provided by an embodiment of the present application. As shown in FIG. 9 , the congestion prediction process of a semiconductor chip is specifically as follows: the semiconductor chip to be predicted is divided into two prediction layers, namely a macro-unit layer and a non-macro-unit layer. The M first feature maps corresponding to the macro-unit layer and the non-macro-unit layer are respectively determined according to the methods in the foregoing embodiments. Utilize the first congestion prediction model to process the M first feature maps corresponding to the macro-unit layer, and obtain the horizontal prediction congestion map and the vertical prediction congestion map corresponding to the macro-unit layer; The M first feature maps are processed to obtain a horizontal prediction congestion map and a vertical prediction congestion map corresponding to the non-macro unit layer. Finally, use the hierarchical aggregation operator and the directional aggregation operator to analyze the horizontal predicted congestion map corresponding to the macro-unit layer, the vertical predicted congestion map corresponding to the macro-unit layer, the horizontal predicted congestion map corresponding to the non-macro-unit layer, and the corresponding non-macro-unit layer. The vertical predicted congestion graph is aggregated to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.

Specifically, for the specific process of predicting the congestion of the semiconductor chip to be predicted, refer to the corresponding process in the embodiments of Fig. 5 and Fig. 7 above, and details are not repeated here.

The method of the embodiment of the present application has been described in detail above, and the device of the embodiment of the present application is provided below.

Please refer to FIG. 10. FIG. 10 is a schematic structural diagram of a model training device 1000 provided by an embodiment of the present application. The device 1000 may include a layering unit 1010, a determination unit 1020, and a training unit 1030. The detailed description of each unit is as follows .

A layering unit 1010, configured to divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is a metal layer contained in each semiconductor chip in K semiconductor chips, and K is a positive integer; the determination unit 1020. Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer A training unit 1030, configured to add M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and use the data set to train a congestion prediction model.

It should be noted that the implementation of each unit may also refer to corresponding descriptions of the method embodiments shown in FIG. 5 and FIG. 7 .

Please refer to FIG. 11 , which is a schematic structural diagram of an image processing apparatus 1100 provided in an embodiment of the present application. The apparatus 1100 includes a determining unit 1110 and a processing unit 1120 .

A determination unit 1110, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer; processing Unit 1120, configured to use a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers to obtain a prediction congestion map corresponding to each of the prediction layers; wherein, the congestion prediction model is obtained through data The data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, and the training data corresponding to each of the training prediction layers includes M first A training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training prediction layers For the real congestion level, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.

Specifically, the image processing apparatus 1100 may be used to process corresponding steps of the image processing method 700 described in FIG. 7 , which will not be repeated here.

Please refer to FIG. 12 , which is a schematic diagram of a hardware structure of a model training device 1200 provided by an embodiment of the present application. The model training apparatus 1200 shown in FIG. 12 (the apparatus 1200 may specifically be a computer device) includes a memory 1201 , a processor 1202 , a communication interface 1203 and a bus 1204 . Wherein, the memory 1201 , the processor 1202 , and the communication interface 1203 are connected to each other through a bus 1204 .

The memory 1201 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 1201 may store a program. When the program stored in the memory 1201 is executed by the processor 1202, the processor 1202 and the communication interface 1203 are used to execute each step of the method for training the congestion prediction model in the embodiment of the present application.

The processor 1202 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs to realize the functions required by the units in the device for training the congestion prediction model in the embodiment of the present application, or to execute the method for training the congestion prediction model in the method embodiment of the present application.

The processor 1202 may also be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the method for training the congestion prediction model of the present application may be completed by an integrated logic circuit of hardware in the processor 1202 or instructions in the form of software. The above-mentioned processor 1202 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 1201, and the processor 1202 reads the information in the memory 1201, and combines its hardware to complete the functions required by the units included in the training device for the congestion prediction model of the embodiment of the present application, or execute the method embodiment of the present application The training method of the congestion prediction model.

The communication interface 1203 implements communication between the apparatus 1200 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1203 .

The bus 1204 may include a pathway for transferring information between various components of the device 1200 (eg, memory 1201 , processor 1202 , communication interface 1203 ).

Please refer to FIG. 13 , which is a schematic diagram of a hardware structure of an image processing apparatus 1300 provided by an embodiment of the present application. Wherein, the image processing apparatus 1300 may be a computer, a mobile phone, a tablet computer or other possible terminal devices, which is not limited in this application. The image processing apparatus 1300 shown in FIG. 13 (the apparatus 1300 may specifically be a computer device) includes a memory 1301 , a processor 1302 , a communication interface 1303 and a bus 1304 . Wherein, the memory 1301 , the processor 1302 , and the communication interface 1303 are connected to each other through a bus 1304 .

The memory 1301 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 1301 may store programs, and when the programs stored in the memory 1301 are executed by the processor 1302, the processor 1302 and the communication interface 1303 are used to execute various steps of the image processing method of the embodiment of the present application.

The processor 1302 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs to realize the functions required by the units in the image processing device of the embodiment of the present application, or to execute the image processing method of the method embodiment of the present application.

The processor 1302 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the image processing method of the present application may be completed by an integrated logic circuit of hardware in the processor 1302 or instructions in the form of software. The above-mentioned processor 1302 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 1301, and the processor 1302 reads the information in the memory 1301, and combines its hardware to complete the functions required by the units included in the image processing device of the embodiment of the present application, or execute the image processing of the method embodiment of the present application method.

The communication interface 1303 implements communication between the apparatus 1300 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1303 .

The bus 1304 may include pathways for transferring information between various components of the device 1300 (eg, memory 1301 , processor 1302 , communication interface 1303 ).

It should be noted that although the device 1200 and the device 1300 shown in FIG. 12 and FIG. 13 only show a memory, a processor, and a communication interface, in a specific implementation process, those skilled in the art should understand that the device 1200 and the device 1300 also have Includes other devices necessary for proper operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 1200 and the apparatus 1300 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 1200 and the device 1300 may only include the devices necessary to realize the embodiment of the present application, instead of all the devices shown in FIG. 12 or FIG. 13 .

It can be understood that the above apparatus 1200 is equivalent to the training device 120 in FIG. 1 , and the apparatus 1300 is equivalent to the execution device 110 in FIG. 1 . Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

The embodiment of the present application also provides a chip system, the chip system includes at least one processor, memory and interface circuit, the memory, the interface circuit and the at least one processor are interconnected by wires, and the at least one memory Instructions are stored in; when the instructions are executed by the processor, the methods described above in FIG. 5 and/or FIG. 7 are implemented.

An embodiment of the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a network device, the method flow shown in FIG. 5 and/or FIG. 7 is implemented.

The embodiment of the present application further provides a computer program product. When the computer program product is run on a terminal, the method flow shown in FIG. 5 and/or FIG. 7 is realized.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the above functions are realized in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims

A method for training a congestion prediction model, characterized in that the method comprises:

Dividing multiple metal layers into at least two prediction layers; wherein, the multiple metal layers are metal layers contained in each semiconductor chip in K semiconductor chips, and K is a positive integer;

Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer;

Adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and using the data set to train a congestion prediction model.
The method according to claim 1, further comprising:

performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing;

The real congestion map corresponding to each of the prediction layers in the K semiconductor chips is added to the data set.
The method according to claim 1 or 2, wherein said dividing said plurality of metal layers into at least two prediction layers comprises:

The multiple metal layers are divided into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
The method according to any one of claims 1-3, wherein the determining the M first feature maps corresponding to each of the prediction layers comprises:

Acquiring M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used to describe the M chip features of each metal layer;

Based on the M second feature maps of each metal layer, generate M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are used to describe any chip feature The first feature map is obtained based on the second feature map describing the feature of any chip in each metal layer.
The method according to any one of claims 2-4, wherein the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the predicted The real congestion map corresponding to each metal layer in the layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on the first horizontal real congestion map corresponding to each metal layer in each of the prediction layers Two horizontal real congestion maps are obtained, and the first vertical real congestion map is obtained based on a second vertical real congestion map corresponding to each metal layer in each prediction layer.
The method according to any one of claims 2-5, wherein the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips are added to the data set, and using the The above datasets are used to train the congestion prediction model, including:

Using the data set to iteratively train the congestion prediction model; wherein, each iterative training includes:

Using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set, to obtain a prediction congestion map corresponding to any prediction layer;

Updating the congestion prediction model based on the predicted congestion map and a real congestion map corresponding to any prediction layer.
The method according to any one of claims 2-5, wherein the prediction layers contained in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer; the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model, including:

Using the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model;

The second congestion prediction model is trained by using the first feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map.
The method according to any one of claims 1-7, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
An image processing method, characterized in that the method comprises:

Determining M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;

Using a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers to obtain a prediction congestion map corresponding to each of the prediction layers;

Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to each training prediction layer in the K training semiconductor chips, and the training data corresponding to each training prediction layer Including M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the The actual congestion level of the training prediction layer, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
The method according to claim 9, characterized in that the method further comprises:

Aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
The method according to claim 10, wherein the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; Aggregation to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted, including:

Using a hierarchical aggregation operator to aggregate the vertical prediction congestion graphs corresponding to each of the prediction layers to obtain a reference vertical prediction congestion graph; using the hierarchical aggregation operator to aggregate the horizontal prediction congestion graphs corresponding to each of the prediction layers , obtaining a reference horizontal predicted congestion map; using a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted;

or

Using the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the prediction layers to obtain a reference predicted congestion map corresponding to each of the prediction layers; using the hierarchical aggregation operator The reference predicted congestion graphs corresponding to each prediction layer are aggregated to obtain the predicted congestion graphs corresponding to the semiconductor chips to be predicted.
The method according to any one of claims 9-11, wherein the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global wiring.
The method according to any one of claims 9-12, wherein the training prediction layer contained in each of the training semiconductor chips is based on the manufacturing process or functional module of the metal layer in each of the training semiconductor chips distribution is obtained.
The method according to any one of claims 9-13, wherein the M first training feature maps corresponding to each of the training prediction layers are based on each metal layer corresponding to each of the training prediction layers obtained from M second feature maps; wherein, the first training feature map used to describe any chip feature in the M first training feature maps is based on describing the feature of any chip in each metal layer The second feature map is obtained.
The method according to any one of claims 9-14, wherein the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the The real congestion map corresponding to each metal layer in the training prediction layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on each metal layer in each of the training prediction layers The corresponding second horizontal real congestion map is obtained, and the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
The method according to any one of claims 9-15, wherein, during each iterative training of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the The real congestion map corresponding to any training prediction layer updates the congestion prediction model; the prediction congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer obtained from the above congestion prediction model.
The method according to any one of claims 9-16, wherein the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model. Two congestion prediction models; the first congestion prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the The first training feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map are obtained by training.
The method according to any one of claims 9-17, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
A training device for a congestion prediction model, characterized in that the device comprises:

A layering unit, configured to divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is a metal layer contained in each semiconductor chip in K semiconductor chips, and K is a positive integer;

A determining unit, configured to determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is positive integer;

A training unit, configured to add M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and use the data set to train a congestion prediction model.
The device according to claim 19, wherein the training unit is also used for:

Obtaining a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing;

The real congestion map corresponding to each of the prediction layers in the K semiconductor chips is added to the data set.
The device according to claim 19 or 20, wherein the layering unit is specifically used for:

The multiple metal layers are divided into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
The device according to any one of claims 19-21, wherein the determining unit is specifically configured to:

Acquiring M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used to describe the M chip features of each metal layer;

Based on the M second feature maps of each metal layer, generate M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are used to describe any chip feature The first feature map is obtained based on the second feature map describing the feature of any chip in each metal layer.
The device according to any one of claims 20-22, wherein the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the predicted The real congestion map corresponding to each metal layer in the layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on the first horizontal real congestion map corresponding to each metal layer in each of the prediction layers Two horizontal real congestion maps are obtained, and the first vertical real congestion map is obtained based on a second vertical real congestion map corresponding to each metal layer in each prediction layer.
The device according to any one of claims 20-23, wherein, in terms of using the data set to train the congestion prediction model, the training unit is specifically configured to:

Using the data set to iteratively train the congestion prediction model; wherein, each iterative training includes:

Using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set, to obtain a prediction congestion map corresponding to any prediction layer;

Updating the congestion prediction model based on the predicted congestion map and a real congestion map corresponding to any prediction layer.
The device according to any one of claims 20-23, characterized in that, the prediction layers contained in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer; In terms of predictive models, the training unit is specifically used for:

Using the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model;

The second congestion prediction model is trained by using the first feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map.
The device according to any one of claims 19-25, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
An image processing device, characterized in that the device comprises:

A determining unit, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;

A processing unit, configured to use a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers, to obtain a prediction congestion map corresponding to each of the prediction layers;

Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, each of the training prediction The training data corresponding to the layer includes M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map uses To describe the actual congestion level of each training prediction layer, each training semiconductor chip includes at least two training prediction layers, and each training prediction layer includes at least one metal layer.
The device according to claim 27, further comprising:

The aggregation unit is configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
The device according to claim 28, wherein the predicted congestion map corresponding to each prediction layer includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation unit is specifically used for:

Using a hierarchical aggregation operator to aggregate the vertical prediction congestion graphs corresponding to each of the prediction layers to obtain a reference vertical prediction congestion graph; using the hierarchical aggregation operator to aggregate the horizontal prediction congestion graphs corresponding to each of the prediction layers , obtaining a reference horizontal predicted congestion map; using a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted;

or,

Using the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the prediction layers to obtain a reference predicted congestion map corresponding to each of the prediction layers; using the hierarchical aggregation operator The reference predicted congestion graphs corresponding to each prediction layer are aggregated to obtain the predicted congestion graphs corresponding to the semiconductor chips to be predicted.
A chip system, characterized in that the chip system includes at least one processor, a memory and an interface circuit, the memory, the interface circuit and the at least one processor are interconnected through lines, and the at least one memory stores There are instructions; when the instructions are executed by the processor, the method of any one of claims 1-18 is realized.
A terminal device, characterized in that the terminal device comprises the system-on-a-chip as claimed in claim 30, and a discrete device coupled to the system-on-a-chip.
A computer-readable storage medium, wherein program instructions are stored in the computer-readable storage medium, and when the program instructions are run on a processor, the method according to any one of claims 1-18 is implemented.
A computer program product, characterized in that, when the computer program product is run on a terminal, the method according to any one of claims 1-18 is realized.