WO2022266888A1 - Congestion prediction model training method, image processing method and apparatus - Google Patents

Congestion prediction model training method, image processing method and apparatus Download PDF

Info

Publication number
WO2022266888A1
WO2022266888A1 PCT/CN2021/101860 CN2021101860W WO2022266888A1 WO 2022266888 A1 WO2022266888 A1 WO 2022266888A1 CN 2021101860 W CN2021101860 W CN 2021101860W WO 2022266888 A1 WO2022266888 A1 WO 2022266888A1
Authority
WO
WIPO (PCT)
Prior art keywords
prediction
congestion
training
layer
layers
Prior art date
Application number
PCT/CN2021/101860
Other languages
French (fr)
Chinese (zh)
Inventor
李栋
王超
张锐
刘武龙
黄宇
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2021/101860 priority Critical patent/WO2022266888A1/en
Priority to CN202180099695.1A priority patent/CN117561515A/en
Publication of WO2022266888A1 publication Critical patent/WO2022266888A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/39Circuit design at the physical level
    • G06F30/394Routing
    • G06F30/3947Routing global

Definitions

  • the present application relates to the technical field of Electronic Design Automation (EDA), in particular to a congestion prediction model training method, image processing method and device.
  • EDA Electronic Design Automation
  • Congestion Prediction Congestion Prediction
  • the goal of congestion prediction is: in the global placement (Global Placement, GP) process, according to the current unit (Cell) placement position to estimate the chip's winding congestion level, so as to provide an optimization basis for the global placement, so that the placer (Placer) can Scatter the cells in the heavily congested area to reduce the layout congestion in this area, thereby reducing the overall congestion of the chip. Its essence is to predict the difference between the routing resource demand (Routing Tracks Demand) of each grid (Grid) and the given routing resource amount (Routing TrackCapacity) on the rasterized chip.
  • existing congestion prediction methods have the following limitations: low accuracy of congestion prediction, long time-consuming calculation of congestion, and difficulty in both accuracy and time-consuming congestion prediction.
  • the embodiment of the present application discloses a congestion prediction model training method, an image processing method and a device.
  • the congestion prediction method can reduce the time consumption of congestion prediction while improving the accuracy of semiconductor chip congestion prediction.
  • the embodiment of the present application discloses a congestion prediction model training method, the method includes: dividing a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is each of the K semiconductor chips metal layers contained in a semiconductor chip, K is a positive integer; M first feature maps corresponding to each of the prediction layers are determined; wherein, the M first feature maps are respectively used to describe each of the prediction layers M chip features, where M is a positive integer; adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and using the data set to train a congestion prediction model.
  • the embodiment of the present application may divide the metal layer in each semiconductor chip into at least two prediction layers based on different methods, so that part or all of the feature data of different metal layers in the same prediction layer after division presents a relatively Strong correlation and consistency; where each prediction layer contains at least one metal layer.
  • each semiconductor chip is divided into at least two prediction layers. Since the feature data (i.e., feature map) corresponding to the metal layer contained in each prediction layer shows strong correlation and consistency, on the one hand, it can avoid the use of different metal layers due to lack of stratification in the prior art. On the other hand, when using the feature data corresponding to different prediction layers to train the congestion prediction model, because the feature data in different prediction layers show different trends, Therefore, after using the characteristic data corresponding to different prediction layers to train the congestion prediction model, the obtained models can effectively identify the characteristic data of different trends, and make corresponding predictions based on the identified features, that is, the model has Refined identification and prediction capabilities. In summary, adopting the training method in the embodiment of the present application can effectively improve the prediction accuracy and generalization ability of the model.
  • the feature data i.e., feature map
  • the above method further includes: performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing ; adding the real congestion map corresponding to each of the prediction layers in the K semiconductor chips to the data set.
  • the real congestion map of each prediction layer is calculated based on the K semiconductor chips after global winding, so that the real congestion map can be compared with the predicted congestion map of each prediction layer later , to adjust the parameters of the congestion prediction model, and then obtain the optimal congestion prediction model.
  • the above-mentioned division of the plurality of metal layers into at least two prediction layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips. The layers are divided into at least two prediction layers.
  • the multiple metal layers in each semiconductor chip can be divided into at least two through the manufacturing process of each metal layer in each semiconductor chip and/or the distribution of functional modules on each metal layer.
  • the metal layer in each semiconductor chip can be divided into two prediction layers according to whether there are macro cells, one prediction layer contains macro cells and standard cells, and the other prediction layer only contains standard cells; or based on The difference in the amount of routing resources of each metal layer divides the plurality of metal layers into at least two prediction layers, and the amount of routing resources of each metal layer in each prediction layer is equivalent.
  • the obtained model can effectively identify feature data of different trends, and make corresponding predictions based on the identified features, that is, the model has refined identification and prediction capabilities .
  • the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature describing the feature of any chip in each metal layer Figure obtained.
  • the above method of obtaining the first feature map describing the feature of any chip based on the second feature map describing the feature of any chip in each metal layer may include: for each second feature map describing the feature of the same chip in each metal layer The pixel value of the corresponding pixel point on the first feature map describing the feature of the same chip is obtained by means of weighted average, maximum value or minimum value, etc. for the corresponding pixel point on the map.
  • each metal layer corresponds to M second feature maps
  • the M second feature maps are used to respectively describe M chip features of each metal layer.
  • the first feature map describing the feature of the same chip obtained by means of value and other methods can accurately represent the feature of the same chip of each metal layer, that is, the first feature map describing the feature of the same chip and the first feature map describing the feature of the same chip in each metal layer
  • the two feature maps have good correlation and consistency; it avoids that the first feature map describing the same chip feature is relatively different because each metal layer in the same prediction layer, so that the final obtained first feature describing the same chip feature
  • the graph is quite different from the second feature graph in which each metal layer describes the features of the same chip.
  • the first feature map that accurately reflects the features of the same chip on each metal layer in each prediction layer can be obtained through the above-mentioned layering method and the determination method of the first feature map, so when using different After the first feature map corresponding to the prediction layer is used to train the congestion prediction model, the obtained models can effectively identify the feature data of different trends, and make corresponding predictions based on the identified features, that is, adopt the implementation of this application
  • the model trained by the example has refined recognition and prediction capabilities.
  • the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers
  • the graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
  • the above method of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in each prediction layer may include: for the second level real congestion map corresponding to each metal layer Calculate the mean value, weighted average or take the maximum value, etc., to obtain the first level real congestion map corresponding to the prediction layer; in addition, those skilled in the art can also use other methods based on the second level real congestion map corresponding to each metal layer A first level real congestion map corresponding to the prediction layer is obtained, which is not limited in this application. Similarly, the acquisition method of the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, which will not be repeated here.
  • each prediction layer since the real congestion map corresponding to each metal layer has strong correlation and consistency, each prediction The real congestion map of each prediction layer obtained by means of the real congestion map of each layer (that is, the real congestion map) has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, it can accurately reflect each prediction.
  • the real congestion map of the congestion level of each layer can be used to ensure the prediction accuracy of the congestion prediction model trained by using the real congestion map of each prediction layer.
  • adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: inputting M first feature maps corresponding to any prediction layer in the data set into the congestion prediction model to obtain the A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.
  • the congestion prediction model can be obtained through multiple trainings, and in the process of each iterative training, the model input data is the feature data of each prediction layer (M first feature Fig. 1), since the characteristic data of each prediction layer can accurately reflect the corresponding characteristics of each metal layer in the prediction layer, and there are large differences in the characteristic data corresponding to different prediction layers, so the prediction model can be refined based on the different prediction layers Congestion prediction is performed on the feature data to obtain a predicted congestion map that can accurately reflect the congestion degree of each prediction layer, and then the model parameters are updated based on the predicted congestion map and the real congestion map of each prediction layer, so that the trained model has a higher prediction accuracy and strong generalization ability.
  • the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer
  • the congestion prediction model includes a first congestion prediction model and a second congestion prediction model
  • the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips are added to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set
  • the feature map and the corresponding real congestion map are used to train the first congestion prediction model
  • the second congestion prediction model is trained using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map .
  • the first congestion prediction model can be trained by using the characteristic data corresponding to the macro-unit layer
  • the second congestion prediction model can be trained by using the characteristic data corresponding to the non-macro-unit layer, so that Obtain a prediction model based on the feature data of the macro-unit layer for accurate prediction, and a prediction model based on non-macro-unit features for accurate prediction, that is, to improve the prediction accuracy of the model; Making predictions can improve the speed of congestion prediction for the same semiconductor chip.
  • the above M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • M chip features corresponding to each metal layer and prediction layer may also include other chip features except the above four chip features, which is not limited in the present application.
  • the above M chip features may include one or more of pin density, network connection density, module mask, or routing resource amount reflecting chip functions and on-chip devices. Obtain the first feature map of the chip features corresponding to each prediction layer, and train the prediction model based on the M first feature maps that accurately reflect the functions of the chip and the characteristics of the on-chip device, thereby improving the prediction accuracy of the trained prediction model .
  • the embodiment of the present application discloses an image processing method, which includes: determining M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two predicted layer, the M is a positive integer; use the congestion prediction model to process the M first feature maps corresponding to each prediction layer, and obtain the prediction congestion map corresponding to each prediction layer; wherein, the congestion prediction model is Obtained after training through a data set, the data set includes training data corresponding to the training prediction layer contained in each training semiconductor chip in a plurality of training semiconductor chips, and the training data corresponding to each training prediction layer includes M A first training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training predictions
  • Each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
  • the model obtained through training has better prediction accuracy and generalization Therefore, a more accurate predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted can be obtained, that is, the accuracy of the predicted congestion map can be improved, and it is convenient to use the obtained prediction layer in the chip production process to accurately predict congestion
  • the chip layout is optimized accordingly.
  • the above method further includes: aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
  • the accuracy of the predicted congestion map corresponding to each prediction layer is high, after aggregation based on the predicted congestion map corresponding to each prediction layer, the obtained information describing the congestion degree of the chip to be predicted
  • the predicted congestion map of the predicted chip also has high accuracy, which is convenient for optimizing the chip layout by using the obtained predicted congestion map of the chip to be predicted in the chip production process.
  • the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation of the predicted congestion maps corresponding to each of the predicted layers is obtained to obtain the
  • the predicted congestion graph corresponding to the semiconductor chip to be predicted includes: using a hierarchical aggregation operator to aggregate the vertical predicted congestion graph corresponding to each of the prediction layers to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate each Aggregate the horizontal predicted congestion maps corresponding to the prediction layers to obtain a reference horizontal predicted congestion map; use a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the pending congestion map Predict the predicted congestion map corresponding to the semiconductor chip; or use the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the predicted layers to obtain the reference prediction corresponding to each of the predicted layers
  • the hierarchical aggregation operator and the directional aggregation operator can be used to aggregate the predicted congestion graph corresponding to each prediction layer to obtain the predicted congestion of the chip to be predicted picture.
  • the operations of the hierarchical aggregation operator and the directional aggregation operator include, but are not limited to, operations such as performing weighted average, maximum value or minimum value on the predicted congestion graph.
  • the predicted congestion map corresponding to each prediction layer obtained based on the prediction model has high accuracy, after the subsequent specific operations of the hierarchical aggregation operator and directional aggregation operator determined according to the specific scene, based on the aggregation operator, the The predicted congestion map of the chips to be predicted also has high accuracy.
  • the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
  • the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
  • the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers
  • the real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
  • the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set.
  • the training feature map and the corresponding real congestion map are obtained by training.
  • the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • the present application discloses a congestion prediction model training device, which includes: a layering unit for dividing multiple metal layers into at least two prediction layers; wherein, the multiple metal layers are K The metal layer contained in each semiconductor chip in the semiconductor chips, K is a positive integer; the determination unit is used to determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively M chip features used to describe each of the prediction layers, M is a positive integer; a training unit, used to convert the M first features corresponding to the prediction layer contained in each of the K semiconductor chips
  • the graph and the real congestion graph are used as a data set, and the congestion prediction model is trained using the data set; wherein, the real congestion graph corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.
  • the above-mentioned training unit is further configured to: obtain a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing; The real congestion maps corresponding to the prediction layers are added to the data set.
  • the above layering unit is specifically configured to: divide the plurality of metal layers into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
  • the above determination unit is specifically configured to: obtain M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used for Describe the M chip features of each metal layer; generate M first feature maps corresponding to each of the prediction layers based on the M second feature maps of each metal layer; wherein, the The first feature map used to describe any chip feature in the M first feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers
  • the graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
  • the above training unit is specifically configured to: use the data set to iteratively train the congestion prediction model; wherein, each time The iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain a prediction congestion map corresponding to any prediction layer; based on the prediction congestion map The real congestion map corresponding to the any prediction layer updates the congestion prediction model.
  • the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer; in the aspect of using the data set to train the congestion prediction model, the above-mentioned training unit is specifically used for : use the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model; use the first feature map corresponding to the non-macro-unit layer in the data set and the corresponding The real congestion map is used to train the second congestion prediction model.
  • the above M chip features include one or more of pin density, network connection density, module mask or amount of routing resources.
  • the present application discloses an image processing device, which includes: a determination unit, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted Including at least two prediction layers, where M is a positive integer; a processing unit configured to use a congestion prediction model to process the M first feature maps corresponding to each prediction layer to obtain each prediction layer Corresponding predicted congestion map; wherein, the congestion prediction model is obtained after training through a data set, and the data set includes training data respectively corresponding to the training prediction layer contained in each of the K training semiconductor chips,
  • the training data corresponding to each of the training prediction layers includes M first training feature maps and real congestion maps, and the M first training feature maps are respectively used to describe M chip features of each of the training prediction layers,
  • the real congestion map is used to describe the real congestion level of each training prediction layer, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
  • the above device further includes: an aggregation unit, configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted .
  • the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map;
  • the above aggregation unit is specifically configured to: use a hierarchical aggregation operator to Aggregating the corresponding vertical predicted congestion graphs to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate the horizontal predicted congestion graphs corresponding to each of the prediction layers to obtain a reference horizontal predicted congestion graph; using directional aggregation The operator aggregates the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted; or, using the directional aggregation operator to Aggregating the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each layer to obtain a reference predicted congestion map corresponding to each of the predicted layers; using the hierarchical aggregation operator to perform aggregated to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted.
  • the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
  • the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
  • the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers
  • the real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
  • the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set.
  • the training feature map and the corresponding real congestion map are obtained by training.
  • the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • the present application discloses a chip system, which is characterized in that the chip system includes at least one processor, a memory, and an interface circuit, and the memory, the interface circuit, and the at least one processor are interconnected by wires Instructions are stored in the at least one memory; when the instructions are executed by the processor, the method described in any one of the first aspect and/or the second aspect is implemented.
  • the present application discloses a terminal device, wherein the terminal device includes the system on chip as described in the third aspect above, and a discrete device coupled to the system on chip.
  • the present application discloses a computer-readable storage medium, which is characterized in that the computer-readable storage medium stores program instructions, and when the program instructions are run on a processor, the above-mentioned first aspect is realized And/or any method described in the second aspect.
  • the present application discloses a computer program product, which is characterized in that, when the computer program product is run on a terminal, the method described in any one of the first aspect and/or the second aspect is implemented.
  • FIG. 1 is a schematic structural diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a network model provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a chip hardware structure provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another system architecture provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a method for training a congestion prediction model provided in an embodiment of the present application
  • FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of an image processing method provided in an embodiment of the present application.
  • Fig. 8 is a schematic diagram of the spatial relationship between a first feature map and a fourth feature map provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a congestion prediction provided by an embodiment of the present application.
  • Fig. 10 is a schematic structural diagram of a model training device provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application.
  • Fig. 12 is a schematic diagram of the hardware structure of a model training device in the embodiment of the present application.
  • FIG. 13 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.
  • Embodiments of the present application can be applied to image processing tasks, for example, in the congestion prediction (Congestion Prediction) in the physical design of chip electronic design automation (Electronic Design Automation, EDA), that is, based on the feature map (feature data) of the semiconductor chip ) to predict the congestion degree of the chip, thereby providing an optimization basis for the global layout, so that the placer (Placer) can push away the cells in the severely congested area, reduce the layout congestion degree of the area, and thereby reduce the overall congestion of the chip.
  • congestion Prediction Congestion Prediction
  • EDA Electronic Design Automation
  • the images in the embodiments of the present application may be static images (or called static pictures) or dynamic images (or called dynamic pictures), for example, the images in the present application may be videos or dynamic pictures, or, the present application Images in can also be still images or photographs.
  • the present application collectively refers to static images or dynamic images as images in the following embodiments.
  • the congestion prediction described above is only a specific scenario where the method of the embodiment of the present application is applied.
  • the method of the embodiment of the present application is not limited to the above-mentioned scenarios when applied. in the scene being processed.
  • the method in the embodiment of the present application can also be similarly applied to other fields, for example, speech recognition and natural language processing, etc., which is not limited in the embodiment of the present application.
  • the training method of the congestion prediction model provided in the embodiment of the present application involves the processing of computer vision, and can be specifically applied to data processing methods such as data training, machine learning, and deep learning, for training data (such as the first feature map in the present application) Perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc., and finally obtain a well-trained congestion prediction model; and, the image processing method provided in the embodiment of the present application can use the above-mentioned trained congestion prediction model, Input the input data (such as the feature map in this application) into the trained congestion prediction model to obtain output data (such as the predicted congestion map of the chip to be predicted in this application).
  • congestion prediction model training method and the image processing method provided in the embodiment of this application are inventions based on the same idea, and can also be understood as two parts in a system, or two stages in an overall process : Such as model training phase and model application phase.
  • the embodiment of the present application involves a large number of related applications of neural networks.
  • a neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as input, and the output of the operation unit can be:
  • W s is the weight of x s
  • b is the bias of the neuron unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer.
  • the activation function may be a sigmoid function.
  • a neural network is a network formed by connecting many of the above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • a deep neural network also known as a multi-layer neural network
  • DNN can be understood as a neural network with many hidden layers, and there is no special metric for the "many” here.
  • the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer.
  • the first layer is the input layer
  • the last layer is the output layer
  • the layers in the middle are all hidden layers.
  • the layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer.
  • the coefficient of the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as It should be noted that the input layer has no W parameter.
  • more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks.
  • Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
  • Convolutional neural network (CNN, convolutional neuron network) is a deep neural network with a convolutional structure.
  • a convolutional neural network consists of a feature extractor consisting of a convolutional layer and a subsampling layer.
  • the feature extractor can be seen as a filter, and the convolution process can be seen as using a trainable filter to convolve with an input image or convolutional feature map.
  • the convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network.
  • a neuron can only be connected to some adjacent neurons.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units.
  • Neural units of the same feature plane share weights, and the shared weights here are convolution kernels.
  • Shared weights can be understood as a way to extract image information that is independent of location. The underlying principle is that the statistical information of a certain part of the image is the same as that of other parts. That means that the image information learned in one part can also be used in another part. So for all positions on the image, the same learned image information can be used.
  • multiple convolution kernels can be used to extract different image information. Generally, the more the number of convolution kernels, the richer the image information reflected by the convolution operation.
  • the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network.
  • the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • the convolutional neural network can use the error back propagation (back propagation, BP) algorithm to correct the size of the parameters in the initial super-resolution model during the training process, so that the reconstruction error loss of the super-resolution model becomes smaller and smaller. Specifically, passing the input signal forward until the output will generate an error loss, and updating the parameters in the initial super-resolution model by backpropagating the error loss information, so that the error loss converges.
  • the backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the parameters of the optimal super-resolution model, such as the weight matrix.
  • the pixel value of the image can be a red-green-blue (RGB) color value, and the pixel value can be a long integer representing the color.
  • the pixel value is 256*Red+100*Green+76Blue, where Blue represents a blue component, Green represents a green component, and Red represents a red component. In each color component, the smaller the value, the lower the brightness, and the larger the value, the higher the brightness.
  • the pixel values may be grayscale values.
  • FIG. 1 is a schematic structural diagram of a system architecture 100 provided by an embodiment of the present application.
  • the data collection device 160 is used to collect training data.
  • the training data includes first feature maps and real congestion maps corresponding to all prediction layers.
  • the data collection device 160 After collecting the training data, the data collection device 160 stores the training data in the database 130, and the training device 120 trains the target model 101 based on the training data maintained in the database 130 (ie, the congestion prediction model in the embodiment of the present application).
  • the target model 101 can be used to implement the image processing method provided by the embodiment of the present application, that is, each prediction layer of the chip to be predicted
  • the corresponding M first feature maps are input to the target model 101 after being pre-processed to obtain a predicted congestion map corresponding to each prediction layer.
  • the target model 101 in the embodiment of the present application may specifically be a congestion prediction model.
  • the congestion prediction model is obtained through at least one training.
  • the training data maintained in the database 130 may not all be collected by the data collection device 160, but may also be received from other devices.
  • the training device 120 does not necessarily perform the training of the target model 101 based entirely on the training data maintained by the database 130, and it is also possible to obtain training data from the cloud or other places for model training. limit.
  • the target model 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. A terminal, etc., may also be a server or a cloud.
  • execution equipment 110 is equipped with input/output (input/output, I/O) interface 112, is used for carrying out data interaction with external equipment, the user can input data to I/O interface 112 through client equipment 140,
  • the input data may include a first feature map corresponding to each prediction layer of the chip to be predicted.
  • the execution device 110 When the execution device 110 preprocesses the input data, or in the calculation module 111 of the execution device 110 performs calculation and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 150 .
  • the I/O interface 112 returns the processing result, such as the predicted congestion map of each prediction layer of the chip to be predicted (or the corresponding predicted congestion map of the chip to be predicted) obtained above, to the client device 140 to provide to the user.
  • the training device 120 can generate corresponding target models 101 based on different training data for different goals or different tasks, and the corresponding target models 101 can be used to achieve the above-mentioned goals or complete the above-mentioned tasks, thereby Provide the user with the desired result.
  • the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 112 .
  • the client device 140 can automatically send the input data to the I/O interface 112 . If the client device 140 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 140 .
  • the user can view the results output by the execution device 110 on the client device 140, and the specific presentation form may be specific ways such as display, sound, and action.
  • the client device 140 can also be used as a data collection terminal, collecting the input data input to the I/O interface 112 as shown in the figure and the output results of the output I/O interface 112 as new sample data, and storing them in the database 130 .
  • the client device 140 may not be used for collection, but the I/O interface 112 directly uses the input data input to the I/O interface 112 as shown in the figure and the output result of the output I/O interface 112 as a new sample.
  • the data is stored in database 130 .
  • accompanying drawing 1 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation, for example, in accompanying drawing 1 , the data storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
  • the target model 101 is obtained by training according to the training device 120.
  • the target model 101 can be obtained by training based on the training method of the congestion prediction model in the embodiment of the present application; specifically, the implementation of the present application
  • the congestion prediction model provided in the example can be a convolutional neural network, a generative adversarial neural network, a variational autoencoder, or a semantic segmentation neural network, which is not specifically limited in this solution.
  • the convolutional neural network is a deep neural network with a convolutional structure and a deep learning (DL) architecture.
  • the deep learning architecture refers to the algorithm through machine learning. Multiple levels of learning are performed at different levels of abstraction.
  • CNN is a feed-forward artificial neural network in which individual neurons can respond to images input into it.
  • a convolutional neural network (CNN) 200 may include an input layer 210 , a convolutional/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230 .
  • the convolutional layer/pooling layer 220 may include layers 221-226 as examples, for example: in one implementation, the 221st layer is a convolutional layer, the 222nd layer is a pooling layer, and the 223rd layer is a volume Layers, 224 are pooling layers, 225 are convolutional layers, and 226 are pooling layers; in another implementation, 221 and 222 are convolutional layers, 223 are pooling layers, and 224 and 225 are convolutional layers Layer, 226 is a pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.
  • the convolution layer 221 may include many convolution operators, which are also called kernels, and their role in image processing is equivalent to a filter for extracting specific information from the input image matrix.
  • the convolution operators are essentially It can be a weight matrix. This weight matrix is usually pre-defined. During the convolution operation on the image, the weight matrix is usually one pixel by one pixel (or two pixels by two pixels) along the horizontal direction on the input image. ...It depends on the value of the stride) to complete the work of extracting specific features from the image.
  • the size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image.
  • the weight matrix will be extended to The entire depth of the input image. Therefore, convolution with a single weight matrix will produce a convolutional output with a single depth dimension, but in most cases instead of using a single weight matrix, multiple weight matrices of the same size (row ⁇ column) are applied, That is, multiple matrices of the same shape.
  • the output of each weight matrix is stacked to form the depth dimension of the convolution image, where the dimension can be understood as determined by the "multiple" mentioned above.
  • Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to filter unwanted noise in the image.
  • the multiple weight matrices have the same size (row ⁇ column), and the feature maps extracted by the multiple weight matrices of the same size are also of the same size, and then the extracted multiple feature maps of the same size are combined to form the convolution operation. output.
  • weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions .
  • the initial convolutional layer (such as 221) often extracts more general features, which can also be referred to as low-level features;
  • the features extracted by the later convolutional layers (such as 226) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
  • each layer of 221-226 as shown in the convolutional layer/pooling layer 220 in Figure 2 can be one layer A convolutional layer is followed by a pooling layer, or a multi-layer convolutional layer is followed by one or more pooling layers.
  • the pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling an input image to obtain an image of a smaller size.
  • the average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of average pooling.
  • the maximum pooling operator can take the pixel with the largest value within a specific range as the result of maximum pooling.
  • the operators in the pooling layer should also be related to the size of the image.
  • the size of the image output after being processed by the pooling layer may be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
  • the convolutional neural network 200 After being processed by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not enough to output the required output information. Because as mentioned earlier, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to use the neural network layer 230 to generate one or a group of outputs with the required number of classes. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 2 ) and an output layer 240, and the parameters contained in the multi-layer hidden layers may be based on specific task types. The related training data are pre-trained.
  • the output layer 240 which has a loss function similar to the classification cross entropy, and is specifically used to calculate the prediction error.
  • the convolutional neural network 200 shown in FIG. 2 is only an example of a convolutional neural network, and in specific applications, the convolutional neural network may also exist in the form of other network models.
  • a chip hardware structure provided by the embodiment of the present application is introduced below.
  • FIG. 3 is a chip hardware structure provided by an embodiment of the present invention, and the chip includes a neural network processor 50 .
  • the chip can be set in the execution device 110 shown in FIG. 1 to complete the computing work of the computing module 111 .
  • the chip can also be set in the training device 120 shown in FIG. 1 to complete the training work of the training device 120 and output the target model 101 .
  • the algorithms of each layer in the convolutional neural network shown in Figure 2 can be implemented in the chip shown in Figure 3 .
  • the neural network processor NPU 50 is mounted on the main CPU (Host CPU) as a coprocessor, and the tasks are assigned by the Host CPU.
  • the core part of the NPU is the operation circuit 503, and the controller 504 controls the operation circuit 503 to extract data in the memory (weight memory or input memory) and perform operations.
  • the operation circuit 503 includes multiple processing units (process engine, PE).
  • arithmetic circuit 503 is a two-dimensional systolic array.
  • the arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition.
  • arithmetic circuit 503 is a general-purpose matrix processor.
  • the operation circuit fetches the data corresponding to the matrix B from the weight memory 502, and caches it in each PE in the operation circuit.
  • the operation circuit takes the data of matrix A from the input memory 501 and performs matrix operation with matrix B, and the obtained partial or final results of the matrix are stored in the accumulator 508 (accumulator).
  • the vector computing unit 507 can further process the output of the computing circuit, such as vector multiplication, vector addition, exponent operation, logarithmic operation, size comparison and so on.
  • the vector calculation unit 507 can be used for network calculations of non-convolution/non-FC layers in neural networks, such as pooling (Pooling), batch normalization (batch normalization), local response normalization (local response normalization), etc. .
  • vector computation unit 507 can store a vector of processed outputs to unified memory 506 .
  • the vector calculation unit 507 may apply a non-linear function to the output of the operation circuit 503, such as a vector of accumulated values, to generate activation values.
  • the vector computation unit 507 generates normalized values, merged values, or both.
  • the vector of processed outputs can be used as an activation input to arithmetic circuitry 503, for example for use in subsequent layers in a neural network.
  • the unified memory 506 is used to store input data and output data.
  • the weight data directly transfers the input data in the external memory to the input memory 501 and/or unified memory 506 through the storage unit access controller 505 (direct memory access controller, DMAC), stores the weight data in the external memory into the weight memory 502, And store the data in the unified memory 506 into the external memory.
  • DMAC direct memory access controller
  • a bus interface unit (bus interface unit, BIU) 510 is configured to implement interaction between the main CPU, DMAC and instruction fetch memory 509 through the bus.
  • An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504.
  • the controller 504 is configured to call the instruction cached in the instruction fetch memory 509 to control the operation process of the operation accelerator.
  • the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all on-chip memory
  • the external memory is a memory outside the NPU
  • the external memory can be a double data rate synchronous dynamic random Memory (double data rate synchronous dynamic random access memory, referred to as DDR SDRAM), high bandwidth memory (high bandwidth memory, HBM) or other readable and writable memory.
  • DDR SDRAM double data rate synchronous dynamic random Memory
  • HBM high bandwidth memory
  • each layer in the convolutional neural network shown in FIG. 2 can be performed by the operation circuit 503 or the vector calculation unit 507 .
  • the training device 120 in FIG. 1 introduced above can execute the various steps of the congestion prediction model training method in the embodiment of the present application, and the execution device 110 in FIG. 1 can execute the various steps in the image processing method in the embodiment of the present application, as shown in FIG.
  • the neural network model shown in 2 and the chip shown in FIG. 3 can also be used to execute the various steps of the image processing method of the embodiment of the present application, and the chip shown in FIG. 3 can also be used to execute the congestion prediction model in the embodiment of the present application steps of the training method.
  • FIG. 4 is a schematic structural diagram of a system architecture 300 provided in an embodiment of the present application.
  • the system architecture includes a local device 301, a local device 302, an execution device 210, and a data storage system 250; wherein, the local device 301 and the local device 302 are connected to the execution device 210 through a communication network.
  • Execution device 210 may be implemented by one or more servers.
  • the execution device 210 may be used in cooperation with other computing devices, such as data storage, routers, load balancers and other devices.
  • Execution device 210 may be arranged on one physical site, or distributed on multiple physical sites.
  • the execution device 210 may use the data in the data storage system 250 or call the program code in the data storage system 250 to implement the congestion prediction model training method or the image processing method in the embodiment of the present application.
  • the execution device 210 may perform the following process:
  • a congestion prediction model can be trained through the execution device 210 above, and the congestion prediction model can be used for image processing, speech processing, and natural language processing, etc., for example, the congestion prediction model can be used to implement the congestion prediction method in the embodiment of the present application.
  • the execution device 210 can be built into an image processing device through the above process, and the image processing device can be used for image processing (for example, it can be used to realize the congestion prediction of the semiconductor chip in the embodiment of the present application).
  • Each local device can represent a variety of computing devices, such as personal computers, computer workstations, smartphones, tablets, and so on.
  • Each user's local device can interact with the execution device 210 through any communication mechanism/communication standard communication network, and the communication network can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
  • the local device 301 and the local device 302 obtain the relevant parameters of the congestion prediction model from the execution device 210, deploy the congestion prediction model on the local device 301 and the local device 302, and use the congestion prediction model to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.
  • the trained congestion prediction model can be directly deployed on the execution device 210, and the execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the trained congestion prediction model Congestion prediction is performed on the chip to be predicted, and the predicted congestion map of the chip to be predicted is obtained.
  • the local device 301 and the local device 302 obtain relevant parameters of the image processing device from the execution device 210, deploy the image processing device on the local device 301 and the local device 302, and use the image processing device to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.
  • the image processing device can be directly deployed on the execution device 210.
  • the execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the image processing device to perform congestion prediction on the chip to be predicted. , to get the predicted congestion map of the chip to be predicted.
  • the above execution device 210 may also be a cloud device, at this time, the execution device 210 may be deployed on the cloud; or, the above execution device 210 may also be a terminal device, at this time, the execution device 210 may be deployed on the user terminal side,
  • the embodiment of the present application does not limit this.
  • FIG. 5 is a schematic flowchart of a congestion prediction model training method 500 provided in an embodiment of the present application. The method includes but is not limited to the following steps:
  • Step S510 Divide the plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers are metal layers included in each of the K semiconductor chips, and K is a positive integer.
  • each semiconductor chip is divided into at least two prediction layers; wherein, each prediction layer includes at least one metal layer.
  • each prediction layer includes at least one metal layer.
  • FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application. As shown in FIG. 6, the semiconductor chip may comprise a plurality of metal layers (from top to bottom are metal layer 1-1 ...
  • each prediction layer contains at least one metal layer, for example, prediction layer 1 may contain metal layer 1-1...metal layer 1-A, and prediction layer N may contain metal layer N-1...metal layer N-B; wherein, A and B are positive integers, and N is an integer greater than or equal to 2.
  • the above-mentioned division of the plurality of metal layers into at least two predictive layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each semiconductor chip for at least two prediction layers.
  • the stratification of the prediction layer is based on the manufacturing process of the metal layer or whether the metal layer contains the same functional module (Module), that is, on the same semiconductor chip, metal layers with similar manufacturing processes can be divided into the same prediction layer, Or divide metal layers containing the same functional modules into the same prediction layer.
  • the manufacturing process of each metal layer can be characterized by the routing track capacity (Routing track capacity) of each metal layer, and the routing resource capacity is specifically the number of routing tracks (Routing track) on each metal layer; when the metal layer The more advanced the manufacturing process, the greater the amount of routing resources.
  • the amount of routing resources on the metal layer of the 7nm manufacturing process is greater than the amount of routing resources on the metal layer of the 14nm manufacturing process.
  • Functional modules refer to some hardware structures in the metal layer, for example, macro cell (Macro Cell) layer or registers, etc.
  • the difference in the amount of routing resources can be less than or equal to a preset threshold
  • the metal layers are divided into the same prediction layer, and the preset threshold can be determined according to the specific application scenario. Assuming that the semiconductor chip contains 6 metal layers, and the amount of wiring resources of the 6 metal layer chips are 15, 15, 14, 10, 10 and 2 respectively, the preset threshold is 2; at this time, the 6 metal layers can be The layer is divided into three prediction layers; among them, the unit of the amount of routing resources is bar. Specifically, the three metal layers whose routing resources are 15, 15 and 14 can be divided into the same prediction layer; the two metal layers whose routing resources are 10 can be divided into the same prediction layer; the routing resources Metal layers with a value of 2 are assigned to the same prediction layer.
  • the same semiconductor chip can be divided into The multiple metal layers in the chip are divided into at least two predictive layers.
  • metal layers containing macrocell layers in the same semiconductor chip can be divided into the same predictive layer, and metal layers containing non-macrocell layers can be divided into the same predictive layer; or metal layers containing registers in the same semiconductor chip can be divided into the same predictive layer.
  • the prediction layer which divides metal layers that do not contain registers into the same prediction layer.
  • Step S520 Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer.
  • M chip features used for congestion prediction may be determined according to specific application scenarios.
  • Obtain chip-related data including netlist, macromodule location, transistor location, transistor pin location, and routing resources, etc.
  • M first feature maps corresponding to the M chip features of each prediction layer are calculated based on the above chip-related data, and each chip feature of the prediction layer corresponds to a first feature map describing the feature of the chip.
  • the above M chip features may include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • each metal layer in each prediction layer has similar chip characteristics; for example, in the same prediction layer, the pin density of one metal layer Pin density is similar.
  • the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature map describing any chip feature in each metal layer obtained from the feature map.
  • each metal layer included in each semiconductor chip corresponds to the aforementioned M chip features, that is, the chip features of each metal layer may include one or more of pin density, network connection density, module mask, or amount of routing resources. Multiple; among them, the pin density and the module mask are not directional; the amount of routing resources and the network connection density are directional. In the same metal layer, the amount of routing resources is in the horizontal or vertical direction, and the network connection density is in the horizontal or vertical direction.
  • the amount of routing resources is specifically the number of routing tracks on each metal layer. Since the routing tracks on each metal layer are directional, either horizontal or vertical, the amount of routing resources It is also directional accordingly.
  • the network connection density refers to the number of windings per unit area, and the windings are wound on the above-mentioned winding track, so the network connection density also has directionality, that is, it is horizontal or vertical accordingly. It should be understood that, for a directional chip feature, the first feature map describing the chip feature is also directional; for example, when the amount of routing resources is horizontal, the first feature map describing the amount of routing resources is also directional. for the horizontal direction.
  • the acquisition of the M second feature maps corresponding to each metal layer in each of the prediction layers includes: performing feature extraction based on the wiring data of each metal layer in each prediction layer, and obtaining the M second feature maps corresponding to each metal layer.
  • the second feature map .
  • the above-mentioned M second feature maps based on each metal layer generate M first feature maps corresponding to each prediction layer, including: for the same chip feature, based on the first feature map describing the same chip feature on each metal layer Two feature maps, obtaining a first feature map describing the features of the same chip, where the first feature map is one of the M first feature maps corresponding to each prediction layer.
  • the corresponding pixel points on each second feature map describing the same chip feature on each metal layer in the same prediction layer are weighted averaged, and the maximum value or minimum value is taken to obtain the corresponding pixel points on the first feature map corresponding to the same chip feature.
  • the pixel value of the pixel point Perform the above operations on each pixel on each second feature map to obtain the pixel value of each pixel on the first feature map corresponding to the feature of the same chip, that is, obtain the first feature map describing the feature of the same chip.
  • the first feature map of the feature is not limited in this application.
  • the first feature map corresponding to the pin density of the prediction layer is determined as follows: first, based on the four metal layers The wiring data of each metal layer obtains four second feature maps respectively describing the pin density of the four metal layers; the pixel values at the same position in the four second feature maps are weighted average, and the maximum value or minimum value is taken etc. to obtain the pixel value at the same position in the first feature map describing the pin density on the prediction layer. Each pixel in the above four second feature maps is processed in the above manner to obtain the first feature map corresponding to the pin density on the prediction layer.
  • the preset operation can be used A second characteristic map describing the first chip characteristics of the metal layer is obtained.
  • the preset operation may be: firstly determine the prediction layer where the one metal layer is located, and determine the description of the first chip on the one metal layer based on the second feature maps in the prediction layer that describe the characteristics of the first chip on the other metal layers.
  • the second feature map of the feature for example, can carry out weighted average, take the maximum value or minimum value or other processing methods on the corresponding pixel points on each second feature map describing the first chip features of each metal layer to obtain the description of the metal layer. Layer the second feature map of the first chip features.
  • each metal layer corresponds to M second feature maps
  • the M second feature maps are used to respectively describe M chip features of each metal layer.
  • the first feature map describing the feature of the same chip obtained by means of value and other methods can accurately represent the feature of the same chip of each metal layer, that is, the first feature map describing the feature of the same chip and the first feature map describing the feature of the same chip in each metal layer
  • the two feature maps have good correlation and consistency; it avoids that the first feature map describing the same chip feature is relatively different because each metal layer in the same prediction layer, so that the final obtained first feature describing the same chip feature
  • the graph is quite different from the second feature graph in which each metal layer describes the features of the same chip.
  • the first feature map that accurately reflects the features of the same chip on each metal layer in each prediction layer can be obtained through the above-mentioned layering method and the determination method of the first feature map, so when using different After the first feature map corresponding to the prediction layer is used to train the congestion prediction model, the obtained models can effectively identify the feature data of different trends, and make corresponding predictions based on the identified features, that is, adopt the implementation of this application
  • the model trained by the example has refined recognition and prediction capabilities.
  • Step S530 Determine a data set based on the M first feature maps and real congestion maps corresponding to the prediction layer included in each of the K semiconductor chips, and use the data set to train a congestion prediction model; wherein, The real congestion map corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.
  • the degree of congestion refers to the difference between the routing resource demand and the routing resource amount.
  • the amount of routing resources refers to the number of routing tracks.
  • the routing resource requirement refers to the number of routing required to connect all the netlists. The required routing is wound in the routing track, so the difference between the routing resource requirement and the amount of routing resources is the degree of congestion. For example, if there are 10 routing resources, 10 routings are needed to connect all the netlists; Other winding wires are wound together in the same track, that is, the congestion degree at this time is 2.
  • the above method further includes: performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing ; adding the real congestion map corresponding to each of the prediction layers in the K semiconductor chips to the data set.
  • chip design can be divided into chip layout and global routing (Global routing) two stages.
  • the chip layout stage mainly determines the netlist in each metal layer on the chip, the position of the macro module, the position of the transistor, the position of the transistor pin and the amount of winding resources, etc.
  • the metal wire is mainly wound in the winding track corresponding to the amount of winding resources.
  • obtaining the actual congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing includes: after performing global routing on the semiconductor chip, the number of routings on the chip can be determined, That is, the routing track demand; and then calculate the real congestion map of each prediction layer based on the routing resource demand and the amount of routing resources.
  • the image processing method (also called the congestion prediction method) in the embodiment of the present application is mainly used in the chip layout stage.
  • the chip congestion program is predicted based on this method to adjust the chip layout accordingly.
  • the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers
  • the graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
  • the real congestion map corresponding to each metal layer in each semiconductor chip is obtained, and the real congestion map corresponding to each metal layer includes a second horizontal real congestion map and a second vertical real congestion map; wherein, the second horizontal real congestion map is used to describe the degree of congestion of the metal layer in the horizontal direction, and the second vertical real congestion map is used to describe the degree of congestion of the metal layer in the vertical direction.
  • the real congestion map corresponding to each metal layer is calculated based on the routing resource requirement and routing resource amount of each metal layer after global routing is performed.
  • the specific process of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in the prediction layer can correspond to the above-mentioned determination of the first feature map corresponding to the prediction layer.
  • the specific process that is, the specific process of obtaining the first feature map describing the feature of the same chip by using the second feature map corresponding to the feature of the same chip on each metal layer, will not be repeated here.
  • the specific process of determining the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, and will not be repeated here.
  • the size of the M first feature maps corresponding to each prediction layer is the same as that of the real congestion map.
  • each prediction layer since the real congestion map corresponding to each metal layer has strong correlation and consistency, based on the embodiment of the present application
  • the real congestion map of each prediction layer obtained by the method has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, the real congestion map that accurately reflects the congestion degree of each prediction layer can be obtained, thereby ensuring subsequent The prediction accuracy of the congestion prediction model trained with the real congestion map of each prediction layer.
  • adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.
  • each iteration training above includes: determining a single iteration training sample from the M first feature maps corresponding to any prediction layer and 2 real congestion maps.
  • a single iteration training sample contains M third feature maps and 2 target real congestion maps.
  • Input the M third feature maps into the congestion prediction model to obtain the predicted congestion map output by the model; determine the prediction error based on the predicted congestion map and the two target real congestion maps.
  • the model parameters in the congestion prediction model are updated using the gradient descent method or other backpropagation algorithms.
  • the preset condition may be that the number of training times is greater than or equal to the preset number, the prediction error is less than or equal to the preset error, or other feasible conditions, which are not limited in the present application.
  • the aforementioned congestion prediction model may be a model such as a generative adversarial neural network, a variational autoencoder, a semantic segmentation neural network, etc., which is not limited in this application.
  • Select M third feature maps from any identical areas on the M first feature maps corresponding to the prediction layer respectively select two target real congestion maps from the arbitrary same areas on the two real congestion maps corresponding to the prediction layer, the The size of the M third feature maps and the two target real congestion maps is equal to the target size; the M third feature maps and the two target real congestion maps are used as training samples for a single iteration.
  • the target size is the size of the input image allowed by the congestion prediction model, and the target size may be smaller than or equal to the size of the first feature map.
  • the M first feature maps are respectively used as the above M third feature maps
  • the congestion maps are respectively used as the above two target real congestion maps, that is, the single iteration training samples at this time include M first feature maps and two real congestion maps.
  • the prediction layers included in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer
  • the congestion prediction model includes a first congestion prediction model and a second congestion prediction model
  • the Adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set A feature map and a corresponding real congestion map to train the first congestion prediction model; using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map to train the second congestion prediction model train.
  • the multiple metal layers in each semiconductor chip are divided into two predictive layers, namely a macro-unit layer and a non-macro-unit layer.
  • the model structures of the first congestion prediction model and the second congestion prediction model are completely the same, and initial model parameters may be the same or different.
  • the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map are used to train the first congestion prediction model, including: M first feature maps corresponding to any macro-unit layer and the real congestion map
  • a single iteration training sample is determined, and then the first congestion prediction model is trained by using the single iteration training sample.
  • the determination process of the single-iteration training samples used to train the first congestion prediction model can refer to the determination process of the single-iteration training samples of the aforementioned congestion prediction model, which will not be repeated here; the specific details of the first congestion prediction model
  • the training process may be the same as the training process of the congestion prediction model in the foregoing embodiment, and details are not repeated here.
  • the training process of the second congestion prediction model is the same as the training process of the first congestion prediction model, and will not be repeated here.
  • FIG. 7 is a schematic flowchart of an image processing method 700 provided in the embodiment of the present application. The method includes but is not limited to the following steps:
  • Step S710 Determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;
  • Step S720 using the congestion prediction model to process the M first feature maps corresponding to each of the prediction layers, to obtain a predicted congestion map corresponding to each of the prediction layers;
  • the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, each of the training prediction
  • the training data corresponding to the layer includes M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map uses
  • each training semiconductor chip includes at least two training prediction layers, each training prediction layer includes at least one metal layer, and K is a positive integer.
  • a sliding window method may be used to obtain M fourth feature maps for a single prediction by the model from the M first feature maps.
  • the size of each fourth feature map is the target size, as shown in Figure 8, the length and width of the target size can be E and F respectively, E and F are positive integers, and the unit can be a pixel.
  • the first feature map shown in FIG. 8 is any one of the M first feature maps corresponding to the prediction layer.
  • the first feature map may contain D fourth feature maps, and the width of the overlapping portion between any two adjacent fourth feature maps is G, where G is an integer greater than or equal to zero, and the unit can be for pixels. That is, each first feature map in the M first feature maps contains D fourth feature maps.
  • the M fourth feature maps at the same region on the M first feature maps are used as input data for a single prediction of the model.
  • the M first feature maps corresponding to each prediction layer contain a total of D sets of input data for congestion prediction, and each set of input data corresponds to a specific area on the first feature map, and also corresponds to a specific area on the prediction layer. area, D is a positive integer.
  • the aforementioned congestion prediction model is used to process the M first feature maps corresponding to each prediction layer to obtain a prediction congestion map corresponding to each prediction layer, including: sequentially inputting D groups of input data corresponding to each prediction layer Input one or more congestion prediction models to obtain a predicted congestion map corresponding to each set of input data, and a total of D groups of predicted congestion maps are obtained.
  • each group of predicted congestion maps includes a horizontal predicted congestion map and a vertical predicted congestion map. Concatenate the horizontal prediction congestion diagrams in group D prediction congestion diagrams to obtain the horizontal prediction congestion diagrams corresponding to each prediction layer; splice the vertical prediction congestion diagrams in group D prediction congestion diagrams to obtain the vertical prediction congestion diagrams corresponding to each prediction layer Congestion map.
  • each pixel point in the overlapping part corresponds to two pixel values on the two predicted congestion maps
  • the pixel value of each pixel after splicing can be determined by means of weighted average or maximum value of two pixel values corresponding to each pixel.
  • the above-mentioned sequentially inputting D sets of input data corresponding to each prediction layer into one or more congestion prediction models includes: sequentially inputting the above-mentioned D sets of input data into a congestion prediction model obtained through training, and obtaining each The predicted congestion map corresponding to the group of input data; or input each group of input data in the D group of input data into one of the multiple congestion prediction models to perform parallel prediction, and obtain the prediction corresponding to each group of input data Congestion map; wherein, the model structure and parameters of each congestion prediction model in the plurality of congestion prediction models are the same as the congestion prediction model obtained through training.
  • the parallel prediction method can greatly save the time of congestion prediction and improve efficiency.
  • the above-mentioned input data corresponding to each prediction layer D is sequentially input into one or more congestion prediction models, including: when the semiconductor chip to be predicted is divided into two prediction layers, respectively macro unit layer and non-macro unit layer
  • the D group of input data corresponding to the macro-unit layer is input into one or more first congestion prediction models to obtain a predicted congestion map corresponding to the macro-unit layer.
  • the above method further includes: aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
  • the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted are aggregated to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.
  • the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation of the predicted congestion maps corresponding to each of the predicted layers is obtained to obtain the
  • the predicted congestion graph corresponding to the semiconductor chip to be predicted includes: using a hierarchical aggregation operator to aggregate the vertical predicted congestion graph corresponding to each of the prediction layers to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate each Aggregate the horizontal predicted congestion maps corresponding to the prediction layers to obtain a reference horizontal predicted congestion map; use a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the pending congestion map Predict the predicted congestion map corresponding to the semiconductor chip; or use the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the predicted layers to obtain the reference prediction corresponding to each of the predicted layers
  • the above-mentioned hierarchical aggregation operator may be an operation such as taking an average value or taking a maximum value.
  • any two corresponding pixel points on the predicted congestion map subjected to hierarchical aggregation are subjected to operations such as averaging or maximum value to obtain the pixel value of the corresponding pixel point on the aggregated predicted congestion map.
  • the above-mentioned directional aggregation operator may also be an operation such as taking an average value or a maximum value, which is not limited in this application.
  • the specific operation process of the directional aggregation operator please refer to the operation process corresponding to the hierarchical aggregation operator, which will not be repeated here.
  • the semiconductor chip to be predicted contains at least two prediction layers, and the directional aggregation operator can be used to aggregate the horizontal prediction congestion map and vertical prediction congestion map corresponding to each prediction layer to obtain the reference prediction corresponding to each prediction layer Congestion map; then use the hierarchical aggregation operator to aggregate the reference prediction congestion map corresponding to each prediction layer to obtain the prediction congestion map corresponding to the semiconductor chip to be predicted.
  • use hierarchical aggregation operator and directional aggregation operator to aggregate the predicted congestion graph corresponding to each prediction layer, and obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted which is not limited in this application.
  • the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
  • the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
  • the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers
  • the real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
  • the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set.
  • the training feature map and the corresponding real congestion map are obtained by training.
  • the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • FIG. 9 is a schematic flowchart of a congestion prediction provided by an embodiment of the present application.
  • the congestion prediction process of a semiconductor chip is specifically as follows: the semiconductor chip to be predicted is divided into two prediction layers, namely a macro-unit layer and a non-macro-unit layer.
  • the M first feature maps corresponding to the macro-unit layer and the non-macro-unit layer are respectively determined according to the methods in the foregoing embodiments.
  • the M first congestion prediction model Utilize the first congestion prediction model to process the M first feature maps corresponding to the macro-unit layer, and obtain the horizontal prediction congestion map and the vertical prediction congestion map corresponding to the macro-unit layer;
  • the M first feature maps are processed to obtain a horizontal prediction congestion map and a vertical prediction congestion map corresponding to the non-macro unit layer.
  • the vertical predicted congestion graph is aggregated to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
  • FIG. 10 is a schematic structural diagram of a model training device 1000 provided by an embodiment of the present application.
  • the device 1000 may include a layering unit 1010, a determination unit 1020, and a training unit 1030.
  • the detailed description of each unit is as follows .
  • a layering unit 1010 configured to divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is a metal layer contained in each semiconductor chip in K semiconductor chips, and K is a positive integer; the determination unit 1020. Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer
  • a training unit 1030 configured to add M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and use the data set to train a congestion prediction model.
  • the above-mentioned training unit is further configured to: obtain a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing; The real congestion maps corresponding to the prediction layers are added to the data set.
  • the above layering unit is specifically configured to: divide the plurality of metal layers into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
  • the above determination unit is specifically configured to: obtain M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used for Describe the M chip features of each metal layer; generate M first feature maps corresponding to each of the prediction layers based on the M second feature maps of each metal layer; wherein, the The first feature map used to describe any chip feature in the M first feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers
  • the graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
  • the above training unit is specifically configured to: use the data set to iteratively train the congestion prediction model; wherein, each time The iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain a prediction congestion map corresponding to any prediction layer; based on the prediction congestion map The real congestion map corresponding to the any prediction layer updates the congestion prediction model.
  • the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer; in the aspect of using the data set to train the congestion prediction model, the above-mentioned training unit is specifically used for : use the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model; use the first feature map corresponding to the non-macro-unit layer in the data set and the corresponding The real congestion map is used to train the second congestion prediction model.
  • the above M chip features include one or more of pin density, network connection density, module mask or amount of routing resources.
  • each unit may also refer to corresponding descriptions of the method embodiments shown in FIG. 5 and FIG. 7 .
  • FIG. 11 is a schematic structural diagram of an image processing apparatus 1100 provided in an embodiment of the present application.
  • the apparatus 1100 includes a determining unit 1110 and a processing unit 1120 .
  • a determination unit 1110 configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer; processing Unit 1120, configured to use a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers to obtain a prediction congestion map corresponding to each of the prediction layers; wherein, the congestion prediction model is obtained through data
  • the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, and the training data corresponding to each of the training prediction layers includes M first A training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training prediction layers
  • each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
  • the above device further includes: an aggregation unit, configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted .
  • the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map;
  • the above aggregation unit is specifically configured to: use a hierarchical aggregation operator to Aggregating the corresponding vertical predicted congestion graphs to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate the horizontal predicted congestion graphs corresponding to each of the prediction layers to obtain a reference horizontal predicted congestion graph; using directional aggregation The operator aggregates the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted; or, using the directional aggregation operator to Aggregating the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each layer to obtain a reference predicted congestion map corresponding to each of the predicted layers; using the hierarchical aggregation operator to perform aggregated to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted.
  • the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
  • the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
  • the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers
  • the real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
  • the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set.
  • the training feature map and the corresponding real congestion map are obtained by training.
  • the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  • the image processing apparatus 1100 may be used to process corresponding steps of the image processing method 700 described in FIG. 7 , which will not be repeated here.
  • FIG. 12 is a schematic diagram of a hardware structure of a model training device 1200 provided by an embodiment of the present application.
  • the model training apparatus 1200 shown in FIG. 12 includes a memory 1201 , a processor 1202 , a communication interface 1203 and a bus 1204 .
  • the memory 1201 , the processor 1202 , and the communication interface 1203 are connected to each other through a bus 1204 .
  • the memory 1201 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
  • the memory 1201 may store a program. When the program stored in the memory 1201 is executed by the processor 1202, the processor 1202 and the communication interface 1203 are used to execute each step of the method for training the congestion prediction model in the embodiment of the present application.
  • the processor 1202 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the device for training the congestion prediction model in the embodiment of the present application, or to execute the method for training the congestion prediction model in the method embodiment of the present application.
  • the processor 1202 may also be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the method for training the congestion prediction model of the present application may be completed by an integrated logic circuit of hardware in the processor 1202 or instructions in the form of software.
  • the above-mentioned processor 1202 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA ready-made programmable gate array
  • Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory 1201, and the processor 1202 reads the information in the memory 1201, and combines its hardware to complete the functions required by the units included in the training device for the congestion prediction model of the embodiment of the present application, or execute the method embodiment of the present application The training method of the congestion prediction model.
  • the communication interface 1203 implements communication between the apparatus 1200 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1203 .
  • the bus 1204 may include a pathway for transferring information between various components of the device 1200 (eg, memory 1201 , processor 1202 , communication interface 1203 ).
  • FIG. 13 is a schematic diagram of a hardware structure of an image processing apparatus 1300 provided by an embodiment of the present application.
  • the image processing apparatus 1300 may be a computer, a mobile phone, a tablet computer or other possible terminal devices, which is not limited in this application.
  • the image processing apparatus 1300 shown in FIG. 13 (the apparatus 1300 may specifically be a computer device) includes a memory 1301 , a processor 1302 , a communication interface 1303 and a bus 1304 .
  • the memory 1301 , the processor 1302 , and the communication interface 1303 are connected to each other through a bus 1304 .
  • the memory 1301 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM).
  • the memory 1301 may store programs, and when the programs stored in the memory 1301 are executed by the processor 1302, the processor 1302 and the communication interface 1303 are used to execute various steps of the image processing method of the embodiment of the present application.
  • the processor 1302 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more
  • the integrated circuit is used to execute related programs to realize the functions required by the units in the image processing device of the embodiment of the present application, or to execute the image processing method of the method embodiment of the present application.
  • the processor 1302 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the image processing method of the present application may be completed by an integrated logic circuit of hardware in the processor 1302 or instructions in the form of software.
  • the above-mentioned processor 1302 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • ASIC application-specific integrated circuit
  • FPGA ready-made programmable gate array
  • Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory 1301, and the processor 1302 reads the information in the memory 1301, and combines its hardware to complete the functions required by the units included in the image processing device of the embodiment of the present application, or execute the image processing of the method embodiment of the present application method.
  • the communication interface 1303 implements communication between the apparatus 1300 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1303 .
  • the bus 1304 may include pathways for transferring information between various components of the device 1300 (eg, memory 1301 , processor 1302 , communication interface 1303 ).
  • the device 1200 and the device 1300 shown in FIG. 12 and FIG. 13 only show a memory, a processor, and a communication interface, in a specific implementation process, those skilled in the art should understand that the device 1200 and the device 1300 also have Includes other devices necessary for proper operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 1200 and the apparatus 1300 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 1200 and the device 1300 may only include the devices necessary to realize the embodiment of the present application, instead of all the devices shown in FIG. 12 or FIG. 13 .
  • the above apparatus 1200 is equivalent to the training device 120 in FIG. 1
  • the apparatus 1300 is equivalent to the execution device 110 in FIG. 1 .
  • the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the embodiment of the present application also provides a chip system, the chip system includes at least one processor, memory and interface circuit, the memory, the interface circuit and the at least one processor are interconnected by wires, and the at least one memory Instructions are stored in; when the instructions are executed by the processor, the methods described above in FIG. 5 and/or FIG. 7 are implemented.
  • An embodiment of the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a network device, the method flow shown in FIG. 5 and/or FIG. 7 is implemented.
  • the embodiment of the present application further provides a computer program product.
  • the computer program product is run on a terminal, the method flow shown in FIG. 5 and/or FIG. 7 is realized.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A congestion prediction model training method, and an image processing method and apparatus. The training method comprises: dividing a plurality of metal layers into at least two prediction layers, wherein the plurality of metal layers are metal layers included in each of K semiconductor chips, and K is a positive integer; determining M first feature maps corresponding to each prediction layer, wherein the M first feature maps are respectively used for describing M chip features of each prediction layer, and M is a positive integer; and adding the M first feature maps corresponding to each prediction layer in the K semiconductor chips to a data set, and training a congestion prediction model by using the data set. By means of the method, the time consumption of congestion prediction can be reduced while the congestion prediction accuracy of the semiconductor chips is improved.

Description

拥塞预测模型训练方法、图像处理方法及装置Congestion prediction model training method, image processing method and device 技术领域technical field
本申请涉及电子设计自动化(Electronic Design automation,EDA)技术领域,尤其涉及一种拥塞预测模型训练方法、图像处理方法及装置。The present application relates to the technical field of Electronic Design Automation (EDA), in particular to a congestion prediction model training method, image processing method and device.
背景技术Background technique
拥塞预测(Congestion Prediction)作为芯片电子设计自动化EDA物理设计中的重要环节,贯穿于整个设计流程之中。布局方案的拥塞与否直接决定芯片的时延、热功耗等指标。拥塞预测的目标在于:在全局布局(Global Placement,GP)过程中,根据当前单元(Cell)摆放位置估计芯片的绕线拥塞程度,从而为全局布局提供优化依据,使得布局器(Placer)能够把拥塞严重区域的单元推散,减少该区域的布局拥塞程度,从而降低芯片的整体拥塞。其本质在于预测栅格化的芯片上,每一个栅格(Grid)的绕线资源需求(Routing Tracks Demand)和给定的绕线资源量(Routing TrackCapacity)之差。Congestion Prediction (Congestion Prediction), as an important link in the chip electronic design automation EDA physical design, runs through the entire design process. Whether the layout scheme is congested or not directly determines the chip's delay, thermal power consumption and other indicators. The goal of congestion prediction is: in the global placement (Global Placement, GP) process, according to the current unit (Cell) placement position to estimate the chip's winding congestion level, so as to provide an optimization basis for the global placement, so that the placer (Placer) can Scatter the cells in the heavily congested area to reduce the layout congestion in this area, thereby reducing the overall congestion of the chip. Its essence is to predict the difference between the routing resource demand (Routing Tracks Demand) of each grid (Grid) and the given routing resource amount (Routing TrackCapacity) on the rasterized chip.
受限于芯片巨大的规模和复杂的结构,现有拥塞预测方法存在如下局限性:拥塞预测准确率低、计算拥塞耗时长、以及难以同时兼顾准确率与拥塞预测耗时。Limited by the huge scale and complex structure of the chip, existing congestion prediction methods have the following limitations: low accuracy of congestion prediction, long time-consuming calculation of congestion, and difficulty in both accuracy and time-consuming congestion prediction.
发明内容Contents of the invention
本申请实施例公开了一种拥塞预测模型训练方法、图像处理方法及装置,该拥塞预测方法能够在提高半导体芯片拥塞预测准确率的同时降低拥塞预测的耗时。The embodiment of the present application discloses a congestion prediction model training method, an image processing method and a device. The congestion prediction method can reduce the time consumption of congestion prediction while improving the accuracy of semiconductor chip congestion prediction.
第一方面,本申请实施例公开了一种拥塞预测模型训练方法,该方法包括:将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数;确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,所述M为正整数;将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型。In the first aspect, the embodiment of the present application discloses a congestion prediction model training method, the method includes: dividing a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is each of the K semiconductor chips metal layers contained in a semiconductor chip, K is a positive integer; M first feature maps corresponding to each of the prediction layers are determined; wherein, the M first feature maps are respectively used to describe each of the prediction layers M chip features, where M is a positive integer; adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and using the data set to train a congestion prediction model.
应当理解,本申请实施例可以基于不同的方式将每个半导体芯片中的金属层划分为至少两个预测层,从而使得划分后同一预测层中的不同金属层的部分或全部特征数据呈现出较强的相关性和一致性;其中,每个预测层包含至少一个金属层。It should be understood that the embodiment of the present application may divide the metal layer in each semiconductor chip into at least two prediction layers based on different methods, so that part or all of the feature data of different metal layers in the same prediction layer after division presents a relatively Strong correlation and consistency; where each prediction layer contains at least one metal layer.
可以看出,在本申请实施例中,通过对半导体芯片中的金属层进行分组,使得每个半导体芯片被划分为至少两个预测层。由于每个预测层中包含的金属层对应的特征数据(即特征图)呈现出较强的相关性和一致性,一方面可以避免现有技术中由于未进行分层,而导致的采用不同金属层对应的差异较大特征数据进行训练时的相互影响,另一方面在分别使用不同预测层对应的特征数据对拥塞预测模型进行训练时,由于不同预测层中的特征数据表现出不同的趋势,因而在利用不同预测层各自对应的特征数据对拥塞预测模型进行训练后,所得到的模型可以分别对不同趋势的特征数据进行有效地识别,并基于识别出的特征进行相应地预测,即模型具有精细化的识别和预测能力。综上,采用本申请实施例中训练方法可以有效提升模型的预测准确率和泛化能力。It can be seen that, in the embodiment of the present application, by grouping the metal layers in the semiconductor chip, each semiconductor chip is divided into at least two prediction layers. Since the feature data (i.e., feature map) corresponding to the metal layer contained in each prediction layer shows strong correlation and consistency, on the one hand, it can avoid the use of different metal layers due to lack of stratification in the prior art. On the other hand, when using the feature data corresponding to different prediction layers to train the congestion prediction model, because the feature data in different prediction layers show different trends, Therefore, after using the characteristic data corresponding to different prediction layers to train the congestion prediction model, the obtained models can effectively identify the characteristic data of different trends, and make corresponding predictions based on the identified features, that is, the model has Refined identification and prediction capabilities. In summary, adopting the training method in the embodiment of the present application can effectively improve the prediction accuracy and generalization ability of the model.
在一种可行的实施方式中,上述方法还包括:对所述K个半导体芯片进行全局绕线,根据全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。In a feasible implementation manner, the above method further includes: performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing ; adding the real congestion map corresponding to each of the prediction layers in the K semiconductor chips to the data set.
可以看出,在本申请实施例中,基于全局绕线后的K个半导体芯片计算得到每个预测层的真实拥塞图,以便后续利用该真实拥塞图与每个预测层的预测拥塞图进行比较,以调整拥塞预测模型参数,进而得到最优的拥塞预测模型。It can be seen that in the embodiment of the present application, the real congestion map of each prediction layer is calculated based on the K semiconductor chips after global winding, so that the real congestion map can be compared with the predicted congestion map of each prediction layer later , to adjust the parameters of the congestion prediction model, and then obtain the optimal congestion prediction model.
在一种可行的实施方式中,上述将所述多个金属层划分为至少两个预测层,包括:根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。In a feasible implementation manner, the above-mentioned division of the plurality of metal layers into at least two prediction layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips. The layers are divided into at least two prediction layers.
可以看出,在本申请实施例中,可以通过每个半导体芯片中各金属层的制造工艺和/或各金属层上的功能模块分布将每个半导体芯片中的多个金属层划分为至少两个预测层,例如,可以根据是否有宏单元可将每个半导体芯片中的金属层分为两个预测层,一个预测层包含宏单元和标准单元,另一预测层仅包含标准单元;或者基于每个金属层的绕线资源量差异将多个金属层分为至少两个预测层,每个预测层内各金属层的绕线资源量数量相当。基于上述划分方式将多个金属层划分不同预测层,从而使得同一预测层中金属层的特征数据呈现高度的一致性,不同预测层对应的特征数据具有较大差异,因而在利用不同预测层各自对应的特征数据对拥塞预测模型进行训练后,所得到的模型可以分别对不同趋势的特征数据进行有效地识别,并基于识别出的特征进行相应地预测,即模型具有精细化的识别和预测能力。It can be seen that in the embodiment of the present application, the multiple metal layers in each semiconductor chip can be divided into at least two through the manufacturing process of each metal layer in each semiconductor chip and/or the distribution of functional modules on each metal layer. For example, the metal layer in each semiconductor chip can be divided into two prediction layers according to whether there are macro cells, one prediction layer contains macro cells and standard cells, and the other prediction layer only contains standard cells; or based on The difference in the amount of routing resources of each metal layer divides the plurality of metal layers into at least two prediction layers, and the amount of routing resources of each metal layer in each prediction layer is equivalent. Based on the above division method, multiple metal layers are divided into different prediction layers, so that the feature data of the metal layer in the same prediction layer are highly consistent, and the feature data corresponding to different prediction layers are quite different. After the corresponding feature data is used to train the congestion prediction model, the obtained model can effectively identify feature data of different trends, and make corresponding predictions based on the identified features, that is, the model has refined identification and prediction capabilities .
在一种可行的实施方式中,上述确定每个所述预测层对应的M个第一特征图,包括:获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。In a feasible implementation manner, the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature describing the feature of any chip in each metal layer Figure obtained.
应当理解,上述基于每个金属层中描述任一芯片特征的第二特征图得到描述该任一芯片特征的第一特征图的方式可以包括:对描述各金属层同一芯片特征的各第二特征图上对应像素点采取加权平均、取最大值或最小值等方式得到描述该同一芯片特征的第一特征图上对应像素点的像素值。It should be understood that the above method of obtaining the first feature map describing the feature of any chip based on the second feature map describing the feature of any chip in each metal layer may include: for each second feature map describing the feature of the same chip in each metal layer The pixel value of the corresponding pixel point on the first feature map describing the feature of the same chip is obtained by means of weighted average, maximum value or minimum value, etc. for the corresponding pixel point on the map.
可以看出,在本申请实施例中,每个金属层对应M个第二特征图,该M个第二特征图用于分别描述每个金属层的M个芯片特征。基于上述分层方式,在每个预测层中,对于同一芯片特征而言,由于各金属层对应的第二特征图具有较强的相关性和一致性,因而基于上述取均值、最大值或最小值等方式得到的描述该同一芯片特征的第一特征图可以准确表征各金属层的该同一芯片特征,即描述同一芯片特征的第一特征图与每个金属层中描述该同一芯片特征的第二特征图具有较好的相关性和一致性;避免了因为同一预测层中各金属层描述该同一芯片特征的第一特征图差异较大,使得最终得到的描述该同一芯片特征的第一特征图与每个金属层描述该同一芯片特征的第二特征图差异较大。综上,对于同一芯片特征而言,通过上述分层方式和第一特征图的确定方式可以获得准确反映每个预测层中 各金属层上该同一芯片特征的第一特征图,因而在利用不同预测层各自对应的第一特征图对拥塞预测模型进行训练后,所得到的模型可以分别对不同趋势的特征数据进行有效地识别,并基于识别出的特征进行相应地预测,即采用本申请实施例训练得到的模型具有精细化的识别和预测能力。It can be seen that, in the embodiment of the present application, each metal layer corresponds to M second feature maps, and the M second feature maps are used to respectively describe M chip features of each metal layer. Based on the above-mentioned layering method, in each prediction layer, for the same chip feature, since the second feature map corresponding to each metal layer has strong correlation and consistency, based on the above-mentioned average, maximum or minimum The first feature map describing the feature of the same chip obtained by means of value and other methods can accurately represent the feature of the same chip of each metal layer, that is, the first feature map describing the feature of the same chip and the first feature map describing the feature of the same chip in each metal layer The two feature maps have good correlation and consistency; it avoids that the first feature map describing the same chip feature is relatively different because each metal layer in the same prediction layer, so that the final obtained first feature describing the same chip feature The graph is quite different from the second feature graph in which each metal layer describes the features of the same chip. In summary, for the features of the same chip, the first feature map that accurately reflects the features of the same chip on each metal layer in each prediction layer can be obtained through the above-mentioned layering method and the determination method of the first feature map, so when using different After the first feature map corresponding to the prediction layer is used to train the congestion prediction model, the obtained models can effectively identify the feature data of different trends, and make corresponding predictions based on the identified features, that is, adopt the implementation of this application The model trained by the example has refined recognition and prediction capabilities.
在一种可行的实施方式中,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers The graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
应当理解,上述基于每个预测层中各金属层对应的第二水平真实拥塞图得到该预测层对应的第一水平真实拥塞图的方式可以包括:对各金属层对应的第二水平真实拥塞图进行求取均值、加权平均或取最大值等方式,得到该预测层对应的第一水平真实拥塞图;此外,本领域技术人员也可采用其它方式基于各金属层对应的第二水平真实拥塞图得到该预测层对应的第一水平真实拥塞图,本申请对此不限定。同理,该预测层对应的第一垂直真实拥塞图的获取方式与第一水平真实拥塞图的获取方式相同,此处不再赘述。It should be understood that the above method of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in each prediction layer may include: for the second level real congestion map corresponding to each metal layer Calculate the mean value, weighted average or take the maximum value, etc., to obtain the first level real congestion map corresponding to the prediction layer; in addition, those skilled in the art can also use other methods based on the second level real congestion map corresponding to each metal layer A first level real congestion map corresponding to the prediction layer is obtained, which is not limited in this application. Similarly, the acquisition method of the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, which will not be repeated here.
可以看出,在本申请实施例中,基于上述分层方式,在每个预测层中,由于各金属层对应的真实拥塞图具有较强的相关性和一致性,因而基于上述确定每个预测层的真实拥塞图(即真实拥塞图)的方式所得到的每个预测层的真实拥塞图与预测层中各金属层的真实拥塞图具有较好的一致性,即可以得到准确反映每个预测层拥塞程度的真实拥塞图,进而确保后续利用每个预测层的真实拥塞图训练得到的拥塞预测模型的预测精度。It can be seen that in the embodiment of the present application, based on the above-mentioned layering method, in each prediction layer, since the real congestion map corresponding to each metal layer has strong correlation and consistency, each prediction The real congestion map of each prediction layer obtained by means of the real congestion map of each layer (that is, the real congestion map) has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, it can accurately reflect each prediction. The real congestion map of the congestion level of each layer can be used to ensure the prediction accuracy of the congestion prediction model trained by using the real congestion map of each prediction layer.
在一种可行的实施方式中,上述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:将所述数据集中任一预测层对应的M个第一特征图输入所述拥塞预测模型,得到所述任一预测层对应的预测拥塞图;基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。In a feasible implementation manner, adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: inputting M first feature maps corresponding to any prediction layer in the data set into the congestion prediction model to obtain the A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.
可以看出,在本申请实施例中,拥塞预测模型可以是经过多次训练得到的,且在每次迭代训练的过程中,模型输入数据为每个预测层的特征数据(M个第一特征图),由于每个预测层的特征数据可以准确反映该预测层中各金属层的对应特征,且不同预测层对应的特征数据存在较大差异,因而预测模型可以精细化地基于不同预测层的特征数据进行拥塞预测,得到能够准确反映每个预测层拥塞程度的预测拥塞图,然后基于每个预测层的预测拥塞图和真实拥塞图更新模型参数,从而使得训练得到的模型具有较高的预测准确率和较强的泛化能力。It can be seen that in the embodiment of the present application, the congestion prediction model can be obtained through multiple trainings, and in the process of each iterative training, the model input data is the feature data of each prediction layer (M first feature Fig. 1), since the characteristic data of each prediction layer can accurately reflect the corresponding characteristics of each metal layer in the prediction layer, and there are large differences in the characteristic data corresponding to different prediction layers, so the prediction model can be refined based on the different prediction layers Congestion prediction is performed on the feature data to obtain a predicted congestion map that can accurately reflect the congestion degree of each prediction layer, and then the model parameters are updated based on the predicted congestion map and the real congestion map of each prediction layer, so that the trained model has a higher prediction accuracy and strong generalization ability.
在一种可行的实施方式中,上述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。In a feasible implementation manner, the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; The M first feature maps corresponding to each of the prediction layers in the K semiconductor chips are added to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set The feature map and the corresponding real congestion map are used to train the first congestion prediction model; the second congestion prediction model is trained using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map .
可以看出,在本申请实施例中,当每个半导体芯片中的多个金属层被划分为宏单元层和非宏单元层时(即分为两个预测层),由于宏单元层和非宏单元层分别对应的特征数据差异较大,可以利用宏单元层对应的特征数据对第一拥塞预测模型进行训练,利用非宏单元层对应的特征数据对第二拥塞预测模型进行训练,从而可以得到基于宏单元层的特征数据进行精准预测的预测模型,以及基于非宏单元特征进行精准预测的预测模型,即提升模型预测准确率;此外,利用两个模型同时对同一半导体芯片的不同预测层进行预测,可以提升对同一半导体芯片进行拥塞预测的速度。It can be seen that in the embodiment of the present application, when multiple metal layers in each semiconductor chip are divided into a macro-unit layer and a non-macro-unit layer (that is, divided into two predictive layers), since the macro-unit layer and the non-macro-unit layer The characteristic data corresponding to the macro-unit layer are quite different, and the first congestion prediction model can be trained by using the characteristic data corresponding to the macro-unit layer, and the second congestion prediction model can be trained by using the characteristic data corresponding to the non-macro-unit layer, so that Obtain a prediction model based on the feature data of the macro-unit layer for accurate prediction, and a prediction model based on non-macro-unit features for accurate prediction, that is, to improve the prediction accuracy of the model; Making predictions can improve the speed of congestion prediction for the same semiconductor chip.
在一种可行的实施方式中,上述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the above M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
应当理解,每个金属层和预测层对应的M个芯片特征还可以包括除上述四种芯片特征外的其它芯片特征,本申请对此不限定。It should be understood that the M chip features corresponding to each metal layer and prediction layer may also include other chip features except the above four chip features, which is not limited in the present application.
可以看出,在本申请实施例中,上述M个芯片特征可以包括反映芯片功能和片上器件的引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。通过获取每个预测层所对应芯片特征的第一特征图,并基于准确反映芯片功能和片上器件特点的M个第一特征图对预测模型进行训练,从而提升训练得到的预测模型的预测准确率。It can be seen that in the embodiment of the present application, the above M chip features may include one or more of pin density, network connection density, module mask, or routing resource amount reflecting chip functions and on-chip devices. Obtain the first feature map of the chip features corresponding to each prediction layer, and train the prediction model based on the M first feature maps that accurately reflect the functions of the chip and the characteristics of the on-chip device, thereby improving the prediction accuracy of the trained prediction model .
第二方面,本申请实施例公开了一种图像处理方法,该方法包括:确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,待预测半导体芯片包括至少两个预测层,所述M为正整数;利用拥塞预测模型对每个预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括多个训练半导体芯片中每个训练半导体芯片包含的训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层。In the second aspect, the embodiment of the present application discloses an image processing method, which includes: determining M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two predicted layer, the M is a positive integer; use the congestion prediction model to process the M first feature maps corresponding to each prediction layer, and obtain the prediction congestion map corresponding to each prediction layer; wherein, the congestion prediction model is Obtained after training through a data set, the data set includes training data corresponding to the training prediction layer contained in each training semiconductor chip in a plurality of training semiconductor chips, and the training data corresponding to each training prediction layer includes M A first training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training predictions Each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
应当理解,每个预测层对应的M个第一特征图的确定方式与每个训练预测层的M个第一训练特征图的确定方式相同,此处不再赘述。It should be understood that the method of determining the M first feature maps corresponding to each prediction layer is the same as the method of determining the M first training feature maps of each training prediction layer, which will not be repeated here.
可以看出,在本申请实施例中,在采用利用上述第一方面中模型训练方法训练得到的拥塞预测模型进行拥塞预测的过程中,由于训练得到的模型具有较好的预测准确性和泛化能力,因而可以得到待预测半导体芯片中每个预测层对应的更加准确的预测拥塞图,即提升预测拥塞图的准确度,进而便于在芯片生产过程中利用所得到每个预测层准确的预测拥塞图对芯片布局进行相应的优化。It can be seen that in the embodiment of the present application, in the process of congestion prediction using the congestion prediction model trained by the model training method in the first aspect above, the model obtained through training has better prediction accuracy and generalization Therefore, a more accurate predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted can be obtained, that is, the accuracy of the predicted congestion map can be improved, and it is convenient to use the obtained prediction layer in the chip production process to accurately predict congestion The chip layout is optimized accordingly.
在一种可行的实施方式中,上述方法还包括:对待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the above method further includes: aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
可以看出,在本申请实施例中,由于每个预测层对应预测拥塞图准确度较高,因而在基于每个预测层对应的预测拥塞图进行聚合后,所得到描述该待预测芯片拥塞程度的预测拥塞图也具有较高的准确性,进而便于在芯片生产过程中利用所得到待预测芯片的预测拥塞图对芯片布局进行相应的优化。It can be seen that in the embodiment of the present application, since the accuracy of the predicted congestion map corresponding to each prediction layer is high, after aggregation based on the predicted congestion map corresponding to each prediction layer, the obtained information describing the congestion degree of the chip to be predicted The predicted congestion map of the predicted chip also has high accuracy, which is convenient for optimizing the chip layout by using the obtained predicted congestion map of the chip to be predicted in the chip production process.
在一种可行的实施方式中,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;所述对每个所述预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图,包括:利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;或利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation of the predicted congestion maps corresponding to each of the predicted layers is obtained to obtain the The predicted congestion graph corresponding to the semiconductor chip to be predicted includes: using a hierarchical aggregation operator to aggregate the vertical predicted congestion graph corresponding to each of the prediction layers to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate each Aggregate the horizontal predicted congestion maps corresponding to the prediction layers to obtain a reference horizontal predicted congestion map; use a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the pending congestion map Predict the predicted congestion map corresponding to the semiconductor chip; or use the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the predicted layers to obtain the reference prediction corresponding to each of the predicted layers Congestion map: using the hierarchical aggregation operator to aggregate the reference predicted congestion map corresponding to each prediction layer to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.
可以看出,在本申请实施例中,对于待预测芯片而言,可以利用层级聚合算子和方向性聚合算子对每个预测层对应的预测拥塞图进行聚合,得到待预测芯片的预测拥塞图。其中,层级聚合算子和方向性聚合算子的操作包括但不限于对预测拥塞图进行加权平均、取最大值或取最小值等操作。由于基于预测模型得到的每个预测层对应的预测拥塞图具有较高的准确度,因而后续根据具体场景所确定的层级聚合算子和方向性聚合算子的具体操作后,基于聚合算子得到的待预测芯片的预测拥塞图也具有较高的准确性。It can be seen that, in the embodiment of the present application, for the chip to be predicted, the hierarchical aggregation operator and the directional aggregation operator can be used to aggregate the predicted congestion graph corresponding to each prediction layer to obtain the predicted congestion of the chip to be predicted picture. Wherein, the operations of the hierarchical aggregation operator and the directional aggregation operator include, but are not limited to, operations such as performing weighted average, maximum value or minimum value on the predicted congestion graph. Since the predicted congestion map corresponding to each prediction layer obtained based on the prediction model has high accuracy, after the subsequent specific operations of the hierarchical aggregation operator and directional aggregation operator determined according to the specific scene, based on the aggregation operator, the The predicted congestion map of the chips to be predicted also has high accuracy.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图是基于全局绕线后的所述K个训练半导体芯片得到的。In a feasible implementation manner, the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
在一种可行的实施方式中,每个所述训练半导体芯片中包含的训练预测层是基于每个所述训练半导体芯片中金属层的制造工艺或功能模块分布划分得到的。In a feasible implementation manner, the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
在一种可行的实施方式中,每个所述训练预测层对应的M个第一训练特征图是根据每个所述训练预测层中每个金属层对应的M个第二特征图得到的;其中,所述M个第一训练特征图中用于描述任一芯片特征的第一训练特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。In a feasible implementation manner, the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述训练预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers The real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
在一种可行的实施方式中,在所述拥塞预测模型每次迭代训练的过程中,通过所述数据集中任一训练预测层对应的预测拥塞图和该任一训练预测层对应的真实拥塞图更新所述拥塞预测模型;所述任一训练预测层对应的预测拥塞图是通过将所述任一训练预测层对应的M个第一训练特征图输入所述拥塞预测模型得到的。In a feasible implementation manner, during each iterative training process of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the real congestion map corresponding to any training prediction layer Updating the congestion prediction model; the predicted congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer into the congestion prediction model.
在一种可行的实施方式中,所述多个训练预测层分为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述第一拥塞预测模型是通过所述数据集中宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的;所述第二拥塞预测模型是通过所述数据集中非宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的。In a feasible implementation manner, the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set. The training feature map and the corresponding real congestion map are obtained by training.
在一种可行的实施方式中,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
应当理解,上述描述拥塞预测模型训练过程的各实施例的有益效果可以对应参照第一方面中对应训练方法的有效效果,此处不再赘述。It should be understood that the beneficial effects of the above described embodiments of the congestion prediction model training process can refer to the effective effects of the corresponding training method in the first aspect, which will not be repeated here.
第三方面,本申请公开了一种拥塞预测模型的训练装置,该装置包括:分层单元,用于将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数;确定单元,用于确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数;训练单元,用于将所述K个半导体芯片中每个半导体芯片包含的所述预测层对应的M个第一特征图和真实拥塞图作为数据集,并利用所述数据集训练拥塞预测模型;其中,每个所述预测层对应的真实拥塞图用于描述每个所述预测层的真实拥塞程度。In a third aspect, the present application discloses a congestion prediction model training device, which includes: a layering unit for dividing multiple metal layers into at least two prediction layers; wherein, the multiple metal layers are K The metal layer contained in each semiconductor chip in the semiconductor chips, K is a positive integer; the determination unit is used to determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively M chip features used to describe each of the prediction layers, M is a positive integer; a training unit, used to convert the M first features corresponding to the prediction layer contained in each of the K semiconductor chips The graph and the real congestion graph are used as a data set, and the congestion prediction model is trained using the data set; wherein, the real congestion graph corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.
在一种可行的实施方式中,上述训练单元还用于:基于全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。In a feasible implementation manner, the above-mentioned training unit is further configured to: obtain a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing; The real congestion maps corresponding to the prediction layers are added to the data set.
在一种可行的实施方式中,上述分层单元具体用于:根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。In a feasible implementation manner, the above layering unit is specifically configured to: divide the plurality of metal layers into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
在一种可行的实施方式中,上述确定单元具体用于:获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述所述任一芯片特征的第二特征图得到的。In a feasible implementation manner, the above determination unit is specifically configured to: obtain M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used for Describe the M chip features of each metal layer; generate M first feature maps corresponding to each of the prediction layers based on the M second feature maps of each metal layer; wherein, the The first feature map used to describe any chip feature in the M first feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers The graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
在一种可行的实施方式中,在所述利用所述数据集训练拥塞预测模型的方面,上述训练单元具体用于:利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:利用所述拥塞预测模型对所述数据集中任一预测层对应的M个第一特征图进行处理,得到所述任一预测层对应的预测拥塞图;基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。In a feasible implementation manner, in the aspect of using the data set to train the congestion prediction model, the above training unit is specifically configured to: use the data set to iteratively train the congestion prediction model; wherein, each time The iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain a prediction congestion map corresponding to any prediction layer; based on the prediction congestion map The real congestion map corresponding to the any prediction layer updates the congestion prediction model.
在一种可行的实施方式中,上述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层;在所述利用所述数据集训练拥塞预测模型的方面,上述训练单元具体用于:利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。In a feasible implementation manner, the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer; in the aspect of using the data set to train the congestion prediction model, the above-mentioned training unit is specifically used for : use the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model; use the first feature map corresponding to the non-macro-unit layer in the data set and the corresponding The real congestion map is used to train the second congestion prediction model.
在一种可行的实施方式中,上述M个芯片特征包括引脚密度、网络连接密度、模块掩 码或绕线资源量中的一个或多个。In a feasible implementation manner, the above M chip features include one or more of pin density, network connection density, module mask or amount of routing resources.
第四方面,本申请公开了一种图像处理装置,该装置包括:确定单元,用于确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,所述待预测半导体芯片包括至少两个所述预测层,所述M为正整数;处理单元,用于利用拥塞预测模型对每个所述预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括K个训练半导体芯片中每个训练半导体芯片包含的训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层。In a fourth aspect, the present application discloses an image processing device, which includes: a determination unit, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted Including at least two prediction layers, where M is a positive integer; a processing unit configured to use a congestion prediction model to process the M first feature maps corresponding to each prediction layer to obtain each prediction layer Corresponding predicted congestion map; wherein, the congestion prediction model is obtained after training through a data set, and the data set includes training data respectively corresponding to the training prediction layer contained in each of the K training semiconductor chips, The training data corresponding to each of the training prediction layers includes M first training feature maps and real congestion maps, and the M first training feature maps are respectively used to describe M chip features of each of the training prediction layers, The real congestion map is used to describe the real congestion level of each training prediction layer, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
在一种可行的实施方式中,上述装置还包括:聚合单元,用于对所述待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the above device further includes: an aggregation unit, configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted .
在一种可行的实施方式中,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;上述聚合单元具体用于:利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;或,利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the above aggregation unit is specifically configured to: use a hierarchical aggregation operator to Aggregating the corresponding vertical predicted congestion graphs to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate the horizontal predicted congestion graphs corresponding to each of the prediction layers to obtain a reference horizontal predicted congestion graph; using directional aggregation The operator aggregates the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted; or, using the directional aggregation operator to Aggregating the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each layer to obtain a reference predicted congestion map corresponding to each of the predicted layers; using the hierarchical aggregation operator to perform aggregated to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图是基于全局绕线后的所述K个训练半导体芯片得到的。In a feasible implementation manner, the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
在一种可行的实施方式中,每个所述训练半导体芯片中包含的训练预测层是基于每个所述训练半导体芯片中金属层的制造工艺或功能模块分布划分得到的。In a feasible implementation manner, the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
在一种可行的实施方式中,每个所述训练预测层对应的M个第一训练特征图是根据每个所述训练预测层中每个金属层对应的M个第二特征图得到的;其中,所述M个第一训练特征图中用于描述任一芯片特征的第一训练特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。In a feasible implementation manner, the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述训练预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers The real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
在一种可行的实施方式中,在所述拥塞预测模型每次迭代训练的过程中,通过所述数 据集中任一训练预测层对应的预测拥塞图和该任一训练预测层对应的真实拥塞图更新所述拥塞预测模型;所述任一训练预测层对应的预测拥塞图是通过将所述任一训练预测层对应的M个第一训练特征图输入所述拥塞预测模型得到的。In a feasible implementation manner, during each iterative training process of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the real congestion map corresponding to any training prediction layer Updating the congestion prediction model; the predicted congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer into the congestion prediction model.
在一种可行的实施方式中,所述多个训练预测层分为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述第一拥塞预测模型是通过所述数据集中宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的;所述第二拥塞预测模型是通过所述数据集中非宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的。In a feasible implementation manner, the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set. The training feature map and the corresponding real congestion map are obtained by training.
在一种可行的实施方式中,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
第五方面,本申请公开了一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器、存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,上述第一方面和/或第二方面中任一项所述的方法得以实现。In a fifth aspect, the present application discloses a chip system, which is characterized in that the chip system includes at least one processor, a memory, and an interface circuit, and the memory, the interface circuit, and the at least one processor are interconnected by wires Instructions are stored in the at least one memory; when the instructions are executed by the processor, the method described in any one of the first aspect and/or the second aspect is implemented.
第六方面,本申请公开了一种终端设备,其特征在于,所述终端设备包括如上述第三方面中所述芯片系统,以及耦合至所述芯片系统的分立器件。In a sixth aspect, the present application discloses a terminal device, wherein the terminal device includes the system on chip as described in the third aspect above, and a discrete device coupled to the system on chip.
第七方面,本申请公开了一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令在处理器上运行时,实现上述第一方面和/或第二方面中任一所述的方法。In a seventh aspect, the present application discloses a computer-readable storage medium, which is characterized in that the computer-readable storage medium stores program instructions, and when the program instructions are run on a processor, the above-mentioned first aspect is realized And/or any method described in the second aspect.
第八方面,本申请公开了一种计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,上述第一方面和/或第二方面中任一所述的方法得以实现。In an eighth aspect, the present application discloses a computer program product, which is characterized in that, when the computer program product is run on a terminal, the method described in any one of the first aspect and/or the second aspect is implemented.
附图说明Description of drawings
以下对本申请实施例用到的附图进行介绍。The accompanying drawings used in the embodiments of the present application are introduced below.
图1是本申请实施例提供的一种系统架构的结构示意图;FIG. 1 is a schematic structural diagram of a system architecture provided by an embodiment of the present application;
图2是本申请实施例提供的一种网络模型的结构示意图;FIG. 2 is a schematic structural diagram of a network model provided by an embodiment of the present application;
图3是本申请实施例提供的一种芯片硬件结构示意图;FIG. 3 is a schematic diagram of a chip hardware structure provided by an embodiment of the present application;
图4是本申请实施例提供的另一种系统架构结构示意图;FIG. 4 is a schematic structural diagram of another system architecture provided by an embodiment of the present application;
图5是本申请实施例提供的一种拥塞预测模型的训练方法流程图示意图;FIG. 5 is a schematic flowchart of a method for training a congestion prediction model provided in an embodiment of the present application;
图6是本申请实施例提供的一种半导体芯片的层级划分示意图;FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application;
图7是本申请实施例提供的一种图像处理方法的流程示意图;FIG. 7 is a schematic flowchart of an image processing method provided in an embodiment of the present application;
图8是本申请实施例提供的一种第一特征图和第四特征图空间关系示意图;Fig. 8 is a schematic diagram of the spatial relationship between a first feature map and a fourth feature map provided by an embodiment of the present application;
图9是本申请实施例提供的一种拥塞预测的流程示意图;FIG. 9 is a schematic flowchart of a congestion prediction provided by an embodiment of the present application;
图10是本申请实施例提供的一种模型训练装置的结构示意图;Fig. 10 is a schematic structural diagram of a model training device provided by an embodiment of the present application;
图11是本申请实施例提供的一种图像处理装置的结构示意图;FIG. 11 is a schematic structural diagram of an image processing device provided by an embodiment of the present application;
图12是本申请实施例中一种模型训练装置的硬件结构示意图;Fig. 12 is a schematic diagram of the hardware structure of a model training device in the embodiment of the present application;
图13是本申请实施例提供的图像处理装置的硬件结构示意图。FIG. 13 is a schematic diagram of a hardware structure of an image processing device provided by an embodiment of the present application.
具体实施方式detailed description
下面结合本申请实施例中的附图对本申请实施例进行描述。Embodiments of the present application are described below with reference to the drawings in the embodiments of the present application.
本申请实施例可以应用在图像处理任务中,例如,在芯片电子设计自动化(Electronic Design Automation,EDA)物理设计中环节中的拥塞预测(Congestion Prediction)中,即基于半导体芯片的特征图(特征数据)来预测芯片的拥塞程度,从而为全局布局提供优化依据,使得布局器(Placer)能够把拥塞严重区域的单元推散,减少该区域的布局拥塞程度,从而降低芯片的整体拥塞。Embodiments of the present application can be applied to image processing tasks, for example, in the congestion prediction (Congestion Prediction) in the physical design of chip electronic design automation (Electronic Design Automation, EDA), that is, based on the feature map (feature data) of the semiconductor chip ) to predict the congestion degree of the chip, thereby providing an optimization basis for the global layout, so that the placer (Placer) can push away the cells in the severely congested area, reduce the layout congestion degree of the area, and thereby reduce the overall congestion of the chip.
应理解,本申请实施例中的图像可以为静态图像(或称为静态画面)或动态图像(或称为动态画面),例如,本申请中的图像可以为视频或动态图片,或者,本申请中的图像也可以为静态图片或照片。为了便于描述,本申请在下述实施例中将静态图像或动态图像统一称为图像。It should be understood that the images in the embodiments of the present application may be static images (or called static pictures) or dynamic images (or called dynamic pictures), for example, the images in the present application may be videos or dynamic pictures, or, the present application Images in can also be still images or photographs. For ease of description, the present application collectively refers to static images or dynamic images as images in the following embodiments.
应此外,上文介绍的拥塞预测只是本申请实施例的方法所应用的具体场景,本申请实施例的方法在应用时并不限于上述场景,本申请实施例的方法能够应用到任何需要进行图像处理的场景中。或者,本申请实施例中的方法也可以类似地应用于其他领域,例如,语音识别及自然语言处理等,本申请实施例中对此并不限定。In addition, the congestion prediction described above is only a specific scenario where the method of the embodiment of the present application is applied. The method of the embodiment of the present application is not limited to the above-mentioned scenarios when applied. in the scene being processed. Alternatively, the method in the embodiment of the present application can also be similarly applied to other fields, for example, speech recognition and natural language processing, etc., which is not limited in the embodiment of the present application.
下面从模型训练侧和模型应用侧对本申请提供的方法进行描述:The method provided by this application is described below from the model training side and the model application side:
本申请实施例提供的拥塞预测模型的训练方法,涉及计算机视觉的处理,具体可以应用于数据训练、机器学习、深度学习等数据处理方法,对训练数据(如本申请中的第一特征图)进行符号化和形式化的智能信息建模、抽取、预处理、训练等,最终得到训练好的拥塞预测模型;并且,本申请实施例提供的图像处理方法可以运用上述训练好的拥塞预测模型,将输入数据(如本申请中的特征图)输入到训练好的拥塞预测模型中,得到输出数据(如本申请中的待预测芯片的预测拥塞图)。需要说明的是,本申请实施例提供的拥塞预测模型的训练方法和图像处理方法是基于同一个构思产生的发明,也可以理解为一个系统中的两个部分,或一个整体流程的两个阶段:如模型训练阶段和模型应用阶段。The training method of the congestion prediction model provided in the embodiment of the present application involves the processing of computer vision, and can be specifically applied to data processing methods such as data training, machine learning, and deep learning, for training data (such as the first feature map in the present application) Perform symbolic and formalized intelligent information modeling, extraction, preprocessing, training, etc., and finally obtain a well-trained congestion prediction model; and, the image processing method provided in the embodiment of the present application can use the above-mentioned trained congestion prediction model, Input the input data (such as the feature map in this application) into the trained congestion prediction model to obtain output data (such as the predicted congestion map of the chip to be predicted in this application). It should be noted that the congestion prediction model training method and the image processing method provided in the embodiment of this application are inventions based on the same idea, and can also be understood as two parts in a system, or two stages in an overall process : Such as model training phase and model application phase.
本申请实施例涉及了大量神经网络的相关应用,为了更好地理解本申请实施例的方案,下面先对本申请实施例可能涉及的神经网络和计算机视觉领域的相关术语和概念进行介绍。The embodiment of the present application involves a large number of related applications of neural networks. In order to better understand the solutions of the embodiments of the present application, the following first introduces the relevant terms and concepts in the fields of neural networks and computer vision that may be involved in the embodiments of the present application.
(1)神经网络(1) neural network
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以为: A neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as input, and the output of the operation unit can be:
Figure PCTCN2021101860-appb-000001
Figure PCTCN2021101860-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入。激活函数可以是sigmoid函数。神经网络是将许多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输 入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。 Among them, s=1, 2, ... n, n is a natural number greater than 1, W s is the weight of x s , and b is the bias of the neuron unit. f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer. The activation function may be a sigmoid function. A neural network is a network formed by connecting many of the above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit. The input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field. The local receptive field can be an area composed of several neural units.
(2)深度神经网络(2) Deep Neural Network
深度神经网络(deep neural network,DNN),也称多层神经网络,可以理解为具有很多层隐含层的神经网络,这里的“很多”并没有特别的度量标准。从DNN按不同层的位置划分,DNN内部的神经网络可以分为三类:输入层,隐含层,输出层。一般来说第一层是输入层,最后一层是输出层,中间的层数都是隐含层。层与层之间是全连接的,也就是说,第i层的任意一个神经元一定与第i+1层的任意一个神经元相连。虽然DNN看起来很复杂,但是就每一层的工作来说,其实并不复杂,简单来说就是如下线性关系表达式:
Figure PCTCN2021101860-appb-000002
其中,
Figure PCTCN2021101860-appb-000003
是输入向量,
Figure PCTCN2021101860-appb-000004
是输出向量,
Figure PCTCN2021101860-appb-000005
是偏移向量,W是权重矩阵(也称系数),α()是激活函数。每一层仅仅是对输入向量
Figure PCTCN2021101860-appb-000006
经过如此简单的操作得到输出向量
Figure PCTCN2021101860-appb-000007
由于DNN层数多,则系数W和偏移向量
Figure PCTCN2021101860-appb-000008
的数量也就很多了。这些参数在DNN中的定义如下所述:以系数W为例:假设在一个三层的DNN中,第二层的第4个神经元到第三层的第2个神经元的线性系数定义为
Figure PCTCN2021101860-appb-000009
上标3代表系数W所在的层数,而下标对应的是输出的第三层索引2和输入的第二层索引4。总结就是:第L-1层的第k个神经元到第L层的第j个神经元的系数定义为
Figure PCTCN2021101860-appb-000010
需要注意的是,输入层是没有W参数的。在深度神经网络中,更多的隐含层让网络更能够刻画现实世界中的复杂情形。理论上而言,参数越多的模型复杂度越高,“容量”也就越大,也就意味着它能完成更复杂的学习任务。训练深度神经网络的也就是学习权重矩阵的过程,其最终目的是得到训练好的深度神经网络的所有层的权重矩阵(由很多层的向量W形成的权重矩阵)。
A deep neural network (DNN), also known as a multi-layer neural network, can be understood as a neural network with many hidden layers, and there is no special metric for the "many" here. According to the position of different layers of DNN, the neural network inside DNN can be divided into three categories: input layer, hidden layer, and output layer. Generally speaking, the first layer is the input layer, the last layer is the output layer, and the layers in the middle are all hidden layers. The layers are fully connected, that is, any neuron in the i-th layer must be connected to any neuron in the i+1-th layer. Although DNN looks complicated, it is actually not complicated in terms of the work of each layer. In simple terms, it is the following linear relationship expression:
Figure PCTCN2021101860-appb-000002
in,
Figure PCTCN2021101860-appb-000003
is the input vector,
Figure PCTCN2021101860-appb-000004
is the output vector,
Figure PCTCN2021101860-appb-000005
Is the offset vector, W is the weight matrix (also called coefficient), and α() is the activation function. Each layer is just an input vector
Figure PCTCN2021101860-appb-000006
After such a simple operation to get the output vector
Figure PCTCN2021101860-appb-000007
Due to the large number of DNN layers, the coefficient W and the offset vector
Figure PCTCN2021101860-appb-000008
The number is also a lot. The definition of these parameters in DNN is as follows: Take the coefficient W as an example: Assume that in a three-layer DNN, the linear coefficient from the fourth neuron of the second layer to the second neuron of the third layer is defined as
Figure PCTCN2021101860-appb-000009
The superscript 3 represents the layer number of the coefficient W, and the subscript corresponds to the output third layer index 2 and the input second layer index 4. The summary is: the coefficient of the kth neuron of the L-1 layer to the jth neuron of the L layer is defined as
Figure PCTCN2021101860-appb-000010
It should be noted that the input layer has no W parameter. In deep neural networks, more hidden layers make the network more capable of describing complex situations in the real world. Theoretically speaking, a model with more parameters has a higher complexity and a greater "capacity", which means that it can complete more complex learning tasks. Training the deep neural network is the process of learning the weight matrix, and its ultimate goal is to obtain the weight matrix of all layers of the trained deep neural network (the weight matrix formed by the vector W of many layers).
(3)卷积神经网络(3) Convolutional neural network
卷积神经网络(CNN,convolutional neuron network)是一种带有卷积结构的深度神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器。该特征抽取器可以看作是滤波器,卷积过程可以看作是使用一个可训练的滤波器与一个输入的图像或者卷积特征平面(feature map)做卷积。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。这其中隐含的原理是:图像的某一部分的统计信息与其他部分是一样的。即意味着在某一部分学习的图像信息也能用在另一部分上。所以对于图像上的所有位置,都能使用同样的学习得到的图像信息。在同一卷积层中,可以使用多个卷积核来提取不同的图像信息,一般地,卷积核数量越多,卷积操作反映的图像信息越丰富。Convolutional neural network (CNN, convolutional neuron network) is a deep neural network with a convolutional structure. A convolutional neural network consists of a feature extractor consisting of a convolutional layer and a subsampling layer. The feature extractor can be seen as a filter, and the convolution process can be seen as using a trainable filter to convolve with an input image or convolutional feature map. The convolutional layer refers to the neuron layer that performs convolution processing on the input signal in the convolutional neural network. In the convolutional layer of a convolutional neural network, a neuron can only be connected to some adjacent neurons. A convolutional layer usually contains several feature planes, and each feature plane can be composed of some rectangularly arranged neural units. Neural units of the same feature plane share weights, and the shared weights here are convolution kernels. Shared weights can be understood as a way to extract image information that is independent of location. The underlying principle is that the statistical information of a certain part of the image is the same as that of other parts. That means that the image information learned in one part can also be used in another part. So for all positions on the image, the same learned image information can be used. In the same convolution layer, multiple convolution kernels can be used to extract different image information. Generally, the more the number of convolution kernels, the richer the image information reflected by the convolution operation.
卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。The convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights through learning during the training process of the convolutional neural network. In addition, the direct benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
(4)损失函数(4) Loss function
在训练深度神经网络的过程中,因为希望深度神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为深度神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到深度神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么深度神经网络的训练就变成了尽可能缩小这个loss的过程。In the process of training the deep neural network, because it is hoped that the output of the deep neural network is as close as possible to the value you really want to predict, you can compare the predicted value of the current network with the target value you really want, and then according to the difference between the two to update the weight vector of each layer of the neural network (of course, there is usually an initialization process before the first update, that is, to pre-configure parameters for each layer in the deep neural network), for example, if the predicted value of the network If it is high, adjust the weight vector to make it predict lower, and keep adjusting until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to pre-define "how to compare the difference between the predicted value and the target value", which is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value important equation. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the deep neural network becomes a process of reducing the loss as much as possible.
(5)反向传播算法(5) Back propagation algorithm
卷积神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的超分辨率模型中参数的大小,使得超分辨率模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的超分辨率模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的超分辨率模型的参数,例如权重矩阵。The convolutional neural network can use the error back propagation (back propagation, BP) algorithm to correct the size of the parameters in the initial super-resolution model during the training process, so that the reconstruction error loss of the super-resolution model becomes smaller and smaller. Specifically, passing the input signal forward until the output will generate an error loss, and updating the parameters in the initial super-resolution model by backpropagating the error loss information, so that the error loss converges. The backpropagation algorithm is a backpropagation movement dominated by error loss, aiming to obtain the parameters of the optimal super-resolution model, such as the weight matrix.
(6)像素值s(6) Pixel value s
图像的像素值可以是一个红绿蓝(RGB)颜色值,像素值可以是表示颜色的长整数。例如,像素值为256*Red+100*Green+76Blue,其中,Blue代表蓝色分量,Green代表绿色分量,Red代表红色分量。各个颜色分量中,数值越小,亮度越低,数值越大,亮度越高。对于灰度图像来说,像素值可以是灰度值。The pixel value of the image can be a red-green-blue (RGB) color value, and the pixel value can be a long integer representing the color. For example, the pixel value is 256*Red+100*Green+76Blue, where Blue represents a blue component, Green represents a green component, and Red represents a red component. In each color component, the smaller the value, the lower the brightness, and the larger the value, the higher the brightness. For grayscale images, the pixel values may be grayscale values.
下面介绍本申请实施例提供的系统架构。The system architecture provided by the embodiment of the present application is introduced below.
参见附图1,图1为本申请实施例提供的一种系统架构100的结构示意图。如系统架构100所示,数据采集设备160用于采集训练数据,本申请实施例中训练数据包括所有预测层对应的第一特征图和真实拥塞图。Referring to FIG. 1 , FIG. 1 is a schematic structural diagram of a system architecture 100 provided by an embodiment of the present application. As shown in the system architecture 100, the data collection device 160 is used to collect training data. In the embodiment of the present application, the training data includes first feature maps and real congestion maps corresponding to all prediction layers.
在采集到训练数据之后,数据采集设备160将这些训练数据存入数据库130,训练设备120基于数据库130中维护的训练数据训练得到目标模型101(即为本申请实施例中的拥塞预测模型)。After collecting the training data, the data collection device 160 stores the training data in the database 130, and the training device 120 trains the target model 101 based on the training data maintained in the database 130 (ie, the congestion prediction model in the embodiment of the present application).
下面将以实施例一更详细地描述训练设备120如何基于训练数据得到目标模型101,该目标模型101能够用于实现本申请实施例提供的图像处理方法,即,将待预测芯片每个预测层对应的M张第一特征图通过相关预处理后输入该目标模型101,即可得到每个预测层对应的预测拥塞图。本申请实施例中的目标模型101具体可以为拥塞预测模型,在本申请提供的实施例中,该拥塞预测模型是通过至少一次训练得到的。需要说明的是,在实际的应用中,数据库130中维护的训练数据不一定都来自于数据采集设备160的采集,也有可能是从其他设备接收得到的。另外需要说明的是,训练设备120也不一定完全基于数据库130维护的训练数据进行目标模型101的训练,也有可能从云端或其他地方获取训练数据进行模型训练,上述描述不应该作为对本申请实施例的限定。The following will use Embodiment 1 to describe in more detail how the training device 120 obtains the target model 101 based on the training data. The target model 101 can be used to implement the image processing method provided by the embodiment of the present application, that is, each prediction layer of the chip to be predicted The corresponding M first feature maps are input to the target model 101 after being pre-processed to obtain a predicted congestion map corresponding to each prediction layer. The target model 101 in the embodiment of the present application may specifically be a congestion prediction model. In the embodiment provided in the present application, the congestion prediction model is obtained through at least one training. It should be noted that, in practical applications, the training data maintained in the database 130 may not all be collected by the data collection device 160, but may also be received from other devices. In addition, it should be noted that the training device 120 does not necessarily perform the training of the target model 101 based entirely on the training data maintained by the database 130, and it is also possible to obtain training data from the cloud or other places for model training. limit.
根据训练设备120训练得到的目标模型101可以应用于不同的系统或设备中,如应用 于图1所示的执行设备110,执行设备110可以是终端,如平板电脑,笔记本电脑,手机终端,车载终端等,还可以是服务器或者云端等。在附图1中,执行设备110配置有输入/输出(input/output,I/O)接口112,用于与外部设备进行数据交互,用户可以通过客户设备140向I/O接口112输入数据,输入数据在本申请实施例中可以包括待预测芯片各预测层对应的第一特征图。The target model 101 trained according to the training device 120 can be applied to different systems or devices, such as the execution device 110 shown in FIG. A terminal, etc., may also be a server or a cloud. In accompanying drawing 1, execution equipment 110 is equipped with input/output (input/output, I/O) interface 112, is used for carrying out data interaction with external equipment, the user can input data to I/O interface 112 through client equipment 140, In this embodiment of the present application, the input data may include a first feature map corresponding to each prediction layer of the chip to be predicted.
在执行设备110对输入数据进行预处理,或者在执行设备110的计算模块111执行计算等相关的处理过程中,执行设备110可以调用数据存储系统150中的数据、代码等以用于相应的处理,也可以将相应处理得到的数据、指令等存入数据存储系统150中。When the execution device 110 preprocesses the input data, or in the calculation module 111 of the execution device 110 performs calculation and other related processing, the execution device 110 can call the data, codes, etc. in the data storage system 150 for corresponding processing , the correspondingly processed data and instructions may also be stored in the data storage system 150 .
最后,I/O接口112将处理结果,如上述得到的待预测芯片各预测层的预测拥塞图(或待预测芯片对应的预测拥塞图)返回给客户设备140,从而提供给用户。Finally, the I/O interface 112 returns the processing result, such as the predicted congestion map of each prediction layer of the chip to be predicted (or the corresponding predicted congestion map of the chip to be predicted) obtained above, to the client device 140 to provide to the user.
值得说明的是,训练设备120可以针对不同的目标或称不同的任务,基于不同的训练数据生成相应的目标模型101,该相应的目标模型101即可以用于实现上述目标或完成上述任务,从而为用户提供所需的结果。It is worth noting that the training device 120 can generate corresponding target models 101 based on different training data for different goals or different tasks, and the corresponding target models 101 can be used to achieve the above-mentioned goals or complete the above-mentioned tasks, thereby Provide the user with the desired result.
在附图1中所示情况下,用户可以手动给定输入数据,该手动给定可以通过I/O接口112提供的界面进行操作。另一种情况下,客户设备140可以自动地向I/O接口112发送输入数据,如果要求客户设备140自动发送输入数据需要获得用户的授权,则用户可以在客户设备140中设置相应权限。用户可以在客户设备140查看执行设备110输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备140也可以作为数据采集端,采集如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果作为新的样本数据,并存入数据库130。当然,也可以不经过客户设备140进行采集,而是由I/O接口112直接将如图所示输入I/O接口112的输入数据及输出I/O接口112的输出结果,作为新的样本数据存入数据库130。In the case shown in FIG. 1 , the user can manually specify the input data, and the manual specification can be operated through the interface provided by the I/O interface 112 . In another case, the client device 140 can automatically send the input data to the I/O interface 112 . If the client device 140 is required to automatically send the input data to obtain the user's authorization, the user can set the corresponding authority in the client device 140 . The user can view the results output by the execution device 110 on the client device 140, and the specific presentation form may be specific ways such as display, sound, and action. The client device 140 can also be used as a data collection terminal, collecting the input data input to the I/O interface 112 as shown in the figure and the output results of the output I/O interface 112 as new sample data, and storing them in the database 130 . Of course, the client device 140 may not be used for collection, but the I/O interface 112 directly uses the input data input to the I/O interface 112 as shown in the figure and the output result of the output I/O interface 112 as a new sample. The data is stored in database 130 .
值得注意的是,附图1仅是本发明实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在附图1中,数据存储系统150相对执行设备110是外部存储器,在其它情况下,也可以将数据存储系统150置于执行设备110中。It is worth noting that accompanying drawing 1 is only a schematic diagram of a system architecture provided by an embodiment of the present invention, and the positional relationship between devices, devices, modules, etc. shown in the figure does not constitute any limitation, for example, in accompanying drawing 1 , the data storage system 150 is an external memory relative to the execution device 110 , and in other cases, the data storage system 150 may also be placed in the execution device 110 .
如图1所示,根据训练设备120训练得到目标模型101,该目标模型101在本申请实施例中可以是基于本申请实施例拥塞预测模型的训练方法进行训练得到的;具体的,本申请实施例提供的拥塞预测模型可以是卷积神经网络、生成对抗神经网络,变分自动编码器或语义分割神经网络等模型,本方案对此不做具体限定。As shown in FIG. 1, the target model 101 is obtained by training according to the training device 120. In the embodiment of the present application, the target model 101 can be obtained by training based on the training method of the congestion prediction model in the embodiment of the present application; specifically, the implementation of the present application The congestion prediction model provided in the example can be a convolutional neural network, a generative adversarial neural network, a variational autoencoder, or a semantic segmentation neural network, which is not specifically limited in this solution.
如前文的基础概念介绍所述,卷积神经网络是一种带有卷积结构的深度神经网络,是一种深度学习(deep learning,DL)架构,深度学习架构是指通过机器学习的算法,在不同的抽象层级上进行多个层次的学习。作为一种深度学习架构,CNN是一种前馈(feed-forward)人工神经网络,该前馈人工神经网络中的各个神经元可以对输入其中的图像作出响应。As mentioned in the introduction to the basic concepts above, the convolutional neural network is a deep neural network with a convolutional structure and a deep learning (DL) architecture. The deep learning architecture refers to the algorithm through machine learning. Multiple levels of learning are performed at different levels of abstraction. As a deep learning architecture, CNN is a feed-forward artificial neural network in which individual neurons can respond to images input into it.
如图2所示,卷积神经网络(CNN)200可以包括输入层210,卷积层/池化层220(其中池化层为可选的),以及神经网络层230。As shown in FIG. 2 , a convolutional neural network (CNN) 200 may include an input layer 210 , a convolutional/pooling layer 220 (where the pooling layer is optional), and a neural network layer 230 .
卷积层/池化层220:Convolutional layer/pooling layer 220:
卷积层:Convolution layer:
如图2所示卷积层/池化层220可以包括如示例221-226层,举例来说:在一种实现中,221层为卷积层,222层为池化层,223层为卷积层,224层为池化层,225为卷积层,226为池化层;在另一种实现方式中,221、222为卷积层,223为池化层,224、225为卷积层,226为池化层。即卷积层的输出可以作为随后的池化层的输入,也可以作为另一个卷积层的输入以继续进行卷积操作。As shown in Figure 2, the convolutional layer/pooling layer 220 may include layers 221-226 as examples, for example: in one implementation, the 221st layer is a convolutional layer, the 222nd layer is a pooling layer, and the 223rd layer is a volume Layers, 224 are pooling layers, 225 are convolutional layers, and 226 are pooling layers; in another implementation, 221 and 222 are convolutional layers, 223 are pooling layers, and 224 and 225 are convolutional layers Layer, 226 is a pooling layer. That is, the output of the convolutional layer can be used as the input of the subsequent pooling layer, or it can be used as the input of another convolutional layer to continue the convolution operation.
下面将以卷积层221为例,介绍一层卷积层的内部工作原理。The following will take the convolutional layer 221 as an example to introduce the inner working principle of one convolutional layer.
卷积层221可以包括很多个卷积算子,卷积算子也称为核,其在图像处理中的作用相当于一个从输入图像矩阵中提取特定信息的过滤器,卷积算子本质上可以是一个权重矩阵,这个权重矩阵通常被预先定义,在对图像进行卷积操作的过程中,权重矩阵通常在输入图像上沿着水平方向一个像素接着一个像素(或两个像素接着两个像素……这取决于步长stride的取值)的进行处理,从而完成从图像中提取特定特征的工作。该权重矩阵的大小应该与图像的大小相关,需要注意的是,权重矩阵的纵深维度(depth dimension)和输入图像的纵深维度是相同的,在进行卷积运算的过程中,权重矩阵会延伸到输入图像的整个深度。因此,和一个单一的权重矩阵进行卷积会产生一个单一纵深维度的卷积化输出,但是大多数情况下不使用单一权重矩阵,而是应用多个尺寸(行×列)相同的权重矩阵,即多个同型矩阵。每个权重矩阵的输出被堆叠起来形成卷积图像的纵深维度,这里的维度可以理解为由上面所述的“多个”来决定。不同的权重矩阵可以用来提取图像中不同的特征,例如一个权重矩阵用来提取图像边缘信息,另一个权重矩阵用来提取图像的特定颜色,又一个权重矩阵用来对图像中不需要的噪点进行模糊化等。该多个权重矩阵尺寸(行×列)相同,经过该多个尺寸相同的权重矩阵提取后的特征图的尺寸也相同,再将提取到的多个尺寸相同的特征图合并形成卷积运算的输出。The convolution layer 221 may include many convolution operators, which are also called kernels, and their role in image processing is equivalent to a filter for extracting specific information from the input image matrix. The convolution operators are essentially It can be a weight matrix. This weight matrix is usually pre-defined. During the convolution operation on the image, the weight matrix is usually one pixel by one pixel (or two pixels by two pixels) along the horizontal direction on the input image. ...It depends on the value of the stride) to complete the work of extracting specific features from the image. The size of the weight matrix should be related to the size of the image. It should be noted that the depth dimension of the weight matrix is the same as the depth dimension of the input image. During the convolution operation, the weight matrix will be extended to The entire depth of the input image. Therefore, convolution with a single weight matrix will produce a convolutional output with a single depth dimension, but in most cases instead of using a single weight matrix, multiple weight matrices of the same size (row×column) are applied, That is, multiple matrices of the same shape. The output of each weight matrix is stacked to form the depth dimension of the convolution image, where the dimension can be understood as determined by the "multiple" mentioned above. Different weight matrices can be used to extract different features in the image. For example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific colors of the image, and another weight matrix is used to filter unwanted noise in the image. Do blurring etc. The multiple weight matrices have the same size (row×column), and the feature maps extracted by the multiple weight matrices of the same size are also of the same size, and then the extracted multiple feature maps of the same size are combined to form the convolution operation. output.
这些权重矩阵中的权重值在实际应用中需要经过大量的训练得到,通过训练得到的权重值形成的各个权重矩阵可以用来从输入图像中提取信息,从而使得卷积神经网络200进行正确的预测。The weight values in these weight matrices need to be obtained through a lot of training in practical applications, and each weight matrix formed by the weight values obtained through training can be used to extract information from the input image, so that the convolutional neural network 200 can make correct predictions .
当卷积神经网络200有多个卷积层的时候,初始的卷积层(例如221)往往提取较多的一般特征,该一般特征也可以称之为低级别的特征;随着卷积神经网络200深度的加深,越往后的卷积层(例如226)提取到的特征越来越复杂,比如高级别的语义之类的特征,语义越高的特征越适用于待解决的问题。When the convolutional neural network 200 has multiple convolutional layers, the initial convolutional layer (such as 221) often extracts more general features, which can also be referred to as low-level features; As the depth of the network 200 deepens, the features extracted by the later convolutional layers (such as 226) become more and more complex, such as features such as high-level semantics, and features with higher semantics are more suitable for the problem to be solved.
池化层:Pooling layer:
由于常常需要减少训练参数的数量,因此卷积层之后常常需要周期性的引入池化层,在如图2中卷积层/池化层220所示例的221-226各层,可以是一层卷积层后面跟一层池化层,也可以是多层卷积层后面接一层或多层池化层。在图像处理过程中,池化层的唯一目的就是减少图像的空间大小。池化层可以包括平均池化算子和/或最大池化算子,以用于对输入图像进行采样得到较小尺寸的图像。平均池化算子可以在特定范围内对图像中的像素值进行计算产生平均值作为平均池化的结果。最大池化算子可以在特定范围内取该范围内值最大的像素作为最大池化的结果。另外,就像卷积层中用权重矩阵的大小应该与图像尺 寸相关一样,池化层中的运算符也应该与图像的大小相关。通过池化层处理后输出的图像尺寸可以小于输入池化层的图像的尺寸,池化层输出的图像中每个像素点表示输入池化层的图像的对应子区域的平均值或最大值。Since it is often necessary to reduce the number of training parameters, it is often necessary to periodically introduce a pooling layer after the convolutional layer, and each layer of 221-226 as shown in the convolutional layer/pooling layer 220 in Figure 2 can be one layer A convolutional layer is followed by a pooling layer, or a multi-layer convolutional layer is followed by one or more pooling layers. In image processing, the sole purpose of pooling layers is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling an input image to obtain an image of a smaller size. The average pooling operator can calculate the pixel values in the image within a specific range to generate an average value as the result of average pooling. The maximum pooling operator can take the pixel with the largest value within a specific range as the result of maximum pooling. In addition, just as the size of the weight matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the size of the image. The size of the image output after being processed by the pooling layer may be smaller than the size of the image input to the pooling layer, and each pixel in the image output by the pooling layer represents the average or maximum value of the corresponding sub-region of the image input to the pooling layer.
神经网络层230:Neural Network Layer 230:
在经过卷积层/池化层220的处理后,卷积神经网络200还不足以输出所需要的输出信息。因为如前所述,卷积层/池化层220只会提取特征,并减少输入图像带来的参数。然而为了生成最终的输出信息(所需要的类信息或其他相关信息),卷积神经网络200需要利用神经网络层230来生成一个或者一组所需要的类的数量的输出。因此,在神经网络层230中可以包括多层隐含层(如图2所示的231、232至23n)以及输出层240,该多层隐含层中所包含的参数可以根据具体的任务类型的相关训练数据进行预先训练得到。After being processed by the convolutional layer/pooling layer 220, the convolutional neural network 200 is not enough to output the required output information. Because as mentioned earlier, the convolutional layer/pooling layer 220 only extracts features and reduces the parameters brought by the input image. However, in order to generate the final output information (required class information or other relevant information), the convolutional neural network 200 needs to use the neural network layer 230 to generate one or a group of outputs with the required number of classes. Therefore, the neural network layer 230 may include multiple hidden layers (231, 232 to 23n as shown in FIG. 2 ) and an output layer 240, and the parameters contained in the multi-layer hidden layers may be based on specific task types. The related training data are pre-trained.
在神经网络层230中的多层隐含层之后,也就是整个卷积神经网络200的最后层为输出层240,该输出层240具有类似分类交叉熵的损失函数,具体用于计算预测误差,一旦整个卷积神经网络200的前向传播(如图2由210至240方向的传播为前向传播)完成,反向传播(如图2由240至210方向的传播为反向传播)就会开始更新前面提到的各层的权重值以及偏差,以减少卷积神经网络200的损失,及卷积神经网络200通过输出层输出的结果和理想结果之间的误差。After the multi-layer hidden layer in the neural network layer 230, that is, the last layer of the entire convolutional neural network 200 is the output layer 240, which has a loss function similar to the classification cross entropy, and is specifically used to calculate the prediction error, Once the forward propagation of the entire convolutional neural network 200 (as shown in Fig. 2, the propagation from 210 to 240 direction is forward propagation) is completed, the backpropagation (as shown in Fig. 2, the propagation from 240 to 210 direction is back propagation) will Start to update the weight values and biases of the aforementioned layers to reduce the loss of the convolutional neural network 200 and the error between the output of the convolutional neural network 200 through the output layer and the ideal result.
需要说明的是,如图2所示的卷积神经网络200仅作为一种卷积神经网络的示例,在具体的应用中,卷积神经网络还可以以其他网络模型的形式存在。It should be noted that the convolutional neural network 200 shown in FIG. 2 is only an example of a convolutional neural network, and in specific applications, the convolutional neural network may also exist in the form of other network models.
下面介绍本申请实施例提供的一种芯片硬件结构。A chip hardware structure provided by the embodiment of the present application is introduced below.
图3为本发明实施例提供的一种芯片硬件结构,该芯片包括神经网络处理器50。该芯片可以被设置在如图1所示的执行设备110中,用以完成计算模块111的计算工作。该芯片也可以被设置在如图1所示的训练设备120中,用以完成训练设备120的训练工作并输出目标模型101。如图2所示的卷积神经网络中各层的算法均可在如图3所示的芯片中得以实现。FIG. 3 is a chip hardware structure provided by an embodiment of the present invention, and the chip includes a neural network processor 50 . The chip can be set in the execution device 110 shown in FIG. 1 to complete the computing work of the computing module 111 . The chip can also be set in the training device 120 shown in FIG. 1 to complete the training work of the training device 120 and output the target model 101 . The algorithms of each layer in the convolutional neural network shown in Figure 2 can be implemented in the chip shown in Figure 3 .
神经网络处理器NPU 50作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路503,控制器504控制运算电路503提取存储器(权重存储器或输入存储器)中的数据并进行运算。The neural network processor NPU 50 is mounted on the main CPU (Host CPU) as a coprocessor, and the tasks are assigned by the Host CPU. The core part of the NPU is the operation circuit 503, and the controller 504 controls the operation circuit 503 to extract data in the memory (weight memory or input memory) and perform operations.
在一些实现中,运算电路503内部包括多个处理单元(process engine,PE)。在一些实现中,运算电路503是二维脉动阵列。运算电路503还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路503是通用的矩阵处理器。In some implementations, the operation circuit 503 includes multiple processing units (process engine, PE). In some implementations, arithmetic circuit 503 is a two-dimensional systolic array. The arithmetic circuit 503 may also be a one-dimensional systolic array or other electronic circuits capable of performing mathematical operations such as multiplication and addition. In some implementations, arithmetic circuit 503 is a general-purpose matrix processor.
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器502中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器501中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器508(accumulator)中。For example, suppose there is an input matrix A, a weight matrix B, and an output matrix C. The operation circuit fetches the data corresponding to the matrix B from the weight memory 502, and caches it in each PE in the operation circuit. The operation circuit takes the data of matrix A from the input memory 501 and performs matrix operation with matrix B, and the obtained partial or final results of the matrix are stored in the accumulator 508 (accumulator).
向量计算单元507可以对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。例如,向量计算单元507可以用于神经网络中非卷积/非FC 层的网络计算,如池化(Pooling),批归一化(batch normalization),局部响应归一化(local response normalization)等。The vector computing unit 507 can further process the output of the computing circuit, such as vector multiplication, vector addition, exponent operation, logarithmic operation, size comparison and so on. For example, the vector calculation unit 507 can be used for network calculations of non-convolution/non-FC layers in neural networks, such as pooling (Pooling), batch normalization (batch normalization), local response normalization (local response normalization), etc. .
在一些实现种,向量计算单元507能将经处理的输出的向量存储到统一存储器506。例如,向量计算单元507可以将非线性函数应用到运算电路503的输出,例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元507生成归一化的值、合并值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路503的激活输入,例如用于在神经网络中的后续层中的使用。In some implementations, vector computation unit 507 can store a vector of processed outputs to unified memory 506 . For example, the vector calculation unit 507 may apply a non-linear function to the output of the operation circuit 503, such as a vector of accumulated values, to generate activation values. In some implementations, the vector computation unit 507 generates normalized values, merged values, or both. In some implementations, the vector of processed outputs can be used as an activation input to arithmetic circuitry 503, for example for use in subsequent layers in a neural network.
统一存储器506用于存放输入数据以及输出数据。The unified memory 506 is used to store input data and output data.
权重数据直接通过存储单元访问控制器505(direct memory access controller,DMAC)将外部存储器中的输入数据搬运到输入存储器501和/或统一存储器506、将外部存储器中的权重数据存入权重存储器502,以及将统一存储器506中的数据存入外部存储器。The weight data directly transfers the input data in the external memory to the input memory 501 and/or unified memory 506 through the storage unit access controller 505 (direct memory access controller, DMAC), stores the weight data in the external memory into the weight memory 502, And store the data in the unified memory 506 into the external memory.
总线接口单元(bus interface unit,BIU)510,用于通过总线实现主CPU、DMAC和取指存储器509之间进行交互。A bus interface unit (bus interface unit, BIU) 510 is configured to implement interaction between the main CPU, DMAC and instruction fetch memory 509 through the bus.
与控制器504连接的取指存储器(instruction fetch buffer)509,用于存储控制器504使用的指令。An instruction fetch buffer 509 connected to the controller 504 is used to store instructions used by the controller 504.
控制器504,用于调用取指存储器509中缓存的指令,实现控制该运算加速器的工作过程。The controller 504 is configured to call the instruction cached in the instruction fetch memory 509 to control the operation process of the operation accelerator.
一般地,统一存储器506,输入存储器501,权重存储器502以及取指存储器509均为片上(on-chip)存储器,外部存储器为该NPU外部的存储器,该外部存储器可以为双倍数据率同步动态随机存储器(double data rate synchronous dynamic random access memory,简称DDR SDRAM)、高带宽存储器(high bandwidth memory,HBM)或其他可读可写的存储器。Generally, the unified memory 506, the input memory 501, the weight memory 502, and the instruction fetch memory 509 are all on-chip memory, and the external memory is a memory outside the NPU, and the external memory can be a double data rate synchronous dynamic random Memory (double data rate synchronous dynamic random access memory, referred to as DDR SDRAM), high bandwidth memory (high bandwidth memory, HBM) or other readable and writable memory.
其中,图2所示的卷积神经网络中各层的运算可以由运算电路503或向量计算单元507执行。Wherein, the operations of each layer in the convolutional neural network shown in FIG. 2 can be performed by the operation circuit 503 or the vector calculation unit 507 .
上文中介绍的图1中的训练设备120能够执行本申请实施例中拥塞预测模型训练方法的各个步骤,图1中的执行设备110能够执行本申请实施例的图像处理方法中的各个步骤,图2所示的神经网络模型和图3所示的芯片也可以用于执行本申请实施例的图像处理方法的各个步骤,图3所示的芯片也可以用于执行本申请实施例中拥塞预测模型训练方法的各个步骤。The training device 120 in FIG. 1 introduced above can execute the various steps of the congestion prediction model training method in the embodiment of the present application, and the execution device 110 in FIG. 1 can execute the various steps in the image processing method in the embodiment of the present application, as shown in FIG. The neural network model shown in 2 and the chip shown in FIG. 3 can also be used to execute the various steps of the image processing method of the embodiment of the present application, and the chip shown in FIG. 3 can also be used to execute the congestion prediction model in the embodiment of the present application steps of the training method.
如图4所示,图4为本申请实施例提供一种系统架构300的结构示意图。该系统架构包括本地设备301、本地设备302以及执行设备210和数据存储系统250;其中,本地设备301和本地设备302通过通信网络与执行设备210连接。As shown in FIG. 4 , FIG. 4 is a schematic structural diagram of a system architecture 300 provided in an embodiment of the present application. The system architecture includes a local device 301, a local device 302, an execution device 210, and a data storage system 250; wherein, the local device 301 and the local device 302 are connected to the execution device 210 through a communication network.
执行设备210可以由一个或多个服务器实现。可选的,执行设备210可以与其它计算设备配合使用,例如:数据存储器、路由器、负载均衡器等设备。执行设备210可以布置在一个物理站点上,或者分布在多个物理站点上。执行设备210可以使用数据存储系统250中的数据,或者调用数据存储系统250中的程序代码来实现本申请实施例的拥塞预测模型训练方法或图像处理方法。Execution device 210 may be implemented by one or more servers. Optionally, the execution device 210 may be used in cooperation with other computing devices, such as data storage, routers, load balancers and other devices. Execution device 210 may be arranged on one physical site, or distributed on multiple physical sites. The execution device 210 may use the data in the data storage system 250 or call the program code in the data storage system 250 to implement the congestion prediction model training method or the image processing method in the embodiment of the present application.
具体地,执行设备210可以执行以下过程:Specifically, the execution device 210 may perform the following process:
将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,N为大于1的整数,K为正整数;确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数;将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型。Divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers are metal layers contained in each semiconductor chip in K semiconductor chips, N is an integer greater than 1, and K is a positive integer; determine each M first feature maps corresponding to the prediction layers; wherein, the M first feature maps are respectively used to describe the M chip features of each of the prediction layers, and M is a positive integer; the K The M first feature maps corresponding to each prediction layer in the semiconductor chip are added to the data set, and the congestion prediction model is trained using the data set.
通过上述执行设备210能够训练得到拥塞预测模型,该拥塞预测模型可以用于图像处理、语音处理及自然语言处理等,例如,该拥塞预测模型可以用于实现本申请实施例中的拥塞预测方法。A congestion prediction model can be trained through the execution device 210 above, and the congestion prediction model can be used for image processing, speech processing, and natural language processing, etc., for example, the congestion prediction model can be used to implement the congestion prediction method in the embodiment of the present application.
或者,通过上述过程执行设备210能够搭建成一个图像处理装置,该图像处理装置可以用于图像处理(例如,可以用于实现本申请实施例中的半导体芯片的拥塞预测)。Alternatively, the execution device 210 can be built into an image processing device through the above process, and the image processing device can be used for image processing (for example, it can be used to realize the congestion prediction of the semiconductor chip in the embodiment of the present application).
用户可以操作各自的用户设备(例如本地设备301和本地设备302)与执行设备210进行交互。每个本地设备可以表示多种计算设备,例如个人计算机、计算机工作站、智能手机、平板电脑等。Users can operate their respective user devices (such as the local device 301 and the local device 302 ) to interact with the execution device 210 . Each local device can represent a variety of computing devices, such as personal computers, computer workstations, smartphones, tablets, and so on.
每个用户的本地设备可以通过任何通信机制/通信标准的通信网络与执行设备210进行交互,通信网络可以是广域网、局域网、点对点连接等方式,或它们的任意组合。Each user's local device can interact with the execution device 210 through any communication mechanism/communication standard communication network, and the communication network can be a wide area network, a local area network, a point-to-point connection, etc., or any combination thereof.
在一种实现方式中,本地设备301、本地设备302从执行设备210获取到拥塞预测模型的相关参数,将拥塞预测模型部署在本地设备301、本地设备302上,利用该拥塞预测模型对待预测芯片进行拥塞预测,得到待预测芯片的预测拥塞图。In one implementation, the local device 301 and the local device 302 obtain the relevant parameters of the congestion prediction model from the execution device 210, deploy the congestion prediction model on the local device 301 and the local device 302, and use the congestion prediction model to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.
在另一种实现中,执行设备210上可以直接部署训练好的拥塞预测模型,执行设备210通过从本地设备301和本地设备302获取待预测芯片的特征数据,并利用该训练好的拥塞预测模型对待预测芯片进行拥塞预测,得到待预测芯片的预测拥塞图。In another implementation, the trained congestion prediction model can be directly deployed on the execution device 210, and the execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the trained congestion prediction model Congestion prediction is performed on the chip to be predicted, and the predicted congestion map of the chip to be predicted is obtained.
在一种实现方式中,本地设备301、本地设备302从执行设备210获取到图像处理装置的相关参数,将图像处理装置部署在本地设备301、本地设备302上,利用该图像处理装置对待预测芯片进行拥塞预测,得到待预测芯片的预测拥塞图。In one implementation, the local device 301 and the local device 302 obtain relevant parameters of the image processing device from the execution device 210, deploy the image processing device on the local device 301 and the local device 302, and use the image processing device to treat the prediction chip Congestion prediction is performed to obtain the predicted congestion map of the chip to be predicted.
在另一种实现中,执行设备210上可以直接部署图像处理装置,执行设备210通过从本地设备301和本地设备302获取待预测芯片的特征数据,并利用该图像处理装置对待预测芯片进行拥塞预测,得到待预测芯片的预测拥塞图。In another implementation, the image processing device can be directly deployed on the execution device 210. The execution device 210 obtains the characteristic data of the chip to be predicted from the local device 301 and the local device 302, and uses the image processing device to perform congestion prediction on the chip to be predicted. , to get the predicted congestion map of the chip to be predicted.
也就是说,上述执行设备210也可以为云端设备,此时,执行设备210可以部署在云端;或者,上述执行设备210也可以为终端设备,此时,执行设备210可以部署在用户终端侧,本申请实施例对此并不限定。That is to say, the above execution device 210 may also be a cloud device, at this time, the execution device 210 may be deployed on the cloud; or, the above execution device 210 may also be a terminal device, at this time, the execution device 210 may be deployed on the user terminal side, The embodiment of the present application does not limit this.
下面结合附图对本申请实施例的拥塞预测模型的训练方法及图像处理方法(例如,EDA中的拥塞预测方法)进行详细的介绍。The method for training the congestion prediction model and the image processing method (for example, the congestion prediction method in EDA) of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
请参见图5,图5是本申请实施例提供的一种拥塞预测模型训练方法500的流程示意图,该方法包括但不限于如下步骤:Please refer to FIG. 5. FIG. 5 is a schematic flowchart of a congestion prediction model training method 500 provided in an embodiment of the present application. The method includes but is not limited to the following steps:
步骤S510:将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数。Step S510: Divide the plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers are metal layers included in each of the K semiconductor chips, and K is a positive integer.
具体地,将每个半导体芯片中包含的金属层划分为至少两个预测层;其中,每个预测层包含至少一个金属层。请参见图6,图6为本申请实施例提供的一种半导体芯片的层级划分示意图。如图6所示,半导体芯片可以包含多个金属层(从上至下依次为金属层1-1…金属层N-B),该半导体芯片可划分为至少两个预测层(从上至下依次为预测层1…预测层N),每个预测层包含至少一个金属层,例如预测层1可以包含金属层1-1…金属层1-A,预测层N可以包含金属层N-1…金属层N-B;其中,A和B为正整数,N为大于或等于2的整数。Specifically, the metal layers included in each semiconductor chip are divided into at least two prediction layers; wherein, each prediction layer includes at least one metal layer. Please refer to FIG. 6 . FIG. 6 is a schematic diagram of hierarchical division of a semiconductor chip provided by an embodiment of the present application. As shown in FIG. 6, the semiconductor chip may comprise a plurality of metal layers (from top to bottom are metal layer 1-1 ... metal layer N-B), and the semiconductor chip can be divided into at least two predictive layers (from top to bottom are sequentially Prediction layer 1...prediction layer N), each prediction layer contains at least one metal layer, for example, prediction layer 1 may contain metal layer 1-1...metal layer 1-A, and prediction layer N may contain metal layer N-1...metal layer N-B; wherein, A and B are positive integers, and N is an integer greater than or equal to 2.
在一种可行的实施方式中,上述将多个金属层划分为至少两个预测层,包括:根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。In a feasible implementation manner, the above-mentioned division of the plurality of metal layers into at least two predictive layers includes: dividing the plurality of metal layers according to the manufacturing process or functional module distribution of the metal layers in each semiconductor chip for at least two prediction layers.
具体地,预测层的分层依据为金属层的制造工艺或者金属层是否包含相同的功能模块(Module),即在同一半导体芯片上,可以将制造工艺类似的金属层划分到同一预测层中,或将包含相同功能模块的金属层划分到同一预测层中。其中,各金属层的制造工艺可以由各金属层的绕线资源量(Routingtrack capacity)进行表征,绕线资源量具体为每个金属层上绕线轨道(Routing track)的数量;当金属层的制造工艺越先进时,其绕线资源量越大,例如,7nm制造工艺的金属层上绕线资源量大于14nm制造工艺的金属层上的绕线资源量。功能模块指金属层中的一些硬件结构,例如,宏单元(Macro Cell)层或寄存器等。Specifically, the stratification of the prediction layer is based on the manufacturing process of the metal layer or whether the metal layer contains the same functional module (Module), that is, on the same semiconductor chip, metal layers with similar manufacturing processes can be divided into the same prediction layer, Or divide metal layers containing the same functional modules into the same prediction layer. Wherein, the manufacturing process of each metal layer can be characterized by the routing track capacity (Routing track capacity) of each metal layer, and the routing resource capacity is specifically the number of routing tracks (Routing track) on each metal layer; when the metal layer The more advanced the manufacturing process, the greater the amount of routing resources. For example, the amount of routing resources on the metal layer of the 7nm manufacturing process is greater than the amount of routing resources on the metal layer of the 14nm manufacturing process. Functional modules refer to some hardware structures in the metal layer, for example, macro cell (Macro Cell) layer or registers, etc.
举例来说,当根据各金属层的绕线资源量的差异性对同一半导体芯片中的多个金属层划分为至少两个预测层时,可以将绕线资源量的差异小于或等于预设阈值的金属层划分到同一预测层中,该预设阈值可以根据具体地应用场景进行确定。假设半导体芯片包含6个金属层,且该6个金属层芯片的绕线资源量分别为15、15、14、10、10和2,预设阈值为2;此时,可以将该6个金属层划分为3个预测层;其中,绕线资源量的单位为条。具体地,可以将绕线资源量分别为15、15和14的三个金属层划分到同一预测层;将绕线资源量为10的两个金属层划分到同一预测层;将绕线资源量为2的金属层划分到同一预测层。For example, when the multiple metal layers in the same semiconductor chip are divided into at least two prediction layers according to the difference in the amount of routing resources of each metal layer, the difference in the amount of routing resources can be less than or equal to a preset threshold The metal layers are divided into the same prediction layer, and the preset threshold can be determined according to the specific application scenario. Assuming that the semiconductor chip contains 6 metal layers, and the amount of wiring resources of the 6 metal layer chips are 15, 15, 14, 10, 10 and 2 respectively, the preset threshold is 2; at this time, the 6 metal layers can be The layer is divided into three prediction layers; among them, the unit of the amount of routing resources is bar. Specifically, the three metal layers whose routing resources are 15, 15 and 14 can be divided into the same prediction layer; the two metal layers whose routing resources are 10 can be divided into the same prediction layer; the routing resources Metal layers with a value of 2 are assigned to the same prediction layer.
举例来说,当根据各金属层包含的功能模块对同一半导体芯片中的多个金属层划分为至少两个预测层时,可以根据半导体芯片中是否包含宏单元层或者是否包含寄存器来将同一半导体芯片中的多个金属层划分为至少两个预测层。例如,可以将同一半导体芯片中包含宏单元层的金属层划分到同一预测层,将包含非宏单元层的金属层划分到同一预测层;或将同一半导体芯片中包含寄存器的金属层划分到同一预测层,将不包含寄存器的金属层划分到同一预测层。For example, when multiple metal layers in the same semiconductor chip are divided into at least two predictive layers according to the functional modules contained in each metal layer, the same semiconductor chip can be divided into The multiple metal layers in the chip are divided into at least two predictive layers. For example, metal layers containing macrocell layers in the same semiconductor chip can be divided into the same predictive layer, and metal layers containing non-macrocell layers can be divided into the same predictive layer; or metal layers containing registers in the same semiconductor chip can be divided into the same predictive layer. The prediction layer, which divides metal layers that do not contain registers into the same prediction layer.
可以看出,基于上述划分方式将多个金属层划分不同预测层,使得同一预测层中金属层的特征数据呈现高度的一致性,不同预测层对应的特征数据具有较大差异,因而在利用不同预测层各自对应的特征数据对拥塞预测模型进行训练后,所得到的模型可以分别对不同趋势的特征数据进行有效地识别,并基于识别出的特征进行相应地预测,即模型具有精细化的识别和预测能力。It can be seen that based on the above division method, multiple metal layers are divided into different prediction layers, so that the feature data of the metal layer in the same prediction layer are highly consistent, and the feature data corresponding to different prediction layers are quite different. After the corresponding characteristic data of the prediction layer train the congestion prediction model, the obtained models can effectively identify the characteristic data of different trends, and make corresponding predictions based on the identified characteristics, that is, the model has refined identification and predictive power.
步骤S520:确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数。Step S520: Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer.
具体地,可以根据具体地应用场景确定用于进行拥塞预测的M个芯片特征。获取芯片相关数据,包括网表、宏模块位置、晶体管的位置、晶体管引脚位置和绕线资源量等。基于上述芯片相关数据计算得到与每个预测层的M个芯片特征对应的M个第一特征图,预测层的每个芯片特征对应一个描述该芯片特征的第一特征图。Specifically, M chip features used for congestion prediction may be determined according to specific application scenarios. Obtain chip-related data, including netlist, macromodule location, transistor location, transistor pin location, and routing resources, etc. M first feature maps corresponding to the M chip features of each prediction layer are calculated based on the above chip-related data, and each chip feature of the prediction layer corresponds to a first feature map describing the feature of the chip.
在一种可行的实施方式中,上述M个芯片特征可以包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the above M chip features may include one or more of pin density, network connection density, module mask, or amount of routing resources.
应当理解,上述芯片特征只是本申请实施例列出的几个具体示例,本领域技术人员也可利用其它芯片特征来对预测层进行相应的描述。此外,采用上述预测层分层依据进行分层后,每个预测层中的各金属层具有类似的芯片特征;例如,在同一预测层中,一个金属层的引脚密度与另一金属层的引脚密度类似。It should be understood that the above-mentioned chip features are only a few specific examples listed in the embodiments of the present application, and those skilled in the art may use other chip features to describe the prediction layer accordingly. In addition, after stratification using the above prediction layer stratification basis, each metal layer in each prediction layer has similar chip characteristics; for example, in the same prediction layer, the pin density of one metal layer Pin density is similar.
在一种可行的实施方式中,上述确定每个所述预测层对应的M个第一特征图,包括:获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述所述任一芯片特征的第二特征图得到的。In a feasible implementation manner, the determination of the M first feature maps corresponding to each of the prediction layers includes: obtaining M second feature maps corresponding to each metal layer in each of the prediction layers; wherein , the M second feature maps are respectively used to describe the M chip features of each metal layer; based on the M second feature maps of each metal layer, each of the prediction layers corresponding to M first feature maps; wherein, the first feature map used to describe any chip feature in the M first feature maps is based on the second feature map describing any chip feature in each metal layer obtained from the feature map.
具体地,每个半导体芯片包含的各金属层都对应上述M个芯片特征,即每个金属层的芯片特征可以包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个;其中,引脚密度和模块掩码是没有方向性的;绕线资源量和网络连接密度是有方向性的。在同一金属层中,绕线资源量为水平方向或垂直方向,网络连接密度为水平方向或垂直方向。Specifically, each metal layer included in each semiconductor chip corresponds to the aforementioned M chip features, that is, the chip features of each metal layer may include one or more of pin density, network connection density, module mask, or amount of routing resources. Multiple; among them, the pin density and the module mask are not directional; the amount of routing resources and the network connection density are directional. In the same metal layer, the amount of routing resources is in the horizontal or vertical direction, and the network connection density is in the horizontal or vertical direction.
其中,绕线资源量具体为每个金属层上绕线轨道(Routing track)的数量,由于每个金属层上绕线轨道是有方向性的,为水平方向或垂直方向,因而绕线资源量也相应地具有方向性。网络连接密度是指单位面积上的绕线数量,绕线是缠绕于上述绕线轨道上,因而网络连接密度也具有方向性,即相应地为水平或垂直方向。应当理解,对于有方向性的芯片特征,描述该芯片特征的第一特征图也相应地具有方向性;例如,当绕线资源量为水平方向时,描述绕线资源量的第一特征图也为水平方向。Among them, the amount of routing resources is specifically the number of routing tracks on each metal layer. Since the routing tracks on each metal layer are directional, either horizontal or vertical, the amount of routing resources It is also directional accordingly. The network connection density refers to the number of windings per unit area, and the windings are wound on the above-mentioned winding track, so the network connection density also has directionality, that is, it is horizontal or vertical accordingly. It should be understood that, for a directional chip feature, the first feature map describing the chip feature is also directional; for example, when the amount of routing resources is horizontal, the first feature map describing the amount of routing resources is also directional. for the horizontal direction.
进一步,上述获取每个所述预测层中每个金属层对应的M个第二特征图,包括:基于每个预测层中各金属层的布线数据进行特征提取,得到各金属层对应的M个第二特征图。上述基于每个金属层的M个第二特征图,生成每个预测层对应的M个第一特征图,包括:对于同一芯片特征而言,基于描述每个金属层上该同一芯片特征的第二特征图,得到描述该同一芯片特征的第一特征图,该第一特征图即是每个预测层对应的M个第一特征图中的一个。具体地,对描述同一预测层中各金属层上同一芯片特征的各第二特征图上对应像素点进行加权平均、取最大值或最小值,得到该同一芯片特征所对应第一特征图上对应像素点的像素值。对各第二特征图上的每个像素点进行上述操作,得到该同一芯片特征对应的第一特征图上各像素点的像素值,即得到描述该同一芯片特征的第一特征图。Further, the acquisition of the M second feature maps corresponding to each metal layer in each of the prediction layers includes: performing feature extraction based on the wiring data of each metal layer in each prediction layer, and obtaining the M second feature maps corresponding to each metal layer. The second feature map. The above-mentioned M second feature maps based on each metal layer generate M first feature maps corresponding to each prediction layer, including: for the same chip feature, based on the first feature map describing the same chip feature on each metal layer Two feature maps, obtaining a first feature map describing the features of the same chip, where the first feature map is one of the M first feature maps corresponding to each prediction layer. Specifically, the corresponding pixel points on each second feature map describing the same chip feature on each metal layer in the same prediction layer are weighted averaged, and the maximum value or minimum value is taken to obtain the corresponding pixel points on the first feature map corresponding to the same chip feature. The pixel value of the pixel point. Perform the above operations on each pixel on each second feature map to obtain the pixel value of each pixel on the first feature map corresponding to the feature of the same chip, that is, obtain the first feature map describing the feature of the same chip.
应当理解,本领域技术人员除采用上述加权平均、取最大值或最小值之外,也可以采用其它方式对描述各金属层上同一芯片特征的各第二特征图进行处理,得到描述该同一芯片特征的第一特征图,本申请对此不限定。It should be understood that, in addition to adopting the above-mentioned weighted average, taking the maximum value or the minimum value, those skilled in the art may also use other methods to process the second characteristic maps describing the characteristics of the same chip on each metal layer, and obtain the description of the same chip. The first feature map of the feature is not limited in this application.
举例来说,当一个预测层中包含四个金属层,M个芯片特征包含引脚密度时,该预测层引脚密度对应的第一特征图的确定方式如下:首先基于该四个金属层中每个金属层的布线数据得到分别描述该四个金属层引脚密度的四个第二特征图;对该四个第二特征图中同一位置的像素值采用加权平均、取最大值或最小值等方式进行处理,得到描述该预测层上引脚密度的第一特征图中同一位置处的像素值。对上述四个第二特征图中每个像素点都采用上述方式进行处理,得到该预测层上引脚密度对应的第一特征图。For example, when a prediction layer includes four metal layers, and M chip features include pin density, the first feature map corresponding to the pin density of the prediction layer is determined as follows: first, based on the four metal layers The wiring data of each metal layer obtains four second feature maps respectively describing the pin density of the four metal layers; the pixel values at the same position in the four second feature maps are weighted average, and the maximum value or minimum value is taken etc. to obtain the pixel value at the same position in the first feature map describing the pin density on the prediction layer. Each pixel in the above four second feature maps is processed in the above manner to obtain the first feature map corresponding to the pin density on the prediction layer.
应当注意,当一个金属层不存在上述M个芯片特征中的第一芯片特征时,即该金属层不存在用于描述该第一芯片特征的真实第二特征图,此时可以采用预设操作得到描述该金属层第一芯片特征的第二特征图。该预设操作可以是:首先确定该一个金属层所位于的预测层,基于该预测层中描述其它各金属层上第一芯片特征的各第二特征图确定描述该一个金属层上第一芯片特征的第二特征图,例如,可以对描述其它各金属层第一芯片特征的各第二特征图上对应像素点进行加权平均、取最大值或最小值或其它处理方式,得到描述该一个金属层第一芯片特征的第二特征图。It should be noted that when a metal layer does not have the first chip feature among the above M chip features, that is, the metal layer does not have a real second feature map for describing the first chip feature, then the preset operation can be used A second characteristic map describing the first chip characteristics of the metal layer is obtained. The preset operation may be: firstly determine the prediction layer where the one metal layer is located, and determine the description of the first chip on the one metal layer based on the second feature maps in the prediction layer that describe the characteristics of the first chip on the other metal layers. The second feature map of the feature, for example, can carry out weighted average, take the maximum value or minimum value or other processing methods on the corresponding pixel points on each second feature map describing the first chip features of each metal layer to obtain the description of the metal layer. Layer the second feature map of the first chip features.
可以看出,在本申请实施例中,每个金属层对应M个第二特征图,该M个第二特征图用于分别描述每个金属层的M个芯片特征。基于上述分层方式,在每个预测层中,对于同一芯片特征而言,由于各金属层对应的第二特征图具有较强的相关性和一致性,因而基于上述取均值、最大值或最小值等方式得到的描述该同一芯片特征的第一特征图可以准确表征各金属层的该同一芯片特征,即描述同一芯片特征的第一特征图与每个金属层中描述该同一芯片特征的第二特征图具有较好的相关性和一致性;避免了因为同一预测层中各金属层描述该同一芯片特征的第一特征图差异较大,使得最终得到的描述该同一芯片特征的第一特征图与每个金属层描述该同一芯片特征的第二特征图差异较大。综上,对于同一芯片特征而言,通过上述分层方式和第一特征图的确定方式可以获得准确反映每个预测层中各金属层上该同一芯片特征的第一特征图,因而在利用不同预测层各自对应的第一特征图对拥塞预测模型进行训练后,所得到的模型可以分别对不同趋势的特征数据进行有效地识别,并基于识别出的特征进行相应地预测,即采用本申请实施例训练得到的模型具有精细化的识别和预测能力。It can be seen that, in the embodiment of the present application, each metal layer corresponds to M second feature maps, and the M second feature maps are used to respectively describe M chip features of each metal layer. Based on the above-mentioned layering method, in each prediction layer, for the same chip feature, since the second feature map corresponding to each metal layer has strong correlation and consistency, based on the above-mentioned average, maximum or minimum The first feature map describing the feature of the same chip obtained by means of value and other methods can accurately represent the feature of the same chip of each metal layer, that is, the first feature map describing the feature of the same chip and the first feature map describing the feature of the same chip in each metal layer The two feature maps have good correlation and consistency; it avoids that the first feature map describing the same chip feature is relatively different because each metal layer in the same prediction layer, so that the final obtained first feature describing the same chip feature The graph is quite different from the second feature graph in which each metal layer describes the features of the same chip. In summary, for the features of the same chip, the first feature map that accurately reflects the features of the same chip on each metal layer in each prediction layer can be obtained through the above-mentioned layering method and the determination method of the first feature map, so when using different After the first feature map corresponding to the prediction layer is used to train the congestion prediction model, the obtained models can effectively identify the feature data of different trends, and make corresponding predictions based on the identified features, that is, adopt the implementation of this application The model trained by the example has refined recognition and prediction capabilities.
步骤S530:基于所述K个半导体芯片中每个半导体芯片包含的所述预测层对应的M个第一特征图和真实拥塞图确定数据集,并利用所述数据集训练拥塞预测模型;其中,每个所述预测层对应的真实拥塞图用于描述每个所述预测层的真实拥塞程度。Step S530: Determine a data set based on the M first feature maps and real congestion maps corresponding to the prediction layer included in each of the K semiconductor chips, and use the data set to train a congestion prediction model; wherein, The real congestion map corresponding to each of the prediction layers is used to describe the real congestion degree of each of the prediction layers.
其中,拥塞程度指绕线资源需求和绕线资源量的差值。绕线资源量指绕线轨道的数量。绕线资源需求指将所有网表进行连通所需的绕线数量,该所需绕线缠绕于绕线轨道中,因而绕线资源需求和绕线资源量的差值即为拥塞程度。例如,绕线资源需求为10条,即需要10条绕线才能把所有网表进行连通;若绕线资源量为8,即绕线轨道为8条,此时,有两条绕线会与其它绕线共同缠绕于同一轨道中,即此时的拥塞程度为2。Wherein, the degree of congestion refers to the difference between the routing resource demand and the routing resource amount. The amount of routing resources refers to the number of routing tracks. The routing resource requirement refers to the number of routing required to connect all the netlists. The required routing is wound in the routing track, so the difference between the routing resource requirement and the amount of routing resources is the degree of congestion. For example, if there are 10 routing resources, 10 routings are needed to connect all the netlists; Other winding wires are wound together in the same track, that is, the congestion degree at this time is 2.
在一种可行的实施方式中,上述方法还包括:对所述K个半导体芯片进行全局绕线,根据全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。In a feasible implementation manner, the above method further includes: performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing ; adding the real congestion map corresponding to each of the prediction layers in the K semiconductor chips to the data set.
其中,芯片设计可以分为芯片布局和全局绕线(Global routing)两个阶段。芯片布局阶段主要确定芯片上每个金属层中的网表、宏模块位置、晶体管的位置、晶体管引脚位置和绕线资源量等。全局绕线阶段主要将金属线缠绕于绕线资源量对应的绕线轨道中。Among them, chip design can be divided into chip layout and global routing (Global routing) two stages. The chip layout stage mainly determines the netlist in each metal layer on the chip, the position of the macro module, the position of the transistor, the position of the transistor pin and the amount of winding resources, etc. In the global winding stage, the metal wire is mainly wound in the winding track corresponding to the amount of winding resources.
具体地,上述根据全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图,包括:在对半导体芯片进行全局绕线后,可以确定芯片上绕线的数量,即绕线资源需求(Routing track demand);然后基于绕线资源需求和绕线资源量计算得到每个预测层的真实拥塞图。Specifically, obtaining the actual congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing includes: after performing global routing on the semiconductor chip, the number of routings on the chip can be determined, That is, the routing track demand; and then calculate the real congestion map of each prediction layer based on the routing resource demand and the amount of routing resources.
本申请实施例中的图像处理方法(也可称为拥塞预测方法)主要用于芯片布局阶段,在未进行全局绕线时,基于该方法预测芯片的拥塞程序从而相应调整芯片布局。The image processing method (also called the congestion prediction method) in the embodiment of the present application is mainly used in the chip layout stage. When global routing is not performed, the chip congestion program is predicted based on this method to adjust the chip layout accordingly.
在一种可行的实施方式中,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers The graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
具体地,获取每个半导体芯片中各金属层对应的真实拥塞图,每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;其中,第二水平真实拥塞图用于描述金属层在水平方向上的拥塞程度,第二垂直真实拥塞图用于描述金属层在垂直方向上的拥塞程度。其中,每个金属层对应的真实拥塞图是在进行全局绕线后,基于每个金属层的绕线资源需求和绕线资源量进行计算得到的。Specifically, the real congestion map corresponding to each metal layer in each semiconductor chip is obtained, and the real congestion map corresponding to each metal layer includes a second horizontal real congestion map and a second vertical real congestion map; wherein, the second horizontal real congestion map is used to describe the degree of congestion of the metal layer in the horizontal direction, and the second vertical real congestion map is used to describe the degree of congestion of the metal layer in the vertical direction. Wherein, the real congestion map corresponding to each metal layer is calculated based on the routing resource requirement and routing resource amount of each metal layer after global routing is performed.
可选地,基于预测层中各金属层对应的第二水平真实拥塞图得到的该预测层对应的第一水平真实拥塞图的具体过程,可以对应参照上述确定预测层对应的第一特征图的具体过程,即利用各金属层上同一芯片特征对应的第二特征图,得到描述该同一芯片特征的第一特征图的具体过程,此处不再赘述。同理,确定预测层对应的第一垂直真实拥塞图的具体过程与第一水平真实拥塞图的确定过程对应相同,此处不再赘述。Optionally, the specific process of obtaining the first level real congestion map corresponding to the prediction layer based on the second level real congestion map corresponding to each metal layer in the prediction layer can correspond to the above-mentioned determination of the first feature map corresponding to the prediction layer. The specific process, that is, the specific process of obtaining the first feature map describing the feature of the same chip by using the second feature map corresponding to the feature of the same chip on each metal layer, will not be repeated here. Similarly, the specific process of determining the first vertical real congestion map corresponding to the prediction layer is the same as that of the first horizontal real congestion map, and will not be repeated here.
应当理解,每个预测层对应的M个第一特征图和真实拥塞图的尺寸相同。It should be understood that the size of the M first feature maps corresponding to each prediction layer is the same as that of the real congestion map.
可以看出,在本申请实施例中,基于上述分层方式,在每个预测层中,由于各金属层对应的真实拥塞图具有较强的相关性和一致性,因而基于本申请实施例中的方式所得到的每个预测层的真实拥塞图与预测层中各金属层的真实拥塞图具有较好的一致性,即可以得到准确反映每个预测层拥塞程度的真实拥塞图,进而确保后续利用每个预测层的真实拥塞图训练得到的拥塞预测模型的预测精度。It can be seen that in the embodiment of the present application, based on the above-mentioned layering method, in each prediction layer, since the real congestion map corresponding to each metal layer has strong correlation and consistency, based on the embodiment of the present application The real congestion map of each prediction layer obtained by the method has a good consistency with the real congestion map of each metal layer in the prediction layer, that is, the real congestion map that accurately reflects the congestion degree of each prediction layer can be obtained, thereby ensuring subsequent The prediction accuracy of the congestion prediction model trained with the real congestion map of each prediction layer.
在一种可行的实施方式中,上述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:利用所述拥塞预测模型对所述数据集中任一预测层对应的M个第一特征图进行处理,得到所述任一预测层对应的预测拥塞图;基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。In a feasible implementation manner, adding the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using The data set performs iterative training on the congestion prediction model; wherein, each iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain A predicted congestion map corresponding to any prediction layer; updating the congestion prediction model based on the predicted congestion map and the real congestion map corresponding to any prediction layer.
具体地,上述每次迭代训练包括:从任一预测层对应的M个第一特征图和2张真实拥 塞图中确定单次迭代训练样本。其中,单次迭代训练样本包含M张第三特征图和2张目标真实拥塞图。将M张第三特征图输入拥塞预测模型,得到模型输出的预测拥塞图;基于预测拥塞图和2张目标真实拥塞图确定预测误差。根据预测误差,采用梯度下降法或其它反向传播算法更新拥塞预测模型中的模型参数。最后,判断该次训练过程是否达到预设条件。当满足预设条件时,结束拥塞预测模型的训练过程;当不满足预设条件时,开始下一次迭代训练。该预设条件可以是训练次数大于或等于预设次数、预测误差小于或等于预设误差或其它可行的条件,本申请对此不限定。其中,上述拥塞预测模型可以是生成对抗神经网络,变分自动编码器,语义分割神经网络等模型,本申请对此不限定。Specifically, each iteration training above includes: determining a single iteration training sample from the M first feature maps corresponding to any prediction layer and 2 real congestion maps. Among them, a single iteration training sample contains M third feature maps and 2 target real congestion maps. Input the M third feature maps into the congestion prediction model to obtain the predicted congestion map output by the model; determine the prediction error based on the predicted congestion map and the two target real congestion maps. According to the prediction error, the model parameters in the congestion prediction model are updated using the gradient descent method or other backpropagation algorithms. Finally, it is judged whether the training process meets the preset condition. When the preset condition is met, the training process of the congestion prediction model is ended; when the preset condition is not met, the next iteration training is started. The preset condition may be that the number of training times is greater than or equal to the preset number, the prediction error is less than or equal to the preset error, or other feasible conditions, which are not limited in the present application. Wherein, the aforementioned congestion prediction model may be a model such as a generative adversarial neural network, a variational autoencoder, a semantic segmentation neural network, etc., which is not limited in this application.
其中,上述确定单次迭代训练样本的过程具体如下:Among them, the above-mentioned process of determining a single iteration training sample is as follows:
从预测层对应的M个第一特征图上任意相同区域分别选取M个第三特征图,从预测层对应的2张真实拥塞图上的该任意相同区域分别选取2张目标真实拥塞图,该M个第三特征图和2张目标真实拥塞图的尺寸等于目标尺寸;将该M个第三特征图和2张目标真实拥塞图作为单次迭代训练样本。其中,目标尺寸为拥塞预测模型允许输入图像的尺寸,目标尺寸可以小于或等于第一特征图的尺寸。Select M third feature maps from any identical areas on the M first feature maps corresponding to the prediction layer, respectively select two target real congestion maps from the arbitrary same areas on the two real congestion maps corresponding to the prediction layer, the The size of the M third feature maps and the two target real congestion maps is equal to the target size; the M third feature maps and the two target real congestion maps are used as training samples for a single iteration. Wherein, the target size is the size of the input image allowed by the congestion prediction model, and the target size may be smaller than or equal to the size of the first feature map.
进一步地,当目标尺寸等于预测层所对应的M个第一特征图的尺寸时,将该M个第一特征图分别作为上述M张第三特征图,将该预测层所对应的2张真实拥塞图分别作为上述2张目标真实拥塞图,即此时单次迭代训练样本包括M个第一特征图和2张真实拥塞图。Further, when the target size is equal to the size of the M first feature maps corresponding to the prediction layer, the M first feature maps are respectively used as the above M third feature maps, and the two real feature maps corresponding to the prediction layer The congestion maps are respectively used as the above two target real congestion maps, that is, the single iteration training samples at this time include M first feature maps and two real congestion maps.
在一种可行的实施方式中,所述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。In a feasible implementation manner, the prediction layers included in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the Adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model includes: using the first feature maps corresponding to the macro-unit layer in the data set A feature map and a corresponding real congestion map to train the first congestion prediction model; using the first feature map corresponding to the non-macro unit layer in the data set and the corresponding real congestion map to train the second congestion prediction model train.
可选地,将每个半导体芯片中的多个金属层划分为两个预测层,分别为宏单元层和非宏单元层。第一拥塞预测模型和第二拥塞预测模型的模型结构完全相同、初始模型模型参数可以相同或不同。Optionally, the multiple metal layers in each semiconductor chip are divided into two predictive layers, namely a macro-unit layer and a non-macro-unit layer. The model structures of the first congestion prediction model and the second congestion prediction model are completely the same, and initial model parameters may be the same or different.
具体地,上述利用数据集中宏单元层对应的第一特征图和对应的真实拥塞图对第一拥塞预测模型进行训练,包括:从任一宏单元层对应的M个第一特征图和真实拥塞图中确定单次迭代训练样本,然后利用单次迭代训练样本对第一拥塞预测模型进行训练。其中,用于对第一拥塞预测模型进行训练的单次迭代训练样本的确定过程可以参见前述拥塞预测模型的单次迭代训练样本的确定过程,此处不再赘述;第一拥塞预测模型的具体训练过程可以与上述实施例中拥塞预测模型的训练过程对应相同,此处不再赘述。Specifically, the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map are used to train the first congestion prediction model, including: M first feature maps corresponding to any macro-unit layer and the real congestion map In the figure, a single iteration training sample is determined, and then the first congestion prediction model is trained by using the single iteration training sample. Wherein, the determination process of the single-iteration training samples used to train the first congestion prediction model can refer to the determination process of the single-iteration training samples of the aforementioned congestion prediction model, which will not be repeated here; the specific details of the first congestion prediction model The training process may be the same as the training process of the congestion prediction model in the foregoing embodiment, and details are not repeated here.
同理,第二拥塞预测模型的训练过程与第一拥塞预测模型的训练过程对应相同,此处不再赘述。Similarly, the training process of the second congestion prediction model is the same as the training process of the first congestion prediction model, and will not be repeated here.
请参见图7,图7是本申请实施例提供的一种图像处理方法700的流程示意图,该方法包括但不限于如下步骤:Please refer to FIG. 7. FIG. 7 is a schematic flowchart of an image processing method 700 provided in the embodiment of the present application. The method includes but is not limited to the following steps:
步骤S710:确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,所述待预测半导体芯片包括至少两个所述预测层,所述M为正整数;Step S710: Determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;
具体地,上述每个预测层对应的M个第一特征图的确定方式可以参照图5所示实施例的具体描述,此处不再赘述。Specifically, for the manner of determining the M first feature maps corresponding to each prediction layer above, reference may be made to the specific description of the embodiment shown in FIG. 5 , which will not be repeated here.
步骤S720:利用拥塞预测模型对每个所述预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;Step S720: using the congestion prediction model to process the M first feature maps corresponding to each of the prediction layers, to obtain a predicted congestion map corresponding to each of the prediction layers;
其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括K个训练半导体芯片中每个训练半导体芯片包含的训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层,K为正整数。Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, each of the training prediction The training data corresponding to the layer includes M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map uses To describe the actual congestion level of each training prediction layer, each training semiconductor chip includes at least two training prediction layers, each training prediction layer includes at least one metal layer, and K is a positive integer.
可选地,当第一特征图的尺寸大于拥塞预测模型输入图像的目标尺寸时,可以采用滑动窗口方式分别从M个第一特征图上获取模型进行单次预测的M个第四特征图。其中,每个第四特征图的尺寸为目标尺寸,如图8所示,目标尺寸的长和宽可以分别为E和F,E和F为正整数,其单位可以为像素点。Optionally, when the size of the first feature map is larger than the target size of the input image of the congestion prediction model, a sliding window method may be used to obtain M fourth feature maps for a single prediction by the model from the M first feature maps. Wherein, the size of each fourth feature map is the target size, as shown in Figure 8, the length and width of the target size can be E and F respectively, E and F are positive integers, and the unit can be a pixel.
具体地,下面将参照图8描述拥塞预测模型用于单次预测的模型输入数据的组成:Specifically, the composition of the model input data used by the congestion prediction model for a single prediction will be described below with reference to FIG. 8:
图8所示第一特征图为预测层对应的M个第一特征图中的任意一个。如图8所示,该第一特征图可以包含D个第四特征图,且任意两个相邻的第四特征图之间重叠部分的宽度为G,G为大于或等于零的整数,单位可以为像素点。即M个第一特征图中的每个第一特征图包含D个第四特征图。将M个第一特征图上相同区域处的M个第四特征图作为模型进行单次预测的输入数据。综上可知,每个预测层对应的M个第一特征图共包含D组用于拥塞预测的输入数据,每组输入数据对应第一特征图上的特定区域,也对应该预测层上的特定区域,D为正整数。The first feature map shown in FIG. 8 is any one of the M first feature maps corresponding to the prediction layer. As shown in Figure 8, the first feature map may contain D fourth feature maps, and the width of the overlapping portion between any two adjacent fourth feature maps is G, where G is an integer greater than or equal to zero, and the unit can be for pixels. That is, each first feature map in the M first feature maps contains D fourth feature maps. The M fourth feature maps at the same region on the M first feature maps are used as input data for a single prediction of the model. In summary, the M first feature maps corresponding to each prediction layer contain a total of D sets of input data for congestion prediction, and each set of input data corresponds to a specific area on the first feature map, and also corresponds to a specific area on the prediction layer. area, D is a positive integer.
可选地,上述利用拥塞预测模型对每个预测层对应的M个第一特征图进行处理,得到每个预测层对应的预测拥塞图,包括:将每个预测层对应的D组输入数据依次输入一个或多个拥塞预测模型中,得到每组输入数据对应的预测拥塞图,共D组预测拥塞图。其中,每组预测拥塞图包括一个水平预测拥塞图和一个垂直预测拥塞图。将D组预测拥塞图中的水平预测拥塞图进行拼接,得到每个预测层对应的水平预测拥塞图;将D组预测拥塞图中的垂直预测拥塞图进行拼接,得到每个预测层对应的垂直拥塞图。应当注意,在对两张预测拥塞图进行拼接的过程中,在该两张预测拥塞图上的重叠部分,该重叠部分的每个像素点分别在两张预测拥塞图上对应两个像素值,可以对每个像素点对应的两个像素值采取加权平均或取最大值等方式确定拼接后每个像素点的像素值。Optionally, the aforementioned congestion prediction model is used to process the M first feature maps corresponding to each prediction layer to obtain a prediction congestion map corresponding to each prediction layer, including: sequentially inputting D groups of input data corresponding to each prediction layer Input one or more congestion prediction models to obtain a predicted congestion map corresponding to each set of input data, and a total of D groups of predicted congestion maps are obtained. Wherein, each group of predicted congestion maps includes a horizontal predicted congestion map and a vertical predicted congestion map. Concatenate the horizontal prediction congestion diagrams in group D prediction congestion diagrams to obtain the horizontal prediction congestion diagrams corresponding to each prediction layer; splice the vertical prediction congestion diagrams in group D prediction congestion diagrams to obtain the vertical prediction congestion diagrams corresponding to each prediction layer Congestion map. It should be noted that in the process of splicing the two predicted congestion maps, in the overlapping part of the two predicted congestion maps, each pixel point in the overlapping part corresponds to two pixel values on the two predicted congestion maps, The pixel value of each pixel after splicing can be determined by means of weighted average or maximum value of two pixel values corresponding to each pixel.
进一步,可选地,上述将每个预测层对应的D组输入数据依次输入一个或多个拥塞预测模型中,包括:将上述D组输入数据依次输入训练得到的一个拥塞预测模型中,得到每组输入数据对应的预测拥塞图;或将D组输入数据中的每组输入数据分别输入多个拥塞预测模型中的一个拥塞预测模型中,以进行并行化预测,得到每组输入数据对应的预测拥塞图;其中,该多个拥塞预测模型中的每个拥塞预测模型的模型结构和参数与训练得到的拥 塞预测模型相同。Further, optionally, the above-mentioned sequentially inputting D sets of input data corresponding to each prediction layer into one or more congestion prediction models includes: sequentially inputting the above-mentioned D sets of input data into a congestion prediction model obtained through training, and obtaining each The predicted congestion map corresponding to the group of input data; or input each group of input data in the D group of input data into one of the multiple congestion prediction models to perform parallel prediction, and obtain the prediction corresponding to each group of input data Congestion map; wherein, the model structure and parameters of each congestion prediction model in the plurality of congestion prediction models are the same as the congestion prediction model obtained through training.
可以看出,本申请实施例中,采用并行化预测方式,可以极大节省拥塞预测的时间,提高效率。It can be seen that, in the embodiment of the present application, the parallel prediction method can greatly save the time of congestion prediction and improve efficiency.
可选地,上述将每个预测层对应的D组输入数据依次输入一个或多个拥塞预测模型中,包括:当待预测半导体芯片被划分为两个预测层,分别为宏单元层和非宏单元层时,将宏单元层对应的D组输入数据输入一个或多个第一拥塞预测模型中,得到宏单元层对应的预测拥塞图。将非宏单元层对应的D组输入数据输入一个或多个第二拥塞预测模型中,得到非宏单元层对应的预测拥塞图。Optionally, the above-mentioned input data corresponding to each prediction layer D is sequentially input into one or more congestion prediction models, including: when the semiconductor chip to be predicted is divided into two prediction layers, respectively macro unit layer and non-macro unit layer In the case of the unit layer, the D group of input data corresponding to the macro-unit layer is input into one or more first congestion prediction models to obtain a predicted congestion map corresponding to the macro-unit layer. Input the group D of input data corresponding to the non-macro unit layer into one or more second congestion prediction models to obtain a predicted congestion map corresponding to the non-macro unit layer.
具体地,上述拥塞预测模型的具体训练过程可以参见图5所述的实施例,此处不再赘述。Specifically, for the specific training process of the above congestion prediction model, reference may be made to the embodiment described in FIG. 5 , which will not be repeated here.
在一种可行的实施方式中,上述方法还包括:对所述待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the above method further includes: aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
具体地,将待预测半导体芯片中各预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。Specifically, the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each prediction layer in the semiconductor chip to be predicted are aggregated to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.
在一种可行的实施方式中,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;所述对每个所述预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图,包括:利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;或利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation of the predicted congestion maps corresponding to each of the predicted layers is obtained to obtain the The predicted congestion graph corresponding to the semiconductor chip to be predicted includes: using a hierarchical aggregation operator to aggregate the vertical predicted congestion graph corresponding to each of the prediction layers to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate each Aggregate the horizontal predicted congestion maps corresponding to the prediction layers to obtain a reference horizontal predicted congestion map; use a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the pending congestion map Predict the predicted congestion map corresponding to the semiconductor chip; or use the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the predicted layers to obtain the reference prediction corresponding to each of the predicted layers Congestion map: using the hierarchical aggregation operator to aggregate the reference predicted congestion map corresponding to each prediction layer to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted.
可选地,上述层级聚合算子可以是取均值或取最大值等运算。具体地,将任意两个进行层级聚合的预测拥塞图上对应的像素点进行取均值或取最大值等运算,得到聚合后预测拥塞图上对应像素点的像素值。同理,上述方向性聚合算子也可以是取均值或最大值等运算,本申请对此不限定。此外,方向性聚合算子具体地运算过程可以参见层级聚合算子对应的运算过程,此处不再赘述。Optionally, the above-mentioned hierarchical aggregation operator may be an operation such as taking an average value or taking a maximum value. Specifically, any two corresponding pixel points on the predicted congestion map subjected to hierarchical aggregation are subjected to operations such as averaging or maximum value to obtain the pixel value of the corresponding pixel point on the aggregated predicted congestion map. Similarly, the above-mentioned directional aggregation operator may also be an operation such as taking an average value or a maximum value, which is not limited in this application. In addition, for the specific operation process of the directional aggregation operator, please refer to the operation process corresponding to the hierarchical aggregation operator, which will not be repeated here.
综上可知,待预测半导体芯片包含至少两个预测层,可以先利用方向性聚合算子对各预测层对应的水平预测拥塞图和垂直预测拥塞图进行聚合,得到每个预测层对应的参考预测拥塞图;再利用层级聚合算子对每个预测层对应的参考预测拥塞图进行聚合,得到待预测半导体芯片对应的预测拥塞图。或者先利用层级聚合算子分别对各预测层对应水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用层级聚合算子分别对各预测层对应垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;再利用方向性聚合算子对参考水平预测拥塞图和参考垂直预测拥塞图进行聚合,得到待预测半导体芯片对应的预测拥塞图。或者采用其它聚合顺序,利用层级聚合算子和方向性聚合算子对个预测层对应的预测拥塞图进 行聚合,得到待预测半导体芯片对应的预测拥塞图,本申请对此不限定。In summary, the semiconductor chip to be predicted contains at least two prediction layers, and the directional aggregation operator can be used to aggregate the horizontal prediction congestion map and vertical prediction congestion map corresponding to each prediction layer to obtain the reference prediction corresponding to each prediction layer Congestion map; then use the hierarchical aggregation operator to aggregate the reference prediction congestion map corresponding to each prediction layer to obtain the prediction congestion map corresponding to the semiconductor chip to be predicted. Or use the hierarchical aggregation operator to aggregate the corresponding horizontal prediction congestion graphs of each prediction layer to obtain the reference horizontal prediction congestion graph; use the hierarchical aggregation operator to aggregate the corresponding vertical prediction congestion graphs of each prediction layer to obtain the reference vertical prediction congestion graph Figure; and then use the directional aggregation operator to aggregate the reference horizontal predicted congestion map and the reference vertical predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted. Or adopt other aggregation order, use hierarchical aggregation operator and directional aggregation operator to aggregate the predicted congestion graph corresponding to each prediction layer, and obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted, which is not limited in this application.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图是基于全局绕线后的所述K个训练半导体芯片得到的。In a feasible implementation manner, the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global routing.
在一种可行的实施方式中,每个所述训练半导体芯片中包含的训练预测层是基于每个所述训练半导体芯片中金属层的制造工艺或功能模块分布划分得到的。In a feasible implementation manner, the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
在一种可行的实施方式中,每个所述训练预测层对应的M个第一训练特征图是根据每个所述训练预测层中每个金属层对应的M个第二特征图得到的;其中,所述M个第一训练特征图中用于描述任一芯片特征的第一训练特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。In a feasible implementation manner, the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述训练预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers The real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
在一种可行的实施方式中,在所述拥塞预测模型每次迭代训练的过程中,通过所述数据集中任一训练预测层对应的预测拥塞图和该任一训练预测层对应的真实拥塞图更新所述拥塞预测模型;所述任一训练预测层对应的预测拥塞图是通过将所述任一训练预测层对应的M个第一训练特征图输入所述拥塞预测模型得到的。In a feasible implementation manner, during each iterative training process of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the real congestion map corresponding to any training prediction layer Updating the congestion prediction model; the predicted congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer into the congestion prediction model.
在一种可行的实施方式中,所述多个训练预测层分为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述第一拥塞预测模型是通过所述数据集中宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的;所述第二拥塞预测模型是通过所述数据集中非宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的。In a feasible implementation manner, the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set. The training feature map and the corresponding real congestion map are obtained by training.
在一种可行的实施方式中,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
具体地,上述利用训练半导体芯片对拥塞预测模型进行训练的具体过程可以参见图5所示的实施例的具体描述,此处不再赘述。Specifically, for the specific process of using the training semiconductor chip to train the congestion prediction model, reference may be made to the specific description of the embodiment shown in FIG. 5 , which will not be repeated here.
请参见图9,图9为本申请实施例提供的一种拥塞预测的流程示意图。如图9所示,半导体芯片的拥塞预测过程具体如下:将待预测半导体芯片分为两个预测层,分别为宏单元层和非宏单元层。按照前述实施例中的方法分别确定宏单元层和非宏单元层对应的M个第一特征图。利用第一拥塞预测模型对宏单元层对应的M个第一特征图进行处理,得到宏单元层对应的水平预测拥塞图和垂直预测拥塞图;利用第二拥塞预测模型对非宏单元层对应的M个第一特征图进行处理,得到非宏单元层对应的水平预测拥塞图和垂直预测拥塞图。最后利用层级聚合算子和方向性聚合算子对宏单元层对应的水平预测拥塞图、宏单元层对应的垂直预测拥塞图、非宏单元层对应的水平预测拥塞图和非宏单元层对应的垂直预测拥塞图进行聚合,得到待预测半导体芯片对应的预测拥塞图。Please refer to FIG. 9 , which is a schematic flowchart of a congestion prediction provided by an embodiment of the present application. As shown in FIG. 9 , the congestion prediction process of a semiconductor chip is specifically as follows: the semiconductor chip to be predicted is divided into two prediction layers, namely a macro-unit layer and a non-macro-unit layer. The M first feature maps corresponding to the macro-unit layer and the non-macro-unit layer are respectively determined according to the methods in the foregoing embodiments. Utilize the first congestion prediction model to process the M first feature maps corresponding to the macro-unit layer, and obtain the horizontal prediction congestion map and the vertical prediction congestion map corresponding to the macro-unit layer; The M first feature maps are processed to obtain a horizontal prediction congestion map and a vertical prediction congestion map corresponding to the non-macro unit layer. Finally, use the hierarchical aggregation operator and the directional aggregation operator to analyze the horizontal predicted congestion map corresponding to the macro-unit layer, the vertical predicted congestion map corresponding to the macro-unit layer, the horizontal predicted congestion map corresponding to the non-macro-unit layer, and the corresponding non-macro-unit layer. The vertical predicted congestion graph is aggregated to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
具体地,上述待预测半导体芯片拥塞预测的具体过程可参见前述图5和图7实施例中 的对应过程,此处不再赘述。Specifically, for the specific process of predicting the congestion of the semiconductor chip to be predicted, refer to the corresponding process in the embodiments of Fig. 5 and Fig. 7 above, and details are not repeated here.
上述详细阐述了本申请实施例的方法,下面提供了本申请实施例的装置。The method of the embodiment of the present application has been described in detail above, and the device of the embodiment of the present application is provided below.
请参见图10,图10是本申请实施例提供的一种模型训练装置1000的结构示意图,该装置1000可以包括分层单元1010、确定单元1020和训练单元1030,其中,各个单元的详细描述如下。Please refer to FIG. 10. FIG. 10 is a schematic structural diagram of a model training device 1000 provided by an embodiment of the present application. The device 1000 may include a layering unit 1010, a determination unit 1020, and a training unit 1030. The detailed description of each unit is as follows .
分层单元1010,用于将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数;确定单元1020,用于确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数;训练单元1030,用于将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型。A layering unit 1010, configured to divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is a metal layer contained in each semiconductor chip in K semiconductor chips, and K is a positive integer; the determination unit 1020. Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer A training unit 1030, configured to add M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and use the data set to train a congestion prediction model.
在一种可行的实施方式中,上述训练单元还用于:基于全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。In a feasible implementation manner, the above-mentioned training unit is further configured to: obtain a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing; The real congestion maps corresponding to the prediction layers are added to the data set.
在一种可行的实施方式中,上述分层单元具体用于:根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。In a feasible implementation manner, the above layering unit is specifically configured to: divide the plurality of metal layers into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
在一种可行的实施方式中,上述确定单元具体用于:获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述所述任一芯片特征的第二特征图得到的。In a feasible implementation manner, the above determination unit is specifically configured to: obtain M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used for Describe the M chip features of each metal layer; generate M first feature maps corresponding to each of the prediction layers based on the M second feature maps of each metal layer; wherein, the The first feature map used to describe any chip feature in the M first feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and the real congestion map corresponding to each metal layer in each of the prediction layers The graph includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the prediction layers, and the The first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each prediction layer.
在一种可行的实施方式中,在所述利用所述数据集训练拥塞预测模型的方面,上述训练单元具体用于:利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:利用所述拥塞预测模型对所述数据集中任一预测层对应的M个第一特征图进行处理,得到所述任一预测层对应的预测拥塞图;基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。In a feasible implementation manner, in the aspect of using the data set to train the congestion prediction model, the above training unit is specifically configured to: use the data set to iteratively train the congestion prediction model; wherein, each time The iterative training includes: using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set to obtain a prediction congestion map corresponding to any prediction layer; based on the prediction congestion map The real congestion map corresponding to the any prediction layer updates the congestion prediction model.
在一种可行的实施方式中,上述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层;在所述利用所述数据集训练拥塞预测模型的方面,上述训练单元具体用于:利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。In a feasible implementation manner, the prediction layers contained in each of the above semiconductor chips are respectively a macro-unit layer and a non-macro-unit layer; in the aspect of using the data set to train the congestion prediction model, the above-mentioned training unit is specifically used for : use the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model; use the first feature map corresponding to the non-macro-unit layer in the data set and the corresponding The real congestion map is used to train the second congestion prediction model.
在一种可行的实施方式中,上述M个芯片特征包括引脚密度、网络连接密度、模块掩 码或绕线资源量中的一个或多个。In a feasible implementation manner, the above M chip features include one or more of pin density, network connection density, module mask or amount of routing resources.
需要说明的是,各个单元的实现还可以对应参照图5和图7所示的方法实施例的相应描述。It should be noted that the implementation of each unit may also refer to corresponding descriptions of the method embodiments shown in FIG. 5 and FIG. 7 .
请参见图11,图11是本申请实施例提供的一种图像处理装置1100的结构示意图。装置1100包括确定单元1110和处理单元1120。Please refer to FIG. 11 , which is a schematic structural diagram of an image processing apparatus 1100 provided in an embodiment of the present application. The apparatus 1100 includes a determining unit 1110 and a processing unit 1120 .
确定单元1110,用于确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,所述待预测半导体芯片包括至少两个所述预测层,所述M为正整数;处理单元1120,用于利用拥塞预测模型对每个所述预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括K个训练半导体芯片中每个训练半导体芯片包含的训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层。A determination unit 1110, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer; processing Unit 1120, configured to use a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers to obtain a prediction congestion map corresponding to each of the prediction layers; wherein, the congestion prediction model is obtained through data The data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, and the training data corresponding to each of the training prediction layers includes M first A training feature map and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the training prediction layers For the real congestion level, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
在一种可行的实施方式中,上述装置还包括:聚合单元,用于对所述待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the above device further includes: an aggregation unit, configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted .
在一种可行的实施方式中,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;上述聚合单元具体用于:利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;或,利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。In a feasible implementation manner, the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; the above aggregation unit is specifically configured to: use a hierarchical aggregation operator to Aggregating the corresponding vertical predicted congestion graphs to obtain a reference vertical predicted congestion graph; using the hierarchical aggregation operator to aggregate the horizontal predicted congestion graphs corresponding to each of the prediction layers to obtain a reference horizontal predicted congestion graph; using directional aggregation The operator aggregates the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain the predicted congestion map corresponding to the semiconductor chip to be predicted; or, using the directional aggregation operator to Aggregating the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each layer to obtain a reference predicted congestion map corresponding to each of the predicted layers; using the hierarchical aggregation operator to perform aggregated to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted.
在一种可行的实施方式中,每个所述训练半导体芯片中包含的训练预测层是基于每个所述训练半导体芯片中金属层的制造工艺或功能模块分布划分得到的。In a feasible implementation manner, the training prediction layer included in each training semiconductor chip is obtained based on the manufacturing process or functional module distribution division of metal layers in each training semiconductor chip.
在一种可行的实施方式中,每个所述训练预测层对应的M个第一训练特征图是根据每个所述训练预测层中每个金属层对应的M个第二特征图得到的;其中,所述M个第一训练特征图中用于描述任一芯片特征的第一训练特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。In a feasible implementation manner, the M first training feature maps corresponding to each of the training prediction layers are obtained from the M second feature maps corresponding to each metal layer in each of the training prediction layers; Wherein, the first training feature map used to describe any chip feature in the M first training feature maps is obtained based on the second feature map describing the feature of any chip in each metal layer.
在一种可行的实施方式中,每个所述训练预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述训练预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二垂直真实拥塞图得到的。In a feasible implementation manner, the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each metal layer corresponding to each of the training prediction layers The real congestion map includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is obtained based on the second horizontal real congestion map corresponding to each metal layer in each of the training prediction layers , the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
在一种可行的实施方式中,在所述拥塞预测模型每次迭代训练的过程中,通过所述数据集中任一训练预测层对应的预测拥塞图和该任一训练预测层对应的真实拥塞图更新所述拥塞预测模型;所述任一训练预测层对应的预测拥塞图是通过将所述任一训练预测层对应的M个第一训练特征图输入所述拥塞预测模型得到的。In a feasible implementation manner, during each iterative training process of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the real congestion map corresponding to any training prediction layer Updating the congestion prediction model; the predicted congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer into the congestion prediction model.
在一种可行的实施方式中,所述多个训练预测层分为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述第一拥塞预测模型是通过所述数据集中宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的;所述第二拥塞预测模型是通过所述数据集中非宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的。In a feasible implementation manner, the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; the first congestion prediction The prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the first training feature map corresponding to the non-macro-unit layer in the data set. The training feature map and the corresponding real congestion map are obtained by training.
在一种可行的实施方式中,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。In a feasible implementation manner, the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
具体地,图像处理装置1100可以用于处理图7中所述的图像处理方法700的对应步骤,此处不再赘述。Specifically, the image processing apparatus 1100 may be used to process corresponding steps of the image processing method 700 described in FIG. 7 , which will not be repeated here.
请参见图12,图12是本申请实施例提供的一种模型训练装置1200的硬件结构示意图。图12所示的模型训练装置1200(该装置1200具体可以是一种计算机设备)包括存储器1201、处理器1202、通信接口1203以及总线1204。其中,存储器1201、处理器1202、通信接口1203通过总线1204实现彼此之间的通信连接。Please refer to FIG. 12 , which is a schematic diagram of a hardware structure of a model training device 1200 provided by an embodiment of the present application. The model training apparatus 1200 shown in FIG. 12 (the apparatus 1200 may specifically be a computer device) includes a memory 1201 , a processor 1202 , a communication interface 1203 and a bus 1204 . Wherein, the memory 1201 , the processor 1202 , and the communication interface 1203 are connected to each other through a bus 1204 .
存储器1201可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1201可以存储程序,当存储器1201中存储的程序被处理器1202执行时,处理器1202和通信接口1203用于执行本申请实施例的拥塞预测模型的训练方法的各个步骤。The memory 1201 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 1201 may store a program. When the program stored in the memory 1201 is executed by the processor 1202, the processor 1202 and the communication interface 1203 are used to execute each step of the method for training the congestion prediction model in the embodiment of the present application.
处理器1202可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例中拥塞预测模型的训练装置中的单元所需执行的功能,或者执行本申请方法实施例的拥塞预测模型训练方法。The processor 1202 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs to realize the functions required by the units in the device for training the congestion prediction model in the embodiment of the present application, or to execute the method for training the congestion prediction model in the method embodiment of the present application.
处理器1202还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的拥塞预测模型的训练方法的各个步骤可以通过处理器1202中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1202还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1201,处理器1202读取存储器1201中的信息,结合其硬件完成本申请实施例 的拥塞预测模型的训练装置中包括的单元所需执行的功能,或者执行本申请方法实施例的拥塞预测模型的训练方法。The processor 1202 may also be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the method for training the congestion prediction model of the present application may be completed by an integrated logic circuit of hardware in the processor 1202 or instructions in the form of software. The above-mentioned processor 1202 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 1201, and the processor 1202 reads the information in the memory 1201, and combines its hardware to complete the functions required by the units included in the training device for the congestion prediction model of the embodiment of the present application, or execute the method embodiment of the present application The training method of the congestion prediction model.
通信接口1203使用例如但不限于收发器一类的收发装置,来实现装置1200与其他设备或通信网络之间的通信。例如,可以通过通信接口1203获取训练数据。The communication interface 1203 implements communication between the apparatus 1200 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1203 .
总线1204可包括在装置1200各个部件(例如,存储器1201、处理器1202、通信接口1203)之间传送信息的通路。The bus 1204 may include a pathway for transferring information between various components of the device 1200 (eg, memory 1201 , processor 1202 , communication interface 1203 ).
请参见图13,图13是本申请实施例提供的图像处理装置1300的硬件结构示意图。其中,图像处理装置1300可以是电脑、手机、平板电脑或其它可能的终端设备,本申请对此不限定。图13所示的图像处理装置1300(该装置1300具体可以是一种计算机设备)包括存储器1301、处理器1302、通信接口1303以及总线1304。其中,存储器1301、处理器1302、通信接口1303通过总线1304实现彼此之间的通信连接。Please refer to FIG. 13 , which is a schematic diagram of a hardware structure of an image processing apparatus 1300 provided by an embodiment of the present application. Wherein, the image processing apparatus 1300 may be a computer, a mobile phone, a tablet computer or other possible terminal devices, which is not limited in this application. The image processing apparatus 1300 shown in FIG. 13 (the apparatus 1300 may specifically be a computer device) includes a memory 1301 , a processor 1302 , a communication interface 1303 and a bus 1304 . Wherein, the memory 1301 , the processor 1302 , and the communication interface 1303 are connected to each other through a bus 1304 .
存储器1301可以是只读存储器(read only memory,ROM),静态存储设备,动态存储设备或者随机存取存储器(random access memory,RAM)。存储器1301可以存储程序,当存储器1301中存储的程序被处理器1302执行时,处理器1302和通信接口1303用于执行本申请实施例的图像处理方法的各个步骤。The memory 1301 may be a read only memory (read only memory, ROM), a static storage device, a dynamic storage device or a random access memory (random access memory, RAM). The memory 1301 may store programs, and when the programs stored in the memory 1301 are executed by the processor 1302, the processor 1302 and the communication interface 1303 are used to execute various steps of the image processing method of the embodiment of the present application.
处理器1302可以采用通用的中央处理器(central processing unit,CPU),微处理器,应用专用集成电路(application specific integrated circuit,ASIC),图形处理器(graphics processing unit,GPU)或者一个或多个集成电路,用于执行相关程序,以实现本申请实施例的图像处理装置中的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。The processor 1302 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application specific integrated circuit (application specific integrated circuit, ASIC), a graphics processing unit (graphics processing unit, GPU) or one or more The integrated circuit is used to execute related programs to realize the functions required by the units in the image processing device of the embodiment of the present application, or to execute the image processing method of the method embodiment of the present application.
处理器1302还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,本申请的图像处理方法的各个步骤可以通过处理器1302中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1302还可以是通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1301,处理器1302读取存储器1301中的信息,结合其硬件完成本申请实施例的图像处理装置中包括的单元所需执行的功能,或者执行本申请方法实施例的图像处理方法。The processor 1302 may also be an integrated circuit chip with signal processing capability. In the implementation process, each step of the image processing method of the present application may be completed by an integrated logic circuit of hardware in the processor 1302 or instructions in the form of software. The above-mentioned processor 1302 can also be a general-purpose processor, a digital signal processor (digital signal processing, DSP), an application-specific integrated circuit (ASIC), a ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 1301, and the processor 1302 reads the information in the memory 1301, and combines its hardware to complete the functions required by the units included in the image processing device of the embodiment of the present application, or execute the image processing of the method embodiment of the present application method.
通信接口1303使用例如但不限于收发器一类的收发装置,来实现装置1300与其他设备或通信网络之间的通信。例如,可以通过通信接口1303获取训练数据。The communication interface 1303 implements communication between the apparatus 1300 and other devices or communication networks by using a transceiver device such as but not limited to a transceiver. For example, training data can be obtained through the communication interface 1303 .
总线1304可包括在装置1300各个部件(例如,存储器1301、处理器1302、通信接口1303)之间传送信息的通路。The bus 1304 may include pathways for transferring information between various components of the device 1300 (eg, memory 1301 , processor 1302 , communication interface 1303 ).
应注意,尽管图12和图13所示的装置1200和装置1300仅仅示出了存储器、处理器、通信接口,但是在具体实现过程中,本领域的技术人员应当理解,装置1200和装置1300 还包括实现正常运行所必须的其他器件。同时,根据具体需要,本领域的技术人员应当理解,装置1200和装置1300还可包括实现其他附加功能的硬件器件。此外,本领域的技术人员应当理解,装置1200和装置1300也可仅仅包括实现本申请实施例所必须的器件,而不必包括图12或图13中所示的全部器件。It should be noted that although the device 1200 and the device 1300 shown in FIG. 12 and FIG. 13 only show a memory, a processor, and a communication interface, in a specific implementation process, those skilled in the art should understand that the device 1200 and the device 1300 also have Includes other devices necessary for proper operation. Meanwhile, according to specific needs, those skilled in the art should understand that the apparatus 1200 and the apparatus 1300 may also include hardware devices for implementing other additional functions. In addition, those skilled in the art should understand that the device 1200 and the device 1300 may only include the devices necessary to realize the embodiment of the present application, instead of all the devices shown in FIG. 12 or FIG. 13 .
可以理解,上述装置1200相当于图1中的训练设备120,装置1300相当于图1中的执行设备110。本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。It can be understood that the above apparatus 1200 is equivalent to the training device 120 in FIG. 1 , and the apparatus 1300 is equivalent to the execution device 110 in FIG. 1 . Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
本申请实施例还提供一种芯片系统,所述芯片系统包括至少一个处理器、存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,上述图5和/或图7所述的方法得以实现。The embodiment of the present application also provides a chip system, the chip system includes at least one processor, memory and interface circuit, the memory, the interface circuit and the at least one processor are interconnected by wires, and the at least one memory Instructions are stored in; when the instructions are executed by the processor, the methods described above in FIG. 5 and/or FIG. 7 are implemented.
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在网络设备上运行时,图5/或图7所示的方法流程得以实现。An embodiment of the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a network device, the method flow shown in FIG. 5 and/or FIG. 7 is implemented.
本申请实施例还提供一种计算机程序产品,当所述计算机程序产品在终端上运行时,图5/或图7所示的方法流程得以实现。The embodiment of the present application further provides a computer program product. When the computer program product is run on a terminal, the method flow shown in FIG. 5 and/or FIG. 7 is realized.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
上述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the above functions are realized in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the application, but the scope of protection of the application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the application. Should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims (33)

  1. 一种拥塞预测模型的训练方法,其特征在于,所述方法包括:A method for training a congestion prediction model, characterized in that the method comprises:
    将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数;Dividing multiple metal layers into at least two prediction layers; wherein, the multiple metal layers are metal layers contained in each semiconductor chip in K semiconductor chips, and K is a positive integer;
    确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数;Determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is a positive integer;
    将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型。Adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and using the data set to train a congestion prediction model.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    对所述K个半导体芯片进行全局绕线,根据全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;performing global routing on the K semiconductor chips, and obtaining a real congestion map corresponding to each prediction layer according to the K semiconductor chips after global routing;
    将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。The real congestion map corresponding to each of the prediction layers in the K semiconductor chips is added to the data set.
  3. 根据权利要求1或2所述的方法,其特征在于,所述将所述多个金属层划分为至少两个预测层,包括:The method according to claim 1 or 2, wherein said dividing said plurality of metal layers into at least two prediction layers comprises:
    根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。The multiple metal layers are divided into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述确定每个所述预测层对应的M个第一特征图,包括:The method according to any one of claims 1-3, wherein the determining the M first feature maps corresponding to each of the prediction layers comprises:
    获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;Acquiring M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used to describe the M chip features of each metal layer;
    基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述所述任一芯片特征的第二特征图得到的。Based on the M second feature maps of each metal layer, generate M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are used to describe any chip feature The first feature map is obtained based on the second feature map describing the feature of any chip in each metal layer.
  5. 根据权利要求2-4中任一项所述的方法,其特征在于,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。The method according to any one of claims 2-4, wherein the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the predicted The real congestion map corresponding to each metal layer in the layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on the first horizontal real congestion map corresponding to each metal layer in each of the prediction layers Two horizontal real congestion maps are obtained, and the first vertical real congestion map is obtained based on a second vertical real congestion map corresponding to each metal layer in each prediction layer.
  6. 根据权利要求2-5中任一项所述的方法,其特征在于,所述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:The method according to any one of claims 2-5, wherein the M first feature maps corresponding to each of the prediction layers in the K semiconductor chips are added to the data set, and using the The above datasets are used to train the congestion prediction model, including:
    利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:Using the data set to iteratively train the congestion prediction model; wherein, each iterative training includes:
    利用所述拥塞预测模型对所述数据集中任一预测层对应的M个第一特征图进行处理,得到所述任一预测层对应的预测拥塞图;Using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set, to obtain a prediction congestion map corresponding to any prediction layer;
    基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。Updating the congestion prediction model based on the predicted congestion map and a real congestion map corresponding to any prediction layer.
  7. 根据权利要求2-5中任一项所述的方法,其特征在于,所述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层;所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型,包括:The method according to any one of claims 2-5, wherein the prediction layers contained in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer; the congestion prediction model includes a first congestion prediction model and a second congestion prediction model; adding M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to the data set, and using the data set to train the congestion prediction model, including:
    利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;Using the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model;
    利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。The second congestion prediction model is trained by using the first feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map.
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。The method according to any one of claims 1-7, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  9. 一种图像处理方法,其特征在于,所述方法包括:An image processing method, characterized in that the method comprises:
    确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,所述待预测半导体芯片包括至少两个所述预测层,所述M为正整数;Determining M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;
    利用拥塞预测模型对每个所述预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;Using a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers to obtain a prediction congestion map corresponding to each of the prediction layers;
    其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括K个训练半导体芯片中每个训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层。Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to each training prediction layer in the K training semiconductor chips, and the training data corresponding to each training prediction layer Including M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map is used to describe each of the The actual congestion level of the training prediction layer, each of the training semiconductor chips includes at least two training prediction layers, and each of the training prediction layers includes at least one metal layer.
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:The method according to claim 9, characterized in that the method further comprises:
    对所述待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。Aggregating the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
  11. 根据权利要求10所述的方法,其特征在于,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;所述对每个所述预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图,包括:The method according to claim 10, wherein the predicted congestion map corresponding to each of the prediction layers includes a vertical predicted congestion map and a horizontal predicted congestion map; Aggregation to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted, including:
    利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水 平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;Using a hierarchical aggregation operator to aggregate the vertical prediction congestion graphs corresponding to each of the prediction layers to obtain a reference vertical prediction congestion graph; using the hierarchical aggregation operator to aggregate the horizontal prediction congestion graphs corresponding to each of the prediction layers , obtaining a reference horizontal predicted congestion map; using a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted;
    or
    利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。Using the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the prediction layers to obtain a reference predicted congestion map corresponding to each of the prediction layers; using the hierarchical aggregation operator The reference predicted congestion graphs corresponding to each prediction layer are aggregated to obtain the predicted congestion graphs corresponding to the semiconductor chips to be predicted.
  12. 根据权利要求9-11中任一项所述的方法,其特征在于,每个所述训练预测层对应的真实拥塞图是基于全局绕线后的所述K个训练半导体芯片得到的。The method according to any one of claims 9-11, wherein the real congestion map corresponding to each training prediction layer is obtained based on the K training semiconductor chips after global wiring.
  13. 根据权利要求9-12中任一项所述的方法,其特征在于,每个所述训练半导体芯片中包含的训练预测层是基于每个所述训练半导体芯片中金属层的制造工艺或功能模块分布划分得到的。The method according to any one of claims 9-12, wherein the training prediction layer contained in each of the training semiconductor chips is based on the manufacturing process or functional module of the metal layer in each of the training semiconductor chips distribution is obtained.
  14. 根据权利要求9-13中任一项所述的方法,其特征在于,每个所述训练预测层对应的M个第一训练特征图是根据每个所述训练预测层中每个金属层对应的M个第二特征图得到的;其中,所述M个第一训练特征图中用于描述任一芯片特征的第一训练特征图是基于所述每个金属层中描述该任一芯片特征的第二特征图得到的。The method according to any one of claims 9-13, wherein the M first training feature maps corresponding to each of the training prediction layers are based on each metal layer corresponding to each of the training prediction layers obtained from M second feature maps; wherein, the first training feature map used to describe any chip feature in the M first training feature maps is based on describing the feature of any chip in each metal layer The second feature map is obtained.
  15. 根据权利要求9-14中任一项所述的方法,其特征在于,每个所述训练预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述训练预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述训练预测层中每个金属层对应的第二垂直真实拥塞图得到的。The method according to any one of claims 9-14, wherein the real congestion map corresponding to each of the training prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the The real congestion map corresponding to each metal layer in the training prediction layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on each metal layer in each of the training prediction layers The corresponding second horizontal real congestion map is obtained, and the first vertical real congestion map is obtained based on the second vertical real congestion map corresponding to each metal layer in each training prediction layer.
  16. 根据权利要求9-15中任一项所述的方法,其特征在于,在所述拥塞预测模型每次迭代训练的过程中,通过所述数据集中任一训练预测层对应的预测拥塞图和该任一训练预测层对应的真实拥塞图更新所述拥塞预测模型;所述任一训练预测层对应的预测拥塞图是通过将所述任一训练预测层对应的M个第一训练特征图输入所述拥塞预测模型得到的。The method according to any one of claims 9-15, wherein, during each iterative training of the congestion prediction model, the predicted congestion map corresponding to any training prediction layer in the data set and the The real congestion map corresponding to any training prediction layer updates the congestion prediction model; the prediction congestion map corresponding to any training prediction layer is obtained by inputting M first training feature maps corresponding to any training prediction layer obtained from the above congestion prediction model.
  17. 根据权利要求9-16中任一项所述的方法,其特征在于,所述多个训练预测层分为宏单元层和非宏单元层,所述拥塞预测模型包括第一拥塞预测模型和第二拥塞预测模型;所述第一拥塞预测模型是通过所述数据集中宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的;所述第二拥塞预测模型是通过所述数据集中非宏单元层对应的第一训练特征图和对应的真实拥塞图进行训练得到的。The method according to any one of claims 9-16, wherein the plurality of training prediction layers are divided into a macro-unit layer and a non-macro-unit layer, and the congestion prediction model includes a first congestion prediction model and a second congestion prediction model. Two congestion prediction models; the first congestion prediction model is obtained by training the first training feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map; the second congestion prediction model is obtained through the The first training feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map are obtained by training.
  18. 根据权利要求9-17中任一项所述的方法,其特征在于,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。The method according to any one of claims 9-17, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  19. 一种拥塞预测模型的训练装置,其特征在于,所述装置包括:A training device for a congestion prediction model, characterized in that the device comprises:
    分层单元,用于将多个金属层划分为至少两个预测层;其中,所述多个金属层为K个半导体芯片中每个半导体芯片包含的金属层,K为正整数;A layering unit, configured to divide a plurality of metal layers into at least two prediction layers; wherein, the plurality of metal layers is a metal layer contained in each semiconductor chip in K semiconductor chips, and K is a positive integer;
    确定单元,用于确定每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图分别用于描述每个所述预测层的M个芯片特征,M为正整数;A determining unit, configured to determine M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are respectively used to describe M chip features of each of the prediction layers, and M is positive integer;
    训练单元,用于将所述K个半导体芯片中每个所述预测层对应的M个第一特征图加入数据集,并利用所述数据集训练拥塞预测模型。A training unit, configured to add M first feature maps corresponding to each of the prediction layers in the K semiconductor chips to a data set, and use the data set to train a congestion prediction model.
  20. 根据权利要求19所述的装置,其特征在于,所述训练单元还用于:The device according to claim 19, wherein the training unit is also used for:
    基于全局绕线后的所述K个半导体芯片得到每个所述预测层对应的真实拥塞图;Obtaining a real congestion map corresponding to each of the prediction layers based on the K semiconductor chips after global routing;
    将所述K个半导体芯片中每个所述预测层对应的真实拥塞图加入所述数据集。The real congestion map corresponding to each of the prediction layers in the K semiconductor chips is added to the data set.
  21. 根据权利要求19或20中所述的装置,其特征在于,所述分层单元具体用于:The device according to claim 19 or 20, wherein the layering unit is specifically used for:
    根据每个所述半导体芯片中金属层的制造工艺或功能模块分布将所述多个金属层划分为至少两个预测层。The multiple metal layers are divided into at least two predictive layers according to the manufacturing process or functional module distribution of the metal layers in each of the semiconductor chips.
  22. 根据权利要求19-21中任一项所述的装置,其特征在于,所述确定单元具体用于:The device according to any one of claims 19-21, wherein the determining unit is specifically configured to:
    获取每个所述预测层中每个金属层对应的M个第二特征图;其中,所述M个第二特征图分别用于描述所述每个金属层的所述M个芯片特征;Acquiring M second feature maps corresponding to each metal layer in each of the prediction layers; wherein, the M second feature maps are respectively used to describe the M chip features of each metal layer;
    基于所述每个金属层的M个第二特征图,生成每个所述预测层对应的M个第一特征图;其中,所述M个第一特征图中用于描述任一芯片特征的第一特征图是基于所述每个金属层中描述所述任一芯片特征的第二特征图得到的。Based on the M second feature maps of each metal layer, generate M first feature maps corresponding to each of the prediction layers; wherein, the M first feature maps are used to describe any chip feature The first feature map is obtained based on the second feature map describing the feature of any chip in each metal layer.
  23. 根据权利要求20-22中任一项所述的装置,其特征在于,每个所述预测层对应的真实拥塞图包括第一水平真实拥塞图和第一垂直真实拥塞图,每个所述预测层中每个金属层对应的真实拥塞图包括第二水平真实拥塞图和第二垂直真实拥塞图;所述第一水平真实拥塞图是基于每个所述预测层中每个金属层对应的第二水平真实拥塞图得到的,所述第一垂直真实拥塞图是基于每个所述预测层中每个金属层对应的第二垂直真实拥塞图得到的。The device according to any one of claims 20-22, wherein the real congestion map corresponding to each of the prediction layers includes a first horizontal real congestion map and a first vertical real congestion map, and each of the predicted The real congestion map corresponding to each metal layer in the layer includes a second horizontal real congestion map and a second vertical real congestion map; the first horizontal real congestion map is based on the first horizontal real congestion map corresponding to each metal layer in each of the prediction layers Two horizontal real congestion maps are obtained, and the first vertical real congestion map is obtained based on a second vertical real congestion map corresponding to each metal layer in each prediction layer.
  24. 根据权利要求20-23中任一项所述的装置,其特征在于,在所述利用所述数据集训练拥塞预测模型的方面,所述训练单元具体用于:The device according to any one of claims 20-23, wherein, in terms of using the data set to train the congestion prediction model, the training unit is specifically configured to:
    利用所述数据集对所述拥塞预测模型进行迭代训练;其中,每次迭代训练,包括:Using the data set to iteratively train the congestion prediction model; wherein, each iterative training includes:
    利用所述拥塞预测模型对所述数据集中任一预测层对应的M个第一特征图进行处理,得到所述任一预测层对应的预测拥塞图;Using the congestion prediction model to process M first feature maps corresponding to any prediction layer in the data set, to obtain a prediction congestion map corresponding to any prediction layer;
    基于所述预测拥塞图和所述任一预测层对应的真实拥塞图更新所述拥塞预测模型。Updating the congestion prediction model based on the predicted congestion map and a real congestion map corresponding to any prediction layer.
  25. 根据权利要求20-23中任一项所述的装置,其特征在于,所述每个半导体芯片包含的预测层分别为宏单元层和非宏单元层;在所述利用所述数据集训练拥塞预测模型的方面, 所述训练单元具体用于:The device according to any one of claims 20-23, characterized in that, the prediction layers contained in each semiconductor chip are respectively a macro-unit layer and a non-macro-unit layer; In terms of predictive models, the training unit is specifically used for:
    利用所述数据集中宏单元层对应的第一特征图和对应的真实拥塞图对所述第一拥塞预测模型进行训练;Using the first feature map corresponding to the macro-unit layer in the data set and the corresponding real congestion map to train the first congestion prediction model;
    利用所述数据集中非宏单元层对应的第一特征图和对应的真实拥塞图对所述第二拥塞预测模型进行训练。The second congestion prediction model is trained by using the first feature map corresponding to the non-macrounit layer in the data set and the corresponding real congestion map.
  26. 根据权利要求19-25中任一项所述的装置,其特征在于,所述M个芯片特征包括引脚密度、网络连接密度、模块掩码或绕线资源量中的一个或多个。The device according to any one of claims 19-25, wherein the M chip features include one or more of pin density, network connection density, module mask, or amount of routing resources.
  27. 一种图像处理装置,其特征在于,所述装置包括:An image processing device, characterized in that the device comprises:
    确定单元,用于确定待预测半导体芯片中每个预测层对应的M个第一特征图;其中,所述待预测半导体芯片包括至少两个所述预测层,所述M为正整数;A determining unit, configured to determine M first feature maps corresponding to each prediction layer in the semiconductor chip to be predicted; wherein, the semiconductor chip to be predicted includes at least two prediction layers, and M is a positive integer;
    处理单元,用于利用拥塞预测模型对每个所述预测层对应的M个第一特征图进行处理,得到每个所述预测层对应的预测拥塞图;A processing unit, configured to use a congestion prediction model to process the M first feature maps corresponding to each of the prediction layers, to obtain a prediction congestion map corresponding to each of the prediction layers;
    其中,所述拥塞预测模型是通过数据集进行训练后得到的,所述数据集包括K个训练半导体芯片中每个训练半导体芯片包含的训练预测层分别对应的训练数据,每个所述训练预测层对应的训练数据包括M个第一训练特征图和真实拥塞图,所述M个第一训练特征图分别用于描述每个所述训练预测层的M个芯片特征,所述真实拥塞图用于描述每个所述训练预测层的真实拥塞程度,每个所述训练半导体芯片包括至少两个所述训练预测层,每个所述训练预测层包括至少一个金属层。Wherein, the congestion prediction model is obtained after training through a data set, the data set includes training data corresponding to the training prediction layers contained in each of the K training semiconductor chips, each of the training prediction The training data corresponding to the layer includes M first training feature maps and a real congestion map, the M first training feature maps are respectively used to describe the M chip features of each of the training prediction layers, and the real congestion map uses To describe the actual congestion level of each training prediction layer, each training semiconductor chip includes at least two training prediction layers, and each training prediction layer includes at least one metal layer.
  28. 根据权利要求27所述的装置,其特征在于,所述装置还包括:The device according to claim 27, further comprising:
    聚合单元,用于对所述待预测半导体芯片中所有预测层对应的预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。The aggregation unit is configured to aggregate the predicted congestion graphs corresponding to all the prediction layers in the semiconductor chip to be predicted to obtain the predicted congestion graph corresponding to the semiconductor chip to be predicted.
  29. 根据权利要求28所述的装置,其特征在于,每个所述预测层对应的预测拥塞图包含垂直预测拥塞图和水平预测拥塞图;所述聚合单元具体用于:The device according to claim 28, wherein the predicted congestion map corresponding to each prediction layer includes a vertical predicted congestion map and a horizontal predicted congestion map; the aggregation unit is specifically used for:
    利用层级聚合算子对每个所述预测层对应的垂直预测拥塞图进行聚合,得到参考垂直预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的水平预测拥塞图进行聚合,得到参考水平预测拥塞图;利用方向性聚合算子对所述参考垂直预测拥塞图和所述参考水平预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图;Using a hierarchical aggregation operator to aggregate the vertical prediction congestion graphs corresponding to each of the prediction layers to obtain a reference vertical prediction congestion graph; using the hierarchical aggregation operator to aggregate the horizontal prediction congestion graphs corresponding to each of the prediction layers , obtaining a reference horizontal predicted congestion map; using a directional aggregation operator to aggregate the reference vertical predicted congestion map and the reference horizontal predicted congestion map to obtain a predicted congestion map corresponding to the semiconductor chip to be predicted;
    或,or,
    利用所述方向性聚合算子对每个所述预测层对应的垂直预测拥塞图和水平预测拥塞图进行聚合,得到每个所述预测层对应的参考预测拥塞图;利用所述层级聚合算子对每个所述预测层对应的参考预测拥塞图进行聚合,得到所述待预测半导体芯片对应的预测拥塞图。Using the directional aggregation operator to aggregate the vertical predicted congestion map and the horizontal predicted congestion map corresponding to each of the prediction layers to obtain a reference predicted congestion map corresponding to each of the prediction layers; using the hierarchical aggregation operator The reference predicted congestion graphs corresponding to each prediction layer are aggregated to obtain the predicted congestion graphs corresponding to the semiconductor chips to be predicted.
  30. 一种芯片系统,其特征在于,所述芯片系统包括至少一个处理器、存储器和接口电路,所述存储器、所述接口电路和所述至少一个处理器通过线路互联,所述至少一个存储 器中存储有指令;所述指令被所述处理器执行时,权利要求1-18任一所述的方法得以实现。A chip system, characterized in that the chip system includes at least one processor, a memory and an interface circuit, the memory, the interface circuit and the at least one processor are interconnected through lines, and the at least one memory stores There are instructions; when the instructions are executed by the processor, the method of any one of claims 1-18 is realized.
  31. 一种终端设备,其特征在于,所述终端设备包括如权利要求30中所述芯片系统,以及耦合至所述芯片系统的分立器件。A terminal device, characterized in that the terminal device comprises the system-on-a-chip as claimed in claim 30, and a discrete device coupled to the system-on-a-chip.
  32. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序指令,当所述程序指令在处理器上运行时,实现权利要求1-18任一所述的方法。A computer-readable storage medium, wherein program instructions are stored in the computer-readable storage medium, and when the program instructions are run on a processor, the method according to any one of claims 1-18 is implemented.
  33. 一种计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,权利要求1-18任一项所述的方法得以实现。A computer program product, characterized in that, when the computer program product is run on a terminal, the method according to any one of claims 1-18 is realized.
PCT/CN2021/101860 2021-06-23 2021-06-23 Congestion prediction model training method, image processing method and apparatus WO2022266888A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/101860 WO2022266888A1 (en) 2021-06-23 2021-06-23 Congestion prediction model training method, image processing method and apparatus
CN202180099695.1A CN117561515A (en) 2021-06-23 2021-06-23 Congestion prediction model training method, image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/101860 WO2022266888A1 (en) 2021-06-23 2021-06-23 Congestion prediction model training method, image processing method and apparatus

Publications (1)

Publication Number Publication Date
WO2022266888A1 true WO2022266888A1 (en) 2022-12-29

Family

ID=84545031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/101860 WO2022266888A1 (en) 2021-06-23 2021-06-23 Congestion prediction model training method, image processing method and apparatus

Country Status (2)

Country Link
CN (1) CN117561515A (en)
WO (1) WO2022266888A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117787171A (en) * 2023-12-25 2024-03-29 苏州异格技术有限公司 Multi-discriminant-based FPGA congestion prediction method and device for CGAN image conversion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636546A (en) * 2015-01-23 2015-05-20 武汉理工大学 Analog circuit wiring assessment method based on non-uniform grid
US20200257770A1 (en) * 2019-02-13 2020-08-13 International Business Machines Corporation Predicting routability of interconnects
CN112233115A (en) * 2020-12-11 2021-01-15 西安国微半导体有限公司 Deep learning-based wiring violation prediction method after placement and readable storage medium
US20210081509A1 (en) * 2019-09-16 2021-03-18 Taiwan Semiconductor Manufacturing Company Limited Integrated circuit layout validation using machine learning
CN112711930A (en) * 2020-12-24 2021-04-27 西安国微半导体有限公司 Wire mesh distribution-based routability-driven global layout method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636546A (en) * 2015-01-23 2015-05-20 武汉理工大学 Analog circuit wiring assessment method based on non-uniform grid
US20200257770A1 (en) * 2019-02-13 2020-08-13 International Business Machines Corporation Predicting routability of interconnects
US20210081509A1 (en) * 2019-09-16 2021-03-18 Taiwan Semiconductor Manufacturing Company Limited Integrated circuit layout validation using machine learning
CN112233115A (en) * 2020-12-11 2021-01-15 西安国微半导体有限公司 Deep learning-based wiring violation prediction method after placement and readable storage medium
CN112711930A (en) * 2020-12-24 2021-04-27 西安国微半导体有限公司 Wire mesh distribution-based routability-driven global layout method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117787171A (en) * 2023-12-25 2024-03-29 苏州异格技术有限公司 Multi-discriminant-based FPGA congestion prediction method and device for CGAN image conversion

Also Published As

Publication number Publication date
CN117561515A (en) 2024-02-13

Similar Documents

Publication Publication Date Title
WO2022083536A1 (en) Neural network construction method and apparatus
US20220215227A1 (en) Neural Architecture Search Method, Image Processing Method And Apparatus, And Storage Medium
WO2022042713A1 (en) Deep learning training method and apparatus for use in computing device
WO2021190127A1 (en) Data processing method and data processing device
WO2021120719A1 (en) Neural network model update method, and image processing method and device
WO2021238366A1 (en) Neural network construction method and apparatus
EP4099220A1 (en) Processing apparatus, method and storage medium
WO2021022521A1 (en) Method for processing data, and method and device for training neural network model
WO2022001805A1 (en) Neural network distillation method and device
WO2022052601A1 (en) Neural network model training method, and image processing method and device
WO2021218517A1 (en) Method for acquiring neural network model, and image processing method and apparatus
CN112990211B (en) Training method, image processing method and device for neural network
WO2022179581A1 (en) Image processing method and related device
CN111914997B (en) Method for training neural network, image processing method and device
WO2021051987A1 (en) Method and apparatus for training neural network model
WO2023231794A1 (en) Neural network parameter quantification method and apparatus
WO2022111617A1 (en) Model training method and apparatus
WO2021018251A1 (en) Image classification method and device
WO2021129668A1 (en) Neural network training method and device
WO2021175278A1 (en) Model updating method and related device
CN113592060A (en) Neural network optimization method and device
US20240078428A1 (en) Neural network model training method, data processing method, and apparatus
CN116187430A (en) Federal learning method and related device
WO2022266888A1 (en) Congestion prediction model training method, image processing method and apparatus
WO2020062299A1 (en) Neural network processor, data processing method and related device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21946388

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180099695.1

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE