CN112989678A

CN112989678A - Coarse grain parallel iteration method and device for integrated circuit interlayer coupling part accumulation

Info

Publication number: CN112989678A
Application number: CN202110425232.1A
Authority: CN
Inventors: 唐章宏; 邹军; 汲亚飞; 黄承清; 王芬
Original assignee: Beijing Wisechip Simulation Technology Co Ltd
Current assignee: Beijing Wisechip Simulation Technology Co Ltd
Priority date: 2021-04-20
Filing date: 2021-04-20
Publication date: 2021-06-18
Anticipated expiration: 2041-04-20
Also published as: CN112989678B

Abstract

The invention provides a parallel coarse grain iteration method and a parallel coarse grain iteration device for the accumulation of interlayer coupling parts of an integrated circuit, wherein the parallel iteration method comprises the following steps: firstly, dividing the calculation of the interlayer coupling of the integrated circuit into two types of parallel coarse particles, wherein the first type is to calculate the electromagnetic field and current distribution of each layer of the integrated circuit based on a two-dimensional finite element, and the second type is to calculate the influence of a source layer on other layers based on a dyadic Green function; secondly, establishing a management process to control the whole iteration loop, dividing the calculation of each iteration into a plurality of calculation tasks based on the divided parallel coarse grains, initiating a plurality of calculation processes in parallel, dynamically distributing the calculation tasks for each calculation process, independently finishing the calculation tasks by each calculation process, and updating the electromagnetic field and current distribution of each source layer of the integrated circuit and the action range of each source layer by repeated iteration until the variation of all the fields is smaller than a specified threshold value, and finishing the iteration. The method and the device can linearly reduce the calculation time of the electromagnetic response of the three-dimensional multilayer integrated circuit along with the parallel process number.

Description

Coarse grain parallel iteration method and device for integrated circuit interlayer coupling part accumulation

Technical Field

The invention relates to the technical field of integrated circuits, in particular to a coarse grain parallel iteration method and device for integrated circuit interlayer coupling part accumulation.

Background

When the integrated circuit works, a high-frequency alternating electromagnetic field can be formed on a multilayer layout of the integrated circuit due to the transmission of high-speed signals, and meanwhile, in order to improve the performance of electronic equipment, reduce the volume and reduce the cost, transistors, other components and circuits are integrated on a small semiconductor substrate. In order to realize more functions, the ultra-large scale integrated circuit has a structure from tens of layers to hundreds of layers, each layer of structure is extremely complex, millions or even tens of millions of transistors are integrated, and the ultra-large scale integrated circuit has a multi-scale structure from a centimeter level to the latest nanometer level at present. In order to ensure that the integrated circuit can normally work and realize the function designed in advance, the power integrity and the signal integrity of the integrated circuit need to be ensured firstly, so that the power integrity and the signal integrity of the integrated circuit with a multi-scale structure of tens of layers and hundreds of layers need to be accurately analyzed by adopting an electromagnetic field analysis method, which is a great problem of the electromagnetic field analysis of the ultra-large scale integrated circuit.

The method comprises the steps of performing electromagnetic field analysis on a three-dimensional large-scale integrated circuit by adopting a traditional method, further calculating the electromagnetic response of the three-dimensional large-scale integrated circuit, generally determining the whole three-dimensional integrated circuit and a limited region outside the integrated circuit as a calculation region after setting a truncation error of a certain region, then performing mesh division on the whole calculation region, calculating the electromagnetic field distribution of the whole calculation region, and further calculating the electromagnetic response of each layer of the integrated circuit, such as the electromagnetic field distribution, the current voltage of a designated port and the like. However, the characteristic dimensions of the via holes, the wires and the like of the integrated circuit are nano-scale, the dimension of the whole integrated circuit is centimeter-scale, the calculation area determined according to the truncation error is decimeter-scale and meter-scale, and hundreds of millions of grids and unknown quantities can be generated by carrying out uniform grid subdivision on the multi-scale space and then analyzing the space electromagnetic radiation of the multi-scale space, so that the hardware (memory) cost and the CPU time cost are overlarge. Therefore, the electromagnetic response of the three-dimensional large-scale integrated circuit can be calculated by adopting a method combining a finite element method and a moment method. In the three-dimensional large-scale integrated circuit area, a finite element method is adopted; in a large-scale area outside the integrated circuit, a moment method is adopted; the finite element method and the moment method are coupled at the interface of the integrated circuit and the external space. Because the moment method only integrates aiming at the interface, a large number of grid units and unknowns can be reduced, but because the scale range of the integrated circuit is from nano-scale to centimeter-scale, the finite element method directly used for solving the integrated circuit can generate a huge sparse matrix, and because the finite element method and the moment method are coupled, the formed coupling matrix is a dense matrix at the interface, the non-zero element number of the whole sparse matrix and the solving complexity of the sparse matrix are greatly increased, and the calculation time is still long.

Disclosure of Invention

Objects of the invention

Based on the above problems, the present invention provides a coarse grain parallel iteration method and apparatus for integrated circuit interlayer coupling part accumulation, which aims to reduce communication among processes to the maximum extent in the iterative computation process of integrated circuit interlayer coupling part accumulation, avoid hard disk read-write bottleneck caused by a memory peak value larger than an available physical memory during multi-process parallel computation, and perfectly solve the process waiting problem caused by different computation example complexity mismatch, thereby greatly improving parallel computation efficiency.

(II) technical scheme

As a first aspect of the invention, the invention discloses a coarse grain parallel iteration method for integrated circuit interlayer coupling part accumulation, which comprises the following steps: firstly, dividing the calculation of the interlayer coupling of the integrated circuit into two types of parallel coarse particles, wherein the first type of parallel coarse particles are used for calculating the electromagnetic field and current distribution of each layer of the integrated circuit based on a two-dimensional finite element, and the second type of parallel coarse particles are used for calculating the influence of a source layer on other layers based on a dyadic Green function;

secondly, establishing a management process to control the whole iteration loop, dividing the calculation of each iteration into a plurality of calculation tasks based on the divided parallel coarse grains, initiating a plurality of calculation processes in parallel, dynamically distributing the calculation tasks for each calculation process, independently finishing the calculation tasks by each calculation process, and updating the electromagnetic field and current distribution of each source layer of the integrated circuit and the action range of each source layer by repeated iteration until the variation of all the fields is smaller than a specified threshold value, and finishing the iteration.

Further, the calculation of the interlayer coupling of the integrated circuit is divided into two types of parallel coarse particles, the first type of parallel coarse particles is used for calculating the electromagnetic field and current distribution of each layer of the integrated circuit based on a two-dimensional finite element, and the second type of parallel coarse particles is used for calculating the influence of the source layer on other layers based on a dyadic green function, and the method specifically comprises the following steps:

each parallel coarse grain corresponds to a calculation task, the calculation task is completed by a calculation process, and for the first type of parallel coarse grains, the calculation process adopts a two-dimensional finite element to calculate the second type of parallel coarse grains according to an applied source iteml（0≤l≤N) The electromagnetic field distribution of the layer is updated to update the electromagnetic field and current distribution of the layer, and the change amount of the electromagnetic field of the layer is calculateddE _lAnd resetG _l= 0; for the parallel coarse grains of the second type, the calculation process uses a dyadic Green function to calculate the second typemSource layer to source layerlInfluence of the layer, isG _mlWherein

，

；

Further, a management process is established to control the whole iteration loop, each iteration is divided into a plurality of calculation tasks based on the divided parallel coarse grains, a plurality of calculation processes are initiated in parallel, the calculation tasks are dynamically distributed for each calculation process, each calculation process independently completes the calculation tasks, electromagnetic field and current distribution of each source layer of the integrated circuit and the action range of each source layer are repeatedly updated in an iteration mode until the change quantity of all fields is smaller than a specified threshold value, and the iteration is finished, and the method comprises the following steps:

step S100, the large scale integrated circuit is summarizedN+1 layer, each layer numbered

When considering the second aspect of LSImWhen the current source of the layer is in the current source of the layer, the layer is called the first layermA source layer provided with a secondmActive layer of source layer

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

I.e. firstmThe farthest distance of the influence range of the source layer is

A layer; the 0 th layer is a bottom layer; establishing a management process; the management process setting all layers of the LSIG _l=0，G _lIs shown aslThe layers are superposed under the influence of other source layers and are not more than 0l≤N(ii) a Setting the number of iterationsiter=0；

Step S200, dividingN+1 first-type parallel coarse grains as a task sequence to be calculated;

step S300, the management process willG _lAs a firstl（0≤l≤N) Superimposed source items of a layer affected by other source layers, initiated in parallelKA computing process for randomly and dynamically taking out the computing tasks in the task sequence to be computed and distributing the tasks to be computedKA computing process;

step S400, the management process checks whether the task sequence to be calculated is empty, if so, the step S1000 is executed, otherwise, the step S500 is executed;

s500, the management process checks whether an idle computing process exists, and if so, computing tasks in the task sequence to be computed are randomly and dynamically taken out and distributed to the idle computing process;

step S600, the management process checks whether the first type of parallel computing particles are finished and unprocessed, if so, the step S700 is carried out, otherwise, the step S800 is carried out;

step S700, the management process divides each of the completed and unprocessed parallel computing particles of the first type

Adding a second type of parallel coarse particles into a task sequence to be calculated, setting the first type of parallel coarse particles as processed, and entering step S400;

step S800, the management process checks whether the second type of parallel computing particles are finished and unprocessed, if so, the step S900 is carried out, otherwise, the preset undetermined time is waited, and the step S400 is carried out;

step S900, the management process calculates the calculation result of the second type parallel calculation particles which are finished and unprocessedG _mlTo the firstlInfluence of the layer:G _l=G _l+G _ml(ii) a Setting the second type of parallel computing particles as processed, and entering step S400;

step S1000, the management process checks the results of all the parallel computing particles of the first type, if

And (5) finishing the iteration, and outputting the electromagnetic field and current distribution of each layer, wherein

The iteration precision is preset; otherwise, executing step S1100;

step S1100, the management process checks the results of all the second-class parallel computing particles and selectsG _mlMaximum value ofG _maxAnd minimum valueG _minCalculating the effective influence value of the dyadic Green function

WhereinthredsholdA discarding threshold value for the influence of a preset dyadic Green function;

step S1200, the management process selects all the satisfied non-woven cellsG _ml|<GOf the conditionG _mlIs marked asG _thredsholdCalculate allG _thredsholdMiddle distance layermNearest layerl _nearNumber of layers of

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

Further, the second type of parallel coarse grain comprises the following specific steps:

the method comprises the steps of firstly, calculating an electric field generated by a point current source at a field point, wherein the electric field generated by the point current source at the field point is a special analytical expression formed according to a special layered structure of an integrated circuit, and the current sources of a multilayer integrated circuit are layered, namely the current density distributed on each metal layer of an integrated circuit layout with a complex shape is only equal to that of each metal layerxAndyis related tozIndependently, the current density distribution is onlyx, yAs a function of (c).

And secondly, taking an electric field expression generated by the point current source at the field point as an integrand function of two-dimensional Gaussian integration, and calculating fields generated by the surface current source of the simple-shaped polygon at the same position based on a linear superposition principle of the fields, wherein the method comprises the following steps: the field generated by the current source in the two-dimensional plane S at any point in space can be calculated by the two-dimensional gaussian integral:

；

wherein the content of the first and second substances,E(x,y,z) At any point in space for the current source within the two-dimensional plane S (x,y,z) The field that is generated is,

is an arbitrary position within the two-dimensional surface S: (u,v) At any point in space (a)x,y,z) Expression of the field generated, ((ii))u _p,v _q) Representing a gaussian integration point corresponding to a two-dimensional gaussian integration in the two-dimensional plane S,p,qrespectively representu,vIn the first directionpA first, aqThe number of the Gaussian integration points is equal to the total number of the points,

is the weight factor corresponding to the gaussian integral point;

thirdly, calculating fields generated by the current on the simple-shaped polygon at different positions of other layers of the integrated circuit, and determining the fields generated by the current on the simple-shaped polygon divided on the layout of other layers of the integrated circuit based on the linear superposition principle of the fields

The fourth step, determining the first step based on the linear superposition principle of the fieldslIs layered onmInfluence of source layerG _lm。

Further, in the iterative process, for each layer of the LSI, the management process calculates the second source layer pairlAfter the influence of the layer, only the other layers are applied to the second layerlThe influence of the layers is added up, rather than updating their electromagnetic field and current distribution immediately until it is desired to put the layers in orderlWhen the layer is used as a source layer, a computing process is allocated to process the parallel coarse grains of the first type, andG _las a firstlThe superimposed source terms of the layer affected by the other source layers are calculatedlElectromagnetic field and current distribution of the layer, uniform one-time updateElectromagnetic field and current distribution of the source layer; and once it is needed to handlelWhen the layer is used as a source layer and the influence of the source layer on other layers is calculated, a certain calculation process is immediately allocated to update the electromagnetic field and current distribution of the source layer without waiting for all other effective layers to act on the second layerlThe influence of the layers is totally accumulated to obtain the finalG _l(ii) a In the use oflThe current distribution of layer update calculates the influence of the source layer on other layers, and then resetsG _l=0。

Further, the specific method for dividing the calculation of the interlayer coupling of the integrated circuit into two types of parallel coarse grains is as follows: partitioning iterative computations of integrated circuit interlayer coupling into non-overlapping computation particles, wherein the computation particles are one or more computation units performing all independent operations of the same type; and acquiring the weighted CPU time and the total CPU time of each calculation particle based on one-time complete serial iterative calculation, and combining the calculation particles into different parallel coarse particles according to the proportion of the weighted CPU time.

On the other hand, the device for parallel iteration of coarse grains accumulated by coupling parts between layers of an integrated circuit comprises a parallel coarse grain dividing module, a parallel coarse grain operation module and a management process module;

the parallel coarse particle dividing module obtains weighted CPU time of each calculation particle and total CPU time of an iteration method of integral multilayer ultra-large scale integrated circuit interlayer coupling based on one-time complete serial iterative calculation, the calculation particles are combined into two types of parallel coarse particles according to the ratio of the weighted CPU time to the total CPU time, the similar parallel coarse particles are mutually independent, and corresponding calculation task sequences can be randomly disordered;

the parallel coarse grain operation module is used for randomly disordering the sequences of all calculation tasks executed by the same type of parallel coarse grains in the process of executing the parallel coarse grains to form a new calculation task sequence, and dynamically distributing the new calculation task sequence to different calculation processes to complete the parallel calculation of the calculation tasks;

the management process module is used for controlling the whole iteration loop.

Further, the specific operation steps of the management process module are as follows:

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

I.e. firstmThe farthest distance of the influence range of the source layer is

step S900, the management process calculates the calculation result of the second type parallel calculation particles which are finished and unprocessedG _mlTo the firstlInfluence of the layer:

(ii) a Setting the second type of parallel computing particles as processed, and entering step S400;

The iteration precision is preset; otherwise, executing step S1100;

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

；

is the weight factor corresponding to the gaussian integral point;

thirdly, calculating fields generated by the current on the simple-shaped polygon at different positions of other layers of the integrated circuit, and determining the fields generated by the current on the simple-shaped polygon divided on the layout of other layers of the integrated circuit based on the linear superposition principle of the fields;

Further, in the parallel coarse grain dividing module, a calculation formula of the weighted CPU time of the calculation grain is:

in the formula:

is as followsiThe weighted CPU time of each calculated grain,T _iis as followsiC of single calculation of each calculated particlePThe time of the U is the time of the U,

is as followsiThe number of times of particle execution is calculated, and the calculation formula of the total CPU time of the whole calculation process is as follows:

wherein, in the step (A),Tfor the total CPU time of the entire calculation process,mthe number of calculation particles divided for the entire calculation program,

is as followsiCalculating a weighted CPU time for the particle; and sequencing the weighted CPU time of each calculation particle according to the descending order and sequentially accumulating until the accumulated sum exceeds 90% of the total CPU time, and taking each calculation particle in the accumulated sum as a parallel coarse particle.

(III) advantageous effects

The invention provides a coarse grain parallel iteration method and a device for integrated circuit interlayer coupling part accumulation, which greatly reduce the communication between processes and the waiting time generated by synchronization in the integrated circuit interlayer coupling part accumulation calculation process, and simultaneously ensure that calculation models with unequal complexity are randomly and uniformly distributed on each calculation node by adopting a calculation task random dynamic distribution method, thereby avoiding hard disk read-write bottleneck caused by virtual memory access due to overhigh peak value memory.

Drawings

The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining and illustrating the present invention and should not be construed as limiting the scope of the present invention.

FIG. 1 is a block diagram of the main steps of a first embodiment of the present invention;

FIG. 2 is a diagram of the steps (upper part) of the management process of the present invention controlling the entire iterative loop;

FIG. 3 is a step diagram of the management process of the present invention controlling the entire iterative loop (lower part);

FIG. 4 is a flow chart of a second type of parallel coarse grain computation in the present invention;

FIG. 5 is an exploded schematic view of the electric field generated at the field point of the point source of the present invention;

fig. 6 is a block diagram of a second embodiment of the present invention.

Detailed Description

In order to make the implementation objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be described in more detail below with reference to the accompanying drawings in the embodiments of the present invention.

It should be noted that: in the drawings, the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described are some embodiments of the present invention, not all embodiments, and features in embodiments and embodiments in the present application may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

A first embodiment of the coarse grain parallel iterative method and apparatus for integrated circuit interlayer coupling partial accumulation according to the present invention is described in detail below with reference to fig. 1, 2, 3, 4, and 5. The embodiment of the method for parallel iteration of coarse grain accumulated coupling parts between layers of an integrated circuit, as shown in fig. 1, includes the following steps: firstly, dividing the calculation of the interlayer coupling of the integrated circuit into two types of parallel coarse particles, wherein the first type of parallel coarse particles are used for calculating the electromagnetic field and current distribution of each layer of the integrated circuit based on a two-dimensional finite element, and the second type of parallel coarse particles are used for calculating the influence of a source layer on other layers based on a dyadic Green function;

，

；

Further, a management process is established to control the whole iteration loop, each iteration is divided into a plurality of computing tasks based on the divided parallel coarse grains, a plurality of computing processes are initiated in parallel, the computing tasks are dynamically distributed for each computing process, each computing process independently completes the computing tasks, electromagnetic field and current distribution of each source layer of the integrated circuit and action range of each source layer are repeatedly updated in an iteration mode until the change amount of all fields is smaller than a specified threshold value, and the iteration is finished, as shown in fig. 2 and 3, the method comprises the following steps:

When considering large scale integrationFirst of the circuitmWhen the current source of the layer is in the current source of the layer, the layer is called the first layermA source layer provided with a secondmActive layer of source layer

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

I.e. firstmThe farthest distance of the influence range of the source layer is

The iteration precision is preset; otherwise, executing step S1100;

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

Further, the first parallel coarse grain two-dimensional finite element calculation method specifically comprises the following steps: for the direct current electric field model, the three-dimensional model of the multilayer integrated circuit refers to the conductivity in the direct current electric field model

Potential of the electrodeuAll the distributions of (A) and (B) are three-dimensional space coordinatesx,y,z) I.e.:

，

the function of the three-dimensional model satisfies the following equation (1):

in the equation (1),

and boundary condition (2):

in the formula

Is a boundary of the first type and is,nis normal to the boundary of the second type,

represents a potentialuAt the first kind boundary

Value of above, using

It is shown that,

bulk current density for external circuits;

the dimension of an actual PCB or a chip packaged board in the multilayer super large scale integrated circuit is far larger than the thickness of the metal layer, so that the three-dimensional direct current field problem of the multilayer integrated circuit is simplified into a two-dimensional direct current field problem;

the field solving equation set established by the finite element method for the two-dimensional model is an equation set (3):

in the formula (I), theI(u) In order to be a functional function,tis the thickness of the metal layer or layers,

as a grid celleThe electrical conductivity of (a) a (b),

as a grid celleThe potential of (a) is set to be,

as a grid celleThe area of (a) is,

as surface currentsThe density of the mixture is higher than the density of the mixture,

representing grid cellseThe edge of (1);

for the alternating electromagnetic field model, the three-dimensional model of the multilayer integrated circuit refers to the dielectric constant in the three-dimensional model of the electromagnetic response characteristic in the frequency domain simulation of the multilayer VLSI

Magnetic permeability of

Electric field intensityEMagnetic field intensityHAll the distributions of (A) and (B) are three-dimensional space coordinatesx,y,z) I.e.:

,

,

，

the function of the three-dimensional model satisfies the following equation:

in the formulaJFor the purpose of the applied current density distribution,

for the angular frequency simulated for the integrated circuit,

indicating the strength of the magnetic fieldHThe degree of rotation of the screw is reduced,

to representElectric field intensityEThe degree of rotation of the screw is reduced,jis the unit of an imaginary number,j ²=-1；

the board size of the actual PCB or chip package in the multilayer VLSI is far larger than the metal layer spacing, the three-dimensional model of the electromagnetic response characteristics in the frequency domain simulation of the multilayer VLSI is simplified into a two-dimensional model, and the dielectric constant in the model is at the moment

Magnetic permeability of

Electric field intensityEMagnetic field intensityHAll the distributions are two-dimensional plane coordinates (x,y) I.e.:

，

，

，

distribution thereof andzindependent of and potential in the fielduAnd surface current densityJ _sSatisfies the following conditions:

in the formula (I), the compound is shown in the specification,

respectively representx, y, zThe unit vector of the direction is,E _zof electric field strengthzThe direction component of the light beam is,H _xandH _yrespectively of magnetic field strengthxAndythe direction component of the light beam is,his the metal layer spacing;

through the simplification from the three-dimensional model to the two-dimensional model, the two-dimensional finite element functional extreme value formula corresponding to the two-dimensional model is obtained as follows:

in the formula (I), the compound is shown in the specification,

in order to be a functional function,

it is shown that the extreme value is taken for the functional,

as a grid celliThe surface admittance of the first and second electrodes,

is a boundary

The boundary condition of the opening of (a),u _kis a boundary

The distribution of the electric potential on the upper side,

indicating a position to the right of the boundary and infinitely close to the boundary,

indicating a position to the left of the boundary and infinitely close to the boundary,

representing grid cellsiThe area of (a) is,

as a grid celliThe current density of (a) is,

as a grid celliThe surface resistance of the glass substrate is higher than the surface resistance of the glass substrate,

as a grid celliThe potential of (a) is set to be,kis referred to askAnd (4) a boundary.

As shown in fig. 4, the second type of parallel coarse grain comprises the following specific steps:

first, as shown in FIG. 5, the first stepmSource layer point current source atlThe electric field expression generated by the layer is a special analytical expression formed according to a special structure of the integrated circuit layering and given by a dyadic green function, and the analytical expression is specifically as follows: aiming at the frequency domain electromagnetic field of the multilayer integrated circuit layout, the electric field intensity generated by the point source at any layer of field point is calculated by adopting a dyadic Green function, and the electric field intensity in nine directions of any point of any layer of the multilayer integrated circuit layout can be solved through the following formula to express that the electric field expression of the point source to the field point is solved:

the electric field generated by the point current source at the field point is expressed as:

wherein the content of the first and second substances,

iis the unit of an imaginary number,i ²=-1；

representing a Bessel function of order 0;

representing a Bessel function of order 1;

expressed as a function of the Bessel integral coefficient,

；x, y, zthe coordinates of the field points are represented,

,

,

representing source point coordinates; angular frequency

，

Represents a frequency;

indicating that the site is at the second

A layer of a material selected from the group consisting of,

is as follows

At layer boundarieszCoordinates;

,

respectively represent

The number of complex waves in the horizontal and vertical directions of the layer;

respectively represent

A layer horizontal dielectric constant, a vertical dielectric constant;

,

respectively representlHorizontal magnetic conductivity and vertical magnetic conductivity of the layer;

is shown aslThe anisotropy coefficient of the layer;

,

respectively representlIntegral coefficients of complex wave numbers of the horizontal and vertical layers;

respectively representlThe undetermined coefficient of a layer,A _l, B _lthe following linear equation is solved:

T₁is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

the following linear equation is solved:

T₂is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

the following linear equation is solved:

T₃is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

to representxOriented electric dipole in the second placelOf said electric field generated by said field points of the layerxA component;

to representxOriented electric dipole in the second placelOf said electric field generated by said field points of the layeryA component;

to representxOriented electric dipole in the second placelOf said electric field generated by said field points of the layerzA component;

to representyOriented electric dipole in the second placelThe field point of a layer is generatedOf said electric fieldxA component;

to representyOriented electric dipole in the second placelOf said electric field generated by said field points of the layeryA component;

to representyOriented electric dipole in the second placelOf said electric field generated by said field points of the layerzA component;

to representzOriented electric dipole in the second placelOf said electric field generated by said field points of the layerxA component;

to representzOriented electric dipole in the second placelOf said electric field generated by said field points of the layeryA component;

to representzOriented electric dipole in the second placelOf said electric field generated by said field points of the layerzAnd (4) components.

The current sources for a multi-level integrated circuit are distributed in layers, i.e. the current density distributed on each metal layer of an integrated circuit layout with a complex shape is only equal to that of the current sourcexAndythe axial direction is related tozAxial direction independence, current density distribution of onlyx，yAs a function of (c).

；

wherein the content of the first and second substances,E(x,y,z) At any point in space for the current source in the two-dimensional plane Sx,y,z) The field that is generated is,

is an arbitrary position within the two-dimensional surface S: (u,v) At any point in space (x,y,z) The expression of the dyadic green function of the generated field,

representing a gaussian integration point corresponding to a two-dimensional gaussian integration in the two-dimensional plane S,p,qrespectively representu,vIn the first directionpA first, aqThe number of the Gaussian integration points is equal to the total number of the points,

is the weight factor corresponding to the gaussian integral point;

Further, in the iterative process, for each layer of the LSI, the management process calculates the second source layer pairlAfter the influence of the layer, only the other layers are applied to the second layerlThe influence of the layers is added up, rather than updating their electromagnetic field and current distribution immediately until it is desired to put the layers in orderlWhen the layer is used as a source layer, a computing process is allocated to process the parallel coarse grains of the first type, andG _las a firstlThe superimposed source terms of the layer affected by the other source layers are calculatedlElectromagnetic field and current distribution of the layer, and uniformly updating the electromagnetic field and current distribution of the source layer at one time; and once it is needed to handlelWhen the layer is used as a source layer and the influence of the source layer on other layers is calculated, a certain calculation process is immediately allocated to update the electromagnetic field and current distribution of the source layer without waiting for all other effective layers to act on the second layerlThe influence of the layers is totally accumulated to obtain the finalG _l(ii) a In the use oflThe current distribution of layer update calculates the influence of the source layer on other layers, and then resetsG _l=0。

The calculation formula for calculating the weighted CPU time of the particles is as follows:

in the formula:

wherein, in the step (A),Tfor the total CPU time of the entire calculation process,mmeters divided for the whole calculation programThe number of the particles is calculated,

Specifically, if the iterative computation of the interlayer coupling of the multilayer very large scale integrated circuit is divided into 3 computing particles of c1, c2 and c3 according to the definition of the computing particles, 3 computing particles can execute the computing task of the whole computing process; if c1 executes 500 operation tasks, c2 executes 200 operation tasks, and c3 executes 5 operation tasks; then 705 operation tasks constitute the whole operation process, which only needs 3 computation particles of c1, c2 and c 3. The whole operation process is executed by 3 computing particles of c1, c2 and c3, and each of c1, c2 and c3 comprises at least 1 independent operation (operation task).

Sorting according to the weighted CPU time obtained by each calculated particle operation, wherein if the c1 weighted CPU time is 0.1s, the c2 weighted CPU time is 100s and the c3 weighted CPU time is 0.2s, the final sorting result is c2> c3> c 1; the weighted CPU times for the 3 calculated particles add sequentially from large to small, i.e., T (c2) + T (c3) + … until the sum of times is greater than 90% of the total CPU time; if T (c2) + T (c3) > 90%, then c2, c3 are each as a parallel coarse particle; if T (c2) > 90% of the total CPU time, then c2 is parallel coarse grain.

The parallel coarse grains are classified, the similar parallel coarse grains are mutually independent, the corresponding calculation task sequences can be randomly disordered, in the process of executing the parallel coarse grains, the sequences of all calculation tasks executed by the similar parallel coarse grains are randomly disordered to form a new calculation task sequence, and the new calculation task sequence is dynamically distributed to different calculation processes to complete the parallel calculation of the calculation tasks.

Specifically, the way of randomly scrambling the operation task sequence is as follows:

firstly, the sequence of operation tasks

Correspondingly generating random number sequences

，m＝1,2,3,…,M. Then to the sequence

The sequences are sorted from small to large, and the sorted sequences are

. Finally, generating new non-repeated operation task sequence

，

Is composed of

In that

Of (c) is used.

The key point is to make all the operation tasks in the parallel particles in sequence

Randomly disorganized to generate new non-repetitive operation task sequence

Then distributing operation tasks according to the sequence, namely equivalently distributing the original operation tasks randomly, wherein the random distribution strategy is characterized in that a random distribution scheme can completely disturb the distribution sequence of all the operation tasks, so that the sum of the peak value memory occupied by the tasks operated by all the operation nodes at the same time is determined by the average value of the process number and the peak value memory occupied by all the models (calculation particles) rather than the highest value。

And the main process distributes all the operation tasks required to be executed by the parallel coarse grains to all the processes including the main process according to the formed new calculation task sequence, and completes the parallel operation of all the operation tasks executed by the parallel coarse grains.

In addition, if a certain operation task in the parallel coarse grains is distributed to a process, a mark file which is used for indicating that the operation task is already distributed to the operation task is generated; when applying for distributing a certain calculation task, the other process tries to generate a mark file of the calculation task, and automatically applies for distributing the next calculation task by the other process under the condition that the mark file exists.

In the multi-process parallel operation process, the chances of allocating a certain operation task to each process are equal, if no measure is taken, multiple processes may be allocated to the same operation task, and the waste of operation resources is caused, so that some measure must be taken, and all operation tasks are uniquely allocated to a certain process. The simplest and most intuitive measure for achieving this is to assign a task a time stamp, i.e. a task is assigned to a process at the same time as it is marked so that other processes are no longer assigned the task. However, because the variables of each process are generally independent of each other during parallel operation, the operation tasks are asymmetric, the operation states of each process are different, and information distributed by any process through the variable marking task cannot be immediately transmitted to other processes, an external explicit marking method is needed to be adopted so that all processes can obtain the information once the operation tasks are marked. Therefore, if the operation task in the parallel coarse grains is distributed to the process, a mark file of the operation task is immediately generated; when a process applies for distributing a certain operation task, the process will try to generate a mark file of the operation task, if the mark file exists, the operation task is indicated to be distributed, and the process will automatically apply for distributing the next operation task.

The specific implementation steps for realizing the correct allocation of the operation tasks by utilizing the marker files are as follows:

step A1, a process applies for distributioniAn arithmetic task;

step A2, judgmentiSign file of individual operation taskFiIf the current state does not exist, jumping to the step A8, and if the current state does not exist, jumping to the step A3;

step A3, judging the mark fileFiWhether the lock is locked or not, jumping to the step A8 if the lock is locked, and jumping to the step A4 if the lock is not locked;

step A4, locking the logo fileFi；

Step A5, generating a logo fileFi；

Step A6, marking fileFiUnlocking;

step A7, completing the first stepiCalculating the operation tasks;

step A8, judging whether all the operation tasks in the parallel coarse grains are completed or not, if not, determining whether all the operation tasks in the parallel coarse grains are completed or noti＝i+1 and returning to step a1, if completed, jumping to step a 9.

All operation tasks required to be executed by the parallel coarse grains are all distributed to all processes, and the distribution of the parallel coarse grains is finished; it returns to performing all of the computational tasks that the other parallel coarse grain allocations need to perform each.

Particularly, when the voltage drop and the current distribution of a power supply layer of the integrated circuit are analyzed, the working frequency is low frequency, the direct current field model is adopted for analysis, no space coupling exists between the integrated circuit layers at the moment, only physical coupling exists, namely, the layers which are connected with each other through the through hole and the external circuit are mutually coupled, at the moment, the mutual influence layers between the integrated circuit layers are determined, and iteration is not needed to influence the influence range

And (6) correcting.

As can be seen from the above iteration steps, in the iteration process, according to the magnitude of the influence value of the dyadic green function of each layer, the range of the influence exerted by each source layer on other layers is adaptively adjusted, instead of exerting the influence of the source on other layers on all other layers every time, so that the iterative computation is accelerated. The advantage of the above iterative method is that the second source layer pair is calculatedlShadow of layerAfter ringing, only the other layers are put in the second placelThe influence of the layers is added up, rather than updating their electromagnetic field and current distribution immediately until it is desired to put the layers in orderlWhen the layer is used as a source layer, a computing process is allocated to process the parallel coarse grains of the first type, andG _las a firstlThe superimposed source terms of the layer affected by the other source layers are calculatedlElectromagnetic field and current distribution of the layer, and uniformly updating the electromagnetic field and current distribution of the source layer at one time; and once it is needed to handlelWhen the layer is used as a source layer and the influence of the source layer on other layers is calculated, a certain calculation process is immediately allocated to update the electromagnetic field and current distribution of the source layer without waiting for all other effective layers to act on the second layerlThe influence of the layers is totally accumulated to obtain the finalG _lTherefore, a large amount of waiting time is reduced, and the parallel computing efficiency is improved.

A second embodiment of the coarse grain parallel iterative method and apparatus for integrated circuit interlayer coupling partial accumulation according to the present invention is described in detail with reference to fig. 2, 3, 4, 5 and 6. As shown in fig. 6, the coarse grain parallel iteration apparatus for integrated circuit interlayer coupling partial accumulation according to this embodiment includes a parallel coarse grain division module, a parallel coarse grain operation module, and a management process module;

in the formula:

firstly, the sequence of operation tasks

Correspondingly generating random number sequences

，m＝1,2,3,…,M. Then to the sequence

The sequences are sorted from small to large, and the sorted sequences are

. Finally, generating new non-repeated operation task sequence

，

Is composed of

In that

Of (c) is used.

Randomly disorganized to generate new non-repetitive operation task sequence

And then distributing the operation tasks according to the sequence, namely equivalently distributing the original operation tasks randomly, wherein the random distribution strategy is characterized in that a random distribution scheme can completely disturb the distribution sequence of all the operation tasks, so that the sum of the peak value memory occupied by the tasks operated by all the operation nodes at the same time is determined by the average value of the process number and the peak value memory occupied by all the models (calculation particles) rather than the maximum value.

step A1, a process applies for distributioniAn arithmetic task;

step A4, locking the logo fileFi；

Step A5, generating a logo fileFi；

Step A6, marking fileFiUnlocking;

step A7, finishingTo become the firstiCalculating the operation tasks;

The management process module is used for controlling the whole iteration loop.

Further, as shown in fig. 2 and 3, the specific operation steps of the management process module are as follows:

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

I.e. firstmThe farthest distance of the influence range of the source layer isN _effA layer; the 0 th layer is a bottom layer; establishing a management process; the management process setting all layers of the LSIG _l=0，G _lIs shown aslThe layers are superposed under the influence of other source layers and are not more than 0l≤N(ii) a Setting the number of iterationsiter=0；

step S300, the management process willG _lAs a firstl（0≤l≤N) The layers being subjected to other source layersSuperimposed source items of influence, parallel initiationKA computing process for randomly and dynamically taking out the computing tasks in the task sequence to be computed and distributing the tasks to be computedKA computing process;

The iteration precision is preset; otherwise, executing step S1100;

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

，

in the equation (1),

and boundary condition (2):

in the formula

represents a potentialuAt the first kind boundary

Value of above, using

It is shown that,

bulk current density for external circuits;

as a grid celleThe electrical conductivity of (a) a (b),

as a grid celleThe potential of (a) is set to be,

as a grid celleThe area of (a) is,

as the density of the surface current, the current density,

representing grid cellseThe edge of (1);

Magnetic permeability of

,

,

，

the function of the three-dimensional model satisfies the following equation:

in the formulaJFor the purpose of the applied current density distribution,

for the angular frequency simulated for the integrated circuit,

indicates the electric field intensityEThe degree of rotation of the screw is reduced,jis the unit of an imaginary number,j ²=-1；

Magnetic permeability of

，

，

，

in the formula (I), the compound is shown in the specification,

in the formula (I), the compound is shown in the specification,

in order to be a functional function,

it is shown that the extreme value is taken for the functional,

as a grid celliThe surface admittance of the first and second electrodes,

is a boundary

The boundary condition of the opening of (a),u _kis a boundary

The distribution of the electric potential on the upper side,

representing grid cellsiThe area of (a) is,

as a grid celliThe current density of (a) is,

wherein the content of the first and second substances,

iis the unit of an imaginary number,i ²=-1；

representing a Bessel function of order 0;

representing a Bessel function of order 1;

expressed as a function of the Bessel integral coefficient,

；x, y, zthe coordinates of the field points are represented,

,

,

representing source point coordinates; angular frequency

，

Represents a frequency;

indicating that the site is at the second

A layer of a material selected from the group consisting of,

is as follows

At layer boundarieszCoordinates;

,

respectively represent

respectively representing the horizontal dielectric constant and the vertical dielectric constant of the second layer;

,

is shown aslThe anisotropy coefficient of the layer;

,

respectively representlThe undetermined coefficient of a layer,A _l, B _lis composed ofSolving the linear equation to obtain:

T₁is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

the following linear equation is solved:

T₂is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

the following linear equation is solved:

T₃is 2n×2nThe complex number matrix of (a) is,

is of length 2nA complex vector of (a);

to representyOriented electric dipole in the second placelOf said electric field generated by said field points of the layerxA component;

；

is the weight factor corresponding to the gaussian integral point;

And (6) correcting.

As can be seen from the above iteration steps, in the iteration process, according to the magnitude of the influence value of the dyadic green function of each layer, the range of the influence exerted by each source layer on other layers is adaptively adjusted, instead of exerting the influence of the source on other layers on all other layers every time, so that the iterative computation is accelerated. The advantage of the above iterative method is that the second source layer pair is calculatedlAfter the influence of the layer, only the other layers are applied to the second layerlThe influence of the layers is added up, rather than updating their electromagnetic field and current distribution immediately until it is desired to put the layers in orderlWhen the layer is used as a source layer, a computing process is allocated to process the parallel coarse grains of the first type, andG _las a firstlThe superimposed source terms of the layer affected by the other source layers are calculatedlElectromagnetic field and current distribution of the layer, and uniformly updating the electromagnetic field and current distribution of the source layer at one time; and once it is needed to handlelWhen the layer is used as a source layer and the influence of the source layer on other layers is calculated, a certain calculation process is immediately allocated to update the electromagnetic field and current distribution of the source layer without waiting for all other effective layers to act on the second layerlThe influence of the layers is totally accumulated to obtain the finalIs/are as followsG _lTherefore, a large amount of waiting time is reduced, and the parallel computing efficiency is improved.

The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A coarse grain parallel iteration method for coupling partial accumulation between layers of an integrated circuit is characterized by comprising the following steps: firstly, dividing the calculation of the interlayer coupling of the integrated circuit into two types of parallel coarse particles, wherein the first type of parallel coarse particles are used for calculating the electromagnetic field and current distribution of each layer of the integrated circuit based on a two-dimensional finite element, and the second type of parallel coarse particles are used for calculating the influence of a source layer on other layers based on a dyadic Green function;

2. The method of claim 1, wherein the computing of the inter-layer coupling of the integrated circuit is divided into two types of parallel coarse grains, the first type of parallel coarse grains are used for computing electromagnetic field and current distribution of each layer of the integrated circuit based on two-dimensional finite elements, and the second type of parallel coarse grains are used for computing influence of a source layer on other layers based on a dyadic Green function, and the method specifically comprises:

each parallel coarse grain corresponds to a calculation task, the calculation task is completed by a calculation process, and for the first type of parallel coarse grainsParticle, the calculation process uses two-dimensional finite elements to calculate the secondl（0≤l≤N) The electromagnetic field distribution of the layer is updated to update the electromagnetic field and current distribution of the layer, and the change amount of the electromagnetic field of the layer is calculateddE _lAnd resetG _l= 0; for the parallel coarse grains of the second type, the calculation process uses a dyadic Green function to calculate the second typemSource layer to source layerlInfluence of the layer, isG _mlWherein

，

。

3. The integrated circuit interlayer coupling part accumulated coarse grain parallel iteration method according to claim 1, wherein a management process is established to control the whole iteration loop, the calculation of each iteration is divided into a plurality of calculation tasks based on the divided parallel coarse grains, the plurality of calculation processes are initiated in parallel, the calculation tasks are dynamically distributed for each calculation process, each calculation process independently completes the calculation tasks, the electromagnetic field and current distribution of each source layer of the integrated circuit and the action range of each source layer are updated by repeated iteration until the change quantity of all fields is smaller than a specified threshold value, and the iteration is finished, comprising the following steps:

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

Namely the firstmThe farthest distance of the influence range of the source layer is

A layer; the 0 th layer is a bottom layer; s100, establishing a management process; the management process setting all layers of the LSIG _l=0，G _lIs shown aslThe layers are superimposed by the influence of other source layers,

(ii) a Setting the number of iterations

；

The iteration precision is preset; otherwise, executing step S1100;

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

4. The integrated circuit interlayer coupling partial accumulation coarse grain parallel iterative method of claim 3, wherein the second type of parallel coarse grain comprises the following steps:

the method comprises the steps of firstly, calculating an electric field generated by a point current source at a field point, wherein the electric field generated by the point current source at the field point is a special analytical expression formed according to a special layered structure of an integrated circuit, and the current sources of a multilayer integrated circuit are layered, namely the current density distributed on each metal layer of an integrated circuit layout with a complex shape is only equal to that of each metal layerxAndyis related tozIndependently, the current density distribution is onlyx, yA function of (a);

；

is the weight factor corresponding to the gaussian integral point;

5. The method of claim 3, wherein the management process calculates the second source layer pair for each LSI layer during the iterationlAfter the influence of the layer, only the other layers are applied to the second layerlThe influence of the layers is added up, rather than updating their electromagnetic field and current distribution immediately until it is desired to put the layers in orderlWhen the layer is used as a source layer, a computing process is allocated to process the parallel coarse grains of the first type, andG _las a firstlThe superimposed source terms of the layer affected by the other source layers are calculatedlElectromagnetic field and current distribution of the layer, and uniformly updating the electromagnetic field and current distribution of the source layer at one time; and once it is needed to handlelWhen the layer is used as a source layer and the influence of the source layer on other layers is calculated, a certain calculation process is immediately allocated to update the electromagnetic field and current distribution of the source layer without waiting for all other effective layers to act on the second layerlThe influence of the layers is totally accumulated to obtain the finalG _l(ii) a In the use oflThe current distribution of layer update calculates the influence of the source layer on other layers, and then resetsG _l=0。

6. The method of claim 1, wherein the method for dividing the computation of the inter-layer coupling of the integrated circuit into two parallel coarse-grained classes comprises: partitioning iterative computations of integrated circuit interlayer coupling into non-overlapping computation particles, wherein the computation particles are one or more computation units performing all independent operations of the same type; and acquiring the weighted CPU time and the total CPU time of each calculation particle based on one-time complete serial iterative calculation, and combining the calculation particles into different parallel coarse particles according to the proportion of the weighted CPU time.

7. A coarse grain parallel iteration device for coupling part accumulation between layers of an integrated circuit is characterized by comprising a parallel coarse grain dividing module, a parallel coarse grain operation module and a management process module;

the management process module is used for controlling the whole iteration loop.

8. The apparatus of claim 7, wherein the management process module performs the following steps:

Is divided bymOthers of the source layerNLayers of the integrated circuit, note

I.e. firstmThe farthest distance of the influence range of the source layer is

A layer; the 0 th layer is a bottom layer; establishing a management process; the management process setting all layers of the LSIG _l=0，G _lIs shown aslThe layers are superimposed by the influence of other source layers,

(ii) a Setting the number of iterations

；

The iteration precision is preset; otherwise, executing step S1100;

Is marked as

Update

Is composed of

Average value of (i), i.e.

The process proceeds to step S200.

9. The integrated circuit interlayer coupled partial accumulation coarse grain parallel iteration device of claim 7, wherein the second type of parallel coarse grain comprises the following steps:

；

is an arbitrary position within the two-dimensional surface S: (u,v) At any point in space (a)x,y,z) Representation of the generated fieldIn the formula (II), the compound (II) is shown in the specification,

is the weight factor corresponding to the gaussian integral point;

10. The apparatus of claim 7, wherein the parallel coarse grain partitioning module computes the weighted CPU time of the computation grain according to the formula:

in the formula: