CN104008216B

CN104008216B - Method of using a memory compiler to generate optimized memory instances

Info

Publication number: CN104008216B
Application number: CN201310056648.6A
Authority: CN
Inventors: 连南钧; 林孝平; 石维强; 林育均; 叶有伟
Original assignee: M31 Technology Corp
Current assignee: M31 Technology Corp
Priority date: 2013-02-22
Filing date: 2013-02-22
Publication date: 2017-04-26
Anticipated expiration: 2033-02-22
Also published as: CN104008216A

Abstract

A method of using a memory compiler to generate optimized memory instances. Data describing the designed memory is provided, and front-end models and back-end models are generated to provide a database. Design criteria are received through a user interface. The memory design is optimized in terms of both speed, power, and area according to the provided database and design criteria, resulting in memory instances.

Description

Method of using a memory compiler to generate optimized memory instances

技术领域technical field

本发明涉及一种存储器编译器（compiler），特别是涉及一种同时考量并自动最佳化速度、功率及面积的存储器编译器。The present invention relates to a memory compiler, in particular to a memory compiler which simultaneously considers and automatically optimizes speed, power and area.

现有技术current technology

存储器编译器（例如随机存取存储器编译器）可用于自动产生存储器实例（memoryinstance）。存储器编译器还可用于支援系统整合晶片（SoC）的设计能力。然而，传统的存储器编译器在产生存储器实例时，仅提供速度、功率或密度的单一特性规格来制定。因此，所产生的存储器实例通常没有同时考量三方面的最佳化以符合客户的要求。A memory compiler (eg, a random access memory compiler) can be used to automatically generate memory instances. The memory compiler can also be used to support system-on-chip (SoC) design capabilities. However, conventional memory compilers only provide a single characteristic specification of speed, power, or density when generating memory instances. Therefore, the generated memory instance usually does not consider the optimization of the three aspects at the same time to meet the customer's requirements.

此外，在产生存储器实例时，传统的存储器编译器操作于元件（device）层级。由于元件本身数量繁多，几乎是以百万以上的数量级来调整整体效率，因此，存储器实例的最佳化需耗费相当的时间。Furthermore, conventional memory compilers operate at the device level when generating memory instances. Due to the large number of components themselves, the overall efficiency is almost adjusted on the order of millions or more, so the optimization of the memory instance takes a considerable amount of time.

鉴于传统的存储器编译器无法有效且快速产生最佳化存储器实例，因此亟需提出一种新颖的存储器编译器，以克服传统存储器编译器的缺点。Since traditional memory compilers cannot efficiently and quickly generate optimized memory instances, it is urgent to propose a novel memory compiler to overcome the shortcomings of traditional memory compilers.

发明内容Contents of the invention

鉴于现有技术的上述问题，本发明实施例的目的之一在于提出一种使用存储器编译器以产生最佳化存储器实例的方法，其同时考量速度、功率及面积的三方因素以最佳化存储器的设计。在一个实施例中，所提出的存储器编译器执行于架构（architecture）层级、区块（block）层级及元件层级，用以加速存储器实例的产生。In view of the above-mentioned problems in the prior art, one of the purposes of the embodiments of the present invention is to provide a method for using a memory compiler to generate an optimized memory instance, which simultaneously considers the three factors of speed, power and area to optimize the memory the design of. In one embodiment, the proposed memory compiler executes at the architecture level, block level and device level to accelerate the generation of memory instances.

根据本发明的实施例，提供所设计存储器的相关描述数据，且产生前端模型和后端模型以提供一数据库。通过用户界面接收设计准则。根据所述数据库和设计准则，同时考量速度、功率及面积三方因素以最佳化该存储器的设计，从而产生存储器实例。According to the embodiment of the present invention, relevant description data of the designed memory is provided, and a front-end model and a back-end model are generated to provide a database. Receive design guidelines through the user interface. According to the database and design criteria, the three factors of speed, power and area are considered simultaneously to optimize the design of the memory, so as to generate memory instances.

在一特定实施例中，该最佳化步骤使用自上而下的方式，将所设计存储器的架构分解为多个区块，根据区块的特性做分析与选择最佳组合；对于这些分解区块，从数据库中获取至少一个高速度数据库、至少一个小功率数据库及至少一个小面积数据库；针对这些区块的表现特性做大方向的区块选择与调整；当最佳组合的区块选定后，调整该区块里面的元件参数做更细部的调整，以实现最佳化。该最佳化步骤还使用自下而上的方式，连结这些调整元件，以形成这些区块；且组合这些区块，以形成存储器，进而查看整体最佳化的表现。In a specific embodiment, the optimization step uses a top-down approach to decompose the designed memory architecture into multiple blocks, analyze and select the best combination according to the characteristics of the blocks; for these decomposed areas block, obtain at least one high-speed database, at least one low-power database, and at least one small-area database from the database; select and adjust the blocks in the general direction according to the performance characteristics of these blocks; when the best combination of blocks is selected Finally, adjust the component parameters in this block to make more detailed adjustments to achieve optimization. The optimization step also connects the adjustment elements to form the blocks using a bottom-up approach, and combines the blocks to form the memory to see the overall optimized performance.

附图说明Description of drawings

图1显示本发明实施例的使用存储器编译器以产生最佳化存储器实例的方法的流程图。FIG. 1 shows a flowchart of a method for using a memory compiler to generate an optimized memory instance according to an embodiment of the present invention.

图2显示图1的最佳化步骤的详细流程图。FIG. 2 shows a detailed flowchart of the optimization steps of FIG. 1 .

图3示出区块分解。Figure 3 shows the block decomposition.

图4示出三维限制条件曲面。Figure 4 shows a three-dimensional constraint surface.

附图标记说明Explanation of reference signs

11：提供存储器相关数据11: Provide memory related data

12：前端模型与和后端模型12: Front-end model and back-end model

13：存储器编译器用户界面13: Memory Compiler User Interface

14：最佳化14: Optimization

141：定义公式141: Defining Formulas

142：选择数据库的相关部分142: Select the relevant part of the database

143：架构分解143: Architecture Decomposition

144：获取高速度、小功率、小面积数据库144: Obtain high-speed, low-power, small-area database

145：元件调整145: Component adjustment

146：区块重映射146: Block remapping

147：架构重映射147: Schema Remapping

148：是否符合限制条件148: Whether it meets the restriction

149：实例产生149: Instance generation

15：候选表15: Candidate list

16：是否符合要求16: Whether it meets the requirements

41：三维限制条件曲面41: 3D constraint surface

2A：自上而下方式2A: Top-Down Approach

2B：自下而上方式2B: Bottom-up approach

XDEC：X解码器XDEC: X decoder

IO：输入输出电路IO: input and output circuit

具体实施方式detailed description

图1显示本发明实施例的使用存储器编译器以产生最佳化存储器实例的方法的流程图。本实施例可用以产生最佳化存储器实例，例如静态随机存取存储器（SRAM）、只读存储器（ROM）、内容可寻址存储器（content addressable memory,CAM）或快闪存储器。FIG. 1 shows a flowchart of a method for using a memory compiler to generate an optimized memory instance according to an embodiment of the present invention. This embodiment can be used to create optimized memory instances, such as static random access memory (SRAM), read only memory (ROM), content addressable memory (CAM) or flash memory.

首先，在步骤11，提供所设计存储器的相关描述数据，例如由半导体代工厂来提供。步骤11所提供的数据可以是集成电路模拟程序（SPICE）所描述的电路、设计法则（例如集成电路拓扑布局规则（TLR））或元件型态（例如随机存取存储器元件），但不限定于此。根据所提供的数据，在步骤12产生前端（F/E）模型和后端（B/E）模型，用以将具有设计行为模型的数据库（library）提供给最佳化器（optimizer），该最佳化器同时考量速度、功率及面积（或密度）这三方因素来最佳化存储器的设计。相反地，传统的存储器编译器仅针对速度、功率或密度中的单一特性因素作开发，而非所有三个因素。在本说明书中，前端模型是相关于所设计存储器的电流、电压和/或功率，而后端模型则相关于所设计存储器的布局样式（pattern）。在一优选实施例中，所提出的方法可适用于设计小面积（或高密度）存储器。相比于传统方法，本优选实施例在设计最佳化小面积（或高密度）存储器实例中更加有效。First, in step 11, relevant description data of the designed memory is provided, such as provided by a semiconductor foundry. The data provided in step 11 can be circuits described by the integrated circuit simulation program (SPICE), design rules (such as integrated circuit topology layout rules (TLR)) or device types (such as random access memory components), but are not limited to this. According to the provided data, generate a front-end (F/E) model and a back-end (B/E) model in step 12 to provide the database (library) with the design behavior model to the optimizer (optimizer), the The optimizer simultaneously considers the three factors of speed, power and area (or density) to optimize the memory design. In contrast, conventional memory compilers are developed for a single characteristic factor of speed, power or density, but not all three. In this specification, the front-end model is related to the current, voltage and/or power of the designed memory, while the back-end model is related to the layout pattern of the designed memory. In a preferred embodiment, the proposed method can be adapted to design small-area (or high-density) memories. Compared to conventional methods, the preferred embodiment is more efficient in designing optimized small area (or high density) memory instances.

另一方面，在步骤13，在具有存储器编译器的电脑上安装用户界面（例如图形用户界面（GUI）），用于从客户接收设计准则（design criteria），例如实例配置（configuration）。用户界面还接收速度、功率和面积的优先顺序。此外，用户界面还接收所设计存储器的储存容量（例如2MB或1GB）。在接下来步骤中，根据储存容量和优先顺序来设计并最佳化该存储器。On the other hand, at step 13 , a user interface, such as a graphical user interface (GUI), is installed on a computer with a memory compiler for receiving design criteria, such as instance configuration, from the customer. The user interface also receives priorities for speed, power and area. In addition, the user interface also receives the storage capacity of the designed memory (for example, 2MB or 1GB). In the next step, the memory is designed and optimized in terms of storage capacity and priority.

接着，在步骤14，根据步骤12所提供的数据库和步骤13所接收的限制条件（constraint）来最佳化速度、功率和面积的设计。将在下文中，结合图2来说明有关最佳化的细节。Next, in step 14 , the design of speed, power and area is optimized according to the database provided in step 12 and the constraints received in step 13 . Details about the optimization will be described below with reference to FIG. 2 .

在执行步骤14的最佳化之后，在步骤15准备候选表（candidate list），其包含多个产生的存储器实例，用以依据客户要求来进行最后评估。在步骤16，从候选表中选择所产生的存储器实例中的一个，其最符合客户的要求。After performing the optimization in step 14, a candidate list is prepared in step 15, which contains a plurality of generated memory instances for final evaluation according to customer requirements. At step 16, one of the generated memory instances is selected from the candidate list, which best meets the customer's requirements.

图2显示图1的最佳化（即，步骤14）的详细流程图。在步骤141，根据步骤13所接收的限制条件，对所设计存储器的速度、功率和面积定义出控制规则（或公式）。同时，在步骤142，根据在步骤13所接收的限制条件，选择所提供的数据库的相关部分。FIG. 2 shows a detailed flowchart of the optimization (ie, step 14 ) of FIG. 1 . In step 141 , according to the constraints received in step 13 , control rules (or formulas) are defined for the speed, power and area of the designed memory. Simultaneously, at step 142 , according to the constraints received at step 13 , the relevant portion of the provided database is selected.

根据本实施例的特征之一，使用自上而下方式（top-down approach）2A来最佳化存储器的设计。其中，在步骤143，如图3所例示，将所设计存储器的整个架构分解为多个区块：存储器单元、X解码器（XDEC）、控制电路以及输入输出电路（IO）。由此，可以区块层级来表示存储器的架构，以进行区块的特性分析与选择最佳组合。相反的，传统的存储器编译器则是执行于元件层级，因此其存储器设计较难操控。本实施例的区块可以是基于叶单元（leaf-cell-based）的区块，但不限定于此。According to one of the features of this embodiment, the design of the memory is optimized using a top-down approach 2A. Wherein, in step 143 , as shown in FIG. 3 , the whole structure of the designed memory is decomposed into multiple blocks: memory unit, X decoder (XDEC), control circuit and input-output circuit (IO). Therefore, the architecture of the memory can be represented at the block level, so as to analyze the characteristics of the blocks and select the best combination. On the contrary, the traditional memory compiler is executed at the device level, so its memory design is more difficult to control. The block in this embodiment may be a leaf-cell-based block, but is not limited thereto.

接下来，在步骤144，从步骤12所提供的数据库中获取这些区块相关的至少一个高速度数据库、至少一个小功率数据库以及至少一个小面积（或高密度）数据库。在本实施例中，修饰词“高”或“低/小”分别指一个物理量（例如速度、功率或面积）的值大于或小于一预设临界值。接着，针对这些区块表现特性做大方向的区块选择与调整。最后，在步骤145，当最佳组合的区块选定后，若有需要，则对这些区块的元件（例如，电晶体）的参数进行更细部的调整或微调。在本实施例中，所调整的参数可包含临界电压（例如低准位临界电压、标准临界电压或高准位临界电压）、P型金属氧化物半导体（PMOS）或N型金属氧化物半导体（NMOS）的宽度/长度、实体布局样式的并联/串联元件及动态/静态的组合/循序（combinational/sequential）闸门（gate）电路型态。Next, at step 144 , at least one high-speed database, at least one low-power database, and at least one small-area (or high-density) database related to these blocks are obtained from the database provided in step 12 . In this embodiment, the modifiers "high" or "low/small" respectively refer to a value of a physical quantity (such as speed, power or area) being greater than or less than a preset critical value. Then, select and adjust blocks in the general direction according to the performance characteristics of these blocks. Finally, in step 145, after the blocks with the best combination are selected, if necessary, the parameters of the components (eg, transistors) in these blocks are adjusted or fine-tuned in more detail. In this embodiment, the adjusted parameters may include threshold voltage (such as low-level threshold voltage, standard threshold voltage or high-level threshold voltage), P-type metal oxide semiconductor (PMOS) or N-type metal oxide semiconductor ( NMOS) width/length, physical layout style parallel/serial elements and dynamic/static combination/sequential (combinational/sequential) gate (gate) circuit type.

根据本实施例的另一特征，使用自下而上方式（bottom-up approach）2B来微调最佳化。在步骤146，将这些调整元件（例如，对部分作调整而另一部分未调整）予以连结（或重映射）以形成个别区块；在步骤147，将这些区块予以组合（或重映射）以形成存储器，再对该存储器进行整体组合模拟，以查看整体最佳化的表现。如果模拟结果符合限制条件（步骤148），则产生相关的存储器实例（步骤149）；否则，根据步骤13所接收的优先顺序，至步骤142选择所提供数据库的另一部分，并再次执行自上而下方式2A和自下而上方式2B。由此，执行自上而下方式2A和自下而上方式2B一次或多次，从而得到候选表，其包含多个产生的存储器实例。According to another feature of this embodiment, the optimization is fine-tuned using a bottom-up approach 2B. At step 146, these adjusted elements (e.g., some adjusted and some not adjusted) are concatenated (or remapped) to form individual blocks; at step 147, these blocks are combined (or remapped) to form A memory is formed, and then the overall combination of the memory is simulated to see the overall optimal performance. If the simulation result meets the constraint condition (step 148), then generate the relevant memory instance (step 149); otherwise, according to the priority order received in step 13, go to step 142 to select another part of the provided database, and execute the top-down method again Bottom-up approach 2A and bottom-up approach 2B. Thus, the top-down approach 2A and the bottom-up approach 2B are performed one or more times, thereby obtaining a candidate list, which includes a plurality of generated memory instances.

如前所述，本实施例同时考量速度、功率和面积因素这三者来最佳化存储器的设计。因此，如图4所例示，在最佳化过程中建构出三维限制条件曲面（constraint surface）41。接近三维限制条件曲面41的一个或多个存储器实例即被选为最佳的候选者。As mentioned above, this embodiment optimizes the design of the memory by simultaneously considering the three factors of speed, power and area. Therefore, as shown in FIG. 4 , a three-dimensional constraint surface (constraint surface) 41 is constructed during the optimization process. One or more memory instances that are close to the three-dimensional constraint surface 41 are then selected as the best candidates.

以上所述仅仅是本发明的优选实施例而已，并非用于限定本发明的范围；其它未脱离发明所公开的精神下所完成的等效改变或修饰，均应包含在的本申请的范围内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention; other equivalent changes or modifications that do not deviate from the disclosed spirit of the invention should be included within the scope of the present application .

Claims

1. A method of using a memory compiler to generate an optimized memory instance, comprising:

Provide relevant description data of the designed memory;

generating a front-end model and a back-end model to provide a database;

Receive design guidelines through the user interface; and

According to the database and the design criteria, the design of the memory is optimized while considering speed, power and area, thereby generating a memory instance,

wherein the design criteria include priorities for speed, power, and area,

Wherein, the optimization step includes:

defining control rules for the speed, power and area of the designed memory according to the priority order and specification requirements;

select relevant parts of said database according to said priorities and specifications;

Decomposing the architecture of the designed memory into a plurality of blocks;

For these decomposed blocks, at least one high-speed database, at least one low-power database and at least one small-area database are obtained from said databases;

According to the performance characteristics of these blocks, make block selection and adjustment in the general direction;

adjust the parameters of the elements of these blocks;

linking the adjusted elements to form the blocks;

combining said blocks to form said memory; and

An overall combinational simulation is performed on the memory.

2. The method of using a memory compiler to generate an optimized memory instance according to claim 1, further comprising:

preparing a candidate list for evaluation; the candidate list includes a plurality of said memory instances; and

One of these memory instances is selected from the candidate list.

3. The method of using a memory compiler to generate an optimized memory instance according to claim 1, wherein the description data includes described circuits, design rules or device types.

4. The method of using a memory compiler to generate an optimized memory instance of claim 1, wherein the front-end model is related to current, voltage and/or power of the designed memory.

5. The method of using a memory compiler to generate an optimized memory instance of claim 1, wherein the backend model is related to a layout pattern of the designed memory.

6. The method of using a memory compiler to generate an optimized memory instance according to claim 1, further comprising: receiving the storage capacity of the designed memory.

7. The method of generating an optimized memory instance using a memory compiler as claimed in claim 1, wherein said decomposed blocks include memory cells, X decoders, control circuits, and I/O circuits.

8. The method of using a memory compiler to generate an optimized memory instance according to claim 1, wherein said parameters include threshold voltage, width/width of PMOS or NMOS Length, parallel/series elements and dynamic/static gate types.

9. The method of using a memory compiler to generate optimized memory instances according to claim 1, wherein said optimizing step generates a three-dimensional constraint surface, one or more memory instances approximating the three-dimensional constraint surface was selected as the best candidate.

10. A method for optimizing a three-dimensional memory compiler, comprising:

According to the three-dimensional priority order of speed, power and area, define the control rules for the three-dimensional design of the memory, thereby generating a three-dimensional constraint surface;

Decomposing the designed memory into a plurality of blocks;

For the decomposed blocks, at least one high-speed database, at least one low-power database, and at least one small-area database are obtained from a provision database;

adjust the parameters of the elements of these blocks;

linking the adjustment elements to form the blocks;

combine the blocks to produce multiple memory instances; and

One or more memory instances close to the three-dimensional constraint surface are selected as the best candidates.

11. The method for optimizing a 3D memory compiler according to claim 10, wherein the decomposed blocks include memory units, X decoders, control circuits, and input-output circuits.

12. The method for optimizing a three-dimensional memory compiler according to claim 10, wherein said parameters include threshold voltage, width/length of PMOS or NMOS, parallel connection/serial connection Components and dynamic/static gate circuit types.

13. A method for memory compiler optimization, comprising:

Define the governing rules for the speed, power and area of the designed memory according to the order of priority;

According to this order of priority, select one to provide the relevant part of the database;

In a top-down manner, the architecture of the designed memory is decomposed into multiple blocks, and then divided into multiple components;

In a bottom-up manner, connecting the elements to form blocks, and then combining the blocks to form the architecture of the memory; and

An overall combinatorial simulation of this memory is performed,

The top-down steps described therein include:

Decompose the architecture of the designed memory into multiple blocks;

For the decomposed blocks, at least one high-speed database, at least one low-power database, and at least one small-area database are obtained from the database;

Based on the performance characteristics of these blocks, select and adjust blocks in the general direction; and

Adjust the parameters of the components of these blocks.

14. The method for memory compiler optimization according to claim 13, wherein the bottom-up step comprises:

linking the tuning elements to form the blocks; and

These blocks are combined to form the memory.

15. The memory compiler optimization method according to claim 13, wherein the decomposed blocks include memory units, X decoders, control circuits, and I/O circuits.

16. The method for optimizing a memory compiler according to claim 13, wherein said parameters include threshold voltage, width/length of PMOS or NMOS, parallel/serial elements And dynamic/static gate circuit type.