US20040130346A1 - Silicon object array with unidirectional segmented bus architecture - Google Patents
Silicon object array with unidirectional segmented bus architecture Download PDFInfo
- Publication number
- US20040130346A1 US20040130346A1 US10/337,494 US33749403A US2004130346A1 US 20040130346 A1 US20040130346 A1 US 20040130346A1 US 33749403 A US33749403 A US 33749403A US 2004130346 A1 US2004130346 A1 US 2004130346A1
- Authority
- US
- United States
- Prior art keywords
- bus
- multiplexer
- coupled
- input
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 title claims abstract description 94
- 229910052710 silicon Inorganic materials 0.000 title claims abstract description 94
- 239000010703 silicon Substances 0.000 title claims abstract description 94
- 238000012545 processing Methods 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims description 13
- 230000008878 coupling Effects 0.000 claims description 5
- 238000010168 coupling process Methods 0.000 claims description 5
- 238000005859 coupling reaction Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 6
- 238000003491 array Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17736—Structural details of routing resources
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17724—Structural details of logic blocks
- H03K19/17732—Macroblocks
Definitions
- the present invention relates to semiconductor integrated circuits. More particularly, the present invention relates to an architecture for communicating between a plurality of processing elements, called silicon objects, within an integrated circuit.
- integrated circuits have been designed with a plurality of individual, programmable processing elements, which are arranged to form an array.
- Each processing element can be implemented to use dedicated, “nearest-neighbor” connections to allow that processing element to communicate with the eight nearest neighbors in the array.
- the eight nearest neighbors are located to the north, south, east, west, northwest, northeast, southwest, and southeast of the processing element.
- One embodiment of the present invention is directed to a logic array, which includes a unidirectional segmented bus and a plurality of silicon objects.
- the bus includes a string of unidirectional bus segments.
- Each silicon object includes a bus input coupled to one of the bus segments in the first bus, and a bus output coupled to a next subsequent one of the bus segments in the first bus.
- a landing circuit is coupled to the bus input for receiving digital information from the bus input.
- a function-specific logic block is coupled to an output of the landing circuit and has a result output.
- Each silicon object further includes a multiplexer having first and second inputs coupled to the bus input and the result output, respectively, and having an output coupled to the bus output.
- Another embodiment of the present invention is directed to a logic array, which includes first and second unidirectional segmented buses. Each bus includes a string of unidirectional bus segments.
- First and second sets of silicon objects including at least one common silicon object, are coupled between segments in the first and second buses, respectively.
- the common silicon object includes first and second bus inputs coupled to respective bus segments in the first and second buses, respectively, and first and second bus outputs coupled to subsequent bus segments in the first and second buses, respectively.
- a logic circuit is coupled to receive a first digital value from the first bus input and generates a new digital value.
- a launch circuit selectively passes the first digital value from the first bus input to the first bus output, replaces the first digital value with the new digital value on the first bus output, or passes the first digital value to the second bus output.
- Yet another embodiment of the present invention is directed to a method of communicating digital values between silicon objects on an integrated circuit.
- the method includes: coupling a first set of silicon objects between respective unidirectional bus segments in a first unidirectional segmented bus; coupling a second set of silicon objects between respective bus segments in a second unidirectional segmented bus, wherein at least one of the silicon objects is common to the first and second sets; receiving a first digital value within the common silicon object from one of the bus segments in the first bus first; generating a new digital value within the silicon object; selectively passing the first digital value from the common silicon object to another of the bus segments in the first bus; selectively replacing the first digital value with the new digital value within the common silicon object and passing the new digital value from the common silicon object to the other bus segment in the first bus; and selectively passing the first digital value from the common silicon object to one of the bus segments in the second bus.
- FIG. 1 is a diagram illustrating a reconfigurable logic array of silicon objects having ten instances of unidirectional segmented buses, called “party lines”, according to one embodiment of the present invention.
- FIG. 2 is a block diagram illustrating in greater detail one of the silicon objects shown in FIG. 1, according to one embodiment of the present invention.
- FIG. 3 is a block diagram illustrating a party line landing circuit within the silicon object shown in FIG. 2, according to one embodiment of the present invention.
- FIG. 4 is a block diagram illustrating a party line launch circuit within the silicon object shown in FIG. 2, according to one embodiment of the present invention.
- FIG. 5 is a block diagram illustrating a launch selection control circuit used for configuring the launch circuit shown in FIG. 4, according to one embodiment of the present invention.
- FIG. 1 is a diagram illustrating a reconfigurable logic array 100 of silicon objects 102 connected by “party lines” according to one embodiment of the present invention.
- Silicon objects 102 are arranged to form a two-dimensional, four-by-four element array. Any number of silicon objects can be used in alternative embodiments of the present invention, and the array can have any number of dimensions. Also, the array can be non-orthogonal.
- a “silicon object” is a single processing element. Any type of silicon object can be used, and these silicon objects can communicate with one another to provide a composite, multi-object function.
- silicon objects 102 can include function-specific logic blocks such as an arithmetic logic unit (ALU), a content-addressable memory (CAM), a cyclic redundancy check (CRC) generator, an integer/real/complex multiplier, a Galois Field multiplier, a memory, or any other combinational or sequential logic function.
- ALU arithmetic logic unit
- CAM content-addressable memory
- CRC cyclic redundancy check
- each silicon object can be programmable in nature, if desired.
- each silicon object 102 has the same physical structure, which is multiply instantiated to form array 100 . Each instantiation is independently configurable. In alternative embodiments, each silicon object is not required to have the same structure as any of the other silicon objects.
- Signals are named by the coordinates of the silicon object that drives them. Signals that are driven from outside array 100 are suffixed with “_*” in FIG. 1. For example, silicon object x0y0 receives a signal, p1_e1_*, from outside array 100 . Silicon object x0y0 drives a signal, p1_e1_x0y0, to silicon object x1y0 and receives a signal, p1_w1_x1y0, from silicon object x1y0. In subsequent figures, interface pins of each silicon object 102 is owned by that object and therefore uses a relative/directional name and is suffixed with “_in” or “_out” to designate an input or an output, respectively.
- silicon objects 102 are connected together by a plurality of “party lines” running in orthogonal north, south, east and west directions, as indicated by arrows 104 .
- Party lines are unidirectional segmented buses that communicate in vertical and horizontal (Manhattan) directions.
- a bus is “segmented” in that the bus passes through at least some combinational logic and/or a register from one bus segment to the next.
- three party lines go northward (and contain the notations N1, N2 and N3); and three go southward (S1, S2 and S3).
- two party lines go eastward (E1 and E2); and two go westward (W1 and W2).
- Each bus segment is not required to connect proximal silicon objects.
- a bus segment might connect to only every other silicon object through which it passes.
- the party lines can extend through logic array 100 in any parallel (potentially opposite) or orthogonal direction relative to one another.
- one or more party lines extend through logic array 100 such that the silicon objects pack as octagons (i.e., party line intersection angles are multiples of 45 degrees) or as hexagons (i.e., party line intersection angles are multiples of 60 degrees) within logic array 100 .
- each party line includes four control bits (C[3:0]), a valid bit (V) and a sixteen-bit data word (R[15:0]), for a total of 21 bits.
- Each party line is formed by a string of bus segments. Each bus segment extends from one silicon object 102 to the next silicon object 102 along that same party line (segmented bus). For example, one party line is formed by eastwardly extending bus segments p1_e1_*, p1_e1_x0y0, p1_e1_x1y0, p1_e1_x2y0 and p1_e1_x3y0. The “e1” designation in these bus segments indicates a first eastward-extending party line. A second eastward party line is indicated by the notation “e2”. The same notation is used for the remaining party lines.
- Each silicon object 102 is connected to ten party lines formed by ten different unidirectional segmented buses.
- control and data often follow the same paths and are therefore labeled together as a 21-bit bus.
- Each signal type of a bus is labeled with the indicator “c”, “v”, or “r”, depending on whether the bus contains control, valid, or data bits.
- the type indicators are concatenated (e.g., “cvr”).
- FIG. 2 is a block diagram illustrating in greater detail one of the silicon objects 102 .
- Silicon object 102 has ten party line inputs 110 , ten party line outputs 112 , a party line landing circuit 114 , a party line launch circuit 114 , and a function-specific logic block (“core”) 118 .
- Party line inputs 110 and outputs 112 are each 21-bits wide and include control bits C[3:0], data bits R[15:0] and valid bit V. Therefore, each of the inputs 110 and outputs 112 has the “cvr” notation mentioned above.
- Each of the party line inputs 110 is coupled to party line landing circuit 114 and to party line launch circuit 116 .
- party line landing circuit 114 includes one or more registers for receiving digital values (data, valid and/or control) from one or more of the party line inputs 110 .
- party line landing circuit 114 has five landing registers. Each landing register can be selectively used to store values from one of two corresponding party line inputs 110 . These landing registers have outputs 120 , which are coupled to logic block 118 and to inputs 122 of launch circuit 116 .
- Values on party line inputs 110 can therefore be captured by a landing register for use by logic block 118 or for synchronizing the value with a local clock signal and transmitting the value back onto the same or a different party line through launch circuit 116 .
- logic block 118 can have any logic function and configuration that is suitable for the particular application in which the array is used.
- Logic block 118 receives landing register outputs 120 on inputs 124 , processes the information according to its function, and generates one or more results on result outputs 126 and 128 .
- Result outputs 126 are coupled to launch circuit 116 .
- Result outputs 126 include one or more sets of new data and valid bits (new R[15:0] and V) and one or more sets of new control bits (new C[3:0]). Any number of result outputs 126 can be generated. In one embodiment, there are four new data values (R[15:0] and V) of 17 bits each and four new control values (C[3:0]) of one bit each.
- logic block 118 provides a party line select signal 132 , called PLS, which assists in selecting the configuration in which the inputs to launch circuit 116 are routed to party line outputs 112 .
- PLS party line select signal
- launch circuit 116 can be configured to selectively “pass” a value received from the previous silicon object on one party line to the next segment of the party line on output 112 , “turn” the value from the previous silicon object to a different party line on output 112 , or replace the value with a new value from logic block 118 or landing circuit 114 , which can then be transmitted to one of the party line outputs 112 .
- the reconfigurability of launch circuit 116 is described in more detail with reference to FIGS. 4 and 5.
- result outputs 128 are fed back to inputs 130 of landing register 114 so that the landing registers can optionally be configured for use as working registers for logic block 118 .
- result outputs 128 include five sets of 16-bit data (R[15:0]) and five sets of valid bits (V), each corresponding to a party line landing register.
- Each instance of each type of signal, control (C), valid (V), and data (R) has its own write enable (core_ . . . _we in FIG. 3). If the corresponding write enable signal is inactive, the corresponding landing register retains its value from the previous clock cycle.
- FIG. 3 is a block diagram illustrating party line landing circuit 114 in greater detail. Sampling of the data bits (R[15:0]), the valid bits (V), and the control bits (C[3:0]) are independently controlled through similar logic. For simplicity, FIG. 3 shows the landing registers and control circuitry for capturing one of the independently configurable fields. The block diagram shown in FIG. 3 is therefore instantiated once for each of these three fields. For each instantiation, the asterisk (*) in each signal or component name is replaced with “r” for the data (R[15:0]), “v” for the valid bits (V), or “c” for the control bits (C[3:0]).
- the party line signal “p1_n1_*_in” corresponds to a 16-bit signal “p1_n1_r_in” for the data bits (R[15:0]), a 1-bit signal “p1_n1_v_in” for the valid bit (V), and a 4-bit signal “p1_n1_c_in” for the control field (C bits).
- Landing circuit 114 receives the ten party line inputs 110 and the five data field (R+V) result outputs 128 (labeled “core_ . . . _*”) together with corresponding write enables (labeled “core_ . . . _*_we”). In one embodiment, no control bit results are passed, so the corresponding write enable control (“core_ . . . _c_we”) is always inactive.
- N1 and S1 There are five landing registers 300 for capturing values from the ten party lines 110 and the five data result outputs 128 .
- Each register 300 can capture values from one of two party lines 110 or one of the result outputs 128 .
- landing register p1_ns1_* can capture values from party line inputs p1_n1_*_in and p1_s1_*_in, or from data result output core_to_ns1_*_in.
- the choice between the two party lines is directed by configuration bit 304 and effected by 2-to-1 multiplexer 306 .
- configuration bit 304 is high, multiplexer 306 selects p1_s1_*_in.
- configuration bit 304 is low, multiplexer 306 selects p1_n1_*_in.
- the output of multiplexer 306 is coupled to landing register 300 through multiplexer 310 .
- Table 1 shows which party lines 110 can land in which landing register 300 .
- TABLE 1 PARTY PARTY LINE LINE LANDING CONFIGURATION (CONFIG (CONFIG REGISTER BITS BIT LOW) BIT HIGH) pl_ns1_vr pl_ns1_vr_in_sel* N1 S1 pl_ns1_c[3:0] pl_ns1_c_in_sel*[3:0] pl_ew1_vr pl_ew1_vr_in_sel* E1 W1 pl_ew1_c[3:0] pl_ew1_c_in_sel*[3:0] pl_ns2_vr pl_ns2_vr_in_sel* N2 S2 pl_ns2_c[3:0] pl_ns2_c_in_sel*[3:0] pl_ew2_vr pl_ew2_vr_in_
- Configuration bit 308 is coupled to the select input of multiplexer 310 .
- One input of multiplexer 310 is coupled to the output of multiplexer 306 , and the other input is coupled to the output of multiplexer 312 .
- multiplexer 310 applies the selected party line input 110 to register 300 .
- configuration bit 308 is low, multiplexer 310 applies the output of multiplexer 312 to register 300 .
- Multiplexer 312 allows the previous value of result output 128 to be held within register 300 during the present clock cycle when configuration bits 308 are configured to store the results from logic block 118 .
- Multiplexer 312 has a first input coupled to the output of register 300 and a second input coupled to the corresponding result output 128 .
- Write enables (core_to_ns1_*_we, core_to_ew1_*_we, core _to_ns2_*_we, core_to_ew2_*_we, and core_to_ns3_*_we) select whether the previous values are fed back to registers 300 through multiplexers 312 or new values are captured from result outputs 128 .
- the write enables correspond to the write enable signals “WE” shown on result outputs 128 in FIG. 2. There is one write enable bit for each of the five data result outputs 128 , and there are separate sets of write enable bits for the data bits (R[15:0]) and valid bits (V). Each write enable bit is coupled to the select input of a corresponding one of the multiplexers 312 .
- logic block 118 does not provide C bit result outputs 128 for the party line registers 114 .
- the configuration of register 308 for these C bits is understood to be high (to select party line input)
- the behavior of multiplexer 310 is understood to always select the value from multiplexer 306
- the result of multiplexer 312 is thereby irrelevant.
- Implementation resources are conserved by connecting multiplexer 306 directly to register 300 (optimizing-away elements 308 , 310 , and 312 ) since the behavior of the intermediate logic is known prior to fabrication.
- FIG. 4 is a schematic diagram illustrating launch circuit 116 (shown in FIG. 2) in greater detail. Again, only a portion of the data path through launch circuit 116 is shown for simplicity. The portion shown in FIG. 4 is instantiated five times, once for the data field (R[15:0] and V) and once for each of the four control bits (C[3:0]). In each instantiation, the asterisk (*) in each signal name is replaced with “vr” for the data field or “c” for each bit of the control field. Control bits are further specified by identifying the bit position, such as “c[0]”.
- Each of the ten party line outputs 112 (labeled N1, N2, N3, S1, S2, S3, E1, E2, W1, and W2) is driven by a respective multiplexer 400 .
- Multiplexers 400 allow the data paths through the silicon object to be configured for launching values from a variety of sources.
- These values include the result outputs 126 (labeled “core”) from logic block 118 , the party line landing register outputs 120 (labeled p1_ns1_*, p1_ns2_*, p1_ns3_*, p1_ew1_*, and p1_ew2_*), the value received on the corresponding party line input 110 (passing straight through the silicon object to the subsequent segment of the corresponding unidirectional segmented bus, labeled as party line output 112 ), the value received on a party line extending in a direction ⁇ 90 degrees relative to the corresponding input 110 (turning left from north-to-west, west-to-south, south-to-east, and east-to-north), and the value received on a party line extending in a direction +90 degrees relative to the corresponding input 110 (turning right from north-to-east, east-to-south, south-to-west, and west-to-north).
- party lines N3 and S3 (coupled via party line inputs 110 p1_n3_*_in and p1_s3_*_in, and via party line outputs 112 p1_n3_*_out and p1_s3_*_out) do not have any turning capability.
- the multiplexer 400 that is coupled to party line output 112 for party line N1 has a first input coupled to output 120 of party line landing register p1_ns1_*, a second input coupled to party line input W1 (turning right), a third input coupled to party line input N1 (passing straight through), a fourth input coupled to party line input E1 (turning left), and a fifth input coupled to result output 126 (labeled “core”) from logic block 118 (shown in FIG. 2).
- the result outputs 126 can be specific to a particular party line or common to one or more other party lines.
- Each multiplexer input includes the corresponding fields (R[15:0]+V, C[3], C[2], C[1], or C[0]) of the instance of the launch structure.
- the remaining launch multiplexers 400 are coupled in a similar fashion to provide similar routing selections.
- the launch multiplexers 400 for party line outputs N3 and S3 are 3-to-1 multiplexers instead of 5-to-1 multiplexers since these party lines do not have corresponding lines in the eastward and westward directions, that is, they cannot turn.
- Each launch multiplexer 400 has a select input coupled to a corresponding select signal, p1_n1_*_out_sel, p1_n2_*_out_sel, p1_n3_*_out_sel, p1_s1_*_out_sel, p1_s2_*_out_sel, p1_s3_*_out_sel, p1_e1_*_out_sel, p1_e2_*_out_sel, p1_w1_*_out_sel, or p1_w2_*_out_sel.
- each launch multiplexer 400 there is one launch select signal for each launch multiplexer 400 in each instance of the launch structure. With ten instances of five multiplexers, there are a total of 50 select signals. These select signals are generated by a launch selection control circuit based on the data stored in configuration registers maintained within each silicon object 102 and by control bits received into the party line landing registers 300 .
- routing configurations and options shown in FIG. 4 are provided as examples only. Various routing configurations and options can be added or removed in alternative embodiments of the present invention.
- FIG. 5 is block diagram illustrating a launch selection circuit for generating the launch selection signals according to one embodiment of the present invention.
- each launch multiplexer 400 is dynamically operated in one of two static configurations. That is, one of these two configurations is chosen for all party lines on a per-clock basis. Any number of selectable configurations can be used in alternative embodiments, and these configurations can be static or dynamic.
- Launch selection circuit 500 includes configuration control circuit 502 , data output select circuit 504 , and control output select circuit 506 .
- Configuration control circuit 502 generates a party line select control signal PL_SEL_SEL on output 508 , which selects one of the two configuration options for all fields of all party lines.
- the V and R [15:0] bits of each party line are routed as a unit by the select signals generated by data output select circuit 504 .
- the four control bits C[3:0] of each party line are routed individually for each party line by the select signals generated by control output select circuit 506 .
- five select signals are generated for each party line: one for each of the four C[3:0] bits (e.g., p1_n1_c_out_sel[0]), and one for the V and R[15:0] bits (e.g., p1_n1_vr_out_sel).
- Each select signal can have one of two selectable patterns, based on the logic state of PL_SEL_SEL on output 508 .
- data output select circuit 504 the select signal for the V and R[15:0] bits of each party line is generated by data launch configuration registers 510 and 512 , multiplexer 514 and register 516 .
- Configuration registers 510 and 512 store the binary patterns for the two selectable configurations for that party line.
- Multiplexer 514 selects which pattern is used for driving the select inputs of multiplexers 400 shown in FIG. 4 for the data fields. This selection is made as a function of configuration control output 508 . The selected pattern is stored in register 516 for the current clock cycle.
- control output select circuit 506 includes configuration registers 520 and 522 , multiplexer 524 and register 526 for each control bit C[3:0] of each party line.
- configuration registers 520 and 522 store the binary patterns for the two selectable configurations for the corresponding control bit.
- Multiplexer 524 selects which pattern is applied to register 526 for each clock cycle as a function of configuration control output 508 .
- Configuration control circuit 502 includes 5-to-1 multiplexer 530 , party line control bit select register 532 , party line select mask register 534 , party line select compare register 536 , logic AND gates 538 (array of five gates), exclusive-NOR (XNOR) gates 540 (array of five gates) and reductive logic AND gate 542 .
- the control fields (C[3:0]) of the landing registers 300 can be used to store patterns that determine which configuration mode will be selected.
- Landing register outputs 120 are coupled to respective inputs of multiplexer 530 .
- the select input of multiplexer 530 is coupled to control register 532 .
- Register 532 is loaded with a value that selects the appropriate landing register output 120 for matching the desired pattern.
- the party line select (PLS) bit 132 is supplied from logic block 118 (shown in FIG. 2).
- PLS bit 132 and the selected landing register output (four control bits) are applied to respective inputs of respective AND gates 538 .
- the other inputs of AND gates 538 are coupled to mask register 534 .
- Mask register 534 is loaded with a pattern that can be used to mask-out certain bits in the pattern formed by PLS bit 132 and the four control bits of the selected landing register.
- the five-bit masked pattern at the output of AND gate 538 is applied to one set of inputs of XNOR gates 540 .
- the other set of inputs of XNOR gates 540 are coupled to compare register 536 .
- Compare register 536 is loaded with a five-bit pattern for comparing against the masked output of AND gates 538 .
- the XNOR gates 540 perform a bit-wise comparison and generate a five-bit output, which indicates whether each bit location had a match.
- the five-bit output from XNOR gate 540 is applied to the five-bit reductive AND gate 542 . If each bit of the masked output from AND gate 538 matches the corresponding bit in compare register 536 , all inputs to AND gate 542 will be high resulting in a high value on configuration control output 508 . If there is a mismatch in one or more of the bit locations, output 508 will be low.
- logic block 118 can directly control the operating mode by setting or clearing PLS bit 132 .
- Configuration registers 510 , 512 , 520 and 522 and registers 532 , 534 and 536 can be hard-wired, programmed through scan logic on power-up, or written through party line inputs 110 and landing register circuit 114 , for example.
- Configuration control circuit 502 therefore provides a high level of programmability to the routing options through launch circuit 116 , and these options can be reconfigured on each clock cycle, if desired.
- Each party line can be configured independently of the other party lines, and the data can be routed independently of the control bits.
- Configuration control circuit 502 is one example of a control circuit that can be used for selecting different routing options through launch circuit 116 . Numerous other routing options and control circuits can be used in alternative embodiments.
- a reconfigurable logic array such as that shown in FIGS. 1 - 5 , can provide the time to market advantages of Field Programmable Gate Arrays (FPGAs) with the cost and performance advantages of custom Application Specific Integrated Circuits (ASICs).
- FPGAs Field Programmable Gate Arrays
- ASICs Application Specific Integrated Circuits
- a reconfigurable logic array can also allow changes to be made to the logical function and data paths through software upgrades, which allows vendors to begin designing an integrated circuit before the specifications of the circuit are finalized.
- the data paths and control paths are loosely coupled, yet independently configurable.
- the data path is sixteen bits wide while the control path is bit-wide granular.
- Each silicon object can have its own local structure, program, and/or memory. Further, each processing element can operate on its own without requiring global control.
- Communication between silicon objects can be performed through traditional nearest-neighbor connections and through party lines that provide longer distance communication. Silicon objects are allowed to change communication patterns on a per-clock basis, for example.
- the function-specific logic block of each silicon object has a program memory that includes both operation and communication directions.
- the instructions can be loaded during initial configuration or dynamically during operation. Intelligent compilation (scheduling and routing) tools can be used to deterministically allocate instructions to each object before run time.
- control paths can guide program execution while data is moved and operated upon through the data paths. From this view, instructions are the mechanisms that tie the independent control and data paths together within an array.
- Reconfigurable logic arrays can be used in a wide variety of applications.
- the arrays can be used to provide high-throughput data processing in applications that exhibit high levels of data flow determinism (i.e., regular dependencies) at a localized level. Irregular dependencies (e.g., interrupts and context switches) can be handled as ordinary signals.
- Reconfigurable arrays can be used for multi-gigabit communications processing, such as data link layer processing, TCP/IP processing, and security processing.
- these applications can require frame/packet parsing and generation, finite state machines, CRC generation and detection, comma detection, statistics counters, hashing and memory controllers, for example.
- These arrays can also be used for signal processing applications such as image and video compression, wireless local area networks, and Forward Error Correction. Numerous other applications also exist.
Landscapes
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Logic Circuits (AREA)
Abstract
Description
- The present invention relates to semiconductor integrated circuits. More particularly, the present invention relates to an architecture for communicating between a plurality of processing elements, called silicon objects, within an integrated circuit.
- As transistor density in integrated circuits continues to increase, the resulting increase in processing potential is often limited due to prohibitively high development complexity, time, and cost. While traditional microprocessors and Field Programmable Gate Array (FPGA) based designs avoid high non-recurring engineering expenses, an overall lack of performance and efficiency leads to a large area and hence high per-chip cost. In certain applications, such as multi-gigabit line rate communication processing, performance constraints render these solutions unworkable.
- In these types of applications, it is convenient for integrated circuit designs to have a high degree of configurability and programmability to allow the same integrated circuit design to perform a variety of different logical functions. For example, integrated circuits have been designed with a plurality of individual, programmable processing elements, which are arranged to form an array. Each processing element can be implemented to use dedicated, “nearest-neighbor” connections to allow that processing element to communicate with the eight nearest neighbors in the array. The eight nearest neighbors are located to the north, south, east, west, northwest, northeast, southwest, and southeast of the processing element.
- Although this arrangement provides a basic level of configurability, the programmability of each processing element has been limited and it is difficult for one processing element to communicate with other processing elements that are not nearest neighbors.
- Improved configurable architectures are therefore desired that provide increased flexibility in communicating from one processing element to other elements in the array and increased programmability of the logic function performed by each processing element.
- One embodiment of the present invention is directed to a logic array, which includes a unidirectional segmented bus and a plurality of silicon objects. The bus includes a string of unidirectional bus segments. Each silicon object includes a bus input coupled to one of the bus segments in the first bus, and a bus output coupled to a next subsequent one of the bus segments in the first bus. A landing circuit is coupled to the bus input for receiving digital information from the bus input. A function-specific logic block is coupled to an output of the landing circuit and has a result output. Each silicon object further includes a multiplexer having first and second inputs coupled to the bus input and the result output, respectively, and having an output coupled to the bus output.
- Another embodiment of the present invention is directed to a logic array, which includes first and second unidirectional segmented buses. Each bus includes a string of unidirectional bus segments. First and second sets of silicon objects, including at least one common silicon object, are coupled between segments in the first and second buses, respectively. The common silicon object includes first and second bus inputs coupled to respective bus segments in the first and second buses, respectively, and first and second bus outputs coupled to subsequent bus segments in the first and second buses, respectively. A logic circuit is coupled to receive a first digital value from the first bus input and generates a new digital value. A launch circuit selectively passes the first digital value from the first bus input to the first bus output, replaces the first digital value with the new digital value on the first bus output, or passes the first digital value to the second bus output.
- Yet another embodiment of the present invention is directed to a method of communicating digital values between silicon objects on an integrated circuit. The method includes: coupling a first set of silicon objects between respective unidirectional bus segments in a first unidirectional segmented bus; coupling a second set of silicon objects between respective bus segments in a second unidirectional segmented bus, wherein at least one of the silicon objects is common to the first and second sets; receiving a first digital value within the common silicon object from one of the bus segments in the first bus first; generating a new digital value within the silicon object; selectively passing the first digital value from the common silicon object to another of the bus segments in the first bus; selectively replacing the first digital value with the new digital value within the common silicon object and passing the new digital value from the common silicon object to the other bus segment in the first bus; and selectively passing the first digital value from the common silicon object to one of the bus segments in the second bus.
- FIG. 1 is a diagram illustrating a reconfigurable logic array of silicon objects having ten instances of unidirectional segmented buses, called “party lines”, according to one embodiment of the present invention.
- FIG. 2 is a block diagram illustrating in greater detail one of the silicon objects shown in FIG. 1, according to one embodiment of the present invention.
- FIG. 3 is a block diagram illustrating a party line landing circuit within the silicon object shown in FIG. 2, according to one embodiment of the present invention.
- FIG. 4 is a block diagram illustrating a party line launch circuit within the silicon object shown in FIG. 2, according to one embodiment of the present invention.
- FIG. 5 is a block diagram illustrating a launch selection control circuit used for configuring the launch circuit shown in FIG. 4, according to one embodiment of the present invention.
- FIG. 1 is a diagram illustrating a
reconfigurable logic array 100 ofsilicon objects 102 connected by “party lines” according to one embodiment of the present invention.Silicon objects 102 are arranged to form a two-dimensional, four-by-four element array. Any number of silicon objects can be used in alternative embodiments of the present invention, and the array can have any number of dimensions. Also, the array can be non-orthogonal. - A “silicon object” is a single processing element. Any type of silicon object can be used, and these silicon objects can communicate with one another to provide a composite, multi-object function. For example,
silicon objects 102 can include function-specific logic blocks such as an arithmetic logic unit (ALU), a content-addressable memory (CAM), a cyclic redundancy check (CRC) generator, an integer/real/complex multiplier, a Galois Field multiplier, a memory, or any other combinational or sequential logic function. Also, each silicon object can be programmable in nature, if desired. - In one embodiment, each
silicon object 102 has the same physical structure, which is multiply instantiated to formarray 100. Each instantiation is independently configurable. In alternative embodiments, each silicon object is not required to have the same structure as any of the other silicon objects. - The following naming conventions are used throughout the various figures. Referring to FIG. 1, coordinates are expressed as “x#y#” using a minimum number of digits, where an x represents location in the east and west directions, y represents location in the north and south directions and # represents a particular value. The north, south, east, and west directions are indicated by
arrows 104. For example, thesilicon object 102 that is located in the lower left corner ofarray 100 is labeled “x0y0”, and the silicon object that is located in the upper right corner is labeled “x3y3”. Theremaining silicon objects 102 are similarly numbered to indicate their relative positions withinarray 100. - Signals are named by the coordinates of the silicon object that drives them. Signals that are driven from
outside array 100 are suffixed with “_*” in FIG. 1. For example, silicon object x0y0 receives a signal, p1_e1_*, fromoutside array 100. Silicon object x0y0 drives a signal, p1_e1_x0y0, to silicon object x1y0 and receives a signal, p1_w1_x1y0, from silicon object x1y0. In subsequent figures, interface pins of eachsilicon object 102 is owned by that object and therefore uses a relative/directional name and is suffixed with “_in” or “_out” to designate an input or an output, respectively. - Referring to the data flow paths shown in FIG. 1,
silicon objects 102 are connected together by a plurality of “party lines” running in orthogonal north, south, east and west directions, as indicated byarrows 104. Party lines are unidirectional segmented buses that communicate in vertical and horizontal (Manhattan) directions. A bus is “segmented” in that the bus passes through at least some combinational logic and/or a register from one bus segment to the next. In the vertical direction, three party lines go northward (and contain the notations N1, N2 and N3); and three go southward (S1, S2 and S3). Horizontally, two party lines go eastward (E1 and E2); and two go westward (W1 and W2). Each bus segment is not required to connect proximal silicon objects. For example in one alternative embodiment, a bus segment might connect to only every other silicon object through which it passes. Also, the party lines can extend throughlogic array 100 in any parallel (potentially opposite) or orthogonal direction relative to one another. In another embodiment, one or more party lines extend throughlogic array 100 such that the silicon objects pack as octagons (i.e., party line intersection angles are multiples of 45 degrees) or as hexagons (i.e., party line intersection angles are multiples of 60 degrees) withinlogic array 100. - In one embodiment, each party line includes four control bits (C[3:0]), a valid bit (V) and a sixteen-bit data word (R[15:0]), for a total of 21 bits. Each party line is formed by a string of bus segments. Each bus segment extends from one
silicon object 102 to thenext silicon object 102 along that same party line (segmented bus). For example, one party line is formed by eastwardly extending bus segments p1_e1_*, p1_e1_x0y0, p1_e1_x1y0, p1_e1_x2y0 and p1_e1_x3y0. The “e1” designation in these bus segments indicates a first eastward-extending party line. A second eastward party line is indicated by the notation “e2”. The same notation is used for the remaining party lines. Eachsilicon object 102 is connected to ten party lines formed by ten different unidirectional segmented buses. - In the following figures, control and data often follow the same paths and are therefore labeled together as a 21-bit bus. Each signal type of a bus is labeled with the indicator “c”, “v”, or “r”, depending on whether the bus contains control, valid, or data bits. In the case of multiple types of content in a bus, the type indicators are concatenated (e.g., “cvr”).
- FIG. 2 is a block diagram illustrating in greater detail one of the silicon objects102.
Silicon object 102 has tenparty line inputs 110, ten party line outputs 112, a partyline landing circuit 114, a partyline launch circuit 114, and a function-specific logic block (“core”) 118.Party line inputs 110 andoutputs 112 are each 21-bits wide and include control bits C[3:0], data bits R[15:0] and valid bit V. Therefore, each of theinputs 110 andoutputs 112 has the “cvr” notation mentioned above. - Each of the
party line inputs 110 is coupled to partyline landing circuit 114 and to partyline launch circuit 116. In one embodiment, partyline landing circuit 114 includes one or more registers for receiving digital values (data, valid and/or control) from one or more of theparty line inputs 110. In one embodiment, partyline landing circuit 114 has five landing registers. Each landing register can be selectively used to store values from one of two correspondingparty line inputs 110. These landing registers haveoutputs 120, which are coupled tologic block 118 and toinputs 122 oflaunch circuit 116. - Values on
party line inputs 110 can therefore be captured by a landing register for use bylogic block 118 or for synchronizing the value with a local clock signal and transmitting the value back onto the same or a different party line throughlaunch circuit 116. - As mentioned above,
logic block 118 can have any logic function and configuration that is suitable for the particular application in which the array is used.Logic block 118 receives landing register outputs 120 oninputs 124, processes the information according to its function, and generates one or more results onresult outputs -
Result outputs 126 are coupled to launchcircuit 116. Result outputs 126 include one or more sets of new data and valid bits (new R[15:0] and V) and one or more sets of new control bits (new C[3:0]). Any number ofresult outputs 126 can be generated. In one embodiment, there are four new data values (R[15:0] and V) of 17 bits each and four new control values (C[3:0]) of one bit each. In addition,logic block 118 provides a party lineselect signal 132, called PLS, which assists in selecting the configuration in which the inputs to launchcircuit 116 are routed to party line outputs 112. For example,launch circuit 116 can be configured to selectively “pass” a value received from the previous silicon object on one party line to the next segment of the party line onoutput 112, “turn” the value from the previous silicon object to a different party line onoutput 112, or replace the value with a new value fromlogic block 118 orlanding circuit 114, which can then be transmitted to one of the party line outputs 112. The reconfigurability oflaunch circuit 116 is described in more detail with reference to FIGS. 4 and 5. -
Result outputs 128 are fed back toinputs 130 of landingregister 114 so that the landing registers can optionally be configured for use as working registers forlogic block 118. In one embodiment, result outputs 128 include five sets of 16-bit data (R[15:0]) and five sets of valid bits (V), each corresponding to a party line landing register. Each instance of each type of signal, control (C), valid (V), and data (R), has its own write enable (core_ . . . _we in FIG. 3). If the corresponding write enable signal is inactive, the corresponding landing register retains its value from the previous clock cycle. - FIG. 3 is a block diagram illustrating party
line landing circuit 114 in greater detail. Sampling of the data bits (R[15:0]), the valid bits (V), and the control bits (C[3:0]) are independently controlled through similar logic. For simplicity, FIG. 3 shows the landing registers and control circuitry for capturing one of the independently configurable fields. The block diagram shown in FIG. 3 is therefore instantiated once for each of these three fields. For each instantiation, the asterisk (*) in each signal or component name is replaced with “r” for the data (R[15:0]), “v” for the valid bits (V), or “c” for the control bits (C[3:0]). For example, the party line signal “p1_n1_*_in” corresponds to a 16-bit signal “p1_n1_r_in” for the data bits (R[15:0]), a 1-bit signal “p1_n1_v_in” for the valid bit (V), and a 4-bit signal “p1_n1_c_in” for the control field (C bits). -
Landing circuit 114 receives the tenparty line inputs 110 and the five data field (R+V) result outputs 128 (labeled “core_ . . . _*”) together with corresponding write enables (labeled “core_ . . . _*_we”). In one embodiment, no control bit results are passed, so the corresponding write enable control (“core_ . . . _c_we”) is always inactive. - There are five landing
registers 300 for capturing values from the tenparty lines 110 and the five data result outputs 128. There is oneresult output 128 for the first north and south party lines (N1 and S1), the first east and west party lines (E1 and W1), the second north and south party lines (N2 and S2), the second east and west party lines (E2 and W2), and the third north and south party lines (N3 and S3). - Each
register 300 can capture values from one of twoparty lines 110 or one of the result outputs 128. For example, landing register p1_ns1_* can capture values from party line inputs p1_n1_*_in and p1_s1_*_in, or from data result output core_to_ns1_*_in. The choice between the two party lines is directed byconfiguration bit 304 and effected by 2-to-1multiplexer 306. Whenconfiguration bit 304 is high,multiplexer 306 selects p1_s1_*_in. Whenconfiguration bit 304 is low,multiplexer 306 selects p1_n1_*_in. The output ofmultiplexer 306 is coupled to landingregister 300 throughmultiplexer 310. Table 1 shows whichparty lines 110 can land in whichlanding register 300.TABLE 1 PARTY PARTY LINE LINE LANDING CONFIGURATION (CONFIG (CONFIG REGISTER BITS BIT LOW) BIT HIGH) pl_ns1_vr pl_ns1_vr_in_sel* N1 S1 pl_ns1_c[3:0] pl_ns1_c_in_sel*[3:0] pl_ew1_vr pl_ew1_vr_in_sel* E1 W1 pl_ew1_c[3:0] pl_ew1_c_in_sel*[3:0] pl_ns2_vr pl_ns2_vr_in_sel* N2 S2 pl_ns2_c[3:0] pl_ns2_c_in_sel*[3:0] pl_ew2_vr pl_ew2_vr_in_sel* E2 W2 pl_ew2_c[3:0] pl_ew2_c_in_sel*[3:0] pl_ns3_vr pl_ns3_vr_in_sel* N3 S3 pl_ns3_c[3:0] pl_ns3_c_in_sel*[3:0] - The selection between the two
party lines 110 and theresult output 128 is made withconfiguration bit 308 andmultiplexer 310.Configuration bit 308 is coupled to the select input ofmultiplexer 310. One input ofmultiplexer 310 is coupled to the output ofmultiplexer 306, and the other input is coupled to the output ofmultiplexer 312. - When
configuration bit 308 is high,multiplexer 310 applies the selectedparty line input 110 to register 300. Whenconfiguration bit 308 is low,multiplexer 310 applies the output ofmultiplexer 312 to register 300. -
Multiplexer 312 allows the previous value ofresult output 128 to be held withinregister 300 during the present clock cycle whenconfiguration bits 308 are configured to store the results fromlogic block 118.Multiplexer 312 has a first input coupled to the output ofregister 300 and a second input coupled to thecorresponding result output 128. Write enables (core_to_ns1_*_we, core_to_ew1_*_we, core _to_ns2_*_we, core_to_ew2_*_we, and core_to_ns3_*_we) select whether the previous values are fed back toregisters 300 throughmultiplexers 312 or new values are captured from result outputs 128. The write enables correspond to the write enable signals “WE” shown onresult outputs 128 in FIG. 2. There is one write enable bit for each of the five data resultoutputs 128, and there are separate sets of write enable bits for the data bits (R[15:0]) and valid bits (V). Each write enable bit is coupled to the select input of a corresponding one of themultiplexers 312. - In one embodiment,
logic block 118 does not provide C bit resultoutputs 128 for the party line registers 114. Thus the configuration ofregister 308 for these C bits is understood to be high (to select party line input), the behavior ofmultiplexer 310 is understood to always select the value frommultiplexer 306, and the result ofmultiplexer 312 is thereby irrelevant. Implementation resources are conserved by connectingmultiplexer 306 directly to register 300 (optimizing-awayelements - FIG. 4 is a schematic diagram illustrating launch circuit116 (shown in FIG. 2) in greater detail. Again, only a portion of the data path through
launch circuit 116 is shown for simplicity. The portion shown in FIG. 4 is instantiated five times, once for the data field (R[15:0] and V) and once for each of the four control bits (C[3:0]). In each instantiation, the asterisk (*) in each signal name is replaced with “vr” for the data field or “c” for each bit of the control field. Control bits are further specified by identifying the bit position, such as “c[0]”. - Each of the ten party line outputs112 (labeled N1, N2, N3, S1, S2, S3, E1, E2, W1, and W2) is driven by a
respective multiplexer 400.Multiplexers 400 allow the data paths through the silicon object to be configured for launching values from a variety of sources. These values include the result outputs 126 (labeled “core”) fromlogic block 118, the party line landing register outputs 120 (labeled p1_ns1_*, p1_ns2_*, p1_ns3_*, p1_ew1_*, and p1_ew2_*), the value received on the corresponding party line input 110 (passing straight through the silicon object to the subsequent segment of the corresponding unidirectional segmented bus, labeled as party line output 112), the value received on a party line extending in a direction −90 degrees relative to the corresponding input 110 (turning left from north-to-west, west-to-south, south-to-east, and east-to-north), and the value received on a party line extending in a direction +90 degrees relative to the corresponding input 110 (turning right from north-to-east, east-to-south, south-to-west, and west-to-north). In this embodiment, party lines N3 and S3 (coupled viaparty line inputs 110 p1_n3_*_in and p1_s3_*_in, and via party line outputs 112 p1_n3_*_out and p1_s3_*_out) do not have any turning capability. - For example, the
multiplexer 400 that is coupled toparty line output 112 for party line N1 has a first input coupled tooutput 120 of party line landing register p1_ns1_*, a second input coupled to party line input W1 (turning right), a third input coupled to party line input N1 (passing straight through), a fourth input coupled to party line input E1 (turning left), and a fifth input coupled to result output 126 (labeled “core”) from logic block 118 (shown in FIG. 2). The result outputs 126 can be specific to a particular party line or common to one or more other party lines. - Each multiplexer input includes the corresponding fields (R[15:0]+V, C[3], C[2], C[1], or C[0]) of the instance of the launch structure. The remaining
launch multiplexers 400 are coupled in a similar fashion to provide similar routing selections. However, thelaunch multiplexers 400 for party line outputs N3 and S3 are 3-to-1 multiplexers instead of 5-to-1 multiplexers since these party lines do not have corresponding lines in the eastward and westward directions, that is, they cannot turn. - Each
launch multiplexer 400 has a select input coupled to a corresponding select signal, p1_n1_*_out_sel, p1_n2_*_out_sel, p1_n3_*_out_sel, p1_s1_*_out_sel, p1_s2_*_out_sel, p1_s3_*_out_sel, p1_e1_*_out_sel, p1_e2_*_out_sel, p1_w1_*_out_sel, or p1_w2_*_out_sel. There is one launch select signal for eachlaunch multiplexer 400 in each instance of the launch structure. With ten instances of five multiplexers, there are a total of 50 select signals. These select signals are generated by a launch selection control circuit based on the data stored in configuration registers maintained within eachsilicon object 102 and by control bits received into the party line landing registers 300. - The routing configurations and options shown in FIG. 4 are provided as examples only. Various routing configurations and options can be added or removed in alternative embodiments of the present invention.
- FIG. 5 is block diagram illustrating a launch selection circuit for generating the launch selection signals according to one embodiment of the present invention. In this embodiment, each
launch multiplexer 400 is dynamically operated in one of two static configurations. That is, one of these two configurations is chosen for all party lines on a per-clock basis. Any number of selectable configurations can be used in alternative embodiments, and these configurations can be static or dynamic. -
Launch selection circuit 500 includesconfiguration control circuit 502, data outputselect circuit 504, and control outputselect circuit 506.Configuration control circuit 502 generates a party line select control signal PL_SEL_SEL onoutput 508, which selects one of the two configuration options for all fields of all party lines. - The V and R [15:0] bits of each party line are routed as a unit by the select signals generated by data output
select circuit 504. The four control bits C[3:0] of each party line are routed individually for each party line by the select signals generated by control outputselect circuit 506. Thus, five select signals are generated for each party line: one for each of the four C[3:0] bits (e.g., p1_n1_c_out_sel[0]), and one for the V and R[15:0] bits (e.g., p1_n1_vr_out_sel). Each select signal can have one of two selectable patterns, based on the logic state of PL_SEL_SEL onoutput 508. - In data output
select circuit 504 the select signal for the V and R[15:0] bits of each party line is generated by data launch configuration registers 510 and 512,multiplexer 514 and register 516. Configuration registers 510 and 512 store the binary patterns for the two selectable configurations for that party line.Multiplexer 514 selects which pattern is used for driving the select inputs ofmultiplexers 400 shown in FIG. 4 for the data fields. This selection is made as a function ofconfiguration control output 508. The selected pattern is stored inregister 516 for the current clock cycle. - Similarly, control output
select circuit 506 includes configuration registers 520 and 522,multiplexer 524 and register 526 for each control bit C[3:0] of each party line. Again, configuration registers 520 and 522 store the binary patterns for the two selectable configurations for the corresponding control bit.Multiplexer 524 selects which pattern is applied to register 526 for each clock cycle as a function ofconfiguration control output 508. -
Configuration control circuit 502 includes 5-to-1multiplexer 530, party line control bitselect register 532, party line select mask register 534, party line select compareregister 536, logic AND gates 538 (array of five gates), exclusive-NOR (XNOR) gates 540 (array of five gates) and reductive logic ANDgate 542. - The control fields (C[3:0]) of the landing registers300 (FIG. 3) can be used to store patterns that determine which configuration mode will be selected. Landing register outputs 120 are coupled to respective inputs of
multiplexer 530. The select input ofmultiplexer 530 is coupled to controlregister 532.Register 532 is loaded with a value that selects the appropriatelanding register output 120 for matching the desired pattern. - In addition, the party line select (PLS) bit132 is supplied from logic block 118 (shown in FIG. 2).
PLS bit 132 and the selected landing register output (four control bits) are applied to respective inputs of respective ANDgates 538. The other inputs of ANDgates 538 are coupled to mask register 534. - Mask register534 is loaded with a pattern that can be used to mask-out certain bits in the pattern formed by
PLS bit 132 and the four control bits of the selected landing register. The five-bit masked pattern at the output of ANDgate 538 is applied to one set of inputs ofXNOR gates 540. The other set of inputs ofXNOR gates 540 are coupled to compareregister 536. Compareregister 536 is loaded with a five-bit pattern for comparing against the masked output of ANDgates 538. TheXNOR gates 540 perform a bit-wise comparison and generate a five-bit output, which indicates whether each bit location had a match. - The five-bit output from
XNOR gate 540 is applied to the five-bit reductive ANDgate 542. If each bit of the masked output from ANDgate 538 matches the corresponding bit in compareregister 536, all inputs to ANDgate 542 will be high resulting in a high value onconfiguration control output 508. If there is a mismatch in one or more of the bit locations,output 508 will be low. - Thus, the results from an operation in
logic block 118 or a value from one of the party lines can be loaded into a landing register and compared against the match pattern to determine which routing configuration through multiplexers 400 (shown in FIG. 4) will be used during the next clock cycle. Also,logic block 118 can directly control the operating mode by setting or clearingPLS bit 132. - Configuration registers510, 512, 520 and 522 and
registers party line inputs 110 andlanding register circuit 114, for example.Configuration control circuit 502 therefore provides a high level of programmability to the routing options throughlaunch circuit 116, and these options can be reconfigured on each clock cycle, if desired. Each party line can be configured independently of the other party lines, and the data can be routed independently of the control bits. -
Configuration control circuit 502 is one example of a control circuit that can be used for selecting different routing options throughlaunch circuit 116. Numerous other routing options and control circuits can be used in alternative embodiments. - As integrated circuit geometries shrink and design and mask-set costs rise, off-the-shelf, high performance, reconfigurable devices become more desirable. A reconfigurable logic array, such as that shown in FIGS.1-5, can provide the time to market advantages of Field Programmable Gate Arrays (FPGAs) with the cost and performance advantages of custom Application Specific Integrated Circuits (ASICs).
- A reconfigurable logic array can also allow changes to be made to the logical function and data paths through software upgrades, which allows vendors to begin designing an integrated circuit before the specifications of the circuit are finalized. The data paths and control paths are loosely coupled, yet independently configurable. In one embodiment, the data path is sixteen bits wide while the control path is bit-wide granular. Each silicon object can have its own local structure, program, and/or memory. Further, each processing element can operate on its own without requiring global control.
- Communication between silicon objects can be performed through traditional nearest-neighbor connections and through party lines that provide longer distance communication. Silicon objects are allowed to change communication patterns on a per-clock basis, for example. In one embodiment, the function-specific logic block of each silicon object has a program memory that includes both operation and communication directions. The instructions can be loaded during initial configuration or dynamically during operation. Intelligent compilation (scheduling and routing) tools can be used to deterministically allocate instructions to each object before run time.
- The control paths can guide program execution while data is moved and operated upon through the data paths. From this view, instructions are the mechanisms that tie the independent control and data paths together within an array.
- Reconfigurable logic arrays can be used in a wide variety of applications. For example, the arrays can be used to provide high-throughput data processing in applications that exhibit high levels of data flow determinism (i.e., regular dependencies) at a localized level. Irregular dependencies (e.g., interrupts and context switches) can be handled as ordinary signals. Reconfigurable arrays can be used for multi-gigabit communications processing, such as data link layer processing, TCP/IP processing, and security processing. At the functional level, these applications can require frame/packet parsing and generation, finite state machines, CRC generation and detection, comma detection, statistics counters, hashing and memory controllers, for example. These arrays can also be used for signal processing applications such as image and video compression, wireless local area networks, and Forward Error Correction. Numerous other applications also exist.
- Although the present invention has been described with reference to preferred embodiments, workers skilled in the art will recognize that changes may be made in form and detail without departing from the spirit and scope of the invention. For example, the terms “high” and “low” are arbitrary terms and are interchangeable with a logical inversion of the circuit. Likewise, the term “coupled” can include various types of connections or couplings and can include a direct connection or a connection thorough one or more intermediate components.
Claims (31)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/337,494 US6816562B2 (en) | 2003-01-07 | 2003-01-07 | Silicon object array with unidirectional segmented bus architecture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/337,494 US6816562B2 (en) | 2003-01-07 | 2003-01-07 | Silicon object array with unidirectional segmented bus architecture |
Publications (2)
Publication Number | Publication Date |
---|---|
US20040130346A1 true US20040130346A1 (en) | 2004-07-08 |
US6816562B2 US6816562B2 (en) | 2004-11-09 |
Family
ID=32681253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/337,494 Expired - Lifetime US6816562B2 (en) | 2003-01-07 | 2003-01-07 | Silicon object array with unidirectional segmented bus architecture |
Country Status (1)
Country | Link |
---|---|
US (1) | US6816562B2 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060080632A1 (en) * | 2004-09-30 | 2006-04-13 | Mathstar, Inc. | Integrated circuit layout having rectilinear structure of objects |
US20070025382A1 (en) * | 2005-07-26 | 2007-02-01 | Ambric, Inc. | System of virtual data channels in an integrated circuit |
US20070038782A1 (en) * | 2005-07-26 | 2007-02-15 | Ambric, Inc. | System of virtual data channels across clock boundaries in an integrated circuit |
US20070247189A1 (en) * | 2005-01-25 | 2007-10-25 | Mathstar | Field programmable semiconductor object array integrated circuit |
EP1952583A2 (en) * | 2005-11-07 | 2008-08-06 | Ambric Inc. | System of virtual data channels across clock boundaries in an integrated circuit |
US20090206889A1 (en) * | 2008-02-15 | 2009-08-20 | Mathstar, Inc. | Method and Apparatus for Controlling Power Surge in an Integrated Circuit |
FR3030806A1 (en) * | 2014-12-17 | 2016-06-24 | Thales Sa | CONFIGURABLE ELECTRONIC DATA TRANSFER SYSTEM AND CONFIGURATION METHOD THEREOF |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE60331296D1 (en) * | 2002-03-18 | 2010-04-01 | Nxp Bv | IMPLEMENTATION OF WIDE MULTIPLEXERS IN A RECONFIGURABLE LOGICAL DEVICE |
US7330484B2 (en) * | 2003-04-23 | 2008-02-12 | Sun Microsystems, Inc. | Method and system for transmitting packet chains |
US20070186076A1 (en) * | 2003-06-18 | 2007-08-09 | Jones Anthony M | Data pipeline transport system |
CN101044485A (en) | 2003-06-18 | 2007-09-26 | 安布里克股份有限公司 | Integrated circuit development system |
US7969919B1 (en) | 2005-08-08 | 2011-06-28 | Rockwell Collins, Inc. | System and method for thermal load sharing between nodes in a communications network |
US8009605B1 (en) | 2005-08-08 | 2011-08-30 | Rockwell Collins, Inc. | Low power, programmable modem for software defined radio applications |
US7260100B1 (en) | 2005-08-08 | 2007-08-21 | Rockwell Collins, Inc. | System and method for net formation and merging in ad hoc networks |
US7430192B1 (en) | 2005-08-08 | 2008-09-30 | Rockwell Collins, Inc. | Net formation-merging system and method |
US8139624B1 (en) | 2005-08-25 | 2012-03-20 | Rockwell Collins, Inc. | System and method for providing multiple ad hoc communication networks on a hardware channel |
US8145880B1 (en) | 2008-07-07 | 2012-03-27 | Ovics | Matrix processor data switch routing systems and methods |
US8045339B2 (en) * | 2008-07-07 | 2011-10-25 | Dell Products L.P. | Multiple component mounting system |
US8327114B1 (en) | 2008-07-07 | 2012-12-04 | Ovics | Matrix processor proxy systems and methods |
US7870365B1 (en) * | 2008-07-07 | 2011-01-11 | Ovics | Matrix of processors with data stream instruction execution pipeline coupled to data switch linking to neighbor units by non-contentious command channel / data channel |
US8131975B1 (en) | 2008-07-07 | 2012-03-06 | Ovics | Matrix processor initialization systems and methods |
US7958341B1 (en) | 2008-07-07 | 2011-06-07 | Ovics | Processing stream instruction in IC of mesh connected matrix of processors containing pipeline coupled switch transferring messages over consecutive cycles from one link to another link or memory |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6121790A (en) * | 1997-10-16 | 2000-09-19 | Altera Corporation | Programmable logic device with enhanced multiplexing capabilities in interconnect resources |
US6127846A (en) * | 1995-05-17 | 2000-10-03 | Altera Corporation | Programmable logic array devices with interconnect lines of various lengths |
US6181160B1 (en) * | 1996-10-10 | 2001-01-30 | Altera Corporation | Programmable logic device with hierarchical interconnection resources |
US6181162B1 (en) * | 1994-04-10 | 2001-01-30 | Altera Corporation | Programmable logic device with highly routable interconnect |
US6184710B1 (en) * | 1997-03-20 | 2001-02-06 | Altera Corporation | Programmable logic array devices with enhanced interconnectivity between adjacent logic regions |
US6218856B1 (en) * | 1994-01-27 | 2001-04-17 | Xilinx, Inc. | High speed programmable logic architecture |
US6225822B1 (en) * | 1998-11-18 | 2001-05-01 | Altera Corporation | Fast signal conductor networks for programmable logic devices |
US6239615B1 (en) * | 1998-01-21 | 2001-05-29 | Altera Corporation | High-performance interconnect |
US6262595B1 (en) * | 1997-06-10 | 2001-07-17 | Altera Corporation | High-speed programmable interconnect |
US6369610B1 (en) * | 1997-12-29 | 2002-04-09 | Ic Innovations Ltd. | Reconfigurable multiplier array |
US6417694B1 (en) * | 1996-10-10 | 2002-07-09 | Altera Corporation | Programmable logic device with hierarchical interconnection resources |
-
2003
- 2003-01-07 US US10/337,494 patent/US6816562B2/en not_active Expired - Lifetime
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6218856B1 (en) * | 1994-01-27 | 2001-04-17 | Xilinx, Inc. | High speed programmable logic architecture |
US6181162B1 (en) * | 1994-04-10 | 2001-01-30 | Altera Corporation | Programmable logic device with highly routable interconnect |
US6127846A (en) * | 1995-05-17 | 2000-10-03 | Altera Corporation | Programmable logic array devices with interconnect lines of various lengths |
US6181160B1 (en) * | 1996-10-10 | 2001-01-30 | Altera Corporation | Programmable logic device with hierarchical interconnection resources |
US6417694B1 (en) * | 1996-10-10 | 2002-07-09 | Altera Corporation | Programmable logic device with hierarchical interconnection resources |
US6184710B1 (en) * | 1997-03-20 | 2001-02-06 | Altera Corporation | Programmable logic array devices with enhanced interconnectivity between adjacent logic regions |
US6262595B1 (en) * | 1997-06-10 | 2001-07-17 | Altera Corporation | High-speed programmable interconnect |
US6121790A (en) * | 1997-10-16 | 2000-09-19 | Altera Corporation | Programmable logic device with enhanced multiplexing capabilities in interconnect resources |
US6369610B1 (en) * | 1997-12-29 | 2002-04-09 | Ic Innovations Ltd. | Reconfigurable multiplier array |
US6239615B1 (en) * | 1998-01-21 | 2001-05-29 | Altera Corporation | High-performance interconnect |
US6281704B2 (en) * | 1998-01-21 | 2001-08-28 | Altera Corporation | High-performance interconnect |
US6225822B1 (en) * | 1998-11-18 | 2001-05-01 | Altera Corporation | Fast signal conductor networks for programmable logic devices |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060080632A1 (en) * | 2004-09-30 | 2006-04-13 | Mathstar, Inc. | Integrated circuit layout having rectilinear structure of objects |
US20070247189A1 (en) * | 2005-01-25 | 2007-10-25 | Mathstar | Field programmable semiconductor object array integrated circuit |
US20070025382A1 (en) * | 2005-07-26 | 2007-02-01 | Ambric, Inc. | System of virtual data channels in an integrated circuit |
US20070038782A1 (en) * | 2005-07-26 | 2007-02-15 | Ambric, Inc. | System of virtual data channels across clock boundaries in an integrated circuit |
US7801033B2 (en) | 2005-07-26 | 2010-09-21 | Nethra Imaging, Inc. | System of virtual data channels in an integrated circuit |
EP1952583A2 (en) * | 2005-11-07 | 2008-08-06 | Ambric Inc. | System of virtual data channels across clock boundaries in an integrated circuit |
EP1952583A4 (en) * | 2005-11-07 | 2009-02-04 | Ambric Inc | System of virtual data channels across clock boundaries in an integrated circuit |
US20090206889A1 (en) * | 2008-02-15 | 2009-08-20 | Mathstar, Inc. | Method and Apparatus for Controlling Power Surge in an Integrated Circuit |
FR3030806A1 (en) * | 2014-12-17 | 2016-06-24 | Thales Sa | CONFIGURABLE ELECTRONIC DATA TRANSFER SYSTEM AND CONFIGURATION METHOD THEREOF |
EP3040873A1 (en) * | 2014-12-17 | 2016-07-06 | Thales | Configurable electronic system of transfer of data and associated configuration method |
Also Published As
Publication number | Publication date |
---|---|
US6816562B2 (en) | 2004-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6816562B2 (en) | Silicon object array with unidirectional segmented bus architecture | |
US6421817B1 (en) | System and method of computation in a programmable logic device using virtual instructions | |
US6591357B2 (en) | Method and apparatus for configuring arbitrary sized data paths comprising multiple context processing elements | |
US8575959B2 (en) | Reconfigurable logic fabrics for integrated circuits and systems and methods for configuring reconfigurable logic fabrics | |
US6006321A (en) | Programmable logic datapath that may be used in a field programmable device | |
US6047115A (en) | Method for configuring FPGA memory planes for virtual hardware computation | |
US6476636B1 (en) | Tileable field-programmable gate array architecture | |
US7746111B1 (en) | Gating logic circuits in a self-timed integrated circuit | |
US9564902B2 (en) | Dynamically configurable and re-configurable data path | |
US7733123B1 (en) | Implementing conditional statements in self-timed logic circuits | |
US7746112B1 (en) | Output structure with cascaded control signals for logic blocks in integrated circuits, and methods of using the same | |
US9411554B1 (en) | Signed multiplier circuit utilizing a uniform array of logic blocks | |
US7746102B1 (en) | Bus-based logic blocks for self-timed integrated circuits | |
US7746106B1 (en) | Circuits for enabling feedback paths in a self-timed integrated circuit | |
US8516025B2 (en) | Clock driven dynamic datapath chaining | |
US7746104B1 (en) | Dynamically controlled output multiplexer circuits in a programmable integrated circuit | |
US7746105B1 (en) | Merging data streams in a self-timed programmable integrated circuit | |
US9002915B1 (en) | Circuits for shifting bussed data | |
JP2005539292A (en) | Programmable pipeline fabric having a mechanism for terminating signal propagation | |
US7746101B1 (en) | Cascading input structure for logic blocks in integrated circuits | |
Greensted et al. | RISA: A hardware platform for evolutionary design | |
Palchaudhuri et al. | Testable architecture design for programmable cellular automata on FPGA using run-time dynamically reconfigurable look-up tables | |
Baklouti et al. | Reconfigurable Communication Networks in a Parametric SIMD Parallel System on Chip | |
She et al. | A novel self-routing reconfigurable fault-tolerant cell array | |
Jordan | A configurable decoder for pin-limited applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MATHSTAR, INC., MINNESOTA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ATKINSON, KEVIN E.;DWYER, TIMOTHY H.;JOHNSON, RYAN C.;AND OTHERS;REEL/FRAME:013641/0558 Effective date: 20030107 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAT HOLDER NO LONGER CLAIMS SMALL ENTITY STATUS, ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: STOL); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: OLK GRUN GMBH LLC, DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SAJAN, INC.;REEL/FRAME:027460/0899 Effective date: 20111114 |
|
AS | Assignment |
Owner name: MATHSTAR, INC., MINNESOTA Free format text: MERGER;ASSIGNOR:MATHSTAR, INC.;REEL/FRAME:027480/0762 Effective date: 20050610 |
|
AS | Assignment |
Owner name: SAJAN, INC., WISCONSIN Free format text: MERGER;ASSIGNOR:MATHSTAR, INC.;REEL/FRAME:027480/0775 Effective date: 20100226 |
|
AS | Assignment |
Owner name: NYTELL SOFTWARE LLC, DELAWARE Free format text: MERGER;ASSIGNOR:OLK GRUN GMBH, LLC;REEL/FRAME:037391/0494 Effective date: 20150826 |
|
FPAY | Fee payment |
Year of fee payment: 12 |