CN105608490B - Cellular array computing system and communication means therein - Google Patents
Cellular array computing system and communication means therein Download PDFInfo
- Publication number
- CN105608490B CN105608490B CN201510456294.3A CN201510456294A CN105608490B CN 105608490 B CN105608490 B CN 105608490B CN 201510456294 A CN201510456294 A CN 201510456294A CN 105608490 B CN105608490 B CN 105608490B
- Authority
- CN
- China
- Prior art keywords
- cell
- cellular array
- data
- array
- master cpu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Multi Processors (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
A kind of cellular array computing system and communication means therein, the cellular array computing system include:Master cpu, cellular array and cellular array bus;The cellular array is had both the two-dimensional array that the cell of calculating and store function forms by more than one, and wherein each cell includes microprocessor and non-volatile random access memory;The arbitrary access of non-volatile random access memory involved data when being calculated for microprocessor is additionally operable to the instruction code of storage software and needs the data of persistence;Position in each comfortable cellular array of each cell storage as ID in cell software or hardware reading;Master cpu is communicated by cellular array bus with each cell in cellular array;There is communication interface between flanking cell in cellular array, it can mutual transmission data.The present invention can overcome active computer framework because of communication performance bottleneck existing between CPU and memory, storage, promote the overall performance of computing system.
Description
Technical field
The present invention relates to computer and computer application technology, more particularly to a kind of cellular array computing system and
Communication means therein.
Background technology
Usually, a computer includes mainly three cores:Central processing unit (CPU, Central
Processing Unit), memory and storage.
By the unremitting effort of some Apical Limited of the world, CPU has evolved into extremely complicated semiconductor chip.It is top
CPU core inside metal-oxide-semiconductor number can be more than 100,000,000.Current industry trend is to be limited by power consumption, the operation frequency of CPU
Rate has been difficult to improve again.Extremely complicated modern CPU, operational efficiency are similarly difficult to improve again.New CPU products, more
Come more towards the evolution of multinuclear direction.
In terms of memory, that occupy dominant position at present is dynamic random access memory (DRAM, Dynamic Random
Access Memory) technology.DRAM can quick random read-write, but content cannot be kept in the event of a power failure.It is practical
On, even if in the case of energization, it is also due to internal lose information for storing up the electric leakage of stored capacitor, it is necessary to
Periodically self-refresh.
In terms of storage, nand flash memory technology gradually replaces conventional hard.The floating grid that flash memory is relied on
(floating gate) technology, although content can be kept in the event of a power failure, the speed of write-in (being rewritten as ' 0 ' by ' 1 ')
Degree is very slow, and the speed of erasing (being rewritten as ' 1 ' by ' 0 ') is slower, and no image of Buddha DRAM is like that for the direct support to calculating.Its quilt
Make block device (block device), it is necessary to which monoblock is wiped together, and a block (block) includes many pages (page), is wiped
Write operation can be carried out except latter every page.The another question of NAND is that have the limited service life.
The logic circuit of DRAM and nand flash memory and CPU, although being all based on cmos semiconductor technique productions,
The technique of this three is each other and incompatible.Then, three cores of computer can not coexist on a single die, this is deep
Ground affects the framework of modern computer.
Computer architecture in the prior art as shown in Figure 1, show multiple CPU cores in Fig. 1, respectively CPU1, CPU2,
CPU3 ..., CPUn, each CPU core generally have corresponding level cache (L1Cache), as needed can also be into one
Step is that each CPU core is equipped with corresponding L2 cache (L2Cache), three-level caches (L3Cache).In DRAM and each CPU
It is communicated by Double Data Rate (DDR, Double Data Rate) interface between core, hard disk (HD, Hard Disk) or solid-state
Hard disk (SSD, Solid State Drives) is then communicated by peripheral device interface between each CPU core.
On the one hand, CPU develops to the direction of multinuclear, and another aspect memory and storage are all in other chip.Multinuclear
CPU handles up the proportional increase of information content, and the bottleneck of system performance is just increasingly becomed with the communication of memory, storage.In order to alleviate
Communication performance bottleneck, CPU have to use increasing multi-level buffer.Caching be in memory content replicate, typically at
This but speed faster static RAM (SRAM, Static Random Access more much higher than DRAM
Memory it) designs.Such framework, cost effectiveness are very poor.The cost of semiconductor chip determines by the area of its silicon chip, and
The performance boost that convention computer architecture is brought and the increase of its silicon area are much disproportionate.
Invention content
The problem to be solved in the present invention is computer architecture in the prior art because existing between CPU and memory, storage
Communication performance bottleneck and the promotion for influencing computer overall performance, and keep cost effectiveness poor.
To solve the above problems, technical solution of the present invention provides a kind of cellular array computing system, including:It is master cpu, thin
Born of the same parents' array and cellular array bus;The cellular array is had both two that the cell of calculating and store function forms by more than one
Array is tieed up, wherein each cell includes microprocessor (MPU, Micro Processing Unit) and non-volatile (NV, Not
Volatile) random access memory;When the non-volatile random access memory is calculated for the microprocessor involved data with
Machine accesses, and is additionally operable to the instruction code of storage software and needs the data of persistence;Described in each leisure of each cell storage
Position in cellular array as identity recognition number (ID, identification) in cell software or hardware read;
The master cpu is communicated by the cellular array bus with each cell in the cellular array;The cell
There is communication interface between flanking cell in array, it can mutual transmission data.
Optionally, the master cpu by each cell in the cellular array bus and the cellular array into
Capable communication includes at least one of following situations:
The non-volatile random access memory of any cell in the cellular array is read and write by address;
By in data broadcasting to the cellular array in target area each cell non-volatile random access memory, and write
Enter in the target area identical relative address in the non-volatile random access memory of each cell;
Instruction, transmission data or reading state are sent to the microprocessor of any cell in the cellular array;
To the microprocessor broadcasting instructions of all cells in the target area.
Optionally, the cell in the cellular array further includes bus control unit and cell interior bus, total line traffic control
Device processed is connected with the cellular array bus, microprocessor and cell interior bus, bus control unit institute for identification
State the communication carried out between master cpu and this cell, connect the microprocessor with transmit instruction that the master cpu is sent or
Data, state are read, or connect the read-write that the non-volatile random access memory carries out data by the cell interior bus
Operation.
Optionally, Floating-point Computation processor (FPU, Float Point Unit) and image are integrated in the microprocessor
At least one of processor.
Optionally, the non-volatile random access memory is magnetic RAM (MRAM, Magnetic Random
Access Memory)。
Optionally, the master cpu is integrated in the cellular array and the cellular array bus in a chip.
Optionally, the master cpu is as independent chip, by the memory interface of standard with by the cellular array and
The chip of the cellular array bus composition is communicated.
To solve the above problems, technical solution of the present invention also provides the communication party in a kind of above-mentioned cellular array computing system
Method, including:
Any cell in the cellular array receives the target that the master cpu is broadcasted in the cellular array bus
Address connects the non-volatile random access memory of the cell so that the master if judging the destination address in this cell
Control the read-write operation that CPU carries out data;
The first special address field is reserved in system address space for the communication between the master cpu and microprocessor
And store the ID of target cell, identified when if any cell receiving the first special address field in the cellular array be with
The communication of the microprocessor of this cell, the then microprocessor for connecting the cell complete subsequent command reception, data receiver and shape
State read operation;
The second special address field is reserved in system address space and is used for the master cpu broadcasting instructions, and described second is special
Different address field has the ID that can assist in each cell of the range of target area in the cellular array, if the cell battle array
Any cell identifies that this cell in the target area, then connects the cell after receiving the second special address field in row
Microprocessor read with transmitting instruction that the master cpu is sent or data, state, or by connect the cell it is non-easily
Lose the read-write operation that random access memory carries out data;
Any cell in the cellular array is under the control of its microprocessor to flanking cell transmission data.
Optionally, the cell in the cellular array further includes bus control unit and cell interior bus, total line traffic control
Device processed is connected with the cellular array bus, microprocessor and cell interior bus;Any cell in the cellular array
Judge that the destination address whether in this cell, identifies whether the communication with the microprocessor of this cell, identifies that this cell is
It is no in the target area, and connection non-volatile random access memory or microprocessor be to be completed by the bus control unit
, the bus control unit connects the non-volatile random access memory by the cell interior bus.
Optionally, read-write of the master cpu for the non-volatile random access memory of any cell in the cellular array
The priority of operation is higher than the intracellular microprocessor for the read-write operation of corresponding non-volatile random access memory.
Technical solution of the present invention also provides a kind of above-mentioned cellular array computing system calculating Monte Carlo (Monte of application
Carlo) the method integrated, including:
The master cpu selects the cell in a whole or target area in the cellular array, integrand pair
Relative address section of the program broadcast answered to each selected cell;
The master cpu broadcasting instructions make the microprocessor of selected cell execute institute since the relative address section
State the corresponding program of integrand;
After each cell completes integral operation, summation is stored in the address of agreement, is carried out after being read for the master cpu
Total summation.
Optionally, when starting to execute, included randomizer reads thin the corresponding program of the integrand
The ID of born of the same parents is as seed.
Technical solution of the present invention also provides the above-mentioned cellular array computing system of another application and calculates Monte Carlo integrals
Method, including:
The master cpu selects the cell in a whole or target area in the cellular array;
The master cpu is broadcasted one and is downloaded in program to the same segment relative address of each selected cell, and
Broadcasting instructions make the microprocessor of selected cell execute the download program since the relative address;
The corresponding program of integrand is split into two or more subprograms, the master cpu broadcast is each
In subprogram to the microprocessor of selected cell;
The microprocessor for downloading program is run, selects one of subprogram to store up according to the ID of respective place cell
It deposits, each subprogram is made sequentially to be deployed in one group of adjacent successively cell;
The master cpu broadcasting instructions make the microprocessor of each group of cell execute the integrand successively corresponding
Program be split after subprogram, the intermediate result of previous stage is transported to next stage and inputs;
After each group of cell completes integral operation, summation is stored in the address of agreement, is read for the master cpu laggard
The total summation of row.
Compared with prior art, technical scheme of the present invention has at least the following advantages:
(claimed with the unit of store function (referred to as " cell ") composition two-dimensional array by the way that more than one is had both independent calculate
For " cellular array "), wherein each cell includes microprocessor and non-volatile random access memory, the non-volatile random storage
The arbitrary access of device involved data when the microprocessor can be supported to be calculated can also support the instruction generation of storage software
Code and need the data of persistence, make memory, storage, calculate three functions and be integrated into each cell, and make each cell it
Between form intensive communication network, on the one hand, master cpu can pass through each cell in cellular array bus and cellular array
Communicated, on the other hand, between the flanking cell in cellular array also can mutual transmission data, thus, it is possible to be mass-sended by data
And internal network, overcome active computer framework because of communication performance bottleneck existing between CPU and memory, storage, is calculated to be promoted
The overall performance of system, and cost effectiveness can be made preferable.
It utilizes multiple nuclear CPU framework to calculate Monte Carlo in the prior art to integrate, when the caching that each CPU is carried is inadequate
When, all CPU are required for reading the code of integrand from the interface with memory, and bottleneck is consequently formed, and pass through application
Above-mentioned cellular array computing system calculates Monte Carlo integrals, broadcast/mass-sending function of cellular array computing system and can
The advantage of the internal network of extensive parallel transmission data will fully demonstrate out, while a large amount of cells operation make the cell
The powerful operational capability of array computation system discharges completely, to be more efficiently completed the calculating of Monte Carlo integrals.
Description of the drawings
Fig. 1 is the schematic diagram of computer architecture in the prior art;
Fig. 2 is a kind of structural schematic diagram of cellular array computing system provided in an embodiment of the present invention;
Fig. 3 be the embodiment of the present invention a kind of flanking cell between communication mode schematic diagram;
Fig. 4 be the embodiment of the present invention another flanking cell between communication mode schematic diagram;
Fig. 5 is the structural schematic diagram of the cell of the embodiment of the present invention;
Fig. 6 is the schematic diagram that application cell array computation system calculates Monte Carlo integrals in pipelined fashion;
Fig. 7 be the embodiment of the present invention cellular array in carry out cell-cell communication cell structural schematic diagram;
Fig. 8 be the embodiment of the present invention cellular array in cell-cell communication Path selection schematic diagram;
Fig. 9 is the implementation process schematic diagram of the full-time output cell of the embodiment of the present invention;
Figure 10 be the embodiment of the present invention cellular array in starting point cell on the angle of target area cell mass-sending schematic diagram;
Figure 11 be the embodiment of the present invention cellular array in starting point cell on the side of target area cell mass-sending schematic diagram;
Figure 12 be the embodiment of the present invention cellular array in starting point cell inside target area cell mass-sending schematic diagram;
Figure 13 be the embodiment of the present invention cellular array in starting point cell outside target area cell mass-sending schematic diagram;
Figure 14 is another structural schematic diagram of cellular array computing system provided in an embodiment of the present invention;
Figure 15 is another structural schematic diagram of cellular array computing system provided in an embodiment of the present invention;
Figure 16 is the yet another construction schematic diagram of cellular array computing system provided in an embodiment of the present invention;
Figure 17 is a kind of structural schematic diagram of camera system provided in an embodiment of the present invention;
Figure 18 is another structural schematic diagram of camera system provided in an embodiment of the present invention;
Figure 19 is the schematic diagram of a neuron in neural network;
Figure 20 is the schematic diagram of neural computing;
Figure 21 is the structural schematic diagram of the cellular array computing system provided in an embodiment of the present invention with debugging interface.
Specific implementation mode
The computer architecture of the prior art is since there are communication performance bottlenecks between CPU and memory, storage, to influence computer
The promotion of overall performance, and keep cost effectiveness poor.
Present inventor considers after research, if three memory, storage, calculating functions are integrated into a chip
Come up, formed it is relatively easy but have both it is independent calculate and the unit of store function, and make to be formed between a large amount of such units intensive
Communication network just can be developed with the internal network that this realizes data mass-sending function and is capable of extensive parallel transmission data
There are the computing architecture of similarity, this is equivalent to does a large amount of microcomputer on a single die a kind of and human brain.
For this purpose, technical solution of the present invention provides one kind computing architecture similar with human brain structure (in technical solution of the present invention
It is referred to as " cellular array computing system "), the computing architecture is relatively easy by numerous structures, has both storage and computing function,
The unit (being referred to as in technical solution of the present invention " cell ") connected again by dense network forms.This new computing architecture will
It is widely used in the fields such as mass computing, big data processing, artificial intelligence.
To make the above purposes, features and advantages of the invention more obvious and understandable, below in conjunction with the accompanying drawings to the present invention
Specific embodiment be described in detail.
As shown in Fig. 2, cellular array computing system provided in an embodiment of the present invention includes:Master cpu, cellular array and thin
Born of the same parents' array bus;The cellular array is the main body in the cellular array computing system, it is to have both calculating by more than one
The two-dimensional array formed with the cell of store function, wherein each cell include microprocessor (MPU) and non-volatile deposit at random
Reservoir (in Fig. 2 by taking MRAM as an example);The non-volatile random access memory involved data when being calculated for the microprocessor
Arbitrary access is additionally operable to the instruction code of storage software and needs the data of persistence;Each comfortable institute of each cell storage
State position in cellular array as ID in cell software or hardware read;The master cpu passes through the cell battle array
Column bus is communicated with each cell in the cellular array;There is communication between flanking cell in the cellular array
Interface, can mutual transmission data.
It should be noted that in the present embodiment, the non-volatile random access memory is illustrated by taking MRAM as an example.
In other embodiment, with further development and ripe, the non-volatile random access memory of non-volatile random storage technology
In addition several potential technologies may be used to realize, such as:Phase-change random access memory (PCRAM, Phase Change
Random Access Memory), resistive formula random access memory (Resistive Random Access Memory), ferroelectricity with
Machine memory (FeRAM, Ferroelectric Random Access Memory), ferroelectric dynamic random access memory (FEDRAM,
Ferroelectric Dynamic Random Access Memory) etc..
MRAM is a kind of new memory and memory technology, can as SRAM/DRAM quick random read-write, and compare
DRAM is fast;Can also as flash memory permanent retention data after a loss of power, and unlike nand flash memory, MRAM can with unlimited time
Erasable, service life is longer.In addition, the economy of MRAM thinks that locality is good, the silicon area ratio SRAM that unit capacity occupies is (usually
Caching as CPU) there is prodigious advantage, it is expected to the level close to DRAM.Its performance is also fairly good, and read-write time delay is close to most
Good SRAM, power consumption are then best in various memories and memory technology.And MRAM is unlike DRAM and Flash and standard
Cmos semiconductor technique is incompatible, and MRAM can be integrated into logic circuit in a chip.By using MRAM technology, so that it may
Three memory, storage, calculating functions are integrated on a chip, the realization of the cellular array computing system is made just
It is possibly realized.
In the present embodiment, the microprocessor has the function of usual CPU, can also be added according to specific application scenarios
The units such as Floating-point Computation processor (FPU, Float Point Unit), image processor, therefore, in the microprocessor can be with
It is integrated at least one of Floating-point Computation processor and image processor.
In actual implementation, the master cpu can be integrated in one with the cellular array and the cellular array bus
In a chip, the master cpu can also be used as independent chip, by the memory interface of standard with by the cellular array and
The chip of the cellular array bus composition is communicated.When using standard between the master cpu and the cellular array
When memory interface is communicated, then master cpu may be used general cpu chip and realize, it is easier to which the cellular array calculates
The implementation of system.
In the present embodiment, as ID, the position can adopt the position in each comfortable cellular array of each cell storage
With the coordinate representation mode of first quartile in plane right-angle coordinate, if (x, y) indicates position of some cell in cellular array
It sets, then (x, y) can be stored in as ID in the cell, and the software and hardware in cell can read this ID, specific
It is used in operation.
In the present embodiment, the master cpu is thin by each in the cellular array bus and the cellular array
The communication that born of the same parents carry out includes following situations:
The non-volatile random access memory of any cell in the cellular array is read and write by address;
By in data broadcasting to the cellular array in target area each cell non-volatile random access memory, and write
Enter in the target area identical relative address in the non-volatile random access memory of each cell;
To the microprocessor of any cell in the cellular array send instruction (including start, suspend), transmission data or
Reading state;
To the microprocessor broadcasting instructions of all cells in target area.
Certainly, in other embodiments, the master cpu passes through in the cellular array bus and the cellular array
The communication that each cell carries out can also be one or more combinations in the above situation.
It should be noted that " target area " in the embodiment of the present invention refers to by the master cpu or the cell battle array
Any cell in row selects more than one there are the region that the cell of neighbouring relations is constituted, and the cell in the region is master control
The object of data or instruction is broadcasted/mass-sended to any cell in CPU or cellular array.In the present embodiment, the target area tool
With rectangular area, (a≤x≤b, c≤y≤d, wherein a, b are rectangular area x-axis direction in plane right-angle coordinate respectively to body
Boundary coordinate, c, d are the boundary coordinate in rectangular area y-axis direction in plane right-angle coordinate respectively) for said
It is bright.In other embodiments, the target area can also be the region that other shapes are constituted, such as diamond-shaped area, triangle
Region, hexagonal area etc..
In addition, the concept of " broadcast " is different from the concept of " mass-sending " in the embodiment of the present invention, the former can be one pass of hair
According to or instruction so that all objects can receive, and the latter can be then point to issue different objects many times over.
Except through master cpu to (including the microprocessor in cell or non-volatile of any cell in the cellular array
Random access memory) it is broadcasted, in the cellular array, there are one communication network, which enables to one
Cell can be under the control of its MPU, to the cell transmission data adjacent with it.As shown in figure 3, in a plane, Ren Yixi
Born of the same parents can communicate with the flanking cell of its upper and lower, left and right four direction.Certainly, between flanking cell communication mode it is general
Thought is not limited only to " upper and lower, left and right four direction ", in the case where configuration can be supported, can also be " upper,
Under, left and right, upper left, upper right, lower-left, the direction of bottom right eight ", as shown in figure 4, any cell can with its upper and lower, left and right,
Upper left, upper right, lower-left, eight directions in bottom right flanking cell communicated.
As shown in figure 5, in the present embodiment, the cell in the cellular array can also include specifically bus control unit and
Cell interior bus, the bus control unit are connected with the cellular array bus, microprocessor and cell interior bus, institute
The communication that bus control unit carries out between the master cpu and this cell for identification is stated, connects the microprocessor to transmit
The instruction or data, state that the master cpu is sent are read, or by the cell interior bus connect it is described it is non-volatile with
Machine memory carries out the read-write operation of data.
Those skilled in the art know, one fairly simple and the good CPU of performance, such as ARM Cotex M0, only 50,000
A or so metal-oxide-semiconductor, even if appropriateness increases more much smaller than more than one hundred million a metal-oxide-semiconductors of top CPU if FPU functions, promotion cpu performance
Area (cost) increase brought is out-of-proportion.The many small CPU of CPU one big are replaced, in the same feelings of totle drilling cost
Under condition, total computing capability centainly increases many times.However convention computer architecture is limited by communication performance bottleneck, using in a large amount of CPU
The actual performance that nucleus band comes is promoted very limited.
And the cellular array computing system that technical solution of the present invention provides solves logical by data broadcasting and internal network
The problem of believing bottleneck, to promote the overall performance of computing system, and can make cost effectiveness preferable, will be real in subsequent application to this
It is more clearly visible that this point in example.
It is preliminary to coordinate the memory of 32KB studies have shown that if using the MPU similar to Cortex M0, it forms one thin
Born of the same parents.Using 40 nanometers of technique, 3000 such cells can be done on a single die, this is very powerful calculating energy
Power.Further investigations have shown that in this way, the calculating of the present age top CPU can be surmounted in same silicon area
Ability (is generally measured with flops per second (FLOPS, floating-point operations per second)).
Since the cellular array computing system of technical solution of the present invention no longer faces the bottleneck with memory interface, solving much actually to ask
In topic, the meeting of performance is more preferable.
Based on above-mentioned cellular array computing system, the embodiment of the present invention is also provided in a kind of above-mentioned cellular array computing system
Communication means, including:Master cpu reads and writes the communication between operation, master cpu and the microprocessor of non-volatile random access memory
It operates, the traffic operation in the broadcast operation and cellular array of master cpu between flanking cell;
The operation of the master cpu read-write non-volatile random access memory specifically includes:It is any thin in the cellular array
Born of the same parents receive the destination address that the master cpu is broadcasted in the cellular array bus, if judging the destination address at this
In cell, then the non-volatile random access memory of the cell is connected so that the master cpu carries out the read-write operation of data.
Traffic operation between the master cpu and microprocessor specifically includes:First is reserved in system address space
Special address field is for the communication between the master cpu and microprocessor and stores the ID of target cell, if the cell battle array
Identify it is communication with the microprocessor of this cell when any cell receives the first special address field in row, then connection should
The microprocessor of cell completes subsequent command reception, data receiver and status read operation.
It should be noted that the system address space be not limited only in each cell by cellular array include
Non-volatile random access memory composition address space summation because the memory of connection cellular array bus may be not just thin
The non-volatile random access memory for including in each cell of born of the same parents' array, entirely possible there is also other kinds of memory and cells
Array bus is connected, and is accessed for the master cpu.Therefore, the master cpu needs to be gone to identify its preparation visit according to the ID of cell
The cell asked (cell is known as " target cell " in the present embodiment at this time).
The broadcast operation of the master cpu specifically includes:The second special address field is reserved in system address space to be used for
The master cpu broadcasting instructions, the second special address field have and can assist in target area in the cellular array
The ID of each cell of range, if any cell identifies that this is thin after receiving the second special address field in the cellular array
Born of the same parents then connect the microprocessor of the cell to transmit the instruction or data, shape that the master cpu is sent in the target area
State is read, or the read-write operation of data is carried out by connecting the non-volatile random access memory of the cell.
The broadcast operation of the master cpu is illustrated so that the target area is specifically rectangular area as an example below.
One section is reserved in system address space and is used as broadcasting instructions, and one in this address section rises for storing in target rectangle region
The ID of point cell.The starting point cell is the first cell accessed by master cpu in the target rectangle region, total in the cell
After lane controller receives this special address, the data of a subsequent word (word) are received, this data includes target rectangle
The ID of the cell diagonal with starting point cell in region.Bus control unit judges this cell in this region, second word of reception
Data.It is instruction or data to MPU that second word, which is indicated, or some relative address is opened from non-volatile random access memory
Begin to be written.If it is the former, MPU is connected, if it is the latter, connection non-volatile random access memory completes subsequent operation.
It should be noted that in the case where the memory space of the described second special address field is relatively limited, storage
Cell ID may not be able to determine therefrom that the range of target area completely, after receiving the described second special address field at this time, also
It needs to receive subsequent data to coordinate the cell ID stored in the described second special address field to determine target area jointly
Range.
Traffic operation in cellular array between flanking cell includes:Any cell in the cellular array is at its micro- place
It manages under the control of device to flanking cell transmission data.
In the present embodiment, there are one bus control units for each cell setting, are connect with cellular array bus.Cell interior is set
Cell interior bus is set, the non-volatile random access memory is the slave equipment (Slave) of the cell interior bus, total line traffic control
Device and microprocessor processed are main equipment (Master).
" any cell judgement in the cellular array involved in communication means in above-mentioned cellular array computing system
Whether the destination address is in this cell ", " identifying whether the communication with the microprocessor of this cell " " identifies that this cell is
It is no in the target area ", and " connection non-volatile random access memory or microprocessor " is by the bus control unit
It completes, the bus control unit connects the non-volatile random access memory by the cell interior bus.
In the specific implementation, non-volatile random access memory of the master cpu for any cell in the cellular array
Read-write operation priority higher than the intracellular microprocessor for corresponding non-volatile random access memory read-write operation.
That is, if the microprocessor in some cell needs to read and write the non-volatile random access memory in the cell, the master controls such as palpus
CPU could be carried out after being completed for the read-write operation of the non-volatile random access memory of the cell.
The specific implementation of communication means in the cellular array computing system can also refer to above-mentioned cellular array and calculate
The implementation of system, details are not described herein again.
In addition, the embodiment of the present invention also provides a kind of above-mentioned cellular array computing system calculating Monte Carlo products of application
The method divided.Monte Carlo integrals are the summations to random number, are in the commonly used big calculating of science and engineering, principle
Relatively easy, calculating process refers to following formula.
S=ΣRandom(x)F(x1,x2,.....,xN)
It below will be further to show the superiority of cellular array computing system for this way to solve the problem.
The calculating of Monte Carlo integrals is very suitable for above-mentioned cellular array computing system, and steps are as follows for specific execution:
The master cpu selects the cell in a whole or target area in the cellular array, integrand F ()
Relative address section of the corresponding program broadcast to each selected cell;
The master cpu broadcasting instructions make the microprocessor of selected cell execute quilt since the relative address section
The corresponding programs of Product function F ();
After each cell completes integral operation, summation is stored in the address of agreement, is carried out after being read for the master cpu
Total summation.
In the present embodiment, the corresponding program of the integrand F () is when starting to execute, included randomizer
The ID of cell is read as seed, can so ensure that the random number that each cell generates is different.
In actual implementation, thousands of a cells start simultaneously at operation, and powerful operational capability discharges completely, are no longer influenced by existing
The restriction for having communication performance bottleneck in technology, so as to be more efficiently completed the calculating of Monte Carlo integrals.
In actual implementation, if integrand F () is extremely complex, the memory of a cell can not load, and can also pass through
The mode of assembly line solves.Therefore, the present embodiment also provides the above-mentioned cellular array computing system of another application and calculates Monte
The method of Carlo integrals, including:
The master cpu selects the cell in a whole or target area in the cellular array;
The master cpu is broadcasted one and is downloaded in program to the same segment relative address of each selected cell, and
Broadcasting instructions make the microprocessor of selected cell execute the download program since the relative address;The download program
The input of next step will be waited for;
The corresponding program of integrand is split into two or more subprograms, the master cpu broadcast is each
In subprogram to the microprocessor of selected cell;
The microprocessor for downloading program is run, selects one of subprogram to store up according to the ID of respective place cell
It deposits, each subprogram is made sequentially to be deployed in one group of adjacent successively cell;
The master cpu broadcasting instructions make the microprocessor of each group of cell execute the integrand successively corresponding
Program be split after subprogram, the intermediate result of previous stage is transported to next stage and inputs;
After each group of cell completes integral operation, summation is stored in the address of agreement, is read for the master cpu laggard
The total summation of row.
For example, as shown in fig. 6, integrand F () can such as be split into tri- parts f1, f2, f3 (three sons
Program), it is deployed in adjacent cell, the intermediate result of previous stage is transported to next stage and inputs.
Specifically, when master cpu broadcasts f1, f2, f3 to each MPU (note:It is not to be dealt into memory, is intended for MPU),
The MPU of program is downloaded in operation, it is selected in the coordinate x (such as with remainder of x/3) of rectangular coordinate system x-axis direction according to oneself ID
In a sub- program storage.In this way, by the broadcast in two stages, three subprograms are deployed to according to desirable rule
In all cells for participating in calculating.
In addition, since the execution of the corresponding program of the integrand F () is actually to be executed since first subprogram
, it is still that cell is read with its included randomizer then when first subprogram after fractionation starts to execute
ID as seed, ensure that the random number that each cell generates is different.
Compared with convention computer architecture, the broadcast capability of the cellular array computing system with being capable of extensive parallel biography
The advantage of the internal network of transmission of data embodies herein.If this problem is calculated with traditional multiple nuclear CPU framework, when every
When caching included a CPU is inadequate, all CPU are required for reading the code of integrand F () from the interface with memory,
Form bottleneck.
In the present embodiment, the implementation for the internal network of cellular array can not only send data to flanking cell, and
And also it is expanded to and can be sent to any one cell from a cell, it can realize that the iuntercellular in cellular array is logical
Letter.
It specifically, can be not against master between any two cell in the cellular array of the cellular array computing system
It is communicated in the case of control CPU, the cell for participating in cell-cell communication includes starting point cell, end point cell and transfer cell, institute
It is the cell that data are sent out to the end point cell to state starting point cell, and the end point cell is finally receives the starting point cell institute
The cell of data is sent out, the transfer cell is adjacent successively along cell-cell communication path and by described in the communication interface transfer
The cell of the sent out data of starting point cell, the cell-cell communication path are by the starting point cell, transfer cell and end point cell
The data transmit-receive path constituted.
By the communication interface between flanking cell in cellular array, realize between flanking cell for data it is multiple in
Turn, makes just to be communicated not against master cpu between any two cell in cellular array, improve cell-cell communication
Efficiency also reduces the processing load of master cpu, so as to further promote the overall performance of computing system.
It should be noted that the starting point cell, end point cell and transfer cell are logical relative to certain iuntercellular
Relative concept for letter process, because some starting point cell completely may also be as the transfer during other cell-cell communications
Cell or end point cell, some end point cell may also as during other cell-cell communications transfer cell or starting point it is thin
Born of the same parents.
In the specific implementation, the cell in the cellular array can also include the network control being connected with the microprocessor
Device processed, the network controller in cell-cell communication to the data of data, transfer sent out or the number finally received
According to transmitting-receiving control is carried out, it is additionally operable to send interrupt signal to the microprocessor.In the present embodiment, by being set into the cell each
A network controller is set, so as to the rapidly interim data in the case where not interfering MPU, thus reduces the place of MPU in cell
Reason burden.In other embodiments, the network controller can not also be set, but realize the transfer of data by MPU.
In the present embodiment, " data sent out " refer to the data that the starting point cell itself is sent out;" the number of transfer
According to " referring to the data that starting point cell is sent out described in the transfer cell transfer, the data are not needed for this cell itself
It to be sent out;" data finally received " refer to end point cell received data, and the data are after by multiple transfer
Destination is arrived, transfer will be no longer carried out." data sent out ", " data of transfer " and " data finally received ", it is just interior
It may be identical data for appearance, only belong to the different addresses in different communication stage.
In the specific implementation, the cell in the cellular array can also include one group to be connected with the network controller
Or one group or more First Input First Output, each group First Input First Output respectively correspond to a cell adjacent with this cell, each group
First Input First Output includes input First Input First Output and output First Input First Output, and the input First Input First Output is for depositing
Storage inputs the data that this cell carries out the data of transfer or finally receives, and the output First Input First Output is thin from this for storing
The data that the need of born of the same parents' output carry out the data of transfer or this cell is sent out to other cells.
If by taking communication mode between flanking cell shown in Fig. 3 as an example, in the cellular array of the present embodiment into
The structure of the cell of row cell-cell communication as shown in fig. 7, network controller in Fig. 7 respectively with MPU and 4 group of fifo queue phase
Even, each group fifo queue is a pair of in " the upper and lower, left and right four direction of two dimensional surface " adjacent cell one with this cell respectively
It answers, when it is implemented, the communication port between each two flanking cell can share one group of corresponding fifo queue.Every group of FIFO
Queue includes input FIFO and output FIFO, in the angle of a cell wherein of stand for, input FIFO is stored from other
The data of flanking cell input, output FIFO store data from this cell to other flanking cells that exported from, flanking cell it is defeated
Go out FIFO and belong to input FIFO for this cell, the output FIFO of this cell belongs to input for flanking cell
FIFO。
It should be noted that corresponding in cell shown in Fig. 7, there are 4 groups of fifo queues, if positioned at the cell battle array of rectangle
The cell at 4 angles of row, then only there are two flanking cells for the cell, and the cell corresponds to 2 groups of fifo queues at this time, if being located at square
The cell on 4 sides of the cellular array of shape, then there are three flanking cells, the at this time cell to correspond to 3 groups of fifo queues for the cell.
In the present embodiment, network controller is also connected with the intracellular MPU, give it send out interrupt signal, as FIFO empty,
FIFO completely, newly arrives data, data submitting etc.;MPU then can send out data by network controller, and the data sent out would generally
It is first put into corresponding output fifo queue.
It should be noted that the eucaryotic cell structure in Fig. 7 illustrate only and carry out the relevant module of cell-cell communication, ability
Field technique personnel are, it is understood that eucaryotic cell structure shown in Fig. 7 can be combined completely with eucaryotic cell structure shown in fig. 5.
In addition, outputting and inputting the data of some cell in the present embodiment using fifo queue storage, can so make thin
Data relay in intercellular communication process more efficiently, reduces the processing load of MPU.In other embodiments, it outputs and inputs
The data of some cell can also be realized by register.
The embodiment of the present invention also provides the communication means between cell in a kind of above-mentioned cellular array computing system, including:
The data that starting point cell in the cellular array will be sent out to end point cell are sent to and described by selected sending direction
The adjacent cell of point cell;When any cell in the cellular array receives the number of data or transfer that flanking cell is sent out
According to when, if according to the ID for the end point cell indicated in the data received judge this cell be end point cell, will
The data received are stored in the non-volatile random access memory of this cell, or notify the microprocessor of this cell to the data of reception
It is handled, otherwise this cell is used as transfer cell, and the data relay received is given this after selected sending direction
The adjacent cell of cell.
In the specific implementation, each data involved in cell-cell communication process can all contain starting point cell and terminal is thin
The ID of born of the same parents, any cell can judge that the data are intended for this according to the ID for the end point cell indicated in the data received
Cell or need further in be given to other flanking cells.A data is by the connection between flanking cell, through excessive
Secondary transfer, cell of reaching home can be according to starting points if the end point cell needs to make feedback with regard to the sent out data of starting point cell
Feedback data is sent to starting point cell by the ID of cell, and the end point cell is with the starting point cell indicated in the data that receive
IDs of the ID as end point cell, is indicated after handling the data received in obtained feedback data, at this time
The end point cell becomes starting point cell when a new cell-cell communication, and it is logical that original starting point cell then becomes the secondary iuntercellular
End point cell when letter.
When it is implemented, while the ID for indicating end point cell, data that the starting point cell is sent out to end point cell
In also indicate address to be accessed in the end point cell or MPU;It is described that the data received are stored in the non-of this cell
Volatile random access memory is the end point cell after the address to be accessed indicated in identifying the data received
It carries out;Described that the MPU of this cell is notified to handle the data of reception, the end point cell is received identifying
It is carried out after the MPU indicated in data.
In actual implementation, if the address to be accessed that end point cell is indicated in identifying the data received
Afterwards, then the data received can be write direct the non-volatile random access memory of the cell by the network controller in the end point cell
In appropriate address, in this case, cell may be implemented " to breed ", and cell can give another cell to download journey
Sequence;If end point cell, after the MPU indicated in identifying the data received, the data of reception will be transferred in end point cell
MPU processing.
In the present embodiment, described since the cell in cellular array further includes the network controller being connected with MPU
Starting point cell to end point cell send out any cell in data, the cellular array receive the data that send out of flanking cell or in
The data that turn and judge this cell be final cell or transfer cell, the data received are stored in this cell it is non-volatile with
Machine memory notifies the MPU of this cell to handle the data of reception, is under the control of the network controller
It completes.
When it is implemented, the data that the starting point cell is sent out to end point cell are first described in network controller input
Export First Input First Output, then by the network controller from the output First Input First Output export to the starting point cell
Adjacent cell;If any cell in the cellular array receives the data of data or transfer that flanking cell is sent out,
The data received are inputted into the input First Input First Output, and again when the data for judging to receive need to carry out transfer
The data are inputted into the output First Input First Output.
If in addition, the network controller judges that the input First Input First Output or output First Input First Output are sky
Or expired, or receive flanking cell and send out or the data of transfer, or data or interim data are sent out to flanking cell, then to
The microprocessor sends interrupt signal.
In the specific implementation, the starting point cell or transfer cell can select in the following way described sender to:
If the communication path of straight line, the transmission can be formed between the starting point cell or transfer cell and the end point cell
Direction is along the straight line from the starting point cell or transfer cell to the direction of the end point cell, otherwise described sender to
For from the starting point cell or transfer cell to the direction of flanking cell to be selected, the flanking cell to be selected is thin with the starting point
Close to the cell of the end point cell among born of the same parents or the adjacent cell of transfer cell.Certainly, the quantity of the flanking cell to be selected
It is possible that for two, then select the less cell of communication task of output data in the two flanking cells to be selected as at this time
Turn cell.
In the present embodiment, the starting point cell or transfer cell select sending direction through the above way, actually also may be used
To be considered the path selection process of cell-cell communication in cellular array.Can be refering to Fig. 8, each rectangle in Fig. 8 indicates thin
A cell in born of the same parents' array, all cells shown in Fig. 8 are the part in entire cellular array, it is assumed that flanking cell it
Between communication mode as shown in Figure 3 carry out.
If A points indicate that a starting point cell, the starting point cell prepare to send out data to the end point cell where C points, by
Be clearly that can form the communication path of straight line between A points and C points, then the cell where A points by data be sent to and its
Cell where adjacent B points, similarly, the cell where B points continue on the straight line between A points and C points as transfer cell
The direction interim data of cell where to C points is adjacent thin successively on the cell-cell communication path formed between A points and C points
Born of the same parents repeatedly forward the data that cell where A points is sent out, until being transmitted to cell where C points.
If D points indicate another starting point cell, which prepares to send out data to the end point cell where G points,
Due to being clearly the communication path that can not form straight line between D points and G points, then in the cell adjacent with cell where D points
Among, the cell where cell and F points where E points is obviously closer to the end point cell where G points, then the two cells belong to
The flanking cell to be selected of cell, can select the less cell of the communication task of wherein output data thin as transfer where D points
Born of the same parents choose at random a cell as transfer cell if the communication task of the two cell output datas is identical.Such as Fig. 8 institutes
Show, the cell where selection E points or the cell where F points will form different cell-cell communication paths.
It should be noted that being by taking the communication mode between flanking cell shown in Fig. 3 as an example in the present embodiment to cell
Between the Path selection that communicates illustrate, if it will be appreciated to those of skill in the art that using shown in Fig. 4 adjacent thin
Communication mode between born of the same parents, then alternative sending direction will be more.
To sum up, in actual implementation, each sends out or the cell of interim data, and network controller must all select one
A adjacent cell is as the next stop.When beginning and end point-blank when, rational selection it is general only there are one;Other
In the case of, there are two same rational selection, network controller will select the opposite neighbours not being in a hurry of a traffic.
If some input fifo queue has data entrance, network controller that will first check for it:
If terminal is this cell,:If terminal is specific relative address, since network controller has direct memory
The data received will be directly stored in described non-volatile deposit at random by the ability for accessing (DMA, Direct Memory Access)
Interrupt notification MPU is used in combination in appropriate address in reservoir;If terminal is MPU, directly MPU processing is notified with interrupt signal.
If the MPU that terminal is other cells or this cell sends out data,:If terminal exists with this cell
On straight line, then selection is correctly oriented, to flanking cell transmission data;In the case of other, there are two possible direction, choosings
The flanking cell that wherein output fifo queue is more idle is selected to send, if the output fifo queue in two flanking cells to be selected
Situation is identical, then can therefrom choose at random a flanking cell and send.
In actual implementation, when the thousands of MPU present in the cellular array are calculated together, how each cell
Output data be sent to the master cpu just become a problem.In general, each MPU can store output data
Agreed address in non-volatile random access memory of cell where it allows master cpu by way of each MPU of poll one by one
It is read out.However, this is not particularly suited for all problems, in some problems, when there was only a few cell in cellular array
When needing to master cpu output data, then each MPU efficiency of poll is too low one by one for master cpu.
Therefore, cellular array computing system provided in an embodiment of the present invention further includes:Be additionally provided in the cellular array to
A few full-time output cell, the sole duty export cell and receive and store other cells to the master control as end point cell
The output data of CPU, and notify the master cpu to read the output data with interrupt signal.
In the specific implementation, can also FIFO teams be set in the non-volatile random access memory of the full-time output cell
Row, other cells are stored in all output datas of the master cpu in the fifo queue, which should have foot
Enough memory spaces, all output datas for storing other cells to the master cpu of having the ability.
When actual implementation, one or several cells can be selected as the full-time output cell in cellular array, one
As can select to be communicated more easily cell between master cpu in position.The full-time output cell with it is described
Interrupt line is equipped between master cpu, the full-time output cell can send interrupt signal to master cpu, such as newly arrive other
The FIFO that the fifo queue being arranged in the output data of cell, MRAM expired, is arranged in MRAM skies etc..
Based on the above-mentioned cellular array computing system for being equipped with full-time output cell, the embodiment of the present invention also provides a kind of cell
Communication means in array computation system, including:The sole duty exports cell and receives and stores other cells to the master cpu
Output data after, the interrupt signal for reading of giving notice to the master cpu;The master cpu is receiving described lead to
After knowing the interrupt signal of reading, the output data is read from the full-time output cell.
When it is implemented, can the output data to be sent to the sole duty in the following way defeated for other described cells
Go out cell:Any cell in other described cells is sent out the output data by selected sending direction as starting point cell
It send to adjacent cell;When any cell in the cellular array receives the output data of flanking cell transmission,
If the ID for judging the end point cell indicated in the output data is consistent with the ID of this cell, due in the output data
The ID for the end point cell indicated is the full-time ID for exporting cell, shows that this cell is that the sole duty exports cell, then will
The output data is stored in the non-volatile random access memory of this cell, and otherwise this cell is used as transfer cell, in selected sender
The output data transfer is given into the adjacent cell of this cell backward.
During the output data is sent to the full-time output cell by other described cells, the starting point cell
Or transfer cell can select in the following way described sender to:If the starting point cell or transfer cell and the sole duty
The communication path of straight line can be formed between output cell, then described sender is to for by the starting point cell or transfer cell
Along the straight line to the direction of the full-time output cell, otherwise described sender is to for by the starting point cell or transfer cell
To the direction of flanking cell to be selected, the flanking cell to be selected is among the cell adjacent with the starting point cell or transfer cell
Close to the full-time cell for exporting cell.
The implementation process of the full-time output cell of the embodiment of the present invention can also be refering to Fig. 9.Fig. 9 shows master cpu, thin
Born of the same parents' array and cellular array bus, the lattice one by one in cellular array simply represent cell one by one, wherein J points
The cell (cell that i.e. bold box lattice indicates) at place is full-time output cell, and Fig. 9 further illustrates full-time output
The structure of cell, as shown in dotted arrow in Fig. 9, it can be seen that be equipped in the MRAM in sole duty output cell and store other carefully
Fifo queue of the born of the same parents to all output datas of the master cpu.
Assuming that the cell where cell and I points where H points needs to provide output data to master cpu, then can pass through
The output data is sent to the cell where J points, H points to the cell-cell communication path of J points by the communication mode between cell
And I points please refer to Fig. 9 to the cell-cell communication path of J points.Since cell-cell communication mode has had a detailed description before this, this
Place repeats no more.
After cell where J points receives the output data that the cell where H points or the cell where I points are sent out, then may be used
Notify the interrupt signal read to master cpu to send, it, can after master cpu receives the interrupt signal of notice reading
By cellular array bus the output data is read from the cell where J points.
By the way that full-time output cell is arranged in cellular array, connect using the full-time output cell as the end point cell
The output data received and store other cells to master cpu, and notified described in the master cpu reading in a manner of interrupt signal
Output data so can improve master cpu when only a few cell is needed to master cpu output data and read output
The efficiency of data.
Introduce the example of the above-mentioned cellular array computing system of an application again below.
Speech recognition can be compared with the voice signal of known sound bank and input, this comparison can be when
Domain compares and can also be compared in frequency domain.When needing the words that compares more and more, such as, it is contemplated that different accents can arrive
It is tens thousand of, seem insufficient if only relying on the computing capability of a few CPU for Real-time speech recognition.
Cellular array computing system provided in an embodiment of the present invention is then very suitable for solving problems.
For this purpose, the embodiment of the present invention also provides a kind of side carrying out comparing using above-mentioned cellular array computing system
Method, including:The master cpu selects in the cellular array all or after the cell in a target area, alignment programs
It is broadcast in the non-volatile random access memory of each cell;The sample that the master cpu is responsible for selected each cell to compare
Product are respectively written into the agreed address of each cell;The master cpu broadcasting instructions give the microprocessor of selected cell, make each
The data to be compared to be entered such as after a microprocessor completion initialization;The master cpu is data broadcasting to be compared to institute
Select the microprocessor of cell;The microprocessor of selected cell runs the alignment programs, to the number to be compared received
It is compared according to the sample for being responsible for comparing with this cell, if obtaining the consistent comparison result of the two, uses above-mentioned cellular array
Communication means in computing system is sent to the full-time output cell for described using the comparison result as output data
Master cpu is read.
When it is implemented, the data to be compared are either voice data to be identified, can also be to be identified
Image data can also be other data for needing to be compared.
In actual implementation, each MPU constantly receives voice data and is compared, it is generally the case that hundreds of to thousands of
In a cell, only there are one or a few obtain data to be compared with this cell to be responsible for both samples compared consistent
Comparison result is sent to full-time output cell by comparison result, these cells, and the latter notifies master cpu to receive with interrupt signal.
If the data to be compared are specially voice data, the process of comparing can time domain or frequency domain into
Row can be first segmented by master cpu if it is the latter and carry out Fast Fourier Transform (FFT) (FFT, Fast Fourier
Transformation), then broadcast have been converted to frequency domain voice data give selected cell MPU.
Comparing is carried out by the above-mentioned cellular array computing system for being equipped with full-time output cell of application, cell can be made
A large amount of cell is carried out at the same time the operation of alignment programs in array, thus has extremely strong parallel processing capability, solves existing
Communication performance bottleneck problem in technology between CPU and memory, makes the ability of real-time voice/image recognition be greatly improved.
As previously mentioned, from a cell broadcast message to some target area in cellular array, there are one simple
Method:Information is read by master cpu to be broadcasted again.The present embodiment also provides another realization method:Intercellular
Point communications functionality is extended to region mass-sending, and this mode can support the higher depth of parallelism, much higher total bandwidth.
In cellular array computing system provided in this embodiment, any cell in the cellular array can also be used as institute
It states all cells of the starting point cell into target area and carries out mass-sending communication, participate in the mass-sending communication and in target area
Cell the transfer cell and end point cell, ginseng are used as the starting point cell or as the end point cell or simultaneously
It is communicated with the mass-sending and the cell outside target area is as the starting point cell or transfer cell.
When it is implemented, the network controller being connected with microprocessor in each cell, in addition to any two cell into
Row cell-cell communication, but also when the mass-sending communicates, to the data of data, transfer sent out or the data finally received
Transmitting-receiving control is carried out, the network controller is additionally operable to send interrupt signal to the microprocessor.
In actual implementation, the original sender (cell in cellular array as starting point cell) of iuntercellular mass-sending communication
It is responsible for indicating target area, the mass-sending of data is completed still through a series of transfers.It will be appreciated to those of skill in the art that
Iuntercellular mass-sending communication is it is also assumed that be effective superposition of multiple intercellular point-to-point communication, therefore iuntercellular mass-sending communicates
Specific implementation can also refer to the implementation that is communicated between any two cell, such as the cellular array referred to before this
In cell equally may include a group or more of First Input First Outputs being connected with the network controller, herein no longer
It repeats.
On the basis of mass-sending communication between above-mentioned cellular array computing system sertoli cell, the embodiment of the present invention also provides one
Iuntercellular mass-sends communication means in the above-mentioned cellular array computing system of kind, including:When any cell in cellular array is used as
It, will be thin if the starting point cell is located in the target area when point cell all cells into target area initiate mass-sending communication
Intercellular mass-sending data are sent to all flanking cells in the target area, and are directed to each flanking cell more fresh target
Otherwise iuntercellular mass-sending data are sent to adjacent cell by region by the direction close to target area;If being located at target
Cell outside region receives the iuntercellular mass-sending data of flanking cell transmission, then is judging the iuntercellular mass-sending number
After not including this cell according to middle indicated target area, this cell is used as transfer cell, will by the direction close to target area
The iuntercellular mass-sends data relay to flanking cell;If the cell in target area receives the institute of flanking cell transmission
State iuntercellular mass-sending data, then after the target area indicated in judging the iuntercellular mass-sending data includes this cell,
This cell is used as end point cell, and the iuntercellular received mass-sending data are stored in the non-volatile random access memory of this cell,
Or notify the microprocessor of this cell to handle iuntercellular mass-sending data, if still have the target area in
The adjacent cell of this cell, then this cell is also as transfer cell, by the iuntercellular received mass-sending data relay to institute
There is the flanking cell in target area, and target area is updated for each flanking cell;Updated target area
Include one or more target areas made of being divided by the target area before updating, is sent out in the target area before update
Or each flanking cell of the cell of iuntercellular mass-sending data described in transfer is separately included in updated each target area
It is interior, it has sent out or iuntercellular described in transfer is mass-sended except the target area of the cell exclusion of data in the updated.
It should be noted that since master cpu can also be by the data broadcasting of some cell to some in cellular array
Target area, in order to " broadcast data of master cpu " different from, therefore in the present embodiment by iuntercellular mass-send communication when institute
The mass-sending data being related to are known as " iuntercellular mass-sending data ".The cell meeting hard objectives region for initiating iuntercellular mass-sending communication, should
The range of the ID of all cells or all cell ID will be indicated among the iuntercellular mass-sends data in target area, appoint
One cell receives the iuntercellular mass-sending data, just can mass-send the target area indicated in data according to the iuntercellular
Judge that iuntercellular mass-sending data are finally received by this cell, or other flanking cells be given in needing further,
Or the two all needs execution.
In addition, described update target area for each flanking cell, it is specifically that the target area before updating is drawn
One or more target areas made of point (have sent out or the cell of iuntercellular mass-sending data described in transfer are excluded
Except updated target area), wherein each target area can respectively contain the flanking cell (i.e. with before update
Sent out in target area or transfer described in iuntercellular mass-sending data the adjacent cell of cell), each described flanking cell exists
Respectively continue intercellular mass-sending communication in corresponding updated target area, correspondingly, the iuntercellular mass-sending
The target area indicated in data equally can also be updated.
In the present embodiment, with communication mode between flanking cell shown in Fig. 3, and to initiate the starting point cell institute of mass-sending communication
The shape of determining target area be rectangle for illustrate.It should be noted that the iuntercellular group given by the present embodiment
More convenient and efficient mode when communication mode is actual implementation is sent out, it will be appreciated by those skilled in the art that in other implementations
In example, iuntercellular mass-sending communication means equally can be suitably used for communicating between other flanking cells in above-mentioned cellular array computing system
The target area of mode or other shapes.
When it is implemented, the difference of the cell present position as starting point cell or transfer cell, send out or the side of transfer
Formula will be different.
When the first cell as starting point cell or transfer cell is located on the angle of rectangular target areas, if the square
The cell quantity for containing the wherein one side on the adjacent both sides of first cell in shape target area is 1, then updated mesh
Mark region is the rectangle region that the rectangular target areas excludes to be formed after first cell in the another side on the adjacent both sides
Domain, otherwise updated target area includes the target area of two rectangles, and one of target area is the adjacent both sides
Middle any side excludes the rectangular area formed after first cell.It should be noted that described first in the present embodiment is thin
Born of the same parents are the general designations of a kind of cell on the angle of rectangular target areas.
It can be with refering to fig. 10, it is assumed that the cell where K points is the starting point cell initiated iuntercellular mass-sending and communicated, or is negative
Blame the transfer cell of transfer iuntercellular mass-sending data, rectangular target areas 101 be the cell where K points send out or transfer described in it is thin
Intercellular mass-sends identified target area before data, and the cell where K points is in rectangular target areas 101 at this time, and is located at square
On the angle of shape target area 101, since 101 this edge in the horizontal direction of rectangular target areas only includes 1 cell, at this time K
Cell where point only can be selected as next stop transfer there are one neighbours, then the network controller of the cell is by the cell
Between mass-sending data be sent to the cell where L points, and rectangular target areas 101 is updated, the target area formed after update
Domain is rectangular target areas 102, is equivalent to and excludes the cell where K points except rectangular target areas 101;With target area
The continuous renewal in domain stops transfer if being left the last one cell in target area.
Assuming that the cell where M points is also to initiate the starting point cell of iuntercellular mass-sending communication, or to be responsible for transfer cell
Between mass-send data transfer cell, rectangular target areas 103 be M points where cell send out or transfer described in iuntercellular mass-send number
According to preceding identified target area, the cell where M points is in rectangular target areas 103 at this time, and is located at rectangular target areas
On 103 angle, since 103 adjacent both sides of rectangular target areas include 1 or more cell, the cell where M points has at this time
Two neighbours can be selected as next stop transfer, then iuntercellular mass-sending data are sent to by the network controller of the cell
The cell where cell and O points where N points, and rectangular target areas 103 is updated, updated target area includes
The target area of two rectangles, one of target area are rectangular target areas 104, another target area is rectangular target
Region 105 is equivalent to and excludes the cell where M points except rectangular target areas 103, rectangular target areas 104 and rectangle
Target area 105 can be used as independent target area to continue to carry out data relay with aforementioned similar approach;With target area
Continuous renewal, if in target area be left the last one cell, stop transfer.
When the second cell as starting point cell or transfer cell is located on the side of rectangular target areas, if the square
With the cell quantity where second cell while adjacent it is 1 in shape target area, then updated target area includes
Side where second cell excludes the target area of two rectangles formed after second cell, otherwise updated target
Region includes the target area of three rectangles, and two of which target area is that side where second cell excludes described second carefully
Two rectangular areas formed after born of the same parents.It should be noted that second cell in the present embodiment is to be located at rectangular target area
The general designation of a kind of cell on the side in domain.
It can be with refering to fig. 11, it is assumed that the cell where P points is the starting point cell initiated iuntercellular mass-sending and communicated, or is negative
Blame the transfer cell of transfer iuntercellular mass-sending data, rectangular target areas 111 be the cell where P points send out or transfer described in it is thin
Intercellular mass-sends identified target area before data, and the cell where P points is in rectangular target areas 111 at this time, and is located at square
On certain side of shape target area 111, due in rectangular target areas 111 with where cell where P points when adjacent
Cell quantity is more than 1, then the cell where P points can be selected as next stop transfer, the network of the cell there are three neighbours at this time
It is thin where cell, the cell where R points and S points that iuntercellular mass-sending data are separately sent to where Q points by controller
Born of the same parents, and rectangular target areas 111 is updated, updated target area includes the target area of three rectangles, respectively
Rectangular target areas 112, rectangular target areas 113 and rectangular target areas 114 are equivalent to and exclude the cell where P points
Except rectangular target areas 111, rectangular target areas 112 and the two target areas of rectangular target areas 113 are the equal of P points
Two rectangular areas formed after cell where the cell at place where side exclusion P points, rectangular target areas 112, rectangle mesh
Marking region 113 and rectangular target areas 114 can be used as independent target area to continue to carry out in data with aforementioned similar approach
Turn;With the continuous renewal of target area, if being left the last one cell in target area, stop transfer.
It is understood that in target area if (not indicated in Figure 11) with where cell where P points while adjacent
Cell quantity be 1, then the cell where P points can be selected as next stop transfer, the network of the cell there are two neighbours at this time
Iuntercellular mass-sending data are separately sent to the cell where Q points and the cell where R points by controller, and to target area
It is updated, updated target area includes the target area of two rectangles, specially rectangular target areas 112 and rectangle mesh
Mark region 113.
When the third cell as starting point cell is located at the inside of rectangular target areas, updated target area includes
The target area of four rectangles, two of which target area are after third cell place row or column excludes the third cell
Two rectangular areas formed, other two target area is the rectangular target areas before update by the third cell institute
It is expert at or column split and two rectangular areas being formed.It should be noted that the third cell in the present embodiment is to be located at
The general designation of a kind of cell of the inside of rectangular target areas, the inside of the rectangular target areas refer to except " angle " and " side " with
Outer region.
It can be with refering to fig. 12, it is assumed that the cell where T points is the starting point cell (the present embodiment for initiating iuntercellular mass-sending communication
Cell where middle T points can not possibly mass-send the transfer cell of data for responsible transfer iuntercellular), rectangular target areas 121 is T points
The cell at place sends out identified target area before the iuntercellular mass-sending data, and the cell where T points is located at rectangle at this time
The inside of target area 121, the cell where T points can be selected as next stop transfer, the network control of the cell there are four neighbours
Device processed by iuntercellular mass-sending data be separately sent to the cell where U points, the cell where V points, the cell where W points and
Cell where X points, and rectangular target areas 121 is updated, updated target area includes the target of four rectangles
Region, respectively rectangular target areas 122, rectangular target areas 123, rectangular target areas 124 and rectangular target areas 125,
It is equivalent to and excludes the cell where T points except rectangular target areas 121, rectangular target areas 122 and rectangular target areas
123 the two target areas are the equal of that the cell where T points is expert at two rectangles formed after the cell where excluding T points
Region, rectangular target areas 124 and the two target areas of rectangular target areas 125 are the equal of rectangular target areas 121 by T
Cell where point, which is expert at, to be divided and two rectangular areas of formation, rectangular target areas 122, rectangular target areas 123, square
Shape target area 124 and rectangular target areas 125 can be used as independent target area to continue with aforementioned similar approach into line number
According to transfer;With the continuous renewal of target area, if being left the last one cell in target area, stop transfer.
In the present embodiment, when the 4th cell as starting point cell or transfer cell is located at except target area, if
The communication path of straight line can be formed in 4th cell and target area between any cell, then the 4th cell hair
Go out or transfer described in the sending directions of iuntercellular mass-sending data be along the straight line from the 4th cell to the side of target area
To, otherwise described sender is to for from the 4th cell to the direction of flanking cell to be selected, the flanking cell to be selected be with
Close to the cell of target area among the adjacent cell of 4th cell.It should be noted that described in the present embodiment
Four cells are the general designations for being located at a kind of cell except rectangular target areas.
It can be with refering to fig. 13, it is assumed that the cell where Y1 points is the starting point cell for initiating iuntercellular mass-sending communication, rectangle mesh
Mark region 131 is that the cell where Y1 points sends out identified target area before the iuntercellular mass-sending data, at this time Y1 points institute
Cell except rectangular target areas 131, due to the cell rectangular target areas with respect to two sides extended line between,
The communication path of straight line can be formed between cell where Y3 points in rectangular target areas, it at this time only can there are one neighbours
Using the transfer as the next stop, iuntercellular mass-sending data are sent to this neighbour by the network controller of the cell where Y1 points
It occupies, i.e. cell where Y2 points, the cell where Y2 points mass-sends the transfer cell of data as iuntercellular described in transfer is responsible for.Y2
Cell where point will be along direction interim data shown in dotted arrow in Figure 13, until being transmitted to cell where Y3 points.Y3
Cell where point is located on the side of rectangular target areas 131, can continue to complete rectangular target areas according to aforementioned correlation technique
Transfer process in 131.
With continued reference to Figure 13, it is assumed that the cell where Z1 points is the starting point cell for initiating iuntercellular mass-sending communication, rectangle mesh
Mark region 131 is that the cell where Z1 points sends out identified target area before the iuntercellular mass-sending data, at this time Z1 points institute
Cell except rectangular target areas 131, due to the cell not rectangular target areas with respect to two sides extended line
Between, it is all difficult to form the communication path of straight line between any cell in rectangular target areas, there are two neighbours at this time
Can be as the transfer of the next stop, i.e. the cell where cell and Z3 points where Z2 points, the two cells are where Z1 points
The flanking cell to be selected of cell, because the two cells are among the adjacent cell of cell where Z1 points closer to rectangular target
The cell in region 131.In actual implementation, it can arbitrarily select one or more practical communication situation selects a burden to compare
For light cell as next stop transfer, the communication task that the lighter cell of the burden refers specifically to output data is less
Cell.Cell where Z1 points, by two feasible transfer communication paths, until the iuntercellular is mass-sended data
In go to cell where Z4 points.Cell where Z4 points is located on the angle of rectangular target areas 131, can be according to aforementioned correlation technique
Continue to complete the transfer process in rectangular target areas 131.
Iuntercellular mass-sends communication means in cellular array computing system provided in this embodiment, by by intercellular point pair
Point communication function expands to region mass-sending, can support the higher depth of parallelism, obtain much higher total bandwidth, to further carry
Rise the overall performance of computing system.
As previously mentioned, the cell in the cellular array of the embodiment of the present invention has both memory, storage and calculates three functions, carefully
The non-volatile random access memory of intracellular can realize when the microprocessor calculates the arbitrary access of involved data and
Store the instruction code of software and need the data of persistence, however the cost of the non-volatile random access memory be usually compared with
High, so it is limited that the non-volatile random access memory in cell, which is used as the space of memory part, then when micro- place in cell
When the data processing amount of reason device is larger, limited memory headroom can influence the treatment effeciency of microprocessor, how extend thin
The memory headroom of born of the same parents becomes as urgent problem to be solved.
Based on above-mentioned consideration, the embodiment of the present invention gives another structure of cellular array computing system, such as Figure 14
Shown, the cellular array computing system is gone back in addition to including foregoing master cpu, cellular array and cellular array bus
At least one memory cell array is may further include, the memory cell array is made of more than one internal storage location
Two-dimensional array, the cellular array and all memory cell arrays are built up three-dimensional structure, in each memory cell array
Internal storage location is connected correspondingly with the cell in the cellular array, and internal storage location cooperation is described non-volatile to deposit at random
Reservoir, the two are provided commonly for the arbitrary access of involved data when the microprocessor calculates.
In actual implementation, the non-volatile random access memory in cell can be MRAM, and the memory cell array then may be used
To be MRAM, DRAM or SRAM silicon chip, lower-cost one or more DRAM silicon chips can be generally chosen, wherein each DRAM
Silicon chip is the memory cell array formed by the internal storage location consistent with each cell position in the cellular array, then by institute
There are DRAM silicon chips to carry out 3D with cellular array silicon chip to combine, either memory unit cell corresponding with cellular array it
Between can be established a communications link by TSV, thus extend the memory of each cell.
In the embodiment of the present invention, by by least one memory cell array being made of more than one internal storage location, with
The cellular array is built up three-dimensional structure, and makes thin in the internal storage location and cellular array in each memory cell array
Born of the same parents are connected correspondingly, the arbitrary access of the internal storage location involved data when being calculated for the microprocessor, so
Just the memory headroom that each cell in cellular array can be extended with lower cost, improves the processing effect of microprocessor in cell
Rate.
It should be pointed out that illustrate only a memory cell array in Figure 14 is built up three-dimensional with the cellular array
The case where structure, those skilled in the art equally will also appreciate that more than one memory cell array is overlapped with the cellular array
The case where forming three-dimensional structure.
Those skilled in the art are also understood that aforementioned master cpu passes through in cellular array bus and cellular array
Each cell communicated, between any two cell not against master cpu carry out communicate, any cell is into target area
All cells, which carry out mass-sending communication, are used as end point cell by the way that full-time output cell is arranged in cellular array receives and stores it
His cell the communication means such as reads for master cpu to the output data of master cpu, these are equally applicable to include the memory
The cellular array computing system of cell array.
It should be noted that due to the memory headroom of each Cell expansions in cellular array, the master cpu is in addition to can
To access the non-volatile random access memory of this cell, can also access internal storage location corresponding with this cell (when with cell battle array
When the quantity that row are superimposed as the memory cell array of three-dimensional structure is more than one, then corresponding with this cell internal storage location
Quantity also has more than one), thus the master cpu by the cellular array bus with it is each in the cellular array
The communication that a cell carries out includes at least one of following situations:The non-of any cell in the cellular array is read and write by address
Volatile random access memory or corresponding internal storage location;The non-volatile of each cell in data broadcasting to target area is deposited at random
Reservoir or corresponding internal storage location, and the non-volatile random access memory of each cell or corresponding interior in the target area is written
Identical relative address in memory cell;To in the cellular array any cell microprocessor send instruction, transmission data or
Reading state;To the microprocessor broadcasting instructions of all cells in target area.
When the cell in the cellular array further includes bus control unit and cell interior bus, the cell interior is total
Line is connected in addition to connecting the microprocessor, non-volatile random access memory, internal storage location also corresponding with this cell, described total
Lane controller is connected with the cellular array bus, microprocessor and cell interior bus, and the bus control unit is for knowing
The communication not carried out between the master cpu and this cell connects the microprocessor to transmit the finger that the master cpu is sent
Enable or data, state read, or by the cell interior bus connect the non-volatile random access memory or with this cell
Corresponding internal storage location carries out the read-write operation of data.
As previously mentioned, the cell in the cellular array of the embodiment of the present invention has both memory, storage and calculates three functions, carefully
The non-volatile random access memory of intracellular can realize when the microprocessor calculates the arbitrary access of involved data and
Store the instruction code of software and need the data of persistence, however the cost of the non-volatile random access memory be usually compared with
High, so it is equally limited that the non-volatile random access memory in cell, which is used as the space of storage section, then big when existing
When amount file or data needs store each cell in cellular array, limited memory space just cannot be satisfied storage and want
It asks, or even it is also possible to the treatment effeciency of microprocessor can be influenced, how to extend the memory space of cell is equally then urgently to solve
Certainly the problem of.
Based on above-mentioned consideration, the embodiment of the present invention gives another structure of cellular array computing system, such as Figure 15
Shown, the cellular array computing system is gone back in addition to including foregoing master cpu, cellular array and cellular array bus
At least one memory cell array is may further include, the memory cell array is made of more than one storage unit
Two-dimensional array, the cellular array and all memory cell arrays are built up three-dimensional structure, in each memory cell array
Storage unit is connected correspondingly with the cell in the cellular array, and storage unit cooperation is described non-volatile to deposit at random
Reservoir, the two are provided commonly for the instruction code of storage software and need the data of persistence.
In actual implementation, the non-volatile random access memory in cell can be MRAM, and the memory cell array then has
Body can be flash memory silicon chip, can generally choose cost relative to the lower one or more nand flash memory silicon chips of MRAM, wherein
Each nand flash memory silicon chip is the storage list formed by the storage unit consistent with each cell position in the cellular array
Element array, then all nand flash memory silicon chips are subjected to 3D with a cellular array silicon chip and are combined, any storage unit and cell battle array
It can be vertically connected by TSV between corresponding cell to establish a communications link in row, thus extend the storage of each cell
Space.
When it is implemented, the cell in the cellular array further includes the storage control being connected with the microprocessor,
Data, which are carried out, for pair storage unit being connected with this cell stores access control.When one or more nand flash memory silicon chips and
Can also be that each cell in cellular array configures nand flash memory control after one cellular array silicon chip carries out 3D combinations
A pair storage unit corresponding with this cell just may be implemented by the nand flash memory controller of this cell in the MPU of device, this cell
It is written and read.When heap file or data storage are in nand flash memory, the search of data can be passed through by each cell
The respective channels NAND scan for, and are greatly accelerated.The compiling of large software system is also needed to thousands of
Source code file is compiled, and when these source codes are stored in such cellular array computing system, compiling is similarly obtained
Great acceleration.
The cellular array computing system provided in an embodiment of the present invention for including the memory cell array, can with it is lower at
The memory space of each cell, improves the data storage capacities of each cell in this extension cellular array.
It should be pointed out that illustrate only a memory cell array in Figure 15 is built up three-dimensional with the cellular array
The case where structure, those skilled in the art equally will also appreciate that more than one memory cell array is overlapped with the cellular array
The case where forming three-dimensional structure.
It will be appreciated to those of skill in the art that aforementioned master cpu passes through in cellular array bus and cellular array often
A cell communicated, between any two cell not against master cpu carry out communicate, any cell institute into target area
There is cell to carry out mass-sending communication, be used as end point cell by the way that full-time output cell is arranged in cellular array and receive and store other
Cell to master cpu output data for master cpu read etc. communication means, these be equally applicable to comprising the storage singly
The cellular array computing system of element array.
All have very much as previously described, because the non-volatile random access memory in cell is used as memory and the space of storage section
Limit, therefore how to extend the memory of cell and memory space is a problem to be solved simultaneously.Based on above-mentioned consideration, the present invention is real
Apply the yet another construction that example gives cellular array computing system, as shown in figure 16, the cellular array computing system in addition to
Including master cpu, cellular array and cellular array bus, can further include at least one said memory cells array and
At least one above-mentioned memory cell array.Cellular array simultaneously comprising the memory cell array and memory cell array calculates
System can refer to the tool of the above-mentioned cellular array computing system for only comprising memory cell array or only including memory cell array
Body is implemented, and details are not described herein again.
Cellular array meter that is provided in an embodiment of the present invention while including the memory cell array and memory cell array
Calculation system can extend the storage of each cell and memory headroom in cellular array simultaneously with lower cost, improve each cell
Data storage capacities and cell in microprocessor treatment effeciency, so as to further promote the globality of computing system
Energy.
High-end imaging sensor is (Microsecond grade) can to obtain image data within the extremely short time, but a vertical frame dimension is clear
The data volume of image is very big, in current camera system, it is contemplated that active computer framework is because between CPU and memory, storage
Influence of the existing communication performance bottleneck for computer overall performance reads out image data and usually just needs 1/30 second -1/60
The time of second, therefore in the camera system of the overwhelming majority, the processing capacity of video is not caught up with much and obtains image data
Speed.
Therefore, camera system in the prior art is since there are between CPU and memory, storage for computer architecture that it is used
Communication performance bottleneck, this greatly affected the overall performance of computing system so that processing of the current camera system for video
Ability does not catch up with the speed that the imaging sensor in the camera system collects image data much.
To solve the above problems, the embodiment of the present invention also provides a kind of camera shooting system using above-mentioned cellular array computing system
System.As shown in figure 17, the camera system includes cellular array computing system and imaging sensor, and the cellular array calculates
System includes master cpu, cellular array and cellular array bus, can specifically refer to the description of above-mentioned related embodiment, herein
It repeats no more;The two-dimensional array that described image sensor is made of more than one image acquisition units, the cellular array with
Described image sensor is built up three-dimensional structure, in the image acquisition units in described image sensor and the cellular array
Cell be connected correspondingly;Described image collecting unit is for acquiring image data for the cell in the cellular array
It is handled.
In actual implementation, cmos image sensor more popular at present, low side may be used in described image sensor
Cmos image sensor sensor devices and other circuits (such as signal amplification circuit, analog to digital conversion circuit etc.) can be made in
The same face of silicon chip, and high-end cmos image sensor can then do other circuits overleaf, be connected by TSV and photosurface
It connects;No matter any situation, a large amount of image acquisition units in imaging sensor can be divided into and the cellular array
In the consistent two-dimensional array of each cell position, then imaging sensor is subjected to 3D with a cellular array silicon chip and is combined, appointed
It can be established a communications link by TSV between one image acquisition units cell corresponding with cellular array.
It should be noted that those skilled in the art know, each image acquisition units of imaging sensor shown in Figure 17
Significant surface need be set to where silicon chip lower section because camera lens can only be below, otherwise above light will be several by other
A silicon chip blocks.In other embodiments, imaging sensor can also be superimposed upon the top of cellular array silicon chip.
In addition, in actual implementation, each frame image will be divided to each Image Acquisition list of described image sensor
In member, each image acquisition units are responsible for acquiring corresponding a part of content in a frame image, just can subsequently realize to a frame
The parallel processing of each section content, so improves image processing efficiency in image.
In the present embodiment, each cell in the cellular array can also configure image processor, the image processor
It is handled for pair image acquisition units acquired image data being connected with this cell.Certainly, in actual implementation, institute
It states among the microprocessor that image processor can also be integrated in cell.
In the case that the cell quantity in cellular array reaches thousands of, then the Image Acquisition list in imaging sensor
First quantity scale having the same, such each frame image can be read by thousands of a channels and be handled simultaneously, make to take the photograph
As the ability of the image procossing of system will obtain hundred times of raising, to meet the requirements at the higher level for high-speed camera.
In addition, present inventor is it is further contemplated that " camera system of the prior art is remote for the processing capacity of video
The speed that imaging sensor collects image data far is not caught up with, keeps the image-capable of camera system relatively limited "
Problem, in addition to high-speed digital photography is at present still without good solution, for some image recognitions, there are faster speeds to want
For the application asked, such as following automatic vehicle control system, it may be required that a frame image is made in signa
It identifies and forms reaction, then the same plan for lacking effective reply.
For this purpose, the embodiment of the present invention also provides a kind of image identification system, including recognition unit and the embodiment of the present invention
The camera system of offer, the recognition unit are used to that the image obtained after camera system processing to be identified.
It will be appreciated to those of skill in the art that when camera system provided in an embodiment of the present invention can be with than existing skill
In the case that the faster speed of art completes the reading and processing of each frame image, can realize at faster speed naturally for
The identification of each frame image.
Therefore, by that will include that the camera system of cellular array computing system is applied in image identification system, due to every
The processing speed of one frame image is increased dramatically, and just can realize the identification for each frame image within the shorter time,
Make the image identification system that there is faster recognition capability, to meet the requirement identified for high speed image.
The specific implementation of above-mentioned camera system provided in an embodiment of the present invention and image identification system can also refer to upper
The associated description of cellular array computing system is stated, details are not described herein again.
As previously described, because the non-volatile random storage included by cell in the cellular array of cellular array computing system
Device is used as memory and the space of storage section is all very limited, therefore how to extend the memory of cell and memory space simultaneously is urgently
Problem to be solved, thus the embodiment of the present invention have been provided for one kind comprising master cpu, cellular array and cellular array bus
On the basis of, and include the cellular array computing system of the memory cell array and memory cell array, the cell battle array simultaneously
Column count system extends the memory and memory space of cell simultaneously, is especially suitable for completing the storage and processing of some big data quantities
Task, since the data volume for the frame frame high-definition image that imaging sensor is acquired in camera system is larger, for camera shooting system
The each Cell expansions memory and memory space of cellular array computing system in system just seem particularly necessary.
Based on above-mentioned consideration, the embodiment of the present invention also provides another structure of camera system.As shown in figure 18, described to take the photograph
As system includes cellular array computing system and imaging sensor, the cellular array computing system includes master cpu, cell
Array and cellular array bus can specifically refer to the description of above-mentioned related embodiment, and details are not described herein again;In the present embodiment,
The cellular array computing system further includes at least one memory cell array and at least one memory cell array;The storage
The two-dimensional array that cell array is made of more than one storage unit, the cellular array and one or more storages are single
Element array is built up three-dimensional structure, and the storage unit in each memory cell array and the cell in the cellular array are one by one
Accordingly it is connected;The storage unit is for storing the instruction code of software and needing the data of persistence;The interior deposit receipt
The two-dimensional array that element array is made of more than one internal storage location, the cellular array and one or more internal storage locations
Array is built up three-dimensional structure, and the internal storage location in each memory cell array and the cell one in the cellular array are a pair of
It is connected with answering, the arbitrary access of the internal storage location involved data when being calculated for the microprocessor.
When actual implementation, the memory cell array can be flash memory silicon chip, the memory cell array can be MRAM,
DRAM or SRAM silicon chips, on a silicon chip, described image sensor may be used more popular at present the cellular array
Cmos image sensor.
The cellular array computing system that the camera system of another kind structure provided in an embodiment of the present invention is included can also
With reference to figure 16.
By by least one memory cell array being made of more than one storage unit and at least one by one
The memory cell array of the above internal storage location composition, is built up three-dimensional structure, and make each deposit with the cellular array respectively
Cell in the internal storage location and cellular array in storage unit, each memory cell array in storage unit array corresponds
Ground is connected, and the memory space and memory headroom of each cell in cellular array, Ji Nengti just can be so extended with lower cost
The data storage capacities of high each cell, and the treatment effeciency of microprocessor in cell can be improved, it is taken the photograph to further improve
As the image-capable of system.
It should be noted that another structure of camera system provided in an embodiment of the present invention is in terms of the cellular array
Simultaneously comprising illustrating at least one memory cell array and at least one memory cell array in calculation system, at it
Can also only include the memory cell array and interior in the cellular array computing system that camera system is included in his embodiment
Any one in deposit receipt element array, such as Figure 14 or shown in figure 15 cellular array computing systems.
In addition, being directly to overlap in the structure of camera system shown in Figure 18, between imaging sensor and cellular array silicon chip
Together, between the two be not present memory cell array silicon chip or memory cell array silicon chip, make image acquisition units with
Line between corresponding cell is shorter, can so make the image that imaging sensor acquires more quickly by thin in cellular array
Born of the same parents read and processing, to improve image processing efficiency.
Certainly, the embodiment of the present invention also provides a kind of image knowledge including recognition unit and camera system as shown in figure 18
Other system, the recognition unit are used to that the image obtained after camera system processing to be identified.
The specific implementation of above-mentioned camera system provided in an embodiment of the present invention and image identification system can also refer to upper
The description of relevant cell array computation system is stated, details are not described herein again.
Neural network (Neural Networks) is common computational methods in computer learning, is the work for copying human brain
Make principle, generally uses concept as perceptron (Perceptron) or neuron (Neuron).Neural Networks
Learning training process be a magnanimity calculate process, a neural network be actually one possess quantity of parameters (may
Have tens thousand of) function, need a large amount of scene, each scene to have input data and correct option, it is a large amount of for adjusting these
Parameter to reach the destination of study.Since the calculation amount involved in neural computing is very big, and calculating in the prior art
Rack structure is because communication performance bottleneck existing between CPU and memory, storage can then seriously affect computer overall performance, to be unfavorable for
The efficient realization of neural computing.
For this purpose, based on the above-mentioned cellular array computing system that the embodiment of the present invention is provided, the embodiment of the present invention also provides
A method of realizing neural computing using above-mentioned cellular array computing system, including:It is each in the cellular array
Participate in storing the code of one or more neuron functions in the cell of neural computing;The master cpu selection one
A or more than one cell executes the code for the neuron function that each cell is stored, and by implementing result to one or one with
On target cell output;It is any participate in neural computing cell receive from other cells exported for nerve
The implementing result of the code of meta-function, as the input data of this cell, based on all input numbers from other cells
The code of neuron function stored according to this cell is executed, and implementing result is output to the neuron for needing the implementing result
Cell where function is either stored in preset address and the master cpu is waited for read or exported to the master cpu.
Nerve is realized in order to be better understood from the above-mentioned cellular array computing system of application provided in an embodiment of the present invention
The method of network calculations, it is necessary to first the principle of neural computing is briefly described.
Neural network is common algorithm in computer learning, has copied the operation principle of human brain.Human brain is by a large amount of
Neuron composition, each neuron can be connect by a large amount of (thousands of) cynapses on dendritic arbors with other neurons
It touches, is inputted;Stiffness of coupling in cynapse has memory function;It exports the signal of an excitement or inhibition, this signal
A large amount of neuron at a distance can be sent to by aixs cylinder.
Human brain operation principle is copied to be formed by neural computing method, neuron therein is exactly a function, such as
Shown in Figure 19, it has very multiple input, such as x1、x2And x3It is wherein 3 inputs, each input corresponds to a weight,
General computational methods are that each input is multiplied by weight and is added again;It exports 0 or 1 (being determined by a threshold value) or one
Value between 0 and 1;There are very multiple parameters (such as weight parameter, threshold parameter etc.) inside it, adjusts these parameters
Process is exactly the process of computer learning.
One typical Neural Network is the network that output, the input of a large amount of Neuron are linked together, leads to
Often it is organized into multistage architecture.As shown in figure 20, this neural network is organized into three-tier architecture in figure, each first layer Neuron
Output be sent to each second layer Neuron, the output of each second layer Neuron is sent to each third layer
Neuron.The first layer of neural network is properly termed as input layer in Figure 20, and the second layer is properly termed as hidden layer, and third layer can claim
For output layer, the output layer exported 0,1,2 ..., 8,9 be neural computing final result.God in
Usually all it is such hierarchical structure through network, in actual implementation, this can be a nerve for being used for identifying handwritten numeral
Network.
The learning training process of Neural network is the process that a magnanimity calculates.One network is actually one
The function for possessing quantity of parameters (may have tens thousand of), needs a large amount of scene, each scene to have input data and correct option,
For adjusting these a large amount of parameters to reach the destination of study (usually using steepest descent method).For example train a nerve net
Network identifies handwritten numeral, needs through tens of thousands of pictures, adjustment parameter makes neural network provide correctly each pictures
As a result, handwriting recognition hereafter just has very high success rate.
The present inventors considered that neural computing is to be highly suitable for cellular array meter provided in an embodiment of the present invention
It is achieved in calculation system.Due to needing to carry out neural network the valuation of many numbers in training process, then the nerve
Network is can be accelerated by a large amount of MPU parallel computings.
In actual implementation, if the quantity size of neuron is less than cell in cellular array computing system in neural network
Quantity, then in cellular array a cell can complete the calculating that a neuron in neural network is related to, it is otherwise neural
The calculating that more than one neuron is related in network can be integrated in a cell and complete.
When it is implemented, the code of the one or more neuron functions of each cell storage, the code of neuron function
It can be broadcasted to each cell as neuron by way of data broadcasting by master cpu, and be broadcasted by master cpu
Instruction to some target area starts neural computing, such as:The master cpu can select in the cellular array all
Or cell of the cell in target area as participation neural computing, by the code broadcast of identical neuron function to institute
In the same segment relative address of each cell of selection, and broadcasting instructions make the micro- of one or more selected cells
Processor executes the code of the neuron function since the relative address.Specific implementation can also refer to above-mentioned cell battle array
The associated description of column count system specific implementation.
In the embodiment of the present invention, each input data from other cells corresponds to a weight parameter, described to be based on
The code that all input datas from other cells execute the neuron function that this cell is stored may include:By each
The corresponding weight parameter of input data from other cells is multiplied, and sums to all products, by the result after summation
Output valve determining later is compared with threshold parameter as implementing result.When actual implementation, the weight parameter and threshold value
Parameter is pre-stored among the non-volatile random access memory in cell.
It is also logical if the implementing result exported after executing the code of the neuron function is successive value when actual implementation
The mode tabled look-up is crossed to accelerate to export.
When it is implemented, due between cellular array computing system provided in an embodiment of the present invention support any two cell
Cell-cell communication, then when carrying out neural computing, can by using the cell-cell communication transmit for the nerve
The implementing result of the code of meta-function.The effect of cell-cell communication is improved by the huge bandwidth of cellular array internal network communication
Rate also reduces the processing load of master cpu, so as to further promote the overall performance of computing system, is more advantageous to nerve
The efficient realization of network calculations.
When it is implemented, when neural network is divided into different layers, then data transmission between layers can use
Cell-cell communication provided in an embodiment of the present invention mass-sends mechanism, and the cell for participating in neural computing in same layer at this time is located at together
In one target area, any cell for participating in neural computing is mass-sended to target area for neuron function in a certain layer
The implementing result of code, being just equivalent to will be for the implementing result of the code of neuron function mass-sending to all participation god of next layer
In cell through network calculations.By the way that intercellular point communications functionality is expanded to region mass-sending, can support higher
The depth of parallelism obtains much higher total bandwidth, to further promote the overall performance of computing system, is remarkably contributing to improve nerve
The speed of network calculations.
If when it is implemented, being additionally provided with full-time output cell in cellular array, can be incited somebody to action by the cell-cell communication
The implementing result of neural computing is sent to the full-time output cell so can be more efficient so that the master cpu is read
Ground exports the implementing result of neural computing to master cpu.When actual implementation, if in the neuron function of last layer
(common application is in the neuron of last layer, each is responsible for the specific number of identification one for certain outputs for obtaining affirmative
According to feature or image), then it can notify master cpu by this mechanism.
Implement neural computing by cell-cell communication, iuntercellular mass-sending and full-time output cell and can refer to cell
Associated description in array computation system embodiment, details are not described herein again.
In conclusion realizing the advantage of neural computing using cellular array computing system provided in an embodiment of the present invention
It is obvious:The arithmetic speed that the parallel computing of a large amount of cells is significantly speeded up, so that the speed pole of learning training
It is big to improve;The huge bandwidth of array internal network communication and mass-sending mechanism are equally remarkably contributing to improve speed;MRAM's is non-
Volatibility so that the successful chip of training can be replicated directly as the product sale for solving particular problem.
Those skilled in the art know that any one CPU is required for there are software debugging interface, this is for software development
Necessary function.Nearly all there are debugging interfaces by CPU currently on the market, and most of debugging interfaces are according to JTAG (Joint
Test Action Group, combined testing action group) standard design.External commissioning device is sent out by this interface to CPU
Debugging instruction, including:Pause, setting breakpoint, read/write memory etc., the fortune of these instructions implemented helper person and check program
Market condition, diagnostic software failure.
For cellular array computing system provided in an embodiment of the present invention, when a large amount of (such as thousands of) CPU are integrated into
When on one chip, how convenient, efficient realization debugging function is a problem to be solved.
Based on above-mentioned consideration, the embodiment of the present invention gives a kind of cellular array computing system with debugging interface.
As shown in figure 21, the cellular array computing system is total in addition to including foregoing master cpu, cellular array and cellular array
Line, can further include the debugging interface being connected with the master cpu, and commissioning device controls institute by the debugging interface
Master cpu is stated to debug the software run in each cell of the cellular array.
In cellular array computing system provided in an embodiment of the present invention, the master cpu may be used in the prior art
CPU realize, can naturally also support existing debugging interface.External debugging device connects master cpu by debugging interface,
It recycles existing debugging interface to support the function of read/write memory, debugging instruction is sent to cellular array.
In view of in some cases, software program requires to exchange bulk information between cell, if a cell encounters
Breakpoint stop, and other cells continue to run, and the confusion of whole system can be caused to cause debugging that can not carry out.For this purpose,
In embodiments of the present invention, the cellular array computing system can also include the temporary of each cell in the connection cellular array
Stop signal wire (being not shown in Figure 21), the halt signal line is used to suspend when the software run in any cell meets with breakpoint
When, the cell all to other sends halt signal.
By increasing the halt signal line of all cells of connection, any one cell in cellular array computing system
When encountering breakpoint, halt signal can be all sent on this root halt signal line, all cells receive after the halt signal at once
Suspend the software wherein run, intercellular network transmission is also suspended simultaneously, and thus, it is possible to avoid causing the confusion of whole system
And cause debugging that can not carry out, it is ensured that the stability and accuracy of debugging.
Further, the halt signal line can also be connected with the commissioning device, and the halt signal is also sent to
The commissioning device.In actual implementation, the halt signal line can guide to cellular array chip exterior, to connect debugging
Equipment.
Based on the above-mentioned cellular array computing system with debugging interface, the embodiment of the present invention also provides a kind of above-mentioned cell
The adjustment method of array computation system, including:The commissioning device controls the master cpu to described by the debugging interface
The cell or whole cell transmission debugging instruction in target cell, target area in cellular array, to realize to each thin
The software run in born of the same parents is debugged.
In embodiments of the present invention, software debugging is to control master cpu by the debugging interface come real by commissioning device
It is existing.
Specifically, the cell or thin that master cpu is sent to some cell or is broadcast in some target area
The debugging instruction of whole cells in born of the same parents' array includes:Pause instruction, read write command and setting break-poing instruction;The pause instruction
Operation, the intercellular data transmission of pause etc. of MPU can be suspended;The read write command can read the internal register of MPU,
Register (operation, pause, experience breakpoint etc.) including recording MPU states, additionally it is possible to read the communication between flanking cell
The data of FIFO in interface;The software setting breakpoint that the setting break-poing instruction can be run in cell, is specifically written cell
MPU in be achieved.
When it is implemented, the master cpu is under the control of the commissioning device, or detect what other needs suspended
When condition, pause instruction is sent to target cell, the cell in target area or whole cells;Receive the pause instruction
Cell suspend the operation of software in this cell;The deposit of the inside in the microprocessor is read by sending read write command
The content in communication interface between device, the non-volatile random access memory (such as MRAM) and flanking cell carrys out debugging software.
Further, break-poing instruction can also be arranged in target cell or target area in the master cpu by transmission
The software setting breakpoint run in cell in domain, and periodically read the state of cell;The state of the cell includes in cell
Operation, pause and the experience breakpoint of software, the condition that other described needs suspend include that the software in cell meets with breakpoint.
In the embodiment of the present invention, due to each cell in further including the connection cellular array in cellular array computing system
The halt signal line, therefore the adjustment method of above-mentioned cellular array computing system further includes:When what is run in any cell
When software meets with breakpoint and suspends, which sends halt signal by the halt signal line to other all cells;It connects
The cell for receiving the halt signal suspends the operation of software in this cell, and suspends the data transmission between flanking cell.
In the embodiment of the present invention, the halt signal line is also connected with the commissioning device, makes the halt signal can also
It is enough sent to the commissioning device, therefore the adjustment method of above-mentioned cellular array computing system further includes:When the commissioning device
When receiving the halt signal that any cell is transmitted by the halt signal line, control suspends the fortune of the master cpu
Row.In actual implementation, when the halt signal that any cell is sent out in cellular array is transmitted to commissioning device, commissioning device can
To suspend master cpu at once, to check the interaction problems of master cpu and cellular array, the stabilization of debugging is thereby further ensured that
Property and accuracy.
The specific reality of cellular array computing system and its adjustment method provided in an embodiment of the present invention with debugging interface
The implementation of the above-mentioned cellular array computing system with other structures can also be referred to by applying.
It should be pointed out that the embodiment of the present invention is adjusted so that the cellular array is specially two-dimentional cellular array as an example to carrying
The cellular array computing system of mouth of trying illustrates, and in other embodiments, the cellular array can also be three-dimensional thin
Born of the same parents' array, the three-dimensional cell array is formed by stacking by more than one two-dimentional cellular array, " adjacent thin in cellular array at this time
The concept of born of the same parents " is not limited solely to two dimensional surface, but expands to three dimensions.If using such as Fig. 3 in two-dimentional cellular array
Shown in communication mode between flanking cell, then in rectangular coordinate system in space, any cell x-axis both forward and reverse directions, y-axis just
Negative direction and z-axis both forward and reverse directions this six directions all have adjacent cell.In actual implementation, when multi-disc 2D cellular array cores
Piece can be superimposed together composition 3D chips when, vertical linkage is established in adjacent iuntercellular by TSV, that is, is located at adjacent
It is established a communications link by TSV between the flanking cell of two two-dimentional cellular arrays.The cellular array chip of 3D is keeping low work(
While consumption advantage, the scale of cellular array is increased, has expanded the bandwidth of intercommunication.
Although present disclosure is as above, present invention is not limited to this.Any those skilled in the art are not departing from this
It in the spirit and scope of invention, can make various changes or modifications, therefore protection scope of the present invention should be with claim institute
Subject to the range of restriction.
Claims (13)
1. a kind of cellular array computing system, which is characterized in that including:Master cpu, cellular array and cellular array bus, institute
It states cellular array and the cellular array bus is integrated in a chip;
The cellular array is had both the two-dimensional array that the cell of calculating and store function forms by more than one, wherein each
Cell includes microprocessor and non-volatile random access memory;When the non-volatile random access memory is calculated for the microprocessor
The arbitrary access of involved data is additionally operable to the instruction code of storage software and needs the data of persistence;
Position in each leisure cellular array of each cell storage as ID in cell software or hardware read;
The master cpu is communicated by the cellular array bus with each cell in the cellular array;
There is communication interface between flanking cell in the cellular array, any cell in the cellular array is in its microprocessor
To flanking cell transmission data under the control of device.
2. cellular array computing system according to claim 1, which is characterized in that the master cpu passes through the cell
The communication that array bus is carried out with each cell in the cellular array includes at least one of following situations:
The non-volatile random access memory of any cell in the cellular array is read and write by address;
By in data broadcasting to the cellular array in target area each cell non-volatile random access memory, and institute is written
State in target area identical relative address in the non-volatile random access memory of each cell;
Instruction, transmission data or reading state are sent to the microprocessor of any cell in the cellular array;
To the microprocessor broadcasting instructions of all cells in the target area.
3. cellular array computing system according to claim 1, which is characterized in that the cell in the cellular array also wraps
Include bus control unit and cell interior bus, the bus control unit and the cellular array bus, microprocessor and cell
Internal bus is connected, the communication that the bus control unit carries out between the master cpu and this cell for identification, described in connection
Microprocessor is to transmit the instruction or data, state reading that the master cpu is sent, or passes through the cell interior bus and connect
Connect the read-write operation that the non-volatile random access memory carries out data.
4. cellular array computing system according to claim 1, which is characterized in that be integrated with floating-point in the microprocessor
At least one of computation processor and image processor.
5. cellular array computing system according to claim 1, which is characterized in that the non-volatile random access memory is
MRAM。
6. cellular array computing system according to claim 1, which is characterized in that the master cpu and the cell battle array
Row and the cellular array bus are integrated in a chip.
7. cellular array computing system according to claim 1, which is characterized in that the master cpu is as independent core
Piece is communicated by the memory interface of standard with the chip being made of the cellular array and the cellular array bus.
8. the communication means in a kind of cellular array computing system as described in claim 1, which is characterized in that including:
Any cell in the cellular array is with receiving the target that the master cpu is broadcasted in the cellular array bus
Location connects the non-volatile random access memory of the cell so that the master control if judging the destination address in this cell
CPU carries out the read-write operation of data;
The first special address field is reserved for the communication between the master cpu and microprocessor in system address space and is deposited
The ID of target cell is stored up, identifies it is thin with this when if any cell receiving the first special address field in the cellular array
The communication of the microprocessor of born of the same parents, the then microprocessor for connecting the cell complete subsequent command reception, data receiver and state reading
Extract operation;
The second special address field is reserved in system address space and is used for the master cpu broadcasting instructions, and described second distinguishingly
Location section has the ID that can assist in each cell of the range of target area in the cellular array, if in the cellular array
Any cell identifies that this cell in the target area, then connects the micro- of the cell after receiving the second special address field
Processor is read with transmitting instruction that the master cpu is sent or data, state, or by connect the cell it is non-volatile with
Machine memory carries out the read-write operation of data;
Any cell in the cellular array is under the control of its microprocessor to flanking cell transmission data.
9. the communication means in cellular array computing system according to claim 8, which is characterized in that the cellular array
In cell further include bus control unit and cell interior bus, the bus control unit and the cellular array bus, Wei Chu
It manages device and cell interior bus is connected;Whether any cell in the cellular array judges the destination address in this cell
In, identify whether the communication with the microprocessor of this cell, identify this cell whether in the target area, and connection is non-
Volatile random access memory or microprocessor are completed by the bus control unit, and the bus control unit passes through the cell
Internal bus connects the non-volatile random access memory.
10. the communication means in cellular array computing system according to claim 8, which is characterized in that the master cpu
Intracellular higher than this for the priority of the read-write operation of the non-volatile random access memory of any cell in the cellular array
Read-write operation of the microprocessor for corresponding non-volatile random access memory.
11. a kind of application cellular array computing system as described in any one of claim 1 to 7 calculates Monte Carlo integrals
Method, which is characterized in that including:
The master cpu selects the cell in a whole or target area in the cellular array, and integrand is corresponding
Relative address section of the program broadcast to each selected cell;
The master cpu broadcasting instructions make the microprocessor of selected cell execute the quilt since the relative address section
The corresponding program of Product function;
After each cell completes integral operation, summation is stored in the address of agreement, is always asked after being read for the master cpu
With.
12. a kind of application cellular array computing system as described in any one of claim 1 to 7 calculates Monte Carlo integrals
Method, which is characterized in that including:
The master cpu selects the cell in a whole or target area in the cellular array;
The master cpu is broadcasted one and is downloaded in program to the same segment relative address of each selected cell, and broadcasts
Instruction makes the microprocessor of selected cell execute the download program since the relative address;
The corresponding program of integrand is split into two or more subprograms, the master cpu broadcasts each sub- journey
In sequence to the microprocessor of selected cell;
The microprocessor for downloading program is run, selects one of subprogram to store according to the ID of respective place cell, makes
Each subprogram is sequentially deployed in one group of adjacent successively cell;
The master cpu broadcasting instructions make the microprocessor of each group of cell execute the corresponding program of the integrand successively
Subprogram after being split, the intermediate result of previous stage are transported to next stage and input;
After each group of cell completes integral operation, summation is stored in the address of agreement, is carried out after being read for the master cpu total
Summation.
13. the method that cellular array computing system according to claim 11 or 12 calculates Monte Carlo integrals, special
Sign is, when starting to execute, the ID that included randomizer reads cell makees the corresponding program of the integrand
For seed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510456294.3A CN105608490B (en) | 2015-07-29 | 2015-07-29 | Cellular array computing system and communication means therein |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510456294.3A CN105608490B (en) | 2015-07-29 | 2015-07-29 | Cellular array computing system and communication means therein |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105608490A CN105608490A (en) | 2016-05-25 |
CN105608490B true CN105608490B (en) | 2018-10-26 |
Family
ID=55988414
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510456294.3A Active CN105608490B (en) | 2015-07-29 | 2015-07-29 | Cellular array computing system and communication means therein |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105608490B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256637A (en) * | 2016-12-28 | 2018-07-06 | 上海磁宇信息科技有限公司 | A kind of cellular array three-dimensional communication transmission method |
CN106951955B (en) * | 2017-03-09 | 2019-12-17 | 中国人民解放军军械工程学院 | Method for selecting electronic cell number in bus embryo electronic cell array |
KR102481256B1 (en) * | 2017-08-31 | 2022-12-23 | 캠브리콘 테크놀로지스 코퍼레이션 리미티드 | Chip device and related product |
US10635622B2 (en) * | 2018-04-03 | 2020-04-28 | Xilinx, Inc. | System-on-chip interface architecture |
US10866753B2 (en) | 2018-04-03 | 2020-12-15 | Xilinx, Inc. | Data processing engine arrangement in a device |
CN108897714B (en) * | 2018-07-03 | 2022-05-24 | 中国人民解放军国防科技大学 | Multi-core or many-core processor chip with autonomous region |
CN109886393B (en) * | 2019-02-26 | 2021-02-09 | 上海闪易半导体有限公司 | Storage and calculation integrated circuit and calculation method of neural network |
CN110362280A (en) * | 2019-09-04 | 2019-10-22 | 南京优存科技有限公司 | Mixing storage system based on the nearly data processing MRAM of low-power consumption neural network |
CN112732631A (en) * | 2020-12-25 | 2021-04-30 | 南京蓝洋智能科技有限公司 | Data transmission method between small chips |
CN112631989A (en) * | 2021-03-08 | 2021-04-09 | 南京蓝洋智能科技有限公司 | Data transmission method among small chips, among chips and among small chips |
CN112667557A (en) * | 2021-03-16 | 2021-04-16 | 南京蓝洋智能科技有限公司 | Data transmission method suitable for chiplet architecture |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1341242A (en) * | 1999-01-21 | 2002-03-20 | 索尼电脑娱乐公司 | High-speed processor system, method of using the same, and recording medium |
CN101354694A (en) * | 2007-07-26 | 2009-01-28 | 上海红神信息技术有限公司 | Ultra-high expanding super computing system based on MPU structure |
CN101681296A (en) * | 2008-02-29 | 2010-03-24 | 株式会社东芝 | Memory system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101330413B (en) * | 2007-06-22 | 2012-08-08 | 上海红神信息技术有限公司 | Method for expanding mixed multi-stage tensor based on around network and ultra-cube network structure |
-
2015
- 2015-07-29 CN CN201510456294.3A patent/CN105608490B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1341242A (en) * | 1999-01-21 | 2002-03-20 | 索尼电脑娱乐公司 | High-speed processor system, method of using the same, and recording medium |
CN101354694A (en) * | 2007-07-26 | 2009-01-28 | 上海红神信息技术有限公司 | Ultra-high expanding super computing system based on MPU structure |
CN101681296A (en) * | 2008-02-29 | 2010-03-24 | 株式会社东芝 | Memory system |
Also Published As
Publication number | Publication date |
---|---|
CN105608490A (en) | 2016-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105608490B (en) | Cellular array computing system and communication means therein | |
CN105740946B (en) | A kind of method that application cell array computation system realizes neural computing | |
CN105718996B (en) | Cellular array computing system and communication means therein | |
US20230334294A1 (en) | Multi-memory on-chip computational network | |
US11580367B2 (en) | Method and system for processing neural network | |
CN105718994B (en) | Cellular array computing system | |
CN106662995B (en) | Device, method, system, medium and the equipment seized for providing intermediate thread | |
US11294599B1 (en) | Registers for restricted memory | |
CN106951926A (en) | The deep learning systems approach and device of a kind of mixed architecture | |
CN111465943B (en) | Integrated circuit and method for neural network processing | |
DE112020004107T5 (en) | CONTENT RECOMMENDATIONS USING ONE OR MORE NEURAL NETWORKS | |
CN102279386B (en) | SAR (Synthetic Aperture Radar) imaging signal processing data transposing method based on FPGA (Field Programmable Gata Array) | |
CN101008928A (en) | Method and apparatus for tracking command order dependencies | |
CN105718991B (en) | Cellular array computing system | |
CN105718990B (en) | Communication means between cellular array computing system and wherein cell | |
CN109359542A (en) | The determination method and terminal device of vehicle damage rank neural network based | |
CN105718392B (en) | Cellular array document storage system and its file-storage device and file memory method | |
CN105718379B (en) | Cellular array computing system and wherein iuntercellular mass-send communication means | |
CN105718380B (en) | Cellular array computing system | |
CN105719227B (en) | A kind of camera system and image identification system | |
CN105718992B (en) | Cellular array computing system | |
CN105718995B (en) | Cellular array computing system and its adjustment method | |
CN105719228B (en) | Camera system and image identification system | |
CN107291209A (en) | Cellular array computing system | |
DE112021004742T5 (en) | Memory bandwidth throttling for virtual machines |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |