CN209388304U - Can software definition deposit the integrated chip of calculation and electronic equipment - Google Patents

Can software definition deposit the integrated chip of calculation and electronic equipment Download PDF

Info

Publication number
CN209388304U
CN209388304U CN201920246699.8U CN201920246699U CN209388304U CN 209388304 U CN209388304 U CN 209388304U CN 201920246699 U CN201920246699 U CN 201920246699U CN 209388304 U CN209388304 U CN 209388304U
Authority
CN
China
Prior art keywords
module
flash memory
configuration information
programmable
memory process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201920246699.8U
Other languages
Chinese (zh)
Inventor
王绍迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhicun Computing Technology Co ltd
Original Assignee
Beijing Zhi Cun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhi Cun Technology Co Ltd filed Critical Beijing Zhi Cun Technology Co Ltd
Priority to CN201920246699.8U priority Critical patent/CN209388304U/en
Application granted granted Critical
Publication of CN209388304U publication Critical patent/CN209388304U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Logic Circuits (AREA)

Abstract

The utility model provide it is a kind of can software definition deposit the integrated chip of calculation, this can software definition deposit that calculate the flash memory process array of integrated chip include for executing the multiple flash memory process subarrays of different simulation vector-matrix multiplication operations respectively, programmable computing module includes multiple programmable arithmetic elements for realizing different arithmetical operations respectively, control module is combined configuration to each module in the integrated chip of calculation is deposited according to the configuration information and finite state machine information of practical application, realize the dynamic configuration of circuit structure in chip, enable the chip to the circuit structure according to actual task flexible modulation chip, and ADC, DAC, register, the peripheral circuits such as programmable arithmetic element can be realized multiplexing, and then reduce circuit area, it adapts to integrated, the needs of miniaturization, and effectively reduce chip at This.

Description

Can software definition deposit the integrated chip of calculation and electronic equipment
Technical field
The utility model relates to semiconductor integrated circuit field, more particularly to it is a kind of can software definition deposit the integrated core of calculation Piece.
Background technique
Flash memory is a kind of nonvolatile memory, realizes depositing for data by regulating and controlling the threshold voltage of flash memory transistor Storage.According to the difference of flash memory transistor and array structure, flash memory is broadly divided into NOR- type flash memory and NAND- type flash memory. NAND- As unit of page and block, capacity is big, at low cost for the read-write of type flash memory, is widely used in extensive free-standing memory;NOR- Type flash memory supports the arbitrary access of data, and compared with NAND- type flash memory, density is lower, capacity is smaller, higher cost, mainly answers For in-line memory.
In recent years, in order to solve traditional von Neumann counting system structure bottleneck, interior calculating (Computing-In- is deposited Memory, CIM) chip architecture obtains the extensive concern of people, basic thought be directly using memory progress logic calculation, To reduce the volume of transmitted data and transmission range between memory and processor, performance is improved while reducing power consumption.
Existing deposit calculates the one customized of integrated chip architecture, and circuit structure is fixed, and cannot be appointed according to practical Business carries out flexible modulation, and circuit module can not achieve shared, cause circuit area big, does not adapt to integrated, miniaturization It needs.
Utility model content
In view of this, the present invention provides it is a kind of can software definition deposit the integrated chip of calculation and equipment, by using Multiple flash memory process subarrays, multiple programmable arithmetic elements and control module cooperation, according to practical application request pair The circuit structure of chip carries out dynamic configuration, can carry out flexible modulation according to actual task, and ADC, DAC, register, can compile The peripheral circuits such as journey arithmetic operation unit can be realized multiplexing, and then reduce circuit area, adapt to the need of integrated miniaturization It wants.
To achieve the goals above, the utility model adopts the following technical solution:
In a first aspect, provide it is a kind of can software definition deposit the integrated chip of calculation, comprising: flash memory process array programmable is calculated Art computing module and the control module being connect with the flash memory process array and the programmable computing module,
The flash memory process array includes for executing the different simulation multiple flash memory process of vector-matrix multiplication operation respectively Subarray;
The programmable computing module includes multiple programmable operations for realizing different arithmetical operations respectively Unit;
The control module according to configuration information to multiple flash memory process subarrays and multiple programmable arithmetic elements into Row combination configuration, realizes the dynamic configuration of circuit structure in chip.
Further, can software definition deposit the integrated chip of calculation further include:
Input interface module, for receiving outer input data;
Input register heap connects the input interface module, for storing the outer input data or pending data;
D/A converter module, input terminal connect the input register heap, and output end connects the flash memory process array, and being used for will The outer input data or pending data are converted to analog signal and transport to the flash memory process array, the flash memory process array pair The analog signal carries out simulation vector-matrix multiplication operation and exports operation result;
Analog-to-digital conversion module, input terminal connect the flash memory process array, and output end connects the programmable computing module, It, should for the simulation vector-matrix multiplication operation result to be converted to digital signal and transports to the programmable computing module Programmable computing module carries out arithmetical operation to the digital signal and exports arithmetic operation results;
Output register heap connects the programmable computing module and the input register heap, for keeping in the arithmetic Operation result, and the arithmetic operation results are exported or transport to the input register heap as the pending data;
Output interface module connects the output register heap, receives the output data of the output register heap, and this is defeated Data export outward out;
Wherein, which connects the input interface module, the input register heap, the D/A converter module, the sudden strain of a muscle Deposit processing array, the analog-to-digital conversion module, the output register heap, the programmable computing module and the output interface mould Block, for carrying out dynamic configuration to foregoing circuit module according to practical application request.
Further, the output end of the input register heap is also connected with the programmable computing module.
Further, multiple programmable arithmetic element serial connections, each programmable arithmetic element are equal It include: demultplexer, arithmetical operation subelement and multiple selector;
The input terminal of the demultplexer connects upper programmable arithmetic element or the analog-to-digital conversion module, wherein one A output end connects the arithmetical operation subelement, and another output and the output end of the arithmetical operation subelement pass through the multichannel Selector connects next programmable arithmetic element or output register heap, control terminal connect the control module.
Further, can software definition deposit the integrated chip of calculation further include: programmed circuit connect with the control module, this Programmed circuit connects source electrode, grid and/or the substrate of each flash cell in the flash memory process array, for regulating and controlling flash memory The threshold voltage of unit;
Wherein, which includes: the voltage generation circuit and use for generating program voltage or erasing voltage In the voltage control circuit that the program voltage is loaded onto selected flash cell.
Further, can software definition deposit the integrated chip of calculation further include:
Ranks decoder connects the flash memory process array and the control module, under the control of the control module The flash memory process array procession is decoded.
Further, which carries out dynamic configuration to each circuit module connected to it according to configuration information, should Configuration information includes: the configuration information of flash memory process subarray, the configuration information of programmable arithmetic element, digital-to-analogue conversion mould The configuration information of block, the configuration information of analog-to-digital conversion module, input interface module configuration information, output interface module configuration The configuration information of information, the configuration information of input register heap and output register heap, this connects according to configuration information pair with it Each circuit module connect carries out dynamic configuration
The flash memory process array is divided into multiple flash memory process submatrixs according to the configuration information of the flash memory process subarray Column, and control the working sequence of multiple flash memory process subarrays;
The corresponding multichannel of each programmable arithmetic element is controlled according to the configuration information of the programmable arithmetic element The working condition of distributor and multiple selector makes multiple programmable arithmetic elements realize any combination operation;
The D/A converting circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the D/A converter module;
The analog to digital conversion circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the analog-to-digital conversion module;
The input interface circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the input interface module;
The output interface circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the output interface module;
Input register data source to be stored is controlled according to the configuration information of the input register heap to connect in input The input data or output register heap pending data of mouth mold block;
The output register heap is controlled by the data output or conduct in it according to the configuration information of the output register heap Pending data transports to the input register heap.
Second aspect provides a kind of electronic equipment, including it is above-mentioned can software definition deposit the integrated chip of calculation.
It is provided by the utility model can software definition deposit calculation integrated chip, electronic equipment, this can software definition deposit calculation The flash memory process array of integrated chip includes for executing the different simulation multiple flash memory process of vector-matrix multiplication operation respectively Subarray, programmable computing module include multiple programmable operation lists for realizing different arithmetical operations respectively Member, control module are combined with finite state machine information to the integrated each module of chip of calculation is deposited according to the configuration information of practical application Configuration is realized the dynamic configuration of circuit structure in chip, is enabled the chip to according to circuit knot in actual task flexible modulation chip Structure, and the peripheral circuits such as ADC, DAC, register, programmable arithmetic element can be realized multiplexing, and then reduce circuit face Product, adapts to the needs of integrated miniaturization, and effectively reduces chip cost.
For the above and other objects, features and advantages of the utility model can be clearer and more comprehensible, preferable reality is cited below particularly Example is applied, and cooperates institute's accompanying drawings, is described in detail below.
Detailed description of the invention
In order to illustrate the embodiment of the utility model or the technical proposal in the existing technology more clearly, below will be to embodiment Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only It is some embodiments of the utility model, for those of ordinary skill in the art, in the premise not made the creative labor Under, it is also possible to obtain other drawings based on these drawings.
Fig. 1 be the utility model embodiment can software definition deposit the structure chart one for calculating integrated chip;
Fig. 2 be the utility model embodiment can software definition deposit the structure chart two for calculating integrated chip;
Fig. 3 be the utility model embodiment can software definition deposit programmable arithmetic element 30 in the integrated chip of calculation Structure chart;
Fig. 4 be the utility model embodiment can depositing for software definition calculate programmable operation subelement in integrated chip Structure chart;
Fig. 5 be the utility model embodiment can the programmable computing module in the integrated chip of calculation of depositing of software definition realize The schematic diagram of compound operation;
Fig. 6 be the utility model embodiment can software definition deposit the structure chart for calculating flash memory process subarray in integrated chip One;
Fig. 7 be the utility model embodiment can software definition deposit the structure chart for calculating flash memory process subarray in integrated chip Two;
Fig. 8 be the utility model embodiment can software definition deposit the structure chart for calculating flash memory process subarray in integrated chip Three;
Fig. 9 be the utility model embodiment can software definition deposit the structure chart three for calculating integrated chip;
Specific embodiment
The following will be combined with the drawings in the embodiments of the present invention, carries out the technical scheme in the embodiment of the utility model Clearly and completely describe, it is clear that the described embodiments are only a part of the embodiments of the utility model, rather than whole Embodiment.Based on the embodiments of the present invention, those of ordinary skill in the art are without making creative work Every other embodiment obtained, fall within the protection scope of the utility model.
Existing deposit calculates the one customized of integrated chip architecture, and circuit structure is fixed, and cannot be appointed according to practical Business carries out flexible modulation, and circuit module can not achieve shared, cause circuit area big.
To solve the above problem in the prior art, the utility model embodiment provide it is a kind of it is provided by the utility model can Software definition deposits calculation integrated chip, electronic equipment, this can depositing for software definition calculate the flash memory process array of integrated chip and include For executing the different multiple flash memory process subarrays of simulation vector-matrix multiplication operation, programmable operation mould respectively Block includes multiple programmable arithmetic elements for realizing different arithmetical operations respectively, and control module is according to practical application Configuration information and finite state machine information are combined configuration to the integrated each module of chip of calculation is deposited, and realize circuit structure in chip Dynamic configuration is enabled the chip to according to circuit structure in actual task flexible modulation chip, and ADC, DAC, register, can be compiled The peripheral circuits such as journey arithmetic operation unit can be realized multiplexing, and then reduce circuit area, adapt to the need of integrated miniaturization It wants.
Fig. 1 be the utility model embodiment can software definition deposit the structure chart one for calculating integrated chip.As shown in Figure 1, should Can software definition deposit the integrated chip of calculation include: flash memory process array 20, programmable computing module 30 and with the flash memory The control module 10 of array 20 and the programmable computing module 30 connection is handled,
The flash memory process array 20 includes for being executed at the different simulation multiple flash memories of vector-matrix multiplication operation respectively Manage subarray (not shown in figure 1).
Wherein, multiple flash memory process subarrays can be the identical flash memory process subarray of structure, can also be according to actually answering It is difference, such as the line number and column of each flash memory process subarray by the structure setting of each flash memory process subarray with demand Number can be configured according to practical application request, the utility model embodiment to this with no restriction.
The programmable computing module 30 includes for realizing that multiple programmables of different arithmetical operations are transported respectively Calculate unit (not shown in figure 1).
Programmable arithmetic element uses hardware realization, for executing specific arithmetical operation.
Wherein, arithmetical operation includes: multiplying, add operation, subtraction, division arithmetic, shift operation, activation letter It counts, be maximized, being minimized, being averaged, the combination in one or more of pond etc..
The control module 10 posts the input interface module in chip, input according to configuration information and finite state machine information Storage heap, D/A converter module, flash memory process array, analog-to-digital conversion module, output register heap, programmable computing module And output interface module is combined configuration, realizes the dynamic configuration of circuit structure in chip.
Wherein it is possible to, by compilation tool, obtain configuration information and finite state machine information according to practical application request.
Wherein, configuration information is usually state that is static, such as specifying the modules of participation task, each unit Configure size;Configuration information is generally held in memory, is waited and is scheduled before task runs.And finite state machine information is logical It is often dynamic, in task run, when control actual task is run timing and state.
Specifically, the control module 10 is according to configuration information to multiple flash memory process subarray and multiple programmables Arithmetic element is combined configuration, selects the flash memory process subarray devoted oneself to work and programmable arithmetic element, and to sudden strain of a muscle The combination matching method for depositing processing subarray and programmable arithmetic element is controlled to realize certain operations.
It is understood that certain may be implemented in each programmable arithmetic element in multiple programmable arithmetic elements One or more of arithmetical operations, multiple programmable arithmetic elements can go out a variety of compound operations with permutation and combination, with multiple sudden strains of a muscle Processing subarray cooperation is deposited, can be realized multiple combinations configuration, and then realize complicated calculation function.
As can be seen from the above description, it is provided by the embodiment of the utility model can software definition deposit the integrated chip of calculation, at flash memory Managing array includes that can compile for executing the different multiple flash memory process subarrays of simulation vector-matrix multiplication operation respectively Journey arithmetical operation module includes multiple programmable arithmetic elements for realizing different arithmetical operations respectively, control module Configuration is combined to multiple flash memory process subarrays and multiple programmable arithmetic elements according to configuration information, realizes chip Framework dynamic configuration, can not only be according to actual task flexible modulation chip architecture, moreover it is possible to realize the calculation function of Various Complex, And the peripheral circuits such as ADC, DAC, register, programmable arithmetic element can be realized multiplexing, and then reduce circuit area, Adapt to the needs of integrated miniaturization.
In an alternative embodiment, referring to fig. 2, this can the integrated chip of calculation of depositing of software definition can also include: defeated Incoming interface module 40, input register heap 50, D/A converter module 60, analog-to-digital conversion module 70, output register heap 80 and Output interface module 90.
The input terminal of input interface module 40 connects external equipment, for receiving the input data from external equipment (i.e. Need the data of operation).
The input terminal of input register heap 50 connects the output end of the input interface module 40, for keeping in the input data An or pending data.
The input terminal of D/A converter module 60 connects the output end of the input register heap 50, and output end connects at the flash memory The input terminal for managing array 20, for that will be converted from the outer input data that the input register heap 50 exports or pending data For analog signal and the flash memory process array 20 is transported to, which carries out simulation vector-square to the analog signal Battle array multiplying simultaneously exports simulation vector-matrix multiplication operation result.
70 input terminal of analog-to-digital conversion module connects the flash memory process array 20, and output end connects the programmable operation mould Block 30, for the simulation vector-matrix multiplication operation result to be converted to digital signal and transports to the programmable operation mould Block 30, the programmable computing module 30 carry out arithmetical operation to the digital signal and export arithmetic operation results.
The input terminal of output register heap 80 connects the programmable computing module 30, and output end connects input deposit Device heap 50 for keeping in the arithmetic operation results, and the arithmetic operation results is exported or transport to this as the pending data Input register heap 50.
The output end of the input terminal connection output register heap 80 of output interface module 90, receives the output register heap 80 output data, and the output data is exported to external equipment.
Wherein, which connects the input interface module 40, the input register heap 50, the D/A converter module 60, the flash memory process array 20, the analog-to-digital conversion module 70, the output register heap 80, the programmable computing module 30 And the output interface module 90, for carrying out dynamic configuration to foregoing circuit module according to configuration information.
Wherein, which carries out dynamic configuration to each circuit module connected to it according to configuration information, this is matched Confidence breath includes: flash memory process subarray 201~20nConfiguration information, programmable arithmetic element 301~30nMatch confidence Breath, the configuration information of D/A converter module 60, the configuration information of analog-to-digital conversion module 70, input interface module 40 match confidence The configuration of breath, the configuration information of output interface module 90, the configuration information of input register heap 50 and output register heap 80 Information etc., it may include following content that this, which carries out dynamic configuration to each circuit module connected to it according to the configuration information:
According to the flash memory process subarray 201~20nConfiguration information the flash memory process array 20 is divided into multiple flash memories Handle subarray 201~20n, and control multiple flash memory process subarrays 201~20nWorking sequence.
According to the programmable arithmetic element 301~30nConfiguration information to control each programmable arithmetic element corresponding Selector working condition, make multiple programmable arithmetic elements realize any combination operation participate in work.
The D/A converting circuit opening and closing shape for participating in actual task is controlled according to the configuration information of the D/A converter module 60 State;
The analog to digital conversion circuit opening and closing shape for participating in actual task is controlled according to the configuration information of the analog-to-digital conversion module 70 State;
The input interface circuit opening and closing shape for participating in actual task is controlled according to the configuration information of the input interface module 40 State;
The output interface circuit opening and closing shape for participating in actual task is controlled according to the configuration information of the output interface module 90 State;
Input register data source to be stored is controlled in defeated according to the configuration information of the input register heap 50 The input data or output register heap pending data of incoming interface module;
According to the configuration information of the output register heap 80 control the output register heap 80 by it data output or The input register heap 50 is transported to as pending data.
Specifically, the input terminal of the input register heap 50 passes through the connection of multiple selector (MUX) the 110 input interface mould The output end of the output end of block 40 and the output register heap 80 receives the outside from the input interface module 40 with selectivity Input data or pending data from the output register heap 80.The control module 10 connects the multiple selector (MUX) 100, which is controlled according to the configuration information, and then controlling the input register heap 50 is that the reception outside is defeated Enter data or the pending data.
The D/A converter module 60 passes through the multiple flash memory process submatrix of demultplexer (DEMUX) 120 selective connection Column (201~20n).Control module 10 connects the demultplexer 120, to control demultplexer Q according to configuration information, into And which flash memory process subarray is selected to participate in work.
Multiple flash memory process subarray (201~20n) output end by a multiple selector 130 connect the modulus turn Change the mold block 70.The control module 10 connects the multiple selector 130, controls the multiple selector 130 according to configuration information, in turn The output of which flash memory process subarray is selected to connect the input terminal of the analog-to-digital conversion module 70, i.e., by above-mentioned participation work The output of flash memory process subarray is connected to the input terminal of the analog-to-digital conversion module 70.
The input terminal of the programmable computing module 30 connects the demultplexer 110 by multiple selector 140 The output end of output end and the analog-to-digital conversion module 70.
Multiple programmable arithmetic elements 30 of the programmable computing module 301~30nSerial connection, each The programmable arithmetic element includes: demultplexer 30a, arithmetical operation subelement 30b and multiple selector 30c, Referring to Fig. 3.
The input terminal of demultplexer 30a connects upper programmable arithmetic element or the analog-to-digital conversion module 70, One of output end connects arithmetical operation subelement 30b, the output end of arithmetical operation subelement 30b and the multichannel point Orchestration 30a another output passes through the next programmable arithmetic element of a multiple selector 30c connection or output deposit Device heap 80, in addition, the control terminal of demultplexer 30a and multiple selector 30c are all connected with the control module 20.
Specifically, the first programmable arithmetic element 301In the input terminal of demultplexer connect the analog-to-digital conversion The output end of module 70, one of output end connect the first programmable arithmetic element 301In arithmetical operation it is single The input terminal of member, another output connects second by a multiple selector with the output end of the arithmetical operation subelement can Program arithmetic operation unit 302Input terminal, which connects the control module with the control terminal of the multiple selector 20。
Second programmable arithmetic element 302In demultplexer input terminal connect first programmable fortune Calculate unit 301Output end, one of output end connects the second programmable arithmetic element 302In arithmetical operation son The input terminal of unit, another output and the output end of the arithmetical operation subelement connect third by a multiple selector Programmable arithmetic element 303Input terminal, which connects the control mould with the control terminal of the multiple selector Block 20.The rest may be inferred, until the n-th programmable arithmetic element 30n, the n-th programmable arithmetic element 30nIn multichannel The input terminal of distributor connects the (n-1)th programmable arithmetic element 30n-1Output end, one of output end connect this n-th Programmable arithmetic element 30nIn arithmetical operation subelement input terminal, another output and arithmetical operation are single The output end of member connects the input terminal of output register heap 80 by a multiple selector, and the demultplexer and the multichannel are selected The control terminal for selecting device connects the control module 20.
Control module 20 connects the demultplexer and the multiple selector, root in each programmable arithmetic element The demultplexer and the multiple selector in each programmable arithmetic element are controlled according to configuration information, to select this can Whether the arithmetical operation subelement in programming arithmetic operation unit participates in operation, realizes multiple programmable arithmetic elements with this Permutation and combination configuration, realize different complex calculations, flexible configuration arithmetical operation function.
In an alternative embodiment, each programmable operation subelement may each comprise multiple be arranged side by side Arithmetic unit, such as multiplier, adder, subtracter, divider, shift unit, activation primitive device, be maximized operation Device one or more of is minimized arithmetic unit, is averaged arithmetic unit, Chi Huaqi, in parallel between each arithmetic unit, defeated Enter the output end that end is respectively connected to corresponding demultplexer, output end is respectively connected to the input of corresponding multiple selector End, referring to fig. 4.
The process that the programmable computing module executes compound operation is as shown in Figure 5.
The output end of the output register heap 80 passes through 150 selective connection of the demultplexer output interface module 90 Input terminal or the input register heap 50 input terminal.The control module 20 connects the demultplexer 150, according to confidence Breath controls the working condition of the demultplexer 150, and the output register heap 80 output result is transported to output interface with selection Module 90 or input register heap 50, when the output register heap 80 output result is transported to input register heap 50 by selection When, it is meant that the calculation process of a new round will be carried out to the output result.
In an alternative embodiment, the output end of the input register heap 50 can also pass through a demultplexer The input terminal of the 110 selective connection D/A converter module 50 or the input terminal of the programmable computing module 30, the control Module 10 connect the demultplexer 110, the working condition of the demultplexer 110 is controlled according to configuration information, with select by The output end of the input register heap 50 is connected to the input terminal or the programmable operation mould of the D/A converter module 50 The input terminal of block 30, wherein when the output end of the input register heap 50 is connected to the input terminal of the D/A converter module 50, Mean that the output to the input register heap 50 carries out simulation vector-matrix multiplication operation and arithmetical operation;When the input is posted When the output end of storage heap 50 is connected to the input terminal of the input terminal of the programmable computing module 30, it is meant that the input The output of register file 50 carries out certain arithmetical operation, further increases the flexibility of chip architecture with this.
In an alternative embodiment, each flash memory process subarray is all made of using source electrode coupling, drain electrode summation Topological structure, referring to Fig. 6, multiple programmable semiconductor devices (also referred to as flash cell) including array arrangement.
Wherein, the source electrode of all programmable semiconductor devices of each column is connected to same analog voltage input, more Column programmable semiconductor device is correspondingly connected with multiple analog voltage inputs;The leakage of all programmable semiconductor devices of each column Pole is connected to the same analog current output end, and multiple row programmable semiconductor device is correspondingly connected with multiple analog current outputs End;The grid of all programmable semiconductor devices of every a line is connected to same bias voltage input, multirow programmable half Conductor device is correspondingly connected with multiple bias voltage inputs;Wherein, the threshold voltage of each programmable semiconductor device It adjusts.
In another alternative embodiment, each flash memory process subarray includes the multiple programmable of array arrangement Semiconductor devices;The grid of all programmable semiconductor devices of every a line is connected to same analog voltage input, multirow Programmable semiconductor device is correspondingly connected with multiple analog voltage inputs;The drain electrode of all programmable semiconductor devices of each column It is connected to same first end, multiple row programmable semiconductor device is correspondingly connected with multiple first ends;The all programmable of each column The source electrode of semiconductor devices is connected to same second end, and multiple row programmable semiconductor device is correspondingly connected with multiple second ends;Often The threshold voltage of a programmable semiconductor device is adjustable;Wherein, which is bias voltage input, the second end For analog current output end, the topological structure of grid coupling, source electrode summation is realized, referring to Fig. 7;Alternatively, the first end is simulation Current output terminal, the second end are bias voltage input, the topological structure of grid coupling, drain electrode summation are realized, referring to Fig. 8.
Specifically, which can will each be compiled by the threshold voltage of adjusting programmable semiconductor device Journey semiconductor devices regards a variable equivalent simulation weight as, analog matrix data is equivalent to, to programmable semiconductor device Array applies analog voltage, realizes matrix multiplication operation function.
In an alternative embodiment, this can software definition to deposit the integrated chip of calculation can also include: programmed circuit 22.
The programmed circuit 22 connects source electrode, grid and/or the substrate of each flash cell in flash memory process array, is used for Regulate and control the threshold voltage of flash cell.
Wherein, the programmed circuit include: voltage generation circuit for generating program voltage or erasing voltage and For the program voltage to be loaded onto the voltage control circuit of selected flash cell.
Specifically, programmed circuit utilizes thermoelectron injection effect, according to flash cell threshold voltage demand data, to flash memory The source electrode of unit applies high voltage, channel electrons is accelerated to high speed, to increase the threshold voltage of flash cell.
Also, programmed circuit utilizes tunneling effect, according to flash cell threshold voltage demand data, to the grid of flash cell Pole or substrate apply high voltage, to reduce the threshold voltage of flash cell.
In addition, control module 10 connects the programmed circuit, for controlling the programmed circuit according to configuration information, to the sudden strain of a muscle The weight stored in processing array 20 is deposited to be adjusted.
In an alternative embodiment, this can software definition to deposit the integrated chip of calculation can also include: ranks decoder.
The ranks decoder connects the flash memory process array 20 and the control module 10, in the control module 10 Control under to 20 procession of flash memory process array decode.
In an alternative embodiment, programmable semiconductor device can be realized using floating transistor.
Wherein, which includes: that NOR type flash memory process array and NAND-type flash memory handle array, certainly, this Utility model is not limited.
Based on the above content, the application provide it is a kind of using the utility model embodiment can software definition deposit the integrated chip of calculation Realize the scene of neural network computing, with illustrate this can software definition deposit and calculate the workflow of integrated chip.
The neural network carries out operation for realizing to data P, which includes R layers of neuron, every layer of neuron It is main to realize vector-matrix multiplication operation, and be attached by certain arithmetical operation (because of the application weight between each layer neuron Point be can software definition deposit the integrated chip of calculation and only describe its fortune so do not carry out going deep into description to neural network computing herein Calculate framework, with exemplary illustration can software definition deposit and calculate the workflow of integrated chip, be not limitation of the utility model).
For the neural network computing, this can software definition deposit calculate integrated chip workflow it is as follows:
Control module 10 obtains configuration information and finite state machine information, and the configuration information and finite state machine information include The configuration information in R period and finite state machine information, R period (for example roll up corresponding to the operation of neural network R layers of neuron Product, pond etc.), the operation of corresponding one layer of neuron of each period.The configuration information in each period includes: flash memory process submatrix The configuration information of column, the configuration information of programmable arithmetic element, the configuration information of output register heap, input register heap Configuration information etc..Flash memory process array 20 is divided into R flash memory process subarray according to the configuration information by control module 10, Each flash memory process subarray corresponds to a cycle, i.e., each flash memory process subarray realizes one layer of neural network of operation, so Control module 10 controls the working sequence of each circuit module according to finite state machine information afterwards.
Input interface module 40 receives data P;
Control module 10 controls the input register heap 50 according to the configuration information and finite state machine information of period 1 Multiple selector (DEMUX) A of front end, is connected to input interface module 40 with the input register heap 50, controls at the flash memory Demultplexer (MUX) Q for managing 20 front end of array, so that the flash memory of the D/A converter module 60 and corresponding neural network first layer Handle subarray 1 be connected to, control the multiple selector B of 20 rear end of flash memory process array, the flash memory process subarray 1 made with Analog-to-digital conversion module 70 be connected to, control programmable computing module 30 each programmable arithmetic element selector and Alternative selector realizes the arithmetical operation 1 for corresponding to neural network first layer, and transports to the input register heap in data P The demultplexer W of 80 output end of output register heap and the multiple selector of 50 front end of input register heap are controlled after 50 (DEMUX) A makes the input terminal of the input register heap 50 be connected to the output end of the output register heap 80, realizes the period 1 Operation architecture configuration;
Data P transports to D/A converter module 60 after keeping in by the input register heap 50, after being converted to analog signal Flash memory process subarray 1 is transported to, which carries out simulation vector-matrix multiplication operation 1 to the analog signal (such as matrix multiplication operation), simulation vector-matrix multiplication operation result 1 switch to digital letter by the analog-to-digital conversion module 70 Number, arithmetic operation results 1 are obtained after programmable computing module 30, are transported to the input through output register heap is after 80s and are posted Storage heap 50 so far completes the operation of first layer neural network;
Automatic trigger control module 10 at this time, control module 10 are believed according to the configuration information and finite state machine of second round Breath controls demultplexer (MUX) Q of 20 front end of flash memory process array so that the D/A converter module 60 with it is corresponding neural The flash memory process subarray 2 of the network second layer is connected to, and controls the multiple selector B of 20 rear end of flash memory process array, and what is made should Flash memory process subarray 2 is connected to analog-to-digital conversion module 70, each programmable fortune of control programmable computing module 30 The selector of unit is calculated, the arithmetical operation 2 for corresponding to the neural network second layer is realized, realizes that the operation framework of second round is matched It sets.
The arithmetic operation results 1 of the first layer neural network transport to digital-to-analogue conversion after keeping in by the input register heap 50 Module 60 transports to flash memory process subarray 2 after being converted to analog signal, which carries out the analog signal It simulates vector-matrix multiplication operation 2 (such as matrix multiplication operation), simulation vector-matrix multiplication operation result passes through the modulus Conversion module 70 switchs to digital signal, and arithmetic operation results 2 are obtained after programmable computing module 30, deposits through output Device heap is after 80s to transport to the input register heap 50, so far completes the operation of second layer neural network, and so on, until last Layer neural network, wherein when carrying out the configuration of the last layer neural network, the multichannel distribution of control 80 output end of output register heap Device W makes the output end of the output register heap 80 connect the input terminal of output interface module 90, so that entire neural network Operation result exported by the output interface module 90 to external equipment.
It will be appreciated by persons skilled in the art that when a certain layer neural network only needs arithmetical operation without simulating When vector-matrix multiplication operation, only the multichannel point that the defeated register file 50 exports need to be controlled in 10 configuration circuit of control module Orchestration E, so that the output end of the input register heap 50 is connected to the input terminal of the arithmetical operation module 30, other configurations Process repeats no more.
Through the above technical solution it is known that it is provided by the embodiment of the utility model can software definition deposit the integrated core of calculation Piece cooperates multiple flash memory process subarrays and multiple programmable arithmetic elements by control module, can be according to actually answering Flexible combination is carried out to chip architecture with demand, can be realized complicated processor active task, is suitable for speech processes, image procossing, machine The various application occasions such as device processing, artificial intelligence (AI), and the periphery such as ADC, DAC, register, programmable arithmetic element electricity Road can be realized multiplexing, and then reduce circuit area, adapt to it is integrated, miniaturization needs, and effectively reduce chip at This.
Fig. 9 be the utility model embodiment can software definition deposit the structure chart three for calculating integrated chip.As shown in figure 9, in Fig. 2 It is shown can be on the basis of software definition deposits the integrated chip of calculation, the input terminal of the input register heap 50 passes through multiple selector (DEMUX) 100 the output end of the input interface module 40 and the output end of the output register heap 80 are connected, with selective reception Outer input data from the input interface module 40 or the pending data from the output register heap 80.The control Module 10 connects the multiple selector (DEMUX) 100.
The D/A converter module 60 passes through the multiple flash memory process subarray of demultplexer (MUX) 120 selective connection (201~20n).Control module 10 connects demultplexer Q.
Multiple flash memory process subarray (201~20n) output end by a multiple selector 130 connect the modulus turn Change the mold block 70.The control module 10 connects multiple selector B.
The input terminal of the programmable computing module 30 connects the demultplexer 110 by multiple selector 140 The output end of output end and the analog-to-digital conversion module 70.
Multiple programmable arithmetic elements 30 of the programmable computing module 301~30nSerial connection, each The programmable arithmetic element includes selector 30a and arithmetical operation subelement 30b.
The input terminal of selector 30a connects upper programmable arithmetic element or the analog-to-digital conversion module 70, wherein One output end connects arithmetical operation subelement 30b, the output of another output and arithmetical operation subelement 30b End connects next programmable arithmetic element or output register heap 80 by an alternative selector, and control terminal connection should Control module 20.
The output end of the output register heap 80 passes through 150 selective connection of the demultplexer output interface module 90 Input terminal or the input register heap 50 input terminal.The control module 20 connects demultplexer W, according to configuration information The working condition of demultplexer W is controlled, the output register heap 80 output result is transported to by output interface module with selection 90 or input register heap 50, when the output register heap 80 output result is transported to input register heap 50 by selection, meaning Taste will to the output result carry out a new round calculation process.
The output end of the input register heap 50 passes through 110 selective connection of the demultplexer D/A converter module The input terminal of 50 input terminal or the programmable computing module 30, the control module 10 connect demultplexer E, according to Configuration information controls the working condition of demultplexer E, and the output end of the input register heap 50 is connected to this with selection The input terminal of the input terminal of D/A converter module 50 or the programmable computing module 30, wherein when the input register When the output end of heap 50 is connected to the input terminal of the D/A converter module 50, it is meant that the output of the input register heap 50 into Row simulation vector-matrix multiplication operation and arithmetical operation;When the output end of the input register heap 50 is connected to the programmable calculation When the input terminal of the input terminal of art computing module 30, it is meant that carry out certain arithmetic to the output of the input register heap 50 and transport It calculates, the flexibility of chip architecture is further increased with this.
It will be appreciated by persons skilled in the art that when a certain layer neural network only needs arithmetical operation without simulating When vector-matrix multiplication operation, only the multichannel point that the defeated register file 50 exports need to be controlled in 10 configuration circuit of control module Orchestration E, so that the output end of the input register heap 50 is connected to the input terminal of the arithmetical operation module 30, other configurations Process repeats no more.
In addition, it will be appreciated by persons skilled in the art that according to practical application request generate configuration information when, Ke Yigen It is realized according to pre-set instruction-framework mapping table.
It is worth noting that when generating configuration information according to practical application request, it is to be understood that the flash memory process for needing to put into The scale of the quantity of subarray and each flash memory process subarray, at this point, flash memory process can be obtained according to practical application request The division of array instructs, and is then instructed according to the division flash memory process array being divided into multiple flash memory process subarrays, right Answer a variety of matrix multiplication operation scales.
It will be appreciated by persons skilled in the art that using the utility model embodiment can software definition deposit the integrated chip of calculation When, when carrying out multiple cycle operations, the period corresponding flash memory process subarray can be programmed in each period, Each flash memory process subarray can be uniformly programmed according to programming instruction before carrying out each cycle operation.
The utility model embodiment also provides a kind of electronic equipment, and neural network algorithm can be performed, which includes Multilayer neuron, every layer of neuron carry out corresponding operation according to the output result of one layer of neuron thereon, which includes It is above-mentioned can software definition deposit the integrated chip of calculation.
Wherein, which for example can be personal computer, laptop computer, cellular phone, camera phone, intelligence Can phone, personal digital assistant, media player, navigation equipment, electronic mail equipment, game console, tablet computer, can The combination of any equipment in wearable device or these equipment.
Specific embodiment is applied in the utility model to be expounded the principles of the present invention and embodiment, with The explanation of upper embodiment is merely used to help understand the method and its core concept of the utility model;Meanwhile for this field Those skilled in the art, based on the idea of the present invention, there will be changes in the specific implementation manner and application range, comprehensive Upper described, the content of the present specification should not be construed as a limitation of the present invention.

Claims (8)

1. one kind can software definition deposit the integrated chip of calculation characterized by comprising flash memory process array, programmable fortune The control module calculating module and being connect with the flash memory process array and the programmable computing module,
The flash memory process array includes for executing different multiple flash memory process of simulation vector-matrix multiplication operation respectively Array;
The programmable computing module includes multiple programmable operation lists for realizing different arithmetical operations respectively Member;
The control module carries out multiple flash memory process subarrays and multiple programmable arithmetic elements according to configuration information Combination configuration, realizes the dynamic configuration of circuit structure in chip.
2. it is according to claim 1 can software definition deposit the integrated chip of calculation, which is characterized in that further include:
Input interface module, for receiving outer input data;
Input register heap connects the input interface module, for storing the outer input data or pending data;
D/A converter module, input terminal connect the input register heap, and output end connects the flash memory process array, and being used for will The outer input data or pending data are converted to analog signal and transport to the flash memory process array, the flash memory process Array carries out simulation vector-matrix multiplication operation to the analog signal and exports operation result;
Analog-to-digital conversion module, input terminal connect the flash memory process array, and output end connects the programmable computing module, For the simulation vector-matrix multiplication operation result to be converted to digital signal and transports to the programmable operation mould Block, the programmable computing module carry out arithmetical operation to the digital signal and export arithmetic operation results;
Output register heap connects the programmable computing module and the input register heap, for keeping in the calculation Art operation result, and the arithmetic operation results are exported or transport to the input register heap as the pending data;
Output interface module connects the output register heap, receives the output data of the output register heap, and by the output Data export outward;
Wherein, the control module connect the input interface module, the input register heap, the D/A converter module, The flash memory process array, the analog-to-digital conversion module, the output register heap, the programmable computing module and The output interface module, for carrying out dynamic configuration to foregoing circuit module according to practical application request.
3. it is according to claim 2 can software definition deposit the integrated chip of calculation, which is characterized in that the input register heap Output end be also connected with the programmable computing module.
4. it is according to claim 2 can software definition deposit the integrated chip of calculation, which is characterized in that multiple programmable calculations The serial connection of art arithmetic element, each programmable arithmetic element includes: demultplexer, arithmetical operation subelement And multiple selector;
The input terminal of the demultplexer connects upper programmable arithmetic element or the analog-to-digital conversion module, wherein one A output end connects the arithmetical operation subelement, and another output and the output end of the arithmetical operation subelement pass through institute It states multiple selector and connects next programmable arithmetic element or output register heap, control terminal connects the control module.
5. it is according to claim 4 can software definition deposit the integrated chip of calculation, which is characterized in that further include: with the control The programmed circuit of molding block connection, the programmed circuit connect the source of each flash cell in the flash memory process subarray Pole, grid and/or substrate, for regulating and controlling the threshold voltage of flash cell;
Wherein, the programmed circuit includes: voltage generation circuit for generating program voltage or erasing voltage and is used for The program voltage is loaded onto the voltage control circuit of selected flash cell.
6. it is according to claim 1 or 2 can software definition deposit the integrated chip of calculation, which is characterized in that further include:
Ranks decoder connects the flash memory process array and the control module, for the control in the control module Under to the flash memory process array procession decode.
7. it is according to claim 4 can software definition deposit the integrated chip of calculation, which is characterized in that the control module according to Configuration information carries out dynamic configuration to each circuit module connected to it, and the configuration information includes: flash memory process subarray Configuration information, the configuration information of programmable arithmetic element, the configuration information of D/A converter module, analog-to-digital conversion module are matched Confidence breath, the configuration information of input interface module, the configuration information of output interface module, input register heap configuration information with And the configuration information of output register heap, it is described that dynamic configuration packet is carried out to each circuit module connected to it according to configuration information It includes:
The flash memory process array is divided into multiple flash memory process submatrixs according to the configuration information of the flash memory process subarray Column, and control the working sequence of multiple flash memory process subarrays;
The corresponding multichannel point of each programmable arithmetic element is controlled according to the configuration information of the programmable arithmetic element The working condition of orchestration and multiple selector makes multiple programmable arithmetic elements realize any combination operation;
The D/A converting circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the D/A converter module;
The analog to digital conversion circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the analog-to-digital conversion module;
The input interface circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the input interface module;
The output interface circuit open and-shut mode for participating in actual task is controlled according to the configuration information of the output interface module;
Input register data source to be stored is controlled in input interface according to the configuration information of the input register heap The input data or output register heap pending data of module;
According to the configuration information of the output register heap control the output register heap by it data output or as to Processing data transport to the input register heap.
8. a kind of electronic equipment, which is characterized in that including as described in any one of claims 1 to 7 can software definition deposit calculation Integrated chip.
CN201920246699.8U 2019-02-26 2019-02-26 Can software definition deposit the integrated chip of calculation and electronic equipment Active CN209388304U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201920246699.8U CN209388304U (en) 2019-02-26 2019-02-26 Can software definition deposit the integrated chip of calculation and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201920246699.8U CN209388304U (en) 2019-02-26 2019-02-26 Can software definition deposit the integrated chip of calculation and electronic equipment

Publications (1)

Publication Number Publication Date
CN209388304U true CN209388304U (en) 2019-09-13

Family

ID=67854445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201920246699.8U Active CN209388304U (en) 2019-02-26 2019-02-26 Can software definition deposit the integrated chip of calculation and electronic equipment

Country Status (1)

Country Link
CN (1) CN209388304U (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111128279A (en) * 2020-02-25 2020-05-08 杭州知存智能科技有限公司 Memory computing chip based on NAND Flash and control method thereof
CN112599132A (en) * 2019-09-16 2021-04-02 北京知存科技有限公司 Voice processing device and method based on storage and calculation integrated chip and electronic equipment
CN116414456A (en) * 2023-01-19 2023-07-11 杭州知存智能科技有限公司 Weighted fusion conversion component in memory chip, memory circuit and cooperative computing method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112599132A (en) * 2019-09-16 2021-04-02 北京知存科技有限公司 Voice processing device and method based on storage and calculation integrated chip and electronic equipment
CN111128279A (en) * 2020-02-25 2020-05-08 杭州知存智能科技有限公司 Memory computing chip based on NAND Flash and control method thereof
CN116414456A (en) * 2023-01-19 2023-07-11 杭州知存智能科技有限公司 Weighted fusion conversion component in memory chip, memory circuit and cooperative computing method
CN116414456B (en) * 2023-01-19 2024-01-19 杭州知存智能科技有限公司 Weighted fusion conversion component in memory chip, memory circuit and cooperative computing method

Similar Documents

Publication Publication Date Title
CN209388304U (en) Can software definition deposit the integrated chip of calculation and electronic equipment
CN106844294B (en) Convolution algorithm chip and communication equipment
Park et al. A 65k-neuron 73-Mevents/s 22-pJ/event asynchronous micro-pipelined integrate-and-fire array transceiver
CN108090560A (en) The design method of LSTM recurrent neural network hardware accelerators based on FPGA
US20200192971A1 (en) Nand block architecture for in-memory multiply-and-accumulate operations
US20160196488A1 (en) Neural network computing device, system and method
CN111611197B (en) Operation control method and device of software-definable storage and calculation integrated chip
CN111611195A (en) Software-definable storage and calculation integrated chip and software definition method thereof
KR20130090147A (en) Neural network computing apparatus and system, and method thereof
WO2023045114A1 (en) Storage and computation integrated chip and data processing method
CN108170640B (en) Neural network operation device and operation method using same
CN108763163A (en) Simulate vector-matrix multiplication operation circuit
CN109086249A (en) Simulate vector-matrix multiplication operation circuit
WO2020258360A1 (en) Compute-in-memory chip, and memory unit array structure
TWI634489B (en) Multi-layer artificial neural network
CN109146070A (en) A kind of peripheral circuit and system of neural network training of the support based on RRAM
CN209182823U (en) A kind of numerical model analysis deposits the integrated chip of calculation and the arithmetic unit for neural network
CN209766043U (en) Storage and calculation integrated chip and storage unit array structure
CN110163359A (en) A kind of computing device and method
CN111767994B (en) Neuron computing device
JP7150998B2 (en) Superconducting neuromorphic core
US11797830B2 (en) Flexible accelerator for sparse tensors in convolutional neural networks
CN109670581B (en) Computing device and board card
CN109086871A (en) Training method, device, electronic equipment and the computer-readable medium of neural network
CN209388707U (en) One kind depositing the integrated chip of calculation

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Room 213-175, 2nd Floor, Building 1, No. 180 Kecheng Street, Qiaosi Street, Linping District, Hangzhou City, Zhejiang Province, 311100

Patentee after: Hangzhou Zhicun Computing Technology Co.,Ltd.

Country or region after: China

Address before: 1416, shining building, No. 35, Xueyuan Road, Haidian District, Beijing 100083

Patentee before: BEIJING WITINMEM TECHNOLOGY Co.,Ltd.

Country or region before: China

CP03 Change of name, title or address