US11461617B2 - Neural network device - Google Patents

Neural network device Download PDF

Info

Publication number
US11461617B2
US11461617B2 US15/911,366 US201815911366A US11461617B2 US 11461617 B2 US11461617 B2 US 11461617B2 US 201815911366 A US201815911366 A US 201815911366A US 11461617 B2 US11461617 B2 US 11461617B2
Authority
US
United States
Prior art keywords
direction data
reverse direction
data
router
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US15/911,366
Other versions
US20190156180A1 (en
Inventor
Kumiko Nomura
Takao Marukame
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Assigned to KABUSHIKI KAISHA TOSHIBA reassignment KABUSHIKI KAISHA TOSHIBA ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MARUKAME, TAKAO, NOMURA, KUMIKO
Publication of US20190156180A1 publication Critical patent/US20190156180A1/en
Application granted granted Critical
Publication of US11461617B2 publication Critical patent/US11461617B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • G06N3/0445
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/38Flow based routing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing

Definitions

  • Embodiments described herein relate generally to a neural network.
  • a technique that realizes a brain processor by using a neural network as hardware has been proposed.
  • a learning machine provides error data to the neural network to optimize a weight coefficient or the like set to the neural network.
  • a conventional neural network performs learning processing in a state where normal arithmetic processing is stopped, to optimize the weight coefficient. Therefore, in the conventional neural network, an external processor can perform the learning processing.
  • the neural network has to perform the arithmetic processing and the learning processing in parallel. Therefore, in this case, in the neural network, processing to propagate arithmetic-processing target data received from an external device in a forward direction and processing to propagate error data for learning in a reverse direction need to be performed in parallel.
  • FIG. 1 is a diagram illustrating a configuration of a neural network device according to an embodiment
  • FIG. 2 is a diagram illustrating contents of forward direction processing
  • FIG. 3 is a diagram illustrating contents of reverse direction processing
  • FIG. 4 is a diagram illustrating a configuration of a data processing unit
  • FIG. 5 is a diagram illustrating a correspondence relation between constituent elements included in the neural network and cores
  • FIG. 6 is a diagram illustrating data to be transmitted and received between a plurality of cores and a plurality of routers
  • FIG. 7 is a diagram illustrating an example of a configuration of data
  • FIG. 8 is a diagram illustrating a configuration of the router
  • FIG. 9 is a diagram illustrating a configuration of an input circuit and an output circuit
  • FIG. 10 is a flowchart illustrating processing performed by a reception unit of the input circuit
  • FIG. 11 is a flowchart illustrating processing performed by an in-router transmission unit of the input circuit
  • FIG. 12 is a flowchart illustrating processing performed by an in-router reception unit of the output circuit
  • FIG. 13 is a flowchart illustrating processing performed by a transmission unit of the output circuit
  • FIG. 14 is a diagram illustrating a configuration of an output storage unit according to a first modification
  • FIG. 15 is a flowchart illustrating processing performed by an in-router reception unit according to a second modification.
  • FIG. 16 is a flowchart illustrating processing performed by an in-router reception unit according to a third modification.
  • a neural network device includes a plurality of cores, and a plurality of routers.
  • the plurality of cores perform processing of a part of constituent elements in a neural network.
  • the plurality of routers transfer data output from each of the plurality of cores to any one of the plurality of cores so that processing are performed according to a configuration of the neural network.
  • Each of the plurality of routers includes an input circuit and an output circuit.
  • Each of the plurality of cores transmits at least one of forward direction data propagating in the neural network in a forward direction and reverse direction data propagating in the neural network in a reverse direction.
  • the input circuit receives the forward direction data and the reverse direction data from any one of the plurality of cores and the plurality of routers.
  • the output circuit or the input circuit selectively deletes the reverse direction data stored based on a request signal for requesting reception of data.
  • a neural network device 10 according to an embodiment will be described below with reference to the drawings.
  • the neural network device 10 according to the embodiment can reduce traffic congestion therein, while performing normal data processing and learning processing in the neural network in parallel.
  • FIG. 1 is a diagram illustrating a configuration of the neural network device 10 according to the embodiment.
  • the neural network device 10 includes a data processing unit 20 , a communication unit 22 , a learning unit 24 , and a setting unit 26 .
  • the data processing unit 20 , the communication unit 22 , the learning unit 24 , and the setting unit 26 can be installed in one semiconductor device, can be installed in a plurality of semiconductor devices provided on one substrate, or can be installed in a plurality of semiconductor devices provided on a plurality of substrates.
  • the learning unit 24 and the setting unit 26 can be realized by the same processor.
  • the neural network device 10 receives input data from an external device.
  • the neural network device 10 performs arithmetic processing using a neural network with respect to the received input data.
  • the neural network device 10 transmits output data, which is a result of the arithmetic processing using the neural network, to the external device.
  • the data processing unit 20 performs normal arithmetic processing based on the neural network.
  • the data processing unit 20 performs, for example, various types of information processing such as pattern recognition processing, data analysis processing, and control processing as the normal arithmetic processing based on the neural network.
  • the data processing unit 20 performs the learning processing in parallel with the normal arithmetic processing.
  • the data processing unit 20 changes a plurality of coefficients (weights) included the neural network so that the normal arithmetic processing is performed more appropriately by the learning processing.
  • the communication unit 22 transmits and receives data to and from the external device. Specifically, in the normal arithmetic processing, the communication unit 22 receives input data as the arithmetic-processing target data from the external device. The communication unit 22 also transmits output data as a result of the arithmetic processing to the external device.
  • the learning unit 24 acquires output data output from the data processing unit 20 in the normal arithmetic processing. In the learning processing, the learning unit 24 calculates error data representing an error in the output data and provides the calculated error data to the data processing unit 20 .
  • the learning unit 24 changes the plurality of coefficients (weights) included in the neural network based on information acquired by propagating the error data to plurality of layers in the reverse direction by the data processing unit 20 . For example, the learning unit 24 calculates gradient of error with respect to each of the coefficients included in the neural network. The learning unit 24 then changes the coefficients, for example, in a direction of setting the gradient of error to zero.
  • the setting unit 26 sets the changed coefficients to the data processing unit 20 , when the learning unit 24 changes the coefficients included in the neural network.
  • FIG. 2 is a diagram illustrating contents of the normal arithmetic processing (forward direction processing) in the neural network.
  • the neural network includes a plurality of layers. Each of the layers performs a predetermined arithmetic operation and processing with respect to the received data. Each of the layers included in the neural network includes a plurality of nodes. The number of nodes included in one layer may be different for each layer.
  • An activation function is set to each node.
  • the activation function may be different for each layer. Further, in the same layer, the activation function may be different for each node.
  • a coefficient (weight) is set to a link connecting between the respective nodes. When propagating data from a node to the next node, the neural network multiplies the data by the coefficient set to the link. These coefficients are appropriately changed by the learning processing.
  • the data processing unit 20 performs the forward direction processing, in which an arithmetic operation is performed while propagating data in the forward direction to the layers in the neural network, in the normal arithmetic processing in the neural network. For example, in the forward direction processing, the data processing unit 20 provides input data to input layers. Subsequently, in the forward direction processing, the data processing unit 20 propagates data output from each layer to a layer immediately thereafter in the forward direction. Subsequently, in the forward direction processing, the data processing unit 20 transmits the data output from an output layer to the external device as output data.
  • forward direction data data propagating in the plurality of layers in the forward direction.
  • FIG. 3 is a diagram illustrating contents of the learning processing (reverse direction processing) in the neural network.
  • An error function is set to each node.
  • the error function is a derivative function of the activation function being set to the node. That is, the error function is a differential of the activation function being set to the node.
  • the learning unit 24 calculates error data representing an error with respect to the output data output in the forward direction processing. Subsequently, in the reverse direction processing, the data processing unit 20 provides the error data generated by the learning unit 24 to the output layer. In the reverse direction processing, the data processing unit 20 propagates the plurality of pieces of data output from the respective layers to the layer immediately before in the reverse direction.
  • reverse direction data data propagating in the plurality of layers in the reverse direction.
  • FIG. 4 is a diagram illustrating a configuration of the data processing unit 20 .
  • the data processing unit 20 includes a plurality of cores 30 , a plurality of routers 40 , and a communication channel 42 ( 42 - 1 , 42 - 2 ).
  • Each of the cores 30 performs an arithmetic operation and processing of a part of constituent elements in the neural network.
  • Each of the cores 30 can be a processor, a dedicated hardware circuit, a digital circuit, or an analog circuit.
  • each of the cores 30 includes a storage unit, and can store the coefficients included in the neural network in the storage unit.
  • the routers 40 transfer data output from each of the cores 30 to any one of the cores 30 via the communication channel 42 , so that an arithmetic operation and processing are performed according to the configuration of the neural network.
  • each of the routers 40 is arranged at a branch point of the communication channel 42 .
  • Each of the routers 40 is directly connected with a plurality of other routers 40 via the communication channel 42 .
  • Each of the routers transmits and receives data to and from the other routers 40 directly connected via the communication channel 42 .
  • each of the routers 40 is connected with one or a plurality of cores 30 and can transmit and receive data to and from the connected cores 30 .
  • the cores 30 are provided in one-to-one association with the routers 40 and transmit and receive data to and from the routers 40 provided in association therewith.
  • Each of the routers 40 transfers data received from the router 40 or the core 30 being a source connected with the corresponding router 40 to another router 40 or the core 30 connected with the corresponding router 40 being a destination.
  • the routers 40 are arranged in a matrix in a first array direction and in a second array direction.
  • the second array direction is a direction orthogonal to the first array direction.
  • the communication channel 42 is a cross-bar network including a plurality of first communication channels 42 - 1 arranged in the first array direction, and a plurality of second communication channels 42 - 2 arranged in the second array direction orthogonal to the first array direction.
  • the routers 40 are provided at a point of intersection of the first communication channels 42 - 1 and the second communication channels 42 - 2 in the cross-bar network. Accordingly, the routers 40 can transfer data output from any core 30 to any of the cores 30 .
  • FIG. 5 is a diagram illustrating a correspondence relation between the constituent elements included in the neural network and the cores 30 that perform processing in the constituent elements.
  • any of the constituent elements included in the neural network is allocated beforehand to each of the cores 30 .
  • Each of the cores 30 performs an arithmetic operation or processing of the constituent element allocated thereto beforehand, among the constituent elements included in the neural network.
  • the constituent elements included in the neural network are, for example, an arithmetic operation of the activation function and an arithmetic operation of the error function in the node, multiplication of a coefficient set to the link, addition of data multiplied by the coefficient, input of data from the external device, output of data to the external device, acquisition of the error data, output of gradient data, and the like.
  • the constituent element is respectively allocated to each of the cores 30 so that all the constituent elements included in the neural network are performed by any of the cores 30 .
  • the processing to be performed in one core 30 can be, for example, processing to be performed in one node.
  • a certain core 30 performs multiplication of a coefficient set to the link, addition of a plurality of pieces of data received from a layer on a former stage, an arithmetic operation of the activation function, or an arithmetic operation of the error function, in one node in a certain layer.
  • the arithmetic operation and processing to be performed in one core 30 can be an arithmetic operation of a part of one node.
  • a certain core 30 can perform an arithmetic operation of the activation function in one node, and another core 30 can perform multiplication and addition of coefficients in the node.
  • the arithmetic operation and processing to be performed in one core 30 can be all the processing in a plurality of nodes included in one layer.
  • the data processing unit 20 can perform processing of the constituent elements included in the neural network in a distributed manner to the plurality of cores 30 .
  • FIG. 6 is a diagram illustrating data to be transmitted and received between the cores 30 and the routers 40 .
  • Each of the cores 30 transmits at least one of the forward direction data propagating in the neural network in the forward direction and the reverse direction data propagating in the neural network in the reverse direction to the router 40 connected to the corresponding core 30 . Further, each of the cores 30 receives at least one of the forward direction data and the reverse direction data from the router 40 connected to the corresponding core 30 .
  • each of the routers 40 receives the forward direction data and the reverse direction data from the core 30 connected to the corresponding router 40 or from another router 40 . Further, each of the routers 40 transmits the received forward direction data and reverse direction data to the core 30 connected to the corresponding router 40 or to another router 40 .
  • the core 30 when transmitting the forward direction data or the reverse direction data, transmits a request signal for requesting reception of the forward direction data or the reverse direction data to the router 40 connected to the corresponding core 30 prior to the transmission. Further, when transmitting the forward direction data or the reverse direction data, the router 40 transmits a request signal to the core 30 connected to the corresponding router 40 or another router 40 being a destination, prior to the transmission.
  • the core 30 Upon reception of the request signal and when reception is possible, the core 30 transmits an enabling signal to the router 40 that has transmitted the request signal. Upon reception of the request signal and when reception is possible, the router 40 transmits an enabling signal to the core 30 or another router 40 that has transmitted the request signal.
  • the core 30 When having received the enabling signal, the core 30 transmits the forward direction data or the reverse direction data to the router 40 connected to the corresponding core 30 . Further, when having received the enabling signal, the router 40 transmits the forward direction data or the reverse direction data to another router 40 or the core 30 connected to the corresponding router 40 , which is a destination.
  • FIG. 7 is a diagram illustrating an example of a configuration of data.
  • the forward direction data and the reverse direction data include, for example, entity data and a header as illustrated in FIG. 7 .
  • entity data is a target of an arithmetic operation and processing in the neural network.
  • the header includes information required for transferring a packet to an intended core 30 , information required for performing an arithmetic operation and processing with respect to the entity data, and the like.
  • the header includes an ID, a data type, a previous processing address, and a subsequent processing address.
  • the ID is information for identifying input data, which is a base of the corresponding entity data.
  • the data type is information for identifying whether the entity data is the forward direction data propagating in the forward direction (data propagating in the normal arithmetic processing) or the reverse direction data propagating in the reverse direction (data propagating in the learning processing).
  • the previous processing address is an address for identifying the core 30 that has output the corresponding data.
  • the previous processing address can be information for identifying a layer and a node in which the corresponding data is generated in the neural network.
  • the subsequent processing address is an address for identifying the core 30 that performs an arithmetic operation or processing next to the corresponding data in the neural network.
  • the subsequent processing address can be information for identifying a constituent element (a layer or a node) that performs an arithmetic operation or processing to the corresponding data.
  • the configuration of the header is not limited to the configuration described above, and the header can have another configuration so long as the router 40 can transfer the entity data to a proper core 30 so that an arithmetic operation and processing can be performed with respect to the entity data according to the configuration of the neural network.
  • FIG. 8 is a diagram illustrating a configuration of the router 40 .
  • the router 40 includes one or more input circuits 50 and one or more output circuits 60 .
  • Each of the one or more input circuits 50 receives the forward direction data and the reverse direction data from any one of the cores 30 or the routers 40 .
  • each of the one or more input circuits 50 is connected any one of the cores 30 or the routers 40 set in advance via the communication channel 42 , to receive the forward direction data and the reverse direction data from the connected one core 30 or one router 40 .
  • Each of the one or more output circuits 60 transmits the forward direction data and the reverse direction data to any one of the cores 30 or the routers 40 .
  • each of the one or more output circuits 60 is connected to any one of the cores 30 or the routers 40 set in advance via the communication channel 42 , to transmit the forward direction data and the reverse direction data to the connected one core 30 or one router 40 .
  • the input circuit 50 is connected to all the output circuits 60 provided in the corresponding router 40 . However, it is allowable that the input circuit 50 is not connected to the output circuit 60 connected to the same core 30 or the same router 40 connected to the corresponding input circuit 50 . That is, it is allowable that the input circuit 50 is not connected to the output circuit 60 connected to the same core 30 or the same router 40 as that of the corresponding input circuit 50 .
  • the router 40 includes a first set of the input circuit 50 and the output circuit 60 , a second set of the input circuit 50 and the output circuit 60 , a third set of the input circuit 50 and the output circuit 60 , a fourth set of the input circuit 50 and the output circuit 60 , and a fifth set of the input circuit 50 and the output circuit 60 .
  • the first set and the second set are connected to other routers 40 adjacent thereto in the first array direction in a matrix.
  • the third set and the fourth set are connected to other routers 40 adjacent thereto in the second array direction in a matrix.
  • the fifth set is connected to the core 30 provided in association with the corresponding router 40 .
  • the input circuit 50 is connected to each of the plurality of output circuits 60 by a signal line different from each other.
  • the input circuit 50 can be connected to each of the plurality of output circuits 60 by a common bus. That is, the router 40 can have a configuration in which each of the one or more input circuits 50 and each of the one or more output circuits 60 are connected to the same bus.
  • the input circuit 50 transmits data added with an identifier of the output circuit 60 as a destination to the bus.
  • the output circuit 60 selects and receives the data added with the identifier of the output circuit 60 from the bus. Accordingly, the input circuit 50 can transmit the forward direction data and the reverse direction data to one specific output circuit 60 among the one or more output circuits 60 .
  • FIG. 9 is a diagram illustrating a configuration of the input circuit 50 and the output circuit 60 .
  • the input circuit 50 and the output circuit 60 are connected in one-to-one association.
  • the input circuit 50 is connected to one or the plurality of output circuits 60 in the corresponding router 40 .
  • the output circuit 60 is connected to one or the plurality of input circuits 50 in the corresponding router 40 .
  • the input circuit 50 includes a reception unit 52 , an input storage unit 54 , and an in-router transmission unit 56 .
  • the reception unit 52 receives a request signal, the forward direction data, and the reverse direction data from the core 30 or the router 40 connected to the corresponding input circuit 50 via the communication channel 42 . Details of the processing performed by the reception unit 52 are described later with reference to FIG. 10 .
  • the input storage unit 54 stores therein the forward direction data and the reverse direction data received by the reception unit 52 .
  • the input storage unit 54 is a first-in first-out buffer (FIFO buffer).
  • the input storage unit 54 can be a shift register that shifts data for each data size of the forward direction data and the reverse direction data.
  • the in-router transmission unit 56 transmits a first request signal, a second request signal, the forward direction data, and the reverse direction data to each of one or the plurality of output circuits 60 in the router 40 . Details of the processing performed by the in-router transmission unit 56 are described later with reference to FIG. 11 .
  • the output circuit 60 includes an in-router reception unit 62 , an output storage unit 64 , and a transmission unit 66 .
  • the in-router reception unit 62 receives the first request signal, the second request signal, the forward direction data, and the reverse direction data from each of one or the plurality of input circuits 50 in the router 40 . Details of the processing performed by the in-router reception unit 62 are described later with reference to FIG. 12 .
  • the output storage unit 64 includes a forward-direction data buffer 72 and a reverse-direction data buffer 74 .
  • the forward-direction data buffer 72 stores therein the forward direction data received by the in-router reception unit 62 .
  • the forward-direction data buffer 72 is a first-in first-out buffer (FIFO buffer). Further, the forward-direction data buffer 72 can be, for example, a shift register that shifts data for each data size of the forward direction data.
  • the reverse-direction data buffer 74 stores therein the reverse direction data received by the in-router reception unit 62 .
  • the reverse-direction data buffer 74 is a first-in first-out buffer (FIFO buffer). Further, the reverse-direction data buffer 74 can be, for example, a shift register that shifts data for each data size of the reverse direction data.
  • the transmission unit 66 transmits a request signal, the forward direction data, and the reverse direction data to the core 30 or the router 40 connected the corresponding output circuit 60 via the communication channel 42 . Details of the processing performed by the transmission unit 66 are described later with reference to FIG. 13 .
  • FIG. 10 is a flowchart illustrating the processing performed by the reception unit 52 of the input circuit 50 .
  • the reception unit 52 performs processes at S 11 to S 16 described below.
  • the reception unit 52 receives the request signal from the core 30 or the router 40 connected to the corresponding input circuit 50 via the communication channel 42 . Subsequently, at S 12 , the reception unit 52 determines whether there is a free space in the input storage unit 54 .
  • the reception unit 52 holds the processing for a certain period of time. After having waited for the certain period of time, the reception unit 52 returns the processing to S 12 , and repeats processes at S 12 and S 13 until a free space becomes available in the input storage unit 54 . If a free space does not become available in the input storage unit 54 even if the reception unit 52 has waited for a certain number of times or for a predetermined time or longer, the reception unit 52 can transmit a disabling signal to the core 30 or the router 40 that has transmitted the request signal to finish the processing.
  • the reception unit 52 transmits an enabling signal to the core 30 or the router 40 that has transmitted the request signal.
  • the core 30 or the router 40 that has transmitted the request signal transmits the forward direction data or the reverse direction data to the corresponding input circuit 50 .
  • the reception unit 52 receives the forward direction data or the reverse direction data from the core 30 or the router 40 that has transmitted the request signal.
  • the reception unit 52 writes the received forward direction data or reverse direction data in the input storage unit 54 . After the process at S 16 , the reception unit 52 finishes the present flow.
  • FIG. 11 is a flowchart illustrating the processing performed by the in-router transmission unit 56 of the input circuit 50 .
  • the in-router transmission unit 56 repeatedly performs processes at S 21 to S 28 described below during the operation of the neural network device 10 .
  • the in-router transmission unit 56 monitors the input storage unit 54 to determine whether the forward direction data or the reverse direction data is present in the input storage unit 54 . If there is no forward direction data or reverse direction data in the input storage unit 54 (NO at S 21 ), the in-router transmission unit 56 repeats the process at S 21 .
  • the in-router transmission unit 56 reads out one piece of the forward direction data or one piece of the reverse direction data that is the least recently written data and has not been transmitted yet from the input storage unit 54 .
  • the in-router transmission unit 56 refers to a header of the read-out forward direction data or reverse direction data to decide one destination from the cores 30 or the routers 40 connected to the corresponding router 40 .
  • the in-router transmission unit 56 analyzes the header to detect an address (for example, the next processing address) of the core 30 that performs the next arithmetic operation and processing with respect to the read-out forward direction data or reverse direction data.
  • the in-router transmission unit 56 finds one route, through which data can be transferred from the corresponding router 40 to the detected core 30 appropriately (for example, with the shortest time or the shortest distance).
  • the in-router transmission unit 56 decides the core 30 or the router 40 on the one route found out from the cores 30 or the routers 40 connected to the corresponding router 40 as a destination.
  • the in-router transmission unit 56 transmits the first request signal for requesting reception of the forward direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination. Further, when having read out the reverse direction data, the in-router transmission unit 56 transmits the second request signal for requesting reception of the reverse direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination.
  • the output circuit 60 Upon reception of the first request signal, if the output circuit 60 can receive the forward direction data, the output circuit 60 transmits an enabling signal to the source of the first request signal. Further, upon reception of the second request signal, if the output circuit 60 can receive the reverse direction data, the output circuit 60 transmits an enabling signal to the source of the second request signal.
  • the in-router transmission unit 56 determines whether the enabling signal has been received from the output circuit 60 connected to the core 30 or the router 40 decided as the destination. If the enabling signal has not been received (NO at S 25 ), at S 26 , the in-router transmission unit 56 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router transmission unit 56 returns the processing to S 25 , and repeats processes at S 25 and S 26 until the enabling signal can be received.
  • the in-router transmission unit 56 can return the processing to S 21 .
  • the in-router transmission unit 56 Upon reception of the enabling signal (YES at S 25 ), at S 27 , the in-router transmission unit 56 transmits the read-out forward direction data or reverse direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination. Subsequently, at S 28 , the in-router transmission unit 56 deletes the transmitted forward direction data or reverse direction data from the input storage unit 54 . After the process at S 28 , the in-router transmission unit 56 returns the processing to S 21 , to perform the present flow repeatedly.
  • FIG. 12 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60 .
  • the in-router reception unit 62 performs processes at S 41 to S 52 described below.
  • the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50 . Subsequently, at S 42 , the in-router reception unit 62 determines whether the request signal is a reception request of the forward direction data (that is, the first request signal has been received), or is a reception request of the reverse direction data (that is, the second request signal has been received).
  • the in-router reception unit 62 advances the processing to S 43 .
  • the in-router reception unit 62 determines whether there is a free space in the forward-direction data buffer 72 in the output storage unit 64 .
  • the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S 43 , and repeats processes at S 43 and S 44 until a free space becomes available in the forward-direction data buffer 72 . If a free space does not become available even if the in-router reception unit 62 has waited for a certain number of times or for a predetermined time or longer, the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal, to finish the processing.
  • the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the first request signal transmits the forward direction data to the corresponding output circuit 60 .
  • the in-router reception unit 62 receives the forward direction data from the input circuit 50 that has transmitted the first request signal.
  • the in-router reception unit 62 writes the received forward direction data in the forward-direction data buffer 72 . After the process at S 47 , the in-router reception unit 62 finishes the present flow.
  • the output circuit 60 By performing processes at S 43 to S 47 , if there is no free space for storing the forward direction data, the output circuit 60 holds reception until a free space is ensured. Accordingly, the output circuit 60 can transfer the forward direction data reliably to the destination core 30 .
  • the in-router reception unit 62 advances the processing to S 48 .
  • the in-router reception unit 62 determines whether there is a free space in the reverse-direction data buffer 74 in the output storage unit 64 .
  • the in-router reception unit 62 deletes the reverse direction data stored in the reverse-direction data buffer 74 in the output storage unit 64 .
  • the reverse-direction data buffer 74 is a FIFO buffer
  • the in-router reception unit 62 deletes one piece of the reverse direction data stored at the head of the reverse-direction data buffer 74 . That is, the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the reverse-direction data buffer 74 . Accordingly, the in-router reception unit 62 can ensure a free space in the reverse-direction data buffer 74 in the output storage unit 64 .
  • the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the second request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the second request signal transmits the reverse direction data to the corresponding output circuit 60 .
  • the in-router reception unit 62 receives the reverse direction data from the input circuit 50 that has transmitted the second request signal.
  • the in-router reception unit 62 writes the received reverse direction data in the reverse-direction data buffer 74 . After the process at S 52 , the in-router reception unit 62 finishes the present flow.
  • the output circuit 60 By performing processes at S 48 to S 52 , if there is no free space for storing the reverse direction data, the output circuit 60 selectively deletes the reverse direction data from the output storage unit 64 to ensure a free space, and immediately receives the reverse direction data. Accordingly, the output circuit 60 can eliminate stagnation of the reverse direction data to ensure smooth traffic.
  • FIG. 13 is a flowchart illustrating the processing performed by the transmission unit 66 of the output circuit 60 .
  • the transmission unit 66 repeatedly performs processes at S 61 to S 67 described below, during the operation of the neural network device 10 .
  • the transmission unit 66 monitors the output storage unit 64 to determine whether the forward direction data or the reverse direction data is present in the output storage unit 64 . If there is no forward direction data and reverse direction data in the output storage unit 64 (NO at S 61 ), the transmission unit 66 repeats the process at S 61 .
  • the transmission unit 66 transmits a request signal to the core 30 or the router 40 connected to the corresponding output circuit 60 via the communication channel 42 .
  • the core 30 or the router 40 transmits an enabling signal to the core 30 or the router 40 that has transmitted the enabling signal.
  • the transmission unit 66 determines whether the enabling signal has been received from the core 30 or the router 40 connected to the corresponding output circuit 60 . If the enabling signal has not been received (NO at S 63 ), at S 64 , the transmission unit 66 holds the processing for a predetermined period of time. After having waited for the certain period of time, the transmission unit 66 returns the processing to S 63 , and repeats processes at S 63 and S 64 until the enabling signal can be received.
  • the transmission unit 66 can return the processing to S 61 .
  • the transmission unit 66 When having received the enabling signal (YES at S 63 ), at S 65 , the transmission unit 66 reads out one piece of the forward direction data or one piece of the reverse direction data that is least recently written and has not been transmitted from the forward-direction data buffer 72 or the reverse-direction data buffer 74 of the output storage unit 64 .
  • the transmission unit 66 can read out the forward direction data stored in the forward-direction data buffer 72 and the reverse direction data stored in the reverse-direction data buffer 74 alternately. Further, the transmission unit 66 can read out the forward direction data in preference to the reverse direction data in such a manner that after the forward direction data stored in the forward-direction data buffer 72 has been read out three times, the reverse direction data stored in the reverse-direction data buffer 74 is read out once.
  • the transmission unit 66 transmits the read-out forward direction data or reverse direction data to the core 30 or the router 40 connected to the corresponding output circuit 60 via the communication channel 42 .
  • the transmission unit 66 deletes the transmitted forward direction data or reverse direction data from the output storage unit 64 .
  • the transmission unit 66 returns the processing to S 61 and repeatedly performs the present flow.
  • the neural network device 10 deletes the stagnating reverse direction data to perform transfer of the reverse direction data smoothly. Accordingly, the neural network device 10 can reduce stagnation of the traffic therein.
  • the neural network device 10 does not delete the forward direction data. Accordingly, the neural network device 10 can reliably perform the arithmetic operation to the input data provided from an external device. Further, although the learning accuracy decreases because the reverse direction data cannot be transferred, the neural network device 10 can perform at least the arithmetic operation reliably, and thus can reduce the influence due to the data deletion.
  • the neural network device 10 can eliminate stagnation of the reverse direction data in the router 40 , the neural network device 10 can increase a memory capacity for storing the forward direction data in the router 40 and decrease the memory capacity for storing the reverse direction data. Accordingly, the neural network device 10 can realize efficient data transfer with a small memory capacity and cost reduction.
  • FIG. 14 is a diagram illustrating a configuration of the output storage unit 64 in the output circuit 60 according to a first modification.
  • the output storage unit 64 can have a configuration, for example, as illustrated in FIG. 14 .
  • differences in the configurations described above are mainly described. The same applies to a second modification onward.
  • the output storage unit 64 includes a data storage unit 82 and a memory controller 84 .
  • the data storage unit 82 is a random-access memory and stores therein the forward direction data and the reverse direction data.
  • the memory controller 84 executes access control with respect to the data storage unit 82 .
  • the memory controller 84 sets a first memory capacity for storing the forward direction data and a second memory capacity for storing the reverse direction data with respect to the data storage unit 82 .
  • the memory controller 84 sets a forward-direction data region having at least the first memory capacity for storing the forward direction data, and a reverse-direction data region having at least the second memory capacity for storing the reverse direction data, with respect to the data storage unit 82 .
  • the in-router reception unit 62 determines whether a total capacity of the forward direction data stored in the data storage unit 82 has reached the first memory capacity. If the total capacity of the forward direction data has not reached the first memory capacity, the in-router reception unit 62 returns an enabling signal to the input circuit 50 that has transmitted the first request signal.
  • the in-router reception unit 62 If the total capacity of the forward direction data has reached the first memory capacity, the in-router reception unit 62 does not return the enabling signal, and after the total capacity of the forward direction data has fallen below the first memory capacity, the in-router reception unit 62 returns the enabling signal.
  • the in-router reception unit 62 determines whether a total capacity of the reverse direction data stored in the data storage unit 82 has reached the second memory capacity. If the total capacity of the reverse direction data has not reached the second memory capacity, the in-router reception unit 62 returns an enabling signal to the input circuit 50 that has transmitted the second request signal.
  • the in-router reception unit 62 deletes any one piece of the reverse direction data stored in the data storage unit 82 . After deletion of the reverse direction data, the in-router reception unit 62 returns the enabling signal.
  • the memory controller 84 manages a write sequence of the forward direction data and the reverse direction data stored in the data storage unit 82 .
  • the transmission unit 66 reads out one piece of the forward direction data or one piece of the reverse direction data least recently written according to the write sequence managed by the memory controller 84 and transmits the read-out data to the core 30 or the router 40 connected to the output circuit 60 .
  • the memory controller 84 can change the first memory capacity for storing the forward direction data and the second memory capacity for storing the reverse direction data according to a time variation of the total capacity of the forward direction data and a time variation of the total capacity of the reverse direction data stored in the data storage unit 82 .
  • the memory controller 84 calculates a ratio of a reception amount of the forward direction data to a reception amount of the reverse direction data at a regular time interval and changes the ratio of the first memory capacity to the second memory capacity according to a change of the ratio.
  • the output circuit 60 according to the first modification can store the forward direction data and the reverse direction data by using a random-access memory.
  • FIG. 15 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60 according to the second modification.
  • the output storage unit 64 has a configuration including a randomly accessible data storage unit 82 and the memory controller 84 as illustrated in FIG. 14
  • the output circuit 60 can perform the processing as illustrated in FIG. 15 .
  • the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50 . Subsequently, at S 72 , the in-router reception unit 62 determines whether there is a free space in the data storage unit 82 .
  • the in-router reception unit 62 advances the processing to S 73 .
  • the in-router reception unit 62 determines whether the received request signal is a reception request of the forward direction data (that is, the first request signal has been received), or a reception request of the reverse direction data (that is, the second request signal has been received).
  • the in-router reception unit 62 advances the processing to S 74 .
  • the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S 72 , and repeats processes at S 72 and S 73 until a free space becomes available in the data storage unit 82 .
  • the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal to finish the processing.
  • the in-router reception unit 62 advances the processing to S 75 .
  • the in-router reception unit 62 deletes the reverse direction data stored in the data storage unit 82 .
  • the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the data storage unit 82 . Accordingly, the in-router reception unit 62 can ensure a free space in the data storage unit 82 . If there is no reverse direction data in the data storage unit 82 , the in-router reception unit 62 proceeds to the next process without performing any processing.
  • the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the first request signal or the second request signal transmits the forward direction data or the reverse direction data to the corresponding output circuit 60 .
  • the in-router reception unit 62 receives the forward direction data or the reverse direction data from the input circuits 50 that has transmitted the first request signal or the second request signal.
  • the in-router reception unit 62 writes the received reverse direction data in the data storage unit 82 . If there is no reverse direction data in the data storage unit 82 and a free space cannot be ensured at S 75 , the in-router reception unit 62 discards the reverse direction data received at S 77 without writing the data in the data storage unit 82 .
  • the output circuit 60 according to the second modification can cause the forward direction data or the reverse direction data to be stored in the data storage unit 82 without distinction. Further, if there is no free space in the data storage unit 82 , and when the reverse direction data is received, the output circuit 60 according to the second modification can delete the stagnating reverse direction data to perform transfer of the reverse direction data smoothly.
  • the input circuit 50 determines whether there is a free space in the input storage unit 54 . If there is no free space in the input storage unit 54 , the input circuit 50 holds the processing for a certain period of time until a free space becomes available in the input storage unit 54 . Therefore, in the second modification, upon reception of a request signal from the core 30 or the router 40 and when there is no free space in the input storage unit 54 , the input circuit 50 can transmit a signal of instructing to delete the reverse direction data to the output circuit 60 in the router 40 .
  • a free space becomes available in the data storage unit 82 of the output circuit 60 , and the input circuit 50 can transmit data to the output circuit 60 . After the data is transmitted to the output circuit 60 , the input circuit 50 can generate a free space in the input storage unit 54 .
  • FIG. 16 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60 according to a third modification.
  • the output storage unit 64 has a configuration including, for example, the randomly accessible data storage unit 82 and the memory controller 84 as illustrated in FIG. 14
  • the output circuit 60 can perform the processing as illustrated in FIG. 16 .
  • the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50 . Subsequently, at S 82 , the in-router reception unit 62 determines whether there is a free space in the data storage unit 82 .
  • the in-router reception unit 62 advances the processing to S 83 .
  • the in-router reception unit 62 determines whether the reverse direction data is present in the data storage unit 82 .
  • the in-router reception unit 62 advances the processing to S 84 .
  • the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S 82 , and repeats processes at S 82 , S 83 , and S 84 until a free space becomes available in the data storage unit 82 .
  • the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal to finish the processing.
  • the in-router reception unit 62 advances the processing to S 85 .
  • the in-router reception unit 62 deletes the reverse direction data stored in the data storage unit 82 .
  • the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the data storage unit 82 . Accordingly, the in-router reception unit 62 can cause the data storage unit 82 to have a free space.
  • the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal.
  • the in-router reception unit 62 receives the forward direction data or the reverse direction data from the input circuit 50 that has transmitted the first request signal or the second request signal.
  • the in-router reception unit 62 writes the received forward direction data or reverse direction data in the data storage unit 62 . After the process at S 88 , the in-router reception unit 62 finishes the present flow.
  • the output circuit 60 according to the third modification can cause the data storage unit 82 to store therein the forward direction data or the reverse direction data without distinction. Further, if there is no free space in the data storage unit 82 , the output circuit 60 according to the third modification can delete the stagnating reverse direction data to perform transfer of the forward direction data and the reverse direction data smoothly.
  • the input storage unit 54 provided in the input circuit 50 is a FIFO buffer or a shift register.
  • the input storage unit 54 provided in the input circuit 50 can be a random-access memory.
  • the reception unit 52 of the input circuit 50 can perform the same processing as that of the in-router reception unit 62 of the output circuit 60 .
  • the input circuit 50 deletes the reverse direction data stored in the input storage unit 54 .
  • the input circuit 50 deletes one piece of the reverse direction data least recently written.
  • the input circuit 50 can perform the same processing as that of the in-router reception unit 62 described in the first modification, the second modification, and the third modification with respect to the input storage unit 54 .
  • the input storage unit 54 provided in the input circuit 50 is a FIFO buffer or a shift register
  • the input circuit 50 when the input circuit 50 has received a request signal and when there is no free space in the input storage unit 54 for storing therein the reverse direction data, the input circuit 50 deletes one piece of the reverse direction data stored at the head of the FIFO buffer. However, if the data stored at the head of the FIFO buffer is the forward direction data, the input circuit 50 performs the processing described in the embodiment.

Abstract

According to an embodiment, a neural network device includes a plurality of cores, and a plurality of routers. Each of the plurality of routers includes an input circuit and an output circuit. Each of the plurality of cores transmits at least one of forward direction data propagating in the neural network in a forward direction and reverse direction data propagating in the neural network in a reverse direction. The input circuit receives the forward direction data and the reverse direction data from any one of the plurality of cores and the plurality of routers. The output circuit or the input circuit selectively deletes the reverse direction data stored based on a request signal for requesting reception of data.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2017-222259, filed on Nov. 17, 2017; the entire contents of which are incorporated herein by reference.
FIELD
Embodiments described herein relate generally to a neural network.
BACKGROUND
In recent years, a technique that realizes a brain processor by using a neural network as hardware has been proposed. In the brain processor, a learning machine provides error data to the neural network to optimize a weight coefficient or the like set to the neural network.
A conventional neural network performs learning processing in a state where normal arithmetic processing is stopped, to optimize the weight coefficient. Therefore, in the conventional neural network, an external processor can perform the learning processing.
However, when the brain processor is to be realized, the neural network has to perform the arithmetic processing and the learning processing in parallel. Therefore, in this case, in the neural network, processing to propagate arithmetic-processing target data received from an external device in a forward direction and processing to propagate error data for learning in a reverse direction need to be performed in parallel.
However, when the processing to propagate data in the forward direction and the processing to propagate data in the reverse direction are performed in parallel with respect to the neural network, traffic in the neural network stagnates, thereby causing an increase of cost and an increase of a processing time.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram illustrating a configuration of a neural network device according to an embodiment;
FIG. 2 is a diagram illustrating contents of forward direction processing;
FIG. 3 is a diagram illustrating contents of reverse direction processing;
FIG. 4 is a diagram illustrating a configuration of a data processing unit;
FIG. 5 is a diagram illustrating a correspondence relation between constituent elements included in the neural network and cores;
FIG. 6 is a diagram illustrating data to be transmitted and received between a plurality of cores and a plurality of routers;
FIG. 7 is a diagram illustrating an example of a configuration of data;
FIG. 8 is a diagram illustrating a configuration of the router;
FIG. 9 is a diagram illustrating a configuration of an input circuit and an output circuit;
FIG. 10 is a flowchart illustrating processing performed by a reception unit of the input circuit;
FIG. 11 is a flowchart illustrating processing performed by an in-router transmission unit of the input circuit;
FIG. 12 is a flowchart illustrating processing performed by an in-router reception unit of the output circuit;
FIG. 13 is a flowchart illustrating processing performed by a transmission unit of the output circuit;
FIG. 14 is a diagram illustrating a configuration of an output storage unit according to a first modification;
FIG. 15 is a flowchart illustrating processing performed by an in-router reception unit according to a second modification; and
FIG. 16 is a flowchart illustrating processing performed by an in-router reception unit according to a third modification.
DETAILED DESCRIPTION
According to an embodiment, a neural network device includes a plurality of cores, and a plurality of routers. the plurality of cores perform processing of a part of constituent elements in a neural network. The plurality of routers transfer data output from each of the plurality of cores to any one of the plurality of cores so that processing are performed according to a configuration of the neural network. Each of the plurality of routers includes an input circuit and an output circuit. Each of the plurality of cores transmits at least one of forward direction data propagating in the neural network in a forward direction and reverse direction data propagating in the neural network in a reverse direction. The input circuit receives the forward direction data and the reverse direction data from any one of the plurality of cores and the plurality of routers. The output circuit or the input circuit selectively deletes the reverse direction data stored based on a request signal for requesting reception of data.
A neural network device 10 according to an embodiment will be described below with reference to the drawings. The neural network device 10 according to the embodiment can reduce traffic congestion therein, while performing normal data processing and learning processing in the neural network in parallel.
FIG. 1 is a diagram illustrating a configuration of the neural network device 10 according to the embodiment. The neural network device 10 includes a data processing unit 20, a communication unit 22, a learning unit 24, and a setting unit 26.
The data processing unit 20, the communication unit 22, the learning unit 24, and the setting unit 26 can be installed in one semiconductor device, can be installed in a plurality of semiconductor devices provided on one substrate, or can be installed in a plurality of semiconductor devices provided on a plurality of substrates. The learning unit 24 and the setting unit 26 can be realized by the same processor.
The neural network device 10 receives input data from an external device. The neural network device 10 performs arithmetic processing using a neural network with respect to the received input data. The neural network device 10 transmits output data, which is a result of the arithmetic processing using the neural network, to the external device.
The data processing unit 20 performs normal arithmetic processing based on the neural network. The data processing unit 20 performs, for example, various types of information processing such as pattern recognition processing, data analysis processing, and control processing as the normal arithmetic processing based on the neural network.
Further, the data processing unit 20 performs the learning processing in parallel with the normal arithmetic processing. The data processing unit 20 changes a plurality of coefficients (weights) included the neural network so that the normal arithmetic processing is performed more appropriately by the learning processing.
The communication unit 22 transmits and receives data to and from the external device. Specifically, in the normal arithmetic processing, the communication unit 22 receives input data as the arithmetic-processing target data from the external device. The communication unit 22 also transmits output data as a result of the arithmetic processing to the external device.
The learning unit 24 acquires output data output from the data processing unit 20 in the normal arithmetic processing. In the learning processing, the learning unit 24 calculates error data representing an error in the output data and provides the calculated error data to the data processing unit 20.
Further, the learning unit 24 changes the plurality of coefficients (weights) included in the neural network based on information acquired by propagating the error data to plurality of layers in the reverse direction by the data processing unit 20. For example, the learning unit 24 calculates gradient of error with respect to each of the coefficients included in the neural network. The learning unit 24 then changes the coefficients, for example, in a direction of setting the gradient of error to zero.
The setting unit 26 sets the changed coefficients to the data processing unit 20, when the learning unit 24 changes the coefficients included in the neural network.
FIG. 2 is a diagram illustrating contents of the normal arithmetic processing (forward direction processing) in the neural network.
The neural network includes a plurality of layers. Each of the layers performs a predetermined arithmetic operation and processing with respect to the received data. Each of the layers included in the neural network includes a plurality of nodes. The number of nodes included in one layer may be different for each layer.
An activation function is set to each node. The activation function may be different for each layer. Further, in the same layer, the activation function may be different for each node. A coefficient (weight) is set to a link connecting between the respective nodes. When propagating data from a node to the next node, the neural network multiplies the data by the coefficient set to the link. These coefficients are appropriately changed by the learning processing.
The data processing unit 20 performs the forward direction processing, in which an arithmetic operation is performed while propagating data in the forward direction to the layers in the neural network, in the normal arithmetic processing in the neural network. For example, in the forward direction processing, the data processing unit 20 provides input data to input layers. Subsequently, in the forward direction processing, the data processing unit 20 propagates data output from each layer to a layer immediately thereafter in the forward direction. Subsequently, in the forward direction processing, the data processing unit 20 transmits the data output from an output layer to the external device as output data.
Here, in the present embodiment, in the normal arithmetic processing in the neural network, data propagating in the plurality of layers in the forward direction is referred to as “forward direction data”.
FIG. 3 is a diagram illustrating contents of the learning processing (reverse direction processing) in the neural network. An error function is set to each node. The error function is a derivative function of the activation function being set to the node. That is, the error function is a differential of the activation function being set to the node.
When the forward direction processing has finished, the learning unit 24 calculates error data representing an error with respect to the output data output in the forward direction processing. Subsequently, in the reverse direction processing, the data processing unit 20 provides the error data generated by the learning unit 24 to the output layer. In the reverse direction processing, the data processing unit 20 propagates the plurality of pieces of data output from the respective layers to the layer immediately before in the reverse direction.
Here, in the present embodiment, in the learning processing in the neural network, data propagating in the plurality of layers in the reverse direction is referred to as “reverse direction data”.
FIG. 4 is a diagram illustrating a configuration of the data processing unit 20. The data processing unit 20 includes a plurality of cores 30, a plurality of routers 40, and a communication channel 42 (42-1, 42-2).
Each of the cores 30 performs an arithmetic operation and processing of a part of constituent elements in the neural network. Each of the cores 30 can be a processor, a dedicated hardware circuit, a digital circuit, or an analog circuit. Further, each of the cores 30 includes a storage unit, and can store the coefficients included in the neural network in the storage unit.
The routers 40 transfer data output from each of the cores 30 to any one of the cores 30 via the communication channel 42, so that an arithmetic operation and processing are performed according to the configuration of the neural network.
For example, each of the routers 40 is arranged at a branch point of the communication channel 42. Each of the routers 40 is directly connected with a plurality of other routers 40 via the communication channel 42. Each of the routers transmits and receives data to and from the other routers 40 directly connected via the communication channel 42.
Further, each of the routers 40 is connected with one or a plurality of cores 30 and can transmit and receive data to and from the connected cores 30. In the present embodiment, the cores 30 are provided in one-to-one association with the routers 40 and transmit and receive data to and from the routers 40 provided in association therewith.
Each of the routers 40 transfers data received from the router 40 or the core 30 being a source connected with the corresponding router 40 to another router 40 or the core 30 connected with the corresponding router 40 being a destination.
For example, the routers 40 are arranged in a matrix in a first array direction and in a second array direction. For example, the second array direction is a direction orthogonal to the first array direction. For example, the communication channel 42 is a cross-bar network including a plurality of first communication channels 42-1 arranged in the first array direction, and a plurality of second communication channels 42-2 arranged in the second array direction orthogonal to the first array direction. The routers 40 are provided at a point of intersection of the first communication channels 42-1 and the second communication channels 42-2 in the cross-bar network. Accordingly, the routers 40 can transfer data output from any core 30 to any of the cores 30.
FIG. 5 is a diagram illustrating a correspondence relation between the constituent elements included in the neural network and the cores 30 that perform processing in the constituent elements.
Any of the constituent elements included in the neural network is allocated beforehand to each of the cores 30. Each of the cores 30 performs an arithmetic operation or processing of the constituent element allocated thereto beforehand, among the constituent elements included in the neural network.
The constituent elements included in the neural network are, for example, an arithmetic operation of the activation function and an arithmetic operation of the error function in the node, multiplication of a coefficient set to the link, addition of data multiplied by the coefficient, input of data from the external device, output of data to the external device, acquisition of the error data, output of gradient data, and the like. The constituent element is respectively allocated to each of the cores 30 so that all the constituent elements included in the neural network are performed by any of the cores 30.
The processing to be performed in one core 30 can be, for example, processing to be performed in one node. For example, a certain core 30 performs multiplication of a coefficient set to the link, addition of a plurality of pieces of data received from a layer on a former stage, an arithmetic operation of the activation function, or an arithmetic operation of the error function, in one node in a certain layer.
Further, the arithmetic operation and processing to be performed in one core 30 can be an arithmetic operation of a part of one node. For example, a certain core 30 can perform an arithmetic operation of the activation function in one node, and another core 30 can perform multiplication and addition of coefficients in the node. Further, the arithmetic operation and processing to be performed in one core 30 can be all the processing in a plurality of nodes included in one layer.
Thus, the data processing unit 20 can perform processing of the constituent elements included in the neural network in a distributed manner to the plurality of cores 30.
FIG. 6 is a diagram illustrating data to be transmitted and received between the cores 30 and the routers 40.
Each of the cores 30 transmits at least one of the forward direction data propagating in the neural network in the forward direction and the reverse direction data propagating in the neural network in the reverse direction to the router 40 connected to the corresponding core 30. Further, each of the cores 30 receives at least one of the forward direction data and the reverse direction data from the router 40 connected to the corresponding core 30.
Further, each of the routers 40 receives the forward direction data and the reverse direction data from the core 30 connected to the corresponding router 40 or from another router 40. Further, each of the routers 40 transmits the received forward direction data and reverse direction data to the core 30 connected to the corresponding router 40 or to another router 40.
Here, when transmitting the forward direction data or the reverse direction data, the core 30 transmits a request signal for requesting reception of the forward direction data or the reverse direction data to the router 40 connected to the corresponding core 30 prior to the transmission. Further, when transmitting the forward direction data or the reverse direction data, the router 40 transmits a request signal to the core 30 connected to the corresponding router 40 or another router 40 being a destination, prior to the transmission.
Upon reception of the request signal and when reception is possible, the core 30 transmits an enabling signal to the router 40 that has transmitted the request signal. Upon reception of the request signal and when reception is possible, the router 40 transmits an enabling signal to the core 30 or another router 40 that has transmitted the request signal.
When having received the enabling signal, the core 30 transmits the forward direction data or the reverse direction data to the router 40 connected to the corresponding core 30. Further, when having received the enabling signal, the router 40 transmits the forward direction data or the reverse direction data to another router 40 or the core 30 connected to the corresponding router 40, which is a destination.
FIG. 7 is a diagram illustrating an example of a configuration of data. The forward direction data and the reverse direction data include, for example, entity data and a header as illustrated in FIG. 7. The entity data is a target of an arithmetic operation and processing in the neural network. The header includes information required for transferring a packet to an intended core 30, information required for performing an arithmetic operation and processing with respect to the entity data, and the like.
For example, the header includes an ID, a data type, a previous processing address, and a subsequent processing address. The ID is information for identifying input data, which is a base of the corresponding entity data.
The data type is information for identifying whether the entity data is the forward direction data propagating in the forward direction (data propagating in the normal arithmetic processing) or the reverse direction data propagating in the reverse direction (data propagating in the learning processing).
The previous processing address is an address for identifying the core 30 that has output the corresponding data. The previous processing address can be information for identifying a layer and a node in which the corresponding data is generated in the neural network.
The subsequent processing address is an address for identifying the core 30 that performs an arithmetic operation or processing next to the corresponding data in the neural network. The subsequent processing address can be information for identifying a constituent element (a layer or a node) that performs an arithmetic operation or processing to the corresponding data.
The configuration of the header is not limited to the configuration described above, and the header can have another configuration so long as the router 40 can transfer the entity data to a proper core 30 so that an arithmetic operation and processing can be performed with respect to the entity data according to the configuration of the neural network.
FIG. 8 is a diagram illustrating a configuration of the router 40. The router 40 includes one or more input circuits 50 and one or more output circuits 60.
Each of the one or more input circuits 50 receives the forward direction data and the reverse direction data from any one of the cores 30 or the routers 40. For example, each of the one or more input circuits 50 is connected any one of the cores 30 or the routers 40 set in advance via the communication channel 42, to receive the forward direction data and the reverse direction data from the connected one core 30 or one router 40.
Each of the one or more output circuits 60 transmits the forward direction data and the reverse direction data to any one of the cores 30 or the routers 40. For example, each of the one or more output circuits 60 is connected to any one of the cores 30 or the routers 40 set in advance via the communication channel 42, to transmit the forward direction data and the reverse direction data to the connected one core 30 or one router 40.
The input circuit 50 is connected to all the output circuits 60 provided in the corresponding router 40. However, it is allowable that the input circuit 50 is not connected to the output circuit 60 connected to the same core 30 or the same router 40 connected to the corresponding input circuit 50. That is, it is allowable that the input circuit 50 is not connected to the output circuit 60 connected to the same core 30 or the same router 40 as that of the corresponding input circuit 50.
For example, the router 40 includes a first set of the input circuit 50 and the output circuit 60, a second set of the input circuit 50 and the output circuit 60, a third set of the input circuit 50 and the output circuit 60, a fourth set of the input circuit 50 and the output circuit 60, and a fifth set of the input circuit 50 and the output circuit 60. The first set and the second set are connected to other routers 40 adjacent thereto in the first array direction in a matrix. The third set and the fourth set are connected to other routers 40 adjacent thereto in the second array direction in a matrix. The fifth set is connected to the core 30 provided in association with the corresponding router 40.
In a connection example in FIG. 8, the input circuit 50 is connected to each of the plurality of output circuits 60 by a signal line different from each other. However, the input circuit 50 can be connected to each of the plurality of output circuits 60 by a common bus. That is, the router 40 can have a configuration in which each of the one or more input circuits 50 and each of the one or more output circuits 60 are connected to the same bus. In this case, the input circuit 50 transmits data added with an identifier of the output circuit 60 as a destination to the bus. The output circuit 60 selects and receives the data added with the identifier of the output circuit 60 from the bus. Accordingly, the input circuit 50 can transmit the forward direction data and the reverse direction data to one specific output circuit 60 among the one or more output circuits 60.
FIG. 9 is a diagram illustrating a configuration of the input circuit 50 and the output circuit 60. In the example in FIG. 9, the input circuit 50 and the output circuit 60 are connected in one-to-one association. However, the input circuit 50 is connected to one or the plurality of output circuits 60 in the corresponding router 40. Further, the output circuit 60 is connected to one or the plurality of input circuits 50 in the corresponding router 40.
The input circuit 50 includes a reception unit 52, an input storage unit 54, and an in-router transmission unit 56. The reception unit 52 receives a request signal, the forward direction data, and the reverse direction data from the core 30 or the router 40 connected to the corresponding input circuit 50 via the communication channel 42. Details of the processing performed by the reception unit 52 are described later with reference to FIG. 10.
The input storage unit 54 stores therein the forward direction data and the reverse direction data received by the reception unit 52. The input storage unit 54 is a first-in first-out buffer (FIFO buffer). The input storage unit 54 can be a shift register that shifts data for each data size of the forward direction data and the reverse direction data.
The in-router transmission unit 56 transmits a first request signal, a second request signal, the forward direction data, and the reverse direction data to each of one or the plurality of output circuits 60 in the router 40. Details of the processing performed by the in-router transmission unit 56 are described later with reference to FIG. 11.
The output circuit 60 includes an in-router reception unit 62, an output storage unit 64, and a transmission unit 66. The in-router reception unit 62 receives the first request signal, the second request signal, the forward direction data, and the reverse direction data from each of one or the plurality of input circuits 50 in the router 40. Details of the processing performed by the in-router reception unit 62 are described later with reference to FIG. 12.
The output storage unit 64 includes a forward-direction data buffer 72 and a reverse-direction data buffer 74. The forward-direction data buffer 72 stores therein the forward direction data received by the in-router reception unit 62. The forward-direction data buffer 72 is a first-in first-out buffer (FIFO buffer). Further, the forward-direction data buffer 72 can be, for example, a shift register that shifts data for each data size of the forward direction data.
The reverse-direction data buffer 74 stores therein the reverse direction data received by the in-router reception unit 62. The reverse-direction data buffer 74 is a first-in first-out buffer (FIFO buffer). Further, the reverse-direction data buffer 74 can be, for example, a shift register that shifts data for each data size of the reverse direction data.
The transmission unit 66 transmits a request signal, the forward direction data, and the reverse direction data to the core 30 or the router 40 connected the corresponding output circuit 60 via the communication channel 42. Details of the processing performed by the transmission unit 66 are described later with reference to FIG. 13.
FIG. 10 is a flowchart illustrating the processing performed by the reception unit 52 of the input circuit 50. When a request signal is transmitted to the corresponding input circuit 50 from the core 30 or the router 40 connected thereto via the communication channel 42, the reception unit 52 performs processes at S11 to S16 described below.
First, at S11, the reception unit 52 receives the request signal from the core 30 or the router 40 connected to the corresponding input circuit 50 via the communication channel 42. Subsequently, at S12, the reception unit 52 determines whether there is a free space in the input storage unit 54.
If there is no free space in the input storage unit 54 (NO at S12), at S13, the reception unit 52 holds the processing for a certain period of time. After having waited for the certain period of time, the reception unit 52 returns the processing to S12, and repeats processes at S12 and S13 until a free space becomes available in the input storage unit 54. If a free space does not become available in the input storage unit 54 even if the reception unit 52 has waited for a certain number of times or for a predetermined time or longer, the reception unit 52 can transmit a disabling signal to the core 30 or the router 40 that has transmitted the request signal to finish the processing.
If there is a free space in the input storage unit 54 (YES at S12), at S14, the reception unit 52 transmits an enabling signal to the core 30 or the router 40 that has transmitted the request signal. Upon reception of the enabling signal, the core 30 or the router 40 that has transmitted the request signal transmits the forward direction data or the reverse direction data to the corresponding input circuit 50.
Subsequently, at S15, the reception unit 52 receives the forward direction data or the reverse direction data from the core 30 or the router 40 that has transmitted the request signal. At S16, the reception unit 52 writes the received forward direction data or reverse direction data in the input storage unit 54. After the process at S16, the reception unit 52 finishes the present flow.
FIG. 11 is a flowchart illustrating the processing performed by the in-router transmission unit 56 of the input circuit 50. The in-router transmission unit 56 repeatedly performs processes at S21 to S28 described below during the operation of the neural network device 10.
At S21, the in-router transmission unit 56 monitors the input storage unit 54 to determine whether the forward direction data or the reverse direction data is present in the input storage unit 54. If there is no forward direction data or reverse direction data in the input storage unit 54 (NO at S21), the in-router transmission unit 56 repeats the process at S21.
If the forward direction data or the reverse direction data is present in the input storage unit 54 (YES at S21), at S22, the in-router transmission unit 56 reads out one piece of the forward direction data or one piece of the reverse direction data that is the least recently written data and has not been transmitted yet from the input storage unit 54.
Subsequently, at S23, the in-router transmission unit 56 refers to a header of the read-out forward direction data or reverse direction data to decide one destination from the cores 30 or the routers 40 connected to the corresponding router 40. For example, the in-router transmission unit 56 analyzes the header to detect an address (for example, the next processing address) of the core 30 that performs the next arithmetic operation and processing with respect to the read-out forward direction data or reverse direction data. After having detected the next processing address, the in-router transmission unit 56 finds one route, through which data can be transferred from the corresponding router 40 to the detected core 30 appropriately (for example, with the shortest time or the shortest distance). The in-router transmission unit 56 decides the core 30 or the router 40 on the one route found out from the cores 30 or the routers 40 connected to the corresponding router 40 as a destination.
Subsequently, at S24, when having read out the forward direction data, the in-router transmission unit 56 transmits the first request signal for requesting reception of the forward direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination. Further, when having read out the reverse direction data, the in-router transmission unit 56 transmits the second request signal for requesting reception of the reverse direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination.
Upon reception of the first request signal, if the output circuit 60 can receive the forward direction data, the output circuit 60 transmits an enabling signal to the source of the first request signal. Further, upon reception of the second request signal, if the output circuit 60 can receive the reverse direction data, the output circuit 60 transmits an enabling signal to the source of the second request signal.
Subsequently, at S25, the in-router transmission unit 56 determines whether the enabling signal has been received from the output circuit 60 connected to the core 30 or the router 40 decided as the destination. If the enabling signal has not been received (NO at S25), at S26, the in-router transmission unit 56 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router transmission unit 56 returns the processing to S25, and repeats processes at S25 and S26 until the enabling signal can be received. If the enabling signal cannot be received even if the in-router transmission unit 56 has waited for a certain number of times or for a predetermined time or longer, or if a disabling signal has been received from the output circuit 60, the in-router transmission unit 56 can return the processing to S21.
Upon reception of the enabling signal (YES at S25), at S27, the in-router transmission unit 56 transmits the read-out forward direction data or reverse direction data to the output circuit 60 connected to the core 30 or the router 40 decided as the destination. Subsequently, at S28, the in-router transmission unit 56 deletes the transmitted forward direction data or reverse direction data from the input storage unit 54. After the process at S28, the in-router transmission unit 56 returns the processing to S21, to perform the present flow repeatedly.
FIG. 12 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60. When the first request signal or the second request signal is transmitted to the corresponding output circuit 60 from any of the one or more input circuits 50 provided in the router 40, the in-router reception unit 62 performs processes at S41 to S52 described below.
First at S41, the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50. Subsequently, at S42, the in-router reception unit 62 determines whether the request signal is a reception request of the forward direction data (that is, the first request signal has been received), or is a reception request of the reverse direction data (that is, the second request signal has been received).
In the case of the reception request of the forward direction data (YES at S42), the in-router reception unit 62 advances the processing to S43. At S43, the in-router reception unit 62 determines whether there is a free space in the forward-direction data buffer 72 in the output storage unit 64.
If there is no free space in the forward-direction data buffer 72 (NO at S43), at S44, the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S43, and repeats processes at S43 and S44 until a free space becomes available in the forward-direction data buffer 72. If a free space does not become available even if the in-router reception unit 62 has waited for a certain number of times or for a predetermined time or longer, the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal, to finish the processing.
If there is a free space in the forward-direction data buffer 72 (YES at S43), at S45, the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the first request signal transmits the forward direction data to the corresponding output circuit 60.
Subsequently, at S46, the in-router reception unit 62 receives the forward direction data from the input circuit 50 that has transmitted the first request signal. At S47, the in-router reception unit 62 writes the received forward direction data in the forward-direction data buffer 72. After the process at S47, the in-router reception unit 62 finishes the present flow.
By performing processes at S43 to S47, if there is no free space for storing the forward direction data, the output circuit 60 holds reception until a free space is ensured. Accordingly, the output circuit 60 can transfer the forward direction data reliably to the destination core 30.
In the case of the reception request of the reverse direction data (NO at S42), the in-router reception unit 62 advances the processing to S48. At S48, the in-router reception unit 62 determines whether there is a free space in the reverse-direction data buffer 74 in the output storage unit 64.
If there is no free space in the reverse-direction data buffer 74 (NO at S48), at S49, the in-router reception unit 62 deletes the reverse direction data stored in the reverse-direction data buffer 74 in the output storage unit 64. For example, when the reverse-direction data buffer 74 is a FIFO buffer, the in-router reception unit 62 deletes one piece of the reverse direction data stored at the head of the reverse-direction data buffer 74. That is, the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the reverse-direction data buffer 74. Accordingly, the in-router reception unit 62 can ensure a free space in the reverse-direction data buffer 74 in the output storage unit 64.
At S49, after deletion of the reverse direction data from the reverse-direction data buffer 74, or if there is originally a free space in the reverse-direction data buffer (YES at S48), at S50, the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the second request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the second request signal transmits the reverse direction data to the corresponding output circuit 60.
Subsequently, at S51, the in-router reception unit 62 receives the reverse direction data from the input circuit 50 that has transmitted the second request signal. At S52, the in-router reception unit 62 writes the received reverse direction data in the reverse-direction data buffer 74. After the process at S52, the in-router reception unit 62 finishes the present flow.
By performing processes at S48 to S52, if there is no free space for storing the reverse direction data, the output circuit 60 selectively deletes the reverse direction data from the output storage unit 64 to ensure a free space, and immediately receives the reverse direction data. Accordingly, the output circuit 60 can eliminate stagnation of the reverse direction data to ensure smooth traffic.
FIG. 13 is a flowchart illustrating the processing performed by the transmission unit 66 of the output circuit 60. The transmission unit 66 repeatedly performs processes at S61 to S67 described below, during the operation of the neural network device 10.
At S61, the transmission unit 66 monitors the output storage unit 64 to determine whether the forward direction data or the reverse direction data is present in the output storage unit 64. If there is no forward direction data and reverse direction data in the output storage unit 64 (NO at S61), the transmission unit 66 repeats the process at S61.
If the forward direction data or the reverse direction data is present in the output storage unit 64 (YES at S61), at S62, the transmission unit 66 transmits a request signal to the core 30 or the router 40 connected to the corresponding output circuit 60 via the communication channel 42. Upon reception of the request signal, if the forward direction data and the reverse direction data can be received, the core 30 or the router 40 transmits an enabling signal to the core 30 or the router 40 that has transmitted the enabling signal.
Subsequently, at S63, the transmission unit 66 determines whether the enabling signal has been received from the core 30 or the router 40 connected to the corresponding output circuit 60. If the enabling signal has not been received (NO at S63), at S64, the transmission unit 66 holds the processing for a predetermined period of time. After having waited for the certain period of time, the transmission unit 66 returns the processing to S63, and repeats processes at S63 and S64 until the enabling signal can be received. If the enabling signal cannot be received even if the transmission unit 66 has waited for a certain number of times or for a predetermined time or longer, or a disabling signal has been received from the core 30 or the router 40 connected to the corresponding output circuit 60, the transmission unit 66 can return the processing to S61.
When having received the enabling signal (YES at S63), at S65, the transmission unit 66 reads out one piece of the forward direction data or one piece of the reverse direction data that is least recently written and has not been transmitted from the forward-direction data buffer 72 or the reverse-direction data buffer 74 of the output storage unit 64. The transmission unit 66 can read out the forward direction data stored in the forward-direction data buffer 72 and the reverse direction data stored in the reverse-direction data buffer 74 alternately. Further, the transmission unit 66 can read out the forward direction data in preference to the reverse direction data in such a manner that after the forward direction data stored in the forward-direction data buffer 72 has been read out three times, the reverse direction data stored in the reverse-direction data buffer 74 is read out once.
Subsequently, at S66, the transmission unit 66 transmits the read-out forward direction data or reverse direction data to the core 30 or the router 40 connected to the corresponding output circuit 60 via the communication channel 42. Subsequently, at S67, the transmission unit 66 deletes the transmitted forward direction data or reverse direction data from the output storage unit 64. After the process at S67, the transmission unit 66 returns the processing to S61 and repeatedly performs the present flow.
As described above, if the reverse direction data to be propagated in the learning processing (the reverse direction processing) in the neural network stagnates in any of the routers 40, the neural network device 10 according to the present embodiment deletes the stagnating reverse direction data to perform transfer of the reverse direction data smoothly. Accordingly, the neural network device 10 can reduce stagnation of the traffic therein.
Further, even if the forward direction data to be propagated in the normal arithmetic processing (the forward direction processing) in the neural network stagnates in any of the routers 40, the neural network device 10 does not delete the forward direction data. Accordingly, the neural network device 10 can reliably perform the arithmetic operation to the input data provided from an external device. Further, although the learning accuracy decreases because the reverse direction data cannot be transferred, the neural network device 10 can perform at least the arithmetic operation reliably, and thus can reduce the influence due to the data deletion.
Further, because the neural network device 10 can eliminate stagnation of the reverse direction data in the router 40, the neural network device 10 can increase a memory capacity for storing the forward direction data in the router 40 and decrease the memory capacity for storing the reverse direction data. Accordingly, the neural network device 10 can realize efficient data transfer with a small memory capacity and cost reduction.
FIG. 14 is a diagram illustrating a configuration of the output storage unit 64 in the output circuit 60 according to a first modification. The output storage unit 64 can have a configuration, for example, as illustrated in FIG. 14. In the first modification, differences in the configurations described above are mainly described. The same applies to a second modification onward.
The output storage unit 64 according to the first modification includes a data storage unit 82 and a memory controller 84. The data storage unit 82 is a random-access memory and stores therein the forward direction data and the reverse direction data. The memory controller 84 executes access control with respect to the data storage unit 82.
The memory controller 84 sets a first memory capacity for storing the forward direction data and a second memory capacity for storing the reverse direction data with respect to the data storage unit 82. For example, the memory controller 84 sets a forward-direction data region having at least the first memory capacity for storing the forward direction data, and a reverse-direction data region having at least the second memory capacity for storing the reverse direction data, with respect to the data storage unit 82.
When having received a first request signal for requesting reception of the forward direction data from any one of the input circuits 50, the in-router reception unit 62 determines whether a total capacity of the forward direction data stored in the data storage unit 82 has reached the first memory capacity. If the total capacity of the forward direction data has not reached the first memory capacity, the in-router reception unit 62 returns an enabling signal to the input circuit 50 that has transmitted the first request signal.
If the total capacity of the forward direction data has reached the first memory capacity, the in-router reception unit 62 does not return the enabling signal, and after the total capacity of the forward direction data has fallen below the first memory capacity, the in-router reception unit 62 returns the enabling signal.
Further, when having received a second request signal for requesting reception of the reverse direction data from any one of the input circuits 50, the in-router reception unit 62 determines whether a total capacity of the reverse direction data stored in the data storage unit 82 has reached the second memory capacity. If the total capacity of the reverse direction data has not reached the second memory capacity, the in-router reception unit 62 returns an enabling signal to the input circuit 50 that has transmitted the second request signal.
If the total capacity of the reverse direction data has reached the second memory capacity, the in-router reception unit 62 deletes any one piece of the reverse direction data stored in the data storage unit 82. After deletion of the reverse direction data, the in-router reception unit 62 returns the enabling signal.
Further, the memory controller 84 manages a write sequence of the forward direction data and the reverse direction data stored in the data storage unit 82. The transmission unit 66 reads out one piece of the forward direction data or one piece of the reverse direction data least recently written according to the write sequence managed by the memory controller 84 and transmits the read-out data to the core 30 or the router 40 connected to the output circuit 60.
The memory controller 84 can change the first memory capacity for storing the forward direction data and the second memory capacity for storing the reverse direction data according to a time variation of the total capacity of the forward direction data and a time variation of the total capacity of the reverse direction data stored in the data storage unit 82. For example, the memory controller 84 calculates a ratio of a reception amount of the forward direction data to a reception amount of the reverse direction data at a regular time interval and changes the ratio of the first memory capacity to the second memory capacity according to a change of the ratio.
The output circuit 60 according to the first modification can store the forward direction data and the reverse direction data by using a random-access memory.
FIG. 15 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60 according to the second modification. For example, when the output storage unit 64 has a configuration including a randomly accessible data storage unit 82 and the memory controller 84 as illustrated in FIG. 14, the output circuit 60 can perform the processing as illustrated in FIG. 15.
First, at S71, the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50. Subsequently, at S72, the in-router reception unit 62 determines whether there is a free space in the data storage unit 82.
If there is no free space in the data storage unit 82 (NO at S72), the in-router reception unit 62 advances the processing to S73. At S73, the in-router reception unit 62 determines whether the received request signal is a reception request of the forward direction data (that is, the first request signal has been received), or a reception request of the reverse direction data (that is, the second request signal has been received).
In the case of a reception request of the forward direction data (YES at S73), the in-router reception unit 62 advances the processing to S74. At S74, the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S72, and repeats processes at S72 and S73 until a free space becomes available in the data storage unit 82. If a free space does not become available in the data storage unit 82 even if the in-router reception unit 62 has waited for a certain number of times or for a predetermined time or longer, the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal to finish the processing.
In the case of a reception request of the reverse direction data (NO at S73), the in-router reception unit 62 advances the processing to S75. At S75, the in-router reception unit 62 deletes the reverse direction data stored in the data storage unit 82. For example, the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the data storage unit 82. Accordingly, the in-router reception unit 62 can ensure a free space in the data storage unit 82. If there is no reverse direction data in the data storage unit 82, the in-router reception unit 62 proceeds to the next process without performing any processing.
If it is determined that there is a free space in the data storage unit 82 (YES at S72) or one piece of the reverse direction data has been deleted from the data storage unit 82 (S75), at S76, the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal. Upon reception of the enabling signal, the input circuit 50 that has transmitted the first request signal or the second request signal transmits the forward direction data or the reverse direction data to the corresponding output circuit 60.
Subsequently, at S77, the in-router reception unit 62 receives the forward direction data or the reverse direction data from the input circuits 50 that has transmitted the first request signal or the second request signal. At S78, the in-router reception unit 62 writes the received reverse direction data in the data storage unit 82. If there is no reverse direction data in the data storage unit 82 and a free space cannot be ensured at S75, the in-router reception unit 62 discards the reverse direction data received at S77 without writing the data in the data storage unit 82.
The output circuit 60 according to the second modification can cause the forward direction data or the reverse direction data to be stored in the data storage unit 82 without distinction. Further, if there is no free space in the data storage unit 82, and when the reverse direction data is received, the output circuit 60 according to the second modification can delete the stagnating reverse direction data to perform transfer of the reverse direction data smoothly.
Further, when having received a request signal from the core 30 or the router 40 connected via the communication channel 42, the input circuit 50 determines whether there is a free space in the input storage unit 54. If there is no free space in the input storage unit 54, the input circuit 50 holds the processing for a certain period of time until a free space becomes available in the input storage unit 54. Therefore, in the second modification, upon reception of a request signal from the core 30 or the router 40 and when there is no free space in the input storage unit 54, the input circuit 50 can transmit a signal of instructing to delete the reverse direction data to the output circuit 60 in the router 40. Accordingly, a free space becomes available in the data storage unit 82 of the output circuit 60, and the input circuit 50 can transmit data to the output circuit 60. After the data is transmitted to the output circuit 60, the input circuit 50 can generate a free space in the input storage unit 54.
FIG. 16 is a flowchart illustrating the processing performed by the in-router reception unit 62 of the output circuit 60 according to a third modification. When the output storage unit 64 has a configuration including, for example, the randomly accessible data storage unit 82 and the memory controller 84 as illustrated in FIG. 14, the output circuit 60 can perform the processing as illustrated in FIG. 16.
First, at S81, the in-router reception unit 62 receives the first request signal or the second request signal from any one of the input circuits 50. Subsequently, at S82, the in-router reception unit 62 determines whether there is a free space in the data storage unit 82.
If there is no free space in the data storage unit 82 (NO at S82), the in-router reception unit 62 advances the processing to S83. At S83, the in-router reception unit 62 determines whether the reverse direction data is present in the data storage unit 82.
If there is no reverse direction data in the data storage unit 82 (NO at S83), the in-router reception unit 62 advances the processing to S84. At S84, the in-router reception unit 62 holds the processing for a certain period of time. After having waited for the certain period of time, the in-router reception unit 62 returns the processing to S82, and repeats processes at S82, S83, and S84 until a free space becomes available in the data storage unit 82. If a free space does not become available even if the in-router reception unit 62 has waited for a certain number of times or for a predetermined time or longer, the in-router reception unit 62 can transmit a disabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal to finish the processing.
If the reverse direction data is present in the data storage unit 82 (YES at S83), the in-router reception unit 62 advances the processing to S85. At S85, the in-router reception unit 62 deletes the reverse direction data stored in the data storage unit 82. For example, the in-router reception unit 62 deletes one piece of the reverse direction data least recently written from the data storage unit 82. Accordingly, the in-router reception unit 62 can cause the data storage unit 82 to have a free space.
If it is determined that there is a free space in the data storage unit 82 (YES at S82) or one piece of the reverse direction data has been deleted from the data storage unit 82 (S85), the in-router reception unit 62 transmits an enabling signal to the input circuit 50 that has transmitted the first request signal or the second request signal.
Subsequently, at S87, the in-router reception unit 62 receives the forward direction data or the reverse direction data from the input circuit 50 that has transmitted the first request signal or the second request signal. At S88, the in-router reception unit 62 writes the received forward direction data or reverse direction data in the data storage unit 62. After the process at S88, the in-router reception unit 62 finishes the present flow.
The output circuit 60 according to the third modification can cause the data storage unit 82 to store therein the forward direction data or the reverse direction data without distinction. Further, if there is no free space in the data storage unit 82, the output circuit 60 according to the third modification can delete the stagnating reverse direction data to perform transfer of the forward direction data and the reverse direction data smoothly.
The neural network device 10 according to the embodiment and some modifications have been described above. In the embodiment and the modifications, it is assumed that the input storage unit 54 provided in the input circuit 50 is a FIFO buffer or a shift register. However, the input storage unit 54 provided in the input circuit 50 can be a random-access memory. In this case, the reception unit 52 of the input circuit 50 can perform the same processing as that of the in-router reception unit 62 of the output circuit 60.
That is, when a request signal is received, if there is no free space in the input storage unit 54 for storing therein the reverse direction data, the input circuit 50 deletes the reverse direction data stored in the input storage unit 54. For example, in this case, the input circuit 50 deletes one piece of the reverse direction data least recently written. Further, the input circuit 50 can perform the same processing as that of the in-router reception unit 62 described in the first modification, the second modification, and the third modification with respect to the input storage unit 54.
Further, when the input storage unit 54 provided in the input circuit 50 is a FIFO buffer or a shift register, when the input circuit 50 has received a request signal and when there is no free space in the input storage unit 54 for storing therein the reverse direction data, the input circuit 50 deletes one piece of the reverse direction data stored at the head of the FIFO buffer. However, if the data stored at the head of the FIFO buffer is the forward direction data, the input circuit 50 performs the processing described in the embodiment.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.

Claims (13)

What is claimed is:
1. A neural network device, comprising:
a plurality of cores configured to perform processing of a part of constituent elements in a neural network; and
a plurality of routers that transfer data output from each of the plurality of cores to any one of the plurality of cores so that processing is performed according to a configuration of the neural network, wherein
the plurality of cores
perform arithmetic processing in the neural network and perform learning processing of the neural network concurrently with the arithmetic processing,
in the arithmetic processing, receive input data as an arithmetic-processing target and output output data as a result of the arithmetic processing, and
when a plurality of coefficients included in the neural network is changed in the learning processing, update the plurality of coefficients,
each of the plurality of routers includes an input circuit and an output circuit,
each of the plurality of cores transmits at least one of forward direction data propagating in the neural network in a forward direction and reverse direction data propagating in the neural network in a reverse direction,
the input circuit receives the forward direction data and the reverse direction data from any one of the plurality of cores and the plurality of routers, and
the output circuit includes:
an output memory circuit;
an in-router receiver that receives the forward direction data and the reverse direction data from the input circuit and writes the received forward direction data and the reverse direction data in the output memory circuit; and
a transmitter that transmits the forward direction data and the reverse direction data stored in the output memory circuit to any one of the plurality of cores and the plurality of routers, wherein
the in-router receiver deletes the reverse direction data stored in the output memory circuit, when a request signal for requesting reception of incoming reverse direction data is received from the input circuit at a time when there is no free space for storing the reverse direction data in the output memory circuit.
2. The device according to claim 1, wherein
the input circuit is connected to any one core or router of the plurality of cores and the plurality of routers and receives the forward direction data or the reverse direction data from the core or router connected to the input circuit, and
the output circuit is connected to any one core or router of the plurality of cores and the plurality of routers and transmits the forward direction data or the reverse direction data to the core or router connected to the output circuit.
3. The device according to claim 2, wherein
the input circuit includes:
an input memory circuit;
a receiver that receives the forward direction data and the reverse direction data from a core or a router connected to the router, and writes the received forward direction data and the reverse direction data in the input memory circuit; and
an in-router transmitter that reads out the forward direction data or the reverse direction data stored in the input memory circuit, determines a core or a router connected to the router as a destination of the read-out forward direction data or reverse direction data, and transmits the read-out forward direction data or reverse direction data to an output circuit connected to the determined core or router.
4. The device according to claim 3, wherein
the output memory circuit includes a forward-direction data buffer that stores the forward direction data therein, and
the transmitter reads out the forward direction data least recently written, from the forward-direction data buffer and transmits the read-out forward direction data to the destination.
5. The device according to claim 4, wherein
the output memory circuit further includes a reverse-direction data buffer that stores the reverse direction data therein and outputs the reverse direction data in order of being written, and
the transmitter reads out the reverse direction data least recently written, from the reverse-direction data buffer and transmits the read-out reverse direction data to the destination.
6. The device according to claim 1, wherein the transmitter deletes the transmitted forward direction data or the reverse direction data from the output memory circuit.
7. The device according to claim 1, wherein when there is no free space for storing the reverse direction data in the output memory circuit, the in-router receiver deletes one piece of the reverse direction data least recently written, from the output memory circuit.
8. The device according to claim 1, wherein upon reception of the request signal, the in-router receiver transmits an enabling signal for enabling transmission of the reverse direction data to a particular input circuit that has transmitted the request signal.
9. The device according to claim 1, wherein
the output memory circuit includes:
a randomly accessible data storage circuit that stores therein the forward direction data and the reverse direction data; and
a memory controller that executes access control with respect to the randomly accessible data storage circuit,
the memory controller sets a memory capacity for storing the forward direction data and a memory capacity for storing the reverse direction data with respect to the randomly accessible data storage circuit, and
upon reception of a request signal for requesting reception of the reverse direction data from any one of the input circuits and when a total capacity of the reverse direction data stored in the randomly accessible data storage circuit has reached the memory capacity for storing the reverse direction data, the in-router receiver deletes any one piece of the reverse direction data stored in the randomly accessible data storage circuit.
10. The device according to claim 1, wherein
the output memory circuit includes a randomly accessible data storage circuit that stores therein the forward direction data and the reverse direction data, and
upon reception of a request signal for requesting reception of the reverse direction data from any one of the input circuits and when there is no free space in the randomly accessible data storage circuit, the in-router receiver deletes the reverse direction data stored in the randomly accessible data storage circuit.
11. The device according to claim 1, wherein
the output memory circuit includes a randomly accessible data storage circuit that stores therein the forward direction data and the reverse direction data, and
upon reception of a request signal for requesting reception of the forward direction data or the reverse direction data from any one of the input circuits and when there is no free space in the randomly accessible data storage circuit, the in-router receiver deletes any one piece of the reverse direction data stored in the randomly accessible data storage circuit.
12. The device according to claim 1, wherein the plurality of cores are provided in one-to-one association with the plurality of routers, and transmit and receive data to and from the routers provided in association therewith.
13. The device according to claim 12, wherein
the plurality of routers are arranged in a matrix,
each of the routers includes:
a first set of the input circuit and the output circuit;
a second set of the input circuit and the output circuit;
a third set of the input circuit and the output circuit;
a fourth set of the input circuit and the output circuit; and
a fifth set of the input circuit and the output circuit,
the first set and the second set are connected to other routers adjacent in a first array direction in the matrix,
the third set and the fourth set are connected to other routers adjacent in a second array direction in the matrix, and
the fifth set is connected to a core provided in association with the router.
US15/911,366 2017-11-17 2018-03-05 Neural network device Active 2041-07-08 US11461617B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2017-222259 2017-11-17
JP2017222259A JP6794336B2 (en) 2017-11-17 2017-11-17 Neural network device
JPJP2017-222259 2017-11-17

Publications (2)

Publication Number Publication Date
US20190156180A1 US20190156180A1 (en) 2019-05-23
US11461617B2 true US11461617B2 (en) 2022-10-04

Family

ID=66533079

Family Applications (1)

Application Number Title Priority Date Filing Date
US15/911,366 Active 2041-07-08 US11461617B2 (en) 2017-11-17 2018-03-05 Neural network device

Country Status (2)

Country Link
US (1) US11461617B2 (en)
JP (1) JP6794336B2 (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10802489B1 (en) 2017-12-29 2020-10-13 Apex Artificial Intelligence Industries, Inc. Apparatus and method for monitoring and controlling of a neural network using another neural network implemented on one or more solid-state chips
US11783167B1 (en) 2018-04-20 2023-10-10 Perceive Corporation Data transfer for non-dot product computations on neural network inference circuit
US10740434B1 (en) 2018-04-20 2020-08-11 Perceive Corporation Reduced dot product computation circuit
US11531868B1 (en) 2018-04-20 2022-12-20 Perceive Corporation Input value cache for temporarily storing input values
US11675503B1 (en) 2018-05-21 2023-06-13 Pure Storage, Inc. Role-based data access
US11455409B2 (en) 2018-05-21 2022-09-27 Pure Storage, Inc. Storage layer data obfuscation
US11954220B2 (en) 2018-05-21 2024-04-09 Pure Storage, Inc. Data protection for container storage
US11763133B2 (en) * 2018-08-31 2023-09-19 Servicenow Canada Inc. Data point suitability determination from edge device neural networks
US11347297B1 (en) 2019-01-23 2022-05-31 Perceive Corporation Neural network inference circuit employing dynamic memory sleep
US11157692B2 (en) * 2019-03-29 2021-10-26 Western Digital Technologies, Inc. Neural networks using data processing units
US11941533B1 (en) 2019-05-21 2024-03-26 Perceive Corporation Compiler for performing zero-channel removal
KR20210030653A (en) * 2019-09-10 2021-03-18 주식회사 모빌린트 Arithmetic unit including multiple cores
US11367290B2 (en) 2019-11-26 2022-06-21 Apex Artificial Intelligence Industries, Inc. Group of neural networks ensuring integrity
US11366434B2 (en) 2019-11-26 2022-06-21 Apex Artificial Intelligence Industries, Inc. Adaptive and interchangeable neural networks
US10691133B1 (en) * 2019-11-26 2020-06-23 Apex Artificial Intelligence Industries, Inc. Adaptive and interchangeable neural networks
US10956807B1 (en) 2019-11-26 2021-03-23 Apex Artificial Intelligence Industries, Inc. Adaptive and interchangeable neural networks utilizing predicting information
US11907571B2 (en) 2020-07-13 2024-02-20 SK Hynix Inc. Read threshold optimization systems and methods using domain transformation
JP7358312B2 (en) * 2020-08-25 2023-10-10 株式会社東芝 Memory and neural network devices
US11355204B2 (en) 2020-09-03 2022-06-07 SK Hynix Inc. Efficient read-threshold calculation method for parametric PV-level modeling
US20220114135A1 (en) * 2020-09-21 2022-04-14 Mostafizur Rahman Computer architecture for artificial intelligence and reconfigurable hardware
US11430530B2 (en) * 2021-01-25 2022-08-30 SK Hynix Inc. Deep learning based program-verify modeling and voltage estimation for memory devices
US11514999B2 (en) 2021-04-16 2022-11-29 SK Hynix Inc. Systems and methods for parametric PV-level modeling and read threshold voltage estimation
US11749354B2 (en) 2021-07-13 2023-09-05 SK Hynix Inc. Systems and methods for non-parametric PV-level modeling and read threshold voltage estimation
US11769555B2 (en) 2021-07-27 2023-09-26 SK Hynix Inc. Read threshold voltage estimation systems and methods for parametric PV-level modeling
US11769556B2 (en) 2021-07-27 2023-09-26 SK Hynix Inc. Systems and methods for modeless read threshold voltage estimation
US11854629B2 (en) 2021-11-22 2023-12-26 SK Hynix Inc. System and method for non-parametric optimal read threshold estimation using deep neural network

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05216859A (en) 1991-12-11 1993-08-27 Ricoh Co Ltd Signal processor
JPH05307624A (en) 1991-09-05 1993-11-19 Ricoh Co Ltd Signal processor
US5485548A (en) 1992-08-01 1996-01-16 Ricoh Company, Ltd. Signal processing apparatus using a hierarchical neural network
US20070201434A1 (en) * 2006-02-15 2007-08-30 Shuji Nakamura Storage system having a channel control function using a plurality of processors
US8977583B2 (en) * 2012-03-29 2015-03-10 International Business Machines Corporation Synaptic, dendritic, somatic, and axonal plasticity in a network of neural cores using a plastic multi-stage crossbar switching
US20150088797A1 (en) 2013-09-26 2015-03-26 Gwangju Institute Of Science And Technology Synapse circuits for connecting neuron circuits, unit cells composing neuromorphic circuit, and neuromorphic circuits
US20160284400A1 (en) 2015-03-27 2016-09-29 University Of Dayton Analog neuromorphic circuit implemented using resistive memories
US20160336064A1 (en) 2015-05-15 2016-11-17 Arizona Board Of Regents On Behalf Of Arizona State University Neuromorphic computational system(s) using resistive synaptic devices
JP2017049945A (en) 2015-09-04 2017-03-09 株式会社東芝 Signal generator and transmission device
US20180082168A1 (en) 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Memcapacitor, neuro device, and neural network device
US20180091442A1 (en) * 2016-09-29 2018-03-29 International Business Machines Corporation Network switch architecture supporting multiple simultaneous collective operations
US20180189645A1 (en) * 2016-12-30 2018-07-05 Intel Corporation Neuromorphic computer with reconfigurable memory mapping for various neural network topologies
US20180211154A1 (en) 2017-01-25 2018-07-26 Kabushiki Kaisha Toshiba Multiplier accumurator, network unit, and network apparatus
JP2019053563A (en) 2017-09-15 2019-04-04 株式会社東芝 Arithmetic device

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02271463A (en) * 1989-04-13 1990-11-06 Seiko Epson Corp Neural network simulator
JPH04344970A (en) * 1991-05-23 1992-12-01 Nippon Telegr & Teleph Corp <Ntt> Neural network processor
JPH0784976A (en) * 1993-09-10 1995-03-31 Toshiba Corp Neural network
JPH086916A (en) * 1994-06-22 1996-01-12 Hitachi Ltd Method and device for learning recurrent neural network
JPH09259100A (en) * 1996-03-19 1997-10-03 Toshiba Corp Neural network device
JPH1153335A (en) * 1997-07-31 1999-02-26 Ricoh Co Ltd Parallel bp successive learning processing device
US20130320989A1 (en) * 2011-03-07 2013-12-05 Hitachi, Ltd. Battery state estimation method and battery control system
US8909576B2 (en) * 2011-09-16 2014-12-09 International Business Machines Corporation Neuromorphic event-driven neural computing architecture in a scalable neural network
US9159020B2 (en) * 2012-09-14 2015-10-13 International Business Machines Corporation Multiplexing physical neurons to optimize power and area
EP3089080A1 (en) * 2015-04-27 2016-11-02 Universität Zürich Networks and hierarchical routing fabrics with heterogeneous memory structures for scalable event-driven computing systems
JP6309936B2 (en) * 2015-11-17 2018-04-11 ファナック株式会社 Control device with coolant monitoring function
US10885425B2 (en) * 2016-12-20 2021-01-05 Intel Corporation Network traversal using neuromorphic instantiations of spike-time-dependent plasticity
CN113792847B (en) * 2017-02-23 2024-03-08 大脑系统公司 Accelerated deep learning apparatus, method and system
KR102008287B1 (en) * 2017-05-23 2019-08-07 고려대학교 산학협력단 Bidirectional fifo memoy and processing device for convoultion using the same

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05307624A (en) 1991-09-05 1993-11-19 Ricoh Co Ltd Signal processor
JPH05216859A (en) 1991-12-11 1993-08-27 Ricoh Co Ltd Signal processor
US5485548A (en) 1992-08-01 1996-01-16 Ricoh Company, Ltd. Signal processing apparatus using a hierarchical neural network
US20070201434A1 (en) * 2006-02-15 2007-08-30 Shuji Nakamura Storage system having a channel control function using a plurality of processors
US8977583B2 (en) * 2012-03-29 2015-03-10 International Business Machines Corporation Synaptic, dendritic, somatic, and axonal plasticity in a network of neural cores using a plastic multi-stage crossbar switching
US20150088797A1 (en) 2013-09-26 2015-03-26 Gwangju Institute Of Science And Technology Synapse circuits for connecting neuron circuits, unit cells composing neuromorphic circuit, and neuromorphic circuits
US20160284400A1 (en) 2015-03-27 2016-09-29 University Of Dayton Analog neuromorphic circuit implemented using resistive memories
US20160336064A1 (en) 2015-05-15 2016-11-17 Arizona Board Of Regents On Behalf Of Arizona State University Neuromorphic computational system(s) using resistive synaptic devices
JP2017049945A (en) 2015-09-04 2017-03-09 株式会社東芝 Signal generator and transmission device
US20180082168A1 (en) 2016-09-20 2018-03-22 Kabushiki Kaisha Toshiba Memcapacitor, neuro device, and neural network device
JP2018049887A (en) 2016-09-20 2018-03-29 株式会社東芝 Memcapacitor, neuro element, and neural network device
US20180091442A1 (en) * 2016-09-29 2018-03-29 International Business Machines Corporation Network switch architecture supporting multiple simultaneous collective operations
US20180189645A1 (en) * 2016-12-30 2018-07-05 Intel Corporation Neuromorphic computer with reconfigurable memory mapping for various neural network topologies
US20180211154A1 (en) 2017-01-25 2018-07-26 Kabushiki Kaisha Toshiba Multiplier accumurator, network unit, and network apparatus
JP2018120433A (en) 2017-01-25 2018-08-02 株式会社東芝 Product-sum operator, network unit and network device
JP2019053563A (en) 2017-09-15 2019-04-04 株式会社東芝 Arithmetic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Geoffrey W. Burr "Analog Resistive Neuromorphic Hardware", IBM Research—Almaden, Jun. 30, 2017, 135 pages.

Also Published As

Publication number Publication date
JP6794336B2 (en) 2020-12-02
JP2019095861A (en) 2019-06-20
US20190156180A1 (en) 2019-05-23

Similar Documents

Publication Publication Date Title
US11461617B2 (en) Neural network device
KR20160030550A (en) Architecture and method for hybrid circuit-switched and packet-switched router
JP6905195B2 (en) Data transfer device, arithmetic processing device and data transfer method
JPH027743A (en) Data packet switching device
JP2010539837A (en) Queue formation method
US10007625B2 (en) Resource allocation by virtual channel management and bus multiplexing
JP6853479B2 (en) Information processing system, information processing device, and control method of information processing system
CN112491715B (en) Routing device and routing equipment of network on chip
JP6847334B2 (en) Network equipment, network systems, network methods, and network programs
CN116915708A (en) Method for routing data packets, processor and readable storage medium
US10728178B2 (en) Apparatus and method for distribution of congestion information in a switch
JP4687925B2 (en) Priority arbitration system and priority arbitration method
US8429240B2 (en) Data transfer device and data transfer system
US7406546B1 (en) Long-distance synchronous bus
JP2014204160A (en) Gateway unit
JP2018207396A (en) Information processor, information processing method and program
US9497141B2 (en) Switch point having look-ahead bypass
JP2020005017A (en) Dynamic variable capacity memory device and storage capacity dynamic variable method
CN112437021B (en) Routing control method, device, routing equipment and storage medium
WO2010058693A1 (en) Packet transmission device, inter-processor communication system, parallel processor system, and packet transmission method
US10346089B2 (en) Data processing system having a write request network and a write data network
JP2013005145A (en) Packet transfer device and packet transfer method
JP6287493B2 (en) Information processing apparatus, transfer apparatus, and control method
EP2939381B1 (en) Acknowledgement forwarding
WO2019171595A1 (en) Control apparatus, communication method and communication program

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: KABUSHIKI KAISHA TOSHIBA, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NOMURA, KUMIKO;MARUKAME, TAKAO;REEL/FRAME:045836/0799

Effective date: 20180326

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE