CN110929855A - Data interaction method and device - Google Patents

Data interaction method and device Download PDF

Info

Publication number
CN110929855A
CN110929855A CN201811100243.7A CN201811100243A CN110929855A CN 110929855 A CN110929855 A CN 110929855A CN 201811100243 A CN201811100243 A CN 201811100243A CN 110929855 A CN110929855 A CN 110929855A
Authority
CN
China
Prior art keywords
register
new channel
written
channel
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811100243.7A
Other languages
Chinese (zh)
Other versions
CN110929855B (en
Inventor
翟云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Jun Zheng Science And Technology Ltd
Original Assignee
Hefei Jun Zheng Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Jun Zheng Science And Technology Ltd filed Critical Hefei Jun Zheng Science And Technology Ltd
Priority to CN201811100243.7A priority Critical patent/CN110929855B/en
Publication of CN110929855A publication Critical patent/CN110929855A/en
Application granted granted Critical
Publication of CN110929855B publication Critical patent/CN110929855B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Advance Control (AREA)

Abstract

The invention provides a data interaction method and a data interaction device, wherein the method comprises the following steps: monitoring whether a new channel identification is written in the first register; sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register; and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification. In the above scheme, by setting the register, after a certain channel finishes the operation, the channel identifier is written into the register, and the next processing on the channel is triggered by the change of the channel identifier in the register, so that the effect of a task pipeline can be achieved between the neural network processor and the external processor, and the data transmission efficiency and the data processing efficiency are improved.

Description

Data interaction method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a data interaction method and device.
Background
Neural networks (Neural networks) are research hotspots in the field of artificial intelligence since the 80 th of the 20 th century, and are formed by abstracting a human brain neuron Network from the information processing perspective so as to establish a certain simple model and then forming different networks according to different connection modes. It is also often directly referred to in engineering and academia as neural networks or neural-like networks.
A neural network is an operational model, which is formed by connecting a large number of nodes (or neurons). Each node represents a particular output function, called the excitation function. Every connection between two nodes represents a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the artificial neural network. The output of the network is different according to the connection mode of the network, the weight value and the excitation function. However, the network itself is usually an approximation to some algorithm or function in nature, and may also be an expression of a logical strategy.
Because of the huge computation of the Neural Network, the NPU (Neural-Network Processing Uint, Neural Network processor or Neural Network acceleration engine) often needs to use a dedicated digital logic circuit to accelerate. Although the neural network is general-purpose due to huge calculation amount, processors such as a CPU/GPU/DSP and the like, but the performance and power consumption are low, so that a special neural network accelerator is generally required to be selected to accelerate at an inference end level.
Although the neural network varies in shape, the calculation of the neural network is relatively regular, and is suitable for performing ASIC acceleration by using coarse-grained instructions, for example: convolution, pooling, full join operations, and the like.
However, in the practical process, only convolution, pooling and full connection are not enough, and sometimes some other calculations are needed, or some new operation types appear as the algorithm evolves. In this case, it is difficult to cover an accelerator that relies only on a limited fixed function, which requires an appropriate expansion of processing power (for example, an operation that cannot be supported is handed over to a CPU for processing), but because data interaction with other processing resources is required, interaction cost, efficiency of data processing, and the like need to be considered.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a data interaction method and device, and aims to achieve the technical effect of improving the processing efficiency.
In one aspect, a data interaction method is provided, including:
monitoring whether a new channel identification is written in the first register;
sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
In one embodiment, the method further comprises:
monitoring whether a new channel identifier is written into the second register;
and under the condition that the second register is determined to be written with a new channel identifier, performing preset second processing operation on the channel identified by the new channel identifier.
In one embodiment, the first register and the second register are both located in a neural network processor.
In one embodiment, before monitoring whether a new channel identification is written in the first register, the method further comprises:
performing a preset third processing operation on the channel identified by the new channel identifier;
after completion of the third processing operation, writing the new channel identification to the first register.
In one embodiment, the data interaction method is applied to a neural network system.
In another aspect, a data interaction apparatus is provided, including:
the first monitoring module is used for monitoring whether a new channel identifier is written into the first register;
a sending module, configured to send an interrupt signal to an external processor if it is determined that a new channel identifier is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
In one embodiment, the above apparatus further comprises:
the second monitoring module is used for monitoring whether a new channel identifier is written into the second register;
and the first processing module is used for performing preset second processing operation on the channel identified by the new channel identification under the condition that the second register is determined to be written with the new channel identification.
In one embodiment, the first register and the second register are both located in a neural network processor.
In one embodiment, the above apparatus further comprises:
the second processing module is used for performing preset third processing operation on the channel identified by the new channel identifier before monitoring whether the first register is written with the new channel identifier;
a write module for writing the new channel identification into the first register after the third processing operation is completed.
In one embodiment, the device is applied to a neural network system.
In the above example, it is monitored whether a new channel identification is written in the first register; sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register; and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification. Namely, by setting the register, after a certain channel finishes operation, the channel identifier is written into the register, and the next processing of the channel is triggered by the change of the channel identifier in the register, so that the effect of a task pipeline can be achieved between the neural network processor and the external processor, and the data transmission efficiency and the data processing efficiency are improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:
FIG. 1 is a timing diagram of a prior art neural network process;
FIG. 2 is a neural network processing timing diagram according to the present application;
FIG. 3 is an architectural diagram of a neural network system according to an embodiment of the present application;
FIG. 4 is a flowchart of a data interaction method between an NPU and a main CPU according to an embodiment of the present application;
fig. 5 is a block diagram of a data interaction device of an NPU and a main CPU according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
Considering that some calculations in the existing neural network system cannot be completed by the NPU, other processors are needed, for example: the processing of CPU/GPU, etc. thus has the interaction of NPU and other processors, and the data interaction cost and the problem of data processing efficiency exist when data are interacted.
Specifically, the following problems exist in the existing data interaction: assuming a neural network, one layer is convolution (CONV1), then the next layer is dot-by-dot inversion of feature maps (feature maps) generated by CONV1 (NEG1), and then pooling of the layers is performed (POOL 1). Assuming that n channels are provided in CONV1, NEG1 and POOL1 in this example, assuming that the current NPU does not support the negation operation, the negation operation needs to be sent to the CPU and the CPU performs the negation operation.
However, because of the interdependence relationship in the data, the NPU cannot calculate CONV1 and POOL1 at the same time, and according to the existing processing manner, the processing procedure is as shown in fig. 1, that is, after all N channels in CONV1 are processed, the dot-by-dot inversion operation of NEG1 is performed, and after all N channels are processed, the pooling operation is performed.
However, in the case of CONV1 having N channels, it is obviously not necessary to wait until all N1 channels have been calculated to allow the CPU to start performing the calculation of NEG 1. Similarly, it is not necessary to start the calculation of POOL by NPU after the CPU completes the NEG calculation for all N channels.
Therefore, in this example, an interaction mechanism between the NPU and the Host CPU is proposed to improve the interaction performance between the NPU and the Host CPU, thereby improving the performance of the entire system.
Specifically, in this example, as shown in fig. 2, a sbox is designed in the NPU, wherein registers of a task-in ID and a task-out ID are designed in the sbox.
When the NPU completes the calculation of convoluting of CONV1 (convolution) of one channel, the task-out ID register is updated to the ID number of the current channel, when the sbox detects that a new task-out ID is written, an interrupt is sent to the host CPU (for example, an interrupt is sent by irq), after the host-CPU receives the interrupt, the task-out ID in the sbox is read, meanwhile, NEG calculation of the corresponding channel is completed, and after the interrupt is completed, the ID number of the current channel is written into the task-in ID register in the sbox (for example, the task-in ID register is written into sbox _ rw in fig. 2). And when the sbox detects that a new task-in ID is written, performing POOL calculation of the corresponding channel.
The processing flow shown in fig. 3 is also formed, and obviously, this processing manner can produce a channel-level task pipeline effect as a whole, so that the execution time is accelerated.
The neural network technology can be applied to the fields of pattern recognition, intelligent robot, automatic control, prediction estimation, biology, medicine, economy and the like, but is not limited to the fields.
In the above example, a specific example is taken as an example for explanation, and in actual implementation, the processor may not be a CPU, and the operation may not be a pooling operation, a convolution operation, or the like.
Based on this, in this example, a method for adapting the cooperative work of neural network processors is provided, which may include the following steps:
step 1: processing data of a first channel through a first network layer of a neural network to obtain a first processing result of the first channel, wherein the first network layer is provided with a plurality of channels;
step 2: immediately providing the first processing result of the first network layer first channel to an external processor so that the external processor processes according to the first processing result of the first network layer first channel to obtain a second processing result of the first channel;
and step 3: and acquiring a second processing result of the first channel, and processing the second processing result of the first signal through a second network layer of the neural network to obtain a third processing result of the first channel.
That is, when the interaction with the external processor is needed, after the data processing of one channel is completed by the first network layer processing, the data is immediately provided to the external processor for processing, instead of being provided to the external processor for processing after the data processing of all channels is completed, and after the data processing of the channel is completed by the external processor, the data is immediately provided to the second network layer for processing, instead of being provided to the next layer for processing after the data processing of all channels is completed, so that the problem of low processing efficiency caused by the need of waiting for the data processing of all channels to be completed and triggering the next layer for processing is avoided, and the technical effect of effectively improving the processing efficiency is achieved.
Specifically, when the NPU is implemented, a first register and a second register may be set in the NPU, and when the first network layer finishes processing the data of the current channel, the ID of the channel is written into the first register, and then the external processor knows that the processing of the data of the channel can be triggered, and then after the external processor finishes processing, the ID of the channel may be written into the second register, so that the NPU is informed that the next network layer can process the data of the channel, and an effect of pipeline processing at a channel level is formed.
In one embodiment, a trigger controller (e.g., sbox) may be provided, and the trigger controller is provided with the first register and the second register, and monitors the data status of the first register and the second register in real time.
Specifically, the instantly providing, by the neural network processor, the first processing result of the first network layer first channel to the external processor may include:
s1: the first network layer writes a channel identification of the first channel to a first register in the neural network processor;
s2: and under the condition that the neural network processor detects that the first register is written with the channel identifier, triggering the external processor to acquire a first processing result of the first channel.
The triggering, by the neural network processor, the external processor to obtain a first processing result of the first channel may include: the neural network processor sends an interrupt signal to the external processor; the external processor reads the channel identification of the first channel from the first register in response to the interrupt signal; and the external processor acquires a first processing result of the first channel according to the channel identifier of the first channel.
Specifically, the obtaining, by the neural network processor, a second processing result of the first channel may include:
s1: the neural network processor acquires a channel identifier of a first channel from a second register, wherein the channel identifier of the first channel in the second register is written by the external processor after the processing is completed;
s2: the neural network processor acquires the channel identifier under the condition that the channel identifier is detected to be written in the second register;
s3: and the neural network processor acquires a second processing result of the first channel according to the acquired channel identifier.
By way of a specific example, a method for adapting a neural network processor to cooperate may include the steps of:
step 1: the neural network processor performs convolution processing on the data of the current channel to obtain a convolution processing result of the current channel;
step 2: sending the convolution processing result of the channel to an external processor, performing point-by-point negation on the convolution processing result of the current channel through the external processor, and performing convolution processing on the next channel of the current channel;
and step 3: and acquiring a point-by-point negation result of the current channel by the external processor, and performing pooling processing on the point-by-point negation result of the current channel.
In view of the problem of data transmission between the neural network server and the external server, a data interaction method is provided in this example, as shown in fig. 4, which may include the following steps:
step 401: monitoring whether a new channel identification is written in the first register;
step 402: sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
For the external processor, after the external processor finishes processing, the external processor needs to interact with the neural network processor, and data needs to be provided to the neural network processor for processing, so that the neural network processor can monitor whether a new channel identifier is written into the second register; and under the condition that the second register is determined to be written with a new channel identifier, performing preset second processing operation on the channel identified by the new channel identifier.
The first register and the second register are both located in the neural network processor.
Specifically, for the neural network processor, before monitoring whether a new channel identifier is written in the first register, a preset third processing operation may be performed on a channel identified by the new channel identifier, and after the third processing operation is completed, the new channel identifier is written in the first register.
The data interaction method can be applied to a neural network system.
Taking an example of data interaction between the NPU and the main CPU for explanation, the data interaction method may include:
step 1: the NPU monitors whether a task-out ID register in the NPU is written with a new channel identifier;
step 2: under the condition that the fact that a new channel identification is written into the task-out ID register is determined, an interrupt signal is sent to a main CPU;
and the main CPU reads the new channel identification from the task-out ID register in response to the interrupt signal, and performs point-by-point inversion operation on the channel indicated by the new channel identification.
In the above example, by setting the register, after a certain channel finishes the operation, the channel identifier is written into the register, and the next processing on the channel is triggered by the change of the channel identifier in the register, so that the NPU and the main CPU can achieve the effect of a task pipeline, and the data transmission efficiency and the data processing efficiency are improved.
When implemented, the interrupt signal may be sent to the main CPU by irq.
For the main CPU, after the preset operation is executed, the NPU may be notified to perform data processing in a manner of writing a channel identifier into the register, specifically, whether a new channel identifier is written into the task-in ID register may be monitored; and under the condition that the task-in ID register is determined to be written with a new channel identification, performing pooling operation on the channel identified by the new channel identification.
When implemented, the new channel ID in the task-in ID register may be written into the task-in ID register by sbox _ rw after the host CPU completes the point-by-point negation operation.
In one embodiment, before the NPU monitors whether a new channel identifier is written in a task-out ID register in the NPU, a convolution operation may be performed on a channel identified by the new channel identifier, and after the convolution operation is completed, the new channel identifier is written in the task-out ID register. That is, after the convolution operation is completed on one channel, the external server is controlled to perform the point-by-point inversion operation immediately instead of waiting for all channels to complete the convolution.
Based on the same inventive concept, the embodiment of the present invention further provides a data interaction apparatus, as described in the following embodiments. Because the principle of the data interaction device for solving the problem is similar to that of the data interaction method, the implementation of the data interaction device can refer to the implementation of the data interaction method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Fig. 5 is a block diagram of a data interaction device according to an embodiment of the present invention, and as shown in fig. 5, the data interaction device is located in a neural network processor and may include: the first monitoring module 501 and the sending module 502 describe the structure below.
A first monitoring module 501, configured to monitor whether a new channel identifier is written in the first register;
a sending module 502, configured to send an interrupt signal to an external processor if it is determined that a new channel identifier is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
The above apparatus may further include: the second monitoring module is used for monitoring whether a new channel identifier is written into the second register; and the first processing module is used for performing preset second processing operation on the channel identified by the new channel identification under the condition that the second register is determined to be written with the new channel identification.
In one embodiment, the first register and the second register are both located in a neural network processor.
In one embodiment, the apparatus may further include: the second processing module is used for performing preset third processing operation on the channel identified by the new channel identifier before monitoring whether the first register is written with the new channel identifier; a write module for writing the new channel identification into the first register after the third processing operation is completed.
In one embodiment, the device can also be applied to a neural network system.
In another embodiment, a software is provided, which is used to execute the technical solutions described in the above embodiments and preferred embodiments.
In another embodiment, a storage medium is provided, in which the software is stored, and the storage medium includes but is not limited to: optical disks, floppy disks, hard disks, erasable memory, etc.
From the above description, it can be seen that the embodiments of the present invention achieve the following technical effects: monitoring whether a new channel identification is written in the first register; sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register; and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification. Namely, by setting the register, after a certain channel finishes operation, the channel identifier is written into the register, and the next processing of the channel is triggered by the change of the channel identifier in the register, so that the effect of a task pipeline can be achieved between the neural network processor and the external processor, and the data transmission efficiency and the data processing efficiency are improved.
In this specification, adjectives such as first and second may only be used to distinguish one element or action from another, without necessarily requiring or implying any actual such relationship or order. References to an element or component or step (etc.) should not be construed as limited to only one of the element, component, or step, but rather to one or more of the element, component, or step, etc., where the context permits.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for data interaction, comprising:
monitoring whether a new channel identification is written in the first register;
sending an interrupt signal to an external processor in the event that it is determined that a new channel identification is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
2. The method of claim 1, further comprising:
monitoring whether a new channel identifier is written into the second register;
and under the condition that the second register is determined to be written with a new channel identifier, performing preset second processing operation on the channel identified by the new channel identifier.
3. The method of claim 2, wherein the first register and the second register are both located in a neural network processor.
4. The method of claim 1, wherein prior to monitoring whether a new channel identification is written to the first register, the method further comprises:
performing a preset third processing operation on the channel identified by the new channel identifier;
after completion of the third processing operation, writing the new channel identification to the first register.
5. The method according to any one of claims 1 to 4, applied in a neural network system.
6. A data interaction device, comprising:
the first monitoring module is used for monitoring whether a new channel identifier is written into the first register;
a sending module, configured to send an interrupt signal to an external processor if it is determined that a new channel identifier is written in the first register;
and the external processor reads the new channel identification from the first memory in response to the interrupt signal, and performs a preset first processing operation on the channel indicated by the new channel identification.
7. The apparatus of claim 6, further comprising:
the second monitoring module is used for monitoring whether a new channel identifier is written into the second register;
and the first processing module is used for performing preset second processing operation on the channel identified by the new channel identification under the condition that the second register is determined to be written with the new channel identification.
8. The apparatus of claim 7, wherein the first register and the second register are both located in a neural network processor.
9. The apparatus of claim 6, further comprising:
the second processing module is used for performing preset third processing operation on the channel identified by the new channel identifier before monitoring whether the first register is written with the new channel identifier;
a write module for writing the new channel identification into the first register after the third processing operation is completed.
10. The device according to any one of claims 6 to 9, applied to a neural network system.
CN201811100243.7A 2018-09-20 2018-09-20 Data interaction method and device Active CN110929855B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811100243.7A CN110929855B (en) 2018-09-20 2018-09-20 Data interaction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811100243.7A CN110929855B (en) 2018-09-20 2018-09-20 Data interaction method and device

Publications (2)

Publication Number Publication Date
CN110929855A true CN110929855A (en) 2020-03-27
CN110929855B CN110929855B (en) 2023-12-12

Family

ID=69856259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811100243.7A Active CN110929855B (en) 2018-09-20 2018-09-20 Data interaction method and device

Country Status (1)

Country Link
CN (1) CN110929855B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070041403A1 (en) * 2005-08-19 2007-02-22 Day Michael N System and method for communicating instructions and data between a processor and external devices
CN103019656A (en) * 2012-12-04 2013-04-03 中国科学院半导体研究所 Dynamically reconfigurable multi-stage parallel single instruction multiple data array processing system
US20150106596A1 (en) * 2003-03-21 2015-04-16 Pact Xpp Technologies Ag Data Processing System Having Integrated Pipelined Array Data Processor
US20170103316A1 (en) * 2015-05-21 2017-04-13 Google Inc. Computing convolutions using a neural network processor
CN107480782A (en) * 2017-08-14 2017-12-15 电子科技大学 Learn neural network processor on a kind of piece
CN107491811A (en) * 2017-09-01 2017-12-19 中国科学院计算技术研究所 Method and system and neural network processor for accelerans network processing unit
CN107836001A (en) * 2015-06-29 2018-03-23 微软技术许可有限责任公司 Convolutional neural networks on hardware accelerator
CN107861757A (en) * 2017-11-30 2018-03-30 上海寒武纪信息科技有限公司 Arithmetic unit and Related product
CN108205701A (en) * 2016-12-20 2018-06-26 联发科技股份有限公司 A kind of system and method for performing convolutional calculation
CN108416422A (en) * 2017-12-29 2018-08-17 国民技术股份有限公司 A kind of convolutional neural networks implementation method and device based on FPGA
CN108475346A (en) * 2015-11-12 2018-08-31 谷歌有限责任公司 Neural random access machine

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150106596A1 (en) * 2003-03-21 2015-04-16 Pact Xpp Technologies Ag Data Processing System Having Integrated Pipelined Array Data Processor
US20070041403A1 (en) * 2005-08-19 2007-02-22 Day Michael N System and method for communicating instructions and data between a processor and external devices
CN103019656A (en) * 2012-12-04 2013-04-03 中国科学院半导体研究所 Dynamically reconfigurable multi-stage parallel single instruction multiple data array processing system
US20170103316A1 (en) * 2015-05-21 2017-04-13 Google Inc. Computing convolutions using a neural network processor
CN107836001A (en) * 2015-06-29 2018-03-23 微软技术许可有限责任公司 Convolutional neural networks on hardware accelerator
CN108475346A (en) * 2015-11-12 2018-08-31 谷歌有限责任公司 Neural random access machine
CN108205701A (en) * 2016-12-20 2018-06-26 联发科技股份有限公司 A kind of system and method for performing convolutional calculation
CN107480782A (en) * 2017-08-14 2017-12-15 电子科技大学 Learn neural network processor on a kind of piece
CN107491811A (en) * 2017-09-01 2017-12-19 中国科学院计算技术研究所 Method and system and neural network processor for accelerans network processing unit
CN107861757A (en) * 2017-11-30 2018-03-30 上海寒武纪信息科技有限公司 Arithmetic unit and Related product
CN108416422A (en) * 2017-12-29 2018-08-17 国民技术股份有限公司 A kind of convolutional neural networks implementation method and device based on FPGA

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
VINAYAK GOKHALE ET AL: "A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks" *
周齐国 等: "ARM与神经网络处理器的通信方案设计" *
杨一晨 等: "一种基于可编程逻辑器件的卷积 神经网络协处理器设计" *

Also Published As

Publication number Publication date
CN110929855B (en) 2023-12-12

Similar Documents

Publication Publication Date Title
JP6829327B2 (en) Conversion methods, devices, computer devices and storage media
CN109376861B (en) Apparatus and method for performing full connectivity layer neural network training
CN109375951B (en) Device and method for executing forward operation of full-connection layer neural network
US10783437B2 (en) Hybrid aggregation for deep learning neural networks
JP7451614B2 (en) On-chip computational network
CN110929856B (en) NPU and main CPU data interaction method and device
CN111860812A (en) Apparatus and method for performing convolutional neural network training
CN113469355B (en) Multi-model training pipeline in distributed system
CN111190741B (en) Scheduling method, equipment and storage medium based on deep learning node calculation
JP2019204492A (en) Neuromorphic accelerator multitasking
CN108171328B (en) Neural network processor and convolution operation method executed by same
KR102407220B1 (en) Artificial intelligence chip and instruction execution method for artificial intelligence chip
WO2021202610A1 (en) Machine learning network implemented by statically scheduled instructions
US20140143524A1 (en) Information processing apparatus, information processing apparatus control method, and a computer-readable storage medium storing a control program for controlling an information processing apparatus
US20210073625A1 (en) Partitioning control dependency edge in computation graph
US11941528B2 (en) Neural network training in a distributed system
US20210326189A1 (en) Synchronization of processing elements that execute statically scheduled instructions in a machine learning accelerator
CN118277490B (en) Data processing system, data synchronization method, electronic device, and storage medium
US20220067495A1 (en) Intelligent processor, data processing method and storage medium
US12014202B2 (en) Method and apparatus with accelerator
WO2020169182A1 (en) Method and apparatus for allocating tasks
CN113240100A (en) Parallel computing method and system based on discrete Hopfield neural network
US11631001B2 (en) Heterogeneous computing on a system-on-chip, including machine learning inference
CN110929857B (en) Data processing method and device of neural network
CN110929855B (en) Data interaction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant