WO2021161496A1 - 情報処理回路 - Google Patents

情報処理回路 Download PDF

Info

Publication number
WO2021161496A1
WO2021161496A1 PCT/JP2020/005733 JP2020005733W WO2021161496A1 WO 2021161496 A1 WO2021161496 A1 WO 2021161496A1 JP 2020005733 W JP2020005733 W JP 2020005733W WO 2021161496 A1 WO2021161496 A1 WO 2021161496A1
Authority
WO
WIPO (PCT)
Prior art keywords
information processing
processing circuit
circuit
calculation result
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2020/005733
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
高橋 勝彦
竹中 崇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to PCT/JP2020/005733 priority Critical patent/WO2021161496A1/ja
Priority to US17/796,329 priority patent/US20230075457A1/en
Priority to JP2022500169A priority patent/JP7364026B2/ja
Publication of WO2021161496A1 publication Critical patent/WO2021161496A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Definitions

  • the present invention relates to an information processing circuit that executes the inference phase of deep learning, a deep learning method, and a storage medium that stores a program that executes deep learning.
  • Deep learning is an algorithm that uses a multi-layer neural network (hereinafter referred to as a network).
  • a learning phase in which each network (layer) is optimized to create a model (learning model) and an inference phase in which inference is performed based on the learning model are executed.
  • the model is sometimes called an inference model.
  • the model may be expressed as an inference device below.
  • an inference device realized by a GPU Graphics Processing Unit
  • CPU Central Processing Unit
  • an accelerator dedicated to deep learning has been put into practical use.
  • Patent Document 1 describes dedicated hardware designed for a deep neural network (DNN).
  • the device described in Patent Document 1 improves various limitations of a hardware solution for DNN, including high power consumption, long latency, high silicon area requirements, and the like.
  • Non-Patent Document 1 describes the Mixture of experts method.
  • the DNN is fixedly configured as a circuit. Therefore, even if the learning data is expanded later and a more advanced DNN can be constructed using the data, it is difficult to change the circuit configuration of the DNN.
  • the present invention provides an information processing circuit, a deep learning method, and a deep learning method that can change the input / output characteristics of a network without changing the circuit configuration of the hardware even when the inferencer is fixedly configured by the hardware.
  • An object of the present invention is to provide a storage medium for storing a program that executes deep learning.
  • the information processing circuit is a first information processing circuit that executes layer operations in deep learning, and a second information processing circuit that executes layer operations in deep learning with respect to input data by a programmable accelerator.
  • the first information processing circuit includes a fusion circuit that fuses the calculation result of the first information processing circuit and the calculation result of the second information processing circuit and outputs the fusion result. It includes a parameter value output circuit in which parameters are internally circuitized, and a product-sum circuit that performs product-sum operations using input data and parameter values.
  • the deep learning method is a first information processing circuit including a parameter value output circuit in which deep learning parameters are internally circuitized and a product-sum circuit that performs a product-sum operation using input data and parameter values.
  • the first calculation result of the layer in deep learning executed by is fused with the second calculation result of the layer in deep learning using input data executed by the second information processing circuit which is a programmable accelerator. Then, the fusion result is output.
  • the program for executing deep learning includes, in a computer, a parameter value output circuit in which deep learning parameters are internally circuitized, and a product-sum circuit that performs a product-sum operation using input data and parameter values.
  • the first calculation result of the layer in the deep learning executed by the first information processing circuit and the second layer in the deep learning using the input data executed by the second information processing circuit which is a programmable accelerator.
  • the fusion process that outputs the fusion result is executed by fusing with the calculation result of.
  • the inference device even when the inference device is fixedly configured by hardware, it is possible to obtain an information processing circuit capable of changing the input / output characteristics of the network without changing the circuit configuration of the hardware. can.
  • FIG. 1 is an explanatory diagram schematically showing the information processing circuit 50 of the first embodiment.
  • the information processing circuit 50 includes a first information processing circuit 10 that realizes a CNN, a second information processing circuit 20 that realizes a CNN, and a fusion circuit 30.
  • the first information processing circuit 10 is an arithmetic unit (circuit) corresponding to a layer and an inference device in which parameters are fixed.
  • the second information processing circuit 20 is a programmable inference device.
  • the first information processing circuit 10 includes a plurality of product-sum circuits 101 and a parameter value output circuit 102.
  • the first information processing circuit 10 is a CNN inference device provided with an arithmetic unit corresponding to each layer of the CNN. Then, in the first information processing circuit 10, the parameters are fixed, and the network configuration (type of deep learning algorithm, what type of layer is arranged in what order, size and output of input data of each layer). Realize a CNN inferior with a fixed data size, etc.). That is, the first information processing circuit 10 includes a product-sum circuit 101 having a circuit configuration specialized for each layer of the CNN (for example, each of the convolution layer and the fully coupled layer). Specializing means that it is a dedicated circuit that exclusively executes the operations of the relevant layer.
  • the fixed parameters mean that when the first information processing circuit 10 is created, the processing of the learning phase is completed, appropriate parameters are determined, and the determined parameters are used. ..
  • the circuit in which the parameters are fixed is the parameter value output circuit 102.
  • the second information processing circuit 20 includes an arithmetic unit 201 and an external memory 202.
  • the second information processing circuit 20 is a programmable CNN inferior.
  • the second information processing circuit 20 has an external memory 202 that holds parameters.
  • the parameter may be changed to the parameter value determined in the learning phase in the processing of the information processing circuit 50. The learning method will be described later.
  • FIG. 2 is an explanatory diagram showing an example of a first information processing circuit 10 that executes layer operations in deep learning.
  • FIG. 2 schematically shows a CNN inference device provided with an arithmetic unit corresponding to each layer.
  • FIG. 2 illustrates the five layers 1, 2, 3, 4, 5 in CNN.
  • An arithmetic unit (circuit) 1011, 1012, 1013, 1014, 1015 corresponding to each of the layers 1, 2, 3, 4, and 5 is provided in the inference device.
  • parameters 1021, 1022, 1023, 1024, 1025 corresponding to each of the layers 1, 2, 3, 4, and 5 are provided corresponding to the arithmetic unit (circuit).
  • the circuits 1011 to 1015 execute the operations of the corresponding layers 1 to 5, if the parameters 1021 to 1025 are unchanged, the circuit is fixedly configured.
  • the fixed circuits 1011 to 1015 correspond to the product-sum circuit 101.
  • the parameters are also fixedly configured in the circuit.
  • the circuit that outputs the fixed parameters 1021 to 1025 corresponds to the parameter value output circuit 102.
  • FIG. 3 is an explanatory diagram showing an example of a second information processing circuit that executes layer operations in deep learning with respect to input data by a programmable accelerator.
  • FIG. 3 schematically shows a CNN inferior configured such that operations on a plurality of layers of a CNN are performed by a common arithmetic unit.
  • the part that executes the operation in the inference unit is composed of the arithmetic unit 201 and the memory (for example, DRAM (Dynamic Random Access Memory)) 202.
  • the arithmetic unit 201 shown in FIG. 3 a large number of adders and a large number of multipliers are formed.
  • "+" indicates an adder and "*" indicates a multiplier.
  • 3 adders and 6 multipliers are illustrated in FIG. 3, a number of adders and multipliers capable of executing each operation of all layers in the CNN are formed.
  • NS The inferencer shown in FIG. 3 is a programmable accelerator.
  • the fusion circuit 30 fuses the calculation result of the first information processing circuit 10 and the calculation result of the second information processing circuit 20 and outputs the fusion result.
  • Examples of the fusion method include a simple average and a weighted sum.
  • the fusion circuit 30 fuses the calculation results by a simple average or a weighted sum.
  • the weighted sum of the present embodiment is predetermined to an arbitrary value based on an experiment, past fusion results, and the like.
  • the fusion circuit 30 has a parameter holding unit (not shown) such as an external memory. Further, the fusion circuit 30 receives the output of the first information processing circuit and the output of the second information processing circuit as inputs to the layer in deep learning, and outputs the calculation result based on the received inputs as the fusion result.
  • the parameter may be changed to the parameter value determined in the learning phase in the processing of the information processing circuit 50.
  • the fusion circuit 30 may be a programmable accelerator.
  • the parameters in deep learning used by the second information processing circuit and the fusion circuit are determined in advance by learning.
  • Examples of the learning method for constructing the second information processing circuit and the fusion circuit include the following three methods.
  • the first is a method of independently learning the parameters of the second information processing circuit, then constructing the whole and adjusting the parameters of the second information processing circuit again.
  • learning of the fusion circuit is not required, so learning is easy.
  • the recognition accuracy is the lowest among the three methods.
  • the second method is to learn the parameters of the second information processing circuit independently, then construct the whole and adjust the fusion circuit (and the parameters of the second information processing circuit) again.
  • the parameters of the second information processing circuit are learned independently. Therefore, in this method, learning the parameters of the second information processing circuit becomes troublesome twice.
  • the time and effort for learning after constructing the whole is small.
  • the third method is to learn the parameters of the second information processing circuit and the parameters of the fusion circuit at the same time. As a feature of this method, learning the parameters of the second information processing circuit does not have to be troublesome twice. However, this method takes more time to learn after constructing the whole than the second method.
  • the second information processing circuit 20 and the fusion circuit 30 shown in FIG. 1 can be configured by one hardware or one software.
  • each component can be configured by a plurality of hardware or a plurality of software. It is also possible to configure a part of each component with hardware and another part with software.
  • FIG. 4 is a block diagram showing an example of a computer having a CPU.
  • a computer having a processor such as a CPU (Central Processing Unit) or a memory
  • FIG. 4 shows a storage device 1001 and a memory 1002 connected to the CPU 1000.
  • the CPU 1000 realizes each function in the second information processing circuit 20 and the fusion circuit 30 shown in FIG. 1 by executing processing (fusion processing) according to a program stored in the storage device 1001. That is, the computer realizes the functions of the second information processing circuit 20 and the fusion circuit 30 in the information processing circuit 50 shown in FIG.
  • the storage device 1001 is, for example, a non-transitory computer readable medium.
  • a non-transitory computer-readable medium is one of various types of tangible storage medium. Specific examples of non-temporary computer-readable media include magnetic recording media (for example, hard disks), magneto-optical recording media (for example, magneto-optical disks), CD-ROMs (Compact Disc-Read Only Memory), and CD-Rs (Compact). Disc-Recordable), CD-R / W (CompactDisc-ReWritable), semiconductor memory (for example, mask ROM, PROM (ProgrammableROM), EPROM (ErasablePROM), flash ROM).
  • the program may also be stored on various types of temporary computer-readable media (transitory computer readable medium).
  • the program is supplied to the temporary computer-readable medium, for example, via a wired or wireless channel, that is, via an electrical signal, an optical signal, or an electromagnetic wave.
  • the memory 1002 is, for example, a RAM (Random Access).
  • Memory is a storage means that temporarily stores data when the CPU 1000 executes processing. It is also possible to envision a form in which a program held by the storage device 1001 or a temporary computer-readable medium is transferred to the memory 1002, and the CPU 1000 executes processing based on the program in the memory 1002.
  • FIG. 5 is a flowchart showing the operation of the information processing circuit 50 of the first embodiment.
  • the flowchart of FIG. 5 shows the inference phase in CNN.
  • the first information processing circuit 10 executes layer operations in deep learning. Specifically, the first information processing circuit 10 outputs the input data such as the input image from the product-sum circuit 101 and the parameter value output circuit 102 corresponding to the layers in each layer constituting the CNN. The product-sum operation is performed in order using the parameters to be performed. After the calculation is completed, the first information processing circuit 10 outputs the calculation result to the fusion circuit 30 (step S601).
  • VGG-16 As a type of deep learning algorithm which is one of the concepts of the network structure in the present embodiment, for example, AlexNet, GoogLeNet, ResNet (ResidualNetwork), SENEt (Squeeze-and-Excitation Networks), MobileNet, VGG-16, etc. There is VGG-19. Further, as the number of layers, which is one of the concepts of the network structure, for example, the number of layers according to the type of the deep learning algorithm can be considered. In addition, the filter size and the like can be included as a concept of the network structure.
  • the second information processing circuit 20 executes layer operations in deep learning with respect to input data by a programmable accelerator. Specifically, the second information processing circuit 20 shares the arithmetic unit 201 with respect to the same input data as the input data input to the first information processing circuit 10 from the external memory (DRAM) 202. Perform a product-sum operation using the read parameters. After the calculation is completed, the second information processing circuit 20 outputs the calculation result to the fusion circuit 30 (step S602).
  • the fusion circuit 30 fuses the calculation result output by the first information processing circuit 10 and the calculation result output by the second information processing circuit 20 (step S603). In this embodiment, they are fused by a simple average or a weighted sum. Then, the fusion circuit 30 outputs the fusion result to the outside.
  • steps S601 to S602 are sequentially executed, but the processes of step S601 and the processes of step S602 can be executed in parallel.
  • the information processing circuit 50 of the present embodiment is a sum of products that performs a sum of products operation using a parameter value output circuit 102 in which deep learning parameters are internally circuitized, and input data and parameter values.
  • a first information processing circuit 10 that includes a circuit 101 and executes layer operations in deep learning, and a second information processing circuit 20 that executes layer operations in deep learning with respect to input data by a programmable accelerator. It is composed of and. As a result, even when the inference device (first information processing circuit 10) is fixedly configured by hardware, the input / output characteristics of the network can be changed without changing the circuit configuration of the hardware.
  • the information processing circuit 50 of the present embodiment has a higher processing speed than the information processing circuit composed of only the programmable accelerator configured to read the parameter value shown in FIG. 3 from the memory. Further, the information processing circuit 50 of the present embodiment has a smaller circuit scale than the information processing circuit composed of only programmable accelerators. As a result, power consumption is reduced.
  • the information processing circuit has been described by taking a plurality of CNN inference devices as an example, but other neural network inference devices may be used.
  • the image data is used as the input data, but the present embodiment can also be utilized in a network in which the input data is other than the image data.
  • FIG. 6 is an explanatory diagram schematically showing the information processing circuit 60 of the second embodiment.
  • the information processing circuit 60 of the present embodiment includes the information processing circuit 50 of the first embodiment.
  • the information processing circuit 60 includes a first information processing circuit 10 that realizes a CNN, a second information processing circuit 20 that realizes a CNN, a fusion circuit 30, and a learning circuit 40. Since the configuration of the circuit other than the learning circuit 40 is the same as that of the information processing circuit 50 of the first embodiment, the description thereof will be omitted.
  • the learning circuit 40 shown in FIG. 6 can be configured by one hardware or one software, similarly to the second information processing circuit 20 and the fusion circuit 30.
  • each component can be configured by a plurality of hardware or a plurality of software. It is also possible to configure a part of each component with hardware and another part with software.
  • the learning circuit 40 receives the calculation result output by fusing the fusion circuit 30 for the input data and the correct label for the input data as input.
  • the learning circuit 40 calculates the loss based on the difference between the calculation result output by the fusion circuit 30 and the correct label, and corrects at least one of the parameters of the second information processing circuit 20 and the fusion circuit 30 ( Correct).
  • the learning method of the second information processing circuit 20 and the fusion circuit 30 is arbitrary, and can be executed by, for example, the Mixture of experts method.
  • the loss is calculated by the loss function.
  • the value of the loss function is calculated by the difference (L2 norm, cross entropy, etc.) between the output (numerical vector) of the fusion circuit 30 and the correct label (numerical vector).
  • FIG. 7 is a flowchart showing the operation of the information processing circuit 60 of the second embodiment. It can be said that the flowchart of FIG. 7 shows the learning phase in CNN.
  • steps S701 to S703 are the same as the processes of steps S601 to S603 in the flowchart of the information processing circuit 50 of the first embodiment shown in FIG. 5, the description thereof will be omitted.
  • the learning circuit 40 receives the calculation result output by fusing the fusion circuit 30 for the input data and the correct label for the input data as input.
  • the learning circuit 40 calculates the loss based on the difference between the calculation result output by the fusion circuit 30 and the correct label (step S704).
  • the learning circuit 40 corrects (corrects) at least one of the parameters of the second information processing circuit 20 and the parameters of the fusion circuit 30 so that the value of the loss function becomes small (step S705 and step S706).
  • step S707 When there is unprocessed data (Yes in step S707), the information processing circuit 50 repeats the above steps S701 to S706 until there is no unprocessed data. When there is no unprocessed data (No in step S707), the information processing circuit 50 ends the process.
  • steps S705 to S706 are sequentially executed, but the processes of step S705 and step S706 can be executed in parallel.
  • the information processing circuit 60 of the present embodiment includes a learning circuit 40 that receives the calculation result of the fusion circuit 30 for the input data and the correct answer label for the input data as inputs, and the learning circuit 40 performs the calculation. Based on the difference between the result and the correct answer label, at least one of the parameters of the second information processing circuit 20 and the parameters of the fusion circuit 30 is corrected. As a result, the information processing circuit 60 of the present embodiment can improve the recognition accuracy.
  • FIG. 8 is an explanatory diagram schematically showing the information processing circuit 51 of the third embodiment.
  • the information processing circuit 51 includes a first information processing circuit 11 that realizes a CNN, a second information processing circuit 21 that realizes a CNN, and a fusion circuit 31. Since the first information processing circuit 11 and the second information processing circuit 21 are the same as the first information processing circuit 10 and the second information processing circuit 20 of the first embodiment, the description thereof will be omitted.
  • input data is input to the fusion circuit 31.
  • Other input / output is the same as the information processing circuit 50 of the first embodiment.
  • the fusion circuit 31 inputs the same input data as the input data received by the first information processing circuit 11 and the second information processing circuit 21. Then, the fusion circuit 31 weights the calculation result of the first information processing circuit 11 and the calculation result of the second information processing circuit 21 based on the weighting parameter determined according to the input data.
  • the weighting parameter is determined by learning performed in advance based on, for example, the identification characteristics of the first information processing circuit 11 and the second information processing circuit 21 with respect to the input data. In other words, it can be said that the weighting parameter is determined based on the strengths and weaknesses of the first information processing circuit 11 and the second information processing circuit 21. That is, it is determined that the higher the identification accuracy for the input data, the larger the weighting parameter.
  • the fusion circuit 31 assigns a larger weight to the first information processing circuit 11 than to the second information processing circuit 21.
  • the fusion circuit 31 receives the calculation result of the first information processing circuit 11 and the calculation result of the second information processing circuit 21 as inputs, calculates and fuses the weighted sum of each received input, and outputs the fusion result. do.
  • the fusion circuit 31 inputs the input data, and the first information processing circuit 11 is based on the weighting parameter determined according to the input data. Weighting is performed on the calculation result and the calculation result of the second information processing circuit 21.
  • the information processing circuit 51 of the present embodiment predicts the strengths and weaknesses of the first information processing circuit 11 and the second information processing circuit 21 with respect to the input data, and weights the input data. The recognition accuracy can be improved as compared with.
  • FIG. 9 is an explanatory diagram schematically showing the information processing circuit 61 of the fourth embodiment.
  • the information processing circuit 61 of the present embodiment includes the information processing circuit 51 of the third embodiment.
  • the information processing circuit 61 includes a first information processing circuit 11 that realizes a CNN, a second information processing circuit 21 that realizes a CNN, a fusion circuit 31, and a learning circuit 41. Since the configuration of the circuit other than the learning circuit 41 is the same as that of the information processing circuit 51 of the third embodiment, the description thereof will be omitted.
  • the learning circuit 41 has the same input / output as the learning circuit 40 of the information processing circuit 60 of the second embodiment. That is, the learning circuit 41 receives the calculation result output by fusing the fusion circuit 31 for the input data and the correct label for the input data as input. The learning circuit 41 calculates the loss based on the difference between the calculation result output by the fusion circuit 31 and the correct label, and corrects at least one of the parameters of the second information processing circuit 21 and the fusion circuit 31 ( Correct).
  • the information processing circuit 61 of the present embodiment includes a learning circuit 41 that receives the calculation result of the fusion circuit 31 for the input data and the correct answer label for the input data as inputs, and the learning circuit 41 performs the calculation. Based on the difference between the result and the correct answer label, at least one of the parameter of the second information processing circuit 21 and the parameter of the fusion circuit 31 is corrected. As a result, the information processing circuit 61 of the present embodiment can improve the recognition accuracy.
  • FIG. 10 is an explanatory diagram schematically showing the information processing circuit 52 of the fifth embodiment.
  • the information processing circuit 52 includes a first information processing circuit 12 that realizes a CNN, a second information processing circuit 22 that realizes a CNN, and a fusion circuit 32.
  • the first information processing circuit 12 of the present embodiment outputs the calculation result of the intermediate layer in deep learning. Specifically, the first information processing circuit 12 outputs the output from the intermediate layer that extracts the feature amount in the deep learning as the calculation result.
  • the intermediate layer for feature extraction is, for example, a mass network called backbone or feature pyramid network.
  • the final result of such a mass network is output from the intermediate layer of the first information processing circuit 12.
  • the backbone CNNs such as ResNet-50, ResNet-101, and VGG-16 are used.
  • RetinaNet (resnet +) feature pyramid network exists as a mass of feature extraction.
  • the output from the intermediate layer is input to the second information processing circuit 22 and the fusion circuit 32.
  • the output from the intermediate layer may be an output from a layer other than the layer for extracting the feature amount.
  • the second information processing circuit 22 uses the calculation result of the intermediate layer as input data to execute the layer calculation in deep learning. Specifically, the second information processing circuit 22 receives an input from an intermediate layer that extracts features of the first information processing circuit 12. The feature amount extraction performed by the second information processing circuit 22 uses the output from the layer that performs the feature amount extraction of the first information processing circuit 12. Therefore, the circuit scale of the second information processing circuit 22 of the present embodiment is smaller than the circuit scale of the second information processing circuit 21 of the fourth embodiment.
  • the fusion circuit 32 receives an input of a feature amount extracted from the intermediate layer of the first information processing circuit 12.
  • the fusion circuit 32 weights the calculation result of the first information processing circuit 12 and the calculation result of the second information processing circuit 22 based on the weighting parameter determined according to the feature amount.
  • the weighting parameter of the present embodiment is also performed in advance based on the discrimination characteristics of the first information processing circuit 12 and the second information processing circuit 22 with respect to the feature amount, as in the fusion circuit 31 of the third embodiment. It may be determined by learning.
  • the fusion circuit 32 assigns a larger weight to the first information processing circuit 12 than to the second information processing circuit 22.
  • the fusion circuit 32 receives the calculation result of the first information processing circuit 12 and the calculation result of the second information processing circuit 22 as inputs, calculates and fuses the weighted sum of each received input, and outputs the fusion result. do.
  • the first information processing circuit 12 outputs the calculation result of the intermediate layer in the deep learning
  • the second information processing circuit 22 is the intermediate layer.
  • the layer calculation in deep learning is executed.
  • the fusion circuit 32 fuses the calculation result of the intermediate layer, the calculation result of the first information processing circuit 12, and the calculation result of the second information processing circuit 22, and outputs the fusion result.
  • the information processing circuit 52 of the present embodiment has advantages and disadvantages of the first information processing circuit 12 and the second information processing circuit 22 based on the feature amount extracted by the intermediate layer in the first information processing circuit 12. Can be predicted and weighted.
  • the information processing circuit 52 of the present embodiment can have higher recognition accuracy than the information processing circuit 50 of the first embodiment. Further, the information processing circuit 52 of the present embodiment is compared with the information processing circuit 51 of the third embodiment by sharing the feature amount extraction of the second information processing circuit 22 with the first information processing circuit 12. The circuit scale can be reduced.
  • FIG. 11 is an explanatory diagram schematically showing the information processing circuit 62 of the sixth embodiment.
  • the information processing circuit 62 of the present embodiment includes the information processing circuit 52 of the fifth embodiment.
  • the information processing circuit 62 includes a first information processing circuit 12 that realizes a CNN, a second information processing circuit 22 that realizes a CNN, a fusion circuit 32, and a learning circuit 42. Since the configuration of the circuit other than the learning circuit 42 is the same as that of the information processing circuit 52 of the fifth embodiment, the description thereof will be omitted.
  • the learning circuit 42 has the same input / output as the learning circuit 40 of the information processing circuit 60 of the second embodiment and the learning circuit 41 of the information processing circuit 61 of the fourth embodiment. That is, the learning circuit 42 accepts the calculation result output by fusing the fusion circuit 32 for the input data and the correct label for the input data as input. The learning circuit 42 calculates the loss based on the difference between the calculation result output by the fusion circuit 32 and the correct label, and corrects at least one of the parameters of the second information processing circuit 22 and the fusion circuit 32 ( Correct).
  • the information processing circuit 62 of the present embodiment includes a learning circuit 42 that receives the calculation result of the fusion circuit 32 for the input data and the correct answer label for the input data as input, and the learning circuit 42 performs the calculation. Based on the difference between the result and the correct answer label, at least one of the parameters of the second information processing circuit 22 and the parameters of the fusion circuit 32 is corrected. As a result, the information processing circuit 62 of the present embodiment can improve the recognition accuracy.
  • FIG. 12 is a block diagram showing a main part of the information processing circuit.
  • the information processing circuit 80 uses a first information processing circuit 81 (in the embodiment, realized by the first information processing circuit 10) that executes layer operations in deep learning, and a programmable accelerator to input data.
  • the second information processing circuit 82 (in the embodiment, realized by the second information processing circuit 20) that executes the layer calculation in the deep learning, the calculation result of the first information processing circuit 81, and the calculation result.
  • the first information processing circuit 81 includes a fusion circuit 83 (in the embodiment, realized by the fusion circuit 30) that fuses with the calculation result of the second information processing circuit 82 and outputs the fusion result.
  • the sum-of-product calculation is performed using the parameter value output circuit 811 (in the embodiment, the parameter value output circuit 102 is realized) in which the deep learning parameters are internally circuitized, and the input data and the parameter value.
  • the circuit 812 in the embodiment, realized by the product-sum circuit 101 is included.
  • a first information processing circuit that executes layer operations in deep learning
  • a second information processing circuit that executes layer operations in deep learning on input data using a programmable accelerator
  • a fusion circuit that fuses the calculation result of the first information processing circuit and the calculation result of the second information processing circuit and outputs the fusion result
  • the first information processing circuit is A parameter value output circuit that internally circuits the parameters of deep learning
  • An information processing circuit including a product-sum circuit that performs a product-sum operation using the input data and the parameter value.
  • the fusion circuit accepts the calculation result of the first information processing circuit and the calculation result of the second information processing circuit as inputs, calculates and fuses the weighted sum of each received input, and obtains the fusion result.
  • the information processing circuit of Appendix 1 to be output.
  • the fusion circuit accepts the calculation result of the first information processing circuit and the calculation result of the second information processing circuit as input to the layer in deep learning, and outputs the calculation result based on the received input as the fusion result.
  • the fusion circuit is an information processing circuit according to any one of Appendix 1 to Appendix 3 that executes layer operations in deep learning by a programmable accelerator.
  • the fusion circuit inputs the same input data as the input data received by the first information processing circuit and the second information processing circuit, and based on the weighting parameter determined according to the input data, the first An information processing circuit according to any one of Supplements 1 to 4, which weights the calculation result of the information processing circuit 1 and the calculation result of the second information processing circuit.
  • the first information processing circuit outputs the calculation result of the intermediate layer in deep learning, and outputs the calculation result.
  • the second information processing circuit uses the calculation result of the intermediate layer as input data to execute the layer calculation in deep learning.
  • the fusion circuit fuses the calculation result of the intermediate layer, the calculation result of the first information processing circuit, and the calculation result of the second information processing circuit, and outputs the fusion result.
  • the first information processing circuit is the information processing circuit of Appendix 6 that outputs the output from the intermediate layer for extracting the feature amount as the calculation result.
  • a learning circuit for learning the layer parameters in deep learning by inputting the calculation result of the fusion circuit for the input data and the correct answer label for the input data is provided.
  • the learning circuit corrects at least one of the parameters of the second information processing circuit and the parameters of the fusion circuit based on the difference between the calculation result and the correct answer label.
  • Information processing circuit
  • Appendix 10 The calculation result of the first information processing circuit and the result of weighting the calculation result of the second information processing circuit are accepted as inputs, and the weighted sum of each received input is calculated and fused to fuse.
  • the deep learning method of Appendix 9 that outputs the result.
  • Appendix 11 A computer-readable recording medium in which a program for executing deep learning is stored.
  • the program that executes the deep learning In deep learning executed by a first information processing circuit including a parameter value output circuit in which the parameters of deep learning are internally circuitized and a product sum circuit that performs a product sum operation using input data and the parameter values.
  • the fusion result is output by fusing the first calculation result of the layer and the second calculation result of the layer in deep learning using input data executed by the second information processing circuit which is a programmable accelerator. It is characterized in that the processor executes the fusion process to be performed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)
PCT/JP2020/005733 2020-02-14 2020-02-14 情報処理回路 Ceased WO2021161496A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/JP2020/005733 WO2021161496A1 (ja) 2020-02-14 2020-02-14 情報処理回路
US17/796,329 US20230075457A1 (en) 2020-02-14 2020-02-14 Information processing circuit
JP2022500169A JP7364026B2 (ja) 2020-02-14 2020-02-14 情報処理回路

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/005733 WO2021161496A1 (ja) 2020-02-14 2020-02-14 情報処理回路

Publications (1)

Publication Number Publication Date
WO2021161496A1 true WO2021161496A1 (ja) 2021-08-19

Family

ID=77292818

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/005733 Ceased WO2021161496A1 (ja) 2020-02-14 2020-02-14 情報処理回路

Country Status (3)

Country Link
US (1) US20230075457A1 (https=)
JP (1) JP7364026B2 (https=)
WO (1) WO2021161496A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7456501B2 (ja) * 2020-05-26 2024-03-27 日本電気株式会社 情報処理回路および情報処理回路の設計方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6658033B2 (ja) * 2016-02-05 2020-03-04 富士通株式会社 演算処理回路、および情報処理装置
US11741693B2 (en) 2017-11-15 2023-08-29 Palo Alto Research Center Incorporated System and method for semi-supervised conditional generative modeling using adversarial networks
US11681923B2 (en) * 2019-04-19 2023-06-20 Samsung Electronics Co., Ltd. Multi-model structures for classification and intent determination
US11568238B2 (en) * 2019-06-28 2023-01-31 Amazon Technologies, Inc. Dynamic processing element array expansion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ALHAMALI ABDULRAHMAN; SALHA NIBAL; MORCEL RAGHID; EZZEDDINE MAZEN; HAMDAN OMAR; AKKARY HAITHAM; HAJJ HAZEM: "FPGA-Accelerated Hadoop Cluster for Deep Learning Computations", 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), IEEE, 14 November 2015 (2015-11-14), pages 565 - 574, XP032859264, DOI: 10.1109/ICDMW.2015.148 *

Also Published As

Publication number Publication date
JP7364026B2 (ja) 2023-10-18
JPWO2021161496A1 (https=) 2021-08-19
US20230075457A1 (en) 2023-03-09

Similar Documents

Publication Publication Date Title
CN110782015B (zh) 神经网络的网络结构优化器的训练方法、装置及存储介质
CN113505883B (zh) 一种神经网络训练方法以及装置
TWI759361B (zh) 用於稀疏神經網路加速的架構、方法、電腦可讀取媒體和裝備
WO2022245502A1 (en) Low-rank adaptation of neural network models
KR102592721B1 (ko) 이진 파라미터를 갖는 컨볼루션 신경망 시스템 및 그것의 동작 방법
CA2957695A1 (en) System and method for building artificial neural network architectures
CN110837887A (zh) 一种深度卷积神经网络的压缩及加速方法、神经网络模型及其应用
KR20220020816A (ko) 심층 신경망들에서의 깊이-우선 컨볼루션
EP3637327B1 (en) Computing device and method
US20240386273A1 (en) Data processing apparatus, training apparatus, method of detecting an object, method of training, and medium
JP7024881B2 (ja) パターン認識装置およびパターン認識方法
Aricioğlu et al. Deep learning based classification of time series of Chen and Rössler chaotic systems over their graphic images
CN110728351A (zh) 数据处理方法、相关设备及计算机存储介质
Xie et al. Nonlinear system identification using optimized dynamic neural network
WO2021161496A1 (ja) 情報処理回路
CN116051861A (zh) 一种基于重参数化的无锚框目标检测方法
Brassai FPGA based hardware implementation of a self-organizing map
US20240354548A1 (en) Energy-Efficient Recurrent Neural Network Accelerator
US12020141B2 (en) Deep learning apparatus for ANN having pipeline architecture
JP7532934B2 (ja) 機器、方法及びプログラム
Köster et al. Attention-enhanced reservoir computing as a multiple dynamical system approximator
Dang et al. Improved pso algorithm for training of neural network in co-design architecture
JP2026502558A (ja) ニューラルネットワークモデルを訓練する方法、ニューラルネットワークモデルに基づいて情報処理を行う方法、ニューラルネットワークモデルを訓練する装置、コンピュータプログラム、及び電子デバイス
JP7310910B2 (ja) 情報処理回路および情報処理回路の設計方法
Mohtavipour et al. A large-scale application mapping in reconfigurable hardware using deep graph convolutional network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20919048

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022500169

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20919048

Country of ref document: EP

Kind code of ref document: A1