CN109359731B

CN109359731B - Neural network processing method and device based on chip design defects

Info

Publication number: CN109359731B
Application number: CN201811127453.5A
Authority: CN
Inventors: 欧耿洲
Original assignee: Jeejio Beijing Technology Co ltd
Current assignee: Zhongke Wuqi (Nanjing) Technology Co.,Ltd.
Priority date: 2018-09-27
Filing date: 2018-09-27
Publication date: 2022-01-28
Anticipated expiration: 2038-09-27
Also published as: CN109359731A

Abstract

The embodiment of the invention relates to a neural network processing method and a device based on chip design defects, wherein the method comprises the following steps: obtaining the type of the design defect of the chip; adjusting the neural network according to the type so that the neural network can normally operate on the chip; wherein the type of the design defect of the chip at least comprises one of the following types: the input/output data cache is in an abnormal working state, the design of the operation unit is wrong, the accelerator cannot adapt to the scale of the neural network, and the neural network adapts to the wrong hardware structure again by modifying or adjusting the deep neural network model so as to finish the operation target.

Description

Neural network processing method and device based on chip design defects

Technical Field

The embodiment of the invention relates to the technical field of neural networks, in particular to a neural network processing method and device based on chip design defects.

Background

With the rapid development of Deep Neural Networks (DNNs) in the field of artificial intelligence, more and more applications require a greater amount of computation and more application-specific computation patterns. Therefore, the operation of the neural network gradually extends from a general-purpose platform (CPU, GPU) to a special-purpose platform (FPGA, DSP, special-purpose processor, and accelerator), which also induces the development, design, and manufacture of a large number of special-purpose circuits and special-purpose processors for processing the neural network, and becomes an emerging field for DNN development. The neural network processor structurally generally comprises a plurality of groups of arithmetic units, the arithmetic units form a pulse array or form multi-stage flow for parallel operation, a flexible data path is formed, and a new special framework can improve the arithmetic efficiency by 50 to 1000 times. However, due to inexperienced design, short development period, long development chain, rapid evolution of the neural network and other reasons, the integrated circuit supporting the neural network is inevitably subjected to vulnerability, error and prediction deficiency in the development and design, so that the chip after tape-out production cannot achieve the expected effect, even the core of the neural network fails, thereby causing great time and economic loss, and the result is catastrophic.

The neural network/deep learning algorithm is used as a flexible operation model with abundant static or dynamic connections, and has the capability of correcting certain processor design defects. In the field of conventional processor design, once a certain core module is damaged, the whole core can be failed.

Therefore, the existing scheme lacks a method for processing the neural network in response to the chip design defect.

Disclosure of Invention

The embodiment of the invention provides a neural network processing method and device based on chip design defects, which enable a neural network to adapt to a wrong hardware structure again by modifying or adjusting a deep neural network model so as to finish an operation target.

In a first aspect, an embodiment of the present invention provides a neural network processing method based on chip design defects, including:

obtaining the type of the design defect of the chip;

adjusting the neural network according to the type so that the neural network can normally operate on the chip;

wherein the type of the design defect of the chip at least comprises one of the following types:

the input/output data cache is in an abnormal working state, and the accelerator cannot adapt to the scale of a neural network or the design error of an arithmetic unit.

In one possible embodiment, the adjusting the neural network according to the type includes:

and when the type of the design defect of the chip is that the input/output data cache is in an abnormal working state, adjusting the speed of the input/output data cache for adjustment.

In one possible embodiment, the setting adjusts an input/output data rate of the input/output data buffer, including:

modifying the number of layers and the scale of each layer of the neural network model, the number of the weight read by each input data, and adjusting the frequency of the arithmetic unit for reading data from the input data cache and outputting the data to the output cache.

In one possible embodiment, the setting adjusts a data address of the input/output data buffer, including:

and modifying the scale of an input and output layer of the neural network model, adjusting the address range for reading the input and output cache, and bypassing the cache failure site.

when the design defect of the chip is of a type that an accelerator cannot adapt to the scale of a neural network, the neural network is split into a plurality of sub-networks so that the accelerator can adapt to each sub-network.

and when the type of the design defect of the chip is the design failure of the operation unit, adding the operation mode corresponding to the operation unit into the training process of the neural network, and keeping the operation in the training process the same as the operation mode of the failed operation unit so as to enable the neural network to adapt to the incorrect operation mode of the accelerator.

In a second aspect, an embodiment of the present invention provides a neural network processing apparatus based on a chip design defect, including:

the acquisition module is used for acquiring the type of the design defect of the chip;

the adjusting module is used for adjusting the neural network according to the type so that the neural network can normally operate on the chip;

In a possible embodiment, the adjusting module is specifically configured to adjust a rate of input/output data of the input/output data buffer when the type of the design defect of the chip is that the input/output data buffer is in an abnormal operating state.

In a possible embodiment, the adjusting module is specifically configured to modify the number of layers and the scale of each layer of the neural network model, and the number of the reading weights of each input data, and adjust the frequency of the arithmetic unit reading data from the input data buffer or writing data to the output buffer.

In a possible embodiment, the adjusting module is specifically configured to modify a scale of an input/output layer of the neural network model, and adjust an address range of the arithmetic unit for reading data from the input data buffer or writing data to the output buffer.

In a possible embodiment, the adjusting module is specifically configured to split the neural network into a plurality of sub-networks so that the accelerator can adapt to each sub-network when the type of the design defect of the chip is that the accelerator cannot adapt to the scale of the neural network.

In a possible implementation manner, the adjusting module is specifically configured to, when the type of the design defect of the chip is an arithmetic unit design failure, add the corresponding arithmetic mode into a training process of the neural network, so that an operation in the training process and an arithmetic mode of an arithmetic unit with an accelerator failure remain the same, so that the neural network adapts to an incorrect arithmetic mode of the accelerator.

According to the neural network processing scheme based on the chip design defects, the types of the design defects of the chip are obtained; adjusting the neural network according to the type so that the neural network can normally operate on the chip; the deep neural network model is modified or adjusted, so that the neural network adapts to the wrong hardware structure again, and the operation target is completed.

Drawings

Fig. 1 is a schematic flow chart of a neural network processing method based on chip design defects according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a neural network processing apparatus based on chip design defects according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a hardware structure of a neural network processing device based on a chip design defect according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

For the convenience of understanding of the embodiments of the present invention, the following description will be further explained with reference to specific embodiments, which are not to be construed as limiting the embodiments of the present invention.

Fig. 1 is a schematic flow chart of a neural network processing method based on chip design defects according to an embodiment of the present invention, as shown in fig. 1, the method specifically includes:

s101, obtaining the type of the design defect of the chip.

the input and output data cache is in an abnormal working state, and the accelerator cannot adapt to the scale of a neural network or the design error of an arithmetic unit.

S101, adjusting the neural network according to the type so that the neural network can normally run on the chip.

Specifically, when the type of the design defect of the chip is that the input/output data cache is in an abnormal working state, the input/output data rate of the input/output data cache is adjusted. For example, the neural network accelerator fails a control path due to the fact that an input data cache (an input layer data cache, IDB) controller cannot work normally, the triggering reason of the failure is that when the data cache content is empty, a state machine of the cache controller enters an error state, so that the data path is disordered, a part of read input data becomes messy codes, the number of read weights of each input data can be modified, and the frequency of the arithmetic unit for reading data from the input data cache is reduced.

The operation mode of the neural network accelerator is that an input data x (i) is obtained, the input data x (i) is multiplied by all weights w of the layer one by one to obtain a part y of the output of the layer, and finally the part y of the output of the layer is summed.

Suppose that a three-tier DNN network with 300 input tier, 30 hidden tier and 10 output tier does not work properly on the accelerator because the arithmetic unit needs to read the input data buffer every 30 weight readings. The model is changed again to form a three-layer DNN model of 300x100x30 and the model is retrained, because more neurons need to be operated in the hidden layer, each input data needs to obtain 100 weights to take next input data, the frequency of data fetching from the input data cache by an operation unit is effectively reduced, an error state is avoided, and the accelerator design error is avoided under the condition that only the neural network model layer is modified.

It is assumed that the neural network accelerator can only support a neural network with less than 10000 connecting nodes, 1024 input layers and 64 output layers. Now there is a DNN network with 960 as input layer, 480 as hidden layer and 10 as output layer, which cannot run on the accelerator because all its connection nodes reach 465600, far beyond the design requirement.

The method of splitting the neural network into a plurality of sub-networks can be adopted, for example, the neural network is split into 9 sub-networks, wherein 8 sub-networks are split between layers 1 and 2, the size of each sub-network is 120x60, and each sub-network connection node is 7200; the remaining 1 sub-network is split into layers 2 and 3, the size is 480x10, and the number of sub-network connection nodes is 4800. So that each sub-network meets the design requirements. Where the data connections between the first 8 sub-networks are partially removed to achieve efficient slitting. Thus, when training, the whole 960x480x10 large network is trained, and when reasoning, the split sub-network is adopted for successive calculation, thereby expanding the capability of the accelerator.

In addition to the splitting, the neural network may be split in other manners, the number of the split sub-networks is not limited to 9, and the splitting may be performed according to actual requirements, which is not specifically limited in this embodiment.

When the type of the design defect of the chip is that the operation mode of the operation unit is designed to be wrong, the corresponding operation mode is added into the training process of the neural network, so that the operation in the training process is kept the same as the operation mode of the operation unit with the failure of the accelerator, and the neural network is suitable for the failure corresponding to the operation mode of the accelerator. For example, a neural network is trained using FANN software that operates on the neural network using floating points, without rounding errors. The operating mode of the FANN is now modified so that each time it outputs a value, it is rounded off to an integer, so that the FANN software becomes hardware-compatible. The software is reused for neural model training, and the model takes the rounding error into account and keeps consistent in hardware.

According to the neural network processing method based on the chip design defect, the type of the design defect of the chip is obtained; adjusting the neural network according to the type so that the neural network can normally operate on the chip; the deep neural network model is modified or adjusted, so that the neural network adapts to the wrong hardware structure again, and the operation target is completed.

Fig. 2 is a schematic structural diagram of a neural network processing apparatus based on chip design defects according to an embodiment of the present invention, as shown in fig. 2, the apparatus specifically includes:

an obtaining module 201, configured to obtain a type of a design defect of a chip;

an adjusting module 202, configured to adjust the neural network according to the type, so that the neural network operates normally on the chip;

the input/output data cache is in an abnormal working state, the accelerator cannot adapt to the scale of the neural network, and the design of the arithmetic unit is wrong.

Optionally, the adjusting module 202 is specifically configured to adjust an input/output data rate of the input/output data cache when the type of the design defect of the chip is that the input/output data cache is in an abnormal working state.

Optionally, the adjusting module 202 is specifically configured to modify the number of layers, the scale of each layer, and the number of read weights of each input data of the neural network model, and adjust a frequency of the arithmetic unit reading data from the input data buffer or writing data into the output buffer.

Optionally, the adjusting module 202 is specifically configured to modify a scale of an input/output layer of the neural network model, and adjust an address range of the arithmetic unit for reading data from the input data cache or writing data to the output cache.

Optionally, the adjusting module 202 is specifically configured to, when the type of the design defect of the chip is the size that an accelerator cannot adapt to the neural network, split the neural network into a plurality of sub-networks so that the accelerator can adapt to each sub-network.

Optionally, the adjusting module 202 is specifically configured to, when the type of the design defect of the chip is an error in the calculation unit calculation mode design, add the corresponding calculation mode into a training process of the neural network, so that the calculation in the training process and the calculation mode of the calculation unit with the accelerator failure remain the same, so that the neural network adapts to an incorrect calculation mode of the accelerator.

According to the neural network processing device based on the chip design defects, the types of the design defects of the chip are obtained; adjusting the neural network according to the type so that the neural network can normally operate on the chip; the deep neural network model is modified or adjusted, so that the neural network adapts to the wrong hardware structure again, and the operation target is completed.

Fig. 3 is a schematic diagram of a hardware structure of a neural network processing device based on chip design defects according to an embodiment of the present invention, and as shown in fig. 3, the device specifically includes:

a processor 310, a memory 320, a transceiver 330.

The processor 310 may be a Central Processing Unit (CPU) or a combination of a CPU and a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.

The memory 320 is used to store various applications, operating systems, and data. The memory 320 may transfer the stored data to the processor 310. The memory 320 may include a volatile memory, a nonvolatile dynamic random access memory (NVRAM), a phase change random access memory (PRAM), a Magnetoresistive Random Access Memory (MRAM), and the like, such as at least one magnetic disk memory device, an electrically erasable programmable read-only memory (EEPROM), a flash memory device, such as a NOR flash memory or a NAND flash memory, a semiconductor device, such as a Solid State Disk (SSD), and the like. Memory 320 may also comprise a combination of memories of the sort described above.

A transceiver 330 for transmitting and/or receiving data, the transceiver 330 may be an antenna, etc.

The working process of each device is as follows:

and a processor 310 for acquiring the type of the design defect of the chip.

And the processor 310 is further configured to adjust the neural network according to the type, so that the neural network operates normally on the chip.

Optionally, the processor 310 is further configured to adjust an input data rate of the input/output data buffer when the type of the design defect of the chip is that the input/output data buffer is in an abnormal operating state.

Optionally, the processor 310 is further configured to adjust an address range of the arithmetic unit for reading data from the input data buffer or writing data to the output buffer when the type of the design defect of the chip is that the input/output data buffer is in an abnormal operating state.

Optionally, the processor 310 is further configured to modify the number of the read weights for each input data, and adjust the frequency of the arithmetic unit accessing data from the input/output data buffer.

Optionally, the processor 310 is further configured to split the neural network into a plurality of sub-networks so that the accelerator can adapt to each sub-network when the type of the design defect of the chip is that the accelerator cannot adapt to the scale of the neural network.

Optionally, the processor 310 is further configured to, when the type of the design defect of the chip is an error in the calculation mode design of the calculation unit, add the corresponding calculation mode into the training process of the neural network, so that the calculation in the training process and the calculation mode of the calculation unit with the accelerator failure remain the same, so that the neural network adapts to the incorrect calculation mode of the accelerator.

The neural network processing device based on the chip design defect provided in this embodiment may be the neural network processing device based on the chip design defect shown in fig. 3, and may perform all the steps of the neural network processing method based on the chip design defect shown in fig. 1, so as to achieve the technical effect of the neural network processing method based on the chip design defect shown in fig. 1, which is described with reference to fig. 1 for brevity, and is not described herein again.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A neural network processing method based on chip design defects is characterized by comprising the following steps:

obtaining the type of the design defect of the chip;

wherein the types of design defects of the chip include:

the input/output data cache is in an abnormal working state, the accelerator cannot adapt to the scale of the neural network, and the design of an arithmetic unit is wrong;

the adjusting the neural network according to the type comprises:

when the type of the design defect of the chip is that the input/output data cache is in an abnormal working state, adjusting the scale of each layer in the middle of the neural network, the quantity of parameters such as weight and the like, and adjusting the speed of the input/output data cache;

when the type of the design defect of the chip is that an accelerator cannot adapt to the scale of a neural network, splitting the neural network into a plurality of sub-networks so that the accelerator can adapt to each sub-network;

2. The method of claim 1, wherein setting the data address of the input/output data buffer for adjustment comprises:

and adjusting the scale of the input layer of the neural network, so that the neural network can fully utilize the part of the data cache which is not failed to bypass the fault address.

3. A neural network processing apparatus based on chip design defects, comprising:

wherein the types of design defects of the chip include:

the adjusting module is specifically configured to adjust the number of parameters such as scale and weight of each layer of the neural network model when the type of the design defect of the chip is that the input/output data cache is in an abnormal working state, so that the rate of the input/output data of the input/output cache is adjusted;

and when the type of the design defect of the chip is that the design of the operation unit is wrong, adding the corresponding operation mode into the training process of the neural network, and keeping the operation in the training process the same as the operation mode of the invalid operation unit so as to enable the neural network to adapt to the incorrect operation mode of the accelerator.

4. The apparatus of claim 3, wherein the adjustment module is specifically configured to adjust a size of an input layer of the neural network such that the neural network can make full use of a portion of the data cache that is not invalidated, to bypass the fault address.