WO2017187516A1 - 情報処理システムおよびその運用方法 - Google Patents
情報処理システムおよびその運用方法 Download PDFInfo
- Publication number
- WO2017187516A1 WO2017187516A1 PCT/JP2016/063072 JP2016063072W WO2017187516A1 WO 2017187516 A1 WO2017187516 A1 WO 2017187516A1 JP 2016063072 W JP2016063072 W JP 2016063072W WO 2017187516 A1 WO2017187516 A1 WO 2017187516A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- machine learning
- layer
- data
- recognition
- learning
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Definitions
- Deep learning is known as a machine learning technique of a multi-layered neural network (Deep Neural Network: DNNd). This is a technique based on a neural network, but in recent years, the situation has been reviewed again in the field of image recognition, triggered by an improvement in the recognition rate using a convolutional neural network. Deep learning devices range from terminals such as image recognition for autonomous driving to cloud such as big data analysis.
- Patent Document 1 is configured using a first network and a second network for the purpose of accurately and quickly obtaining the derivative in addition to the output value of the network, and the first network uses a sigmoid function. Although the calculation is performed, the second network discloses a technique for improving the calculation efficiency by performing a derivative operation of the sigmoid function to make a substantial four arithmetic operation.
- Patent Document 2 relates to a learning method of a neural network having a wide application field such as pattern recognition, character recognition, and various controls. For example, using a plurality of neural networks having different numbers of units in the intermediate layer, An object of the present invention is to provide a neural network learning system capable of performing learning efficiently and at high speed while suppressing an increase.
- the above patent document cannot be an efficient solution for implementing so-called deep learning in an IoT environment in which a neural network is set deeper.
- the reason is that the above-mentioned system is intended to use each output for each purpose, and therefore there is no concept of network reconfiguration at each layer and efficient use of computing resources.
- the hardware scale, power, and computing performance of the hardware installed on the terminal side are limited. In particular, there is a demand for a system that can perform efficient calculations and can appropriately change the configuration according to the situation.
- a framework that enables efficient processing in cooperation with a central computer is also important.
- the IoT era will be a huge system with trillion sensors, and it becomes difficult to centrally control everything, but it is also a requirement that the system be capable of autonomous control for each terminal .
- One aspect of the present invention for solving the above problems is that a plurality of DNNs are hierarchically configured, and the hidden layer data of the DNN of the first layer machine learning / recognition device is used as the second layer machine learning / recognition device. It is an information processing system characterized by using the input data of DNN.
- the hardware scale of the second hierarchy machine learning / recognition apparatus is configured larger than the hardware scale of the first hierarchy machine learning / recognition apparatus.
- Another aspect of the present invention is an operation method of an information processing system including a plurality of DNNs, and the plurality of DNNs include a first layer machine learning / recognition device and a second layer machine learning / recognition device. It has a multi-layered structure, and the information processing capability of the second layer machine learning / recognition device is higher than the information processing capability of the first layer machine learning / recognition device.
- the DNN hidden layer data is used as the DNN input data of the second-layer machine learning / recognition apparatus.
- the configuration of the DNN neural network of the first layer machine learning / recognition device is controlled based on the processing result of the second layer machine learning / recognition device.
- data of the second layer is calculated using the data of the first layer, and vice versa. It has a means for calculating data.
- weight data that determines the relationship between each data of the first layer and each data of the second layer, and the weight data is stored as a single memory holding unit as all the weight coefficient matrices that constitute the weight data.
- a calculation unit having a product-sum operation unit corresponding to the calculation of each matrix element, which is a constituent element of the weighting coefficient matrix has a one-to-one correspondence. Is stored with the row vector of the matrix as the basic unit, and the calculation of the weighting coefficient matrix is performed for each basic unit stored in the storage holding unit.
- the first row component of the row vector is held in the storage holding unit in the same arrangement order as the column vector of the original matrix.
- the second row component of the row vector is held in the storage holding unit while shifting the component of the column vector of the original matrix by one element to the right or left.
- the third row component of the row vector is held in the storage holding unit by being shifted by one element in the same direction as the direction in which the constituent elements of the column vector of the original matrix are moved by the second row component.
- the Nth row component of the last row of the row vector is held in the memory holding unit with one element shifted further in the same direction as the component of the column vector of the original matrix is moved by the N-1th row component. Is done.
- the second layer data when calculating the first layer data from the second layer data using the weighting coefficient matrix, arrange the second layer data as a column vector of the matrix and input each element to the product-sum calculator. At the same time, the first row of the weighting coefficient matrix is input to the product-sum operation unit, the multiplication operation is performed on both data, the result of the operation is stored in the accumulator, and the second and lower rows of the weighting coefficient matrix are calculated.
- the second layer data is shifted to the left or right, and each time the weight matrix row operation is performed, the second layer data is shifted by one element and then rearranged with the element data of the corresponding row of the weight coefficient matrix.
- the arithmetic unit has a configuration in which the multiplication operation with the data of the second layer is performed, the data stored in the accumulator of the same operation unit is added, and the same operation is performed up to the Nth row of the weighting coefficient matrix.
- the data of the first layer is arranged like a column vector of the matrix, and each element is input to the product-sum calculator.
- the first row of the weighting coefficient matrix is input to the product-sum operation unit, the multiplication operation is performed, the result is stored in the accumulator, and the second and lower rows of the weighting coefficient matrix are calculated.
- the weighting coefficient matrix row operation is performed, the first layer data is shifted by one element to the left or right, and then the first data is rearranged with the element data of the corresponding row of the weighting coefficient matrix.
- the multiplication operation with the layer data is performed, and then the accumulator information stored in the operation unit is input to the addition unit of the adjacent operation unit, the addition with the result of the multiplication operation is performed, and the result is stored in the accumulator.
- Store and perform similar operations on the weight matrix A machine learning arithmetic unit which comprises carrying out N lines away.
- intermediate data is generated by calculating connection between neurons using a weight function determined in advance by learning. System.
- This intermediate data is intermediate data obtained by extracting feature points for classifying input data.
- the generated intermediate data is input to an upper level neural network device provided in the second level.
- the second-layer neural network device receives an output signal from an intermediate layer of one or more neural network devices in the first layer.
- the second-layer neural network device receives new inputs from one or more first-layer neural network devices and performs new learning.
- FIG. 1 is a configuration block diagram according to a first embodiment of the present invention.
- A The figure which shows the structure of a 1st hierarchy
- B It is explanatory drawing of the structure between each calculation node.
- FIG. 6 is a block diagram showing another form of the embodiment shown in FIG. 2 (A). It is a figure which shows the communication protocol of a 1st hierarchy and a 2nd hierarchy. It is a flowchart which shows the sequence which updates the DNN information of a 1st hierarchy.
- the same symbol or number may be distinguished by adding a suffix. However, if there is no need to distinguish between them, the suffix may be omitted.
- notations such as “first”, “second”, and “third” are attached to identify the constituent elements, and do not necessarily limit the number or order.
- a number for identifying a component is used for each context, and a number used in one context does not necessarily indicate the same configuration in another context. Further, it does not preclude that a component identified by a certain number also functions as a component identified by another number.
- FIG. 1A explains the basic concept of this embodiment.
- the simplest example is to perform learning on the server side as shown in FIG. It will be a system that performs recognition on the terminal side.
- the inventors of the present application proceeded with the DNN study, they found that learning on the upper server side becomes efficient by utilizing the intermediate data of the DNN calculation in the recognition unit.
- the input data on the terminal side and the DNN middle layer data when the terminal side recognizes are sent to the server side.
- the learning is performed on the side, and the learning result on the server is transmitted to the terminal side at an appropriate timing to advance the recognition operation on the terminal.
- the input of the DNN on the server side is to use the data output of the intermediate layer of the DNN of the terminal and learn with the DNN in each layer.
- the DNN of the terminal performs supervised learning
- the DNN of the server performs supervised learning.
- the terminal-side DNN device is composed of a small, small-area, low-power device
- the server-side DNN device is composed of a so-called server having high-speed computation and a large-capacity memory.
- FIG. 1B is a diagram showing a main embodiment of the present invention.
- FIG. 1B (a) shows a system composed of a plurality of machine learning devices (DNN1-1 to 2-1).
- DNN1-1 to 2-1 machine learning devices
- paths indicated by nd011-nd014, nd021-nd024, and nd031-nd034 indicate paths connecting the levels of each neural network.
- the first machine learning and recognition device hierarchy (1 st HRCY) machine learning and recognition device and the second hierarchy (2 nd HRCY) are hierarchically connected.
- Each machine learning / recognition device DNN includes an input layer IL, an intermediate layer HL, and an output layer OL.
- the first layer machine learning / recognition device and the second layer machine learning / recognition device in the deep neural network constituting the first layer machine learning / recognition device, it is not the data of the output layer OL at the time of recognition.
- the intermediate layer HL data (nd014, nd024) generated during the recognition process, which is called a hidden layer, is used as the input of the second-level machine learning / recognition apparatus.
- the data from the output layer OL is output as data that presents the recognition result in a histogram or the like for each category classified in advance, and is composed of data indicating how the input data is classified as a result of recognition.
- Data from the intermediate layer (hidden layer) HL is data obtained by extracting feature values of the input data.
- the reason why this intermediate layer data is utilized is that the intermediate layer data is the data from which the features of the input data are extracted, and the high-quality input data in the learning by the second-level machine learning / recognition device Because it can be used as.
- the signals (nd015, nd025) from the second hierarchy learning / recognition apparatus to the first hierarchy learning / recognition apparatus are the network and weight of the first hierarchy learning / recognition apparatus, or a signal for instructing a change thereof. This is because a change signal is issued when it is necessary to change the recognition network of the first layer learning / recognition apparatus in the learning / recognition processing in the first and second layers. This makes it possible to improve the recognition rate of the first layer learning / recognition apparatus in the actual operation situation.
- CNN convolutional neural network
- a part of the original image is cut out (called the kernel) for the part corresponding to the hidden layer, and so-called image convolution is performed by a pixel-unit product-sum operation with a weight filter of the same image size.
- image convolution is performed by a pixel-unit product-sum operation with a weight filter of the same image size.
- a pooling operation for coarse-graining the image is further performed to generate a plurality of smaller data.
- the hidden layer is characterized in that information that is a feature of the original image is efficiently extracted.
- the inventors have found that, in considering data conversion in machine learning, for example, it is possible to improve learning efficiency by effectively using data extracted from features appearing in a hidden layer of CNN.
- image recognition learning In general, it is often difficult for a machine to grasp the meaning of image data, even if it can be understood by human beings when it is understood.
- the above hidden layer data is processed to conspicuously show the features of the image at the same time as compressing information by convolution with weight data and coarse-graining by statistical processing between surrounding pixels. Is a feature.
- CNN by providing a plurality of such feature extraction processes, it is possible to make the feature quantity stand out, and by processing the feature quantity, there is a feature that makes the judgment of the image close to the correct answer with high probability.
- the data in the intermediate layer is worthy of highlighting features.
- a neural network type learning machine requires computation in proportion to the number of neurons, computing resources (calculation performance, hardware scale, etc.) ) Is important.
- the low latency is satisfied in order to satisfy the requirement (3).
- the machine learning / recognition apparatus is small and capable of high-speed feedback and has limited functions.
- the requirement (2) is also satisfied.
- FIG. 1B (b) shows a combination configuration example of four types of hardware used in the first layer and the second layer.
- the hardware scale on the second layer side is made larger than that on the first layer side.
- the information processing capability is generally higher.
- the requirement (4) is also satisfied.
- learning in the second level is performed by using the conventional first level using input data. Compared with the learning similar to the recognition in the above, qualitative improvement can be made for the requirement (1). This is because a larger amount of information is input to the second layer machine learning / recognition apparatus by taking values from the hidden layer instead of the output layer of the first layer machine learning / recognition apparatus.
- the first layer machine learning / recognition device and the second layer machine learning / recognition device can each have a learning function.
- supervised learning is performed by the second hierarchy machine learning / recognition apparatus.
- learning is easier than making the whole one DNN.
- the learning of the second layer machine learning / recognition device can be performed while using the data from other first layer machine learning / recognition devices as input data, it is possible to efficiently increase the amount of data, and the learning efficiency and Improve learning outcomes.
- the second-level machine learning / recognition device supervised learning is performed by using the hidden layer value calculated by the first-level machine learning / recognition device as input, so that learning is repeated in the second-level machine learning / recognition device.
- supervised learning is performed by using the hidden layer value calculated by the first-level machine learning / recognition device as input, so that learning is repeated in the second-level machine learning / recognition device.
- FIG. 2 shows a specific configuration of the first layer machine learning / recognition apparatus (DNN1).
- a neural network type machine learning / recognition apparatus includes nodes (i 1 to i L ) of an input layer IL1, nodes (o 1 to o P ) of an output layer OL1, and , Each node of the hidden layers HL11 to HL13 (n 2 1 to n 2 M , n 3 1 to n 3 N , n 4 1 to n 4 O ), and the connection between the nodes is shown in FIG.
- the arithmetic operation (AU) of the weights w i j, k and the input node n i j is entered into the connection between n i j and n i + 1 k .
- DNN network configuration controller is a control circuit that controls the DNN network configuration.
- DNN configuration data is stored as information on the neural network configuration information data transmission line (NWCD) and weight coefficient change line (WCD), and the information is reflected in the DNN device as necessary.
- This configuration data can correspond to a so-called configuration memory when using an FPGA (Field Programmable Gate Array) described later.
- the DNN network configuration control unit can communicate with the second layer machine learning / recognition device (DNN2).
- the contents of the DNN configuration data can be transmitted to the second layer machine learning / recognition apparatus, and the contents of the DNN configuration data can be received from the second layer machine learning / recognition apparatus. Data for communication will be described later with reference to FIG. 3B.
- the data storage memory (DNN_MIDD) has a function of holding data of each layer of the neural network and outputting it to the second layer machine learning / recognition device.
- the data of nd014 and nd024 have been described in the form of being transmitted to the second layer machine learning / recognition apparatus.
- LM learning module
- This is a well-known technique generally called supervised learning, but it is possible to evaluate how much the output result of the result calculated by DNN1 deviates compared to so-called teacher data (TDS1), which is considered correct. It is important to learn that the weighting coefficient of the neural network is changed based on the amount of deviation.
- the error detection unit (DD: Deviation Detection) unit calculates the error amount (DDATA) by matching the DNN1 calculation result with the teacher data (TDS1), and compares the result information with the correct answer information as necessary. Recognition result rating information is generated and stored.
- weights are determined and stored by a weight coefficient adjustment circuit (WCU: Weight Change Unit), weight coefficients are set by weight coefficient change lines (WUD), and each neural network n i j and n i is set.
- WCU Weight Change Unit
- WUD weight coefficient change lines
- each neural network n i j and n i is set.
- the weights w i j, k defined as +1 k are changed.
- FIG. 3A shows another configuration example of the first layer machine learning / recognition apparatus (DNN1).
- DNN1 first layer machine learning / recognition apparatus
- the data of the final output layer OL1 that has undergone recognition processing (Recognition) is used as input, and the inverse operation (Learning) of the recognition operation is performed and returned to the input layer IL1.
- DD error detection unit
- LM learning modules
- DNN_MIDD data storage memory
- the first-level machine learning / recognition apparatus (DNN1) is provided with means for storing a recognition result recognition result score at the same time as performing recognition processing, and the recognition result is a predetermined threshold value 1 If the variance is greater than a predetermined value when the recognition result histogram is created, or if the variance is greater than a predetermined value, An update request transmitting means for transmitting an update request signal to the DNN neural network structure and the weighting coefficient of the first layer machine learning / recognition device is provided for the second layer machine learning / recognition device.
- the second layer machine learning / recognition device Upon receiving the update request signal of the first layer machine learning / recognition device, the second layer machine learning / recognition device (DNN2) updates the DNN neural network structure and weighting factor of the first layer machine learning / recognition device. Then, the update data is transmitted to the first hierarchy machine learning / recognition apparatus.
- the first-level machine learning / recognition device (DNN1) builds a new neural network based on the updated data.
- FIGS. 2A and 3A show specific examples of the first-level machine learning / recognition apparatus (DNN1).
- the basic structure of the second-level machine learning / recognition device (DNN2) is the same.
- supervised learning is performed by using data from the hidden layer HL of the first layer machine learning / recognition device (DNN1) as an input to the second layer machine learning / recognition device (DNN2). It also has an interface for data communication with the DNN network configuration controller (DNNCC) of the first layer machine learning / recognition device (DNN1) and the data storage memory (DNN_MIDD).
- DNNCC DNN network configuration controller
- DNN_MIDD data storage memory
- FIG. 3B is a diagram showing communication protocols of the first layer and the second layer. The structure of data held in the first hierarchy is shown in both the case where learning is performed by the first hierarchy machine learning / recognition apparatus and the case where learning is not performed.
- the configuration information (DNN #) of the neural network As information representing the features of the first layer machine learning / recognition device, the configuration information (DNN #) of the neural network, the weight coefficient information (WPN #), the comparison result information (RES_COMP) with the correct answer information, It consists of recognition result information (recognition accuracy rate, etc., Det_rank), configuration update request signal (update request) (UD (Req) of the first layer machine learning / recognition device.
- the configuration update request signal of the first hierarchy machine learning / recognition apparatus has a configuration of several bits at most, and the second hierarchy machine learning / recognition apparatus periodically requests the configuration update of the first hierarchy machine learning / recognition apparatus. Check the signal to see if it needs to be updated. If this information indicates a request for update, prepare to transfer the latest data additionally learned by the second-layer machine learning / recognition device to the first-layer machine learning / recognition device, and prepare to transfer data update information Then, the request update preparation completion signal data is transmitted to the first hierarchy machine learning / recognition apparatus and stored in the data of the first hierarchy machine learning / recognition apparatus. This data is stored as UD_Prprd.
- DNN learning is performed in the second-level machine learning / recognition device, if the learning fails to achieve the desired recognition rate, the learning in the first-level machine learning / recognition device may be re-executed. Conceivable. Even in such a case, since learning is hierarchized, there is an effect that efficient calculation as a whole becomes possible.
- FIG. 4 shows a program sequence for changing the configuration of the first layer machine learning / recognition apparatus.
- the second-tier machine learning / recognition device can First-tier machine learning / recognition device update request information is transmitted to the recognition device.
- a data preparation completion signal or update bit is sent to the first layer machine learning / recognition device. Send information.
- the boot sequence shown in FIG. 4 is run in a situation where the first-tier machine learning / recognition apparatus is rebooted.
- the second layer machine learning / recognition device By checking the data preparation completion signal or the update bit information, it is determined whether data update access to the second layer machine learning / recognition device is necessary, and if necessary, the second layer machine learning / recognition device Sends a data download request signal to the server (S401), detects the arrival of update data, stops downloading the update data (S402), uses parity and CRC (Cyclic Redundancy Check) to check the normality of the data Inspect (S403). Thereafter, the FPGA configuration information is reconfigured (S404). Thereafter, the FPGA is booted (S405), and normal operation is started (S406).
- CRC Cyclic Redundancy Check
- Figure 5 shows the configuration when applying DNN to FPGA (501) when DNN is configured with FPGA.
- a dynamic rewriting technology of the configuration memory (CRAM) inside the FPGA is used.
- the FPGA includes a look-up table unit (LEU) and a switch unit (SWU), and an arithmetic unit (DSP) and a memory (RAM) that perform a product-sum operation and the like configured by hardware.
- LEU look-up table unit
- SWU switch unit
- DSP arithmetic unit
- RAM memory
- the logic circuit such as DNN network in this embodiment is mounted on LEU, SEU, DSP, RAM and performs normal operation.
- updating the contents of the DNN as described above can be realized by writing the update data transmitted from the second-level machine learning / recognition apparatus into the CRAM by the CRAM control circuit (CRAMC).
- CRAMC CRAM control circuit
- this first-level machine learning / recognition device when configuring this first-level machine learning / recognition device with an FPGA, intermediate layer data stored in the memory, and network configuration information (configuration information describing the switch part of the FPGA), it is conceivable that weight information, discrimination information of recognition information recognized by the first hierarchy learning / recognition apparatus, and the like are transmitted to the second hierarchy learning / recognition apparatus.
- the second layer learning / recognition device can send efficient, high-quality data, This has the effect of improving learning efficiency in the second hierarchy.
- the configuration of this embodiment it is not necessarily limited to the type of neural network in the first layer and the second layer.
- the same network is formed in the first hierarchy and the second hierarchy, there is an effect that a larger neural network can be constructed as a whole.
- a neural network for image recognition processing is configured in the first layer and a neural network for natural language processing is assembled in the second layer, the effect of enabling efficient learning linked to the first layer and the second layer is effective. is there.
- FIG. 6 is an embodiment characterized in that no means for sending data from the second-tier machine learning / recognition device DNN2 to the first-tier machine learning / recognition device DNN1 is provided.
- the simplest configuration is obtained.
- the advantage of this method is that the second-tier machine learning / recognition device DNN2 performs learning and recognition computation using the computation results of the first-tier machine learning / recognition device DNN1, but the second-tier machine learning / recognition device There is no feedback path from DNN2 to the first layer machine learning / recognition device DNN1, and therefore the first layer machine learning / recognition device DNN1 and the second layer machine learning / recognition device DNN2 are independent from each other. It is a point that can be.
- the second layer machine learning / recognition device DNN2 performs supervised learning using the values of the hidden layers HL13 and HL23 calculated by the first layer machine learning / recognition device DNN1 as inputs. Therefore, when the learning is repeatedly performed in the second hierarchy machine learning / recognition apparatus DNN2, it is not necessary to perform the operation again in the first hierarchy machine learning / recognition apparatus DNN1, so in the second hierarchy machine learning / recognition apparatus DNN2 In the learning, there is no need to re-execute the learning executed by the first-level machine learning / recognition device DNN1, and there is an effect that the calculation amount can be reduced as a whole.
- the second-level machine learning / recognition device can be input to the second-level machine learning / recognition device DNN2 by generating and transferring the input data during learning by the first-level machine learning / recognition device DNN1 even in the case of learning computation. There is also an effect that less data is passed to DNN2.
- FIG. 7 assumes a case where the first-level machine learning / recognition apparatus DNN1 advances the recognition process.
- a signal line from the upper hierarchy to the lower hierarchy will be described.
- from the upper hierarchy Can easily be extended when there is a signal connection.
- the first-level machine learning / recognition device DNN1 receives input from an external sensor device or a database, and executes recognition processing inside DNN1. At that time, the data of the intermediate layer, here, the data of nd014nd is held in the data storage STORAGE 1 (HDD, Flash memory, DRAM, etc.) attached to DNN1.
- the first layer machine learning / recognition device DNN1 assumes that the hardware scale is often limited, and it is considered that there is a limit to data storage in this layer. For this reason, it is desirable to implement a temporary memory-like configuration such as FIFO in this layer, and by intermittently transmitting the data to the second layer machine learning / recognition device DNN2, in the second layer, the database Construct Class DATA.
- the recognition score information obtained by advancing the recognition process in DNN1, the neural network configuration information of DNN1 device, and the weighting factor information are stored at the same time, it is added in the second-level machine learning / recognition device DNN2.
- the neural network information and the weighting factor information may be information that can be mutually recognized in the first layer and the second layer, and for example, it is conceivable to share the data in units of 64 bits.
- the first layer does not need to understand details of network configuration information and weight coefficient information, and does not forget the network being executed and the weight coefficient information.
- the second layer machine learning / recognition device DNN2 needs to know what kind of weighting factor pattern is used in what network the first layer machine learning / recognition device DNN1. It is necessary to prepare a correspondence table with the corresponding first level machine learning / recognition device DNN1.
- FIG. 8 shows a case where there are three or more first level machine learning / recognition devices DNN1.
- the first-level machine learning / recognition device DNN1 performs learning and recognition calculation independently of each other, so even if the number is increased, the second-level machine learning / recognition device DNN2 performs learning. Extension to is easy.
- connection between the first layer and the second layer is described as only the information connection between the two layers. However, as the number of the first layer is increased, the connection is more efficient. The connection method becomes important.
- an embodiment has been described in which data is exchanged using the network NW.
- This network NW may be wireless or wired, and may be connected appropriately depending on the location and situation where the system is installed.
- FIG. 9 is a diagram showing a modified example. The feature in this figure shows that the first-tier machine learning / recognition device DNN1 can be shared by different second-tier machine learning / recognition devices DNN2-1 and DNN2-2.
- the first hierarchy machine learning is performed.
- the connection between the recognition device DNN1 and the second-level machine learning device DNN2 can be implemented flexibly. This is a configuration that takes advantage of the feature of performing independent calculations in the first and second layers.
- FIG. 10 is a diagram showing an example of another modification.
- the feature in this figure is that the data of the optimum layer is selected from the plurality of intermediate hidden layers as data to be input from the first layer machine learning / recognition device DNN1 to the second layer machine learning / recognition device DNN2.
- the feature is that it can be transmitted.
- the figure extracted from the output of the HL12 and HL22 layers is shown, but the output from HL11, HL21, etc. may be used.
- the switching of this connection can be set independently of the first layer machine learning / recognition device DNN1 and the second layer machine learning / recognition device DNN2 and the second layer machine learning / recognition device DNN2.
- the transmission data to the second layer machine learning / recognition device DNN2 desirably transmits the network structure and the weight coefficient information together with the intermediate layer data.
- the means described in the first embodiment may be used.
- switching of output data can be set in cooperation with other first-level machine learning / recognition device DNN1 and second-level machine learning / recognition device DNN2.
- the second It is effective to provide a signal transmission / reception indicating whether or not to switch the layer from which transmission data is extracted to the hierarchical machine learning / recognition device DNN2.
- the recognition rate is evaluated when learning based on the data is performed, and the related first machine learning / The output control switching control of the recognition device group may be executed.
- FIG. 11 shows an embodiment in which operation layers are provided in three layers.
- the reason for providing a plurality of operation hierarchies is based on the calculation capability and efficiency.
- the first-level machine learning / recognition device DNN1 is intended to be installed in an embedded system, and is very compact and has large power constraints, so a large amount of computation cannot be expected.
- the computations of the second and third layers DNN2 and DNN3 have less computational hardware restrictions, making it possible to perform large-scale and high-speed computations by taking advantage of the advantages of larger size and relaxed power constraints.
- the second-layer machine learning / recognition device is provided with the neural network structure of the first-layer machine learning / recognition device DNN1 and the duplicate DNN1C of the weighting factor information.
- the learning calculation is performed by the device.
- the neural network structure and weight coefficient information of the learning result are appropriately reflected in the first layer machine learning / recognition device DNN1 by the data nd015.
- the present embodiment there are fewer functions on the terminal side, and the amount of hardware to be mounted can be reduced.
- learning with a high-performance machine learning / recognition device in the second hierarchy has an effect of shortening the time required for learning by the first-layer machine learning / recognition device DNN1.
- the hidden layer value is calculated in the first layer machine learning / recognition device DNN1, and the result nd014 is sent to the second layer machine learning / recognition device DNN1C.
- the learning in the second layer is repeatedly performed using the intermediate layer data of the first layer machine learning / recognition device DNN1.
- Data such as the structure of the neural network and the weighting coefficient obtained as a learning result in the second hierarchy machine learning / recognition apparatus DNN1C is transmitted to the first hierarchy machine learning / recognition apparatus DNN1 at an appropriate timing.
- the first hierarchy machine learning / recognition apparatus DNN1 performs the recognition process after reflecting the updated configuration information.
- the first layer machine learning / recognition device DNN1 does not need to perform the calculation again, so that the amount of calculation during learning can be reduced.
- the learning function in the first layer machine learning / recognition device DNN1 is not used at the time of normal recognition calculation, but is learned at the timing of initialization or update. Is a feature.
- the second-layer machine learning / recognition apparatus has a copy of the first-layer machine learning / recognition apparatus, and after learning there, the first-layer machine learning / recognition apparatus reflects the neural network structure, weighting factors, and the like.
- supervised learning is performed in the first-level machine learning / recognition device, and the learning result data is used as an initial value. As shown in the first embodiment, supervised learning is performed in the entire system including the first hierarchy and the second hierarchy.
- FIG. 14 shows a specific embodiment when applied to Convolutional Neural network (CNN).
- CNN the hidden layer is composed of a convolution layer (Convolution Layer: CL) and a pooling layer (Pooling Layer: PL), and a plurality of combinations thereof are provided.
- the data of the hidden layer is data such as nd111 to nd115.
- an FPGA is used for the first layer machine learning / recognition device DNN1 and the second layer machine learning / recognition device is composed of a device composed of a CPU and a GPU.
- CNN is structurally decomposed into small pixel blocks (called kernels) with respect to the input image, and performs the inner product operation with the weighting coefficient matrix corresponding to the same number of pixels while scanning the original image in every unit. .
- kernels small pixel blocks
- parallel processing in hardware is effective, and implementation with an FPGA having a large number of arithmetic units and memories in the LSI is very efficient with low power and high performance.
- the second hierarchy it is effective to efficiently distribute data from a plurality of first hierarchies as a batch process to a plurality of arithmetic units, and a low-cost distributed arithmetic system using software processing is effective. It is desirable to use it. As in this example, it can be easily applied to various DNNs.
- FIG. 15 shows an example of application to a machine learning system using different sensors (for example, a camera and a microphone).
- the image processing neural network DNN1-11 and the voice processing neural network DNN1-13 are combined.
- recognition with a robot or the like, it is considered that characterizing both the image and the sound is highly effective in various recognitions. This is because, when a person understands things, the amount of information is dramatically larger than the case where the visual information and the auditory information are combined, which is a single case, so that the recognition efficiency is increased.
- the image is processed by CNN and the voice is constituted by a fully connected neural network.
- the configuration is aimed at improving the recognition rate by using various types of non-uniform neural networks and merging the advantages.
- the learning itself can be learned separately, there is an effect that the learning itself is easy even in a complicated system.
- FIG. 16 shows a system application and operation method of this embodiment including a database construction system for object recognition to which such a system is applied.
- a plurality of first-level machine learning / recognition devices perform recognition / learning on one target at the same time, and hidden layer data calculated by the first-level machine learning / recognition device is used as a second-level machine learning / recognition device. To communicate.
- this object is not limited to image data.
- data from various angles such as audio information, temperature information, odor information, and texture information (hardness and composition) are handled as input.
- efficient information is transmitted to the second layer machine learning device, and more detailed multi-sensor cooperation learning / recognition is performed.
- the learning enhancement period is thus characterized by conducting detailed observations at the laboratory level. Furthermore, the results need to be put into actual operation. This period is defined as the actual operation period. During this period, reconfiguration data is transmitted from the second-level machine learning / recognition device to the first-level machine learning / recognition device so that efficient recognition can be realized even if the first-level machine learning / recognition device is a single unit.
- This situation is based on the first embodiment of this application, such as transmitting the recognition results for the constantly changing environment to the second-layer machine learning / recognition device as appropriate, and further data for efficient recognition. Conduct collection.
- the quality of the initial data (high recognition rate, efficient neural network form, etc.) can be improved when it is used in the actual operation period. I can expect.
- the first level machine learning / recognition devices DNN 1 to DNN N are assumed to be small learning / discriminators, and the second layer machine learning / recognition device DNN is a large learning machine. Assumed.
- the 1st step learning is performed with the second-level machine learning / recognition device DNN.
- This is the first learning phase (learning I). Therefore, learning by the second-level machine learning / recognition device DNN with abundant computational resources is efficient.
- the input data is learned with data that matches the operational status implemented in 2nd STEP. For example, when considering automatic driving or the like, moving image data taken by a camera provided in an automobile can be considered. In a sense, learning at this stage uses data under limited conditions, and the amount of data is limited, but a basic DNN network for the first-level machine learning / recognition device is constructed. It is positioned as learning that builds a basic configuration for doing this.
- the discriminator is installed in the first-level machine learning / recognition devices DNN 1 to DNN N to perform recognition and learning (supervised learning) through on-the-job training under actual operational conditions. Learning at this stage is exactly equivalent to on-the-job training for acquiring a driver's license when acquiring a driver's license.
- the main purpose is to collect data for improving the recognition rate, and the purpose is to grasp the separation status of the DNN constructed in the first step from the teacher data. For example, when it is applied to an automatic driving system, it is installed in an actual car, and a driver (human) judgment is used as teacher data, and the deviation is scored to collect data.
- the data of the hidden layer of DNN 1 to DNN N is sent to the second layer machine learning / recognition device DNN as appropriate, and further learning is accumulated in the second layer machine learning / recognition device DNN.
- the update data is reflected on the recognition devices DNN 1 to DNN N, and supervised learning is promoted by the first layer machine learning / recognition devices DNN 1 to DNN N.
- the score is particularly good, when the score is bad, or when the judgment is good, it is sorted out and sent to the second level machine learning / recognition device DNN.
- the device DNN enables multifaceted learning while also using such information.
- This stage corresponds to the case where the discriminators of the first-level machine learning / recognition devices DNN 1 to DNN N are sufficiently learned, and is a stage where a control right is given.
- the first-level machine learning / recognition apparatus does not perform the learning, but mainly performs the recognition process.
- basic items are compared with teacher data, a simple check mechanism that maintains the level of the comparison results is provided, and appropriately transmitted to the second-level machine learning / recognition device DNN. Continuous learning is carried out with the DNN machine learning / recognition device.
- FIG. 18 shows an embodiment for implementing a fully connected layer of a neural network with an FPGA. It is a connection form used in neural networks such as the final output layer of the CNN method and the GRBM (Gaussian Restricted Boltzmann Machine) method, but high-efficiency implementation is required to make it an FPGA.
- the calculation order of the weighting factors is different between the calculation of connection from the lower layer (visible layer) to the upper layer (hidden layer) and the calculation from the upper layer (hidden layer) to the lower layer (visible layer).
- the lower layer is composed of four nodes from Vo to V3
- the upper layer is composed of three nodes from h0 to h2
- the lower layer nodes are all connected to the upper layer nodes.
- the operation is to calculate the value of the node on the output side by multiplying the value of the node by the weight function.
- this value is expressed in a matrix format, it is expressed as a 4 ⁇ 4 matrix.
- the weighting factor is a matrix having a very large dimension, it is disadvantageous in cost to prepare two such matrices, especially in the first layer machine learning / recognition apparatus. . Therefore, it is important to have a memory configuration that retains this weighting coefficient that can reduce the area while maintaining high-speed computation.
- the means for realizing this will first be the following matrix representation as shown in FIG. 18 (B) when storing the weighting factor.
- the arithmetic circuit is a product-sum arithmetic circuit as shown in FIG. 18C, and the operation result of this circuit is input to the input selector unit in the multiplication unit and addition unit in the input path to the accumulator. It has a feature that it has a path and a path to be input to the multiplication unit and the addition unit of the adjacent product-sum operation circuit.
- Each arithmetic unit has a multiplication unit (pd0 to pd3), an addition unit (ad0 to ad3), and an accumulator (ac0 to ac3).
- the input of the addition unit is a selector, and the first input is 3 inputs. (I000, i001, i002), the second input is (i010, i011, i012), the input of the addition unit is the output of the multiplication unit as the first input, and the second input is four inputs that can be switched by the selector ( i020, i021, i022, i023) are shown.
- i020 is “0”
- i021 is an input from a register
- i022 is an accumulator output
- i023 is an example in which an input is shared with a part (i012) of a multiplication unit input.
- the data input to the V register is input to each adder (i010, i020, i030, i040), and the corresponding W array weight coefficient is input to the multiplier (i000, i100, i2000, i300), and multiplication is performed.
- information at address # 3 is read from the W array and input to the multiplication unit (i000, i100, i2000, i300).
- the corresponding unit of the H register is input to the multiplication unit (i010, i020, i030), then multiplied, and initially “0” is added and then stored in the accumulator.
- the storage data of the accumulator is input to the adder circuit of the adjacent arithmetic unit, so sw01, sw11, sw21, and sw31 are turned on, and sw02, sw12, sw22, and sw32 are turned off, and the computation is performed.
- the example in which the DNN device is hierarchized and the terminal side processing unit and the server side processing unit are provided has been described. Furthermore, the input data on the terminal side and the DNN intermediate layer data when recognition is being performed on the terminal side are sent to the server side, learning is performed on the server side, and the learning result on the server is The example which advances to the recognition operation
- the input of the DNN on the server side is to use the data output of the intermediate layer of the DNN of the terminal and learn with the DNN in each layer.
- the DNN of the terminal performs supervised learning, and then the DNN of the server performs supervised learning.
- the DNN device on the terminal side is composed of a small, small area, low power device, and the DNN device on the server side is composed of a so-called server having high-speed computation and a large capacity memory.
- the hierarchical learning has the effect of shortening the learning time and facilitating the learning itself, rather than making the entire DNN as one.
- control variable originally considered by the designer is not necessarily optimal, but between multiple terminals and servers where such optimization is difficult.
- the overall DNN configuration can be optimized as a whole.
- the present invention is not limited to the above-described embodiment, and includes various modifications.
- a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment.
- machine learning can be applied in all technical fields to which machine learning can be applied, for example, social infrastructure systems.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
(1)組込み装置における各種制限(ハード規模、電力、演算性能)下での革新的な情報制御装置の創出
(2)IoT時代はネットワークによる物理的に離れた演算資源の利活用が可能であるので、その資産を有効に活用する技術であること
(3)IoT時代は、トリリオンセンサによる巨大システムになるとの想定があり、自律した制御が可能なシステムであること
となる。
(1)学習を実施するための十分な入力データが揃っていること
(2)ニューラルネット型学習機であれば、ニューロン数に比例した演算が必要であり、計算資源(演算性能、ハード規模等)が潤沢であること
が重要視される。
(3)柔軟な適応(低レイテンシ、高速フィードバック)
等の要件も必要である。しかも、IoTとして、多数の端末を考える上では、
(4)いわゆる複雑系としての対応が必要となる。
(1)第一階層機械学習・認識装置で生成された中間層データ
(2)機械学習装置をFPGAで構成した場合のニューラルネットワーク構造
(3)ニューロン間演算の重み係数
(4)第一階層機械学習・認識装置で入力データを弁別する際の識別率および弁別スコア(ヒストグラム)情報
(5)第一階層機械学習・認識装置でOn the Job Trainingを実施する際の教師あり学習による矯正情報、等が考えられる。
基本的な実施例と比べた場合、第一階層機械学習・認識装置だけで使うことはできなくなるが、第一階層および第二階層を含めた全体システムとしての最適化が実現できる効果がある。
この期間には、第二階層機械学習・認識装置から第一階層機械学習・認識装置への再構成データを伝送し、第一階層機械学習・認識装置が単体でも効率的な認識が実現できるように設定される。
H = W ・ V ・・・ (1)
の内積演算が必要になるが、逆に、上層から下層への演算においては、
V = WT ・ H ・・・ (2)
のWの転置行列との内積演算が必要になる。図18(A)に示すネットワークを例に演算を具体的に説明する。
(1)下層から上層の値を求める場合:
Vレジスタに入力されたデータを各加算部に入力し(i010,i020,i030,i040)、対応するWアレイの重み係数を乗算部に入力し(i000,i100,i2000,i300)、乗算を実施後、最初は”0”を”i020, i120 i220 i320 へ入力し加算する。次に、Vレジスタの値を左にシフト(ローテート)し、対応するVレジスタの値を乗算部に入力する。これにより、実質的にWレジスタのアドレスがインクリメントしたアドレスのデータを乗算部へ入力することができる。乗算後、sw01、sw11、sw21、sw31をOFFにし、sw02、sw12、sw22、sw32をONにして、アキュムレータに格納されているデータを加算部に入力して加算する。これを全てにわたって実行する。その結果、
V0*W00+V1*W10+V2*W20+V3*W30 ・・・ (3)
V0*W01+V1*W11 +V2*W21+V3*W31 ・・・ (4)
V0*W02+V1*W12 +V2*W22+V3*W32 ・・・ (5)
を得る。このモードは隣の演算ユニットの結果を利用しないので、セルフ演算モードとよぶ。
この場合は、アキュムレータに格納されたデータを隣の積和演算回路の加算部に渡すことで、実質的に、Wアレイの斜めシフト演算を実行するものである。
上記の演算を繰り返し、以下を得る。
H2*W32+H1*W31+H0*W30 ・・・ (6)
H0*W00+H2*W02+H1*W01 ・・・ (7)
H1*W11+H0*W10+H2*W12 ・・・ (8)
H2*W22+H1*W21+H0*W20 ・・・ (9)
このモードはとなりの演算ユニットの結果を利用するので、相互演算モードとする。
2nd HRCY 第二階層機械学習・認識装置
3rd HRCY 第三階層機械学習・認識装置
IL 入力層
HL 隠れ層
OL 出力層
DNN ディープニューラルネット型機械学習・認識部
WUD 重み係数変更線(WUD:Wait coefficient up date)
NWCD ニューラルネットワーク構成情報データ伝送線
WCD 重み係数変更線
WCU 重み係数調整回路(WCU:Weight Change Unit)
DNNCC DNNネットワーク構成制御部
DDATA 検出データ
LM 学習モジュール
DD 誤差検出部(DD:Deviation Detection)部
TDS 教師データ
DS データストレージ部
ni j i層,j番目のノード
nd i j,k i層,j番目のノードとi+1層,k番目のノードとの接続線
AU 算術演算ユニット
w i j,k i層,j番目のノードを入力とし、i+1層,k番目のノードの値を計算する際の重み係数
DNN# 第一階層機械学習・認識装置に搭載されているDNNネットワークの識別番号
WPN# 第一階層機械学習・認識装置に搭載されているDNNネットワークの重み係数のパターン番号
RES_COMP
Det_rank 検出結果のランキング情報
UD Req 第一階層機械学習・認識装置のニューラルネットワークの更新リクエスト発行情報
UD Prprd 第一階層機械学習・認識装置のニューラルネットワークの更新完了情報
CRAM FPGAの構成情報格納メモリ
LEU ルックアップテーブル格納ユニット
SWU スイッチ部ユニット
DSP 算術演算ハード演算部
RAM FPGA内メモリ
IO データ入出力回路部
IN_DATA 第一階層機械学習・認識装置の入力データ
STORAGE 第一階層機械学習・認識装置から第二階層機械学習・認識装置へのデータ転送一時保管データ蓄積部
CLASS_DATA 第一階層からの複数の第一階層機械学習・認識装置から送信された情報を蓄えるデータベース
NW ネットワーク
CL11 畳込み層
PL11 プーリング層
FL11 完全結合層
Claims (15)
- 複数のDNNを階層的に構成し、
第一階層機械学習・認識装置のDNNの隠れ層のデータを、
第二階層機械学習・認識装置のDNNの入力データとすることを特徴とする情報処理システム。 - 前記第一階層機械学習・認識装置のDNNについて出力層が所望の出力となるように教師有り学習を行った後、
前記第二階層機械学習・認識装置のDNNの教師有り学習を行う、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置は、
認識処理を実施すると同時に、前記認識処理の認識結果のスコアを記憶する手段を設け、前記認識結果があらかじめ決められたしきい値1よりも大きくなった場合、もしくは、あらかじめ決められたしきい値2よりも小さくなった場合、もしくは、前記認識結果のヒストグラムを作成した際に分散があらかじめ決められた値より大きくなった場合に、前記第二階層機械学習・認識装置に対して、前記第一階層機械学習・認識装置のDNNのニューラルネットワーク構造および重み係数に対して更新リクエスト信号を送信する更新リクエスト送信手段を設け、
前記第二階層機械学習・認識装置は、
第一階層機械学習・認識装置の前記更新リクエスト信号を受取ると、前記第一階層機械学習・認識装置のDNNのニューラルネットワーク構造および重み係数の更新を実施し、その、更新データを前記第一階層機械学習・認識装置へ送信し、
前記第一階層機械学習・認識装置は、
前記更新データをもとに、新たなニューラルネットワークを構築する、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置は、
学習処理を行う学習モジュールと、
前記学習処理の学習結果の重み係数情報、認識結果レーティング情報、中間層データ情報を格納する記憶手段と、
前記第一階層機械学習・認識装置のニューラルネットワークの更新が必要な場合に、更新リクエスト信号を、前記第二階層機械学習・認識装置へ送信する手段を有する、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置と、前記第二階層機械学習・認識装置の接続において、
前記第一階層機械学習・認識装置から前記第二階層機械学習・認識装置への入力のみを有する、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置は、
前記DNNの隠れ層の値を一時的に保持する記憶装置を有するとともに、前記第二階層機械学習・認識装置において前記記憶装置のデータを入力データデータベースとして保持する機構を有する、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置が複数あり、前記複数の第一階層機械学習・認識装置から単一の前記第二階層機械学習・認識装置への前記入力データの送信については、直接もしくは、有線および無線の少なくとも一つを用いたネットワークを介して接続されることを特徴とする、請求項1記載の情報処理システム。
- 複数の前記第二階層機械学習・認識装置を備え、
一つの前記第一階層機械学習・認識装置からの前記隠れ層のデータを、複数の前記第二階層機械学習・認識装置で共有する、請求項1記載の情報処理システム。 - 前記第二階層機械学習・認識装置には、前記第一階層機械学習・認識装置のDNNの複製を設け、
前記第一階層機械学習・認識装置での学習、もしくは、認識処理とともに、
前記第二階層機械学習・認識装置でも前記第一階層機械学習・認識装置からの入力データをもとに学習を実施し、その結果、前記第二階層機械学習・認識装置での学習結果であるニューラルネットワークの構成情報、並びに、重み係数情報を前記第一階層機械学習・認識装置へ送信し、前記第一階層機械学習・認識装置のニューラルネットワークおよび重み係数を更新する、請求項1記載の情報処理システム。 - 前記第一階層機械学習・認識装置のハードウエア規模よりも、前記第二階層機械学習・認識装置のハードウエア規模を大きく構成した、請求項1記載の情報処理システム。
- 複数のDNNから構成される情報処理システムの運用方法であって、
前記複数のDNNは、第一階層機械学習・認識装置と第二階層機械学習・認識装置を含む多層構造を構成し、
前記第二階層機械学習・認識装置の情報処理能力は、前記第一階層機械学習・認識装置の情報処理能力よりも高いものを用いることとし、
前記第一階層機械学習・認識装置のDNNの隠れ層のデータを、前記第二階層機械学習・認識装置のDNNの入力データとする、情報処理システムの運用方法。 - 前記第二階層機械学習・認識装置の処理結果に基づいて、前記第一階層機械学習・認識装置のDNNのニューラルネットワークの構成を制御する、請求項11記載の情報処理システムの運用方法。
- 複数の前記第一階層機械学習・認識装置を用いて、ひとつの被検査対象の観察を実施し、
前記観察の過程で得られる、前記第一階層機械学習・認識装置の前記隠れ層のデータを前記第二階層機械学習・認識装置へ伝達し、
前記第二階層機械学習・認識装置では、前記隠れ層のデータをもとに学習を実施するとともに、前記第一階層機械学習・認識装置のニューラルネットワーク構造および重み係数を算出するためのデータベースを構築し、
前記第二階層機械学習・認識装置での前記学習ならびに前記データベースの構築期間を前記第一階層機械学習・認識装置の学習強化期間として定義し、
前記第二階層機械学習・認識装置は、前記学習が完了後に、前記第一階層機械学習・認識装置のニューラルネットワークおよび重み係数を設定し、前記第一階層機械学習・認識装置と前記第二階層機械学習・認識装置とで認識学習の運用を実施する、実運用期間を定義する運用形態を有する、請求項11記載の情報処理システムの運用方法。 - 複数の前記第一階層機械学習・認識装置の構築に向けて、前期第二階層機械学習・認識装置にて初期のニューラルネットワーク構築のための第一の学習期間を設け、
その後、前記第一の学習期間で会得された学習データを前記第一階層機械学習・認識装置に実装し、前記第一階層機械学習・認識装置を実運用させながら教師あり学習を推進する第二の学習期間を設け、
さらに、前記第二の学習期間が終了後に、前記第一階層機械学習・認識装置を用いた機械学習認識制御を実施し、必要に応じて前期第二階層機械学習・認識装置との連携学習を推進する第三の学習期間を設ける、請求項11記載の情報処理システムの運用方法。
- 多層からなるニューラルネットワークにおいて、
第一層のデータを用いて第二層のデータを演算し、その逆の、前記第二層のデータを用いて前記第一層のデータを演算する手段を有し、
両方の前記演算において、前記第一層の各データと、前記第二層の各データとの間の関係を決める重みデータを有し、
前記重みデータは、構成するすべての重み係数行列としてひとつの記憶保持部に格納され、
前記重み係数行列の構成要素である、ひとつひとつの行列要素の演算に対して、1対1対応する積和演算器からなる演算ユニットを有し、
前記重み係数行列を構成する行列要素を前記記憶保持部へ格納する際に、行列の行ベクトルを基本単位にして格納され、
前記重み係数行列の演算は、前記記憶保持部に格納された基本単位ごとに演算され、
前記行ベクトルの第一行成分は、元の行列の列ベクトルと構成要素の並び順が同じくして前記記憶保持部へ保持し、
前記行ベクトルの第二行成分は、元の行列の列ベクトルの構成要素を右もしくは左へ一要素ずらして前期記憶保持部に保持し、
前記行ベクトルの第三行成分は、元の行列の列ベクトルの構成要素を前記第二行成分で移動させた方向と同じ方向に、さらに一要素ずらして前期記憶保持部に保持し、
前記行ベクトルの最終行の第N行成分は、元の行列の列ベクトルの構成要素を第N-1行成分で移動させた方向と同じ方向に、さらに一要素ずらして前記記憶保持部へ保持し、
前記第一層のデータを前記第二層のデータから前記重み係数行列を用いて演算する場合は、
前記第二層のデータを行列の列ベクトルのように並べ、各要素を前記積和演算器へ入力し、
同時に、前記重み係数行列の第一行を前記積和演算器へ入力して両データに関する乗算演算を実施し、その演算結果をアキュムレータへ格納し、
前記重み係数行列の第二行以下を計算する際には、前記第二層のデータを左もしくは右へ、重み行列の行演算を実施する毎に前記第二層のデータを一要素ずらした後に、前記重み係数行列の対応する行の要素データと並べ替えられた前記第二層のデータとの乗算演算を実施し、
その後、同じ演算ユニットの前記アキュムレータに格納したデータを加算し、
同様な演算を重み係数行列の第N行まで実施する演算器構成を有し、
前記第二層のデータを前記第一層のデータから前記重み係数行列を用いて演算する場合は、
前記第一層のデータを行列の列ベクトルのように並べ、各要素を前記積和演算器へ入力し、
同時に、前記重み係数行列の第一行を前記積和演算器へ入力して乗算演算を実施し、その結果を前記アキュムレータへ格納し、
前記重み係数行列の第二行以下を計算する際には、前記第一層のデータを左もしくは右へ、前記重み係数行列の行演算を実施する毎に前記第一層のデータを一要素ずらした後に、前記重み係数行列の対応する行の要素データと並べ替えられた前記第一層のデータとの乗算演算を実施し、
その後、前記演算ユニットに格納された前記アキュムレータの情報を隣の演算ユニットの加算部へ入力し、前記乗算演算の結果との加算を実施し、その結果を前記アキュムレータに格納し、
同様な演算を重み行列の第N行まで実施することを特徴とする機械学習演算器。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2016/063072 WO2017187516A1 (ja) | 2016-04-26 | 2016-04-26 | 情報処理システムおよびその運用方法 |
JP2018513989A JP6714690B2 (ja) | 2016-04-26 | 2016-04-26 | 情報処理システム、情報処理システムの運用方法、および機械学習演算器 |
US15/761,217 US20180260687A1 (en) | 2016-04-26 | 2016-04-26 | Information Processing System and Method for Operating Same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2016/063072 WO2017187516A1 (ja) | 2016-04-26 | 2016-04-26 | 情報処理システムおよびその運用方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017187516A1 true WO2017187516A1 (ja) | 2017-11-02 |
Family
ID=60161328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/063072 WO2017187516A1 (ja) | 2016-04-26 | 2016-04-26 | 情報処理システムおよびその運用方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20180260687A1 (ja) |
JP (1) | JP6714690B2 (ja) |
WO (1) | WO2017187516A1 (ja) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109523752A (zh) * | 2018-11-28 | 2019-03-26 | 润电能源科学技术有限公司 | 一种设备故障预警方法、装置、电子设备和介质 |
JP2019061578A (ja) * | 2017-09-27 | 2019-04-18 | 富士フイルム株式会社 | 学習支援装置、学習支援装置の作動方法、学習支援プログラム、学習支援システム、および端末装置 |
JP2019095932A (ja) * | 2017-11-20 | 2019-06-20 | 日本製鉄株式会社 | 異常判定方法及び装置 |
WO2019131527A1 (ja) * | 2017-12-26 | 2019-07-04 | 株式会社エイシング | 汎用学習済モデルの生成方法 |
WO2019216513A1 (ko) * | 2018-05-10 | 2019-11-14 | 서울대학교산학협력단 | 행 단위 연산 뉴럴 프로세서 및 이를 이용한 데이터 처리 방법 |
JP2020038658A (ja) * | 2018-09-04 | 2020-03-12 | 株式会社ストラドビジョン | エッジイメージを利用して物体を検出する学習方法及び学習装置、並びにそれを利用したテスト方法及びテスト装置 |
WO2020102887A1 (en) * | 2018-11-19 | 2020-05-28 | Tandemlaunch Inc. | System and method for automated design space determination for deep neural networks |
WO2020188794A1 (ja) * | 2019-03-20 | 2020-09-24 | 株式会社日立国際電気 | 映像システム、撮像装置、および映像処理装置 |
JP2021503668A (ja) * | 2017-11-21 | 2021-02-12 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | マルチ・タスク学習を用いた特徴抽出方法、コンピュータ・システム、およびコンピュータ・プログラム製品(マルチ・タスク学習を用いた特徴抽出) |
JP2021503653A (ja) * | 2017-11-17 | 2021-02-12 | タレス・ディス・フランス・エス・ア | 通信装置によって行われるユーザ認証を改善する方法 |
WO2021028967A1 (ja) * | 2019-08-09 | 2021-02-18 | 三菱電機株式会社 | 制御システム、サーバ、機器、制御方法およびプログラム |
JP2021081930A (ja) * | 2019-11-18 | 2021-05-27 | 日本放送協会 | 学習装置、情報分類装置、及びプログラム |
KR20210066545A (ko) * | 2019-11-28 | 2021-06-07 | 광주과학기술원 | 반도체 소자의 시뮬레이션을 위한 전자 장치, 방법, 및 컴퓨터 판독가능 매체 |
DE102018002781B4 (de) | 2017-04-13 | 2021-07-22 | Fanuc Corporation | Schaltungskonfigurations-Optimierungsvorrichtung und maschinelle Lernvorrichtung |
JP2021526253A (ja) * | 2018-05-23 | 2021-09-30 | モビディウス リミテッド | 深層学習システム |
JP2023511864A (ja) * | 2020-01-15 | 2023-03-23 | グーグル エルエルシー | 小さいフットプリントのマルチチャネルキーワードスポッティング |
US11741026B2 (en) | 2020-08-31 | 2023-08-29 | Samsung Electronics Co., Ltd. | Accelerator, method of operating an accelerator, and electronic device including an accelerator |
JP7475509B2 (ja) | 2018-10-19 | 2024-04-26 | ジェネンテック, インコーポレイテッド | 畳み込みニューラルネットワークによる凍結乾燥製剤における欠陥検出 |
JP7475150B2 (ja) | 2020-02-03 | 2024-04-26 | キヤノン株式会社 | 推論装置、推論方法、及びプログラム |
US11983879B2 (en) | 2018-11-30 | 2024-05-14 | Fujifilm Corporation | Image processing apparatus, image processing method, and program |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180129900A1 (en) * | 2016-11-04 | 2018-05-10 | Siemens Healthcare Gmbh | Anonymous and Secure Classification Using a Deep Learning Network |
KR20180060149A (ko) * | 2016-11-28 | 2018-06-07 | 삼성전자주식회사 | 컨볼루션 처리 장치 및 방법 |
US10769501B1 (en) * | 2017-02-15 | 2020-09-08 | Google Llc | Analysis of perturbed subjects using semantic embeddings |
CN108062246B (zh) * | 2018-01-25 | 2019-06-14 | 北京百度网讯科技有限公司 | 用于深度学习框架的资源调度方法和装置 |
US11126649B2 (en) | 2018-07-11 | 2021-09-21 | Google Llc | Similar image search for radiology |
CN109445688B (zh) * | 2018-09-29 | 2022-04-15 | 上海百功半导体有限公司 | 一种存储控制方法、存储控制器、存储设备及存储系统 |
KR102161758B1 (ko) * | 2018-10-24 | 2020-10-05 | 아주대학교 산학협력단 | 사용자 단말, 서버 및 이를 포함하는 클라이언트 서버 시스템 |
KR20200063289A (ko) * | 2018-11-16 | 2020-06-05 | 삼성전자주식회사 | 영상 처리 장치 및 그 동작방법 |
CN110059076A (zh) * | 2019-04-19 | 2019-07-26 | 国网山西省电力公司电力科学研究院 | 一种输变电线路设备的故障数据库半自动化建立方法 |
EP3767548A1 (en) * | 2019-07-03 | 2021-01-20 | Nokia Technologies Oy | Delivery of compressed neural networks |
EP3767549A1 (en) * | 2019-07-03 | 2021-01-20 | Nokia Technologies Oy | Delivery of compressed neural networks |
US11886963B2 (en) | 2020-12-01 | 2024-01-30 | OctoML, Inc. | Optimizing machine learning models |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254086A1 (en) * | 2011-03-31 | 2012-10-04 | Microsoft Corporation | Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization |
JP2015166962A (ja) * | 2014-03-04 | 2015-09-24 | 日本電気株式会社 | 情報処理装置、学習方法、及び、プログラム |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0926948A (ja) * | 1995-07-11 | 1997-01-28 | Fujitsu Ltd | ニューラルネットワークによる情報処理装置 |
JP4121061B2 (ja) * | 2002-01-10 | 2008-07-16 | 三菱電機株式会社 | 類識別装置及び類識別方法 |
JP5816771B1 (ja) * | 2015-06-08 | 2015-11-18 | 株式会社Preferred Networks | 学習装置ユニット |
-
2016
- 2016-04-26 US US15/761,217 patent/US20180260687A1/en not_active Abandoned
- 2016-04-26 JP JP2018513989A patent/JP6714690B2/ja not_active Expired - Fee Related
- 2016-04-26 WO PCT/JP2016/063072 patent/WO2017187516A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254086A1 (en) * | 2011-03-31 | 2012-10-04 | Microsoft Corporation | Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization |
JP2015166962A (ja) * | 2014-03-04 | 2015-09-24 | 日本電気株式会社 | 情報処理装置、学習方法、及び、プログラム |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102018002781B4 (de) | 2017-04-13 | 2021-07-22 | Fanuc Corporation | Schaltungskonfigurations-Optimierungsvorrichtung und maschinelle Lernvorrichtung |
JP2019061578A (ja) * | 2017-09-27 | 2019-04-18 | 富士フイルム株式会社 | 学習支援装置、学習支援装置の作動方法、学習支援プログラム、学習支援システム、および端末装置 |
JP2021503653A (ja) * | 2017-11-17 | 2021-02-12 | タレス・ディス・フランス・エス・ア | 通信装置によって行われるユーザ認証を改善する方法 |
JP2019095932A (ja) * | 2017-11-20 | 2019-06-20 | 日本製鉄株式会社 | 異常判定方法及び装置 |
JP7046181B2 (ja) | 2017-11-21 | 2022-04-01 | インターナショナル・ビジネス・マシーンズ・コーポレーション | マルチ・タスク学習を用いた特徴抽出方法、コンピュータ・システム、およびコンピュータ・プログラム製品(マルチ・タスク学習を用いた特徴抽出) |
JP2021503668A (ja) * | 2017-11-21 | 2021-02-12 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | マルチ・タスク学習を用いた特徴抽出方法、コンピュータ・システム、およびコンピュータ・プログラム製品(マルチ・タスク学習を用いた特徴抽出) |
WO2019131527A1 (ja) * | 2017-12-26 | 2019-07-04 | 株式会社エイシング | 汎用学習済モデルの生成方法 |
JPWO2019131527A1 (ja) * | 2017-12-26 | 2020-01-16 | 株式会社エイシング | 汎用学習済モデルの生成方法 |
US11568327B2 (en) | 2017-12-26 | 2023-01-31 | Aising Ltd. | Method for generating universal learned model |
WO2019216513A1 (ko) * | 2018-05-10 | 2019-11-14 | 서울대학교산학협력단 | 행 단위 연산 뉴럴 프로세서 및 이를 이용한 데이터 처리 방법 |
JP2021526253A (ja) * | 2018-05-23 | 2021-09-30 | モビディウス リミテッド | 深層学習システム |
JP7372010B2 (ja) | 2018-05-23 | 2023-10-31 | モビディウス リミテッド | 深層学習システム |
JP2020038658A (ja) * | 2018-09-04 | 2020-03-12 | 株式会社ストラドビジョン | エッジイメージを利用して物体を検出する学習方法及び学習装置、並びにそれを利用したテスト方法及びテスト装置 |
JP7475509B2 (ja) | 2018-10-19 | 2024-04-26 | ジェネンテック, インコーポレイテッド | 畳み込みニューラルネットワークによる凍結乾燥製剤における欠陥検出 |
WO2020102887A1 (en) * | 2018-11-19 | 2020-05-28 | Tandemlaunch Inc. | System and method for automated design space determination for deep neural networks |
CN109523752A (zh) * | 2018-11-28 | 2019-03-26 | 润电能源科学技术有限公司 | 一种设备故障预警方法、装置、电子设备和介质 |
CN109523752B (zh) * | 2018-11-28 | 2021-01-29 | 润电能源科学技术有限公司 | 一种设备故障预警方法、装置、电子设备和介质 |
US11983879B2 (en) | 2018-11-30 | 2024-05-14 | Fujifilm Corporation | Image processing apparatus, image processing method, and program |
WO2020188794A1 (ja) * | 2019-03-20 | 2020-09-24 | 株式会社日立国際電気 | 映像システム、撮像装置、および映像処理装置 |
JP7108780B2 (ja) | 2019-03-20 | 2022-07-28 | 株式会社日立国際電気 | 映像システム、撮像装置、および映像処理装置 |
JPWO2020188794A1 (ja) * | 2019-03-20 | 2021-12-09 | 株式会社日立国際電気 | 映像システム、撮像装置、および映像処理装置 |
US11881013B2 (en) | 2019-03-20 | 2024-01-23 | Hitachi Kokusai Electric Inc. | Video system |
JP7483104B2 (ja) | 2019-08-09 | 2024-05-14 | 三菱電機株式会社 | 制御システム |
WO2021028967A1 (ja) * | 2019-08-09 | 2021-02-18 | 三菱電機株式会社 | 制御システム、サーバ、機器、制御方法およびプログラム |
JPWO2021028967A1 (ja) * | 2019-08-09 | 2021-02-18 | ||
JP2021081930A (ja) * | 2019-11-18 | 2021-05-27 | 日本放送協会 | 学習装置、情報分類装置、及びプログラム |
KR20210066545A (ko) * | 2019-11-28 | 2021-06-07 | 광주과학기술원 | 반도체 소자의 시뮬레이션을 위한 전자 장치, 방법, 및 컴퓨터 판독가능 매체 |
KR102293791B1 (ko) * | 2019-11-28 | 2021-08-25 | 광주과학기술원 | 반도체 소자의 시뮬레이션을 위한 전자 장치, 방법, 및 컴퓨터 판독가능 매체 |
JP7345667B2 (ja) | 2020-01-15 | 2023-09-15 | グーグル エルエルシー | 小さいフットプリントのマルチチャネルキーワードスポッティング |
JP2023511864A (ja) * | 2020-01-15 | 2023-03-23 | グーグル エルエルシー | 小さいフットプリントのマルチチャネルキーワードスポッティング |
JP7475150B2 (ja) | 2020-02-03 | 2024-04-26 | キヤノン株式会社 | 推論装置、推論方法、及びプログラム |
US11741026B2 (en) | 2020-08-31 | 2023-08-29 | Samsung Electronics Co., Ltd. | Accelerator, method of operating an accelerator, and electronic device including an accelerator |
Also Published As
Publication number | Publication date |
---|---|
US20180260687A1 (en) | 2018-09-13 |
JPWO2017187516A1 (ja) | 2018-07-19 |
JP6714690B2 (ja) | 2020-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017187516A1 (ja) | 情報処理システムおよびその運用方法 | |
WO2020221200A1 (zh) | 神经网络的构建方法、图像处理方法及装置 | |
CN109102065B (zh) | 一种基于PSoC的卷积神经网络加速器 | |
US11164074B2 (en) | Multi-core systolic processor system for neural network processing | |
WO2022083536A1 (zh) | 一种神经网络构建方法以及装置 | |
CN112183718B (zh) | 一种用于计算设备的深度学习训练方法和装置 | |
WO2021057056A1 (zh) | 神经网络架构搜索方法、图像处理方法、装置和存储介质 | |
WO2019060290A1 (en) | DIRECT ACCESS MATERIAL ACCELERATION IN A NEURONAL NETWORK | |
EP3710995B1 (en) | Deep neural network processor with interleaved backpropagation | |
WO2021244249A1 (zh) | 一种分类器的训练方法、数据处理方法、系统以及设备 | |
WO2022001805A1 (zh) | 一种神经网络蒸馏方法及装置 | |
WO2021218517A1 (zh) | 获取神经网络模型的方法、图像处理方法及装置 | |
WO2023221928A1 (zh) | 一种推荐方法、训练方法以及装置 | |
JP2021510219A (ja) | マルチキャストネットワークオンチップに基づいた畳み込みニューラルネットワークハードウェアアクセラレータおよびその動作方式 | |
WO2022179492A1 (zh) | 一种卷积神经网络的剪枝处理方法、数据处理方法及设备 | |
KR102562344B1 (ko) | 네트워크 프로세서와 컨볼루션 처리기를 갖는 디바이스용 신경망 처리기 | |
US20220207327A1 (en) | Method for dividing processing capabilities of artificial intelligence between devices and servers in network environment | |
CN112163601A (zh) | 图像分类方法、系统、计算机设备及存储介质 | |
CN111931901A (zh) | 一种神经网络构建方法以及装置 | |
CN114037882A (zh) | 边缘人工智能装置、电子装置及其方法 | |
WO2021036397A1 (zh) | 目标神经网络模型的生成方法和装置 | |
WO2024114659A1 (zh) | 一种摘要生成方法及其相关设备 | |
CN114662646A (zh) | 实现神经网络的方法和装置 | |
WO2023197857A1 (zh) | 一种模型切分方法及其相关设备 | |
JP6957659B2 (ja) | 情報処理システムおよびその運用方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2018513989 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15761217 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16900389 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16900389 Country of ref document: EP Kind code of ref document: A1 |