WO2018101476A1

WO2018101476A1 - Information processing device, information processing method, and information processing program

Info

Publication number: WO2018101476A1
Application number: PCT/JP2017/043366
Authority: WO
Inventors: 完曽我部; 東馬曽我部
Original assignee: 株式会社グリッド; 国立大学法人電気通信大学
Priority date: 2016-12-01
Filing date: 2017-12-01
Publication date: 2018-06-07

Abstract

An information processing device according to one embodiment of the present invention comprises: an acquisition unit for acquiring learning data for studying a neural network including a plurality of neurons; and a computation unit for optimizing the neural network on the basis of the learning data. The computation unit optimizes a neural network that is provided, at each of the plurality of neurons, with: a gradient generation layer for applying a gradient to the neurons; and an opening and closing mechanism for opening and closing connections to adjacent neurons.

Description

Information processing apparatus, information processing method, and information processing program

The present invention relates to an information processing apparatus, an information processing method, and an information processing program for optimizing a neural network.

In conventional neural networks, neurons are treated as mathematical functions. For example, in the neural network described in Patent Document 1, a connection load is set for each neuron, and input / output is given from a neuron in an adjacent layer by forward propagation or reverse propagation.

JP 2017-37392 A

As described above, in the neural network described in Patent Document 1, since each neuron is only given input / output from an adjacent neuron, the neuron has only a function as a function variable, and includes a physical function. There is a problem that it cannot be applied to social infrastructure problems.

Therefore, in view of the above problems, an object of the present invention is to provide an information processing apparatus, an information processing method, and an information processing program for realizing a neural network that can be applied to real-world social infrastructure problems including physical functions.

In order to solve the above-described problem, an information processing apparatus according to an aspect of the present invention includes an acquisition unit that acquires learning data for learning a neural network including a plurality of neurons, and the neural network based on the learning data. A neural network comprising: a computation unit for optimizing a network, wherein the computation unit includes a gradient generation layer for providing a gradient to the neuron, and an opening / closing mechanism for opening and closing a connection between adjacent neurons. To optimize.

In the information processing apparatus according to an aspect of the present invention, the opening / closing mechanism may perform forward propagation or back propagation between the adjacent neurons by a predetermined optimum number of times.

In the information processing apparatus according to an aspect of the present invention, the predetermined optimum number of times may be a ratio of the number of times the gradient generation layer is updated and the number of times the gradient of back propagation is updated.

In the information processing apparatus according to an aspect of the present invention, the gradient generation layer may be characterized in that the neuron is given a gradient from the outside independently of other neurons.

In the information processing apparatus according to one aspect of the present invention, when the neural network is applied to a power system, the arithmetic unit associates the neuron with a node of the power system, and the gradient that the gradient generation layer provides to the neuron. May correspond to a change in the supply and demand balance of power in each node of the power system.

In the information processing apparatus according to one aspect of the present invention, the gradient given to the neuron by the gradient generation layer may correspond to a variation in power generated independently at each node of the power system.

An information processing method according to an aspect of the present invention includes an acquisition step of acquiring learning data for learning a neural network including a plurality of neurons, and an arithmetic step of optimizing the neural network based on the learning data. In the calculation step, the neural network including a gradient generation layer that gives a gradient to the neuron and an open / close mechanism that opens and closes a connection between adjacent neurons is optimized.

An information processing program according to an aspect of the present invention includes an acquisition function for acquiring learning data for learning a neural network including a plurality of neurons, an arithmetic function for optimizing the neural network based on the learning data, and In the arithmetic function, a neural network including a gradient generation layer that gives a gradient to the neuron and an opening / closing mechanism that opens and closes a connection between adjacent neurons is optimized for each of the plurality of neurons.

According to the present invention, it is applicable to real-world social infrastructure problems including physical functions by temporarily disconnecting connections between layers of an existing neural network and newly introducing a gradient generation layer to be added from the outside. It is possible to provide an information processing apparatus, an information processing method, and an information processing program for realizing a neural network.

It is the schematic which shows the structure of the neural network in embodiment of this invention. It is drawing for demonstrating the outline of the neural network 1 in embodiment of this invention. It is a block diagram which shows the structural example of the information processing apparatus in embodiment of this invention. It is a figure which shows the structural example of the neural network 1 in embodiment of this invention. It is a block diagram which shows the structural example of the calculating part in embodiment of this invention. It is the schematic which shows the schematic example of the transmission network of the electric power energy in embodiment of this invention. It is a flowchart which shows the operation example of the information processing apparatus in embodiment of this invention.

Hereinafter, a measuring apparatus according to an embodiment of the present invention will be described in detail with reference to the drawings.

<Embodiment>
FIG. 1 is a schematic diagram showing the structure of a neural network 1 according to an embodiment of the present invention. As shown in FIG. 1, in the embodiment of the present invention, a connection between layers of an existing neural network 1 is temporarily disconnected, and a gradient generation layer added from the outside is newly introduced.

The essence of artificial intelligence technology is to artificially reproduce the biological mechanism elucidated in the intelligence of the human brain with a computer. Looking back at the history of artificial intelligence, groundbreaking discoveries occur by artificially reproducing the biological mechanisms of the brain with computers. For example, Parse Proton is the first neuron device that models visual and brain neural activity, focusing on the function of neurons, the smallest unit of nerves. Similarly, as a mechanism caused by the biological mechanism of the brain, there are a convolutional neural network 1 caused by the receptive field and reinforcement learning caused by the basal ganglia. Further, deep learning, which is tertiary artificial intelligence, has a corresponding relationship with the brain architecture, that is, the hierarchical structure of the spinal cord, brainstem, midbrain, cerebral neocortex, and hippocampus in order from the lowest layer.

Here, the existing deep learning does not have a clear physical meaning between the object to be identified and the neuron in the neural network 1, and the neuron functions merely as a function variable of computational intelligence. This closeness of neuronal expression is a factor that hinders the development of deep learning.

On the other hand, AI (Artificial Intelligence), which is currently under development, is a general-purpose AI and is a general-purpose artificial having a process of “observation → identification → acting” that is required in the early stage of human growth. It is intelligence. In other words, the general-purpose AI can be said to be a human AI.

Therefore, the neural network 1 according to the embodiment of the present invention extends the human general-purpose AI algorithm to give the artificial intelligence the “physical” function of the neuron. Giving the “physical” function means that each calculation object is a calculation unit and has a physical function such as “transmission, metabolism, proliferation, growth, degeneration and regeneration”. Means that.

FIG. 2 is a diagram for explaining the outline of the neural network 1 in the embodiment of the present invention. As shown in FIG. 2, each neuron of the neural network 1 in the embodiment of the present invention is defined with a unique “physical device function”. For example, as shown in FIG. 2, a physical device function “manufacturing” of social infrastructure is defined as a physical function “proliferation / growth” for the neuron A of the neural network 1. Similarly, a physical device function “movement” is defined for the neuron B as a physical function “transmission”. For the neuron C, a physical device function “sensor” is defined as a physical function “degeneration and regeneration”. Further, for the neuron D, a physical device function “energy” is defined as a physical function “metabolism”. Thus, the physical device function as the social infrastructure function is associated with the neuron of the neural network 1 according to the embodiment of the present invention. As a result, the neural network 1 in the embodiment of the present invention can construct the neural network 1 having flexibility that can cope with various problems in the real world.

As shown in FIG. 1, the neural network 1 in the embodiment of the present invention updates the gradient until convergence by alternately executing “forward propagation” and “back propagation” in the machine learning of the existing neural network 1. Can continue. In the neural network 1 according to the embodiment of the present invention, the neurons in each layer of the neural network 1 are released from the back propagation in the existing neural network 1, and the gradient can be updated from the gradient generation layer. Instead of completely canceling the back propagation function, the ratio of the number of gradient generation layer updates and the number of back propagation gradient updates is set to a predetermined optimum number, and back propagation is performed based on the predetermined optimum number of times. Incorporates the backpropagation delay function to be performed. In this way, by using the neural network 1 that realizes weak coupling (that is, the neural network 1 that performs back propagation under a predetermined optimum number of times), it is possible to give all neurons a physical function. Become.

By using a neuron having a physical function like the neural network 1 in the embodiment of the present invention, it can be applied not only to existing image processing but also to traffic, human flow, and physical distribution. In addition, the neural network 1 according to the embodiment of the present invention can be applied to an IoT (Internet of Things) platform platform by providing a neuron with a sensor function as a physical function. Moreover, the neural network 1 in the embodiment of the present invention can be applied to a management platform of a production factory by giving a neuron a manufacturing function as a physical function. Thus, the neural network 1 in the embodiment of the present invention can be applied to various industries.

<Configuration of information processing apparatus>
FIG. 3 is a block diagram illustrating a configuration example of the information processing apparatus 100. As illustrated in FIG. 3, the information processing apparatus 100 includes a communication unit 101, an input / output unit 102, a display unit 103, a storage unit 104, and a control unit 105.

The communication unit 101 is a communication interface capable of transmitting and receiving predetermined data and messages. The communication unit 101 is a communication interface capable of wireless communication, for example, and includes a function of communicating via a wireless LAN access point and a function of communicating via a wireless communication network such as LTE or CDMA. Also good. Further, it may include a function that can be connected to the network 3 via an access point. The access point provides communication using a wireless LAN wireless communication system such as Wi-Fi that complies with the IEEE 802.11 standard, for example. For example, the communication unit 101 can receive learning data from another information processing apparatus (not shown).

The input / output unit 102 includes a function for inputting various operations on the information processing apparatus 100 and a function for outputting a processing result processed by the information processing apparatus 100. The input / output unit 102 is, for example, a touch panel, and can detect contact with a pointing tool such as a user's finger or stylus and a contact position thereof. In addition, the input / output unit 102 may be, for example, a pointing device such as a keyboard or a mouse, a device capable of operating input by voice, or the like. The input / output unit 102 is a sound output device such as a speaker, a 3D (three dimensions) output device, a hologram output device, or the like, and includes a function of outputting a processing result. The input / output unit 102 is not limited to these, and may be any device.

The display unit 103 is, for example, a monitor such as a liquid crystal display or OELD (organic electroluminescence display). The display unit 103 may be realized by a device capable of displaying an image, text information, or the like in a space, such as a head mounted display (HDM), projection mapping, or a hologram.

The storage unit 104 includes a function of storing various programs and various data necessary for the information processing apparatus 100 to operate. The storage unit 104 is realized by various storage media such as an HDD, an SSD, and a flash memory.

The storage unit 104 stores, for example, a driver program, an operating system program, an application program, data, and the like used for various processes in the control unit 105. For example, the storage unit 104 stores, as a driver program, a communication driver program that executes an IEEE 802.11 standard wireless communication method or a mobile communication (cellular communication) wireless communication method. The storage unit 104 stores an input device driver program, an output device driver program, and the like. The storage unit 104 may store various text data, video data, image data, and the like, or temporarily store temporary data related to a predetermined process.

Further, the storage unit 104 stores learning data. The learning data is data for learning the neural network 1. The learning data can be obtained by actual measurement or simulation, for example. The learning data may include a set of input data and teacher data as a result criterion.

Further, the storage unit 104 may store the neural network 1 optimized by the control unit 105.

The control unit 105 has a function for executing a predetermined function by a code or an instruction in the program, and is, for example, a central processing unit (CPU). The control unit 105 may be a microprocessor, a multiprocessor, an ASIC, an FPGA, or the like, for example. The control unit 105 is not limited to these examples.

As shown in FIG. 3, the control unit 105 includes an acquisition unit 106 and a calculation unit 107.

The acquisition unit 106 acquires learning data stored in the storage unit 104. The acquisition unit 106 acquires the initial structure of the neural network 1. The acquisition unit 106 acquires the neural network 1 having an initial structure by inputting from the outside or by setting in advance.

The calculation unit 107 performs a calculation for optimizing the neural network 1 having the initial structure acquired by the acquisition unit 106. For example, the calculation unit 107 optimizes the neural network 1 using the learning data acquired by the acquisition unit 106. The arithmetic unit 107 executes, for example, a neuron generation process, a neuron disappearance process, a gradient generation process, and an opening / closing process between adjacent neurons. *

<Configuration of Neural Network 1>
FIG. 4 is a diagram illustrating a configuration example of the neural network 1 according to the embodiment of the present invention. As shown in FIG. 4, the neural network 1 optimized by the control unit 105 includes a plurality of neurons 200 including a mechanism that can be opened and closed, and a gradient generation layer 300 corresponding to each of the plurality of neurons 200. The openable / closable mechanism included in each of the plurality of neurons 200 has a function of opening / closing a connection with the adjacent neuron 200 by a predetermined optimum number of times. The optimized neural network 1 executes “forward propagation” and “back propagation” in the existing neural network with the adjacent neuron 200 when the openable / closable mechanism is closed. On the other hand, in a state where the openable / closable mechanism is open, “forward propagation” and “backpropagation” are not executed between adjacent neurons 200. That is, the optimized neural network 1 frees the neurons in each layer from back propagation. The predetermined optimum number of times is, for example, the ratio of the number of gradient generation layer updates to the number of back propagation gradient updates.

The gradient generation layer 300 provides a physical function to each of the plurality of neurons 200. The gradient generation layer 300 has a function of adding a gradient to the neuron 200 from the outside. For example, the gradient generation layer 300 has a function of giving (updating) dynamic data or static data input / output (gradient) to the neuron 200. Note that the gradient generation layer 300 is not given from adjacent neurons such as “forward propagation” and “back propagation” in an existing neural network, and a gradient is given to a predetermined neuron 200 independently of other neurons 200. Give.

For example, when the neural network 1 is applied to the power system, the gradient generation layer 300 can give the neuron 200 input / output of dynamic data such as sunlight, wind power, and power consumption. In addition, the gradient generation layer 300 can give the neuron 200 input / output of static data such as a storage battery or a sensor.

Also, the optimized neural network 1 includes a neuron 200A serving as an input layer, a neuron 200B serving as an output layer, and a plurality of neurons 200C serving as intermediate layers, similarly to the existing neural network. However, each of these neurons 200 includes a mechanism that can open and close a connection between adjacent neurons 200 and a gradient generation layer that updates the gradient, as described above.

As described above, the optimized neural network 1 can construct a neural network 1 having flexibility that can cope with various problems in the real world, because each neuron 200 has a physical function.

Hereinafter, a case where the neural network 1 according to the embodiment of the present invention is applied to a power system will be described as an example. However, it goes without saying that the neural network 1 in the embodiment of the present invention is applicable to various systems other than the power system.

Here, the calculation unit 107 that optimizes the neural network 1 will be described. FIG. 5 is a diagram illustrating a configuration example of the calculation unit 107 in the embodiment of the present invention. As illustrated in FIG. 5, the calculation unit 107 includes a neuron generation processing unit 171, a neuron disappearance processing unit 172, a gradient generation processing unit 173, and an opening / closing processing unit 174.

The neuron generation processing unit 171 has a function of assigning physical device functions in various social infrastructures as neurons 200 and generating the neurons 200 when the neural network 1 is associated with various social infrastructures in the real world. For example, when the neural network 1 is applied to the power system, the neuron generation processing unit 171 generates the neuron 200 by associating the neuron 200 with a house, building, factory, or home that consumes power.

The neuron annihilation processing unit 172 has a function of erasing the corresponding neuron 200 in the neural network 1 when the physical device function is lost in various social infrastructures in the real world. For example, when the neural network 1 is applied to the power system, the neuron disappearance processing unit 172 extinguishes the neurons 200 associated with houses, buildings, factories, homes, etc. that no longer consume power.

The gradient generation processing unit 173 has a function of giving input / output (gradient) of dynamic data and static data to the neuron 200 using the gradient generation layer 300 provided for each neuron 200. For example, when the neural network 1 is applied to the power system, the open / close processing unit 174 uses the power obtained by solar power generation or wind power generation or the power discharged from the storage battery as input from the gradient generation layer 300 to the neuron 200. give. In addition, the gradient generation processing unit 173 emits the power charged in the storage battery or the power used by the sensor from the neuron 200 as an output to the gradient generation layer 300.

The opening / closing processing unit 174 has a function of opening / closing a connection between adjacent neurons 200. The opening / closing processor 174 opens / closes the connection between the adjacent neurons 200 based on a predetermined optimum number of times. For example, when the neural network 1 is applied to a power system, the power flow between adjacent power supply destinations (houses, buildings, rooms, etc.) corresponding to each neuron 200 is turned on based on a predetermined optimum number of times. Alternatively, it is turned off (connected or opened / closed).

As described above, the neural network 1 can be associated with various social infrastructures in the real world by the function of the arithmetic unit 107, and the neural network 1 having flexibility that can deal with various problems in the real world can be constructed. it can.

<Equivalence of electric energy transmission network and neural network 1>
The equivalence between the power energy transmission network and the neural network 1 will be described. Electric power energy is transmitted continuously without directivity. Then, when a change occurs in the power supply / demand balance in the nodes included in the power system, the change given to the flow of power energy can be described by Expression (1).

In Equation (1), H _{i, U} is a network transmission matrix. As a result, the flow of electricity in the transmission line in the steady state is as shown in Equation (2).

Also, the supply and demand balance objective function is expressed by equation (3).

In Equation (3), L is the total network loss. When equation (3) is transformed, an identity relation such as equation (4) is obtained.

In the equation (4), when the objective function L is 0, it is most ideal, and when the error function E is further introduced, the equation (5) is obtained.

In Equation (5), the error function shown in Equation (6) is the cross entropy used when updating the gradient of the existing neural network.

That is, the equivalence between the transmission network of power energy and the neural network 1 is proved. Further, it is proved that there is an equivalence between the minimum search of the error function in the neural network 1 and the transmission loss of the power energy transmission network being zero.

When searching for the minimum value of the error function in the neural network 1, the minimum value is searched by updating until the gradient of the error function with respect to the learning parameter becomes zero. Regarding the slope in the supply-demand balance, equation (7) holds.

FIG. 6 is a schematic diagram showing a schematic example of a power energy transmission network. Based on equation (7), the physical meaning of the gradient of the power energy transmission network is as follows. That is, the current flows while being driven by a voltage resulting from a gradient from a high potential to a low potential. As illustrated in FIG. 6, in the power (current × voltage) energy transmission network, current flows in the downward direction of the voltage gradient. In FIG. 5, the voltage is expressed by the “line” gradient, and the current is expressed by the thickness of the “line”. From the equivalence of the power energy transmission network and the neural network 1, the minimum value of the error function in the neural network 1 is calculated as the minimum value of the objective function L of the supply and demand balance in the power energy transmission network using the search algorithm, that is, It is ensured that “L → 0” can be searched. That is, as shown in FIG. 6, the neural network 1 of the embodiment of the present invention can be applied to a power energy transmission network by giving each node (neuron) a physical function.

In the neural network 1 applied to the power energy transmission network, the voltage gradient can be updated by alternately executing forward propagation and reverse propagation to calculate the minimum value of the objective function balanced in supply and demand. It becomes possible.

<Gradient generation type neural network 1>
In a power energy transmission network, there are local power plants such as solar power generation and wind power generation, and power storage facilities having a function of temporarily storing energy such as a storage battery. Such a power energy transmission network including a power plant and a power storage facility can be applied by using the gradient generation type neural network 1.

The gradient generation type neural network 1 matches the topology of the existing recurrent neural network. The existing recurrent neural network needs to perform time-dependent back propagation, and at that time, the update of the parameter θ is executed by the equation (8).

Since there is a problem of disappearance of the gradient, the calculation ahead of _T assumes that δT is 0, and is only the calculation of Expression (9).

Gradient generated neural network 1, [delta] _T can be generated from a synthetic gradient layer, further, is introduced gradient prediction layer, it is possible to predict the three steps away. As described above, the gradient generation type neural network 1 plays a role of complementing the existing recursive type neural network, leading to improvement of the identification system.

<Operation Example of Information Processing Device 100>
FIG. 7 is a flowchart showing an operation example of the information processing apparatus 100 according to the embodiment of the present invention.

As shown in FIG. 7, the acquisition unit 106 of the information processing apparatus 100 acquires learning data for learning the neural network 1 including a plurality of neurons (S101).

The calculation unit 107 of the information processing apparatus 100 includes a gradient generation layer that gives a gradient to the neuron based on the learning data acquired by the acquisition unit 106 and an open / close mechanism that opens and closes a connection between adjacent neurons. The neural network 1 provided for each is optimized (S102).

As described above, in the embodiment of the present invention, a gradient generation / synthesis layer to be added from the outside is newly introduced by temporarily disconnecting the connection between the layers of the existing neural network. Therefore, the neurons of the neural network 1 according to the embodiment of the present invention can be associated with physical device functions as social infrastructure functions. As a result, the neural network 1 according to the embodiment of the present invention can construct the neural network 1 having flexibility that can cope with various problems in the real world.

<Supplement>
It goes without saying that the information processing apparatus 100 according to the above embodiment is not limited to the above embodiment, and may be realized by other methods.

In the above embodiment, the display unit 103 may be a device outside the information processing device 100.

In the above-described embodiment, a processor that functions as each functional unit constituting the information processing apparatus 100 executes an information processing program and the like. This is performed by an integrated circuit (IC (Integrated Circuit) chip, LSI (Large) Scale Integration)) etc. may be realized by a logic circuit (hardware) or a dedicated circuit. In addition, these circuits may be realized by one or a plurality of integrated circuits, and the functions of the plurality of functional units described in the above embodiments may be realized by a single integrated circuit. An LSI may be called a VLSI, a super LSI, an ultra LSI, or the like depending on the degree of integration.

The information processing program may be recorded on a processor-readable recording medium, and the recording medium may be a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, or a programmable logic. A circuit or the like can be used. The distortion measurement program may be supplied to the processor via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the distortion measurement program. The present invention can also be realized in the form of a data signal embedded in a carrier wave, in which the distortion measurement program is embodied by electronic transmission.

The information processing program can be implemented using, for example, a script language such as ActionScript or JavaScript (registered trademark), an object-oriented programming language such as Objective-C or Java (registered trademark), or a markup language such as HTML5. .

The configurations described in the above embodiment and each supplement may be combined as appropriate.

DESCRIPTION OF SYMBOLS 1 Neural network 100 Information processing apparatus 101 Communication part 102 Input / output part 103 Display part 104 Storage part 105 Control part 106 Acquisition part 107 Operation part 200 Neuron 300 Gradient generation layer

Claims

An acquisition unit for acquiring learning data for learning a neural network including a plurality of neurons, and a calculation unit for optimizing the neural network based on the learning data, wherein the calculation unit is configured to gradient the neurons An information processing apparatus for optimizing a neural network provided with a gradient generation layer for providing a plurality of neurons and an opening / closing mechanism for opening / closing a connection between adjacent neurons.
The information processing apparatus according to claim 1, wherein the opening / closing mechanism performs forward propagation or back propagation between the adjacent neurons by a predetermined optimum number of times.
3. The information processing apparatus according to claim 2, wherein the predetermined optimum number of times is a ratio of the number of updates of the gradient generation layer and the number of times of gradient update of the back propagation.
The information processing apparatus according to any one of claims 1 to 3, wherein the gradient generation layer gives a gradient to the neuron from the outside independently of other neurons.
When the neural network is applied to a power system, the arithmetic unit associates the neuron with a node of the power system,
5. The information processing apparatus according to claim 1, wherein the gradient given to the neuron by the gradient generation layer corresponds to a change in a power supply / demand balance in each node of the power system. 6.
The information processing apparatus according to claim 5, wherein a gradient given to the neuron by the gradient generation layer corresponds to power generated independently at each node of the power system.
An acquisition step of acquiring learning data for learning a neural network including a plurality of neurons;
A step of optimizing the neural network based on the learning data,
An information processing method for optimizing a neural network including a gradient generation layer for providing a gradient to a neuron and an opening / closing mechanism for opening / closing a connection between adjacent neurons in each of the plurality of neurons in the calculation step.
On the computer,
An acquisition function for acquiring learning data for learning a neural network including a plurality of neurons;
An arithmetic function for optimizing the neural network based on the learning data, and an information processing program to be executed,
An information processing program for optimizing a neural network including a gradient generation layer for providing a gradient to a neuron and an opening / closing mechanism for opening / closing a connection between adjacent neurons in each of the plurality of neurons.