WO2007096954A1

WO2007096954A1 - Neural network device and its method

Info

Publication number: WO2007096954A1
Application number: PCT/JP2006/303147
Authority: WO
Inventors: Kohei Arai
Original assignee: Saga University
Priority date: 2006-02-22
Filing date: 2006-02-22
Publication date: 2007-08-30
Also published as: JPWO2007096954A1; JP5002821B2

Abstract

[PROBLEM TO BE SOLVED] A neural network device is provided to arrive at a large area-like optimum solution through short time learning with a less training data set. [MEANS FOR SOLVING THE PROBLEM] A weight coefficient is not only updated by using an error between output data generated by a neural network and the ones of teachers’ data and learning is finished if the error becomes small, but convergence of the learning is also judged on a basis of correlations between input data and weight coefficients of an input layer and a hided layer. Thus, since the convergence of the learning is not determined merely because the weight coefficient comes to a dead end but the correlation coefficients between the input data and the weight coefficients of the input layer and the hided layer are also used for the judgment of the convergence of the learning, a local solution does not lead the learning to a finish but the learning can be finished up if a large area-like optimum solution is reached.

Description

Specification

Neural network device and method thereof

Technical field

[0001] The present invention relates to a neural network device that realizes an information processing mechanism created by mimicking the mechanism of the brain, and in particular, not a local solution of a weighting factor between an input layer and a hidden layer, but a global solution. It can be efficiently determined-relating to a Euler network device.

Background art

A neural network is an information processing mechanism that mimics the mechanism of a human brain. The human brain has an excellent ability to easily achieve processing that is difficult to achieve with a computer. Currently, the computer we are using is called the Neumann type, which is extremely powerful when solving problems that are formulated with extremely high computation speeds. When solving such problems, it is very difficult to implement the problem formulation. Therefore, a neural network was born to realize the basic functions of human recognition, memory, and judgment on a computer using information processing mechanisms in the human brain as hints. Neural networks have a structure that simplifies the information processing mechanism in the brain. There are about 14 billion nerve cells in the human brain, and these nerve cells are connected to each other. Each neuron receives an input signal from another cell and sends an output signal to another cell when the sum of the input signals exceeds a certain value. This propagation of information between cells enables processes such as recognition, memory, and judgment that are usually performed by humans. The biggest feature of information processing in the brain is that a large number of cells that do not function and have power are gathered to realize complex and advanced processing as a whole, and neural networks produce this advantage. It is a powerful mechanism. Specifically, a neuron in the brain is modeled as an element called a “neuron”, and a network is constructed by arranging and connecting a large number of -eurons. In actual application, various problems can be dealt with by changing (learning) the parameters of each Euron according to the problem to be applied.

[0003] Fig. 10 (a) is a schematic diagram of -Euron. The cell body (soma) contains the nucleus! The dendrite is the part of many branches that come out of the cell body, and it is the input terminal of the neuron, and the axon has cell body strength. It extends and is the part that is the output terminal of the neuron, and the synapse is in contact with the other-euron with a small thick leg at the end of the branch, and it plays a role in connecting neurons and transmitting information.

[0004] When modeled, the result is as shown in Fig. 10 (b).

[0005] [Equation 1]

Input weighted sum (ί linear discriminant function) (1)

[0006] Combining a large number of them enables complex calculations. Figure 10 (c) shows a three-layer hierarchical-Eural network (any mapping can be realized).

The neural network learning is performed as shown in Fig. 10 (d).

Figure 11 (a) shows an error.

[0007] [Equation 2]

Find the minimum weight (νν, ν)

£ =

(2) The new weight is given by the following equation.

[0008] [Equation 3] dE

w (t + \) = w (t) -e (3)

dw

Proportionality constant

In this way, the minimum value is obtained.

[0009] [Equation 4] Change in unit time

Aw (t) = — ε—— • · · (4)

[0010] [Equation 5]

[0011] [Equation 6]

_{£ = / (ν 1, ···} , ν., Η '11, ···,

I: Positive (6)

[0012] [Equation 7]

[0013] When changing to satisfy these equations, the error E does not increase with respect to time t.

There is a way to determine the weight of the Ural network by exemplifying a lot of learning data (input and desired output thread). A learning method based on the steepest descent method is called error back propagation.

Figures 11 (b) and 11 (c) are diagrams illustrating the structure of back propagation.

The total input to the jth unit in the output layer is

[0014] [Equation 8]

[0015] [Equation 9] x) = 0, (")... (9)

[0016] [ ^Equation 10] Error— ^{£ ≡} ∑nu Γ ¹ ) — ^) ² The sum of each learning data is a desirable output, and w is determined so that error (error) E becomes small.

[0017] [Equation 11] iw = —ε

dw

If you keep changing the weight w to follow

[0018] [Equation 12]

― · · · (1 2)

dw

Ask for.

See Figure 11 (d).

[0019] [Equation 13] dE ^ dE dx _m ^{(n + 1)} . From here (3)

[0020] [Equation 14] " ⁺¹⁾ = 5 («) (1 4)

[0021] [Equation 15] > Differentiated by (15)

[0022] [Equation 16]

()) (") · · · Only (16) remains.

[0023] [Equation 17] dx, (»-!)

0 (17))

So (13)

[0024] [Equation 18]

Dy dy, (»)

[0025] [Equation 19]

Ask for.

See Figure 11 (e).

[0026] When [number 20]

Ε •,, (20) dx '

[0027] First,

[Number 21] dx, ("-1) (21)

9w

[0028] Next,

[Number 22]

[0029] [Equation 23]

'dE))

(twenty three)

[0030] The above calculation is reversed to the output side force input side,

[Equation 24] dE

(twenty four)

dx

Tilt while propagating

[0031] [Equation 25] dE

(twenty five)

dw

[0032] The calculation proceeds. This is called error back propagation learning.

The characteristics of neural networks are as follows.

(1) Learning ability · · 'It is possible to automatically form necessary functions based on the input / output samples presented.

(2) Non-linearity: By learning, even complex mapping relationships that are difficult to formulate can be easily constructed.

(3) Parallel processing: The input signal is sent to various Eurons through the combination and processed in parallel. Disclosure of the invention

Problems to be solved by the invention

[0033] Since hierarchical learning based on backpropagation learning performs optimization learning based on the steepest descent method, there is a risk of falling into a local solution rather than guaranteeing a global optimal solution. There is a simulation method for reaching the global optimal solution, but it takes time to learn and the input data set and the desired output data set (referred to as the training data set). It needs to be prepared in large quantities and has unrealistic traps and traps.

[0034] The present invention has been made to solve the above-described problem, and provides a dual network device capable of reaching a global optimum solution by learning in a short time with a small number of training data sets. For the purpose.

Means for solving the problem

[0035] As a result of the inventor's earnest research on the Ural network, it was found that the input data becomes highly correlated with the weighting coefficient that combines the input layer and the hidden layer as the learning progresses. It was. Although it depends on the number of hidden layer nodes, if the number of weighting coefficients is the same as that of the input layer (assuming one hidden layer node, all the input layer nodes The number of weighting factors matches the number of nodes in the input layer), and the correlation between the input and the weighting factor when learning progresses with the desired output,

[0036] [Equation 26]

R _xw = J ( _χ.- Χ) π _ί} -W)! _Nm- · ■ (2 6)

[0037] becomes higher. Similarly, when the number of hidden layers increases, the correlation coefficient of the equation can be calculated by thinning out or by averaging to match the number of input layers, and the progress of learning can be checked. Can do. However, the method of obtaining the correlation coefficient is not limited to the above formula, and any formula that can be numerically expressed according to the degree of correlation can be used.

[0038] The present invention is based on a hierarchical-Eural network with backpropagation learning based on the steepest descent method, and examines the correlation coefficient between the input data and the weighting coefficient between the input layer and the hidden layer. The ability to check the progress of learning and fall into a local solution, or globally optimal It is determined whether the solution is a suboptimal solution close to the solution. Also, even if error backpropagation learning converges, if the correlation coefficient is low, that is, if it falls into a local solution, a learning method that changes the initial value and restarts learning is used. It also included -European network equipment.

[0039] (1) The -Eural network device according to the present invention includes an input layer having nodes, a hidden layer, and an output layer, and outputs output data that is input from the input layer and output from the output layer. And the teacher data prepared in advance corresponding to the input data, and using the error that is the comparison result, the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer are calculated. A hierarchical neural network device based on backpropagation learning that learns by updating, and determines the convergence of the learning based on the correlation coefficient between the input data and the weighting coefficient between the nodes in the input layer and the hidden layer. To do.

[0040] As described above, according to the present invention, the weighting coefficient is updated using the error between the output data generated by the neural network and the output data of the teacher data, and learning is performed when the error power is increased. Since the convergence of learning is judged based on the correlation coefficient between the input data and the weighting coefficients of the input layer and the hidden layer. The learning convergence is also determined using the correlation coefficient between the input data and the weighting coefficients of the input layer and hidden layer, and a global optimal solution that does not end the learning with a local solution is derived. The effect is that learning can be completed.

[0041] (2) In addition, the -Ural network device according to the present invention includes an input layer having nodes, a hidden layer, and an output layer. The input data is input to the input layer and output from the output layer. Force data is compared with pre-prepared teacher data corresponding to the input data, and the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer are calculated using an error as a comparison result. A hierarchical dual network device based on error back propagation learning that learns by updating the first learning convergence determination unit that determines the convergence of learning based on the error of the comparison result of the learning, and an input A second learning convergence determination unit for determining the convergence of the learning based on a correlation coefficient between the data and a weighting coefficient between nodes of the input layer and the hidden layer, and the first learning convergence determination unit and the first learning convergence determination unit 2 learning convergence judgment departments Practice The learning is terminated when it is determined that has converged.

More specifically, the first learning convergence determination unit looks at the fluctuations in the error of the comparison result obtained so far, and determines that the learning has converged when the fluctuations are almost eliminated.

[0042] (3) In addition, the -Ural network device according to the present invention includes an input layer having nodes, a hidden layer, and an output layer. The input data is input to the input layer and output from the output layer. Force data is compared with pre-prepared teacher data corresponding to the input data, and the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer are calculated using an error as a comparison result. A hierarchical dual network device based on error back propagation learning that learns by updating the first learning convergence determination unit that determines the convergence of learning based on the error of the comparison result of the learning, and an input A second learning convergence determination unit that determines the convergence of the learning based on a correlation coefficient between the data and the weighting coefficient between nodes of the input layer and the hidden layer, and the first learning convergence determination unit learns After judging that has converged The second learning convergence judgment unit judges the convergence of the learning, when the second learning convergence judgment unit determines that learning is converged, it is to end the learning.

[0043] As described above, according to the present invention, the convergence of learning is first determined from the error between the output data generated by the Euler network and the output data of the teacher data. The error between the output data and the output data of the teacher data is indispensable to reflect the weighting factor in error backpropagation learning, and the correlation between the input data and the weighting factor of the input layer and hidden layer Before determining the power learning convergence, it is efficient to use the error already obtained to determine the learning convergence based on the error.

[0044] (4) In addition, the Yule network device according to the present invention determines that the learning has converged when the correlation function is equal to or greater than a predetermined threshold, if necessary. Is.

(5) In addition, according to the present invention, if necessary, the second network convergence judgment unit learns when the correlation coefficient satisfies the learning convergence condition in which the increasing tendency reaches saturation. It is judged that it has converged.

(6) In addition, according to the present invention, the Ural network device performs second learning as necessary. The convergence determination unit initializes the weighting coefficient and learns again when the learning convergence condition is not met.

As described above, according to the present invention, when it is determined that the error power learning between the output data generated by the neural network and the output data of the teacher data has converged, and the input data, the input layer, and the hidden If the learning converges from the correlation coefficient with the layer weighting coefficient, it is determined that the learning is unsuccessful. It has the effect that a global optimal solution can be obtained by back propagation learning. It is desirable to initialize the weighting factor because it is difficult to get out of the local solution and reach the global optimal solution no matter how many times you learn while in the local solution.

[0046] (7) Further, in the case of the -Ural network device according to the present invention, when the weighting factor is initialized as necessary, the correlation between the initialized weighting factor and the weighting factor before the initialization is performed. A coefficient is obtained, and when this correlation coefficient is equal to or greater than a predetermined threshold, the weight coefficient is initialized again.

Thus, according to the present invention, even if the weighting factor is initialized, if it is the same as the weighting factor, the possibility of falling into the same local solution is high when error back propagation learning is performed again. In this case, it is possible to avoid unnecessary learning by re-initializing and efficiently obtain a global optimum solution.

[0047] (8) Further, the error back propagation learning method of the hierarchical neural network device according to the present invention corresponds to the input data and the output data output from the output layer by inputting the input data to the input layer. Learning by updating the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer using the error that is the comparison result by comparing with the prepared teacher data An error back-propagation learning method for a hierarchical neural network apparatus, wherein convergence of the learning is determined based on a correlation coefficient between input data and a weight coefficient between nodes of an input layer and a hidden layer.

Thus, the present invention can also be grasped as a method.

These outlines of the invention do not enumerate the features essential to the present invention, and a sub-combination of these features can also be an invention.

Brief Description of Drawings FIG. 1 is a block configuration diagram of a -ural network device according to a first embodiment of the present invention.

FIG. 2 is a hardware configuration diagram of a computer in which the Yural network device according to the first embodiment of the present invention is constructed.

FIG. 3 is an operation flowchart at the time of learning of the -Ural network device according to the first embodiment of the present invention.

[Fig. 4] Fig. 4 is a partial alternative flowchart relating to the initialization of the Yural network device according to the first embodiment of the present invention.

[FIG. 5] Observation images and actual measurement values at 17:11 on May 20, 2002 according to the embodiment.

[Fig. 6] Observed images and measured values at 16:45 on November 30, 2004 according to the embodiment.

[Fig. 7] Transition of solutions for the input and desired output of Fig. 6.

[Fig. 8] This is the transition of the solution with respect to the input and desired output in Fig. 5.

[Fig. 9] Transition of the solution to the input and desired output in Fig. 5.

[FIG. 10] An explanatory diagram of the background network-Ural network.

FIG. 11 is an explanatory diagram of the background art-a Ural network.

Explanation of symbols

[0049] 10 input section

20 Neural network mechanism

30 Output section

40 Back propagation mechanism

41 Error calculator

42 Weight coefficient reflection part

43 Error convergence judgment part

50 Correlation coefficient mechanism

51 Correlation coefficient calculator

52 Correlation coefficient condition judgment unit

53 Initialization section

100 computers 111 CPU

112 RAM

113 ROM

114 flash memory

115 HD

116 LAN card

117 mouse

118 keyboard

119 video card

119a display

120 sound card

120a speaker

121 drive

BEST MODE FOR CARRYING OUT THE INVENTION

(First embodiment of the present invention)

(1) Block configuration

FIG. 1 is a block diagram of the -Ural network device according to this embodiment. The -Ural network device according to the present embodiment includes an input unit 10 that captures input data to be processed, and generates output data by processing the input data that has been captured -Ural network mechanism unit 20 and the generated output Determine the output error between the output unit 30 that sends out the data and the output data generated by the neural network mechanism unit 20 during training and the output data of the teacher data, and determine whether the calculated error force error has converged. If it is determined that V has not converged, error back propagation is performed to update the weighting coefficient by performing error back propagation 40, and the correlation coefficient between the input data and the hidden coefficient between the input layer and the hidden layer during learning. If the error back-propagation mechanism 40 is determined to have converged, the power of whether the latest correlation coefficient is lower than the predetermined threshold and the correlation coefficient tend to increase and reach saturation Judgment of power If the most recent correlation coefficient is lower than the predetermined threshold, the correlation coefficient is not increasing, or the correlation coefficient does not reach saturation even when increasing, the neural network And a correlation coefficient mechanism 50 for initializing the torque mechanism unit 20.

[0051] The error back-propagation mechanism 40 uses an error calculator 41 that calculates an error between the output data and the output data of the teacher data for each input of the input data during learning, and uses the error calculated by the error calculator 41. The weight coefficient reflecting unit 42 for updating the weighting coefficient of the Ral network mechanism unit 20 and the error convergence determining unit 43 for determining whether or not the error fluctuating power obtained by the error calculating unit 41 has converged. The weight coefficient reflection unit 42 may be configured to reflect the weight coefficient only when the error convergence determination unit 43 determines that the error has not converged, or the error calculation unit 41 does not depend on the error convergence determination unit 43. In the case where the weight coefficient is obtained, the weight coefficient reflecting unit 42 may be configured to reflect the weight coefficient. Even when the latter configuration is adopted, if the error convergence determination unit 43 determines that the convergence has occurred and if a global optimum solution has been obtained, the processing in the error calculation unit 41 and the weight coefficient reflection unit 42 is performed thereafter. Not done.

[0052] The correlation coefficient mechanism 50 includes a correlation coefficient calculation unit 51 that obtains a correlation coefficient between the input data and the weighting coefficient of the input layer and the hidden layer for each input of the input data during learning, and the latest correlation coefficient is The correlation coefficient condition determination unit 52 for determining whether the power is lower than the predetermined threshold and whether the correlation coefficient tends to increase and reach saturation, and the correlation coefficient condition determination unit 52 When it is determined that the correlation coefficient is lower than the predetermined threshold, when it is determined that the correlation coefficient is not increasing, or when the correlation coefficient is determined not to reach saturation even when increasing And an initialization unit 53 for initializing the mechanism unit 20.

[0053] (2) Hardware configuration

FIG. 2 is a hardware configuration diagram of a computer in which the Yural network device according to the present embodiment is constructed.

The computer 100 on which the neural network is constructed is a CPU (Central Processing Unit) lll, a RAM (Random Access Memory) 112, a ROM (Read Only Memory) 11 3, a flash memory (Flash memory) 114, and an external storage device. HD (Hard disk) 115, LAN (Local Area Network) card 116, mouse 117, keyboard 118, video card 119, display 119a, sound card 120, which is a display device electrically connected to this video card 119, this A sound output device that is electrically connected to the sound card 120, and a storage medium such as a flexible disk, CD-ROM, DVD-ROM, etc. It consists of a drive 121 that reads and writes. A so-called person skilled in the art can slightly change the components of the hardware, and can construct a single-universal network for a plurality of computers. One or more modules, not all, can be built for each computer to achieve load distribution. Of course, it is possible to build a -Ural network on the grid computer system.

[0054] (3) Operation

FIG. 3 is an operation flow chart at the time of learning of the -Ural network device according to the present embodiment.

The input unit 10 captures input data and teacher data that also has output data power. The acquired input data is processed by the -Ural network mechanism 20, and the output unit 10 outputs the generated output data. The error calculation unit 41 also calculates an error for the output data and the output data of the teacher data (step 201). The weighting factor reflection unit 42 updates the weighting factor of the Euler network using the calculated error (Step 202). The correlation coefficient calculation unit 51 calculates the correlation coefficient from the input layer and hidden layer weight coefficients and the input data (step 211). The error convergence determination unit 43 determines whether or not the error has converged using the error obtained by the error calculation unit 41 (step 221). If it is determined in step 221 that the signal has not converged, the process returns to step 100. If it is determined in step 221 that it has converged, the correlation coefficient condition determination unit 52 uses the correlation coefficient obtained by the correlation coefficient calculation unit 51 and the latest correlation coefficient is lower than the predetermined threshold value. (Step 231). If it is determined in step 231 that the most recent correlation coefficient is low, it is determined whether or not the correlation coefficient is increasing and reaches saturation (step 232). If it is determined in step 231 that the number of correlations is not low, or if it is determined in step 232 that the correlation coefficient is increasing and reaches saturation, it is assumed that a global optimal solution has been found. Exit. If it is determined in step 232 that the correlation coefficient is increasing and has not reached saturation, it is assumed that a local solution has been obtained-the initial value of the Eural network mechanism 20 is reset (step 241), and step 100 Return to.

[0055] (4) Effects of this embodiment

As described above, according to the exemplary embodiment of the present invention, the neural network device The error between the output data generated by the workpiece and the output data of the teacher data is calculated and propagated back to the error to update the weighting factor, the calculated error is accumulated, and the convergence of the error is judged from the error variation. Judgment is made as to whether or not the fluctuation of the correlation coefficient obtained continuously when the convergence is satisfied is the condition satisfying the predetermined condition. Here, the predetermined condition is that the correlation coefficient is not smaller than the predetermined threshold, and the correlation The number is increasing and has reached saturation.If this condition is satisfied, the global optimal solution is found and learning is terminated.On the other hand, if the condition is not satisfied, a local solution is obtained. -Eural network is initialized and re-learned, so it is possible to apply the steepest descent method to quickly learn, and learning was terminated by this steepest descent method. Globally optimal solution Therefore, it is possible to make a state where a global optimal solution is obtained instead of a local solution at the end of learning.

[0056] (Other Embodiments)

[Judgment by error and judgment by correlation coefficient]

In the -Ural network device according to the first embodiment, the fluctuation of the error between the output data and the output data of the teacher data is first determined, and then the correlation coefficient is determined. It is also possible to determine the variation in error after first determining the number of relationships. However, since the error between the output data and the output data of the teacher data is a value that must be derived for the purpose of error back propagation learning, it is better to judge the error variation first. desirable. In addition, when it is determined that the variable power learning of the error between the output data and the output data of the teacher data has converged, at least the local solution is obtained, whereas when it is determined that the correlation coefficient power learning has converged, There is a possibility that even a local solution has not been obtained, so it is desirable to first determine the convergence of learning from the variation in error between the output data and the output data of the teacher data.

[0057] [Initialization of weighting factor]

In the -Ural network device according to the first embodiment, for example, initialization is performed with a random number. However, initialization using the random number may be substantially the same as the weighting factor before initialization. Initialization to prevent unnecessary learning that is likely to fall into the same local solution again if the same weighting factor is obtained after initialization. If the correlation coefficient between the weighting factor after initialization and the weighting factor before initialization is equal to or greater than a predetermined threshold value, the initialization can be performed again.

[0058] For example, instead of step 241, as shown in FIG. 4, the current weighting factor is recorded (step 2411), the weighting factor is initialized (step 2412), and the recorded weighting factor before initialization and the initial The correlation coefficient of the weighting coefficient after conversion is obtained (step 2413), and it is determined whether or not this correlation coefficient is equal to or greater than a predetermined threshold value (step 2414). Returning to step 100, if it is determined that the value is smaller than the predetermined threshold, the process returns to step 100. Example

[0059] Hierarchical-Eural network was used for sea surface temperature (SST) estimation using NOAA / AVHRR (High Spatial Resolution Visible Thermal Infrared Radiometer) data. Fig.5 or Fig.6 shows the sea surface temperature estimated by the thermal infrared image of NOAA / AVHRR bands 4 and 5 and the method called MCSST (multi-channel sea surface temperature). Fig. 5 (a) or Fig. 6 (a) is the thermal infrared image of band 4, and Fig. 5

Fig. 6 (b) or Fig. 6 (b) is a thermal infrared image of band 5, and Fig. 5 (c) or Fig. 6 (c) is an estimated measured value by MCSST. Figure 5 is the one at 17:11 on May 20, 2002, and Figure 6 is the one at 16:45 on November 30, 2004.

[0060] FIG. 5 (a) (b) or 6 (a) (b) is used as input data, and FIG. 5 (c) or 6 (c) is considered as the desired output. Ral network error backpropagation learning was performed. At this time, the average difference between the desired output and the actual output of the hierarchical-Eural network was evaluated as the average error. In addition, the initial value of the weighting factor is given by a uniform random number.

[0061] An example of the course of learning is shown in Figs. Figure 7 shows the transition of the solution for the input and desired output of Figure 6, and Figure 8 or Figure 9 is for Figure 5. Figures 7 and 8 relate to the image area off Miyazaki Prefecture, and Figure 9 relates to the image area off Kokura.

[0062] Fig. 7 (a) shows that the correlation coefficient converges at a low part without turning to an increasing trend, and Fig. 7 (b) shows that the error decreases as the number of learning passes and decreases. The trend also shows that it tends to converge. Therefore, it can be seen from Fig. 7 that it has fallen into a local solution. [0063] Fig. 8 (a) shows that the correlation coefficient tends to converge through an increasing trend, and Fig. 8 (b) shows that the error decreases as the number of learning passes, and the decreasing trend also converges. Shows that it is in the direction. Therefore, it can be seen from Fig. 8 that a global optimal solution has been obtained.

[0064] Fig. 9 (a) shows that the correlation coefficient maintains an increasing trend, and Fig. 9 (b) shows that the error becomes smaller as the number of learning passes. It shows that it is urgent. Therefore, it can be seen from Fig. 9 that a global optimal solution has been obtained.

[0065] Looking at these, it can be seen that the correlation coefficient between the input and the weighting coefficient between the input layer and the hidden layer increases, and the average error tends to decrease. That is, the correlation coefficient can be used as an index of learning convergence.

[0066] Although the present invention has been described with the above embodiments, the technical scope of the present invention is not limited to the scope described in the embodiments, and various modifications or improvements can be added to these embodiments. Is possible. Embodiments to which vigorous changes or improvements are added are also included in the technical scope of the present invention. This is clear from the claims and the means to solve the problems.

Claims

The scope of the claims

[1] It consists of an input layer with nodes, a hidden layer, and an output layer. Input data is input to the input layer, and output data output from the output layer and teacher data prepared in advance corresponding to the input data are displayed. Comparing and updating the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer by using the error that is the comparison result, the hierarchy based on error reverse propagation learning -Eural network device,

A hierarchical-Ural network device based on error back-propagation learning that determines the convergence of learning based on the correlation coefficient between input data and weighting factors between nodes in the input layer and hidden layer.

[2] It consists of an input layer with nodes, a hidden layer, and an output layer. Input data is input to the input layer, output data output from the output layer, and teacher data prepared in advance corresponding to the input data. Comparing and updating the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer by using the error that is the comparison result, the hierarchy based on error reverse propagation learning -Eural network device,

A first learning convergence determination unit for determining learning convergence based on an error of the learning comparison result;

A second learning convergence judging unit for judging the convergence of the learning based on the correlation coefficient between the input data and the weighting coefficient between nodes of the input layer and the hidden layer;

A hierarchical-Ural network device based on back propagation learning that terminates learning when both the first learning convergence determination unit and the second learning convergence determination unit determine that learning has converged.

[3] It consists of an input layer with nodes, a hidden layer, and an output layer. Input data is input to the input layer, output data output from the output layer, and teacher data prepared in advance corresponding to the input data. Comparing and updating the weighting coefficient between the nodes of the output layer and the hidden layer and the weighting coefficient between the nodes of the hidden layer and the input layer by using the error that is the comparison result, the hierarchy based on error reverse propagation learning -Eural network device,

A first learning convergence determination unit for determining learning convergence based on an error of the learning comparison result; A second learning convergence judging unit for judging the convergence of the learning based on the correlation coefficient between the input data and the weighting coefficient between nodes of the input layer and the hidden layer;

After the first learning convergence determination unit determines that the learning has converged, the second learning convergence determination unit determines the learning convergence,

A hierarchical-Ural network device based on error back propagation learning that terminates learning when the second learning convergence determination unit determines that learning has converged.

[4] The second learning convergence determination unit determines that learning has converged when the correlation function is equal to or greater than a predetermined threshold.

A hierarchical neural network device based on error back-propagation learning according to claim 2 or 3.

[5] The second learning convergence determination unit determines that the learning has converged when the learning convergence condition in which the correlation coefficient increases to reach saturation is met.

[6] When the learning convergence condition is not satisfied, the second learning convergence determination unit initializes the weighting coefficient and learns again.

6. A hierarchical neural network based on error back-propagation learning according to claim 4 or 5.

[7] When initializing the weighting coefficient, a correlation coefficient between the initialized weighting coefficient and the weighting coefficient before initialization is obtained, and when this correlation coefficient is equal to or greater than a predetermined threshold, the weighting coefficient is again obtained. Initialize the number

7. A hierarchical neural network based on back propagation learning according to claim 6.

[8] Input data is input to the input layer, and the output data output from the output layer is compared with the teacher data prepared in advance corresponding to the input data. An error back-propagation learning method for a hierarchical neural network device that learns by updating a weighting factor between nodes of a layer and a weighting factor between nodes of a hidden layer and an input layer,

Based on the correlation coefficient between the input data and the weight coefficient between the input layer and hidden layer nodes Error Back Propagation Learning Method for Hierarchical Neural Network Device for Judging Convergence of Writing