CN107256425B - Random weight network generalization capability improvement method and device - Google Patents

Random weight network generalization capability improvement method and device Download PDF

Info

Publication number
CN107256425B
CN107256425B CN201710354539.0A CN201710354539A CN107256425B CN 107256425 B CN107256425 B CN 107256425B CN 201710354539 A CN201710354539 A CN 201710354539A CN 107256425 B CN107256425 B CN 107256425B
Authority
CN
China
Prior art keywords
training
sample
output
random
rwn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710354539.0A
Other languages
Chinese (zh)
Other versions
CN107256425A (en
Inventor
何玉林
敖威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201710354539.0A priority Critical patent/CN107256425B/en
Publication of CN107256425A publication Critical patent/CN107256425A/en
Application granted granted Critical
Publication of CN107256425B publication Critical patent/CN107256425B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Abstract

The invention discloses a method and a device for improving generalization ability of a random weight network, which are characterized in that on the premise of not changing a frame structure of the random weight network, a target sample with the maximum uncertainty value in a training sample is mined, a simulation sample which is approximately distributed with the target sample with the maximum uncertainty value is generated, and weights of an output layer of the random weight network are updated iteratively based on the simulation sample, so that the aim of actively mining hidden information of the training sample and further improving the generalization ability of the random weight network can be fulfilled.

Description

Random weight network generalization capability improvement method and device
Technical Field
The invention relates to the technical field of data mining, in particular to a random weight network generalization capability improvement method and device.
Background
The Random Weight Network (RWN) is a full-link feedforward neural Network that does not rely on an iterative Weight update strategy, and unlike a traditional Weight update strategy based on an error back-propagation method, the RWN randomly selects input layer weights, and calculates an analytic solution of the output layer weights by solving a pseudo-inverse of a hidden layer output matrix. Because iterative weight adjustment is avoided, the random weight network obtains extremely high training speed, and meanwhile, the convergence of the random weight network is guaranteed in the universal approximation theorem theory.
At present, a great deal of research work on the generalization capability improvement of the random weight network is provided, and mainly focuses on the improvement of a random weight network framework, including the improvement of input layer weight, the selection of the number of nodes of an optimal hidden layer, the integration of multiple random weight networks and the like, which improve the prediction performance of the random weight network to a certain extent, but neglect the deep utilization of information contained in training data. In other words, the existing random weight network improvement work is passive use of training data, which is only used as a test field for checking the effect of the improvement work, rather than actively mining the intrinsic information of the training data to guide how to improve the generalization ability of the random weight network.
Disclosure of Invention
The invention mainly aims to provide a method and a device for improving the generalization ability of a random weight network, and aims to solve the technical problem that the prior art does not actively mine the internal information of training data to guide how to improve the generalization ability of the random weight network.
To achieve the above object, a first aspect of the present invention provides a random access network generalization capability improving method, including:
step (ii) of1: using training samples
Figure BDA0001298324180000021
Training random weights network RWN(r)Obtaining the trained random weight network RWN(r +1)And the training sample
Figure BDA0001298324180000022
Wherein r has an initial value of 0, and
Figure BDA0001298324180000023
to initiate training samples, RWN(0)Is an initial random weight network;
step 2: from the training sample
Figure BDA0001298324180000024
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
and step 3: calculating the simulation sample and the training sample
Figure BDA0001298324180000025
As a new training sample
Figure BDA0001298324180000026
And 4, step 4: and returning to the step 1 by making R ═ R +1 until R ═ R, and ending the training process after the step 1 is executed to obtain an improved random weight network RWN(R)And R is the preset iterative training times.
In order to achieve the above object, a second aspect of the present invention further provides a random access network generalization capability improving apparatus, including:
a training module for utilizing training samples
Figure BDA0001298324180000027
Training random weights network RWN(r)After having been trainedRandom access network RWN(r+1)And the training sample
Figure BDA0001298324180000028
Wherein r has an initial value of 0, and
Figure BDA0001298324180000029
to initiate training samples, RWN(0)Is an initial random weight network;
a selection generation module for generating a selection from the training samples
Figure BDA00012983241800000210
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
a calculation module for calculating the simulation sample and the training sample
Figure BDA00012983241800000211
As a new training sample
Figure BDA00012983241800000212
A return ending module, configured to make R ═ R +1, return to the training module, and end the training process after executing the training module until R ═ R is reached, so as to obtain an improved random weight network RWN(R)And R is the preset iterative training times.
The invention provides a random weight network generalization ability improvement method, which comprises the steps of mining a target sample with the largest uncertainty value in a training sample to generate a simulation sample on the premise of not changing a random weight network framework structure, and training a random weight network in an iteration mode on the basis of the simulation sample, so that the aim of actively mining the training sample to improve the generalization ability of the random weight network can be fulfilled.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for improving generalization capability of a random access network according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating the utilization of training samples in step 101 according to a first embodiment of the present invention
Figure BDA0001298324180000031
For random access network RWN(r)Training to obtain the training sample
Figure BDA0001298324180000032
A flow chart of the step of refining the uncertainty value of each sample in the step (A);
FIG. 3 is a flowchart illustrating a detailed step of generating a simulation sample by using a target sample and a preset neighborhood control factor in step 102 according to the first embodiment of the present invention;
FIG. 4 is a flow chart illustrating additional steps of the first embodiment of the present invention;
FIG. 5 is a diagram illustrating functional modules of an apparatus for improving generalization capability of random access network according to a second embodiment of the present invention;
FIG. 6 is a diagram illustrating the detailed functional blocks of the training module 501 according to a second embodiment of the present invention;
FIG. 7 is a diagram of additional functional modules according to a second embodiment of the present invention;
fig. 8 is a schematic diagram of a refinement function module of the selection generation module in the second embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the prior art, the technical problem that active mining of internal information of training data for guiding improvement on generalization capability of a random weight network is not realized exists.
In order to solve the above technical problems, the present invention provides a method and an apparatus for improving the generalization capability of a random access network. On the premise of not changing the frame structure of the random weight network, a simulation sample is generated by mining a target sample with the largest uncertainty value in a training sample, and the random weight network is trained in an iterative mode based on the simulation sample, so that the aim of actively mining the training sample to improve the generalization capability of the random weight network can be fulfilled. Furthermore, the method for improving the generalization capability of the random weight network in the embodiment of the invention not only obviously improves the generalization capability of the random weight network, but also has extremely strong capability of controlling the overfitting of the random weight network.
Referring to fig. 1, a flow chart of a method for improving generalization capability of a random access network according to a first embodiment of the present invention is shown, the method comprising:
step 101: using training samples
Figure BDA0001298324180000041
Training random weights network RWN(r)Obtaining the trained random weight network RWN(r +1)And the training sample
Figure BDA0001298324180000042
Wherein r has an initial value of 0, and
Figure BDA0001298324180000043
to initiate training samples, RWN(0)Is an initial random weight network;
in an embodiment of the invention, an initial training sample
Figure BDA0001298324180000044
Is a data set containing N D-dimensional training samples as follows:
Figure BDA0001298324180000045
the initial training sample described above
Figure BDA0001298324180000046
For training an initial random weight network RWN(0)Wherein the initial random right network RWN(0)The number of input layer nodes is D, the format of hidden layer nodes is K, and the number of output layer nodes is 1, wherein the initial random weight network RWN(0)The input layer input matrix of (a) is:
Figure BDA0001298324180000047
wherein the initial random right network RWN(0)The output layer output matrix of (a) is:
Figure BDA0001298324180000051
wherein, the random right network RWN(0)The implicit input matrix of (c) is:
Figure BDA0001298324180000052
wherein, the random right network RWN(0)The hidden layer output matrix of (a) is:
Figure BDA0001298324180000053
wherein the content of the first and second substances,
Figure BDA0001298324180000054
for the input layer weight matrix to be the input layer weight matrix,
Figure BDA0001298324180000055
the layer bias is implied.
Wherein, ω iskd,bkK is 1, 2, …, and K is a random number in any interval. For example, it may be the interval [0, 1 ]]Random numbers are uniformly distributed throughout the course of administration.
Wherein the content of the first and second substances,
Figure BDA0001298324180000056
the function is activated for Sigmoid.
In an embodiment of the invention, an initial training sample is utilized
Figure BDA0001298324180000057
For initial random right network RWN(0)Training is carried out, and the trained random weight network RWN is obtained(1)And training samples
Figure BDA0001298324180000058
The uncertainty of each sample in (a).
Wherein, the training times can be represented by r, and the initial value of r is 0, that is, after one training, the obtained random weight network is RWN(1). And RWN(r)Representing the random weight network obtained after the r-th training.
In the embodiment of the present invention, the uncertainty is obtained based on the actual output and the true output of the output layer output matrix of the random weight network, please refer to fig. 2, which is a first embodiment of the present invention in which the training samples are utilized in step 101
Figure BDA0001298324180000059
For random access network RWN(r)Training to obtain the training sample
Figure BDA00012983241800000510
The flow chart of the step of refining the uncertainty value of each sample in the method comprises the following steps:
step 201, utilizing training samples
Figure BDA00012983241800000511
For random access network RWN(r)Training to obtain an output layer output matrix;
step 202, taking the output layer output matrix as the real output of the training sample, calculating the error between the real output and the actual output of the training sample, and taking the error as the uncertainty value of each sample in the training sample.
Training samples obtained for the r-th training
Figure BDA0001298324180000061
And random weight network obtained by the r training
Figure BDA0001298324180000062
The training sample is
Figure BDA0001298324180000063
Input random access network RWN(r)And training to obtain an output layer output matrix.
Using the output layer output matrix as a training sample
Figure BDA0001298324180000064
Calculating an error between the real output and the actual output, and using the error as the training sample
Figure BDA0001298324180000065
The uncertainty value of each sample in (1).
Wherein, the actual output is obtained by using a test sample, and the preset test sample is used for inputting an initial random weight network RWN(0)And taking the obtained output layer output matrix as actual output, wherein the actual output is used in the iterative training process.
Wherein, because the output matrix of the output layer is:
Figure BDA0001298324180000066
that is, each sample has a corresponding output layer matrix value, the output layer value of each sample in the output layer matrix is used as the real output, and the error between the actual output and the real output of each sample is calculated, so as to obtain the uncertainty value of each sample.
Wherein the uncertainty value is
Figure BDA0001298324180000067
Wherein the content of the first and second substances,
Figure BDA0001298324180000068
representing the actual output of the nth sample,
Figure BDA0001298324180000069
the output of the random weight network obtained after the nth training sample is shown as the real output, U represents the uncertainty value,
Figure BDA00012983241800000610
representing the input value of sample n in the input layer input matrix.
Step 102: from the training sample
Figure BDA00012983241800000611
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
in the embodiment of the invention, after one training is finished, a training sample is obtained
Figure BDA00012983241800000612
And from the training samples
Figure BDA0001298324180000071
And selecting the sample with the largest uncertainty value as a target sample, and generating a simulation sample by using the target sample and a preset neighborhood control factor.
Wherein the training sample
Figure BDA0001298324180000072
The samples were as follows:
Figure BDA0001298324180000073
in the r training, the last training sample is used
Figure BDA0001298324180000074
And last obtaining random right network RWN(r-1). In obtaining the above-mentioned training
Figure BDA0001298324180000075
After the uncertainty value of each sample in (1), the sample with the largest uncertainty value is
Figure BDA0001298324180000076
Then utilize the sample
Figure BDA0001298324180000077
And generating a simulation sample.
Compared with samples with small uncertainty values, samples with large uncertainty values play a more important role in improving the generalization capability of the random weight network. As an ion, if the sample
Figure BDA0001298324180000078
Is 0, i.e. it has no uncertainty, and it is not necessary to adapt the current learning algorithm to the sample.
Referring to fig. 3, a flowchart illustrating a refinement step of generating a simulation sample by using a target sample and a preset neighborhood control factor in step 102 according to a first embodiment of the present invention includes:
step 301, determining a value range of an input layer input matrix and a value range of an output layer output matrix of a simulation sample to be generated by using the target sample and the neighborhood control factor;
step 302, randomly extracting a random number from the value range of the input layer input matrix, and generating the input layer input matrix of the simulation sample by using the extracted random number; and randomly extracting random numbers from the value range of the output layer output matrix, and generating the output layer output matrix of the simulation sample by using the extracted random numbers.
In the embodiment of the invention, simulation samples approximately distributed with the high uncertainty target samples are obtained based on the high uncertainty target samples.
Specifically, the value range of the input layer input matrix and the value range of the output layer output matrix of the simulation sample to be generated are determined by using the target sample and the preset neighborhood control factor.
Wherein the target sample
Figure BDA0001298324180000079
The corresponding field control factor is a D +1 dimensional 'super-rectangular solid', and the value range of the input layer input matrix and the value range of the output layer output matrix of the obtained simulation sample to be generated are respectively as follows:
Figure BDA0001298324180000081
wherein the content of the first and second substances,
Figure BDA0001298324180000082
representing the input layer input of the target sample obtained by the r-th training,
Figure BDA0001298324180000083
represents the output layer output of the target sample obtained by the r training, represents the neighborhood control factor,
Figure BDA0001298324180000084
representing a difference between a maximum value and a minimum value in the input layer input of the target sample,
Figure BDA0001298324180000085
an output layer output representing the target sample. Wherein the neighborhood control factor δ > 0.
After the value range is obtained, random numbers are randomly extracted from the value range of the input layer input matrix, the extracted random numbers are used for generating the input layer input matrix of the simulation sample, the random numbers are randomly extracted from the value range of the output layer output matrix, and the extracted random numbers are used for generating the output layer output matrix of the simulation sample so as to obtain the simulation sample.
It will be appreciated that in embodiments of the invention, the neighborhood control factor is set so as to obtain simulated samples that are approximately the same distribution as the sample with the largest uncertainty value, i.e., at high uncertainty samplesδAnd generating simulation samples in the neighborhood so as to reduce the current random weight network error.
Step 103: calculating the simulation sample and the training sample
Figure BDA0001298324180000086
As a new training sample
Figure BDA0001298324180000087
Step 104: returning to the step 101 by making R ═ R +1 until R ═ R, and ending the training process after executing the step 101 to obtain the improved random weight network RWN(R)And R is the preset iterative training times.
In the embodiment of the invention, the simulation sample and the training sample need to be calculated
Figure BDA0001298324180000088
The union is used as a new training sample
Figure BDA0001298324180000089
Therefore, the temperature of the molten metal is controlled,
Figure BDA00012983241800000810
that is, each training will be used based on the trainingAnd obtaining the training sample used in the next training by the training sample and the simulation sample obtained by the training.
In the embodiment of the invention, a new training sample is obtained
Figure BDA00012983241800000811
Then, let R be R +1, return to step 101, so that the random weight network can be trained in an iterative manner, and when R is R, after step 101 is executed, the training process is ended to obtain an improved random weight network RWN(R)And R is the preset iterative training times.
Wherein the random access network RWN (R) The output layer weight matrix of (a) is:
Figure BDA0001298324180000091
wherein the content of the first and second substances,
Figure BDA0001298324180000092
where W represents the input layer weight matrix and B is the hidden layer bias.
Further, please refer to fig. 4, which is a flowchart illustrating additional steps of the first embodiment of the present invention, including:
step 401, taking M random numbers as the weight of an input layer and P random numbers as the bias of a hidden layer from a preset arbitrary interval;
step 402, calculating a first average value of the M random numbers, and setting the initial random weight network RWN according to the first average value(0)Calculating a second average value of the P random numbers, and setting the initial random weight network RWN according to the second average value(0)Is biased.
In the embodiment of the present invention, before training the initial random weight network, the input layer weight and the hidden layer bias of the initial random weight network need to be set, and the setting may be specifically completed as follows:
taking M random numbers as input layer weight and P random numbers as hidden layer bias from any preset interval, calculating a first average value of the M random numbers, and setting an initial random weight network RWN according to the first average value(0)Computing a second average value of the P random numbers, and setting the initial random weight network RWN according to the second average value(0)Is biased.
For example, 100 random numbers may be taken from an arbitrary interval and the average of the 100 random numbers may be used as the input layer weight, and 100 random numbers may be taken from the arbitrary interval and the average of the 100 random numbers may be used as the hidden layer bias.
It can be understood that the influence of randomization of the input layer weight and the hidden layer bias on the training of the random weight network can be effectively eliminated by setting the input layer weight and the hidden layer bias by taking the average value of a plurality of random numbers.
In the embodiment of the invention, on the premise of not changing the frame structure of the random weight network, the target sample with the largest uncertainty value in the training sample is mined to generate the simulation sample, and the random weight network is trained in an iterative manner based on the simulation sample, so that the aim of actively mining the training sample to improve the generalization capability of the random weight network can be fulfilled. Furthermore, the method for improving the generalization capability of the random weight network in the embodiment of the invention not only obviously improves the generalization capability of the random weight network, but also has extremely strong capability of controlling the overfitting of the random weight network.
It can be understood that the random weight network RWN is obtained through R training(R)The random right network RWN can then be further matched(R)The generalization ability of (a) is verified, which can be specifically as follows:
assume initial random-weight network RWN(0)The test error on the data set of the independent test sample is E(0)The core problem to be solved by the embodiment of the invention is how to generate a data set x containing R ≧ 1 simulation sample(1),x(2),…,x(R)Is based on
Figure BDA0001298324180000101
Trained random weight network RWN(R)Test error E on a data set of a test sample(R)<E(0). Thus, a data set of test samples can be input into the random-weight network RWN(R)Obtaining the output test error E(R)And using the test error E(R)And test error E(0)And comparing to verify the improvement of generalization capability brought by training the random weight network by the technical scheme in the embodiment of the invention.
It is understood that the improvement in the embodiment of the present invention is an improvement on the output layer weights of the random weight network, and the generalization capability of the random weight network is improved by iteratively optimizing the output layer weights of the random weight network by using simulation samples. Compared with the improvement of the random weight network framework structure in the prior art, the technical scheme provided by the embodiment of the invention starts with training samples on the basis of not changing the random weight network framework structure, and achieves the purpose of improving the prediction performance of the random weight network by introducing more simulation samples, namely the improvement of the output layer weight of the random weight network is realized.
It can be understood that the technical solution in the embodiment of the present invention has the following advantages over the prior art: the frame structure of the random right network does not need to be modified, so that the generalization capability of the random right network is improved more conveniently and easily; the method has strong expansibility, is suitable for random weight networks needing to be improved based on a frame structure, expands the knowledge space represented by training samples and generates a batch of simulation samples distributed with high-uncertainty training samples; and the overfitting capability is efficiently controlled, and the probability of the overfitting phenomenon is greatly reduced.
Referring to fig. 5, a schematic diagram of functional modules of an apparatus for improving generalization capability of random access network according to a second embodiment of the present invention is shown, where the apparatus includes:
a training module 501 for utilizing training samples
Figure BDA0001298324180000111
Training random weights network RWN(r)Obtaining the trained random weight network RWN(r+1)And the training sample
Figure BDA0001298324180000112
Wherein r has an initial value of 0, and
Figure BDA0001298324180000113
to initiate training samples, RWN(0)Is an initial random weight network;
a selection generation module 502 for generating training samples from the training samples
Figure BDA0001298324180000114
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
a calculating module 503 for calculating the simulation sample and the training sample
Figure BDA0001298324180000115
As a new training sample
Figure BDA0001298324180000116
A returning and ending module 504, configured to return the training module 501 to the point R +1 until the point R is equal to R, and after executing the training module, end the training process to obtain an improved random weight network RWN(R)And R is the preset iterative training times.
Further, please refer to fig. 6, which is a schematic diagram of a detailed functional module of the training module 501 according to a second embodiment of the present invention, including:
a network training module 601 for utilizing training samples
Figure BDA0001298324180000117
Training random weights network RWN(r)Obtaining output layer output matrix and after trainingRandom access network RWN(r+1)
An uncertainty calculation module 602, configured to use the output layer output matrix as a true output of the training sample, calculate an error between the true output and an actual output of the training sample, and use the error as an uncertainty value of each sample in the training sample.
Further, please refer to fig. 7, which is a schematic diagram of an additional functional module according to a second embodiment of the present invention, the additional functional module includes:
an extracting module 701, configured to take M random numbers as weights of an input layer and P random numbers as offsets of a hidden layer from a preset arbitrary interval;
a mean value calculating module 702, configured to calculate a first mean value of the M random numbers, and set the initial random weight network RWN according to the first mean value(0)Calculating a second average value of the P random numbers, and setting the initial random weight network RWN according to the second average value(0)Is biased.
Further, please refer to fig. 8, which is a schematic diagram of a detailed functional module of the selection generating module 502 according to the second embodiment of the present invention, wherein the selection generating module 502 includes:
a selection module 801 for selecting from the training samples
Figure BDA0001298324180000121
Selecting a target sample with the maximum uncertainty value;
a range determining module 802, configured to determine, by using the target sample and the neighborhood control factor, a value range of an input layer input matrix and a value range of an output layer output matrix of the simulation sample to be generated;
a sample generation module 803, configured to randomly extract a random number from the value range of the input layer input matrix, and generate an input layer input matrix of a simulation sample by using the extracted random number; randomly extracting random numbers from the value range of the output layer output matrix, and generating the output layer output matrix of the simulation sample by using the extracted random numbers;
wherein, the value range of the input layer input matrix and the value range of the output layer output matrix are respectively as follows:
Figure BDA0001298324180000122
wherein the content of the first and second substances,
Figure BDA0001298324180000123
representing the input layer input of the target sample obtained by the r-th training,
Figure BDA0001298324180000124
represents the output layer output of the target sample obtained by the r training, represents the neighborhood control factor,
Figure BDA0001298324180000125
representing a difference between a maximum value and a minimum value in the input layer input of the target sample,
Figure BDA0001298324180000126
an output layer output representing the target sample.
In the embodiment of the invention, on the premise of not changing the frame structure of the random weight network, the target sample with the largest uncertainty value in the training sample is mined to generate the simulation sample, and the random weight network is trained in an iterative manner based on the simulation sample, so that the aim of actively mining the training sample to improve the generalization capability of the random weight network can be fulfilled. Furthermore, the method for improving the generalization capability of the random weight network in the embodiment of the invention not only obviously improves the generalization capability of the random weight network, but also has extremely strong capability of controlling the overfitting of the random weight network.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In view of the above description of the method and apparatus for improving the generalization capability of random access network provided by the present invention, those skilled in the art may change the concept of the embodiments of the present invention in the specific implementation manners and application ranges.

Claims (6)

1. A method for improving generalization capability of a random weight network, the method comprising:
step 1: using training samples
Figure FDA0002288721290000011
Training random weights network RWN(r)Obtaining the trained random weight network RWN(r+1)And the training sample
Figure FDA0002288721290000012
Wherein r has an initial value of 0, and
Figure FDA0002288721290000013
to initiate training samples, RWN(0)Is an initial random weight network;
step 2: from the training sample
Figure FDA0002288721290000014
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
and step 3: calculating the simulation sample and the training sample
Figure FDA0002288721290000015
As a new training sample
Figure FDA0002288721290000016
And 4, step 4: and returning to the step 1 by making R ═ R +1 until R ═ R, and ending the training process after the step 1 is executed to obtain an improved random weight network RWN(R)R is a preset iterative training frequency;
the generating of the simulation sample by using the target sample and the preset neighborhood control factor comprises:
determining the value range of an input layer input matrix and the value range of an output layer output matrix of the simulation sample to be generated by using the target sample and the neighborhood control factor;
randomly extracting random numbers from the value range of the input layer input matrix, and generating the input layer input matrix of the simulation sample by using the extracted random numbers; randomly extracting random numbers from the value range of the output layer output matrix, and generating the output layer output matrix of the simulation sample by using the extracted random numbers;
wherein, the value range of the input layer input matrix and the value range of the output layer output matrix are respectively as follows:
Figure FDA0002288721290000017
wherein the content of the first and second substances,
Figure FDA0002288721290000018
representing the input layer input of the target sample obtained by the r-th training,
Figure FDA0002288721290000019
represents the output layer output of the target sample obtained by the r training, represents the neighborhood control factor,
Figure FDA00022887212900000110
representing a difference between a maximum value and a minimum value in the input layer input of the target sample,
Figure FDA00022887212900000111
an output layer output representing the target sample.
2. The method of claim 1, wherein the utilizing training samples
Figure FDA0002288721290000021
For random access network RWN(r)Training to obtain the training sample
Figure FDA0002288721290000022
The uncertainty values of the samples in (1), comprising:
using training samples
Figure FDA0002288721290000023
For random access network RWN(r)Training to obtain an output layer output matrix;
and taking the output layer output matrix as the real output of the training sample, calculating the error between the real output and the actual output of the training sample, and taking the error as the uncertainty value of each sample in the training sample.
3. The method of claim 1, further comprising:
taking M random numbers as the weight of an input layer and P random numbers as the bias of a hidden layer from a preset arbitrary interval;
calculating a first average value of the M random numbers, and setting the initial random weight network RWN according to the first average value(0)Calculating a second average value of the P random numbers, and setting the initial random weight network RWN according to the second average value(0)Is biased.
4. An apparatus for improving generalization capability of a random weight network, the apparatus comprising:
a training module for utilizing training samples
Figure FDA0002288721290000024
Training random weights network RWN(r)Obtaining the trained random weight network RWN(r +1)And the training sample
Figure FDA0002288721290000025
Wherein r has an initial value of 0, and
Figure FDA0002288721290000026
to initiate training samples, RWN(0)Is an initial random weight network;
a selection generation module for generating a selection from the training samples
Figure FDA0002288721290000027
Selecting a target sample with the largest uncertainty value, and generating a simulation sample by using the target sample and a preset neighborhood control factor;
a calculation module for calculating the simulation sample and the training sample
Figure FDA0002288721290000028
As a new training sample
Figure FDA0002288721290000029
A return ending module, configured to make R ═ R +1, return to the training module, and end the training process after executing the training module until R ═ R is reached, so as to obtain an improved random weight network RWN(R)R is a preset iterative training frequency;
the selection generation module comprises:
a selection module for selecting from the training samples
Figure FDA0002288721290000031
Selecting a target sample with the maximum uncertainty value;
the range determining module is used for determining the value range of an input layer input matrix and the value range of an output layer output matrix of the simulation sample to be generated by using the target sample and the neighborhood control factor;
the sample generation module is used for randomly extracting random numbers from the value range of the input layer input matrix and generating the input layer input matrix of the simulation sample by using the extracted random numbers; randomly extracting random numbers from the value range of the output layer output matrix, and generating the output layer output matrix of the simulation sample by using the extracted random numbers;
wherein, the value range of the input layer input matrix and the value range of the output layer output are respectively as follows:
Figure FDA0002288721290000032
wherein the content of the first and second substances,
Figure FDA0002288721290000033
representing the input layer input of the target sample obtained by the r-th training,
Figure FDA0002288721290000034
represents the output layer output of the target sample obtained by the r training, represents the neighborhood control factor,
Figure FDA0002288721290000035
representing a difference between a maximum value and a minimum value in the input layer input of the target sample,
Figure FDA0002288721290000036
an output layer output representing the target sample.
5. The apparatus of claim 4, wherein the training module comprises:
a network training module for utilizing the training samples
Figure FDA0002288721290000037
Training random weights network RWN(r)Obtaining output matrix of output layer and trained random weight network RWN(r+1)
And the uncertainty calculation module is used for taking the output layer output matrix as the real output of the training sample, calculating the error between the real output and the actual output of the training sample, and taking the error as the uncertainty value of each sample in the training sample.
6. The apparatus of claim 4, further comprising:
the extraction module is used for taking M random numbers as the weight of the input layer from a preset arbitrary interval and P random numbers as the bias of the hidden layer from the preset arbitrary interval;
a mean value calculating module, configured to calculate a first mean value of the M random numbers, and set the initial random weight network RWN according to the first mean value(0)Calculating a second average value of the P random numbers, and setting the initial random weight network RWN according to the second average value(0)Is biased.
CN201710354539.0A 2017-05-18 2017-05-18 Random weight network generalization capability improvement method and device Active CN107256425B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710354539.0A CN107256425B (en) 2017-05-18 2017-05-18 Random weight network generalization capability improvement method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710354539.0A CN107256425B (en) 2017-05-18 2017-05-18 Random weight network generalization capability improvement method and device

Publications (2)

Publication Number Publication Date
CN107256425A CN107256425A (en) 2017-10-17
CN107256425B true CN107256425B (en) 2020-04-14

Family

ID=60027338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710354539.0A Active CN107256425B (en) 2017-05-18 2017-05-18 Random weight network generalization capability improvement method and device

Country Status (1)

Country Link
CN (1) CN107256425B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108564173A (en) * 2018-04-26 2018-09-21 深圳大学 A kind of random weight network generalization improved method, device and computer readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982373A (en) * 2012-12-31 2013-03-20 山东大学 OIN (Optimal Input Normalization) neural network training method for mixed SVM (Support Vector Machine) regression algorithm
CN104298999A (en) * 2014-09-30 2015-01-21 西安电子科技大学 Hyperspectral feature leaning method based on recursion automatic coding
CN105550744A (en) * 2015-12-06 2016-05-04 北京工业大学 Nerve network clustering method based on iteration
WO2016145516A1 (en) * 2015-03-13 2016-09-22 Deep Genomics Incorporated System and method for training neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982373A (en) * 2012-12-31 2013-03-20 山东大学 OIN (Optimal Input Normalization) neural network training method for mixed SVM (Support Vector Machine) regression algorithm
CN104298999A (en) * 2014-09-30 2015-01-21 西安电子科技大学 Hyperspectral feature leaning method based on recursion automatic coding
WO2016145516A1 (en) * 2015-03-13 2016-09-22 Deep Genomics Incorporated System and method for training neural networks
CN105550744A (en) * 2015-12-06 2016-05-04 北京工业大学 Nerve network clustering method based on iteration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Extreme learning machine: Theory and applications;HUANG,Guang-Bin 等;《Neurocomputing》;20060516;全文 *

Also Published As

Publication number Publication date
CN107256425A (en) 2017-10-17

Similar Documents

Publication Publication Date Title
US10521729B2 (en) Neural architecture search for convolutional neural networks
US20230368024A1 (en) Neural architecture search
US10984319B2 (en) Neural architecture search
WO2021155713A1 (en) Weight grafting model fusion-based facial recognition method, and related device
CN111259738B (en) Face recognition model construction method, face recognition method and related device
CN108491714A (en) The man-machine recognition methods of identifying code
CN109063824B (en) Deep three-dimensional convolutional neural network creation method and device, storage medium and processor
CN109145107B (en) Theme extraction method, device, medium and equipment based on convolutional neural network
CN114663848A (en) Knowledge distillation-based target detection method and device
CN114138231B (en) Method, circuit and SOC for executing matrix multiplication operation
CN113965313B (en) Model training method, device, equipment and storage medium based on homomorphic encryption
CN107256425B (en) Random weight network generalization capability improvement method and device
CN106407932A (en) Handwritten number recognition method based on fractional calculus and generalized inverse neural network
CN117236421A (en) Large model training method based on federal knowledge distillation
CN109783769B (en) Matrix decomposition method and device based on user project scoring
WO2023173550A1 (en) Cross-domain data recommendation method and apparatus, and computer device and medium
CN116054144A (en) Distribution network reconstruction method, system and storage medium for distributed photovoltaic access
CN113112092A (en) Short-term probability density load prediction method, device, equipment and storage medium
CN113657468A (en) Pre-training model generation method and device, electronic equipment and storage medium
CN112560326A (en) Method and device for determining pressure field
WO2018209651A1 (en) Generalization ability improvement method and device for random weight network
CN111079843A (en) Training method based on RBF neural network
CN111783977A (en) Neural network training process intermediate value storage compression method and device based on regional gradient updating
CN117494816B (en) Model reasoning method, device, equipment and medium based on computing unit deployment
CN116980379A (en) Red spot message pushing method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant