CN115456202B - Method, device, equipment and medium for improving learning performance of working machine - Google Patents
Method, device, equipment and medium for improving learning performance of working machine Download PDFInfo
- Publication number
- CN115456202B CN115456202B CN202211394593.5A CN202211394593A CN115456202B CN 115456202 B CN115456202 B CN 115456202B CN 202211394593 A CN202211394593 A CN 202211394593A CN 115456202 B CN115456202 B CN 115456202B
- Authority
- CN
- China
- Prior art keywords
- working machine
- local
- data set
- prediction
- variance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 230000007786 learning performance Effects 0.000 title claims abstract description 24
- 238000012360 testing method Methods 0.000 claims abstract description 87
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 79
- 238000012549 training Methods 0.000 claims abstract description 75
- 230000004927 fusion Effects 0.000 claims abstract description 46
- 230000008569 process Effects 0.000 claims abstract description 36
- 238000012937 correction Methods 0.000 claims abstract description 32
- 230000002776 aggregation Effects 0.000 claims abstract description 29
- 238000004220 aggregation Methods 0.000 claims abstract description 29
- 230000004931 aggregating effect Effects 0.000 claims abstract description 15
- 230000006870 function Effects 0.000 claims description 44
- 238000009826 distribution Methods 0.000 claims description 20
- 238000003860 storage Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 4
- 238000010801 machine learning Methods 0.000 abstract description 12
- 238000010586 diagram Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the field of machine learning, and provides a method, a device, equipment and a medium for improving the learning performance of a working machine. The method comprises the following steps: establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (random binary coded modulation) aggregation algorithm to obtain a global prediction model; and sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine. The method disclosed by the invention can obviously improve the final learning precision of the working machine model after prediction fusion.
Description
Technical Field
The invention relates to the field of machine learning, in particular to a method, a device, equipment and a medium for improving the learning performance of a working machine.
Background
The internet of things generates a large amount of distributed data, a typical training mode is to store the data on a server, and a model is trained through the server, however, the problem of communication efficiency and calculation efficiency of this mode is obvious, for example, hundreds of Gb of data generated by a car for several hours are a great burden in the transmission and calculation processes. In practical application, a deep neural network is adopted as a machine learning model in general distributed machine learning, and the machine learning model achieves unprecedented success in many applications, such as model classification and pattern recognition, but is mainly limited to offline learning. In practical applications, the working opportunity obtains data flow, so online learning is an effective way to solve the problem.
In the prior art, in a method for improving the learning performance of a working machine, aggregation of global model prediction is performed by using gpe (Generalized products of experiments). One disadvantage of gpe is that when using local prediction provided by the working machine for gpe aggregation, the obtained global prediction model has a large uncertainty, i.e. the prediction variance is large and conservative, so that the conservative global prediction variance affects the final learning performance of the working machine in the distributed framework. The global prediction variance obtained by using the existing aggregation algorithm is not too large distinguished from the local prediction variance obtained by the working machine by using a local data set and a GPR (Gaussian process regression), so that the result advantage of global prediction is not obvious when the two variances are compared, namely the global prediction variance obtained by using a certain aggregation algorithm is smaller than the local prediction variance of the working machine, and the improvement of the advantage brought by the global model on the local prediction performance cannot be reflected in the final local fusion process.
Disclosure of Invention
In view of this, the present invention provides a method, an apparatus, a device, and a medium for improving the learning performance of a working machine, where the method for improving the learning performance of a working machine uses a Gaussian Process Regression (GPR) and a rBCM (Robust Bayesian commit machine) aggregation algorithm to classify the function as a prediction model of the working machine, and learns the function with a local data set to realize prediction of test output. Each work machine then sends the locally predicted expectations and variances to the server. And after receiving the prediction expectation and the variance of all the working machines, the server adopts an rBCM algorithm to carry out the aggregation of the global model, and sends the obtained global prediction expectation and variance to each working machine, so that the working machines realize the final prediction fusion. Under an online learning framework, by introducing a correction term of precision, the variance of global prediction can be smaller, and the final learning precision of the working machine is improved.
In view of the above object, an aspect of an embodiment of the present invention provides a method of improving learning performance of a working machine, the method including the steps of: establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (Rich computer-based binary-coded decimal) aggregation algorithm to obtain a global prediction model; and sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the establishing a local training data set corresponding to each working machine, and training the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine includes: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
In some embodiments, the creating a local training data set corresponding to the objective function and each working machine, and the building a test data set from the local training data set includes: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
In some embodiments, said approximating said local training data set to said objective function on said test data set by a gaussian process regression algorithm to obtain a corresponding local prediction model for said each working machine comprises: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, said calculating a gaussian posterior probability distribution over said test data set for said each working machine, and obtaining the expectation and variance of the corresponding local prediction for said each working machine comprises: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the setting, by the server, a correction term of uncertainty on the data of the local prediction model corresponding to each working machine and aggregating the correction term based on an rBCM aggregation algorithm to obtain a global prediction model includes: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and the variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In some embodiments, the aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertainty correction term and the rBCM aggregation algorithm to obtain a global prediction model includes: and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain the expectation and variance of the global prediction.
In some embodiments, the sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain the minimum prediction error model corresponding to each working machine includes: transmitting the global predicted expectation and variance to each of the work machines; and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the fusing the set fusion algorithm for each working machine and an uncertainty test data set to obtain the minimum prediction error model for each working machine includes: setting a fusion algorithm and an uncertainty test data set for each working machine according to the global prediction variance and the local prediction variance of each working machine; and obtaining a prediction error minimum model corresponding to each working machine on the uncertainty test data set through the fusion algorithm so as to realize the error minimum of the expected value on the uncertainty test data set.
In some embodiments, said setting a fusion algorithm and an uncertainty test data set for said each work machine based on said global predicted variance and said local predicted variance for said each work machine comprises: and establishing an uncertainty test data set, and setting a fusion algorithm for each working machine according to the variance of global prediction and the variance of local prediction of data in the uncertainty test data set.
In some embodiments, the creating an uncertainty test data set and setting a fusion algorithm for each of the working machines according to the magnitude of the global predicted variance and the local predicted variance of the data in the uncertainty test data set comprises: in response to the variance of the global prediction for data in the uncertainty test dataset not being greater than the variance of the local prediction, using the global prediction model as a minimum model of prediction error for the work machine.
In some embodiments, the creating an uncertainty test data set and setting a fusion algorithm for each of the working machines according to the magnitude of the global predicted variance and the local predicted variance of the data in the uncertainty test data set further comprises: and in response to the variance of the global prediction of the data in the uncertainty test data set being greater than the variance of the local prediction, using the local prediction model corresponding to the working machine as the minimum prediction error model of the working machine.
In another aspect of the embodiments of the present invention, there is also provided an apparatus for improving learning performance of a working machine, the apparatus including: the system comprises a first module, a second module and a third module, wherein the first module is configured to establish a local training data set corresponding to each working machine, and train the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; the second module is configured to set an uncertain correction term for the data of the local prediction model corresponding to each working machine through the server and aggregate the data based on an rBCM (Rich computer-based binary-coded decimal) aggregation algorithm to obtain a global prediction model; and the third module is configured to send the global prediction model to each working machine, and set a fusion algorithm for each working machine and fuse an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In some embodiments, the first module is further configured to: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
In some embodiments, the first module is further configured to: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
In some embodiments, the first module is further configured to: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the first module is further configured to: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
In some embodiments, the second module is further configured to: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In another aspect of the embodiments of the present invention, there is also provided a computer device, including at least one processor; and a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of any of the methods described above.
In another aspect of the embodiments of the present invention, a computer-readable storage medium is also provided, in which a computer program for implementing any one of the above method steps is stored when the computer program is executed by a processor.
The invention has at least the following beneficial effects: the invention provides a method, a device, equipment and a medium for improving the learning performance of a working machine, wherein the method for improving the learning performance of the working machine adopts Gaussian Process Regression (GPR) as a prediction model of the working machine and carries out the aggregation of a global model through an rBCM algorithm, and sends the obtained global prediction expectation and variance to each working machine so as to realize the final prediction fusion of the working machine. The global prediction precision can be improved, namely the global model prediction variance (with uncertainty) is greatly reduced, so that a better model fusion effect of a working machine can be realized. In particular, for the local variance and the global variance, if the global prediction variance is very small, for the working machine with large local prediction variance, a fusion algorithm that replaces the local model with the global model is more valuable. On one hand, the rBCM global model polymerization algorithm reduces the uncertainty of global prediction, namely conservatism is reduced; on the other hand, for the global prediction variance obtained by the server through the rBCM, the working machine remarkably improves the final learning precision after the model prediction fusion of the working machine by using the comparison between the global model and the local model variance.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of an embodiment of a method for improving learning performance of a work machine according to the present invention;
FIG. 2 is a schematic diagram of an embodiment of an apparatus for improving learning performance of a working machine according to the present invention;
FIG. 3 is a schematic diagram of one embodiment of a computer device provided by the present invention;
fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium provided in the present invention.
Detailed Description
Embodiments of the present invention are described below. However, it is to be understood that the disclosed embodiments are merely examples and that other embodiments may take various and alternative forms.
In addition, it should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are only used for convenience of expression and should not be construed as a limitation to the embodiments of the present invention, and the descriptions thereof in the following embodiments are omitted. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
One or more embodiments of the present application will be described below with reference to the accompanying drawings.
In view of the above objects, a first aspect of an embodiment of the present invention proposes an embodiment of a method of improving learning performance of a working machine. Fig. 1 is a schematic diagram illustrating an embodiment of a method for improving learning performance of a working machine according to the present invention. As shown in fig. 1, a method for improving learning performance of a working machine according to an embodiment of the present invention includes the following steps:
s1, establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
s2, setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the correction terms based on an rBCM (random binary coded modulation) aggregation algorithm to obtain a global prediction model;
and S3, sending the global prediction model to each working machine, and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
In view of the above object, the first aspect of the embodiments of the present invention also provides another embodiment of a method for improving the learning performance of a working machine.
Distributed machine learning is used to address situations where the computational load is too large, training data is excessive, and the model is oversized. For the case of too large amount of calculation, multi-thread or multi-machine parallel operation based on shared memory (or virtual memory) can be adopted; in the case of too much training data, the data needs to be divided and distributed to a plurality of working nodes for training, so that the local data of each working node is within the tolerance. Each working node can train a sub-model according to local data and can communicate with other working nodes according to a certain rule (the communication content is mainly sub-model parameters or parameter updating) so as to ensure that the training results from all the working nodes can be effectively integrated finally and a global machine learning model can be obtained. For the case that the model size is too large, the model needs to be divided and distributed to different working nodes for training. Different from data parallel, the dependency relationship among submodels under the model parallel framework is very strong, because the output of a certain submodel may be the input of another submodel, if the communication of the intermediate calculation result is not carried out, the whole model training cannot be completed.
In general distributed machine learning, a deep neural network is adopted as a machine learning model and mainly applied to pattern classification and pattern recognition, but the distributed machine learning is limited to offline learning, in practical application, a working machine can obtain data flow in real-time application, online learning is called as a means for solving the problem, and Gaussian process regression is one effective means. The gaussian process model can be equivalent to existing machine learning models, including Bayesian linear models, multi-layer neural networks. According to the central limit theorem, assuming that the weights in the neural network follow a gaussian normal distribution, as the width of the neural network approaches infinity, such a neural network is equivalent to a gaussian process regression. However, unlike conventional learning models, such as linear regression, logistic regression, and neural networks, which require solving an optimization problem to minimize a loss function to obtain optimal model parameters, gaussian process regression is a non-hyperparametric statistical probability model and does not require solving an optimization problem. Given training data and test inputs, the prediction of the gaussian process regression is divided into two steps, inference and prediction. The inference process assumes that the function to be learned obeys the Gaussian process, gives the Gaussian prior probability distribution of the model, and then uses the observed value and the Bayesian rule to calculate the Gaussian posterior probability distribution of the model. When the local model prediction is completed, each working machine sends the obtained local prediction (expectation and variance) to the server, and the server completes the calculation of the global model, for example, an average aggregation algorithm is used for solving the global model. And finally, the server sends the global model (global expectation and variance) obtained by calculation back to each working machine, and the working machines perform fusion calculation by using the obtained global model and the local model obtained by self training so as to expect to obtain an updated prediction of the target function, so that the prediction is closer to the true value of the function.
The method for improving the learning performance of the working machine uses Gaussian Process Regression (GPR) as a prediction model of the working machine, and learns functions by using a local data set to realize prediction of test output. Each work machine then sends the locally predicted expectations and variances to the server. And after receiving the prediction expectation and the variance of all the working machines, the server carries out global model aggregation through an rBCM algorithm, and sends the obtained global prediction expectation and variance to each working machine to realize final prediction fusion of the working machines. The global aggregation of the rBCM algorithm can improve the accuracy of global prediction, namely the prediction variance (uncertainty) of a global model is greatly reduced, so that a better model fusion effect of a working machine can be realized. In particular, considering the comparison between the local variance and the global variance, if the global prediction variance is very small, it is more valuable to adopt a fusion algorithm in which the global model replaces the local model for the working machine with the larger local prediction variance.
Defining an objective function asIn whichIs thatThe input space is dimensioned. Without loss of generality, we assume that the output is one-dimensional, i.e. one-dimensional. At the moment of timeGiven a givenThe corresponding output is
Is obeyed with a mean of 0 and a variance ofGaussian noise of gaussian probability distribution, i.e.. Defining a training set of the formWhereinIs a collection of input data that is,is a column vector that aggregates the outputs. The Gaussian process regression objective is to utilize a training setIn testing data setsUpper approximation function。
Defining symmetric positive semi-definite kernel functionsI.e. byIn which,Is a measure. LetReturning a column vector such that it isEach element is equal to. Hypothesis functionIs a sample from a prior probability distribution of a Gaussian process having a mean function ofThe kernel function is. Then training output and test outputObeying a joint probability distribution
WhereinAndreturn to byAndthe vector of the composition is then calculated,return a matrix such thatGo to the firstThe elements of the column are。
Using properties of the Gaussian process, the Gaussian process regression uses a training setPredictive test data setTo output of (c). This outputObeying a normal distribution, i.e.Here, the
In distributed machine learning, consider a network havingA working machine. Define this set as. At each momentEach of the working machinesUsing local training dataTo predict function to test inputTo output of (c).、The local predictive value of each machine training is
If under the Federal learning framework, each working machine will train a good local prediction,And sending the data to a server.
The specific steps of distributed training and fusion are as follows:
(1) Constructing a training subset based on projection of a training set, defining two training data pointsAnda distance ofData pointsTo a collectionA distance of. Defining data pointsTo a collectionIs a set of projections。
Consider each working machineAnd its local training data setFor a test dataCalculating test dataTo the training setIs labeled as:
For each working machineAnd projection sets thereofTaking out each projection point marked as. Herein subscriptIs shown asAnd (4) a projection point. And then for each proxelTo find out a neighborhood thereofSo thatAnd is directed to,,. It should be noted here that the number of neighborhoods is adjustable, and the selection can be fixed.
(2) Selecting a kernel function, in practical application, generally selecting a kernel function:
(3) For each working machineIn the new training setThe gaussian posterior probability distribution is calculated above, i.e.:
In the training subsetObtaining a local prediction using equation (7)Andthen, the local prediction is sent to a server, wherein the local prediction error is proved to be smaller than an upper bound which is defined asThat is, for the test input, the following inequality holds
(4) The server utilizes an rBCM aggregation algorithm to aggregate the local predicted values, and global prediction expectation and variance are given:
Wherein, the first and the second end of the pipe are connected with each other,is an uncertainty correction term that can make the global expectation variance smaller. Because the global prediction expectation obtained by the rBCM algorithm has consistency, namely when the training data is large enough, the global prediction expectation is consistentCan approximate a function. Therefore, the approximation error is as follows:
(5) The server predicts global expectationSum varianceSent to each working machine according to the global prediction varianceAnd local prediction varianceDesigning a fusion algorithm for each working machine to enable the prediction expectation after fusion to be more approximate to the target functionThe true value of (d). Constructing a test data with small uncertaintyThe set of (a) is as follows:
If this set is not an empty set, global prediction from the serverAndwill be used; if the set is an empty set, local prediction from the working machineAndwill be used. The global prediction variance obtained by the rBCM algorithm is smaller, so that the global prediction variance can more occupy the dominant advantage when the global prediction variance and the local prediction variance are compared in the working machine fusion algorithm. If the local variance is large, then global prediction is usedThe expectation and variance are substituted so that the local prediction of the working machine is significantly improved. On the other hand, by comparing the upper bounds of equation (8) and equation (10), when the global prediction variance is small enough, the confidence interval becomes narrower to reflect a smaller approximation error.
In a second aspect of embodiments of the present invention, an apparatus for improving learning performance of a work machine is provided. Fig. 2 is a schematic diagram illustrating an embodiment of an apparatus for improving learning performance of a working machine according to the present invention. As shown in fig. 2, the apparatus for improving learning performance of a working machine according to the present invention includes: the first module 011 is configured to establish a local training data set corresponding to each working machine, and train the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine; a second module 012, configured to set, by the server, a correction term of uncertainty for the data of the local prediction model corresponding to each working machine, and aggregate the data based on the rBCM aggregation algorithm to obtain a global prediction model; and a third module 013, configured to send the global prediction model to each of the working machines, and set a fusion algorithm for each of the working machines and fuse an uncertainty test data set to obtain a minimum prediction error model corresponding to each of the working machines.
The first module 011 is further configured for: establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets; and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
The first module 011 is further configured for: calculating the projection of each test data to the local training data set to obtain a local projection set; and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
The first module 011 is further configured for: calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine; and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
The first module 011 is further configured for: and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
The second module 012 is further configured to: setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server; and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
In view of the above object, a third aspect of the embodiments of the present invention provides a computer device, and fig. 3 is a schematic diagram illustrating an embodiment of a computer device provided by the present invention. As shown in FIG. 3, an embodiment of a computer device provided by the present invention includes the following modules: at least one processor 021; and a memory 022, the memory 022 storing computer instructions 023 executable on the processor 021, the computer instructions 023, when executed by the processor 021, implementing the steps of the method as described above.
The invention also provides a computer readable storage medium. FIG. 4 is a schematic diagram illustrating an embodiment of a computer-readable storage medium provided by the present invention. As shown in fig. 4, the computer readable storage medium 031 stores a computer program 032 which, when executed by a processor, performs the method as described above.
Finally, it should be noted that, as one of ordinary skill in the art can appreciate that all or part of the processes of the methods of the above embodiments can be implemented by a computer program to instruct related hardware, and the program of the method for setting system parameters can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the methods as described above. The storage medium of the program may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments corresponding thereto.
Furthermore, the methods disclosed according to embodiments of the present invention may also be implemented as a computer program executed by a processor, which may be stored in a computer-readable storage medium. Which when executed by a processor performs the above-described functions as defined in the method disclosed by an embodiment of the invention.
Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, D0L, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
The foregoing are exemplary embodiments of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.
Claims (20)
1. A method of improving learning performance of a work machine, the method being for achieving final prediction fusion of the work machine and improving final learning accuracy of the work machine, and the method comprising:
establishing a local training data set corresponding to each working machine, and training the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
setting an uncertain correction term for the data of the local prediction model corresponding to each working machine through a server, and aggregating the data of the local prediction model corresponding to each working machine after the uncertain correction term is set based on an rBCM aggregation algorithm to obtain a global prediction model;
and sending the global prediction model to each working machine, establishing a corresponding uncertainty test data set and a fusion algorithm for each working machine according to the local prediction model corresponding to each working machine and the global prediction model, and fusing the uncertainty test data set through the fusion algorithm to obtain a prediction error minimum model corresponding to each working machine.
2. The method of claim 1, wherein the establishing a local training data set corresponding to each working machine and training the local training data set through a gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine comprises:
establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets;
and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
3. The method of claim 2, wherein the establishing a local training data set corresponding to the objective function and each working machine, and the constructing a test data set from the local training data set comprises:
calculating the projection of each test data to the local training data set to obtain a local projection set;
and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
4. The method of claim 2, wherein approximating the local training data set to the objective function on the test data set by a gaussian process regression algorithm to obtain the local prediction model for each of the work machines comprises:
calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine;
and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
5. The method of claim 4, wherein calculating a Gaussian posterior probability distribution over the test data set for each of the work machines, obtaining the expectation and variance of the corresponding local prediction for each of the work machines comprises:
and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
6. The method according to claim 4, wherein the setting, by the server, a correction term of uncertainty for the data of the local prediction model corresponding to each working machine, and aggregating the data of the local prediction model corresponding to each working machine after the correction term of uncertainty is set based on an rBCM aggregation algorithm to obtain a global prediction model comprises:
setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server;
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
7. The method according to claim 6, wherein the aggregating the expectation and variance of the local prediction corresponding to each of the working machines based on the uncertainty correction term and the rBCM aggregation algorithm to obtain a global prediction model comprises:
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain the expectation and variance of the global prediction.
8. The method of claim 7, wherein the sending the global prediction model to each of the plurality of working machines, establishing a corresponding uncertainty test data set and fusion algorithm for each of the plurality of working machines according to the local prediction model corresponding to each of the plurality of working machines and the global prediction model, and performing fusion on the uncertainty test data set through the fusion algorithm to obtain the minimum prediction error model corresponding to each of the plurality of working machines comprises:
transmitting the global predicted expectation and variance to each of the work machines;
and setting a fusion algorithm for each working machine and fusing an uncertainty test data set to obtain a prediction error minimum model corresponding to each working machine.
9. The method of claim 8, wherein fusing the set of uncertainty test data with the set of fusion algorithms for each of the plurality of work machines to obtain the minimum prediction error model for each of the plurality of work machines comprises:
setting a fusion algorithm and an uncertainty test data set for each working machine according to the variance of the global prediction and the variance of the local prediction of each working machine;
and obtaining a prediction error minimum model corresponding to each working machine on the uncertainty test data set through the fusion algorithm so as to realize the error minimum of the expected value on the uncertainty test data set.
10. The method of claim 9, wherein said setting a fusion algorithm and an uncertainty test data set for each of said plurality of work machines based on said global predicted variance and said local predicted variance for each of said plurality of work machines comprises:
and establishing an uncertainty test data set, and setting a fusion algorithm for each working machine according to the variance of global prediction and the variance of local prediction of data in the uncertainty test data set.
11. The method of claim 10, wherein said creating an uncertainty test data set and setting a fusion algorithm for each of said work machines based on the magnitude of the global predicted variance and the local predicted variance of data in said uncertainty test data set comprises:
in response to the variance of the global prediction for data in the uncertainty test dataset not being greater than the variance of the local prediction, using the global prediction model as a minimum model of prediction error for the work machine.
12. The method of claim 10, wherein said creating an uncertainty test data set and setting a fusion algorithm for each of said work machines based on the magnitude of the global predicted variance and the local predicted variance of data in said uncertainty test data set further comprises:
and in response to the variance of the global prediction of the data in the uncertainty test data set being greater than the variance of the local prediction, using the local prediction model corresponding to the working machine as the minimum prediction error model of the working machine.
13. An apparatus for improving learning performance of a working machine, the apparatus being configured to achieve final prediction fusion of the working machine and to improve final learning accuracy of the working machine, and the apparatus comprising:
the system comprises a first module, a second module and a third module, wherein the first module is configured to establish a local training data set corresponding to each working machine and train the local training data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine;
the second module is configured to set an uncertain correction term for the data of the local prediction model corresponding to each working machine through the server, and aggregate the data after the local prediction model corresponding to each working machine is set with the uncertain correction term based on an rBCM aggregation algorithm to obtain a global prediction model; and
and the third module is configured to send the global prediction model to each working machine, establish a corresponding uncertainty test data set and a fusion algorithm for each working machine according to the local prediction model corresponding to each working machine and the global prediction model, and perform fusion on the uncertainty test data set through the fusion algorithm to obtain a prediction error minimum model corresponding to each working machine.
14. The apparatus of claim 13, wherein the first module is further configured to:
establishing a target function and a local training data set corresponding to each working machine, and establishing a test data set through the local training data sets;
and approximating the local training data set to the target function on the test data set through a Gaussian process regression algorithm to obtain a local prediction model corresponding to each working machine.
15. The apparatus of claim 14, wherein the first module is further configured to:
calculating the projection of each test data to the local training data set to obtain a local projection set;
and constructing a test data set based on the neighborhood corresponding to each projection point in the local projection set.
16. The apparatus of claim 14, wherein the first module is further configured to:
calculating Gaussian posterior probability distribution of each working machine on the test data set to obtain expectation and variance of local prediction corresponding to each working machine;
and establishing a local prediction model through the expectation and the variance of the local prediction corresponding to each working machine.
17. The apparatus of claim 16, wherein the first module is further configured for:
and selecting a kernel function matched with the calculated Gaussian posterior probability, and calculating the Gaussian posterior probability distribution of each working machine on the test data set based on the kernel function to obtain the expectation and the variance of the local prediction corresponding to each working machine.
18. The apparatus of claim 16, wherein the second module is further configured for:
setting a correction term of uncertainty for the expectation and variance of the local prediction corresponding to each working machine through a server;
and aggregating the expectation and variance of the local prediction corresponding to each working machine based on the uncertain correction term and the rBCM aggregation algorithm to obtain a global prediction model.
19. A computer device, comprising:
at least one processor; and
a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method of any one of claims 1 to 12.
20. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394593.5A CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211394593.5A CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115456202A CN115456202A (en) | 2022-12-09 |
CN115456202B true CN115456202B (en) | 2023-04-07 |
Family
ID=84309944
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211394593.5A Active CN115456202B (en) | 2022-11-08 | 2022-11-08 | Method, device, equipment and medium for improving learning performance of working machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115456202B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117370473B (en) * | 2023-12-07 | 2024-03-01 | 苏州元脑智能科技有限公司 | Data processing method, device, equipment and storage medium based on integrity attack |
CN117473331B (en) * | 2023-12-27 | 2024-03-08 | 苏州元脑智能科技有限公司 | Stream data processing method, device, equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109688110A (en) * | 2018-11-22 | 2019-04-26 | 顺丰科技有限公司 | DGA domain name detection model construction method, device, server and storage medium |
CN112381145A (en) * | 2020-11-16 | 2021-02-19 | 江康(上海)科技有限公司 | Gaussian process regression multi-model fusion modeling method based on nearest correlation spectral clustering |
CN114912626A (en) * | 2022-04-15 | 2022-08-16 | 上海交通大学 | Method for processing distributed data of federal learning mobile equipment based on summer pril value |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6697159B2 (en) * | 2016-07-13 | 2020-05-20 | 富士通株式会社 | Machine learning management program, machine learning management device, and machine learning management method |
CN107451101B (en) * | 2017-07-21 | 2020-06-09 | 江南大学 | Method for predicting concentration of butane at bottom of debutanizer by hierarchical integrated Gaussian process regression soft measurement modeling |
US20220101178A1 (en) * | 2020-09-25 | 2022-03-31 | EMC IP Holding Company LLC | Adaptive distributed learning model optimization for performance prediction under data privacy constraints |
CN115174191B (en) * | 2022-06-30 | 2024-01-09 | 山东云海国创云计算装备产业创新中心有限公司 | Local predicted value safe transmission method, computer equipment and storage medium |
-
2022
- 2022-11-08 CN CN202211394593.5A patent/CN115456202B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109688110A (en) * | 2018-11-22 | 2019-04-26 | 顺丰科技有限公司 | DGA domain name detection model construction method, device, server and storage medium |
CN112381145A (en) * | 2020-11-16 | 2021-02-19 | 江康(上海)科技有限公司 | Gaussian process regression multi-model fusion modeling method based on nearest correlation spectral clustering |
CN114912626A (en) * | 2022-04-15 | 2022-08-16 | 上海交通大学 | Method for processing distributed data of federal learning mobile equipment based on summer pril value |
Also Published As
Publication number | Publication date |
---|---|
CN115456202A (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115456202B (en) | Method, device, equipment and medium for improving learning performance of working machine | |
CN110263227B (en) | Group partner discovery method and system based on graph neural network | |
CN115563858A (en) | Method, device, equipment and medium for improving steady-state performance of working machine | |
CN111191934A (en) | Multi-target cloud workflow scheduling method based on reinforcement learning strategy | |
CN111506814B (en) | Sequence recommendation method based on variational self-attention network | |
CN113011529B (en) | Training method, training device, training equipment and training equipment for text classification model and readable storage medium | |
CN111063398B (en) | Molecular discovery method based on graph Bayesian optimization | |
Petelin et al. | Control system with evolving Gaussian process models | |
Tehrani et al. | A hybrid optimized artificial intelligent model to forecast crude oil using genetic algorithm | |
CN111459988A (en) | Method for automatic design of machine learning assembly line | |
He et al. | A GNN-based predictor for quantum architecture search | |
US20240095529A1 (en) | Neural Network Optimization Method and Apparatus | |
CN115392493A (en) | Distributed prediction method, system, server and storage medium | |
CN112541556A (en) | Model construction optimization method, device, medium, and computer program product | |
CN112508178A (en) | Neural network structure searching method and device, electronic equipment and storage medium | |
CN115412401B (en) | Method and device for training virtual network embedding model and virtual network embedding | |
CN115081609A (en) | Acceleration method in intelligent decision, terminal equipment and storage medium | |
Hennebold et al. | Machine learning based cost prediction for product development in mechanical engineering | |
CN114819442A (en) | Operational research optimization method and device and computing equipment | |
Cherifi et al. | An incremental evidential conflict resolution method for data stream fusion in IoT | |
Golmohammadi et al. | Neural network application for supplier selection | |
Chen et al. | Automated Machine Learning | |
Pashazadeh et al. | On the difficulty of generalizing reinforcement learning framework for combinatorial optimization | |
Ding | Not only domain randomization: Universal policy with embedding system identification | |
US11783194B1 (en) | Evolutionary deep learning with extended Kalman filter for modeling and data assimilation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |