CN112785000A - Machine learning model training method and system for large-scale machine learning system - Google Patents
Machine learning model training method and system for large-scale machine learning system Download PDFInfo
- Publication number
- CN112785000A CN112785000A CN202110127839.1A CN202110127839A CN112785000A CN 112785000 A CN112785000 A CN 112785000A CN 202110127839 A CN202110127839 A CN 202110127839A CN 112785000 A CN112785000 A CN 112785000A
- Authority
- CN
- China
- Prior art keywords
- model
- machine learning
- module
- training
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the technical field of model training, and discloses a machine learning model training method and a machine learning model training system for a large-scale machine learning system, wherein the machine learning model training system for the large-scale machine learning system comprises the following steps: the system comprises a data acquisition module, a data preprocessing module, a parameter range determining module, a central control module, a model training module, a model testing module, a model evaluation module, a model optimizing module, a data storage module and an updating display module. According to the method, the training sample set is processed through the data preprocessing module, the characteristic subset of the training sample set is obtained, and the model training data volume is reduced; the machine learning model is trained in an incremental learning-based mode, so that the accuracy of model training can be improved; the model evaluation module and the model optimization module determine the optimal parameter values in the value range of each parameter and adjust the model parameters, so that the training efficiency of the model in machine learning is improved.
Description
Technical Field
The invention belongs to the technical field of model training, and particularly relates to a machine learning model training method and system for a large-scale machine learning system.
Background
Currently, with the widespread popularity of machine learning, various machine learning models are receiving more and more attention. For a machine learning model, it is usually required to train the machine learning model based on training data (also called training samples), and then perform some kind of prediction, such as performing class prediction, using the trained machine learning model.
In the training process of the machine learning model, samples need to be added or modified to the machine learning model. In order to increase training samples for machine learning, different features need to be added, or different features need to be combined and input to a machine learning model one by one, but the existing training method for the learning model is tedious, long in time consumption, low in training efficiency, and low in flexibility and applicability. Therefore, a new machine learning model training method for a large-scale machine learning system is needed.
Through the above analysis, the problems and defects of the prior art are as follows: the existing learning model training method is tedious, long in time consumption, low in training efficiency, and low in flexibility and applicability.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a machine learning model training method and system for a large-scale machine learning system.
The invention is realized in such a way that a machine learning model training method facing a large-scale machine learning system comprises the following steps:
acquiring a latest feature set of the machine learning model and incremental data in a current time period through a data acquisition module through data acquisition equipment; processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program through a data preprocessing module to obtain a training sample set and a testing sample set;
the processing and dividing of the acquired feature set and the incremental data in the current time period by the data preprocessing module through a data preprocessing program comprises the following steps:
(1.1) calculating the weight of each feature data and each incremental data by using a Jaccard index for the acquired feature set and the incremental data in the current time period to form a first weight set;
(1.2) comparing the weights of the characteristic data and the incremental data in the first weight set with a preset weight threshold, and screening the characteristic data and the incremental data meeting the requirements to obtain a first data subset;
(1.3) taking the first data subset, and calculating the weight of each feature data and each incremental data by using a Relief-F algorithm to form a second weight set;
(1.4) the weight of each feature data and each incremental data in the second weight set is taken, the weights are compared with a preset threshold, and the feature data and the incremental data meeting the requirements are selected to obtain a final data set;
(1.5) carrying out clustering analysis on the obtained data set to obtain a plurality of data subsets; extracting training set data with the same proportion from each data subset to obtain a plurality of training subsets, and taking the residual data in each data subset as a test subset;
(1.6) combining the training subsets to obtain a training set, and combining the test subsets to obtain a test set;
(1.7) respectively calculating the mean value and the standard deviation of the physicochemical data corresponding to the current training set and the test set; calculating the mean error and standard deviation error between the physicochemical values of the training set and the test set based on the mean value and standard deviation value of the physicochemical data corresponding to the current training set and the test set;
(1.8) if the mean error and the standard error difference value between the physicochemical values of the training set and the test set obtained by calculation are respectively less than or equal to a preset threshold, taking the current training set and the test set as a final training set and a final test set; otherwise, returning to the step (1.5);
step two, processing the training sample set to obtain a characteristic subset of the training sample set; determining the range of the model parameter to be selected according to the type of the machine learning model through a range determination program through a parameter range determination module;
step three, sequentially selecting model parameters in the range of the model parameters through a central control module and a central processing unit coordinated control model training module, and training the machine learning model by utilizing a feature subset of a training sample set through a model training program;
testing the machine learning model obtained by training by using a model testing program through a model testing module and a testing sample set; evaluating the trained machine learning model through an evaluation program by a model evaluation module to obtain a model evaluation value;
step five, adjusting model parameters of the machine learning model through a model optimization module according to the obtained model evaluation value and the model test result through a model optimization program to obtain optimal parameters and an optimal model;
step six, storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory by a data storage module;
and seventhly, updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through the updating and displaying module through the display.
Further, the testing, by the model testing module, the machine learning model obtained by training through the model testing program by using the test sample set includes:
(1) receiving a test request of the machine learning model to be tested;
(2) calling a model test service according to the test request to test the model to be tested by using a test sample set;
(3) and outputting a test result of the machine learning model to be tested.
Further, the request carries test information of the model to be tested, and the test information comprises a model file, a test data set and parameters of the model to be tested.
Further, the model test service comprises a plurality of deep learning frameworks; different frames in the multiple deep learning frames are used for building different test models; the different test models are used for testing different deep learning models.
Further, the evaluating the trained machine learning model by the model evaluation module using an evaluation program to obtain a model evaluation value includes:
(1) obtaining the value range of each parameter of the machine learning model through a model evaluation module;
(2) determining the initial value of the corresponding parameter by utilizing an evaluation program in the value range of each parameter;
(3) and the central processing unit controls the evaluation program to adjust each parameter to the initial value, and acquires a model evaluation value from the evaluation program.
Further, the model evaluation value is used to indicate the performance of the parameter-adjusted machine learning model.
Further, the adjusting, by the model optimization module, the model parameter of the machine learning model according to the obtained model evaluation value and the model test result through the model optimization program to obtain the optimal parameter and the optimal model includes:
re-determining the value of each parameter within the value range of each parameter according to the obtained model evaluation value and the model test result, and comparing the model evaluation value corresponding to each parameter with the model test result, wherein the corresponding optimal model evaluation value and the model test result are the optimal parameters; and controlling the parameter adjusting program to adjust the parameters of the model based on the optimal parameters, so as to obtain the optimal model.
Another object of the present invention is to provide a machine learning model training system for a large-scale machine learning system, which implements the machine learning model training method for a large-scale machine learning system, the machine learning model training system for a large-scale machine learning system comprising:
the system comprises a data acquisition module, a data preprocessing module, a parameter range determining module, a central control module, a model training module, a model testing module, a model evaluation module, a model optimizing module, a data storage module and an updating display module;
the data acquisition module is connected with the central control module and used for acquiring the latest feature set of the machine learning model and incremental data in the current time period through data acquisition equipment;
the data preprocessing module is connected with the central control module and used for processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program to obtain a training sample set and a test sample set; meanwhile, processing the training sample set to obtain a characteristic subset of the training sample set;
the parameter range determining module is connected with the central control module and used for determining the range of the model parameter to be selected according to the type of the machine learning model through a range determining program;
the central control module is connected with the data acquisition module, the data preprocessing module, the parameter range determining module, the model training module, the model testing module, the model evaluation module, the model optimization module, the data storage module and the updating display module and is used for coordinating and controlling the normal operation of each module of the machine learning model training system facing the large-scale machine learning system through the central processing unit;
the model training module is connected with the central control module and is used for sequentially selecting model parameters in the range of the model parameters and training the machine learning model by utilizing the characteristic subset of the training sample set through a model training program;
the model testing module is connected with the central control module and used for testing the machine learning model obtained by training by utilizing a testing sample set through a model testing program;
the model evaluation module is connected with the central control module and used for evaluating the trained machine learning model through an evaluation program to obtain a model evaluation value;
the model optimization module is connected with the central control module and used for adjusting model parameters of the machine learning model according to the obtained model evaluation value and the model test result through a model optimization program to obtain an optimal model;
the data storage module is connected with the central control module and used for storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory;
and the updating display module is connected with the central control module and is used for updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through a display.
Another object of the present invention is to provide a computer program product stored on a computer readable medium, which includes a computer readable program for providing a user input interface to implement the method for training a machine learning model of a large-scale machine learning system when the computer program product is executed on an electronic device.
Another object of the present invention is to provide a computer-readable storage medium storing instructions which, when executed on a computer, cause the computer to execute the method for training a machine learning model for a large-scale machine learning system.
By combining all the technical schemes, the invention has the advantages and positive effects that: according to the machine learning model training method for the large-scale machine learning system, the obtained training sample set is processed through the data preprocessing module to obtain the characteristic parameter set of the training sample set, so that the model training data volume can be greatly reduced; the machine learning model is trained in an incremental learning-based mode, so that the accuracy of model training can be improved; the optimal parameter values are determined in the value range of each parameter through the model evaluation module and the model optimization module, and the model parameters are adjusted, so that the training efficiency of the model in machine learning is improved, and the problems of complexity, long time consumption, low training efficiency, low flexibility and low applicability of the existing learning model training method can be effectively solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a machine learning model training method for a large-scale machine learning system according to an embodiment of the present invention.
FIG. 2 is a structural block diagram of a machine learning model training system for a large-scale machine learning system according to an embodiment of the present invention;
in the figure: 1. a data acquisition module; 2. a data preprocessing module; 3. a parameter range determination module; 4. a central control module; 5. a model training module; 6. a model test module; 7. a model evaluation module; 8. a model optimization module; 9. a data storage module; 10. and updating the display module.
Fig. 3 is a flowchart of a method for processing and dividing the acquired feature set and the incremental data in the current time period through a data preprocessing program by a data preprocessing module according to an embodiment of the present invention.
Fig. 4 is a flowchart of a method for testing the machine learning model obtained by training through a model testing module by using a model testing program according to an embodiment of the present invention.
Fig. 5 is a flowchart of a method for obtaining a model evaluation value by evaluating the trained machine learning model with an evaluation program by a model evaluation module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides a machine learning model training method and system for a large-scale machine learning system, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention includes the following steps:
s101, acquiring the latest feature set of the machine learning model and incremental data in the current time period through a data acquisition module through data acquisition equipment; processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program through a data preprocessing module to obtain a training sample set and a testing sample set;
s102, processing a training sample set to obtain a feature subset of the training sample set; determining the range of the model parameter to be selected according to the type of the machine learning model through a range determination program through a parameter range determination module;
s103, sequentially selecting model parameters in the range of the model parameters through a central control module and a central processing unit coordinated control model training module, and training the machine learning model by utilizing a feature subset of a training sample set through a model training program;
s104, testing the machine learning model obtained by training through a model testing module by utilizing a testing sample set through a model testing program; evaluating the trained machine learning model through an evaluation program by a model evaluation module to obtain a model evaluation value;
s105, adjusting model parameters of the machine learning model through a model optimization module according to the obtained model evaluation value and the model test result through a model optimization program to obtain optimal parameters and an optimal model;
s106, storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory by a data storage module;
and S107, updating and displaying the acquired feature set, training sample set, testing sample set, model parameter range, model training result, model testing result, model evaluation value, optimal parameter and real-time data of the optimal model through the display by the updating and displaying module.
As shown in fig. 2, a machine learning model training system for a large-scale machine learning system according to an embodiment of the present invention includes: the system comprises a data acquisition module 1, a data preprocessing module 2, a parameter range determining module 3, a central control module 4, a model training module 5, a model testing module 6, a model evaluation module 7, a model optimizing module 8, a data storage module 9 and an updating display module 10.
The data acquisition module 1 is connected with the central control module 4 and used for acquiring the latest feature set of the machine learning model and incremental data in the current time period through data acquisition equipment;
the data preprocessing module 2 is connected with the central control module 4 and used for processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program to obtain a training sample set and a test sample set; meanwhile, processing the training sample set to obtain a characteristic subset of the training sample set;
the parameter range determining module 3 is connected with the central control module 4 and is used for determining the range of the model parameter to be selected according to the type of the machine learning model through a range determining program;
the central control module 4 is connected with the data acquisition module 1, the data preprocessing module 2, the parameter range determining module 3, the model training module 5, the model testing module 6, the model evaluation module 7, the model optimization module 8, the data storage module 9 and the updating display module 10, and is used for coordinating and controlling the normal operation of each module of the machine learning model training system facing the large-scale machine learning system through a central processing unit;
the model training module 5 is connected with the central control module 4 and is used for sequentially selecting model parameters in the range of the model parameters and training the machine learning model by utilizing the feature subset of the training sample set through a model training program;
the model testing module 6 is connected with the central control module 4 and used for testing the machine learning model obtained by training by utilizing a testing sample set through a model testing program;
the model evaluation module 7 is connected with the central control module 4 and used for evaluating the trained machine learning model through an evaluation program to obtain a model evaluation value;
the model optimization module 8 is connected with the central control module 4 and used for adjusting model parameters of the machine learning model according to the obtained model evaluation value and the model test result through a model optimization program to obtain an optimal model;
the data storage module 9 is connected with the central control module 4 and used for storing the acquired feature set, the training sample set, the test sample set, the range of model parameters, the model training result, the model test result, the model evaluation value, the optimal parameters and the optimal model through a memory;
and the updating display module 10 is connected with the central control module 4 and is used for updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through a display.
The technical solution of the present invention is further illustrated by the following specific examples.
Example 1
Fig. 1 shows a machine learning model training method for a large-scale machine learning system according to an embodiment of the present invention, and fig. 3 shows a preferred embodiment of the method, where the processing and dividing of the acquired feature set and the incremental data in the current time period by the data preprocessing module according to the embodiment of the present invention includes:
s201, calculating the weight of each feature data and each incremental data by using a Jaccard index for the acquired feature set and the incremental data in the current time period to form a first weight set;
s202, comparing the weights of the characteristic data and the incremental data in the first weight set with a preset weight threshold, and screening the characteristic data and the incremental data meeting the requirements to obtain a first data subset;
s203, taking the first data subset, and calculating the weight of each feature data and each incremental data by using a Relief-F algorithm to form a second weight set;
s204, the weight of each feature data and each incremental data in the second weight set is taken and compared with a preset threshold, and the feature data and the incremental data meeting the requirements are selected to obtain a final data set;
s205, carrying out cluster analysis on the obtained data set to obtain a plurality of data subsets; extracting training set data with the same proportion from each data subset to obtain a plurality of training subsets, and taking the residual data in each data subset as a test subset;
s206, combining the training subsets to obtain a training set, and combining the test subsets to obtain a test set;
s207, respectively calculating the mean value and standard deviation of the physicochemical data corresponding to the current training set and the test set; calculating the mean error and standard deviation error between the physicochemical values of the training set and the test set based on the mean value and standard deviation value of the physicochemical data corresponding to the current training set and the test set;
s208, if the mean error and the standard error difference value between the physicochemical values of the training set and the test set obtained by calculation are respectively less than or equal to a preset threshold value, taking the current training set and the test set as a final training set and a final test set; otherwise, return to step S205.
Example 2
The method for training a machine learning model for a large-scale machine learning system according to the embodiment of the present invention is shown in fig. 1, and as a preferred embodiment, as shown in fig. 4, the method for testing the machine learning model obtained by training through a model testing module by using a model testing program according to the embodiment of the present invention includes:
s301, receiving a test request of the machine learning model to be tested;
s302, calling a model test service according to the test request and testing the model to be tested by using a test sample set;
and S303, outputting a test result of the machine learning model to be tested.
The request provided by the embodiment of the invention carries the test information of the model to be tested, and the test information comprises a model file, a test data set and parameters of the model to be tested.
The model test service provided by the embodiment of the invention comprises a plurality of deep learning frames; different frames in the multiple deep learning frames are used for building different test models; the different test models are used for testing different deep learning models.
Example 3
Fig. 1 shows a method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention, and fig. 5 shows a preferred embodiment of the method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention, where a model evaluation module evaluates a trained machine learning model by using an evaluation program to obtain a model evaluation value, where the method includes:
s401, obtaining the value range of each parameter of the machine learning model through a model evaluation module;
s402, determining initial values of corresponding parameters by utilizing an evaluation program in the value range of each parameter;
and S403, the central processing unit controls the evaluation program to adjust each parameter to the initial value, and obtains a model evaluation value from the evaluation program.
The model evaluation value provided by the embodiment of the invention is used for indicating the performance of the machine learning model after parameter adjustment.
Example 4
As shown in fig. 1, the method for training a machine learning model for a large-scale machine learning system according to the embodiment of the present invention is a preferred embodiment, where the method for adjusting model parameters of the machine learning model by a model optimization module according to an obtained model evaluation value and a model test result through a model optimization program to obtain optimal parameters and an optimal model includes:
re-determining the value of each parameter within the value range of each parameter according to the obtained model evaluation value and the model test result, and comparing the model evaluation value corresponding to each parameter with the model test result, wherein the corresponding optimal model evaluation value and the model test result are the optimal parameters; and controlling the parameter adjusting program to adjust the parameters of the model based on the optimal parameters, so as to obtain the optimal model.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When used in whole or in part, can be implemented in a computer program product that includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A machine learning model training method for a large-scale machine learning system is characterized by comprising the following steps:
acquiring a latest feature set of the machine learning model and incremental data in a current time period through a data acquisition module through data acquisition equipment; processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program through a data preprocessing module to obtain a training sample set and a testing sample set;
the processing and dividing of the acquired feature set and the incremental data in the current time period by the data preprocessing module through a data preprocessing program comprises the following steps:
(1.1) calculating the weight of each feature data and each incremental data by using a Jaccard index for the acquired feature set and the incremental data in the current time period to form a first weight set;
(1.2) comparing the weights of the characteristic data and the incremental data in the first weight set with a preset weight threshold, and screening the characteristic data and the incremental data meeting the requirements to obtain a first data subset;
(1.3) taking the first data subset, and calculating the weight of each feature data and each incremental data by using a Relief-F algorithm to form a second weight set;
(1.4) the weight of each feature data and each incremental data in the second weight set is taken, the weights are compared with a preset threshold, and the feature data and the incremental data meeting the requirements are selected to obtain a final data set;
(1.5) carrying out clustering analysis on the obtained data set to obtain a plurality of data subsets; extracting training set data with the same proportion from each data subset to obtain a plurality of training subsets, and taking the residual data in each data subset as a test subset;
(1.6) combining the training subsets to obtain a training set, and combining the test subsets to obtain a test set;
(1.7) respectively calculating the mean value and the standard deviation of the physicochemical data corresponding to the current training set and the test set; calculating the mean error and standard deviation error between the physicochemical values of the training set and the test set based on the mean value and standard deviation value of the physicochemical data corresponding to the current training set and the test set;
(1.8) if the mean error and the standard error difference value between the physicochemical values of the training set and the test set obtained by calculation are respectively less than or equal to a preset threshold, taking the current training set and the test set as a final training set and a final test set; otherwise, returning to the step (1.5);
step two, processing the training sample set to obtain a characteristic subset of the training sample set; determining the range of the model parameter to be selected according to the type of the machine learning model through a range determination program through a parameter range determination module;
step three, sequentially selecting model parameters in the range of the model parameters through a central control module and a central processing unit coordinated control model training module, and training the machine learning model by utilizing a feature subset of a training sample set through a model training program;
testing the machine learning model obtained by training by using a model testing program through a model testing module and a testing sample set; evaluating the trained machine learning model through an evaluation program by a model evaluation module to obtain a model evaluation value;
step five, adjusting model parameters of the machine learning model through a model optimization module according to the obtained model evaluation value and the model test result through a model optimization program to obtain optimal parameters and an optimal model;
step six, storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory by a data storage module;
and seventhly, updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through the updating and displaying module through the display.
2. The method for training a machine learning model of a large-scale machine learning system according to claim 1, wherein in step six, the testing the trained machine learning model by the model testing module through the model testing program by using the test sample set includes:
(1) receiving a test request of the machine learning model to be tested;
(2) calling a model test service according to the test request to test the model to be tested by using a test sample set;
(3) and outputting a test result of the machine learning model to be tested.
3. The method for training the machine learning model of the large-scale machine learning system according to claim 2, wherein the request carries test information of the model to be tested, and the test information comprises a model file, a test data set and parameters of the model to be tested.
4. The method for training the machine learning model of the large-scale machine learning system according to claim 1, wherein the model testing service comprises a plurality of deep learning frames; different frames in the multiple deep learning frames are used for building different test models; the different test models are used for testing different deep learning models.
5. The method for training machine learning models of large-scale machine learning system according to claim 1, wherein the evaluating the trained machine learning models by the model evaluation module using an evaluation program to obtain model evaluation values comprises:
(1) obtaining the value range of each parameter of the machine learning model through a model evaluation module;
(2) determining the initial value of the corresponding parameter by utilizing an evaluation program in the value range of each parameter;
(3) and the central processing unit controls the evaluation program to adjust each parameter to the initial value, and acquires a model evaluation value from the evaluation program.
6. The method for training the machine learning model of the large-scale machine learning system according to claim 5, wherein the model evaluation value is used for indicating the performance of the parameter-adjusted machine learning model.
7. The method as claimed in claim 5, wherein the step of adjusting model parameters of the machine learning model by the model optimization module according to the model evaluation values and the model test results via the model optimization program to obtain the optimal parameters and the optimal model comprises:
re-determining the value of each parameter within the value range of each parameter according to the obtained model evaluation value and the model test result, and comparing the model evaluation value corresponding to each parameter with the model test result, wherein the corresponding optimal model evaluation value and the model test result are the optimal parameters; and controlling the parameter adjusting program to adjust the parameters of the model based on the optimal parameters, so as to obtain the optimal model.
8. A large-scale machine learning system-oriented machine learning model training system for implementing the large-scale machine learning system-oriented machine learning model training method according to any one of claims 1 to 7, wherein the large-scale machine learning system-oriented machine learning model training system comprises:
the system comprises a data acquisition module, a data preprocessing module, a parameter range determining module, a central control module, a model training module, a model testing module, a model evaluation module, a model optimizing module, a data storage module and an updating display module;
the data acquisition module is connected with the central control module and used for acquiring the latest feature set of the machine learning model and incremental data in the current time period through data acquisition equipment;
the data preprocessing module is connected with the central control module and used for processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program to obtain a training sample set and a test sample set; meanwhile, processing the training sample set to obtain a characteristic subset of the training sample set;
the parameter range determining module is connected with the central control module and used for determining the range of the model parameter to be selected according to the type of the machine learning model through a range determining program;
the central control module is connected with the data acquisition module, the data preprocessing module, the parameter range determining module, the model training module, the model testing module, the model evaluation module, the model optimization module, the data storage module and the updating display module and is used for coordinating and controlling the normal operation of each module of the machine learning model training system facing the large-scale machine learning system through the central processing unit;
the model training module is connected with the central control module and is used for sequentially selecting model parameters in the range of the model parameters and training the machine learning model by utilizing the characteristic subset of the training sample set through a model training program;
the model testing module is connected with the central control module and used for testing the machine learning model obtained by training by utilizing a testing sample set through a model testing program;
the model evaluation module is connected with the central control module and used for evaluating the trained machine learning model through an evaluation program to obtain a model evaluation value;
the model optimization module is connected with the central control module and used for adjusting model parameters of the machine learning model according to the obtained model evaluation value and the model test result through a model optimization program to obtain an optimal model;
the data storage module is connected with the central control module and used for storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory;
and the updating display module is connected with the central control module and is used for updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through a display.
9. A computer program product stored on a computer readable medium, comprising a computer readable program for providing a user input interface to implement the method of machine learning model training for a large-scale machine learning system of any one of claims 1 to 7 when executed on an electronic device.
10. A computer-readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the method of training a machine learning model for a large-scale machine learning system according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110127839.1A CN112785000A (en) | 2021-01-29 | 2021-01-29 | Machine learning model training method and system for large-scale machine learning system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110127839.1A CN112785000A (en) | 2021-01-29 | 2021-01-29 | Machine learning model training method and system for large-scale machine learning system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112785000A true CN112785000A (en) | 2021-05-11 |
Family
ID=75759859
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110127839.1A Pending CN112785000A (en) | 2021-01-29 | 2021-01-29 | Machine learning model training method and system for large-scale machine learning system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112785000A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114358966A (en) * | 2022-03-16 | 2022-04-15 | 希望知舟技术(深圳)有限公司 | Production scheduling method and device based on machine learning and storage medium |
-
2021
- 2021-01-29 CN CN202110127839.1A patent/CN112785000A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114358966A (en) * | 2022-03-16 | 2022-04-15 | 希望知舟技术(深圳)有限公司 | Production scheduling method and device based on machine learning and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111526119B (en) | Abnormal flow detection method and device, electronic equipment and computer readable medium | |
CN117078048B (en) | Digital twinning-based intelligent city resource management method and system | |
CN112907128B (en) | Data analysis method, device, equipment and medium based on AB test result | |
CN113837596B (en) | Fault determination method and device, electronic equipment and storage medium | |
CN111222553B (en) | Training data processing method and device of machine learning model and computer equipment | |
CN115249043A (en) | Data analysis method and device, electronic equipment and storage medium | |
CN112182067A (en) | Individual height prediction method and device, electronic equipment and storage medium | |
CN110909804A (en) | Method, device, server and storage medium for detecting abnormal data of base station | |
CN112785000A (en) | Machine learning model training method and system for large-scale machine learning system | |
CN111311393A (en) | Credit risk assessment method, device, server and storage medium | |
CN115994093A (en) | Test case recommendation method and device | |
CN115454787A (en) | Alarm classification method and device, electronic equipment and storage medium | |
CN114912582A (en) | Model production method, model production device, electronic device, and storage medium | |
CN111835541B (en) | Method, device, equipment and system for detecting aging of flow identification model | |
CN112582080A (en) | Internet of things equipment state monitoring method and system | |
CN115860055B (en) | Performance determination method, performance optimization method, device, electronic equipment and medium | |
CN116541252B (en) | Computer room fault log data processing method and device | |
CN112905419B (en) | Index data monitoring threshold range determining method and device and readable storage medium | |
CN114580543B (en) | Model training method, interaction log analysis method, device, equipment and medium | |
CN112799913B (en) | Method and device for detecting abnormal operation of container | |
CN114885231B (en) | Communication protocol self-adaptive signal acquisition method, system, terminal and medium | |
CN113419879B (en) | Message processing method, device, equipment and storage medium | |
CN110084511B (en) | Unmanned aerial vehicle configuration method, device, equipment and readable storage medium | |
CN115361308A (en) | Industrial control network data risk determination method, device, equipment and storage medium | |
CN117745438A (en) | Model representation attenuation risk prediction method, device, electronic equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |