CN112785000A

CN112785000A - Machine learning model training method and system for large-scale machine learning system

Info

Publication number: CN112785000A
Application number: CN202110127839.1A
Authority: CN
Inventors: 王卓
Original assignee: Nanchang University
Current assignee: Nanchang University
Priority date: 2021-01-29
Filing date: 2021-01-29
Publication date: 2021-05-11

Abstract

The invention belongs to the technical field of model training, and discloses a machine learning model training method and system for a large-scale machine learning system. The machine learning model training system for the large-scale machine learning system includes: a data acquisition module, a data preprocessing Module, parameter range determination module, central control module, model training module, model testing module, model evaluation module, model optimization module, data storage module, update display module. The present invention processes the training sample set through the data preprocessing module, obtains the feature subset of the training sample set, and reduces the amount of model training data; the method based on incremental learning is used to train the machine learning model, which can improve the accuracy of model training; The model evaluation module and the model optimization module determine the optimal parameter values within the value range of each parameter, and adjust the model parameters, which improves the training efficiency of the model in machine learning.

Description

Machine learning model training method and system for large-scale machine learning system

Technical Field

The invention belongs to the technical field of model training, and particularly relates to a machine learning model training method and system for a large-scale machine learning system.

Background

Currently, with the widespread popularity of machine learning, various machine learning models are receiving more and more attention. For a machine learning model, it is usually required to train the machine learning model based on training data (also called training samples), and then perform some kind of prediction, such as performing class prediction, using the trained machine learning model.

In the training process of the machine learning model, samples need to be added or modified to the machine learning model. In order to increase training samples for machine learning, different features need to be added, or different features need to be combined and input to a machine learning model one by one, but the existing training method for the learning model is tedious, long in time consumption, low in training efficiency, and low in flexibility and applicability. Therefore, a new machine learning model training method for a large-scale machine learning system is needed.

Through the above analysis, the problems and defects of the prior art are as follows: the existing learning model training method is tedious, long in time consumption, low in training efficiency, and low in flexibility and applicability.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a machine learning model training method and system for a large-scale machine learning system.

The invention is realized in such a way that a machine learning model training method facing a large-scale machine learning system comprises the following steps:

acquiring a latest feature set of the machine learning model and incremental data in a current time period through a data acquisition module through data acquisition equipment; processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program through a data preprocessing module to obtain a training sample set and a testing sample set;

the processing and dividing of the acquired feature set and the incremental data in the current time period by the data preprocessing module through a data preprocessing program comprises the following steps:

(1.1) calculating the weight of each feature data and each incremental data by using a Jaccard index for the acquired feature set and the incremental data in the current time period to form a first weight set;

(1.2) comparing the weights of the characteristic data and the incremental data in the first weight set with a preset weight threshold, and screening the characteristic data and the incremental data meeting the requirements to obtain a first data subset;

(1.3) taking the first data subset, and calculating the weight of each feature data and each incremental data by using a Relief-F algorithm to form a second weight set;

(1.4) the weight of each feature data and each incremental data in the second weight set is taken, the weights are compared with a preset threshold, and the feature data and the incremental data meeting the requirements are selected to obtain a final data set;

(1.5) carrying out clustering analysis on the obtained data set to obtain a plurality of data subsets; extracting training set data with the same proportion from each data subset to obtain a plurality of training subsets, and taking the residual data in each data subset as a test subset;

(1.6) combining the training subsets to obtain a training set, and combining the test subsets to obtain a test set;

(1.7) respectively calculating the mean value and the standard deviation of the physicochemical data corresponding to the current training set and the test set; calculating the mean error and standard deviation error between the physicochemical values of the training set and the test set based on the mean value and standard deviation value of the physicochemical data corresponding to the current training set and the test set;

(1.8) if the mean error and the standard error difference value between the physicochemical values of the training set and the test set obtained by calculation are respectively less than or equal to a preset threshold, taking the current training set and the test set as a final training set and a final test set; otherwise, returning to the step (1.5);

step two, processing the training sample set to obtain a characteristic subset of the training sample set; determining the range of the model parameter to be selected according to the type of the machine learning model through a range determination program through a parameter range determination module;

step three, sequentially selecting model parameters in the range of the model parameters through a central control module and a central processing unit coordinated control model training module, and training the machine learning model by utilizing a feature subset of a training sample set through a model training program;

testing the machine learning model obtained by training by using a model testing program through a model testing module and a testing sample set; evaluating the trained machine learning model through an evaluation program by a model evaluation module to obtain a model evaluation value;

step five, adjusting model parameters of the machine learning model through a model optimization module according to the obtained model evaluation value and the model test result through a model optimization program to obtain optimal parameters and an optimal model;

step six, storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory by a data storage module;

and seventhly, updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through the updating and displaying module through the display.

Further, the testing, by the model testing module, the machine learning model obtained by training through the model testing program by using the test sample set includes:

(1) receiving a test request of the machine learning model to be tested;

(2) calling a model test service according to the test request to test the model to be tested by using a test sample set;

(3) and outputting a test result of the machine learning model to be tested.

Further, the request carries test information of the model to be tested, and the test information comprises a model file, a test data set and parameters of the model to be tested.

Further, the model test service comprises a plurality of deep learning frameworks; different frames in the multiple deep learning frames are used for building different test models; the different test models are used for testing different deep learning models.

Further, the evaluating the trained machine learning model by the model evaluation module using an evaluation program to obtain a model evaluation value includes:

(1) obtaining the value range of each parameter of the machine learning model through a model evaluation module;

(2) determining the initial value of the corresponding parameter by utilizing an evaluation program in the value range of each parameter;

(3) and the central processing unit controls the evaluation program to adjust each parameter to the initial value, and acquires a model evaluation value from the evaluation program.

Further, the model evaluation value is used to indicate the performance of the parameter-adjusted machine learning model.

Further, the adjusting, by the model optimization module, the model parameter of the machine learning model according to the obtained model evaluation value and the model test result through the model optimization program to obtain the optimal parameter and the optimal model includes:

re-determining the value of each parameter within the value range of each parameter according to the obtained model evaluation value and the model test result, and comparing the model evaluation value corresponding to each parameter with the model test result, wherein the corresponding optimal model evaluation value and the model test result are the optimal parameters; and controlling the parameter adjusting program to adjust the parameters of the model based on the optimal parameters, so as to obtain the optimal model.

Another object of the present invention is to provide a machine learning model training system for a large-scale machine learning system, which implements the machine learning model training method for a large-scale machine learning system, the machine learning model training system for a large-scale machine learning system comprising:

the system comprises a data acquisition module, a data preprocessing module, a parameter range determining module, a central control module, a model training module, a model testing module, a model evaluation module, a model optimizing module, a data storage module and an updating display module;

the data acquisition module is connected with the central control module and used for acquiring the latest feature set of the machine learning model and incremental data in the current time period through data acquisition equipment;

the data preprocessing module is connected with the central control module and used for processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program to obtain a training sample set and a test sample set; meanwhile, processing the training sample set to obtain a characteristic subset of the training sample set;

the parameter range determining module is connected with the central control module and used for determining the range of the model parameter to be selected according to the type of the machine learning model through a range determining program;

the central control module is connected with the data acquisition module, the data preprocessing module, the parameter range determining module, the model training module, the model testing module, the model evaluation module, the model optimization module, the data storage module and the updating display module and is used for coordinating and controlling the normal operation of each module of the machine learning model training system facing the large-scale machine learning system through the central processing unit;

the model training module is connected with the central control module and is used for sequentially selecting model parameters in the range of the model parameters and training the machine learning model by utilizing the characteristic subset of the training sample set through a model training program;

the model testing module is connected with the central control module and used for testing the machine learning model obtained by training by utilizing a testing sample set through a model testing program;

the model evaluation module is connected with the central control module and used for evaluating the trained machine learning model through an evaluation program to obtain a model evaluation value;

the model optimization module is connected with the central control module and used for adjusting model parameters of the machine learning model according to the obtained model evaluation value and the model test result through a model optimization program to obtain an optimal model;

the data storage module is connected with the central control module and used for storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory;

and the updating display module is connected with the central control module and is used for updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of the model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through a display.

Another object of the present invention is to provide a computer program product stored on a computer readable medium, which includes a computer readable program for providing a user input interface to implement the method for training a machine learning model of a large-scale machine learning system when the computer program product is executed on an electronic device.

Another object of the present invention is to provide a computer-readable storage medium storing instructions which, when executed on a computer, cause the computer to execute the method for training a machine learning model for a large-scale machine learning system.

By combining all the technical schemes, the invention has the advantages and positive effects that: according to the machine learning model training method for the large-scale machine learning system, the obtained training sample set is processed through the data preprocessing module to obtain the characteristic parameter set of the training sample set, so that the model training data volume can be greatly reduced; the machine learning model is trained in an incremental learning-based mode, so that the accuracy of model training can be improved; the optimal parameter values are determined in the value range of each parameter through the model evaluation module and the model optimization module, and the model parameters are adjusted, so that the training efficiency of the model in machine learning is improved, and the problems of complexity, long time consumption, low training efficiency, low flexibility and low applicability of the existing learning model training method can be effectively solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a machine learning model training method for a large-scale machine learning system according to an embodiment of the present invention.

FIG. 2 is a structural block diagram of a machine learning model training system for a large-scale machine learning system according to an embodiment of the present invention;

in the figure: 1. a data acquisition module; 2. a data preprocessing module; 3. a parameter range determination module; 4. a central control module; 5. a model training module; 6. a model test module; 7. a model evaluation module; 8. a model optimization module; 9. a data storage module; 10. and updating the display module.

Fig. 3 is a flowchart of a method for processing and dividing the acquired feature set and the incremental data in the current time period through a data preprocessing program by a data preprocessing module according to an embodiment of the present invention.

Fig. 4 is a flowchart of a method for testing the machine learning model obtained by training through a model testing module by using a model testing program according to an embodiment of the present invention.

Fig. 5 is a flowchart of a method for obtaining a model evaluation value by evaluating the trained machine learning model with an evaluation program by a model evaluation module according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at the problems in the prior art, the invention provides a machine learning model training method and system for a large-scale machine learning system, and the invention is described in detail below with reference to the accompanying drawings.

As shown in fig. 1, the method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention includes the following steps:

s101, acquiring the latest feature set of the machine learning model and incremental data in the current time period through a data acquisition module through data acquisition equipment; processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program through a data preprocessing module to obtain a training sample set and a testing sample set;

s102, processing a training sample set to obtain a feature subset of the training sample set; determining the range of the model parameter to be selected according to the type of the machine learning model through a range determination program through a parameter range determination module;

s103, sequentially selecting model parameters in the range of the model parameters through a central control module and a central processing unit coordinated control model training module, and training the machine learning model by utilizing a feature subset of a training sample set through a model training program;

s104, testing the machine learning model obtained by training through a model testing module by utilizing a testing sample set through a model testing program; evaluating the trained machine learning model through an evaluation program by a model evaluation module to obtain a model evaluation value;

s105, adjusting model parameters of the machine learning model through a model optimization module according to the obtained model evaluation value and the model test result through a model optimization program to obtain optimal parameters and an optimal model;

s106, storing the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the optimal model through a memory by a data storage module;

and S107, updating and displaying the acquired feature set, training sample set, testing sample set, model parameter range, model training result, model testing result, model evaluation value, optimal parameter and real-time data of the optimal model through the display by the updating and displaying module.

As shown in fig. 2, a machine learning model training system for a large-scale machine learning system according to an embodiment of the present invention includes: the system comprises a data acquisition module 1, a data preprocessing module 2, a parameter range determining module 3, a central control module 4, a model training module 5, a model testing module 6, a model evaluation module 7, a model optimizing module 8, a data storage module 9 and an updating display module 10.

The data acquisition module 1 is connected with the central control module 4 and used for acquiring the latest feature set of the machine learning model and incremental data in the current time period through data acquisition equipment;

the data preprocessing module 2 is connected with the central control module 4 and used for processing and dividing the acquired feature set and incremental data in the current time period through a data preprocessing program to obtain a training sample set and a test sample set; meanwhile, processing the training sample set to obtain a characteristic subset of the training sample set;

the parameter range determining module 3 is connected with the central control module 4 and is used for determining the range of the model parameter to be selected according to the type of the machine learning model through a range determining program;

the central control module 4 is connected with the data acquisition module 1, the data preprocessing module 2, the parameter range determining module 3, the model training module 5, the model testing module 6, the model evaluation module 7, the model optimization module 8, the data storage module 9 and the updating display module 10, and is used for coordinating and controlling the normal operation of each module of the machine learning model training system facing the large-scale machine learning system through a central processing unit;

the model training module 5 is connected with the central control module 4 and is used for sequentially selecting model parameters in the range of the model parameters and training the machine learning model by utilizing the feature subset of the training sample set through a model training program;

the model testing module 6 is connected with the central control module 4 and used for testing the machine learning model obtained by training by utilizing a testing sample set through a model testing program;

the model evaluation module 7 is connected with the central control module 4 and used for evaluating the trained machine learning model through an evaluation program to obtain a model evaluation value;

the model optimization module 8 is connected with the central control module 4 and used for adjusting model parameters of the machine learning model according to the obtained model evaluation value and the model test result through a model optimization program to obtain an optimal model;

the data storage module 9 is connected with the central control module 4 and used for storing the acquired feature set, the training sample set, the test sample set, the range of model parameters, the model training result, the model test result, the model evaluation value, the optimal parameters and the optimal model through a memory;

and the updating display module 10 is connected with the central control module 4 and is used for updating and displaying the acquired feature set, the training sample set, the testing sample set, the range of model parameters, the model training result, the model testing result, the model evaluation value, the optimal parameters and the real-time data of the optimal model through a display.

The technical solution of the present invention is further illustrated by the following specific examples.

Example 1

Fig. 1 shows a machine learning model training method for a large-scale machine learning system according to an embodiment of the present invention, and fig. 3 shows a preferred embodiment of the method, where the processing and dividing of the acquired feature set and the incremental data in the current time period by the data preprocessing module according to the embodiment of the present invention includes:

s201, calculating the weight of each feature data and each incremental data by using a Jaccard index for the acquired feature set and the incremental data in the current time period to form a first weight set;

s202, comparing the weights of the characteristic data and the incremental data in the first weight set with a preset weight threshold, and screening the characteristic data and the incremental data meeting the requirements to obtain a first data subset;

s203, taking the first data subset, and calculating the weight of each feature data and each incremental data by using a Relief-F algorithm to form a second weight set;

s204, the weight of each feature data and each incremental data in the second weight set is taken and compared with a preset threshold, and the feature data and the incremental data meeting the requirements are selected to obtain a final data set;

s205, carrying out cluster analysis on the obtained data set to obtain a plurality of data subsets; extracting training set data with the same proportion from each data subset to obtain a plurality of training subsets, and taking the residual data in each data subset as a test subset;

s206, combining the training subsets to obtain a training set, and combining the test subsets to obtain a test set;

s207, respectively calculating the mean value and standard deviation of the physicochemical data corresponding to the current training set and the test set; calculating the mean error and standard deviation error between the physicochemical values of the training set and the test set based on the mean value and standard deviation value of the physicochemical data corresponding to the current training set and the test set;

s208, if the mean error and the standard error difference value between the physicochemical values of the training set and the test set obtained by calculation are respectively less than or equal to a preset threshold value, taking the current training set and the test set as a final training set and a final test set; otherwise, return to step S205.

Example 2

The method for training a machine learning model for a large-scale machine learning system according to the embodiment of the present invention is shown in fig. 1, and as a preferred embodiment, as shown in fig. 4, the method for testing the machine learning model obtained by training through a model testing module by using a model testing program according to the embodiment of the present invention includes:

s301, receiving a test request of the machine learning model to be tested;

s302, calling a model test service according to the test request and testing the model to be tested by using a test sample set;

and S303, outputting a test result of the machine learning model to be tested.

The request provided by the embodiment of the invention carries the test information of the model to be tested, and the test information comprises a model file, a test data set and parameters of the model to be tested.

The model test service provided by the embodiment of the invention comprises a plurality of deep learning frames; different frames in the multiple deep learning frames are used for building different test models; the different test models are used for testing different deep learning models.

Example 3

Fig. 1 shows a method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention, and fig. 5 shows a preferred embodiment of the method for training a machine learning model for a large-scale machine learning system according to an embodiment of the present invention, where a model evaluation module evaluates a trained machine learning model by using an evaluation program to obtain a model evaluation value, where the method includes:

s401, obtaining the value range of each parameter of the machine learning model through a model evaluation module;

s402, determining initial values of corresponding parameters by utilizing an evaluation program in the value range of each parameter;

and S403, the central processing unit controls the evaluation program to adjust each parameter to the initial value, and obtains a model evaluation value from the evaluation program.

The model evaluation value provided by the embodiment of the invention is used for indicating the performance of the machine learning model after parameter adjustment.

Example 4

As shown in fig. 1, the method for training a machine learning model for a large-scale machine learning system according to the embodiment of the present invention is a preferred embodiment, where the method for adjusting model parameters of the machine learning model by a model optimization module according to an obtained model evaluation value and a model test result through a model optimization program to obtain optimal parameters and an optimal model includes:

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When used in whole or in part, can be implemented in a computer program product that includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL), or wireless (e.g., infrared, wireless, microwave, etc.)). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.

Claims

1. a machine learning model training method for a large-scale machine learning system, is characterized in that, the described machine learning model training method for a large-scale machine learning system comprises the following steps:

Step 1, obtain the latest feature set of the machine learning model and the incremental data in the current time period through the data acquisition module through the data acquisition device; pass the data preprocessing module through the data preprocessing program to obtain the acquired feature set and the current time period. The incremental data in the data is processed and divided to obtain a training sample set and a test sample set;

The processing and division of the acquired feature set and the incremental data in the current time period by the data preprocessing module through the data preprocessing program includes:

(1.1) Use the Jaccard index to calculate the weight of each feature data and incremental data for the acquired feature set and incremental data in the current time period to form a first weight set;

(1.2) comparing the weights of the characteristic data and incremental data in the first weight set with the preset weight threshold, and screening the characteristic data and incremental data that meet the requirements to obtain the first data subset;

(1.3) Take the first data subset to calculate the weight of each feature data and incremental data with the Relief-F algorithm to form the second weight set;

(1.4) take the weight of each feature data and incremental data in the second weight set, compare with the preset threshold, select the feature data and incremental data that meet the requirements, and obtain the final data set;

(1.5) Perform cluster analysis on the obtained data set to obtain multiple data subsets; extract the same proportion of training set data from each data subset to obtain multiple training subsets, and use the remaining data in each data subset to obtain multiple training subsets. as a test subset;

(1.6) Combining multiple training subsets to obtain a training set, and combining multiple test subsets to obtain a test set;

(1.7) Calculate the mean and standard deviation of the physical and chemical data corresponding to the current training set and the test set respectively; and calculate the mean error between the physical and chemical values of the training set and the test set based on the mean and standard deviation of the physical and chemical data corresponding to the current training set and test set and standard deviation error;

(1.8) If the calculated mean error and standard deviation error between the physical and chemical values of the training set and the test set are respectively less than or equal to the preset threshold, then the current training set and test set are used as the final training set and final test set; otherwise, Return to step (1.5);

In step 2, the training sample set is processed to obtain the feature subset of the training sample set; the range of the model parameters to be selected is determined according to the type of the machine learning model by the parameter range determination module through the range determination program;

Step 3, through the central control module through the central processing unit to coordinate the control model training module within the range of the model parameters, select the model parameters in turn, and use the feature subset of the training sample set to train the machine learning model through the model training program. ;

Step 4: Use the test sample set to test the machine learning model obtained by training through the model testing module through the model testing program; evaluate the machine learning model obtained by training through the evaluation program through the model evaluation module, and obtain the model evaluation value. ;

In step 5, the model parameters of the machine learning model are adjusted by the model optimization program through the model optimization module according to the obtained model evaluation value and the model test result, so as to obtain the optimal parameters and the optimal model;

Step 6: Store the acquired feature set, training sample set, test sample set, model parameter range, model training result, model testing result, model evaluation value, optimal parameter, and optimal model through the memory storage module through the data storage module;

Step 7: The feature set, training sample set, test sample set, model parameter range, model training result, model test result, model evaluation value, optimal parameter and real-time data of the optimal model obtained through the display module are updated by the display module. to update the display.

2. the machine learning model training method for large-scale machine learning system as claimed in claim 1, is characterized in that, in step 6, described through model test module through model test program utilizes test sample set to train the described machine that obtains Learning models to test include:

(1) Receive the test request of the machine learning model to be tested;

(2) invoking the model test service according to the test request and using the test sample set to test the model to be tested;

(3) Output the test result of the machine learning model to be tested.

3. the machine learning model training method for large-scale machine learning system as claimed in claim 2, is characterized in that, described request carries the test information of the model to be tested, and described test information comprises model file, test data set and to be tested parameters of the model.

4. The machine learning model training method for a large-scale machine learning system according to claim 1, wherein the model testing service includes multiple deep learning frameworks; to build different test models; the different test models are used to test different deep learning models.

5. The machine learning model training method for large-scale machine learning systems as claimed in claim 1, wherein the described machine learning model obtained by training is evaluated by a model evaluation module using an evaluation program to obtain a model evaluation value ,include:

(1) Obtain the value range of each parameter of the machine learning model through the model evaluation module;

(2) Within the value range of each parameter, use the evaluation procedure to determine the initial value of the corresponding parameter;

(3) The central processing unit controls the evaluation program to adjust each parameter to the initial value, and obtains the model evaluation value from the evaluation program.

6 . The method for training a machine learning model for a large-scale machine learning system according to claim 5 , wherein the model evaluation value is used to indicate the performance of the machine learning model after parameter adjustment. 7 .

7. the machine learning model training method for large-scale machine learning system as claimed in claim 5, it is characterized in that, described by model optimization module through model optimization program according to the model evaluation value that obtains and model test result to described machine learning The model parameters of the model are adjusted to obtain the optimal parameters and the optimal model including:

According to the obtained model evaluation value and model test result, re-determine the value of each parameter within the value range of each parameter, compare the model evaluation value and model test result corresponding to each parameter, and correspond to the optimal model evaluation value and model test result. That is, the optimal parameters; the optimal model can be obtained by controlling the parameter adjustment program to adjust the parameters of the model based on the optimal parameters.

8. A machine learning model training system for a large-scale machine learning system that implements the machine learning model training method for a large-scale machine learning system according to any one of claims 1-7, wherein the Machine learning model training systems for scale machine learning systems include:

Data acquisition module, data preprocessing module, parameter range determination module, central control module, model training module, model testing module, model evaluation module, model optimization module, data storage module, update display module;

a data acquisition module, connected to the central control module, for acquiring the latest feature set of the machine learning model and incremental data in the current time period through the data acquisition device;

The data preprocessing module, connected with the central control module, is used to process and divide the acquired feature set and the incremental data in the current time period through the data preprocessing program to obtain a training sample set and a test sample set; process to obtain the feature subset of the training sample set;

a parameter range determination module, connected with the central control module, for determining the range of the model parameters to be selected according to the type of the machine learning model through the range determination program;

The central control module is connected with the data acquisition module, the data preprocessing module, the parameter range determination module, the model training module, the model testing module, the model evaluation module, the model optimization module, the data storage module, and the update display module, and is used to pass the central processing unit. Coordinating and controlling the normal operation of each module of the large-scale machine learning system-oriented machine learning model training system;

A model training module, connected with the central control module, for selecting model parameters in sequence within the range of the model parameters, and using the feature subset of the training sample set to train the machine learning model through a model training program;

a model testing module, connected to the central control module, for testing the machine learning model obtained by training with a test sample set through a model testing program;

a model evaluation module, connected with the central control module, for evaluating the machine learning model obtained by training through an evaluation program to obtain a model evaluation value;

The model optimization module is connected with the central control module, and is used for adjusting the model parameters of the machine learning model according to the obtained model evaluation value and the model test result through the model optimization program to obtain the optimal model;

The data storage module, connected with the central control module, is used to store the acquired feature set, training sample set, test sample set, range of model parameters, model training results, model testing results, model evaluation values, optimal parameters and maximum parameters obtained through the memory. optimal model;

Update the display module and connect it with the central control module to pair the acquired feature set, training sample set, test sample set, range of model parameters, model training results, model testing results, model evaluation values, optimal parameters and the most The real-time data of the optimal model is updated and displayed.

9. A computer program product stored on a computer-readable medium, comprising a computer-readable program, when executed on an electronic device, providing a user input interface to implement the large-scale system according to any one of claims 1 to 7. Machine Learning Model Training Methods for Scale Machine Learning Systems.

10. A computer-readable storage medium storing instructions that, when the instructions are executed on a computer, cause the computer to execute the machine learning model training for a large-scale machine learning system according to any one of claims 1 to 7 method.