CN107844837B - Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm - Google Patents

Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm Download PDF

Info

Publication number
CN107844837B
CN107844837B CN201711048805.3A CN201711048805A CN107844837B CN 107844837 B CN107844837 B CN 107844837B CN 201711048805 A CN201711048805 A CN 201711048805A CN 107844837 B CN107844837 B CN 107844837B
Authority
CN
China
Prior art keywords
machine learning
algorithm parameter
algorithm
parameter values
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711048805.3A
Other languages
Chinese (zh)
Other versions
CN107844837A (en
Inventor
戴文渊
陈雨强
杨强
张舒羽
栾淑君
刘守湘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201711048805.3A priority Critical patent/CN107844837B/en
Priority to CN202010496368.7A priority patent/CN111652380B/en
Publication of CN107844837A publication Critical patent/CN107844837A/en
Application granted granted Critical
Publication of CN107844837B publication Critical patent/CN107844837B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Stored Programmes (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method and a system for adjusting and optimizing algorithm parameters aiming at a machine learning algorithm are provided. The method comprises the following steps: (A) determining a machine learning algorithm for training a machine learning model; (B) providing a graphical interface for setting parameter configuration items of the machine learning algorithm to a user, wherein the parameter configuration items are used for limiting how to generate a plurality of groups of candidate algorithm parameter values; (C) receiving input operation executed on a graphical interface by a user for setting the parameter adjusting configuration item, and acquiring the parameter adjusting configuration item set by the user according to the input operation; (D) generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items; (E) respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values; (F) and evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.

Description

Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm
Technical Field
The present invention relates generally to the field of artificial intelligence, and more particularly, to a method and system for adjusting and optimizing algorithm parameters for machine learning algorithm.
Background
At present, the basic process of training the machine learning model mainly includes:
1. importing a data set (e.g., a data table) containing historical data records;
2. completing feature engineering, wherein, by performing various processing on the attribute information of the data records in the data set to obtain various features (for example, combined features can be included), a feature vector formed by the features can be used as a machine learning sample;
3. and training a model, wherein the model is learned based on the machine learning samples obtained through the feature engineering according to a set machine learning algorithm (such as a logistic regression algorithm, a decision tree algorithm, a neural network algorithm and the like). Here, the algorithm parameters of the machine learning algorithm have a significant influence on the goodness of the learned model.
On the existing machine learning platform, an interactive mode based on a graphical interface can be adopted to complete a machine learning model training process, and a user does not need to write program codes personally. However, in the training model stage, the algorithm parameter values set manually are input into the platform system. That is, the user needs to perform algorithm parameter tuning in advance, and cannot effectively implement automatic algorithm parameter tuning by means of the platform.
However, the tuning of the algorithm parameters is often complex and usually requires manual tuning, so that the tuning of the algorithm parameters is a time-consuming and labor-consuming matter; in addition, in order to obtain the algorithm parameter values with good model effect, the machine learning internal principle needs to be understood, the meaning, the influence range, the mutual influence relationship among the algorithm parameters and the like of each algorithm parameter are known, the technical threshold is high, a user needs to make continuous attempts, and the efficiency and the experience of the user in model training are greatly influenced.
Disclosure of Invention
An exemplary embodiment of the present invention provides a method and a system for adjusting and optimizing algorithm parameters for a machine learning algorithm, so as to solve the problem that in the prior art, automatic adjustment and optimization of algorithm parameters for a machine learning algorithm used for training a machine learning model cannot be conveniently performed in a machine learning system.
According to an exemplary embodiment of the present invention, there is provided a method for algorithm parameter tuning for a machine learning algorithm, including: (A) determining a machine learning algorithm for training a machine learning model; (B) providing a graphical interface for setting parameter configuration items of the machine learning algorithm to a user, wherein the parameter configuration items are used for defining how to generate a plurality of groups of candidate algorithm parameter values, wherein each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm; (C) receiving input operation executed on a graphical interface by a user for setting the parameter adjusting configuration item, and acquiring the parameter adjusting configuration item set by the user according to the input operation; (D) generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items; (E) respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values; (F) and evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
Optionally, the method further comprises: (G) and displaying the generated multiple groups of candidate algorithm parameter values and the trained effect of the machine learning model corresponding to each group of candidate algorithm parameter values to the user.
Optionally, the method further comprises: (H) and directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
Optionally, the parameter configuration item includes at least one of: an initial value configuration item for specifying an initial value of the algorithm parameter to be tuned such that at least one candidate algorithm parameter value of the algorithm parameter to be tuned is generated based on the specified initial value of the algorithm parameter to be tuned in step (D); a value range configuration item used for appointing the value range of the algorithm parameter to be adjusted, so that at least one candidate algorithm parameter value of the algorithm parameter to be adjusted is generated based on the appointed value range of the algorithm parameter to be adjusted in the step (D); a parameter tuning method configuration item for specifying a method of generating a plurality of sets of candidate algorithm parameter values, such that in step (D) the plurality of sets of candidate algorithm parameter values are generated based on at least one candidate algorithm parameter value of each algorithm parameter to be tuned according to the specified method.
Optionally, in step (E), the machine learning models corresponding to each set of candidate algorithm parameter values are trained in parallel, wherein in training the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values are maintained by a parameter server, wherein the parameters have the form of key-value pairs, the parameter server storing multiple key-value pairs having a same key as a single key corresponding to multiple values.
Optionally, in step (E), training, by a plurality of computing devices, machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein, when training, in parallel, machine learning models corresponding to each set of candidate algorithm parameter values, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values are maintained by a parameter server, wherein the parameter server includes at least one server side and a plurality of client sides, wherein the client sides correspond to the computing devices one-to-one, and the corresponding client sides and the computing devices are integrated into a whole, wherein the at least one server side is used for storing parameters of the machine learning models corresponding to each set of candidate algorithm parameter values; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values with one or more server sides, wherein the computing device corresponding to each client is configured to train a machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and identical keys are compressed and/or combined in the parameter operation instructions.
Optionally, in step (E), at each set of candidate algorithm parameter values, performing the same data-streaming calculations with respect to machine learning model training according to the machine learning algorithm, wherein the data-streaming calculations are performed by merging the same processing steps between the respective data-streaming calculations.
Optionally, the method further comprises: (I) and setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from the displayed plurality of groups of candidate algorithm parameter values, and applying the set algorithm parameter values to the subsequent step of training the machine learning model.
Optionally, the method further comprises: (J) and storing a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect in a configuration file form.
Optionally, the method further comprises: (K) and saving a group of candidate algorithm parameter values selected by a user from the displayed groups of candidate algorithm parameter values in a configuration file form.
According to another exemplary embodiment of the present invention, there is provided a system for algorithm parameter tuning for machine learning algorithm, including: algorithm determining means for determining a machine learning algorithm for training a machine learning model; the display device is used for providing a graphical interface for setting parameter adjusting configuration items of the machine learning algorithm for a user, wherein the parameter adjusting configuration items are used for limiting how to generate a plurality of groups of candidate algorithm parameter values, and each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm; configuration item acquisition means for receiving an input operation performed by a user on a graphical interface in order to set the parameter configuration item, and acquiring the parameter configuration item set by the user according to the input operation; the algorithm parameter value generating device is used for generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items; at least one computing device, configured to train, according to the machine learning algorithm, a machine learning model corresponding to each set of candidate algorithm parameter values under each set of candidate algorithm parameter values, respectively; and the evaluation device is used for evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
Optionally, the display device further displays the generated sets of candidate algorithm parameter values and the trained effect of the machine learning model corresponding to each set of candidate algorithm parameter values to the user.
Optionally, the system further comprises: and the application device is used for directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
Optionally, the parameter configuration item includes at least one of: the initial value configuration item is used for appointing the initial value of the algorithm parameter to be adjusted, so that the algorithm parameter value generation device generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the appointed initial value of the algorithm parameter to be adjusted; the value range configuration item is used for appointing the value range of the algorithm parameter to be adjusted, so that the algorithm parameter value generation device generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the appointed value range of the algorithm parameter to be adjusted; and the parameter adjusting method configuration item is used for appointing a method for generating a plurality of groups of candidate algorithm parameter values, so that the algorithm parameter value generating device generates a plurality of groups of candidate algorithm parameter values according to the appointed method based on at least one candidate algorithm parameter value of each algorithm parameter to be adjusted.
Optionally, the at least one computing device trains the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein the parameters of the machine learning models corresponding to each set of candidate algorithm parameter values are maintained by a parameter server as the machine learning models corresponding to each set of candidate algorithm parameter values are trained in parallel by the at least one computing device, wherein the parameters are in the form of key-value pairs, the parameter server storing a plurality of key-value pairs having a same key in a form in which a single key corresponds to a plurality of values.
Optionally, the system includes a plurality of computing devices, wherein the plurality of computing devices train the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein parameters of the machine learning models corresponding to each set of candidate algorithm parameter values are maintained by a parameter server when the plurality of computing devices train the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein the parameter server includes at least one server side and a plurality of client sides, wherein the client sides correspond to the computing devices one-to-one, and the corresponding client sides and the computing devices are integrated, wherein the at least one server side is configured to store the parameters of the machine learning models corresponding to each set of candidate algorithm parameter values; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values with one or more server sides, wherein the computing device corresponding to each client is configured to train a machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and identical keys are compressed and/or combined in the parameter operation instructions.
Optionally, the at least one computing device performs the same data-streaming calculations on machine learning model training according to the machine learning algorithm at each set of candidate algorithm parameter values, wherein the data-streaming calculations are performed by merging the same processing steps between the respective data-streaming calculations.
Optionally, the system further comprises: and the application device is used for setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from a plurality of groups of displayed candidate algorithm parameter values, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
Optionally, the system further comprises: and the storage device is used for storing a group of candidate algorithm parameter values corresponding to the best-effect machine learning model in a configuration file form.
Optionally, the system further comprises: and the storage device is used for storing a group of candidate algorithm parameter values selected by a user from the displayed groups of candidate algorithm parameter values in a configuration file form.
According to another exemplary embodiment of the present invention, a computer-readable medium for algorithm parameter tuning for machine learning algorithms is provided, wherein a computer program for executing the method for algorithm parameter tuning for machine learning algorithms as described above is recorded on the computer-readable medium.
According to another exemplary embodiment of the present invention, a computer for algorithm parameter tuning for machine learning algorithms is provided, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions which, when executed by the processor, perform the method for algorithm parameter tuning for machine learning algorithms as described above.
According to the method and the system for adjusting and optimizing the algorithm parameters aiming at the machine learning algorithm, a convenient, efficient and interactive friendly algorithm parameter adjusting and optimizing process is provided, a user only needs to set related configuration items for limiting how to generate a plurality of groups of candidate algorithm parameter values through an interactive interface, automatic algorithm parameter adjustment and optimization can be achieved, user experience is improved, and the effect of a machine learning model is also improved.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of exemplary embodiments of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:
FIG. 1 shows a flow diagram of a method of algorithm parameter tuning for a machine learning algorithm, according to an exemplary embodiment of the invention;
FIG. 2 illustrates an example of saving parameters of a machine learning model according to an exemplary embodiment of the invention;
FIG. 3 shows a flow diagram of a method of algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention;
FIGS. 4 and 5 illustrate examples of graphical interfaces for setting parameter configuration items of a machine learning algorithm according to exemplary embodiments of the invention;
FIG. 6 illustrates an example of an algorithm parameter tuning analysis report according to an exemplary embodiment of the present invention;
FIG. 7 illustrates an example of a DAG graph for algorithm parameter tuning for a machine learning algorithm, according to an exemplary embodiment of the present invention;
FIG. 8 shows a block diagram of a system for algorithm parameter tuning for machine learning algorithms, according to an exemplary embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
Here, machine learning is a necessary product of the development of artificial intelligence research to a certain stage, which is directed to improving the performance of the system itself by means of calculation, using experience. In a computer system, "experience" is usually in the form of "data" from which a "model" can be generated by a machine learning algorithm, i.e. by providing empirical data to a machine learning algorithm, a model can be generated based on these empirical data, which provides a corresponding judgment, i.e. a prediction, in the face of a new situation. Whether the machine learning model is trained or predicted using the trained machine learning model, the data needs to be converted into machine learning samples including various features. Machine learning may be implemented in the form of "supervised learning," "unsupervised learning," or "semi-supervised learning," it being noted that exemplary embodiments of the present invention do not impose particular limitations on specific machine learning algorithms. It should also be noted that other means such as statistical algorithms may also be incorporated during the training and application of the model.
Fig. 1 shows a flowchart of a method for algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention. Here, the method may be executed by a computer program, or may be executed by a system or a computer that performs algorithm parameter tuning specifically for a machine learning algorithm, as an example.
In step S10, a machine learning algorithm for training the machine learning model is determined.
For example, the machine learning algorithm may be a logistic regression algorithm such as ftrl (follow the regulated leader) optimization algorithm, or may be other machine learning algorithms, which is not limited in the present invention.
As an example, the machine learning algorithm used to train the machine learning model may be determined from input operations performed by a user on a graphical interface used to set the machine learning algorithm used to train the machine learning model.
In step S20, a graphical interface for setting parameter configuration items of the machine learning algorithm is provided to a user, wherein the parameter configuration items are used for defining how to generate a plurality of sets of candidate algorithm parameter values, wherein each set of candidate algorithm parameter values includes one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm. According to an exemplary embodiment of the present invention, a plurality of sets of candidate algorithm parameter values for algorithm parameter tuning may be generated based on parameter tuning configuration items set by a user.
It should be understood that the parameters of the algorithm to be tuned for different machine learning algorithms may be different or the same. As an example, when the machine learning algorithm is an FTRL optimization algorithm, the algorithm parameters to be tuned of the machine learning algorithm may include: maximum training round number, learning rate, L1 regular term coefficient, and L2 regular term coefficient.
In step S30, an input operation performed on the graphical interface by the user to set the parameter configuration item is received, and the parameter configuration item set by the user is acquired according to the input operation.
As an example, the graphical interface provided to the user may include an input control corresponding to each parameter configuration item to select and/or edit content, so that the parameter configuration item set by the user may be acquired by receiving a selection operation and/or an editing operation of the user.
In step S40, a plurality of sets of candidate algorithm parameter values are generated based on the obtained parameter tuning configuration items.
As an example, the parameter configuration items may include at least one of: initial value configuration items, value range configuration items and parameter adjusting method configuration items. It should be understood that the parameter configuration items may also include other configuration items that define how sets of candidate algorithm parameter values are generated.
Specifically, the initial value configuration item is used to specify the initial values of the algorithm parameters to be tuned, so that at least one candidate algorithm parameter value of the algorithm parameters to be tuned is generated based on the specified initial values of the algorithm parameters to be tuned in step S40.
The value range configuration item is used to specify a value range of the to-be-adjusted algorithm parameter, so that at least one candidate algorithm parameter value of the to-be-adjusted algorithm parameter is generated based on the specified value range of the to-be-adjusted algorithm parameter in step S40.
As an example, the value range configuration item may further include a sampling range configuration item and a sampling number configuration item. Specifically, the sampling range configuration item is used to specify a numerical range for sampling, and the sampling number configuration item is used to specify the number of times for sampling, so that in step S40, the specified number of times is sampled within the specified numerical range, and the collected numerical value is used as a candidate algorithm parameter value of the algorithm parameter to be adjusted.
As another example, the value range configuration item may further include a specific value configuration item. Specifically, the specific value configuration item is used to directly specify a specific value of the algorithm parameter to be adjusted, so that the specified specific value of the algorithm parameter to be adjusted is used as a candidate algorithm parameter value of the algorithm parameter to be adjusted in step S40.
As an example, the value range configuration item may include both the sampling range configuration item and the sampling number configuration item, and also include a specific value configuration item, and the sampling range configuration item and the sampling number configuration item set by the user may be obtained according to an input operation performed by the user on the graphical interface, or the specific value configuration item set by the user.
Furthermore, it should be understood that if the tuning parameter configuration item includes an initial value configuration item and a value range configuration item, in step S40, at least one candidate algorithm parameter value of the to-be-tuned algorithm parameter may be generated based on the initial value of the to-be-tuned algorithm parameter specified by the initial value configuration item and the value range of the to-be-tuned algorithm parameter specified by the value range configuration item.
The parameter tuning method configuration item is used to specify a method of generating a plurality of sets of candidate algorithm parameter values, so that the plurality of sets of candidate algorithm parameter values are generated based on at least one candidate algorithm parameter value of each algorithm parameter to be tuned in the specified method in step S40.
As an example, the method for generating the plurality of sets of candidate algorithm parameter values may be a method for randomly generating N sets of candidate algorithm parameter values based on all candidate algorithm parameter values of all algorithm parameters to be adjusted, where each set of candidate algorithm parameter values includes one candidate algorithm parameter value of each algorithm parameter to be adjusted, where N is an integer greater than 0, and a value of N may be specified by a parameter adjustment method configuration item or may be preset. Here, for example, the method of generating the plurality of sets of candidate algorithm parameter values may be a Random Search method.
As another example, the method of generating a plurality of sets of candidate algorithm parameter values may be a method of generating all different combinations including one candidate algorithm parameter value for each algorithm parameter to be tuned based on all candidate algorithm parameter values of all algorithm parameters to be tuned. Here, for example, the method of generating the plurality of sets of candidate algorithm parameter values may be a Grid Search (trellis Search) method.
However, it should be noted that the above examples are only used for illustrating and explaining the exemplary embodiments of the present invention, and the exemplary embodiments of the present invention do not necessarily require a user to configure the above items, for example, at least one candidate algorithm parameter value of the algorithm parameter to be adjusted may be generated based on an initial value of the algorithm parameter to be adjusted which is set in advance; or, at least one candidate algorithm parameter value of the algorithm parameter to be adjusted can be generated based on the preset value range of the algorithm parameter to be adjusted; alternatively, multiple sets of candidate algorithm parameter values may be generated based on at least one candidate algorithm parameter value of each algorithm parameter to be tuned according to a preset method for generating multiple sets of candidate algorithm parameter values.
In step S50, a machine learning model corresponding to each set of candidate algorithm parameter values is trained according to the machine learning algorithm for each set of candidate algorithm parameter values. Specifically, the machine learning algorithm is executed under the condition that the algorithm parameter values of the algorithm parameters to be adjusted are each group of candidate algorithm parameter values, so as to respectively obtain the machine learning models corresponding to each group of candidate algorithm parameter values.
As an example, in step S50, the machine learning models corresponding to each set of candidate algorithm parameter values may be trained in parallel to improve the efficiency of algorithm parameter tuning and fully utilize computing resources.
As one example, in training the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values may be maintained by a parameter server, wherein the parameters have the form of key-value pairs (key-values), the parameter server storing multiple key-value pairs having a same key as a single key corresponding to multiple values, thereby avoiding a linear increase in storage overhead when storing parameters of the machine learning models corresponding to each set of candidate algorithm parameter values at the same time.
Specifically, the machine learning model corresponding to each set of candidate algorithm parameter values may correspond to a set of key value pairs in which each key may be associated with a model feature, with each key corresponding to a respective value. Moreover, key-value pairs corresponding to machine learning models corresponding to different sets of candidate algorithm parameter values have identical keys. As shown in fig. 2, the machine learning model corresponding to the 1 st set of candidate algorithm parameter values corresponds to a set of key-value pairs, which includes keys k1, k2, k3, …, km, corresponding to values v11, v12, v13, …, v1m, respectively; the machine learning model corresponding to the 2 nd set of candidate algorithm parameter values corresponds to a set of key value pairs, wherein the key value pairs comprise keys k1, k2, k3, … and km, and respectively correspond to values v21, v22, v23, … and v2 m; the machine learning model corresponding to the nth set of candidate algorithm parameter values corresponds to a set of key value pairs, wherein the set of key value pairs comprises keys k1, k2, k3, … and km, and the values vn1, vn2, vn3, … and vnm respectively correspond to the keys, wherein m is an integer greater than 1, and n is an integer greater than 1. It can be seen that key-value pairs among the n sets of key-value pairs have identical keys, and thus, according to an exemplary embodiment of the present invention, the parameter server may store key-value pairs having identical keys in the form that a single key corresponds to multiple values, i.e., merge key-value pairs having identical keys that correspond to different machine learning models into a form that a single key corresponds to multiple values, e.g., as a form that key k1 corresponds to values v11, v21, …, vn 1.
As another example, the machine learning models corresponding to each set of candidate algorithm parameter values may be trained in parallel by a plurality of computing devices, wherein, when the machine learning models corresponding to each set of candidate algorithm parameter values are trained in parallel, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values may be maintained by a parameter server, which may have a distributed structure, wherein the parameter server may include at least one server side for holding parameters of the machine learning models corresponding to each set of candidate algorithm parameter values and a plurality of client sides, wherein the client sides correspond to the computing devices one-to-one, and the corresponding client sides and the computing devices are integrated into a whole; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values (namely, parameters of the machine learning model corresponding to the at least one set of candidate algorithm parameter values) with one or more server sides, wherein the computing device corresponding to each client is configured to train the machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and in the parameter operation instructions, identical keys are compressed and/or combined. According to the embodiment of the invention, the network overhead for transmitting the parameter operation instruction between the client and the server can be effectively reduced.
As an example, each client may receive a parameter operation request of a parameter related to the machine learning algorithm under the at least one set of candidate algorithm parameter values from a corresponding computing device, respectively generate a parameter operation instruction corresponding to the parameter operation request for one or more servers storing the parameter, and respectively transmit the generated parameter operation instruction to the one or more servers. Further, as an example, each of the clients may receive a parameter operation instruction corresponding to a parameter operation result of the parameter from the one or more servers, generate a parameter operation result corresponding to each of the parameter operation requests based on the received parameter operation instruction, and transmit the generated parameter operation result to a corresponding computing device. For example, the parameter operation request may include a pull operation request and/or a push operation request.
According to the exemplary embodiment of the invention, each computing device requests to acquire and/or update parameters of at least one machine learning model from a corresponding client in the process of training the at least one machine learning model, wherein the parameters are distributively stored on one or more server sides. Therefore, after receiving any parameter operation request, the client splits the parameter operation request into parameter operation request parts corresponding to the server sides, and stores the split parts in corresponding queues. For example, a respective queue may be set for each server side. As an example, the parameter operation request based on which the client generates the parameter operation instruction each time may be each part cached in the queue, that is, at least one parameter operation request part for the corresponding server side received from the corresponding computing device after the client generates the parameter operation instruction last time and before the parameter operation instruction is generated this time. Since the parameter operation instructions corresponding to the respective server sides are respectively generated based on the respective queues, considering that each queue stores parameter operation requests related to at least one machine learning model, the correspondingly generated parameter operation instructions may be based on the same or different types of parameter operation requests, which may be directed to the same or different machine learning models.
As another example, the same data-streaming calculations for machine learning model training may be performed according to the machine learning algorithm under each set of candidate algorithm parameter values, wherein the data-streaming calculations are performed by combining the same processing steps between the respective data-streaming calculations, thereby reducing the actual amount of computation and the amount of reading and writing, resulting in performance improvement.
As an example, the same processing steps between the individual data-streaming calculations may be merged starting from upstream, i.e. a common upstream processing step between the individual data-streaming calculations is merged.
Returning to fig. 1, in step S60, the effect of the trained machine learning model corresponding to each set of candidate algorithm parameter values is evaluated. It should be appreciated that the goodness of the effect of the machine learning model corresponding to each set of candidate algorithm parameter values can reflect the goodness of each set of candidate algorithm parameter values.
As an example, the effect of the machine learning model corresponding to each set of candidate algorithm parameter values may be evaluated according to the evaluation value of the machine learning model corresponding to each set of candidate algorithm parameter values with respect to the evaluation index. Here, the evaluation index may be an evaluation index specified by an evaluation index configuration item set by a user through a graphical interface, or may be an evaluation index set in advance.
As an example, the evaluation index may be various model evaluation indexes for measuring the effect of the machine learning model. For example, the evaluation index may be an Area Under an AUC (Receiver operating characteristic) Curve, an Area Under an ROC (Receiver operating characteristic), an MAE (mean absolute Error), a log loss function (logloss), or the like.
As an example, after step S60, the method for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention may further include: and directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model. Here, the set of candidate algorithm parameter values corresponding to the best machine learning model is the optimal set of algorithm parameter values obtained by automatically adjusting the algorithm parameters.
As an example, after step S60, the method for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention may further include: and storing a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect in a configuration file form, so that the candidate algorithm parameter values can be directly called according to user requirements when subsequent steps of training the machine learning model are executed, or can be directly called according to user requirements when other machine learning processes are carried out.
Fig. 3 shows a flowchart of a method for algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention. As shown in fig. 3, the method for performing algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention may further include step S70 in addition to step S10, step S20, step S30, step S40, step S50, and step S60 shown in fig. 1. Steps S10 to S60 can be implemented with reference to the specific embodiment described with reference to fig. 1, and will not be described herein again.
In step S70, the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values are displayed to the user. Here, the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values may be displayed in any effective form.
As an example, the method for performing algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention may further include: and setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from the displayed plurality of groups of candidate algorithm parameter values, and applying the set algorithm parameter values to the subsequent step of training the machine learning model.
As another example, the method for performing algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention may further include: and storing a group of candidate algorithm parameter values selected by a user from the displayed groups of candidate algorithm parameter values in a configuration file form, so that the candidate algorithm parameter values can be directly called according to the user requirements when the subsequent step of training the machine learning model is executed, or can be directly called according to the user requirements when other machine learning processes are carried out.
As an example, the method for performing algorithm parameter tuning for a machine learning algorithm according to another exemplary embodiment of the present invention may further include: and setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from the displayed multiple groups of candidate algorithm parameter values, applying the set algorithm parameter values to the subsequent step of training the machine learning model, and storing the selected group of candidate algorithm parameter values in a configuration file form.
An example of setting a key configuration item by a user through a graphic interface according to an exemplary embodiment of the present invention is described below with reference to fig. 4 and 5. Fig. 4 and 5 illustrate examples of graphical interfaces for setting a tuning configuration item of a machine learning algorithm according to an exemplary embodiment of the present invention. It should be understood that the specific interaction details of the exemplary embodiments of the present invention in setting up the various call parameter configuration items are not limited to the examples shown in fig. 4 and 5.
As shown in fig. 4 and 5, the graphical interface for setting the parameter configuration item may display content options and/or content input boxes corresponding to the initial value configuration item, the value range configuration item, and the parameter adjustment method configuration item, respectively. Specifically, the parameter method configuration items may be set according to a selection operation of the user in the pull-down menu, so that the content selected by the user is specified as the parameter method. For example, as shown in fig. 4, the user selects the parameter tuning method option of "Random Search" so that "Random Search" is designated as the parameter tuning method, and may further display a content input box for setting "number of parameter tuning times" to the user, and designate the number of candidate algorithm parameter value combinations generated with "Random Search" (for example, designate generation of 6 sets of candidate algorithm parameter values) in accordance with an editing operation of the content input box by the user (for example, input of a numerical value "6" as shown in fig. 4).
The graphical interface can also display initial value configuration items and/or value range configuration items corresponding to the to-be-adjusted algorithm parameters included in the machine learning algorithm. As shown in fig. 4, here, the machine learning algorithm used for training the machine learning model is an FTRL optimization algorithm, and the algorithm parameters to be adjusted of the FTRL optimization algorithm may include: the maximum number of training rounds, the learning rate, the L1 regular term coefficient, and the L2 regular term coefficient, and accordingly, the graphical interface may display content input boxes for setting initial values corresponding to the above-mentioned parameters of the algorithm to be tuned, so that the setting of the initial value configuration item may be implemented according to the editing operation of the user on the content input box (for example, inputting the values "4", "0.5", "0" in the corresponding content input boxes, respectively, as shown in fig. 4).
According to the input operation of selecting the parameter range setting option by the user, a graphical interface for setting the value range configuration item can be popped up. As shown in fig. 5, the popped up graphical interface may display value range configuration items corresponding to the respective to-be-adjusted algorithm parameters. Aiming at each algorithm parameter to be adjusted, a user can select an 'appointed range' option or a 'numerical enumeration' option, if the user selects the 'appointed range' option, a content input box for setting a sampling range configuration item and a sampling frequency configuration item can be displayed for the user, and the value range of the algorithm parameter to be adjusted corresponding to the content input box can be set according to the input operation of the user on the content input box; if the user selects the 'numerical enumeration' option, a content input box for setting specific value configuration items can be displayed for the user, and the value range of the algorithm parameter to be adjusted corresponding to the content input box can be set according to the input operation of the user on the content input box. For example, as shown in fig. 5, for the parameter to be adjusted, which is the maximum number of training rounds, the value range of the maximum number of training rounds can be specified as the value acquired by sampling 1 time in the value range 1-10 according to the editing operation (inputting the values "1 and 10") of the user on the content input box of the "sampling range" and the editing operation (inputting the value "1") on the content input box of the "number of sampling times". For a parameter to be adjusted, which is the learning rate, the value range of the learning rate can be specified as the values "2", "4" and "8" according to the editing operation (inputting the values "2", "4" and "8") of the user on the content input box of "value enumeration".
An example of displaying the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values to a user according to an exemplary embodiment of the present invention is described below with reference to fig. 6. In the example of fig. 6, the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values are displayed in the form of an algorithm parameter tuning analysis report.
As shown in fig. 6, the analysis report shows the generated 6 sets of candidate algorithm parameter values and the trained effects of the machine learning models corresponding to each set of candidate algorithm parameter values (i.e., evaluation values regarding the evaluation index "AUC"), and the 6 sets of candidate algorithm parameter values may be arranged according to the merits of the effects of the corresponding machine learning models. In addition, the analysis report may also show that the parameter tuning method used for automatically tuning the algorithm parameters is "Random Search", the parameter tuning times are "6", and the initial values of the maximum training round number, the learning rate, the L1 regular term coefficient, and the L2 regular term coefficient of the algorithm parameters to be tuned are "4", "0.5", "0", and "0", respectively.
Further, as an example, a user may select a set of candidate algorithm parameter values from the algorithm parameter tuning analysis report shown in FIG. 6 to apply to subsequent machine learning steps and/or to save in the form of a configuration file.
According to an exemplary embodiment of the invention, a machine learning process may be performed in the form of a directed acyclic graph (DAG graph), which may encompass all or part of the steps for performing machine learning model training, testing, or prediction. For example, a DAG graph including a historical data import step, a data split step, a feature extraction step, and an automatic parameter tuning step may be built for algorithm parameter auto-tuning. That is, the various steps described above may be performed as nodes in a DAG graph.
Fig. 7 illustrates an example of a DAG graph with algorithm parameter tuning for a machine learning algorithm, according to an exemplary embodiment of the invention.
Referring to fig. 7, a first step: and establishing a data import node. For example, as shown in fig. 7, the data import node may be set in response to a user operation to import a banking data table named "bank _ jin" into the machine learning platform, where the data table may contain a plurality of history data records.
The second step is that: and establishing a data splitting node, and connecting a data import node to the data splitting node so as to split the imported data table into a training set and a verification set, wherein data records in the training set are used for being converted into machine learning samples to learn the model, and data records in the verification set are used for being converted into test samples to verify the effect of the learned model. The data splitting node may be set in response to a user operation to split the imported data table into a training set and a validation set in a set manner.
The third step: establishing two feature extraction nodes, and respectively connecting the data splitting nodes to the two feature extraction nodes, so as to respectively perform feature extraction on a training set and a verification set output by the data splitting nodes, for example, the left side of the default data splitting node outputs the training set, and the right side outputs the verification set. The training set and the validation set may be feature extracted based on a feature configuration set by a user in the feature extraction node or written code. It should be understood that the feature extraction modes of the machine learning sample and the test sample are consistent correspondingly. The user can directly apply the feature extraction mode configured for the left-side feature extraction node to the feature extraction for the right-side feature extraction node, or the platform can set the left-side feature extraction node and the right-side feature extraction node as automatic synchronous setting.
The fourth step: and establishing an automatic parameter adjusting node, and respectively connecting the two feature extraction nodes to the automatic parameter adjusting node. The automatic parameter setting node may be set in response to a user operation, for example, when an input operation that the user clicks the "automatic parameter setting" node is received, a graphical interface for setting parameter setting configuration items as shown in fig. 4 and 5 may be provided to the user so that the user can set parameter setting configuration items through the graphical interface.
After the DAG graph including the above steps is built, the entire DAG graph can be run according to the user's instructions. In the operation process, the machine learning platform can automatically generate a plurality of groups of candidate algorithm parameter values according to configuration items set by a user; respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values; and evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
In addition, as an example, after the automatic parameter adjustment node, the model training node may also be established, and the automatic parameter adjustment node is connected to the model training node, so as to directly set the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm used by the model training node as a set of candidate algorithm parameter values corresponding to the best-effect machine learning model. Accordingly, the model training node may be set in response to a user operation to train the model in the set manner.
FIG. 8 shows a block diagram of a system for algorithm parameter tuning for machine learning algorithms, according to an exemplary embodiment of the present invention. As shown in fig. 8, the system for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention includes: algorithm determination means 10, display means 20, configuration item acquisition means 30, algorithm parameter value generation means 40, at least one calculation means 50, evaluation means 60.
Specifically, the algorithm determination device 10 is used to determine a machine learning algorithm for training a machine learning model.
The display device 20 is configured to provide a graphical interface for setting a parameter configuration item of the machine learning algorithm to a user, wherein the parameter configuration item is used for defining how to generate a plurality of sets of candidate algorithm parameter values, wherein each set of candidate algorithm parameter values includes one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm.
The configuration item obtaining device 30 is used for receiving an input operation performed on the graphical interface by a user for setting the parameter configuration item, and obtaining the parameter configuration item set by the user according to the input operation.
The algorithm parameter value generating means 40 is used for generating a plurality of sets of candidate algorithm parameter values based on the obtained parameter-adjusting configuration items.
As an example, the parameter configuration items may comprise at least one of: an initial value configuration item for specifying an initial value of the algorithm parameter to be adjusted, so that the algorithm parameter value generation means 40 generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the specified initial value of the algorithm parameter to be adjusted; a value range configuration item, configured to specify a value range of the algorithm parameter to be adjusted, so that the algorithm parameter value generation device 40 generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the specified value range of the algorithm parameter to be adjusted; parameter adjusting method configuration items are used for specifying methods for generating multiple groups of candidate algorithm parameter values, so that the algorithm parameter value generating device 40 generates multiple groups of candidate algorithm parameter values according to the specified methods based on at least one candidate algorithm parameter value of each algorithm parameter to be adjusted.
The at least one computing device 50 is configured to train, according to the machine learning algorithm, a machine learning model corresponding to each set of candidate algorithm parameter values under each set of candidate algorithm parameter values, respectively.
As an example, the at least one computing device 50 may train the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein, when the at least one computing device 50 trains the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values may be maintained by a parameter server, wherein the parameters are in the form of key-value pairs, the parameter server storing multiple key-value pairs having a same key in the form of a single key corresponding to multiple values.
As another example, a system for algorithm parameter tuning for machine learning algorithms according to an exemplary embodiment of the present invention may include a plurality of computing devices 50, wherein the plurality of computing devices 50 may train machine learning models corresponding to each set of candidate algorithm parameter values in parallel, wherein, when the plurality of computing devices 50 train the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, parameters of the machine learning model corresponding to each set of candidate algorithm parameter values may be maintained by a parameter server, wherein the parameter server comprises at least one server side and a plurality of client sides, wherein the client sides correspond to the computing devices 50 one by one, and the corresponding client sides and the computing devices 50 are integrated into a whole, the at least one server is used for storing parameters of the machine learning model corresponding to each group of candidate algorithm parameter values; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values with one or more server sides, wherein the computing device 50 corresponding to each client is configured to train a machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and in the parameter operation instructions, the same keys are compressed and/or combined.
As another example, the at least one computing device 50 may perform the same data-streaming calculations for machine learning model training in accordance with the machine learning algorithm at each set of candidate algorithm parameter values, wherein the data-streaming calculations are performed by merging the same processing steps between the respective data-streaming calculations.
The evaluation device 60 is used for evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
As an example, the display device 20 may also display the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values to the user.
As an example, the system for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention may further include: application means (not shown). The application device is used for directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent step of training the machine learning model; or, the method is used for setting the algorithm parameter value of the algorithm parameter to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by the user from the displayed groups of candidate algorithm parameter values, and applying the set algorithm parameter value to the subsequent step of training the machine learning model.
As an example, the system for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention may further include: a holding device (not shown). The storage device is used for storing a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect in a configuration file form; or, the method is used for saving a group of candidate algorithm parameter values selected by the user from the displayed groups of candidate algorithm parameter values in a configuration file.
It should be understood that, according to the exemplary embodiment of the present invention, a specific implementation of the system for performing algorithm parameter tuning on a machine learning algorithm may be implemented with reference to the related specific implementation described in conjunction with fig. 1 to 7, and will not be described herein again.
The system for algorithm parameter tuning for machine learning algorithms according to exemplary embodiments of the present invention may comprise means for software, hardware, firmware, or any combination thereof, each configured to perform a specific function. These means may correspond, for example, to a dedicated integrated circuit, to pure software code, or to a module combining software and hardware. Further, one or more functions implemented by these apparatuses may also be collectively performed by components in a physical entity device (e.g., a processor, a client, a server, or the like).
It should be understood that the method for algorithm parameter tuning for machine learning algorithms according to an exemplary embodiment of the present invention may be implemented by a program recorded on a computer readable medium, for example, according to an exemplary embodiment of the present invention, there may be provided a computer readable medium for algorithm parameter tuning for machine learning algorithms, wherein the computer readable medium has recorded thereon a computer program for executing the following method steps: (A) determining a machine learning algorithm for training a machine learning model; (B) providing a graphical interface for setting parameter configuration items of the machine learning algorithm to a user, wherein the parameter configuration items are used for defining how to generate a plurality of groups of candidate algorithm parameter values, wherein each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm; (C) receiving input operation executed on a graphical interface by a user for setting the parameter adjusting configuration item, and acquiring the parameter adjusting configuration item set by the user according to the input operation; (D) generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items; (E) respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values; (F) and evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
The computer program in the computer-readable medium may be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, etc., and it should be noted that the computer program may also be used to perform additional steps other than the above steps or perform more specific processing when the above steps are performed, and the contents of the additional steps and the further processing are described with reference to fig. 1 to 7, and will not be described again to avoid repetition.
It should be noted that the system for adjusting algorithm parameters for a machine learning algorithm according to an exemplary embodiment of the present invention may completely depend on the running of a computer program to realize corresponding functions, that is, each device corresponds to each step in the functional architecture of the computer program, so that the whole system is called by a special software package (e.g., lib library) to realize the corresponding functions.
On the other hand, each means included in the system for performing algorithm parameter tuning for a machine learning algorithm according to an exemplary embodiment of the present invention may also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that a processor may perform the corresponding operations by reading and executing the corresponding program code or code segments.
For example, exemplary embodiments of the present invention may also be implemented as a computer including a storage component having stored therein a set of computer-executable instructions that, when executed by the processor, perform a method of algorithm parameter tuning for a machine learning algorithm.
In particular, the computer may be deployed in a server or client or on a node device in a distributed network environment. Further, the computer may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions.
The computer need not be a single device, but may be any combination of devices or circuitry capable of executing the instructions (or sets of instructions) described herein, either individually or in combination. The computer may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the computer, the processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some of the operations described in the method for performing algorithm parameter tuning for a machine learning algorithm according to the exemplary embodiment of the present invention may be implemented by software, some of the operations may be implemented by hardware, and further, the operations may be implemented by a combination of hardware and software.
The processor may execute instructions or code stored in one of the memory components, which may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integral to the processor, e.g., having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the storage component.
In addition, the computer may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computer may be connected to each other via a bus and/or a network.
The operations involved in a method of algorithm parameter tuning for machine learning algorithms according to an exemplary embodiment of the present invention may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated on by non-exact boundaries.
For example, as described above, a computer for algorithm parameter tuning for machine learning algorithms according to exemplary embodiments of the present invention may include a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions that, when executed by the processor, perform the steps of: (A) determining a machine learning algorithm for training a machine learning model; (B) providing a graphical interface for setting parameter configuration items of the machine learning algorithm to a user, wherein the parameter configuration items are used for defining how to generate a plurality of groups of candidate algorithm parameter values, wherein each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm; (C) receiving input operation executed on a graphical interface by a user for setting the parameter adjusting configuration item, and acquiring the parameter adjusting configuration item set by the user according to the input operation; (D) generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items; (E) respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values; (F) and evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values.
While exemplary embodiments of the invention have been described above, it should be understood that the above description is illustrative only and not exhaustive, and that the invention is not limited to the exemplary embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. Therefore, the protection scope of the present invention should be subject to the scope of the claims.

Claims (20)

1. A method of algorithm parameter tuning for a machine learning algorithm, comprising:
(A) determining a machine learning algorithm for training a machine learning model;
(B) providing a graphical interface for setting parameter configuration items of the machine learning algorithm to a user, wherein the parameter configuration items are used for defining how to generate a plurality of groups of candidate algorithm parameter values, wherein each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm;
(C) receiving input operation executed on a graphical interface by a user for setting the parameter adjusting configuration item, and acquiring the parameter adjusting configuration item set by the user according to the input operation;
(D) generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items;
(E) respectively training a machine learning model corresponding to each group of candidate algorithm parameter values according to the machine learning algorithm under each group of candidate algorithm parameter values;
(F) evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values,
wherein in step (E), the same data flow calculation for machine learning model training is performed according to the machine learning algorithm under each set of candidate algorithm parameter values,
wherein the data streaming calculations are performed by merging the same processing steps between the respective data streaming calculations; and the same processing steps between the various data streaming calculations are merged from upstream.
2. The method of claim 1, further comprising:
(G) and displaying the generated multiple groups of candidate algorithm parameter values and the trained effect of the machine learning model corresponding to each group of candidate algorithm parameter values to the user.
3. The method of claim 1, further comprising:
(H) and directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
4. The method of claim 1, wherein the parameter configuration items comprise at least one of: an initial value configuration item for specifying an initial value of the algorithm parameter to be tuned such that at least one candidate algorithm parameter value of the algorithm parameter to be tuned is generated based on the specified initial value of the algorithm parameter to be tuned in step (D); a value range configuration item used for appointing the value range of the algorithm parameter to be adjusted, so that at least one candidate algorithm parameter value of the algorithm parameter to be adjusted is generated based on the appointed value range of the algorithm parameter to be adjusted in the step (D); a parameter tuning method configuration item for specifying a method of generating a plurality of sets of candidate algorithm parameter values, such that in step (D) the plurality of sets of candidate algorithm parameter values are generated based on at least one candidate algorithm parameter value of each algorithm parameter to be tuned according to the specified method.
5. The method of claim 1, wherein in step (E), the machine learning models corresponding to each set of candidate algorithm parameter values are trained in parallel,
wherein parameters of the machine learning model corresponding to each set of candidate algorithm parameter values are maintained by a parameter server in parallel with training of the machine learning model corresponding to each set of candidate algorithm parameter values, wherein the parameters are in the form of key-value pairs, the parameter server storing multiple key-value pairs having a same key in the form of a single key corresponding to multiple values.
6. The method of claim 1, wherein in step (E), the machine learning models corresponding to each set of candidate algorithm parameter values are trained in parallel by a plurality of computing devices,
when the machine learning models corresponding to each group of candidate algorithm parameter values are trained in parallel, parameters of the machine learning models corresponding to each group of candidate algorithm parameter values are maintained by a parameter server, wherein the parameter server comprises at least one server end and a plurality of client ends, the client ends correspond to the computing devices in a one-to-one mode, the corresponding client ends and the computing devices are integrated, and the at least one server end is used for storing the parameters of the machine learning models corresponding to each group of candidate algorithm parameter values; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values with one or more server sides, wherein the computing device corresponding to each client is configured to train a machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and identical keys are compressed and/or combined in the parameter operation instructions.
7. The method of claim 2, further comprising:
(I) and setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from the displayed plurality of groups of candidate algorithm parameter values, and applying the set algorithm parameter values to the subsequent step of training the machine learning model.
8. The method of claim 1, further comprising:
(J) and storing a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect in a configuration file form.
9. The method of claim 2 or 7, further comprising:
(K) and saving a group of candidate algorithm parameter values selected by a user from the displayed groups of candidate algorithm parameter values in a configuration file form.
10. A system for algorithm parameter tuning for machine learning algorithms, comprising:
algorithm determining means for determining a machine learning algorithm for training a machine learning model;
the display device is used for providing a graphical interface for setting parameter adjusting configuration items of the machine learning algorithm for a user, wherein the parameter adjusting configuration items are used for limiting how to generate a plurality of groups of candidate algorithm parameter values, and each group of candidate algorithm parameter values comprises one candidate algorithm parameter value of each algorithm parameter to be adjusted of the machine learning algorithm;
configuration item acquisition means for receiving an input operation performed by a user on a graphical interface in order to set the parameter configuration item, and acquiring the parameter configuration item set by the user according to the input operation;
the algorithm parameter value generating device is used for generating a plurality of groups of candidate algorithm parameter values based on the acquired parameter adjusting configuration items;
at least one computing device, configured to train, according to the machine learning algorithm, a machine learning model corresponding to each set of candidate algorithm parameter values under each set of candidate algorithm parameter values, respectively;
an evaluation device for evaluating the effect of the trained machine learning model corresponding to each group of candidate algorithm parameter values,
wherein the at least one computing device performs the same data-streaming computation with respect to machine learning model training according to the machine learning algorithm at each set of candidate algorithm parameter values,
wherein the data streaming calculation is performed by merging the same processing steps between the respective data streaming calculations, and the same processing steps between the respective data streaming calculations are merged from upstream.
11. The system of claim 10, wherein the display device further displays the generated sets of candidate algorithm parameter values and the trained effects of the machine learning model corresponding to each set of candidate algorithm parameter values to the user.
12. The system of claim 10, further comprising:
and the application device is used for directly setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values corresponding to the machine learning model with the best effect, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
13. The system of claim 10, wherein the parameter configuration items comprise at least one of: the initial value configuration item is used for appointing the initial value of the algorithm parameter to be adjusted, so that the algorithm parameter value generation device generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the appointed initial value of the algorithm parameter to be adjusted; the value range configuration item is used for appointing the value range of the algorithm parameter to be adjusted, so that the algorithm parameter value generation device generates at least one candidate algorithm parameter value of the algorithm parameter to be adjusted based on the appointed value range of the algorithm parameter to be adjusted; and the parameter adjusting method configuration item is used for appointing a method for generating a plurality of groups of candidate algorithm parameter values, so that the algorithm parameter value generating device generates a plurality of groups of candidate algorithm parameter values according to the appointed method based on at least one candidate algorithm parameter value of each algorithm parameter to be adjusted.
14. The system of claim 10, wherein the at least one computing device trains machine learning models corresponding to each set of candidate algorithm parameter values in parallel,
wherein, while the at least one computing device is training the machine learning models corresponding to each set of candidate algorithm parameter values in parallel, parameters of the machine learning models corresponding to each set of candidate algorithm parameter values are maintained by a parameter server, wherein the parameters are in the form of key-value pairs, the parameter server storing multiple key-value pairs having a same key in a form in which a single key corresponds to multiple values.
15. The system of claim 10, wherein the system comprises a plurality of computing devices, wherein the plurality of computing devices train a machine learning model corresponding to each set of candidate algorithm parameter values in parallel,
when the plurality of computing devices train the machine learning models corresponding to each group of candidate algorithm parameter values in parallel, maintaining parameters of the machine learning models corresponding to each group of candidate algorithm parameter values by a parameter server, wherein the parameter server comprises at least one server and a plurality of clients, the clients correspond to the computing devices one by one, and the corresponding clients and the computing devices are integrated into a whole, wherein the at least one server is used for storing the parameters of the machine learning models corresponding to each group of candidate algorithm parameter values; each client is used for transmitting parameter operation instructions of parameters related to the machine learning algorithm under at least one set of candidate algorithm parameter values with one or more server sides, wherein the computing device corresponding to each client is configured to train a machine learning model according to the machine learning algorithm under the at least one set of candidate algorithm parameter values respectively, and identical keys are compressed and/or combined in the parameter operation instructions.
16. The system of claim 11, further comprising:
and the application device is used for setting the algorithm parameter values of the algorithm parameters to be adjusted of the machine learning algorithm as a group of candidate algorithm parameter values selected by a user from a plurality of groups of displayed candidate algorithm parameter values, and applying the set algorithm parameter values to the subsequent steps of training the machine learning model.
17. The system of claim 10, further comprising:
and the storage device is used for storing a group of candidate algorithm parameter values corresponding to the best-effect machine learning model in a configuration file form.
18. The system of claim 11 or 16, further comprising:
and the storage device is used for storing a group of candidate algorithm parameter values selected by a user from the displayed groups of candidate algorithm parameter values in a configuration file form.
19. A computer-readable medium for algorithm parameter tuning for machine learning algorithms, wherein a computer program for performing the method for algorithm parameter tuning for machine learning algorithms according to any of claims 1 to 9 is recorded on the computer-readable medium.
20. A computer for algorithm parameter tuning for machine learning algorithms, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions which, when executed by the processor, perform the method for algorithm parameter tuning for machine learning algorithms of any one of claims 1 to 9.
CN201711048805.3A 2017-10-31 2017-10-31 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm Active CN107844837B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201711048805.3A CN107844837B (en) 2017-10-31 2017-10-31 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm
CN202010496368.7A CN111652380B (en) 2017-10-31 2017-10-31 Method and system for optimizing algorithm parameters aiming at machine learning algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711048805.3A CN107844837B (en) 2017-10-31 2017-10-31 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202010496368.7A Division CN111652380B (en) 2017-10-31 2017-10-31 Method and system for optimizing algorithm parameters aiming at machine learning algorithm

Publications (2)

Publication Number Publication Date
CN107844837A CN107844837A (en) 2018-03-27
CN107844837B true CN107844837B (en) 2020-04-28

Family

ID=61681212

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202010496368.7A Active CN111652380B (en) 2017-10-31 2017-10-31 Method and system for optimizing algorithm parameters aiming at machine learning algorithm
CN201711048805.3A Active CN107844837B (en) 2017-10-31 2017-10-31 Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202010496368.7A Active CN111652380B (en) 2017-10-31 2017-10-31 Method and system for optimizing algorithm parameters aiming at machine learning algorithm

Country Status (1)

Country Link
CN (2) CN111652380B (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509727B (en) * 2018-03-30 2022-04-08 深圳市智物联网络有限公司 Model selection processing method and device in data modeling
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN108681487B (en) * 2018-05-21 2021-08-24 千寻位置网络有限公司 Distributed system and method for adjusting and optimizing sensor algorithm parameters
WO2020011068A1 (en) * 2018-07-10 2020-01-16 第四范式(北京)技术有限公司 Method and system for executing machine learning process
CN110188886B (en) * 2018-08-17 2021-08-20 第四范式(北京)技术有限公司 Method and system for visualizing data processing steps of a machine learning process
CN111949349B (en) * 2018-08-21 2024-09-20 第四范式(北京)技术有限公司 Method and system for uniformly executing feature extraction
CN109284828A (en) * 2018-09-06 2019-01-29 沈文策 A kind of hyper parameter tuning method, device and equipment
CN109242040A (en) * 2018-09-28 2019-01-18 第四范式(北京)技术有限公司 Automatically generate the method and system of assemblage characteristic
CN109447277B (en) * 2018-10-19 2023-11-10 厦门渊亭信息科技有限公司 Universal machine learning super-ginseng black box optimization method and system
CN111178535B (en) * 2018-11-12 2024-05-07 第四范式(北京)技术有限公司 Method and apparatus for implementing automatic machine learning
CN109828836B (en) * 2019-01-20 2021-04-30 北京工业大学 Parameter dynamic configuration method for batch streaming computing system
CN111797990B (en) * 2019-04-08 2024-08-09 北京百度网讯科技有限公司 Training method, training device and training system of machine learning model
CN110019151B (en) * 2019-04-11 2024-03-15 深圳市腾讯计算机系统有限公司 Database performance adjustment method, device, equipment, system and storage medium
CN112101562B (en) * 2019-06-18 2024-01-30 第四范式(北京)技术有限公司 Implementation method and system of machine learning modeling process
CN112149836B (en) * 2019-06-28 2024-05-24 杭州海康威视数字技术股份有限公司 Machine learning program updating method, device and equipment
CN110414689A (en) * 2019-08-06 2019-11-05 中国工商银行股份有限公司 Update method and device on a kind of machine learning model line
CN110647998B (en) * 2019-08-12 2022-11-25 北京百度网讯科技有限公司 Method, system, device and storage medium for implementing automatic machine learning
CN110728371A (en) * 2019-09-17 2020-01-24 第四范式(北京)技术有限公司 System, method and electronic device for executing automatic machine learning scheme
CN110838069A (en) * 2019-10-15 2020-02-25 支付宝(杭州)信息技术有限公司 Data processing method, device and system
CN111047048B (en) * 2019-11-22 2023-04-07 支付宝(杭州)信息技术有限公司 Energized model training and merchant energizing method and device, and electronic equipment
CN111191795B (en) * 2019-12-31 2023-10-20 第四范式(北京)技术有限公司 Method, device and system for training machine learning model
CN111242320A (en) * 2020-01-16 2020-06-05 京东数字科技控股有限公司 Machine learning method and device, electronic equipment and storage medium
CN111311104B (en) * 2020-02-27 2023-08-25 第四范式(北京)技术有限公司 Recommendation method, device and system for configuration file
CN111831322B (en) * 2020-04-15 2023-08-01 中国人民解放军军事科学院战争研究院 Multi-level user-oriented machine learning parameter configuration method
CN111611240B (en) * 2020-04-17 2024-09-06 第四范式(北京)技术有限公司 Method, device and equipment for executing automatic machine learning process
CN111723939A (en) * 2020-05-15 2020-09-29 第四范式(北京)技术有限公司 Parameter adjusting method, device, equipment and system of machine learning model
CN111694844B (en) * 2020-05-28 2024-05-07 平安科技(深圳)有限公司 Enterprise operation data analysis method and device based on configuration algorithm and electronic equipment
CN114385256B (en) * 2020-10-22 2024-06-11 华为云计算技术有限公司 Configuration method and configuration device of system parameters
CN113886026B (en) * 2021-12-07 2022-03-15 中国电子科技集团公司第二十八研究所 Intelligent modeling method and system based on dynamic parameter configuration and process supervision
CN115470910A (en) * 2022-10-20 2022-12-13 晞德软件(北京)有限公司 Automatic parameter adjusting method based on Bayesian optimization and K-center sampling
CN118033207A (en) * 2024-02-27 2024-05-14 青岛汉泰电子有限公司 System and method for triggering all analog channels of display oscilloscope

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200087A (en) * 2014-06-05 2014-12-10 清华大学 Parameter optimization and feature tuning method and system for machine learning
CN105912500A (en) * 2016-03-30 2016-08-31 百度在线网络技术(北京)有限公司 Machine learning model generation method and machine learning model generation device
CN106156810A (en) * 2015-04-26 2016-11-23 阿里巴巴集团控股有限公司 General-purpose machinery learning algorithm model training method, system and calculating node

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016061283A1 (en) * 2014-10-14 2016-04-21 Skytree, Inc. Configurable machine learning method selection and parameter optimization system and method
US10846596B2 (en) * 2015-11-23 2020-11-24 Daniel Chonghwan LEE Filtering, smoothing, memetic algorithms, and feasible direction methods for estimating system state and unknown parameters of electromechanical motion devices
CN106202431B (en) * 2016-07-13 2019-06-28 华中科技大学 A kind of Hadoop parameter automated tuning method and system based on machine learning
CN107203809A (en) * 2017-04-20 2017-09-26 华中科技大学 A kind of deep learning automation parameter adjustment method and system based on Keras

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200087A (en) * 2014-06-05 2014-12-10 清华大学 Parameter optimization and feature tuning method and system for machine learning
CN106156810A (en) * 2015-04-26 2016-11-23 阿里巴巴集团控股有限公司 General-purpose machinery learning algorithm model training method, system and calculating node
CN105912500A (en) * 2016-03-30 2016-08-31 百度在线网络技术(北京)有限公司 Machine learning model generation method and machine learning model generation device

Also Published As

Publication number Publication date
CN111652380A (en) 2020-09-11
CN111652380B (en) 2023-12-22
CN107844837A (en) 2018-03-27

Similar Documents

Publication Publication Date Title
CN107844837B (en) Method and system for adjusting and optimizing algorithm parameters aiming at machine learning algorithm
WO2019129060A1 (en) Method and system for automatically generating machine learning sample
CN112101562B (en) Implementation method and system of machine learning modeling process
US11595415B2 (en) Root cause analysis in multivariate unsupervised anomaly detection
CN111797998B (en) Method and system for generating combined features of machine learning samples
US20210241177A1 (en) Method and system for performing machine learning process
US11544604B2 (en) Adaptive model insights visualization engine for complex machine learning models
US20200090073A1 (en) Method and apparatus for generating machine learning model
CN107273979B (en) Method and system for performing machine learning prediction based on service level
CN113435602A (en) Method and system for determining feature importance of machine learning sample
US10592616B2 (en) Generating simulation data using a linear curve simplification and reverse simplification method
US12045734B2 (en) Optimizing gradient boosting feature selection
CN108898229B (en) Method and system for constructing machine learning modeling process
EP4239491A1 (en) Method and system for processing data tables and automatically training machine learning model
CN113626612B (en) Prediction method and system based on knowledge graph reasoning
JP2020060922A (en) Hyper parameter tuning method, device and program
US20220366315A1 (en) Feature selection for model training
CN113419941A (en) Evaluation method and apparatus, electronic device, and computer-readable storage medium
CN110895718A (en) Method and system for training machine learning model
Wang et al. Flint: A platform for federated learning integration
CN108073582B (en) Computing framework selection method and device
US11580196B2 (en) Storage system and storage control method
US20210365470A1 (en) Apparatus for recommending feature and method for recommending feature using the same
JP2020198135A (en) Hyper parameter tuning method, device and program
US20240143414A1 (en) Load testing and performance benchmarking for large language models using a cloud computing platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant