WO2021190379A1

WO2021190379A1 - Method and device for realizing automatic machine learning

Info

Publication number: WO2021190379A1
Application number: PCT/CN2021/081329
Authority: WO
Inventors: 罗远飞; 焦英翔; 涂威威
Original assignee: 第四范式（北京）技术有限公司
Priority date: 2020-03-25
Filing date: 2021-03-17
Publication date: 2021-09-30
Also published as: CN111340240A

Abstract

A method and device for realizing automatic machine learning, comprising: obtaining process configuration information set by a user and used for representing at least part of a machine learning modeling process, the process configuration information comprising one or more operation steps (S110); obtaining parameter configuration information set by the user for the at least part of the machine learning modeling process, the parameter configuration information comprising a value space of at least part of operation parameters involved in the operation steps (S120); on the basis of the process configuration information and different value combinations of the at least part of the operation parameters determined on the basis of the parameter configuration information, executing the at least part of the machine learning modeling process to obtain a plurality of execution results (S130); and outputting preferred values of the at least part of the operation parameters according to the plurality of execution results (S140). Therefore, the automatic machine learning can be realized by obtaining the process configuration information and the parameter configuration information set by the user, thereby reducing the development cost of the automatic machine learning.

Description

Method and device for realizing automatic machine learning

This disclosure claims the priority of a Chinese patent application whose application date is March 25, 2020, the application number is 202010219796.5, and the invention title is "Method and Device for Realizing Automatic Machine Learning".

Technical field

The present disclosure generally relates to the field of artificial intelligence, and more specifically, to a method and device for realizing automatic machine learning.

Background technique

Machine learning is an inevitable product of the development of artificial intelligence research to a certain stage. It is committed to improving the performance of the system itself through computational means and using experience. In a computer system, "experience" usually exists in the form of "data". Through machine learning algorithms, a "model" can be generated from data. That is to say, by providing empirical data to the machine learning algorithm, it can be generated based on these empirical data The model, when faced with a new situation, the model will provide the corresponding judgment, that is, the prediction result.

A machine learning model usually solves a problem in a specific scenario, and the development of a machine learning model requires a lot of labor costs and requires specialized talent investment. Aiming at the shortcomings of traditional machine learning modeling solutions, automatic machine learning (AutoML) was created. The purpose of automatic machine learning is to use automated data-driven methods to determine machine learning solutions.

However, the current development cost of automatic machine learning is relatively high. How to enable users to realize automatic machine learning through simple operations so as to reduce the development cost of automatic machine learning is a problem that needs to be solved urgently.

Summary of the invention

The exemplary embodiments of the present disclosure aim to provide a solution for realizing automatic machine learning in which a machine learning model is online, so as to reduce the cost for users to develop automatic machine learning.

According to the first aspect of the present disclosure, a method for realizing automatic machine learning is proposed, which includes: obtaining process configuration information set by a user for characterizing at least part of the machine learning modeling process, and the process configuration information includes one or more Operation steps; obtain parameter configuration information set by the user for at least part of the machine learning modeling process, the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation step; at least part of the value determined based on the process configuration information and the parameter configuration information Performing at least part of the machine learning modeling process to obtain multiple execution results for different value combinations of the operating parameters; and outputting the preferred values of at least part of the operating parameters according to the multiple execution results.

According to a second aspect of the present disclosure, an apparatus for realizing automatic machine learning is proposed, which includes: a first acquisition module configured to acquire process configuration information set by a user for characterizing at least part of the machine learning modeling process, and the process The configuration information includes one or more operation steps; the second obtaining module is used to obtain the parameter configuration information set by the user for at least part of the machine learning modeling process, and the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation steps The execution module is used to execute at least part of the machine learning modeling process based on the different value combinations of the process configuration information and the at least part of the operating parameters determined based on the parameter configuration information to obtain multiple execution results; and the output module is used to obtain multiple execution results The execution result outputs the preferred values of at least part of the operating parameters.

According to a third aspect of the present disclosure, a system including at least one computing device and at least one storage device storing instructions is proposed, wherein when the instructions are executed by at least one computing device, at least one computing device is The method mentioned in the first aspect.

According to a fourth aspect of the present disclosure, a computer-readable storage medium storing instructions is proposed, wherein, when the instructions are executed by at least one computing device, at least one computing device is caused to execute as described in the first aspect of the present disclosure. method.

According to a fifth aspect of the present disclosure, a computing device is provided, including a processor and a memory, and a set of computer-executable instructions is stored in the memory. When the set of computer-executable instructions is executed by the processor, The processor is caused to perform the following steps: obtain process configuration information set by the user for characterizing at least part of the machine learning modeling process, where the process configuration information includes one or more operation steps; The parameter configuration information set by the learning modeling process, the parameter configuration information including the value space of at least part of the operating parameters involved in the operation step; the at least determined based on the process configuration information and the parameter configuration information A combination of different values of part of the operating parameters is executed to execute the at least part of the machine learning modeling process to obtain multiple execution results; and the preferred values of the at least part of the operating parameters are output according to the multiple execution results.

According to the method and device for realizing automatic machine learning according to exemplary embodiments of the present disclosure, by acquiring process configuration information and parameter configuration information set by a user, automatic machine learning can be realized, thereby reducing the development cost of automatic machine learning.

Description of the drawings

From the following detailed description of the embodiments of the present disclosure in conjunction with the accompanying drawings, at least one of these aspects and advantages, and other aspects and advantages of the present disclosure will become clearer and easier to understand, in which:

Fig. 1 shows a flowchart of a method for realizing automatic machine learning according to an exemplary embodiment of the present disclosure;

Figure 2 shows a calculation diagram used to characterize the machine learning feature selection process;

Fig. 3 shows a structural block diagram of an apparatus for realizing automatic machine learning according to an exemplary embodiment of the present disclosure.

Detailed ways

In order to enable those skilled in the art to better understand the present disclosure, exemplary embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings and specific implementations.

Fig. 1 shows a flowchart of a method for realizing automatic machine learning according to an exemplary embodiment of the present disclosure. The method shown in FIG. 1 can be completely implemented in software through a computer program, and the method shown in FIG. 1 can also be executed by a specially configured computing device.

Referring to FIG. 1, in step S110, process configuration information set by the user for representing at least a part of the machine learning modeling process is obtained, and the process configuration information includes one or more operation steps. Among them, different operation steps may have a predetermined execution sequence.

At least part of the machine learning modeling process can be a complete machine learning modeling process, or a part of the process in the machine learning modeling process. As an example, at least part of the machine learning modeling process may include but not limited to data splicing process, data splitting process, feature generation process, model building process, model training process, model testing process, model evaluation process, model application process, etc. One or more of.

The present disclosure can obtain the process configuration information set by the user in a variety of ways. For example, the present disclosure can provide users with multiple ways of setting process configuration information, so as to obtain process configuration information set by users in multiple ways. Therefore, the present disclosure supports the user to set process configuration information in a variety of ways, that is, the process configuration information can be set by the user in a variety of ways. Among them, the process configuration information set by different setting methods may be different forms of information, that is, the process configuration information mentioned in the present disclosure may refer to information corresponding to multiple forms set by the user in multiple ways.

For example, the process configuration information can be written by the user based on a language based on specific rules. For example, the user can write the process configuration information by writing program code. As an example, users can be provided with a data upload interface to obtain files uploaded by users through the data upload interface, and process configuration information can be determined by parsing the files. The files can be written by the user in a language based on specific rules. For example, the file can refer to the user Program code written for the at least part of the machine learning modeling process, where the program code defines one or more operation steps and the execution sequence between different operation steps.

For another example, the process configuration information may also be set by the user through a visual operation method (such as drag and drop). As an example, an interactive interface for setting the at least part of the machine learning modeling process can be shown to the user, and process configuration information set by the user through the interactive interface can be obtained. For example, the user can set the process configuration information by constructing a flowchart representing at least part of the machine information modeling process in the interactive interface, that is, the process configuration information can be a flowchart constructed by the user in the interactive interface. In one embodiment, multiple nodes (operation nodes and data nodes) that can be selected by the user can be displayed in the interactive interface. The user can configure the attribute information of the node by selecting the node, and connect the node by way of connection. The execution logic in between is configured to set the process configuration information. Regarding the content that can be displayed in the interactive interface and the specific implementation process of the configuration information of the user setting process through the interactive interface, it is not the focus of the present disclosure, and therefore will not be described in detail.

In step S120, the parameter configuration information set by the user for at least part of the machine learning modeling process is obtained, and the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation step.

The present disclosure does not limit the execution sequence between step S110 and step S120. That is, step S110 may be executed first, and then step S120; or step S120 may be executed first, and then step S110 may be executed; or step S110 and step S120 may also be executed at the same time in no particular order.

The parameter configuration information may be set by the user when setting the process configuration information, or may be set by the user after setting the process configuration information. In one embodiment, the user may configure the value space of at least part of the operation parameters involved in the operation steps in the process configuration information in the process of setting the process configuration information to set the parameter configuration information.

In the present disclosure, the parameter configuration information set by the user can be obtained in various ways. For example, the present disclosure can provide users with multiple ways of setting parameter configuration information, so as to obtain parameter configuration information set by users in multiple ways. Therefore, the present disclosure supports the user to set the parameter configuration information in a variety of ways, that is, the parameter configuration information can be set by the user in a variety of ways. Among them, the parameter configuration information set by different setting methods may be different forms of information, that is, the parameter configuration information mentioned in the present disclosure may refer to information corresponding to multiple forms set by the user in a variety of ways.

For example, the parameter configuration information can be written by the user based on a language based on specific rules. For example, the user can write the parameter configuration information by writing program code. As an example, users can be provided with a data upload interface to obtain files uploaded by users through the data upload interface, and parameter configuration information can be determined by parsing the files. The files can be written by the user based on a language based on specific rules. For example, the file can refer to the user Program code written for process configuration information, where the program code defines the value space of at least part of the operating parameters involved in one or more operating steps.

In one embodiment, the file mentioned here and the file mentioned in step S110 above may refer to the same file, that is, the user can upload a file that includes both process configuration information and parameter configuration information through the data upload interface. In the present disclosure, process configuration information and parameter configuration information can be obtained by analyzing files uploaded by users. For example, the user can define the value space of at least part of the operating parameters involved in the operation steps while compiling the process configuration information used to characterize at least part of the machine learning modeling process based on a language of specific rules.

For another example, the parameter configuration information may also be set by the user through a visual operation. As an example, the user may be shown an interactive interface for setting the value space of the operation parameter involved in the operation step in the process configuration information, and the parameter configuration information set by the user through the interactive interface may be obtained. In one embodiment, the user can define the value space of at least part of the operating parameters involved in the operation steps in the process configuration information while setting the process configuration information through the interactive interface to set the parameter configuration information.

As an example, the operation steps in the process configuration information may include operation nodes and data nodes, the input and output of the operation nodes are all data nodes, and the parameter configuration information includes the value space of at least part of the operation parameters related to the operation node. Thus, the process configuration information can be a calculation graph composed of data nodes and operation nodes. Among them, the calculation graph can be regarded as a directed acyclic graph composed of data nodes and operation nodes. The calculation graph can clearly define the execution logic of at least part of the machine learning modeling process. Users can package their original code into data nodes and operation nodes through the base classes or methods provided by the framework, and connect them into calculation graphs. In addition, users can also define operation nodes and data nodes in the form of drag and drop on the graphical interface to construct a calculation graph. When defining the operation node in the calculation graph, the user can also define a value space for the operation parameters involved in the operation node, so as to change some or all of the operation parameters in the operation node from a certain item with a certain value to a value with an undetermined value. Undefined items, the value of the operating parameter needs to be selected from the value space. The value space of all operating parameters in the calculation graph is the total value space corresponding to the current calculation graph. Generally speaking, the product of all the value spaces constitutes the total search space.

In step S130, based on the different value combinations of the process configuration information and at least part of the operating parameters determined based on the parameter configuration information, at least part of the machine learning modeling process is executed, and multiple execution results are obtained.

At least part of the machine learning modeling process represented by the process configuration information has parameters whose values are to be determined. Based on the parameter configuration information, different combinations of values of some of the parameters to be determined can be determined. The process configuration information and each combination of values can constitute at least part of the machine learning modeling process for performing logical determination. Therefore, for each value combination, the machine learning modeling process represented by the process configuration information can be executed to obtain multiple execution results corresponding to different value combinations.

As an example, the process configuration information also defines at least one target node, and the target node is a data node. The target node is used as the evaluation index of the value combination, that is, the optimization target. For example, the target node may refer to the execution result of at least part of the operation logic of the machine learning modeling process represented by the process configuration information. Generally speaking, the execution result corresponding to the target node can be transformed into an evaluable scalar, such as maximum or minimum. The goal of the present disclosure is to hope that the result of the target node is optimal. Therefore, in the process of performing step S130, one or more value combinations of at least part of the operating parameters can be determined according to the parameter configuration information; then, according to the process configuration information and the determined value combination, at least part of the machine learning model can be executed. Model process to get the execution result of the target node. According to the execution result of the target node, the pros and cons of the currently adopted value combination can be evaluated.

The present disclosure can calculate the value combination that needs to be performed for the next group or array according to the value combination determined in history, and in this process, certain pruning optimization can be done according to the existing algorithm to reduce the size of the search space and speed up the operation. The preferred value of the parameter. Among them, the search space refers to a collection of all possible value combinations, and the search space can be determined according to the value space represented by the parameter configuration information and the historical value combination.

As an example, the present disclosure may add the value combination and execution result determined in the current round to the historical information during each iteration, and then use a predetermined search strategy to determine the location of the current round according to the parameter configuration information and historical information. One or more value combinations of at least part of the operating parameters. The search strategy may include but is not limited to at least one of the following: random search strategy, enumerated search strategy, Bayesian optimization search strategy, and grid search strategy. For the implementation principle of each search strategy, reference can be made to the existing theoretical knowledge, which will not be repeated in this disclosure. In one embodiment, when determining the value combination of the current round, the execution result obtained by executing the at least part of the machine learning modeling process based on the historical value combination may also be referred to. That is, the next set or multiple sets of value combinations can be determined according to the historical value combinations and execution results, and in this process, certain pruning optimizations can be made according to the existing algorithms to reduce the size of the search space and speed up Get the optimal solution.

The present disclosure can also obtain the filter condition configuration information set by the user to filter the values of at least part of the operating parameters, and the filter condition configuration information can include one or more filters for the values of at least part of the operating parameters. Filter criteria. After the value combination of the at least part of the operating parameters is determined based on the parameter configuration information, the value combination may be filtered according to the filter condition configuration information to eliminate the value combination that does not meet the preset standard. Among them, the eliminated value combination can be regarded as an infeasible solution and discarded, that is, the eliminated value combination no longer participates in the execution of at least part of the machine learning modeling process. For example, the filtering condition configuration information may include filtering conditions for filtering parameter values that cause high resource consumption, and the filtering conditions may limit at least one of the range of time-consuming operation and memory occupation of the operation steps. For example, limit the running time to be lower than the first threshold, or limit the memory usage to be lower than the second threshold, or limit the running time to be lower than the first threshold and limit the memory usage to be lower than the second threshold. If the running time of one or several operation steps under a certain combination of values is too long (for example, greater than the first threshold) or the memory footprint is too large (for example, greater than the second threshold), it exceeds the limit of the filter condition, Then the set of value combinations can be discarded as an infeasible solution and no longer executed, that is, the final scoring result is not required. Therefore, when the total value space of the at least part of the operating parameters is large and the search uncertainty is high, the configuration information according to the filtering conditions set by the user can make the search process more controllable to a certain extent, In turn, the optimal value finally obtained is more in line with actual requirements.

In step S140, the preferred values of at least part of the operating parameters are output according to the multiple execution results.

The execution result can be used as an evaluation index to reflect the pros and cons of the adopted value combination. According to the performance of the execution result, a combination of values corresponding to the best or better execution result may be selected from a plurality of execution results as the preferred values of the at least part of the operation parameters.

The machine learning modeling process represented by the process configuration information can be regarded as composed of multiple operation steps. For different value combinations, during the process of performing machine learning modeling, there may be operating steps with unchanged values of operating parameters. For these operating steps, the execution results are generally fixed. Therefore, from the perspective of saving computing resources, the present disclosure can also save the execution result of the operation step with the same value of the operating parameter. In the process of performing the at least part of the machine learning modeling, the at least part of the machine learning can be ignored. During the modeling process, execute the operation step with the unchanged value of the operation parameter, and call to save the execution result of the operation step in advance.

As an example, the present disclosure may also estimate the resource usage of the computing resources allocated for the operation steps according to the execution result of the at least part of the machine learning modeling process; adjust the computing resources allocated for the operation steps according to the estimated result. For example, if it is found that the utilization rate of the computing resources allocated for the operation step is not high after multiple executions of a certain operation step, the computing resources allocated for the operation step can be lowered.

So far, the implementation process of the method for realizing automatic machine learning of the present disclosure will be briefly described with reference to FIG. 1. The method for realizing automatic machine learning of the present disclosure can be implemented as a general parameter search framework, which can be applied to scenarios requiring a large number of parameters such as AuotML to solve the multi-parameter search and optimization problem.

The process configuration information mentioned above is a calculation graph as an example. The parameter search framework of the present disclosure can be divided into four functional modules, including a calculation graph module, a parameter space module, a search strategy module, and an execution engine.

The calculation graph module is used to obtain user-defined calculation graphs. The computational graph is a directed acyclic graph composed of data nodes and operation nodes. The input and output of the operation nodes are data nodes, and the data nodes are used as the input and output of other operation nodes. The calculation graph can clearly characterize the execution logic of at least part of the machine learning modeling process. The calculation graph corresponds to the process configuration information mentioned above, that is, the process configuration information can be a calculation graph composed of data nodes and operation nodes. Users can simply use the base classes or methods provided by the framework to package their original code into data nodes and operation nodes, and connect them into a calculation graph. The connected calculation graph can automatically generate its relationship graph. For example, when defining a calculation graph, the user can use code to simply encapsulate the logic of each operation node using the function (or class) provided by the framework, and the multiple operation nodes obtained can be based on the upstream and downstream and input and output correspondences. The relationship is connected to the edge, and a calculation graph is obtained, and its content is stored in the calculation graph module. In addition, the process of generating calculation graphs can use a graphical interface, that is, users can define nodes (operation nodes and data nodes) in the calculation graph in the form of drag and drop on the graphical interface. These nodes can be existing logic provided by the framework or You can support users to write a piece of custom code inside the node.

The parameter space module is used to obtain the parameter space configured by the user for the operation node in the calculation graph (corresponding to the value space of the operation parameter mentioned above). When the user defines the operation node in the calculation graph, he can add a sub-parameter space for it, and change the value of an operation parameter in the current operation node from a certain item to an undetermined item, and the value of the operation parameter is running You can select one from this sub-parameter space. The sub-parameter spaces on all operation nodes in the calculation graph are combined, that is, the parameter space corresponding to the current calculation graph. Generally speaking, the product of all sub-parameter space options constitutes the total parameter space. Taking the user’s code to define a calculation graph as an example, the user can use a special identifier to characterize the sub-parameter space (for example, the Choice class provided by the framework), fill the special identifier in the position of the operating parameter involved in the operating node, and then generate the calculation The special mark can be automatically identified when the picture is displayed and added to the parameter space module for management. In addition, the definition of the sub-parameter space can also be done in the graphical interface. For example, the user can also use a special option to define a parameter as a sub-parameter space. In addition, for operating parameters that usually have a fixed value space, the present disclosure can also provide a default sub-parameter space to reduce the threshold for users to use, that is, for some operating parameters, the value space can be specified by the system without user definition.

Figure 2 shows a computational graph used to characterize the machine learning feature selection process. The circle represents the operation node, and the square represents the data node.

The operation node read_1 in Figure 2 is the operation of reading data, which is used to read data from the hard disk into the memory; the data node variable_1 is used to characterize the original data in the memory; the operation node make_dataset_1 is used to select k columns from the original data , And divided into training set and test set by row, which contains a sub-parameter space (corresponding to the value space mentioned above), that is, k randomly selected from the n columns of data in variable_1; the data node variable_2 is used for characterization The training data set containing only k columns; the data node variable_3 is used to characterize the test data set containing only k columns; the operation node lgb_train_1 is used to use LigthGBM to obtain the model on the training data set, and calculate the auc (accurate) of the current model according to the test set Degree), lgb_train_1 contains two sub-parameter spaces, one is the learning rate of LightGBM is uniformly sampled between [0.1,0.2], and the other is the tree of LightGBM training tree selected in [7,13,19,23]. Among them, LightGBM is the abbreviation of Light Gradient Boosting Machine, and is a framework for implementing the GBDT algorithm.

The search strategy module is used to determine multiple value combinations based on the calculation graph and the parameter space. Among them, when the user defines a calculation graph, he can define a target node, that is, an optimization target. The optimization target is preferably a data node with a scalar content. This node is the result of the calculation logic represented by the current calculation graph. The target of the parameter search is this The result is optimal (maximum or minimum, usually any evaluation method can be transformed into a scalar). The search strategy module can calculate the next set or sets of parameter value combinations that need to be searched based on the combination of historical parameter values and the execution results of the target node. In this process, certain cuts can be made according to the existing algorithm. Branch optimization to reduce the size of the search space and speed up the optimal solution.

The search strategy module provides the next group or groups of parameter value combinations that need to be searched. The parameter value combination and the original calculation graph form a set of certain calculation graph logic, and then the execution framework can automatically schedule execution according to machine resource conditions. These calculation graph logics calculate the execution result of the target node corresponding to the parameter value combination, and add it to the historical information for the search strategy module to determine the parameter value combination that needs to be searched for the next round.

After multiple iterations of the execution engine and the search strategy module, when the search strategy module no longer produces the next set of parameter value combinations, it means that the entire search process is over. Then the parameter value corresponding to the best execution result in the historical data The combination is the result of this search task.

Take Figure 2 as an example. Since the learning rate takes random values in a continuous space, the overall parameter space is infinite. Here, the search strategy can use random search, each run in each sub-parameter space Randomly select a value, after running a total of M times (random search parameter), select the best result from it. Among them, the value of M can be configured by the user.

The code logic in the above operation node can directly use the code logic in the user’s original machine learning application, just through simple packaging, and some original fixed parameters into sub-parameter space (that is, the value Range).

It can be seen that users only need to make small changes or learn very few things to implement an AutoML application, which greatly reduces the development cost, and this disclosure does not depend on the specific logic in the operation node, and the user can use the first The three-party package implements arbitrary logic, even non-machine learning tasks can be run.

In addition, from the perspective of execution efficiency, the search strategy module supports the generation of multiple sets of parameter value combinations, and multiple sets of calculation graphs are generated at the same time to improve the utilization of computing resources.

The execution engine is responsible for generating one or several calculation graphs determined by the execution logic according to the value combination determined by the search strategy module, and running to obtain the execution result (such as the value of the target node). In the execution process, the execution engine can be used to determine which operation step or steps should be executed by each processing node in the distributed system. The resource utilization efficiency in the distributed environment needs to be considered. It can also be used to perform historical execution information on a Simple prediction of resource utilization in operation steps. For example, after multiple executions of a certain operation step, it is found that only one cpu is needed to meet its needs, then it can be calculated at the same time as other intensive operation steps. Or the resource utilization rate of a calculation graph is always low, you can increase the number of calculation graphs returned by the search strategy module to increase the degree of parallelism.

There may be intersections between the different value combinations returned by the search strategy module. For example, in the calculation graph shown in Figure 2, only the parameters in the calculation node lgb_train_1 are changed

When the output of the previous calculation node can be reused without calculation, the execution engine can identify the node that can be skipped according to the parameter set and the current executed state, and directly use the output of the node. At the same time, the execution engine can also release some resources in advance that will no longer be used by downstream nodes in the calculation graph according to the total resources configured by the user (memory, cpu consumption, etc.).

In an embodiment of the present disclosure, a machine learning model (such as a neural network model) corresponding to at least part of the machine learning modeling process can be used to predict image categories, text categories, voice emotions, fraudulent transactions, advertising click-through rates, and the like. The machine learning model aims to predict problems related to objects or events in related scenes. For example, it can be used to predict image categories, predict text in images, predict text categories, predict voice emotion categories, predict fraudulent transactions, predict advertising click-through rates, predict commodity prices, etc., so that the prediction results can be directly used as a basis for decision-making or further combined with other rules And become the basis for decision-making. For example, the machine learning model can be used in any of the following scenarios: online content (such as news, advertisements, music, etc.) recommendation; credit card fraud detection; abnormal behavior detection; intelligent marketing; intelligent investment advisors; network traffic analysis .

In one embodiment, the applicable scenarios of the machine learning model in the embodiments of the present disclosure include but are not limited to the following scenarios:

Image processing scenarios, including: optical character recognition OCR, face recognition, object recognition, and image classification; more specifically, for example, OCR can be applied to bill (such as invoice) recognition, handwriting recognition, etc., face recognition can be applied to security In other fields, object recognition can be applied to traffic sign recognition in autonomous driving scenes, and image classification can be applied to “purchase photos” and “find the same money” on e-commerce platforms.

Voice recognition scenarios, including products that can perform human-computer interaction through voice, such as mobile phone voice assistants (such as Apple mobile phone Siri), smart speakers, etc.;

Natural language processing scenarios, including: review text (such as contracts, legal documents, customer service records, etc.), spam identification (such as spam SMS identification), and text classification (emotions, intentions, and topics, etc.);

Automatic control scenarios, including: mine group adjustment operation prediction, wind turbine adjustment operation prediction, and air conditioning system adjustment operation prediction; specifically, a group of adjustment operations that can predict a high mining rate for mine groups, and predict high power generation efficiency for wind turbines For the air conditioning system, a set of adjustment operations can be predicted to meet demand while saving energy consumption;

Intelligent Q&A scenarios, including: chat robots and intelligent customer service;

Business decision-making scenarios include: scenarios in the financial technology, medical, and municipal fields, including:

The field of financial technology includes: marketing (such as coupon usage prediction, advertising click behavior prediction, user portrait mining, etc.) and customer acquisition, anti-fraud, anti-money laundering, underwriting and credit scoring, and commodity price prediction;

The medical field includes: disease screening and prevention, personalized health management and auxiliary diagnosis;

Municipal areas include: social governance and supervision and law enforcement, resource environment and facility management, industrial development and economic analysis, public services and people's livelihood protection, smart cities (the deployment and management of various urban resources such as public transportation, online car-hailing, shared bicycles, etc.);

Recommended business scenarios, including: news, advertising, music, consulting, video and financial products (such as financial management, insurance, etc.) recommendation;

Search scenarios, including: web search, image search, text search, video search, etc.;

Abnormal behavior detection scenarios, including: abnormal behavior detection of electricity consumption by State Grid customers, network malicious traffic detection, abnormal behavior detection in operation logs, etc.

The method for realizing automatic machine learning of the present disclosure can also be realized as a device for realizing automatic machine learning. Fig. 3 shows a structural block diagram of an apparatus for realizing automatic machine learning according to an exemplary embodiment of the present disclosure. Among them, the functional unit of the device for realizing automatic machine learning can be realized by hardware, software, or a combination of hardware and software that realizes the principles of the present disclosure. Those skilled in the art can understand that the functional units described in FIG. 3 can be combined or divided into sub-units to realize the principles of the present disclosure. Therefore, the description herein may support any possible combination, or division, or more specific limitation of the functional units described herein.

The following is a brief description of the functional units that the device for realizing automatic machine learning can have and the operations that can be performed by each functional unit. For the details involved, please refer to the relevant description above, which will not be repeated here.

Referring to FIG. 3, the device 300 for implementing automatic machine learning includes a first acquisition module 310, a second acquisition module 320, an execution module 330 and an output module 340.

The first obtaining module 310 is configured to obtain process configuration information set by a user for representing at least a part of the machine learning modeling process, the process configuration information including one or more operation steps. The second obtaining module 320 is configured to obtain parameter configuration information set by the user for the at least part of the machine learning modeling process, the parameter configuration information including the value space of at least part of the operation parameters involved in the operation step. The operation step may include an operation node and a data node, the input and output of the operation node are both data nodes, and the parameter configuration information may include the value space of at least part of the operation parameters related to the operation node. The process configuration information can be a calculation graph composed of data nodes and operation nodes. For the specific implementation process of the first obtaining module 310 obtaining the process configuration information and the second obtaining module 320 obtaining the parameter configuration information, please refer to the above description of step S110 and step S120 in conjunction with FIG. 1.

As an example, the apparatus 300 for implementing automatic machine learning may further include a providing module and a receiving module. The providing module is configured to provide a data upload interface to the user, and the receiving module is configured to receive a file uploaded by the user through the data upload interface, where the file may be a language of the user based on specific rules for the at least part of the machine learning modeling process Prepared. The first obtaining module 310 may parse the file to determine process configuration information. In an embodiment, the file may also define the value space of at least part of the operation parameters involved in the operation step, and the second acquisition module 320 may parse the file to determine the parameter configuration information.

As an example, the apparatus 300 for implementing automatic machine learning may further include a first display module. The first display module is configured to display an interactive interface for setting the at least part of the machine learning modeling process to the user, wherein the first obtaining module 310 can obtain the process configuration set by the user through the interactive interface information. The device 300 for implementing automatic machine learning may further include a second display module. The second display module is configured to display to the user an interactive interface for setting the value space of the operation parameter involved in the operation step, wherein the second obtaining module 320 can obtain the user setting through the interactive interface Parameter configuration information.

The execution module 330 is configured to execute the at least part of the machine learning modeling process based on the process configuration information and different value combinations of the at least part of the operating parameters determined based on the parameter configuration information to obtain multiple execution results.

As an example, the process configuration information further defines at least one target node, and the target node is a data node. The apparatus 300 for implementing automatic machine learning may further include a determining module. In each iteration process: the determining module may determine one or more value combinations of the at least part of the operating parameters according to the parameter configuration information; the execution module 330 may determine according to the process configuration information and the The value combination determined by the determining module executes the at least part of the machine learning modeling process to obtain the execution result of the target node. In an embodiment, the apparatus 300 for implementing automatic machine learning may further include an add module. In each iteration process: the adding module adds the value combination determined in the current round and the execution result to the historical information; the determining module uses a predetermined search strategy to determine according to the parameter configuration information and the historical information One or more value combinations of at least part of the operating parameters in the current round. The predetermined search strategy may include but is not limited to at least one of the following: random search strategy, enumerated search strategy, Bayesian optimization search strategy, and grid search strategy.

The device 300 for realizing automatic machine learning may further include a saving module configured to save the execution result of the operation step in which the value of the operation parameter is unchanged. When the execution module 330 executes the at least part of the machine learning modeling process, it may ignore the execution of the operation steps in which the value of the operating parameter does not change during the at least part of the machine learning modeling process, and call the pre-stored by the saving module The execution result of this operation step.

The device 300 for implementing automatic machine learning may also include an estimation module and an adjustment module. The estimation module is configured to estimate the resource usage of the computing resources allocated for the operation step according to the execution result of the at least part of the machine learning modeling process; the adjustment module is configured to adjust the resource usage according to the estimation result. The computing resources allocated by the operation steps.

The output module 340 is configured to output the preferred values of the at least part of the operating parameters according to the multiple execution results.

The device 300 for implementing automatic machine learning may further include a third acquisition module and a screening module. The third acquiring module is configured to acquire the filter condition configuration information set by the user and used to filter the values of at least part of the operating parameters. The screening module is used to screen different value combinations of at least part of the operating parameters determined based on the parameter configuration information according to the screening condition configuration information. The execution module can execute at least part of the machine learning modeling process based on the process configuration information and the filtered value combination. One or more filter conditions can be included in the filter condition configuration information. After the value combination of the at least part of the operating parameters is determined based on the parameter configuration information, the filter module may filter the value combination according to the filter condition configuration information to eliminate the value combination that does not meet the preset standard. Among them, the eliminated value combination can be regarded as an infeasible solution and discarded, that is, the eliminated value combination no longer participates in the execution of at least part of the machine learning modeling process.

It should be understood that the specific implementation of the apparatus 300 for implementing automatic machine learning according to an exemplary embodiment of the present disclosure can be implemented with reference to the relevant description of the method for implementing automatic machine learning in conjunction with FIG. 1, and will not be repeated here.

The method and apparatus for implementing automatic machine learning according to exemplary embodiments of the present disclosure are described above with reference to FIGS. 1 to 3. It should be understood that the above method can be implemented by a program recorded on a computer-readable medium. For example, according to an exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions can be provided, wherein the computer can be A computer program for executing the method for realizing automatic machine learning of the present disclosure (for example, as shown in FIG. 1) is recorded on the read storage medium.

The computer program in the above-mentioned computer-readable storage medium can be run in an environment deployed in computer equipment such as a client, a host, a proxy device, a server, etc. It should be noted that the computer program can be used to perform steps other than those shown in FIG. 1 In addition, it can also be used to perform additional steps in addition to the above steps or perform more specific processing when performing the above steps. The content of these additional steps and more specific processing has been described with reference to FIG. 1, here in order to avoid repetition It will not be repeated here.

It should be noted that the device for realizing automatic machine learning according to the exemplary embodiment of the present disclosure can completely rely on the operation of the computer program to realize the corresponding function, that is, the functional architecture of each device and the computer program corresponds to each step, so that the entire device It is called by a special software package (for example, lib library) to realize the corresponding function.

On the other hand, each device shown in FIG. 3 can also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segment used to perform the corresponding operation can be stored in a computer-readable storage medium such as a storage medium, so that the processor can read and run the corresponding Program code or code segment to perform the corresponding operation.

For example, the exemplary embodiments of the present disclosure may also be implemented as a computing device including a storage component and a processor. The storage component stores a set of computer-executable instructions. When the set of computer-executable instructions is used by the processor During execution, the processor is made to execute the method for realizing automatic machine learning mentioned above.

The storage component is also the memory. Make the processor execute the method for realizing automatic machine learning mentioned above, even if the processor executes the following steps: obtain the process configuration information set by the user to represent at least part of the machine learning modeling process, and the process configuration information includes one or Multiple operation steps; obtain the parameter configuration information set by the user for at least part of the machine learning modeling process, the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation steps; determined based on the process configuration information and the parameter configuration information Perform at least part of the machine learning modeling process with different value combinations of at least part of the operating parameters to obtain multiple execution results; and output preferred values of at least part of the operating parameters according to the multiple execution results.

The step of obtaining process configuration information set by the user to characterize at least part of the machine learning modeling process may include: providing a data upload interface to the user; receiving a file uploaded by the user through the data upload interface, where the file is based on a specific rule by the user The language is written for at least part of the machine learning modeling process; the file is parsed to determine the process configuration information.

The file may also define the value space of at least part of the operation parameters involved in the operation step, and the step of obtaining parameter configuration information set by the user for at least part of the machine learning modeling process may include: parsing the file to determine the parameter configuration information.

The step of obtaining process configuration information set by the user for characterizing at least part of the machine learning modeling process may include: presenting to the user an interactive interface for setting at least part of the machine learning modeling process; obtaining user settings through the interactive interface Process configuration information.

The step of obtaining the parameter configuration information set by the user for at least part of the machine learning modeling process may include: showing the user an interactive interface for setting the value space of the operation parameter involved in the operation step; obtaining the user setting through the interactive interface Parameter configuration information.

The process configuration information may also define at least one target node, the target node is a data node, where at least part of the machine learning modeling process is executed based on the process configuration information and different value combinations of at least part of the operating parameters determined based on the parameter configuration information The steps of may include iteratively performing the following operations: determining one or more value combinations of at least part of the operating parameters according to the parameter configuration information; performing at least part of the machine learning modeling process according to the process configuration information and the determined value combination, to obtain The execution result of the target node.

In each iteration process, it also includes the following steps: adding the value combination determined in the current round and the execution result to the historical information, and determining one or more value combinations of at least part of the operating parameters according to the parameter configuration information includes : According to parameter configuration information and historical information, use a predetermined search strategy to determine one or more value combinations of at least part of the operating parameters of the current round.

The search strategy may include at least one of the following: random search strategy; enumerated search strategy; Bayesian optimization search strategy; grid search strategy.

The steps executed by the processor may further include: saving the execution result of the operation step in which the value of the operating parameter is unchanged; during the execution of at least part of the machine learning modeling process, ignoring the value of the operating parameter in the process of at least part of the machine learning modeling process. Change the execution of the operation step, and call to save the execution result of the operation step in advance.

The steps executed by the processor may further include: estimating the resource usage of the computing resources allocated for the operation steps according to the execution result of at least part of the machine learning modeling process; and adjusting the computing resources allocated for the operation steps according to the estimation result.

The steps executed by the processor may further include: obtaining filter condition configuration information set by the user for screening at least part of the values of the operating parameters, and the step of executing at least part of the machine learning modeling process may include: configuring the information according to the filter conditions Filtering different value combinations of at least part of the operating parameters determined based on the parameter configuration information; performing at least part of the machine learning modeling process based on the process configuration information and the filtered value combination.

Specifically, the computing device can be deployed in a server or a client, and can also be deployed on a node device in a distributed network environment. In addition, the computing device may be a PC computer, a tablet device, a personal digital assistant, a smart phone, a web application, or other devices capable of executing the above set of instructions.

Here, the computing device does not have to be a single computing device, and may also be any device or a collection of circuits that can execute the above-mentioned instructions (or instruction sets) individually or jointly. The computing device may also be a part of an integrated control system or a system manager, or may be configured as a portable electronic device interconnected with a local or remote (e.g., via wireless transmission) interface.

In the computing device, the processor may include a central processing unit (CPU), a graphics processing unit (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor. By way of example and not limitation, the processor may also include an analog processor, a digital processor, a microprocessor, a multi-core processor, a processor array, a network processor, and the like.

Some operations described in the method for realizing automatic machine learning according to exemplary embodiments of the present disclosure can be realized by software, some operations can be realized by hardware, and in addition, it can also be realized by a combination of software and hardware. These operations.

The processor can run instructions or codes stored in one of the storage components, where the storage component can also store data. Instructions and data can also be sent and received via a network via a network interface device, wherein the network interface device can use any known transmission protocol.

The storage component can be integrated with the processor, for example, RAM or flash memory is arranged in an integrated circuit microprocessor or the like. In addition, the storage component may include an independent device, such as an external disk drive, a storage array, or any other storage device that can be used by a database system. The storage component and the processor may be operatively coupled, or may communicate with each other, for example, through an I/O port, a network connection, or the like, so that the processor can read files stored in the storage component.

In addition, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, a mouse, a touch input device, etc.). All components of the computing device can be connected to each other via at least one of a bus and a network.

The operations involved in the method for implementing automatic machine learning according to exemplary embodiments of the present disclosure may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated according to imprecise boundaries.

For example, as described above, an apparatus for implementing automatic machine learning according to an exemplary embodiment of the present disclosure may include a storage component and a processor, wherein the storage component stores a set of computer-executable instructions, and when the set of computer-executable instructions is When the processor executes, it executes the above-mentioned method for realizing automatic machine learning.

Exemplary embodiments of the present disclosure may also be implemented as a system including at least one computing device and at least one storage device storing instructions, wherein when the instructions are executed by the at least one computing device, the at least one The computing device executes the method for realizing automatic machine learning as described above in this disclosure.

The exemplary embodiments of the present disclosure are described above. It should be understood that the foregoing description is only exemplary and not exhaustive, and the present disclosure is not limited to the disclosed exemplary embodiments. Without departing from the scope and spirit of the present disclosure, many modifications and alterations are obvious to those of ordinary skill in the art. Therefore, the protection scope of the present disclosure should be subject to the scope of the claims.

Claims

A method for implementing automatic machine learning, including:

Acquiring process configuration information set by the user for representing at least part of the machine learning modeling process, where the process configuration information includes one or more operation steps;

Acquiring parameter configuration information set by the user for the at least part of the machine learning modeling process, where the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation step;

Execute the at least part of the machine learning modeling process based on the process configuration information and the different value combinations of the at least part of the operating parameters determined based on the parameter configuration information to obtain multiple execution results; and

The preferred values of the at least part of the operating parameters are output according to the multiple execution results.
The method according to claim 1, wherein the step of obtaining process configuration information set by the user for characterizing at least part of the machine learning modeling process comprises:

Provide users with a data upload interface;

Receiving a file uploaded by a user through the data upload interface, where the file is written by the user in a language based on specific rules for the at least part of the machine learning modeling process;

Analyze the file to determine the process configuration information.
The method according to claim 2, wherein the file further defines the value space of at least part of the operating parameters involved in the operation step, and obtains parameter configuration information set by the user for the at least part of the machine learning modeling process The steps include:

Analyze the file to determine the parameter configuration information.
The method according to claim 1, wherein the step of obtaining process configuration information set by the user for characterizing at least part of the machine learning modeling process comprises:

Showing the user an interactive interface for setting the at least part of the machine learning modeling process;

Obtain process configuration information set by the user through the interactive interface.
The method according to claim 1, wherein the step of obtaining the parameter configuration information set by the user for the at least part of the machine learning modeling process comprises:

Show the user an interactive interface for setting the value space of the operation parameter involved in the operation step;

Acquire parameter configuration information set by the user through the interactive interface.
The method according to claim 1, wherein the process configuration information further defines at least one target node, and the target node is a data node, wherein the process configuration information is determined based on the process configuration information and the parameter configuration information. For different value combinations of the at least part of the operating parameters, the steps of executing the at least part of the machine learning modeling process include iteratively executing the following operations:

Determining one or more value combinations of the at least part of the operating parameters according to the parameter configuration information;

According to the process configuration information and the determined value combination, execute the at least part of the machine learning modeling process to obtain the execution result of the target node.
The method according to claim 6, wherein, in each iteration process, the method further comprises: adding the combination of values determined in the current round and the execution result to the historical information,

The step of determining one or more value combinations of the at least part of the operating parameters according to the parameter configuration information includes: using a predetermined search strategy to determine the at least one value combination of the at least part of the operating parameters in the current round according to the parameter configuration information and the historical information. One or more value combinations of some operating parameters.
The method according to claim 7, wherein the search strategy includes at least one of the following:

Random search strategy;

Enumeration search strategy;

Bayesian optimization search strategy;

Grid search strategy.
The method according to claim 1, further comprising:

Save the execution result of the operation step in which the value of the operation parameter is unchanged;

During the execution of the at least part of the machine learning modeling process, the execution of the operation step in which the value of the operating parameter does not change in the at least part of the machine learning modeling process is ignored, and the execution result of the operation step is saved in advance.
The method according to claim 1, further comprising:

Estimate the resource usage of the computing resources allocated for the operation steps according to the execution result of the at least part of the machine learning modeling process;

Adjust the computing resources allocated for the operation step according to the estimation result.
The method according to claim 1, further comprising: obtaining filter condition configuration information set by a user for filtering the values of the at least part of the operating parameters, wherein:

The step of performing the at least part of the machine learning modeling process includes: filtering different value combinations of the at least part of the operating parameters determined based on the parameter configuration information according to the filtering condition configuration information; and based on the process configuration information In combination with the filtered value, the at least part of the machine learning modeling process is executed.
The method of claim 1, wherein:

The operation step includes an operation node and a data node, the input and output of the operation node are both data nodes, and the parameter configuration information includes the value space of at least part of the operation parameters related to the operation node.
The method according to claim 12, wherein the process configuration information is a calculation graph composed of data nodes and operation nodes.
A device for realizing automatic machine learning, including:

The first obtaining module is configured to obtain process configuration information set by a user for representing at least part of the machine learning modeling process, the process configuration information including one or more operation steps;

The second acquiring module is configured to acquire parameter configuration information set by the user for the at least part of the machine learning modeling process, where the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation step;

An execution module configured to execute the at least part of the machine learning modeling process based on the process configuration information and different value combinations of the at least part of the operating parameters determined based on the parameter configuration information to obtain multiple execution results; as well as

The output module is configured to output the preferred values of the at least part of the operating parameters according to the multiple execution results.
The device according to claim 14, further comprising:

Provide a module, which is configured to provide users with a data upload interface;

The receiving module is configured to receive a file uploaded by a user through the data upload interface, where the file is written by the user for the at least part of the machine learning modeling process based on a language of specific rules, wherein the first acquisition The module parses the file and determines the process configuration information.
The device according to claim 15, wherein the file further defines the value space of at least part of the operation parameters involved in the operation step, and the second acquisition module parses the file to determine the parameter configuration information.
The device according to claim 14, further comprising:

The first display module is configured to display an interactive interface for setting the at least part of the machine learning modeling process to the user, wherein the first obtaining module obtains the process configuration information set by the user through the interactive interface .
The device according to claim 14, further comprising:

The second display module is configured to display to the user an interactive interface for setting the value space of the operation parameter involved in the operation step, wherein the second obtaining module obtains the user setting through the interactive interface Parameter configuration information.
The apparatus according to claim 14, wherein the process configuration information further defines at least one target node, the target node is a data node, and the apparatus further comprises a determining module,

In each iteration process: the determining module determines one or more value combinations of the at least part of the operating parameters according to the parameter configuration information; the execution module determines according to the process configuration information and the determining module And execute the at least part of the machine learning modeling process to obtain the execution result of the target node.
The device according to claim 19, further comprising: an adding module,

In each iteration process: the adding module adds the value combination determined in the current round and the execution result to the historical information; the determining module uses a predetermined search strategy to determine according to the parameter configuration information and the historical information One or more value combinations of at least part of the operating parameters in the current round.
The device according to claim 20, wherein the search strategy comprises at least one of the following:

Random search strategy;

Enumeration search strategy;

Bayesian optimization search strategy;

Grid search strategy.
The device according to claim 14, further comprising:

The saving module is configured to save the execution result of the operation step whose value of the operation parameter is unchanged,

In the execution of the at least part of the machine learning modeling process, the execution module ignores the execution of the operation steps in which the values of the operating parameters are unchanged during the at least part of the machine learning modeling process, and calls the execution of the operation steps saved in advance result.
The device according to claim 14, further comprising:

The estimation module is configured to estimate the resource usage of the computing resources allocated for the operation step according to the execution result of the at least part of the machine learning modeling process;

The adjustment module is configured to adjust the computing resources allocated for the operation step according to the estimation result.
The device according to claim 14, further comprising:

The third acquiring module is configured to acquire the filter condition configuration information set by the user and used to filter the values of the at least part of the operating parameters;

A screening module configured to screen different value combinations of the at least part of the operating parameters determined based on the parameter configuration information according to the screening condition configuration information;

The execution module executes the at least part of the machine learning modeling process based on the process configuration information and the filtered value combination.
The device according to claim 14, wherein:

The operation step includes an operation node and a data node, the input and output of the operation node are both data nodes, and the parameter configuration information includes the value space of at least part of the operation parameters related to the operation node.
The apparatus according to claim 25, wherein the process configuration information is a calculation graph composed of data nodes and operation nodes.
A system comprising at least one computing device and at least one storage device storing instructions, wherein when the instructions are executed by the at least one computing device, the at least one computing device causes the at least one computing device to execute the The method of any claim.
A computer-readable storage medium storing instructions, wherein when the instructions are executed by at least one computing device, the at least one computing device is prompted to execute the method according to any one of claims 1 to 13.
A computing device includes a processor and a memory. The memory stores a set of computer-executable instructions. When the set of computer-executable instructions is executed by the processor, the processor is caused to perform the following steps:

Acquiring process configuration information set by the user for representing at least part of the machine learning modeling process, where the process configuration information includes one or more operation steps;

Acquiring parameter configuration information set by the user for the at least part of the machine learning modeling process, where the parameter configuration information includes the value space of at least part of the operation parameters involved in the operation step;

Execute the at least part of the machine learning modeling process based on the process configuration information and the different value combinations of the at least part of the operating parameters determined based on the parameter configuration information to obtain multiple execution results; and

The preferred values of the at least part of the operating parameters are output according to the multiple execution results.
The computing device according to claim 29, wherein the step of obtaining process configuration information set by the user for characterizing at least part of the machine learning modeling process comprises:

Provide users with a data upload interface;

Receiving a file uploaded by a user through the data upload interface, where the file is written by the user in a language based on specific rules for the at least part of the machine learning modeling process;

Analyze the file to determine the process configuration information.
The computing device according to claim 30, wherein the file further defines the value space of at least part of the operation parameters involved in the operation step, and obtains the parameter configuration set by the user for the at least part of the machine learning modeling process The information steps include:

Analyze the file to determine the parameter configuration information.
The computing device according to claim 29, wherein the step of obtaining process configuration information set by the user for characterizing at least part of the machine learning modeling process comprises:

Showing the user an interactive interface for setting the at least part of the machine learning modeling process;

Obtain process configuration information set by the user through the interactive interface.
The computing device according to claim 29, wherein the step of obtaining the parameter configuration information set by the user for the at least part of the machine learning modeling process comprises:

Show the user an interactive interface for setting the value space of the operation parameter involved in the operation step;

Acquire parameter configuration information set by the user through the interactive interface.
The computing device according to claim 29, wherein the process configuration information further defines at least one target node, and the target node is a data node, wherein the process configuration information is determined based on the process configuration information and the parameter configuration information. The steps of performing the at least part of the machine learning modeling process include different value combinations of the at least part of the operating parameters, including iteratively performing the following operations:

Determining one or more value combinations of the at least part of the operating parameters according to the parameter configuration information;

According to the process configuration information and the determined value combination, execute the at least part of the machine learning modeling process to obtain the execution result of the target node.
The computing device according to claim 34, wherein, in each iteration process, it further comprises the following step: adding the value combination determined in the current round and the execution result to the historical information,

The step of determining one or more value combinations of the at least part of the operating parameters according to the parameter configuration information includes: using a predetermined search strategy to determine the at least one value combination of the at least part of the operating parameters in the current round according to the parameter configuration information and the historical information. One or more value combinations of some operating parameters.
The computing device according to claim 35, wherein the search strategy comprises at least one of the following:

Random search strategy;

Enumeration search strategy;

Bayesian optimization search strategy;

Grid search strategy.
The computing device according to claim 29, further comprising the following steps:

Save the execution result of the operation step in which the value of the operation parameter is unchanged;

During the execution of the at least part of the machine learning modeling process, the execution of the operation step in which the value of the operating parameter does not change in the at least part of the machine learning modeling process is ignored, and the execution result of the operation step is saved in advance.
The computing device according to claim 29, further comprising the following steps:

According to the execution result of the at least part of the machine learning modeling process, estimate the resource usage of the computing resources allocated for the operation step;

Adjust the computing resources allocated for the operation step according to the estimation result.
The computing device according to claim 29, further comprising the following step: obtaining filter condition configuration information set by the user for filtering the values of the at least part of the operating parameters,

The step of performing the at least part of the machine learning modeling process includes: filtering different value combinations of the at least part of the operating parameters determined based on the parameter configuration information according to the filtering condition configuration information; and based on the process configuration information In combination with the filtered value, the at least part of the machine learning modeling process is executed.
The computing device of claim 29, wherein:

The operation step includes an operation node and a data node, the input and output of the operation node are both data nodes, and the parameter configuration information includes the value space of at least part of the operation parameters related to the operation node.
The computing device according to claim 40, wherein the process configuration information is a computing graph composed of data nodes and operation nodes.