CN109146081B - Method and device for creating model project in machine learning platform - Google Patents

Method and device for creating model project in machine learning platform Download PDF

Info

Publication number
CN109146081B
CN109146081B CN201710502054.1A CN201710502054A CN109146081B CN 109146081 B CN109146081 B CN 109146081B CN 201710502054 A CN201710502054 A CN 201710502054A CN 109146081 B CN109146081 B CN 109146081B
Authority
CN
China
Prior art keywords
model
node
machine learning
item
project
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710502054.1A
Other languages
Chinese (zh)
Other versions
CN109146081A (en
Inventor
汪翠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710502054.1A priority Critical patent/CN109146081B/en
Publication of CN109146081A publication Critical patent/CN109146081A/en
Application granted granted Critical
Publication of CN109146081B publication Critical patent/CN109146081B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application discloses a method and a device for creating a model project in a machine learning platform, wherein the method comprises the following steps: creating a machine learning operating environment; adding components required by a model project to the machine learning running environment; establishing input and output links for the components required by the model project according to the sequence set by the model project to create the model project; carrying out parameter configuration on components required by the created model project; and running the created model item. By adopting the method provided by the application, the time for a user to create the model project in the machine learning platform can be greatly shortened.

Description

Method and device for creating model project in machine learning platform
Technical Field
The application relates to the field of machine learning, in particular to a method and a device for creating a model project in a machine learning platform.
Background
With the rapid development of artificial intelligence in recent years, machine learning has also rapidly developed as a main implementation method of artificial intelligence, and various large internet companies have developed their own machine learning platforms. Through the machine learning platform, reasonable model projects can be obtained according to existing data, the problems of scientific research can be solved by utilizing the model projects, and the model projects can also be applied to various fields in actual life to actively guide activities such as life and production of people. The machine learning platform improves and verifies the model project by creating and running a model project training experiment even repeatedly repeating the process for many times until a satisfactory model project is obtained, thereby finally creating the model project.
At present, when a machine learning platform which is a mainstream machine learning platform creates a model project, an experiment needs to be created first, then a component is dragged from a component column according to the requirement of the experiment, and then the experiment is operated after the component is connected and parameter configuration is carried out. When the experiment is established in each experiment method, the components required by the experiment need to be dragged one by one from the corresponding component columns, so that the time is consumed, the inconvenience is caused, and the usability of the machine learning platform is poor.
Disclosure of Invention
The application provides a method for rapidly creating a model project in a machine learning platform, which is used for solving the problem that the existing machine learning platform creates the model project. The application also provides a device for quickly creating the model project in the machine learning platform.
The application provides a method for quickly creating a model project in a machine learning platform, which comprises the following steps:
creating a machine learning operating environment;
adding components required by a model project to the machine learning running environment;
establishing input and output links for the required components according to the sequence set by the model items to create the model items;
carrying out parameter configuration on components required by the model project;
and running the created model item.
Optionally, the creating a machine learning runtime environment includes:
creating a model project panel at the machine learning platform;
and allocating an identification code for the created model project panel.
Optionally, the adding of the model item to the machine learning execution environment includes:
adding components to the model project panel;
inserting the added attribute information of the component into a model project panel information table;
the attribute information comprises the added model project panel identification code where the component is located, the added component name, the added component identification code and the position information of the added component in the model project panel.
Optionally, the components include an algorithm component and a data source component.
Optionally, the establishing the input and output links according to the order set by the model items by the required components includes:
distributing model project names and model project codes for model projects to be operated;
selecting components required by the model project;
and establishing data or command input and output links among the components required by the model items according to the sequence set by the model items.
Optionally, the establishing the input and output links according to the order set by the model items by the required components includes:
selecting components required by the model project;
judging whether the component to be selected contains a father node or a parallel node or not, and automatically distributing a model project name and an identification code when the component to be selected does not contain the father node or the parallel node;
establishing data or command input and output links among the components required by the model items according to the sequence set by the model items; wherein the content of the first and second substances,
each component in the model project forms a node of the model project, the father node is the previous node of the selected node, and the parallel nodes are nodes of which the selected nodes have the common father node.
Optionally, selecting the components required for a model item includes selecting components to be added to the machine learning model item runtime environment, and/or
Components that already exist in the model item runtime environment or have been used by other model items in the runtime environment.
Optionally, the method further includes: inserting the model item information and the model item panel identification code into a model item information table;
the model item information comprises a model item name, a model item code, and components and attribute information thereof required by the model item.
Optionally, the method further includes inserting each piece of node information constituting the model item into a node information table;
the node information comprises node positions, node-related father nodes and node-related son node information.
Optionally, the parameter configuration of the components required by the model item includes:
configuring parameter information of algorithm components required by the model project;
and inserting the parameter information, the node identification code and the model item code into a node parameter table.
Optionally, the running the created model item includes
Selecting a model project to be operated, and starting the operation model project;
reading all node identification codes in the model item according to the model item code;
sorting the nodes according to the parent-child relationship of the node identification codes in the node table;
and executing the nodes in sequence according to the node sequence.
Optionally, the method further includes: after the node executes, inserting an execution record into the node execution log table; and inserting model item execution state data into the model item execution state table after the model item operation is finished.
Optionally, the sequentially executing the nodes according to the node sequence includes:
for each node, acquiring the parameter information of the node from the node parameter table according to the model project code and the node identification code;
judging whether the node belongs to the data source node or the algorithm node;
when the judgment result is the data source node, acquiring data from a corresponding database of the machine learning platform;
and when the judgment result is the algorithm node, acquiring the parameter of the node from the node parameter table and executing the algorithm according to the algorithm parameter.
Optionally, the components required for adding the model item to the machine learning operating environment are specifically:
the components required for adding the same model project to the machine learning running environment; or
And adding components required by different model items to the machine learning running environment.
In addition, the present application also provides an apparatus for quickly creating a model item in a machine learning platform, including:
a creating unit configured to create a machine learning execution environment;
the component adding unit is used for adding components required by model items to the machine learning running environment;
the connecting unit is used for establishing input and output links for the required components according to the sequence set by the model items so as to create the model items;
the parameter configuration unit is used for carrying out parameter configuration on the components required by the model project;
and the running unit is used for running the created model item.
Optionally, the creating unit includes:
a panel creating unit for creating a model project panel on the machine learning platform;
and the identification code allocation unit is used for allocating identification codes to the created model project panels.
Optionally, the component adding unit includes:
a component unit for adding components to the model project panel;
an attribute insertion unit for inserting the added attribute information of the component into a model item panel information table;
the attribute information comprises the identification code of the model project panel where the added component is located, the name of the added component and the position information of the added component in the model project panel.
Optionally, the connection unit includes:
the model item distribution unit is used for distributing a model item name and a model item code for the model item to be operated;
a selection unit for selecting components required by the model item;
and the linking unit is used for establishing data or command input and output links among the components required by the model items according to the sequence set by the model items.
In addition, the present application also provides an electronic device, which includes:
a processor and a storage medium;
the storage medium stores or carries a method for quickly creating a model project in a machine learning platform; after the device is powered on and the processor runs the program of the method for quickly creating the model item in the machine learning platform, the following operations are executed:
creating a machine learning operating environment;
adding components required by a model project to the machine learning running environment;
establishing input and output links for the required components according to the sequence set by the model items to create the model items;
carrying out parameter configuration on components required by the model project;
and running the created model item.
In addition, the application also provides a method for quickly creating the model project in the machine learning platform, which comprises the following steps:
creating a machine learning operating environment;
adding components required by a model project to the machine learning running environment;
creating, in the learning environment, a model item based on the added component;
and running the created model item.
Compared with the prior art, one aspect of the application has the following advantages: according to the method and the device for establishing the model project, firstly, a machine learning operation environment is established, then the components required by the model project are added into the environment, when the model project is required to be established, the components required by the model project are directly connected according to the sequence set by the model project to establish the model project, and then the components required by the model project are subjected to parameter configuration to operate the model project. Compared with the prior art, the method has the advantages that the components are added into the machine learning running environment firstly, the model items can be created by connecting the components required by the model items according to the set sequence of the model items after the model items are created each time, the components required by the model items are not required to be added by dragging the components required by the model items one by one, the time for creating the model items is greatly shortened, and meanwhile, the usability of the machine learning platform is improved.
Drawings
FIG. 1 is a flow diagram of an embodiment of a method for rapidly creating model items in a machine learning platform of the present application;
FIG. 2 is a flowchart of a method for creating a machine learning runtime environment according to the present embodiment;
FIG. 3 is a flowchart of a method for adding components required by a model item to a machine learning runtime environment according to the present embodiment;
FIG. 4 is a flowchart of a method for establishing I/O links for components required by model items according to a sequence set by the model items according to the embodiment;
FIG. 5 is a flowchart of a method for configuring parameters of components required by a model project according to the present embodiment;
FIG. 6 is a flowchart of a method for running the created model item provided by the present embodiment;
fig. 7 is a flowchart of a method for executing nodes according to the node sequence provided in this embodiment;
FIG. 8 shows a schematic diagram of a panel created on a machine learning platform;
FIG. 9 shows the flow of node connection and the contents of the associated parameter information table;
FIG. 10 is a flow chart of the model project operation of an embodiment of the present application;
FIG. 11 is a flow chart of algorithm node execution according to an embodiment of the present application;
FIG. 12 is a diagram illustrating an embodiment of an apparatus for rapidly creating model items in a machine learning platform according to an embodiment of the present application;
fig. 13 is a diagram of a model item component connection relationship established on a platform by using the method for quickly creating a model item according to the embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather construed as limited to the embodiments set forth herein.
Valuable data are mined from mass data, and the method has great significance for scientific and social progress. With the rapid development of mobile internet, human society has inevitably entered the era of big data. Compared with the traditional data, the big data has massive, heterogeneous, repeated and even conflicting data, so that the traditional data mining method cannot continuously meet new requirements from data form, computational model or computational efficiency. Based on the above, cloud computing and cloud service are rapidly becoming research hotspots. The machine learning cloud platform also comes along.
The machine learning platform realizes functions of data mining, modeling, prediction and the like through a distributed cloud computing platform. The method provides one-stop algorithm services such as algorithm development, sharing, model training, deployment, monitoring and the like for users, and in some machine learning platforms, the users can operate the whole model building experiment process through a visual operation interface, and simultaneously support a command mode, so that the users can operate experiment modeling through a command line. By the machine learning platform, the threshold of a user or a developer for developing and deploying the distributed machine learning system and related applications can be greatly reduced. A method for quickly creating model items in a machine learning platform is provided. By the method, a user or a related developer can quickly create a calculation model project or an experiment model project which the user or the related developer wants, and quickly verify the feasibility of the calculation model project or the experiment model project. Therefore, developers can develop the model project more conveniently by efficiently and quickly utilizing the machine learning platform. The method for quickly creating the model project in the machine learning platform comprises the following steps: creating a machine learning model project operating environment; adding a component to the machine learning model project runtime environment; establishing input and output links for components required by the model project according to the sequence set by the model project to create the model project; carrying out parameter configuration on components required by the model project; and running the created model item. As described in detail below.
Please refer to fig. 1, which is a flowchart illustrating an embodiment of the method of the present application. The method for quickly creating the model project in the machine learning platform comprises the following steps:
step 101: and creating a machine learning model project operating environment.
The model project refers to a data workflow or data application which is built on a machine learning platform by a machine learning platform user and consists of a plurality of or a series of components. Model items are also referred to as computational model items, and also include experimental model items created temporarily or over time to verify a particular function. The examples of the present application will be described with experimental model items as examples. The rules among the existing data can be found through the model items, and finally the model items reflecting the rules are created.
The machine learning execution environment is a virtual environment in the machine learning platform for creating and executing model items, under which the model items can be created and executed.
The method for creating the machine learning model project operating environment comprises the following steps:
step 101-1: a model project panel is created on the machine learning platform.
The model item panel refers to a machine learning model item execution environment in the form of a panel on which the model item can be created and executed. Unlike traditional model item interfaces in machine learning platforms, the model item panel may also have components stored on it in a directly visible form. The model item panel may be created on the machine learning platform by clicking on a button on the machine learning platform that creates the model item panel. The user can name, rename and the like the created model project panel.
The model item panel is provided with a scroll bar so as to facilitate viewing and operating the whole model item panel.
A plurality of the model project panels may be created simultaneously.
A plurality of model projects can be created and run simultaneously on the same model project panel; and when the performance of the page is deteriorated due to the simultaneous operation of a plurality of model items, the capacity can be expanded on the back-end server according to the number of users and the use frequency of the components.
The user may also select whether to save the model item and the generated model item that have been successfully run on the model item panel, and when selecting to save the model item and the generated model item, may set saving parameters such as a saving name, a saving location, and a saving life cycle of the model item or the generated model item.
Step 101-2: assigning a model item panel identification code to the created model item panel.
The model item panel identification code refers to a code used to identify the model item panel. The model project panel identification code corresponds to the model project panel one by one and is a unique code.
When the creation of the model item panel on the machine learning platform is successful, a daemon will assign the model item panel identification code to the created model item panel.
Fig. 8 shows a schematic diagram of a panel created on a machine learning platform. As shown in FIG. 8, the experiment model project panel created is shown as the "experiment run panel" in the figure. The panel area below, the experimental model project can be created by pulling the component and performing the simulation. The experimental model project (the experiment is abbreviated in fig. 8 to 11) refers to a data workflow or data application set up by a machine learning platform user; the user needs to establish an experiment example and then build a data flow on the experiment panel.
Please continue to refer to fig. 1, step 102: adding components required by a model project to the machine learning runtime environment.
Specifically, in this step, components are added to the model project operating environment according to the requirements of the model project or the project and the like.
The component refers to an operation unit which can be called and executed on the machine learning platform and is used for representing various algorithms or data sources. Such as data import and export, data processing, data analysis, model project training, or prediction.
The component representing the algorithm is an algorithm component, the algorithm component is provided with a parameter configuration column for configuring parameters of each algorithm during operation, and the algorithm component is provided with input and output buttons for connection among the algorithms and output and viewing of operation results of the algorithms.
The component representing the data source is a data source component which is mainly used for setting the table name, once the table name is set by the data source component, the system reads data of the corresponding table name from the background database and provides the data for subsequent algorithm components.
The components include the algorithm component and the data source component.
The method for adding the components required by the model project to the machine learning running environment comprises the following steps:
step 102-1: adding components to the model project panel. The components may be components of the same model project, or components of different model projects. The present embodiment will be described by taking the example of adding the same item model component. However, it should be noted that the present application is not limited to this, and components of different model items may be added simultaneously or sequentially, and in the subsequent step, input and output links are established according to the sequence of the components belonging to the same model item.
Adding the dragged components to the model item panel by dragging the corresponding components in the component column one by one into the model item panel.
Step 102-2: inserting the added attribute information of the component into a model project panel information table;
the attribute information comprises the identification code of the model project panel where the added component is located, the name of the added component and the position information of the added component in the model project panel.
The model item panel information table is a database table for recording the attribute information of all the components in the model item panel.
When the component is successfully added into the model item panel, the background program inserts the attribute information such as the model item panel identification code of the added component, the component name and the position information in the model item panel into a model item panel information table to record the added component.
Step 103: and establishing input and output links for the components required by the model item according to the sequence set by the model item to create the model item.
The components required by the model project refer to the sum of the algorithm component and the data source component required by the model project to be created.
The sequence set by the model items refers to the sequence of the execution steps of the model items.
The input and output link refers to a connecting line between an algorithm component and a data source component which are required by the model project and are carried out according to the sequence of the steps of the model project. I.e. input-output channels for data or commands between building blocks.
The method for establishing the input and output links of the components required by the model project according to the set sequence of the model project to create the model project comprises the following steps:
step 103-1: and distributing model item names and model item codes for the model items to be operated.
The model item to be run is the created model item in which all components required for the model item are connected in the order set by the model item, or the created model item in which some of the components required for the model item are connected.
The model item code refers to a code for identifying the model item. The model item codes correspond to the model items one by one and are unique codes.
And for each model item to be run, automatically and randomly allocating a model item name and a corresponding model item code to the background program when the background program is created.
Step 103-2: the components required for the model project are selected.
The selecting of the components required by the model items means that the components required by the selecting of the two model items are connected according to the set sequence of the model items.
And the two connected components required by the model item are a parent component and a child component in sequence according to the set sequence of the model item.
The components required for selecting a model item include selecting from the components added to the machine learning model item execution environment, or from the components already present in the model item execution environment, or from the components already used by other model items. Or in a certain project model project, components are simultaneously selected from all or any two of the foregoing to meet the requirements of the model project or project.
Step 103-3: judging whether the selected component contains a father node or a parallel node or not, and automatically distributing a model project name and a model project code when the selected component does not contain the father node or the parallel node;
each component in the model project forms a model project node, referred to as a node for short; the father node is the last model project node of the selected model project node; the parallel nodes are the model item nodes for which the selected model item nodes have the common parent node.
Determining whether the selected component includes a parent node or a parallel node means determining whether the parent component in step 103-2 includes a parent node or a parallel node.
If the parent component does not contain the parent node or the parallel node, the child component is connected with the parent component, a new model project is created, and at the moment, a background program automatically and randomly allocates a model project name and a model project code for the created new model project;
if the parent component contains the parent node or the parallel node, the fact that the child component is connected with the parent component is that the component is added to the model item to be operated containing the parent component, and at the moment, a new model item is not created.
Step 103-4: inserting model item information and the model item panel identification code into a model item information table;
the model item information includes the model item name, the model item code, components required for the model item, and attribute information of the components required for the model item.
The model item information table is a data table for recording the model item information of all the model items to be operated and the model item panel identification codes of the model items.
When a new model item is created in step 103-3, and the created new model item also becomes a model item to be run, the model item information to which the created new model item belongs and the model item panel identification code in which the model item is located are also inserted into the model item information table.
Step 103-5: and inserting each node information constituting the model item into a node information table.
The node information includes location information of the node, parent node and child node information related to the node.
And the child nodes of the nodes are the model project nodes at the next level of the model project nodes.
The node-related parent nodes refer to all parent nodes of the model project node.
The child nodes related to the nodes refer to all child nodes of the model project node.
The node information table is a data table for recording the node information of all the nodes included in all the model items to be operated.
When a new model item is created in the step 103-3, the node information of each model item node constituting the model item is inserted into the node information table at the same time as the step 103-4 is performed.
Step 103-6: establishing data or command input and output links among components required by model items according to sequence set by the model items
The step is to repeat the four steps 103-2, 103-3, 103-4 and 103-5, and sequentially connect all algorithm components and data sources required by the model item according to the sequence set by the model item, thereby creating the model item.
Step 104: carrying out parameter configuration on components required by the model project;
the method for configuring the parameters of the components required by the model project comprises the following steps:
step 104-1: parameter information of algorithm components required for configuring the model project.
And opening the parameter configuration column of the algorithm component required by the model item to be subjected to parameter configuration, and selecting the parameters required by the operation of the algorithm for configuration. For example, if a certain algorithm node is a random forest algorithm node, the parameters of the number of trees, the tree depth, the training column, the target column and the like are configured.
Step 104-2; and inserting the parameter information, the node identification code, the node name and the model item identification code into a node parameter table.
The parameter information of the algorithm component required for the model project includes parameter names and parameter configuration values of all parameters thereof.
The node parameter table is a data table for recording the node identification code, the node name, the associated model item identification code of the node and the parameter information of the component corresponding to the node contained in the model item.
After the parameter configuration of the algorithm component required by the model project is completed through the step 104-1, the background program of the step 104-2 is automatically executed.
Fig. 9 shows the flow of node connection and the contents of the related parameter information table, and the contents of the above steps 103 and 104 can also refer to fig. 9.
Through the steps, the creation of the model item is completed, and further, the model item can be simulated and checked through simulating and operating the model item, so that whether the built model item can achieve the expected purpose or not can be achieved. Please continue to refer to fig. 1.
Step 105: and running the created model item.
The method for operating the created model item comprises the following steps:
step 105-1: selecting a model item to be run, and starting to run the model item.
After the user selects the model item to be run in a frame selection mode, the model item is started to run by pressing a button for running the model item preset on the machine learning platform.
Because the machine learning platform supports the simultaneous operation of a plurality of model items, a user can simultaneously select a plurality of model items to be operated so that the plurality of selected model items can be operated simultaneously.
If the user has not boxed the model item to run, all model items in the machine learning model item runtime environment are selected by default. Of course, the platform may select a default model item mode, for example, it may default to not run any model items when not boxed.
Step 105-2: and reading all the node identification codes in the model item according to the model item identification codes.
When the model item selected in the step 105-1 starts to run, the background program first reads the model item codes of all the running model items, and then reads the node identification codes of all the nodes in the model item corresponding to the model item codes according to the read model item codes.
Step 105-3: sorting the nodes according to the parent-child relationship of the node identification codes in the node table;
next, in the step 105-2, reading all the parent nodes and child nodes of each read node identification code in the node information table to determine the parent-child relationship among all the nodes included in the executed model item, and sorting all the nodes included in the executed model item according to the parent-child relationship.
Step 105-4: and executing the nodes in sequence according to the node sequence.
Then, in the step 105-3, after the all nodes included in the executed model item are sorted, the step is executed.
The method for sequentially executing the nodes according to the node sequence comprises the following steps:
step 105-4-1: for each node, acquiring the parameter information of the node from the node parameter table according to the model project code and the node identification code;
when executing one node, the background program reads the parameter information of the node from the node parameter table according to the executed node identification code and the corresponding model item code.
Step 105-4-2: judging whether the node belongs to the data source node or the algorithm node;
the data source node refers to that a component corresponding to the node is a data source component. The algorithm node means that the component corresponding to the node is an algorithm component. And judging the node as a data source node or an algorithm node according to the node parameter information read from the node parameter table.
Step 105-4-3: when the judgment result is the data source node, acquiring data from a database of a machine learning platform;
and if the judgment result in the step 105-4-2 is the data source node, acquiring data from a corresponding database of the machine learning platform. For a machine learning platform, the intelligent platform is mainly divided into three layers, wherein the first layer is a Web UI interface, the second layer is an IDST algorithm layer, and the last layer is a database platform layer. In the machine learning platform, the step specifically refers to reading table data from a corresponding database.
Step 105-4-4: and when the judgment result is the algorithm node, acquiring the parameter of the node from the node parameter table and executing the algorithm according to the algorithm parameter.
And if the judgment result of the step 105-4-2 is an algorithm node, executing the algorithm corresponding to the algorithm node according to the parameter information of the algorithm node read in the step 105-4-1.
Specifically, the parameter information is spliced into a machine learning command line form according to an agreed format. The background program sends the machine learning command to the database system through the scheduling system, and the database system obtains the storage space where the algorithm is located, the name of the algorithm and the parameter configuration of the algorithm according to the parameters in the analysis machine learning command format. The database system calls a target algorithm in the designated algorithm storage space and executes the algorithm according to the algorithm parameters. And after the algorithm is successfully operated, returning the name of a result table of the algorithm or the name of a generated algorithm model item, wherein the result table is stored in a database item space (simply understood as a database) corresponding to the current system of the user, and the pmml file of the algorithm model item is stored in the ODPS volume designated by the background system.
The algorithm node execution flow chart can refer to fig. 11.
Step 105-5: and after the execution of the model item is finished, inserting model item execution state data into the model item execution state table.
The node execution log table is a database table used for recording the operation condition of the node;
the model item execution state table is a database table used for recording the operation result of the model item of the instrument.
The model item operation result comprises the operation time and the operation result of the model item.
In the step 105-4, each time a node is executed, the daemon inserts a piece of data into the node execution log table to record the execution status of the node.
If the node is successfully executed, recording an execution log of the node;
if the execution of the node fails, recording an execution failure log of the node; meanwhile, the model item corresponding to the node which fails to execute stops running, and the result data of the model item running is inserted into a model item execution state table;
and if all the nodes of the framed model item are successfully executed, the framed model item is successfully operated, and the operation result data of the model item is also inserted into the model item execution state table.
The flow of the experimental model project run can also be seen in fig. 10.
In this embodiment, the model items, the model item information, the component information, and the component parameters generated in the panel are configured and then stored in a table corresponding to mysql, or may be stored in other types of tables such as rds, sqlserver, and the like, or other storage systems such as lds, cache, and the like. When the model project generated in this embodiment runs, the storage space of the algorithm package and the data is the project space of the database. The operating environment may also be an ECS, and the storage system may also be an OSS, etc. And will not be discussed further herein.
The method for quickly creating a model project according to the present application is described below by using a specific example, as shown in fig. 13, it shows a model project component connection relationship diagram established on a platform by using the method for quickly creating a model project according to the above embodiment. In order to predict haze weather, after weather prediction data are acquired, preprocessing can be performed through a data preprocessing component, model project training is performed on the processed data, the weather conditions are predicted by using the trained model projects and the test data, and prediction results are evaluated. The method comprises the following specific steps: an input data source (a weather prediction data component) is divided into two types of data, namely training data and test data after type conversion (data preprocessing), and the training data are input into two machine learning components, namely a random forest-1 in figure 13 and a logistic regression binary classification (logistic regression binary classification-1 in figure 13), so as to respectively obtain training model items; then, taking the training model items and the split test data as input of a prediction component, and predicting whether weather corresponding to each line of data in the test data is haze weather or not; and evaluating the predicted effect by the evaluation component;
for the haze weather prediction experiment model project, the experiment flow shown in fig. 13 can be established by the method of the embodiment, the components are respectively pulled to the panel according to the experiment requirements, and the connection is established according to the experiment flow, that is, the temporary experiment is established. If only random forests or logistic regression binary component output model items are needed to perform other work, such as model item online deployment and open online weather prediction service, a temporary experiment can be established in a test environment (namely an experiment panel), and a debugging experiment is run to store the experiment model items; meanwhile, which experimental model item is stored can be selected according to model item prediction evaluation results of random forests and logistic regression; the experiment is not required to be established in the test environment, and the experiment parameters can be debugged and the experiment results can be compared; meanwhile, haze weather prediction is carried out in the experiment, and for other two-class prediction, only a data source component needs to be modified, and the rest components in the experiment are shared; the user can modify the experiment according to the quality of the experiment results of the random forest and the logistic regression secondary classification (only the prediction evaluation branch of the random forest or only the prediction evaluation branch of the logistic regression secondary classification is saved) and then the experiment is saved.
In the embodiment of the application, the operating environment for quick creation is created in the machine learning platform, the experimental assembly is placed in the experimental environment, the operation experiment can be simulated in the experimental environment, and the experimental result can be obtained quickly in time. The platform system distributes an identification area (namely a panel with an identification code) for the operating environment, the experiment assembly is built on the panel, and the background records experiment information, parameter information, node information and mutual correlation information through parameter tables with different dimensions. During experiment operation, through the parameter table and the related data and commands, the operation of the experiment can be rapidly and flexibly realized, the components can be reused on the panel, different experiments can use the same component, the component and the related nodes are judged to be called by which experiment through the experiment identification code, and a plurality of experiments can be operated without mutual interference and concurrence.
According to the embodiment of the application, a large amount of time and physical power of a user can be saved, the user only needs to establish a model project trial operation panel when operating the model project, the algorithm components are pulled to the panel, so that various temporary model projects can be formed by using the algorithm components all the time, the model project can be tried to operate, the model project can be debugged and the effect of the model project can be checked in the panel, and whether the model project is stored or not or the model project generated by the temporary model project is stored is selected after the model project is successfully operated. If the user only needs the generated model item to perform subsequent data prediction and other functions, the user only needs to store the model item, does not need to additionally create the model item, and operates the model item to generate the model item. Each time the user wants any model item, only "try running" in the panel is needed.
In addition, after the panels are created in the embodiment of the application, all algorithm components which are possibly used are pulled to the panels, when the number of algorithm components is large, pages are possibly dense and disordered, a user can create a plurality of panels, the identification code of each panel is unique, and the components contained in each panel can be pulled from the component bar at will. While the "panel" contains a scroll bar so that the panel can present a large area. When a plurality of model projects are operated simultaneously in the same panel, the page performance is possibly poor, so that the rear end of the server can expand the capacity according to the number of users and the use frequency of the components;
in addition, the temporary model project in the application can also store the model project after the trial operation is successful, and a successfully operated model project can be obtained only by setting the name, the storage position and the life cycle of the model project; when only the model item is needed, the user only needs to test and run the corresponding temporary model item once, and complicated processes such as model item building and the like are not needed;
in addition, the algorithm components in the panel are not changed due to success or failure of the operation of the model project, and a user can repeatedly use the algorithm components for many times without pulling from the algorithm component column in the machine learning page.
In addition, the temporary model items can be established in the panel to perform the smoking test, and besides, the temporary model items can be established only for producing model item data, preprocessing the model item data, obtaining the model item model items and the like.
In addition, when the algorithm components in the component column are changed, the algorithm components in the 'panel' can be selected to be changed or not, such as updating old algorithm components, adding new algorithm components and the like, and the flexibility is strong.
In the embodiment, the application also provides a device for rapidly creating the model item in the machine learning platform. The apparatus corresponds to an embodiment of the method described above.
The embodiments described above provide a method for quickly creating model items in a machine learning platform. The method for rapidly creating the model project in the machine learning platform can be further realized by the following steps: creating a machine learning operating environment; adding components required by a model project to the machine learning running environment; creating, in the learning environment, a model item based on the added component; and running the created model item. And will not be discussed further herein.
Please refer to fig. 12, which is a schematic diagram of an embodiment of an apparatus for rapidly creating model items in a machine learning platform according to an embodiment of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative. An apparatus for rapidly creating a model item in a machine learning platform according to the embodiment includes:
a creating unit 201 configured to create a machine learning execution environment;
a component adding unit 202, configured to add components required by a model item to the machine learning execution environment;
a connection unit 203, configured to establish input and output links for components required by the model item according to an order set by the model item to create the model item;
a parameter configuration unit 204, configured to perform parameter configuration on the components required by the model project;
an execution unit 205, configured to execute the created model item.
Optionally, the creating unit includes:
a panel creating unit for creating a model project panel on the machine learning platform;
and the identification code allocation unit is used for allocating identification codes to the created model project panels.
Optionally, the component adding unit includes:
a component unit for adding components to the model project panel;
an attribute insertion unit for inserting the added attribute information of the component into a model item panel information table;
the attribute information comprises the identification code of the model project panel where the added component is located, the name of the added component and the position information of the added component in the model project panel.
Optionally, the connection unit includes:
the model item distribution unit is used for distributing a model item name and a model item code for the model item to be operated;
a selection unit for selecting components required by the model item;
and the linking unit is used for establishing data or command input and output links among the components required by the model items according to the sequence set by the model items.
Optionally, the connection unit includes:
a selection unit for selecting components required by the model item;
the judging unit is used for judging whether the component to be selected contains a father node or a parallel node or not and automatically distributing the model project name and the identification code when the component to be selected does not contain the father node or the parallel node;
the linkage unit is used for establishing data or command input and output links among the components required by the model items according to the sequence set by the model items; wherein the content of the first and second substances,
each component in the model project forms a node of the model project, the father node is the previous node of the selected node, and the parallel nodes are nodes of which the selected nodes have the common father node.
Optionally, the method further includes: a model item information recording unit for inserting the model item information and the model item panel identification code into the model item information table;
the model item information comprises a model item name, a model item code, and components and attribute information thereof required by the model item.
Optionally, the system further includes a node information recording unit, configured to insert each piece of node information constituting the model item into a node information table;
the node information comprises node positions, node-related father nodes and node-related son node information.
Optionally, the parameter configuration unit includes:
the algorithm parameter configuration unit is used for configuring parameter information of the algorithm components required by the model project;
and the node parameter recording unit is used for inserting the parameter information, the node identification code, the node name and the model item code into a node parameter table.
Optionally, the operation unit comprises
The starting unit is used for selecting a model item to be operated and starting the operation model item;
the identification code reading unit reads all the node identification codes in the model item in a node parameter table according to the codes of the model item;
the node sorting unit sorts the nodes according to the parent-child relationship of the node identification codes in the node table;
and the execution unit is used for sequentially executing the nodes according to the node sequence.
Optionally, the method further includes: the log unit is used for inserting the execution record into the log table after the node executes; and the state table generating unit is used for inserting the model item execution state data into the model item execution state table after the model item operation is finished.
Optionally, the execution unit includes:
the node parameter calling unit is used for acquiring the parameter information of each node from the node parameter table according to the model item code and the node identification code;
a node attribute judging unit that judges whether the node belongs to the data source node or the algorithm node;
the data execution unit is used for acquiring data from a corresponding database of the machine learning platform when the judgment result is the data source node;
and the command execution unit is used for executing the algorithm according to the algorithm parameters when the judgment result is the algorithm node.
In addition, the present application also provides an electronic device, which includes: a processor and a storage medium;
the storage medium stores or carries a method for quickly creating a model project in a machine learning platform; after the device is powered on and the processor runs the program of the method for quickly creating the model item in the machine learning platform, the following operations are executed:
creating a machine learning operating environment; adding a component to the machine learning runtime environment; establishing input and output links for components required by the model project according to the sequence set by the model project to create the model project; carrying out parameter configuration on components required by the model project; and running the created model item.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto, and variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the present invention.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (20)

1. A method for creating a model project in a machine learning platform, comprising:
creating a machine learning operating environment;
creating a model project panel at the machine learning platform;
adding components required by a model project to the machine learning running environment;
adding components to the model project panel;
establishing input and output links for the required components according to the sequence set by the model items to create the model items;
carrying out parameter configuration on components required by the model project;
and running the created model item.
2. The method for creating a model item in a machine learning platform of claim 1, wherein said creating a machine learning runtime environment comprises:
and allocating an identification code for the created model project panel.
3. The method for creating a model item in a machine learning platform of claim 2, wherein: the components required for adding the model item to the machine learning running environment comprise:
inserting the added attribute information of the component into a model project panel information table;
the attribute information comprises the added model project panel identification code where the component is located, the added component name, the added component identification code and the position information of the added component in the model project panel.
4. The method for creating a model item in a machine learning platform of claim 3, wherein:
the components include an algorithm component and a data source component.
5. The method for creating a model item in a machine learning platform of claim 3, wherein: the step of establishing the input and output links of the required components according to the sequence set by the model items comprises the following steps:
distributing model project names and model project codes for model projects to be operated;
selecting components required by the model project;
and establishing data or command input and output links among the components required by the model items according to the sequence set by the model items.
6. The method for creating a model item in a machine learning platform of claim 3, wherein: the step of establishing the input and output links of the required components according to the sequence set by the model items comprises the following steps:
selecting components required by the model project;
judging whether the component to be selected contains a father node or a parallel node or not, and automatically distributing a model project name and an identification code when the component to be selected does not contain the father node or the parallel node;
establishing data or command input and output links among the components required by the model items according to the sequence set by the model items; wherein the content of the first and second substances,
each component in the model project forms a node of the model project, the father node is the previous node of the selected node, and the parallel nodes are nodes of which the selected nodes have the common father node.
7. The method for creating a model item in a machine learning platform of any one of claims 5 or 6, wherein selecting a component required for a model item comprises selecting a component to be added to the machine learning model item runtime environment, and/or
Components that already exist in the model item runtime environment or have been used by other model items in the runtime environment.
8. The method for creating model items in a machine learning platform according to any of claims 5 or 6, further comprising: inserting the model item information and the model item panel identification code into a model item information table;
the model item information comprises a model item name, a model item code, and components and attribute information thereof required by the model item.
9. The method for creating a model item in a machine learning platform of claim 8, further comprising inserting each node information that constitutes the model item into a node information table;
the node information comprises node positions, node-related father nodes and node-related son node information.
10. The method of claim 9, wherein configuring parameters of components required for the model project comprises:
configuring parameter information of algorithm components required by the model project;
and inserting the parameter information, the node identification code and the model item code into a node parameter table.
11. The method for creating a model item in a machine learning platform of claim 10, wherein: said running said created model item includes
Selecting a model project to be operated, and starting the operation model project;
reading all node identification codes in the model item according to the model item code;
sorting the nodes according to the parent-child relationship of the node identification codes in the node table;
and executing the nodes in sequence according to the node sequence.
12. The method for creating a model item in a machine learning platform of claim 11, further comprising: after the node executes, inserting an execution record into the node execution log table; and inserting model item execution state data into the model item execution state table after the model item operation is finished.
13. The method for creating a model item in a machine learning platform of claim 11, wherein: the executing the nodes in sequence according to the node sequence comprises the following steps:
for each node, acquiring the parameter information of the node from the node parameter table according to the model project code and the node identification code;
judging whether the node belongs to the data source node or the algorithm node;
when the judgment result is the data source node, acquiring data from a corresponding database of the machine learning platform;
and when the judgment result is the algorithm node, acquiring the parameter of the node from the node parameter table and executing the algorithm according to the algorithm parameter.
14. The method for creating a model item in a machine learning platform according to claim 1, wherein the components required for adding a model item to the machine learning runtime environment are specifically:
adding components required by the same model project to the machine learning running environment; or
Adding components required by different model items to the machine learning runtime environment.
15. An apparatus for creating a model item in a machine learning platform, comprising:
a creating unit configured to create a machine learning execution environment; creating a model project panel at the machine learning platform;
the component adding unit is used for adding components required by model items to the machine learning running environment; adding components to the model project panel;
the connecting unit is used for establishing input and output links for the required components according to the sequence set by the model items so as to create the model items;
the parameter configuration unit is used for carrying out parameter configuration on the components required by the model project;
and the running unit is used for running the created model item.
16. The apparatus for creating a model item for a machine learning platform of claim 15, wherein the creating unit comprises:
a panel creating unit for creating a model project panel on the machine learning platform;
and the identification code allocation unit is used for allocating identification codes to the created model project panels.
17. The apparatus for creating model items in a machine learning platform of claim 15, wherein: the component addition unit includes:
a component unit for adding components to the model project panel;
an attribute insertion unit for inserting the added attribute information of the component into a model item panel information table;
the attribute information comprises the identification code of the model project panel where the added component is located, the name of the added component and the position information of the added component in the model project panel.
18. The apparatus for creating model items in a machine learning platform of claim 15, wherein: the connection unit includes:
the model item distribution unit is used for distributing a model item name and a model item code for the model item to be operated;
a selection unit for selecting components required by the model item;
and the linking unit is used for establishing data or command input and output links among the components required by the model items according to the sequence set by the model items.
19. An electronic device, characterized by comprising:
a processor and a storage medium;
the storage medium stores or carries a method for creating a model project in a machine learning platform; the device is powered on and executes the program of the method for creating the model item in the machine learning platform through the processor, and the following operations are executed:
creating a machine learning operating environment;
creating a model project panel at the machine learning platform;
adding components required by a model project to the machine learning running environment;
adding components to the model project panel;
establishing input and output links for the required components according to the sequence set by the model items to create the model items;
carrying out parameter configuration on components required by the model project;
and running the created model item.
20. A method for creating a model project in a machine learning platform, comprising:
creating a machine learning operating environment;
creating a model project panel at the machine learning platform;
adding components required by a model project to the machine learning running environment;
adding components to the model project panel;
creating, in the learning environment, a model item based on the added component;
and running the created model item.
CN201710502054.1A 2017-06-27 2017-06-27 Method and device for creating model project in machine learning platform Active CN109146081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710502054.1A CN109146081B (en) 2017-06-27 2017-06-27 Method and device for creating model project in machine learning platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710502054.1A CN109146081B (en) 2017-06-27 2017-06-27 Method and device for creating model project in machine learning platform

Publications (2)

Publication Number Publication Date
CN109146081A CN109146081A (en) 2019-01-04
CN109146081B true CN109146081B (en) 2022-04-29

Family

ID=64805125

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710502054.1A Active CN109146081B (en) 2017-06-27 2017-06-27 Method and device for creating model project in machine learning platform

Country Status (1)

Country Link
CN (1) CN109146081B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451663B (en) * 2017-07-06 2021-04-20 创新先进技术有限公司 Algorithm componentization, modeling method and device based on algorithm components and electronic equipment
CN109948804B (en) * 2019-03-15 2021-11-02 北京清瞳时代科技有限公司 Cross-platform dragging type deep learning modeling and training method and device
CN110276456B (en) * 2019-06-20 2021-08-20 山东大学 Auxiliary construction method, system, equipment and medium for machine learning model
CN110414187B (en) * 2019-07-03 2021-09-17 北京百度网讯科技有限公司 System and method for model safety delivery automation
CN111367891A (en) * 2020-03-30 2020-07-03 中国建设银行股份有限公司 Method, device and equipment for calling modeling intermediate data and readable storage medium
CN111861020A (en) * 2020-07-27 2020-10-30 深圳壹账通智能科技有限公司 Model deployment method, device, equipment and storage medium
CN112331348B (en) * 2020-10-21 2021-06-25 北京医准智能科技有限公司 Analysis method and system for set marking, data, project management and non-programming modeling
CN113609098A (en) * 2021-07-31 2021-11-05 云南电网有限责任公司信息中心 Visual modeling platform based on data mining process
CN114168446B (en) * 2022-02-10 2022-07-22 浙江大学 Simulation evaluation method and device for mobile terminal operation algorithm model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708404A (en) * 2012-02-23 2012-10-03 北京市计算中心 Machine learning based method for predicating parameters during MPI (message passing interface) optimal operation in multi-core environments
CN105868019A (en) * 2016-02-01 2016-08-17 中国科学院大学 Automatic optimization method for performance of Spark platform
CN106021211A (en) * 2016-05-18 2016-10-12 山东达创网络科技股份有限公司 Intelligent form system and generation method thereof
CN106779088A (en) * 2016-12-06 2017-05-31 北京物思创想科技有限公司 Perform the method and system of machine learning flow

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9576017B2 (en) * 2014-02-03 2017-02-21 Software Ag Systems and methods for managing graphical model consistency

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708404A (en) * 2012-02-23 2012-10-03 北京市计算中心 Machine learning based method for predicating parameters during MPI (message passing interface) optimal operation in multi-core environments
CN105868019A (en) * 2016-02-01 2016-08-17 中国科学院大学 Automatic optimization method for performance of Spark platform
CN106021211A (en) * 2016-05-18 2016-10-12 山东达创网络科技股份有限公司 Intelligent form system and generation method thereof
CN106779088A (en) * 2016-12-06 2017-05-31 北京物思创想科技有限公司 Perform the method and system of machine learning flow

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
阿里云机器学习平台——PAI平台;开发者社区;《https://developer.aliyun.com/article/57677》;20160713;第4-5页,图四,图五 *

Also Published As

Publication number Publication date
CN109146081A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
CN109146081B (en) Method and device for creating model project in machine learning platform
US10860339B2 (en) Autonomous creation of new microservices and modification of existing microservices
CN109542791B (en) A kind of program large-scale concurrent evaluating method based on container technique
Liu et al. Multi-objective scheduling of scientific workflows in multisite clouds
US8196113B2 (en) Realtime creation of datasets in model based testing
CN109189469B (en) Reflection-based android application micro-servitization method and system
JP6045134B2 (en) Parallel workload simulation for application performance testing
CN109189374B (en) Object structure code generation method and system based on object reference chain
CN109240666B (en) Function calling code generation method and system based on call stack and dependent path
CN115169810A (en) Artificial intelligence system construction method and device for power grid regulation
CN112199086A (en) Automatic programming control system, method, device, electronic device and storage medium
CN103744647A (en) Java workflow development system and method based on workflow GPD
CN107632827A (en) The generation method and device of the installation kit of application
CN115860143A (en) Operator model generation method, device and equipment
CN114297056A (en) Automatic testing method and system
CN111897725B (en) Automatic test method, medium, equipment and system for middle platform service
CN114936152A (en) Application testing method and device
CN114140047A (en) System bill of material generation method, system, storage medium and equipment
CN107563025B (en) Verification platform management method and device
CN112363700A (en) Cooperative creation method and device of intelligent contract, computer equipment and storage medium
CN110019533A (en) Synchronous scenario generation method, device, equipment and computer readable storage medium
CN112597669B (en) Simulation test platform and working method thereof
CN113986305B (en) B/S model upgrade detection method, device, equipment and storage medium
CN113741931B (en) Software upgrading method and device, electronic equipment and readable storage medium
CN112948480B (en) Data extraction method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant