CN107169575A - A kind of modeling and method for visualizing machine learning training pattern - Google Patents

A kind of modeling and method for visualizing machine learning training pattern Download PDF

Info

Publication number
CN107169575A
CN107169575A CN201710501660.1A CN201710501660A CN107169575A CN 107169575 A CN107169575 A CN 107169575A CN 201710501660 A CN201710501660 A CN 201710501660A CN 107169575 A CN107169575 A CN 107169575A
Authority
CN
China
Prior art keywords
machine learning
spark
training pattern
component
learning training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710501660.1A
Other languages
Chinese (zh)
Inventor
殷晋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Secret Number Measurement Data Technology Co Ltd
Original Assignee
Beijing Secret Number Measurement Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Secret Number Measurement Data Technology Co Ltd filed Critical Beijing Secret Number Measurement Data Technology Co Ltd
Priority to CN201710501660.1A priority Critical patent/CN107169575A/en
Publication of CN107169575A publication Critical patent/CN107169575A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The present invention relates to a kind of modeling and method for visualizing machine learning training pattern, system includes:Business process designer, the data flow between the algorithm in graphical algorithm assembly, and product process description language are set up for the operation for pulling the graphical algorithm assembly of selection according to user;Flow resolver, for being parsed to the process description language that business process designer is generated, creates corresponding learning object, and generate corresponding Spark study pipeline;With flow scheduling device, model training is carried out for Spark study pipelines to be submitted on Spark clusters.By selecting respective graphical algorithm assembly, and pull the data flow set up between algorithm, product process description language, process of analysis description language again, corresponding learning object is created according to node class name and attribute, and corresponding Spark study pipeline is generated, then be submitted on Spark clusters and carry out model training, it is possible to achieve high-quality machine learning modeling.

Description

A kind of modeling and method for visualizing machine learning training pattern
Technical field
The invention belongs to big data machine learning techniques field, and in particular to one kind visualization machine learning training aids, main It is used to help user and realizes quick model training.
Background technology
The establishment process of existing machine learning model is very cumbersome, and its establishment process is generally included:Signature analysis, model Training, model checking, the export of model tuning, model and model loading.
Wherein, each stage is required for independently being encoded, and especially creates and analysis process is very cumbersome and time-consuming, need Data Analyst and engineer is wanted to put into the substantial amounts of time.
Further, since the exchange data format disunity in each stage, causes model training to take very much, it is impossible to realize body Systemization result verification.
The content of the invention
In order to solve the above mentioned problem of prior art, the present invention provides a kind of modeling for visualizing machine learning training pattern Method, it can realize high-quality machine learning modeling, including realize that visual flow scheme design, visual model are tested Card, it is visual check intermediate result, Data Analyst can be allowed to carry out the instruction of machine learning in the case of without coding Practice, the training of model can be accelerated.
The present invention also provides a kind of modeling for visualizing machine learning training pattern, and it can realize high-quality machine Device learning model building, including realize visual flow scheme design, visual model checking, it is visual check intermediate result, can To allow Data Analyst to carry out the training of machine learning in the case of without coding, the training of model can be accelerated.
In order to achieve the above object, the main technical schemes that the present invention is used include:
A kind of modeling method for visualizing machine learning training pattern, it comprises the following steps:
The predetermined graphical algorithm assembly of S1, selection, and be drawn to design area to set up the calculation in graphical algorithm assembly Data flow between method, with this product process description language;
S2, process description language is parsed, corresponding learning object is created according to node class name and attribute, and generate Corresponding Spark study pipeline;
S3, study pipeline is submitted on Spark clusters and carries out model training.
By such scheme, the modeling method for visualizing machine learning training pattern of the invention, it can realize high-quality The machine learning modeling of amount, including realize that visual flow scheme design, the checking of visual model, visual intermediate result are looked into See, Data Analyst can be allowed to carry out the training of machine learning in the case of without coding, model can be dramatically speeded up Training effectiveness.
Wherein, in step S1, graphical algorithm assembly encapsulates pre-defined algorithm to be formed.For example, can be based on Canvas technologies, using SmartML, (data modelling language SmartML is based on JSON format writings, is included under root and sets up Six child nodes of dataSource, query, mapping, outputTable, sql and partition.Wherein, dataSource Node is used to point out that the data to be extracted wherefrom are come.Preferably, dataSource nodes, which are given a definition, two sub- node n ame And type, wherein, name is used for the title for pointing out data source, and type is used for the type for pointing out data source.Wherein, query Node is used to define the process that every kind of different platform data are produced and inquired about.Wherein, mapping nodes are used to define current source The export structure of data pick-up result.Preferably, can be used for being redefined the structure for extracting data in data source. Wherein, outputTable nodes are used to define a kind of output table name of data source Query Result.Preferably, data table name It is weighed justice after, can as next one or several data analysis processes input.Wherein, sql nodes are used for different numbers The data being drawn into according to source are recalculated, are associated, analyzed and exported.Preferably, sql grammer can follow Spark Sql standard syntax structure.Wherein, partition nodes are used to define subregion, according to data characteristicses and being actually needed data Collection is distributed on one or more nodes of Spark clusters.) by linear regression algorithm, Logistic algorithm packagings it is graphical Algorithm assembly.
Preferably, being concealed with predetermined operation logic inside graphical algorithm assembly.Whereby, reach and patrol complicated algorithm Collect and be patterned the simplified effect of encapsulation.
Wherein, in step S1, corresponding attribute setting also is carried out to graphical algorithm assembly.For example, calculating random forest The attributes such as the depth of method, maximum feature, classification tree, sampling policy are configured.
Wherein, in step S1, graphical algorithm assembly includes any one in following assemblies or appointed several:
Data source component, for being read for user from the data for set up reading data in machine learning training pattern Take component;
Data are located by data prediction component in advance for being selected for user to be set up in machine learning training pattern The data prediction component of reason;
Text analyzing component, the text for text analyzing is set up for being selected for user in machine learning training pattern This analytic unit;
Machine learning component, the machine for machine learning is set up for being selected for user in machine learning training pattern Device learning object;
Result verification component, the knot for result verification is set up for being selected for user in machine learning training pattern Fruit checking assembly.
Wherein, in step S2, learning object is according to node class name and attribute establishment.
Wherein, in step S2, Spark study pipelines are generated according to the connection attribute of node.
Wherein, in step S3, study pipeline is that the resource utilization of foundation Spark clusters is submitted on Spark clusters 's.Whereby, training effectiveness is improved.
Preferably, Spark clusters are dynamic distributed Spark clusters.
For example, Spark can dynamically be controlled by the encapsulation to AWS interfaces and the management of Spark clustering performance indexs The service condition of cluster resource, dynamic increase and deletion Spark cluster resources, realize dynamic capacity-expanding truly.
Wherein, step S4 can also be included, training result is verified.
Wherein, step S5 can also be included, the model for completing training is preserved into export.
A kind of modeling for visualizing machine learning training pattern, it includes:
Business process designer, it is graphical to set up for the graphical algorithm assembly of selection to be drawn into design area according to user The data flow between algorithm in algorithm assembly, and product process description language;
Flow resolver, for being parsed to the process description language that business process designer is generated, creates corresponding study Component, and generate corresponding Spark study pipeline;
Flow scheduling device, model training is carried out for Spark study pipelines to be submitted on Spark clusters.
By such scheme, the modeling for visualizing machine learning training pattern of the invention, it can realize high-quality The machine learning modeling of amount, including realize that visual flow scheme design, the checking of visual model, visual intermediate result are looked into See, Data Analyst can be allowed to carry out the training of machine learning in the case of without coding, model can be dramatically speeded up Training effectiveness.
Wherein, graphical algorithm assembly includes any one in following assemblies or appointed several:
Data source component, for being read for user from the data for set up reading data in machine learning training pattern Take component;
Data are located by data prediction component in advance for being selected for user to be set up in machine learning training pattern The data prediction component of reason;
Text analyzing component, the text for text analyzing is set up for being selected for user in machine learning training pattern This analytic unit;
Machine learning component, the machine for machine learning is set up for being selected for user in machine learning training pattern Device learning object;
Result verification component, the knot for result verification is set up for being selected for user in machine learning training pattern Fruit checking assembly.
Wherein, data prediction component includes any one in following assemblies or appointed several:
Sequence number increases component, and data are increased to be set up in machine learning training pattern for being selected for user The sequence number increase component of sequence number processing;
Type transition components, for carrying out type turn to data to be set up in machine learning training pattern for user Change the type transition components of processing.
Wherein, machine learning component includes any one in following assemblies or appointed several:
Two classification components, are instructed for being selected for user to be set up in machine learning training pattern with two sorting algorithms Two experienced classification based training components;
Many classification components, are instructed for being selected for user to be set up in machine learning training pattern with multi-classification algorithm Experienced many classification based training components;
Cluster component, sets up what is be trained with clustering algorithm for being selected for user in machine learning training pattern Cluster training assembly.
Wherein, two classification components include any one in following assemblies or appointed several:
The classification components of GBDT bis-, for being calculated for user to set up to classify with GBDT bis- in machine learning training pattern The classification based training components of GBDT bis- that method is trained;
Linear SVM component, sets up linearly to support for being selected for user in machine learning training pattern The linear SVM training assembly that vector machine algorithm is trained;
The classification component of logistic regression two, sets up with logistic regression for being selected for user in machine learning training pattern The classification based training component of logistic regression two that two sorting algorithms are trained.
Wherein, business process designer be provided with following modules any one or appoint it is several:
Algorithm assembly list block, for supplying the graphical algorithm assembly of list;
Visible process canvas module, for for flow for displaying design, model checking and/or intermediate result;
Algorithm assembly setting area module, for the respective attributes for setting respective graphical algorithm assembly (for example, to random The attributes such as the depth of forest algorithm, maximum feature, classification tree, sampling policy are configured).
Wherein, algorithm assembly list block can be with the graphical algorithm assembly of tree list.
Wherein, the flow scheme design shown in visible process canvas module includes each graphical algorithm assembly and the phase selected Data flow relation between mutually.
Wherein, the execution state of each graphical algorithm assembly can also be shown in visible process canvas module.
Wherein, user can be each graphically by operation (including click, double-clicks) in visible process canvas module Algorithm assembly performs corresponding operation (including modeling, training etc.).
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, preferably, also including to training The model preserving module that model is preserved.
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, preferably, also including to model The model import modul imported.
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, wherein, graphical algorithm assembly Pre-defined algorithm is encapsulated to be formed.For example, can be based on Canvas technologies, using SmartML by linear regression algorithm, Logistic algorithm packagings are graphical algorithm assembly.
Preferably, being concealed with predetermined operation logic inside graphical algorithm assembly.Whereby, reach and patrol complicated algorithm Collect and be patterned the simplified effect of encapsulation.
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, wherein, learning object is basis What node class name and attribute were created.
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, wherein, Spark study pipelines are Generated according to the connection attribute of node.
The modeling of the visualization machine learning training pattern of any of the above-described embodiment, wherein, study pipeline is foundation The resource utilization of Spark clusters is submitted on Spark clusters.Whereby, training effectiveness is improved.
Preferably, Spark clusters are dynamic distributed Spark clusters.
For example, Spark can dynamically be controlled by the encapsulation to AWS interfaces and the management of Spark clustering performance indexs The service condition of cluster resource, dynamic increase and deletion Spark cluster resources, realize dynamic capacity-expanding truly.
Brief description of the drawings
Fig. 1 is the interface schematic diagram of the modeling of the visualization machine learning training pattern of one embodiment of the invention;
Fig. 2 is the modeling procedure schematic diagram of the visualization machine learning training pattern of one embodiment of the invention.
Embodiment
In order to preferably explain the present invention, in order to understand, below in conjunction with the accompanying drawings, by embodiment, to this hair It is bright to be described in detail.
Referring to Fig. 1, the modeling of the visualization machine learning training pattern of one embodiment of the invention, it includes:
Business process designer, it is graphical to set up for the graphical algorithm assembly of selection to be drawn into design area according to user The data flow between algorithm in algorithm assembly, and product process description language;
Flow resolver, for being parsed to the process description language that business process designer is generated, creates corresponding study Component, and generate corresponding Spark study pipeline;
Flow scheduling device, model training is carried out for Spark study pipelines to be submitted on Spark clusters.
Wherein, business process designer is provided with algorithm assembly list block (on the left of drawing), for supplying the graphical algorithm of list Component;
Visible process canvas module (in the middle part of drawing), for for flow for displaying design, model checking and/or middle knot Really;
Algorithm assembly setting area module (on the right side of drawing), for the respective attributes for setting respective graphical algorithm assembly (for example, the attribute such as depth to random forests algorithm, maximum feature, classification tree, sampling policy is configured).
Wherein, the graphical algorithm assembly list on the left of drawing is wrapped in first class catalogue into tree, including three-level catalogue Include:
Data source component, for being read for user from the data for set up reading data in machine learning training pattern Take component;
Data are located by data prediction component in advance for being selected for user to be set up in machine learning training pattern The data prediction component of reason;
Text analyzing component, the text for text analyzing is set up for being selected for user in machine learning training pattern This analytic unit;
Machine learning component, the machine for machine learning is set up for being selected for user in machine learning training pattern Device learning object;
Result verification component, the knot for result verification is set up for being selected for user in machine learning training pattern Fruit checking assembly.
Have under data source component therein and read data table options, can be set up for user in machine learning training pattern Read in the module of data.The algorithm of data is read in due to being wherein packaged with, therefore, user directly selects and (for example pulled), Without being programmed, the establishment process of model is simplified.
There are increase sequence number and type conversion options, for user in machine learning training pattern under data prediction component Middle set up to the data of reading increase the sequence number increase module of sequence number pretreatment and sets up the data progress to reading The type modular converter of type conversion pretreatment.Due to the algorithm for being wherein packaged with increase sequence number, type is changed, therefore, use Family directly from (such as pulling), without being programmed, simplifies the establishment process of model.
There are two classification, three second-level directories of many classification and cluster under machine learning component.Wherein, have under two classified catalogues There is GBDT bis- to classify, linear SVM and logistic regression two are classified three options, for user in machine learning training pattern In set up corresponding machine learning module, user can select as needed.Due to being wherein packaged with corresponding algorithm, therefore, User directly from (such as pulling), without being programmed, simplifies the establishment process of model.
Wherein, each mould that user sets up in machine learning model is shown on the visible process painting canvas in the middle part of drawing Block, for example, using data source -1 for reading the foundation of data table options, using increasing the sequence that sequence number and type conversion options are set up Number increase module and type modular converter, is built using SegmentParser, TF/IDF, StopWord option (not shown) Vertical participle, word frequency statisticses and use transition word module.
Meanwhile, the shape in each stage in machine learning training process is also show on the visible process painting canvas in the middle part of drawing State, progress etc., for example, shown in this interface, the increase sequence number stage is in, and next stage is the participle stage, wherein It (is to represent to have run compared with thick lines in housing, be represented compared with hachure in figure that the increase sequence number stage, which has run more than 50%, Wait to run, meanwhile, the circle arrow symbol in operation progress, such as figure is shown on the data flow connection line also between two stages Number, both illustrate traffic direction, and the operation progress with its positional representation on line), it is of course also possible to use other forms It has been shown that, (can be represented not carry out and can not click on grey, be represented to have transported with glassy yellow such as being distinguished in different colors OK, off-duty is represented with blueness and can clicked on), the present invention is not limited this.
Wherein, the algorithm assembly setting area on the right side of drawing is shown, user can carry out machine by setting corresponding parameter Learning training, for example, " 4 component selections " are set to " Word2Vec 10K ", will " Label by " are set to " Word ", will " Color by " be set to " No color map ", from " T-SNE " algorithm 3D models, its " Perplexity " is shown as 73, " Learning rate " are shown as 59 for it.
The modeling of visualization machine learning training pattern in above-described embodiment, it can be transported according to following method OK, specific steps include:
The predetermined graphical algorithm assembly of S1, selection, and be drawn to design area to set up the calculation in graphical algorithm assembly Data flow between method, with this product process description language;
S2, process description language is parsed, corresponding learning object is created according to node class name and attribute, and generate Corresponding Spark study pipeline;
S3, study pipeline is submitted on Spark clusters and carries out model training.
Wherein, due to graphical algorithm assembly include data source component, data prediction component, text analyzing component, Machine learning component, result verification component etc., can be used to read in for user to set up in machine learning training pattern Hidden inside data, data prediction, text analyzing, machine learning, each component of result verification, and each graphical algorithm assembly There is predetermined operation logic, therefore, it can by pulling the modules that graphical algorithm assembly can be formed in training pattern, And corresponding data flow is set up between modules, i.e., without programming stage by stage, and simple drag operation structure can be passed through Training pattern is built, at the same time it can also flow for displaying design, model checking and intermediate result in visible process painting canvas, is realized Visual flow scheme design, model checking and intermediate result are checked, can help the Data Analyst faster, more intuitively to dig Dig data value.
The interface of reference picture 1, the present invention obtains training pattern with reference to following steps:
1st, data source is selected from left side;
2nd, based on selected data source, the pretreatment operation of data is selected by pulling;
3rd, to pretreated data, carry out selection algorithm from right side dragging and analyzed, implementation model training flow is obtained Training result;
4th, training result is verified;
5th, training pattern is preserved.
Two application examples are also provided below, and present invention is described.
Example one (text mining, 1,000,000 text datas)
Its analysis process includes:
1st, text prepares;
2nd, stop words is filtered;
3rd, word frequency statisticses;
4th, feature extraction;
5th, model training is carried out with logistic regression algorithm;
6th, logistic regression is estimated;
7th, the model export after strong training.
, it is necessary to which substep complete independently, per stage is required to programming, overall training time when prior art faces this example Need 3 hours.
Using the system and method for the present invention, based on visible process design, model checking, Distributed Calculation, overall instruction Practicing the time only needs half an hour, and compared with prior art, efficiency is significantly improved.
Example two (meteorologic analysis)
The model being estimated to wind energy resources is built, it is (each in nearest 22 years by the historical data for analyzing wind-resources Height wind speed and wind power concentration day, year change and its long-run average, the wind probability distribution of different height, wind direction frequency and wind The directional spreding of energy density, wind speed and wind energy frequency distribution, annual effective wind speed hourage, turbulent flow, wind shear exponent, air are close Degree ...), prediction the five-year in wind-resources using situation.
With prior art, it is necessary to which the two day time an of people could complete the establishment export of model, and the present invention is used, only needed The establishment with regard to model can be completed in 1 hour is wanted to export.
In summary, the present invention can realize visual flow scheme design, the checking of visual model, visual centre As a result check, Data Analyst can be allowed to carry out the training of machine learning in the case of without coding, can be dramatically speeded up The training effectiveness of model, furthermore, it is possible to help Data Analyst, faster, more directly mining data is worth.

Claims (10)

1. a kind of modeling for visualizing machine learning training pattern, it includes:
Business process designer, graphical algorithm is set up for the graphical algorithm assembly of selection to be drawn into design area according to user The data flow between algorithm in component, and product process description language;
Flow resolver, for being parsed to the process description language that business process designer is generated, creates corresponding learning object, And generate corresponding Spark study pipeline;
Flow scheduling device, model training is carried out for Spark study pipelines to be submitted on Spark clusters.
2. the modeling of machine learning training pattern is visualized as claimed in claim 1, it is characterised in that graphical algorithm Component includes any one in following assemblies or appointed several:
Data source component, sets up the digital independent group for reading in data in machine learning training pattern for selecting for user Part;
Data prediction component, sets up what data were pre-processed for being selected for user in machine learning training pattern Data prediction component;
Text analyzing component, the text point for text analyzing is set up for being selected for user in machine learning training pattern Analyse component;
Machine learning component, the engineering for machine learning is set up for being selected for user in machine learning training pattern Practise component;
Result verification component, is tested for being selected for user to be set up in machine learning training pattern for the result of result verification Demonstrate,prove component.
3. the modeling of machine learning training pattern is visualized as claimed in claim 1, it is characterised in that business process designer Be provided with following modules any one or appoint it is several:
Algorithm assembly list block, for supplying the graphical algorithm assembly of list;
Visible process canvas module, for for flow for displaying design, model checking and/or intermediate result;
Algorithm assembly setting area module, for the respective attributes for setting respective graphical algorithm assembly.
4. the modeling of machine learning training pattern is visualized as claimed in claim 3, it is characterised in that:
The flow scheme design shown in visible process canvas module includes each graphical algorithm assembly selected and each other Data flow relation.The execution state of each graphical algorithm assembly can also be shown in visible process canvas module.And/or use Family can be in visible process canvas module by operating each graphical algorithm assembly to perform corresponding operation.
5. the modeling of machine learning training pattern is visualized as claimed in claim 1, it is characterised in that also including following Any of structure is appointed several:
Structure 1 in addition to the model preserving module preserved to training pattern;
Structure 2 in addition to the model import modul imported to model;
Structure 3, graphical algorithm assembly encapsulate pre-defined algorithm to be formed;
On the basis of structure 4, structure 3, predetermined operation logic is concealed with inside graphical algorithm assembly;
Structure 5, learning object are according to node class name and attribute establishment;
Structure 6, Spark study pipelines are generated according to the connection attribute of node;
Structure 7, study pipeline are that the resource utilization of foundation Spark clusters is submitted on Spark clusters;
On the basis of structure 8, structure 7, Spark clusters are dynamic distributed Spark clusters;
On the basis of structure 9, structure 8, by the encapsulation to AWS interfaces and the management of Spark clustering performance indexs, dynamically control The service condition of Spark cluster resources processed, dynamic increase and deletion Spark cluster resources, realize dynamic expansion truly Hold.
6. a kind of modeling method for visualizing machine learning training pattern, it is characterised in that it comprises the following steps:
S1, the predetermined graphical algorithm assembly of selection, and be drawn to design area come set up the algorithm in graphical algorithm assembly it Between data flow, with this product process description language;
S2, process description language is parsed, corresponding learning object is created according to node class name and attribute, and generate corresponding Spark study pipeline;
S3, study pipeline is submitted on Spark clusters and carries out model training.
7. the modeling method of machine learning training pattern is visualized as claimed in claim 6, it is characterised in that:In step S1, Graphical algorithm assembly encapsulates pre-defined algorithm to be formed.Preferably, being concealed with predetermined behaviour inside graphical algorithm assembly Make logic.Corresponding attribute setting also is carried out to graphical algorithm assembly.Graphical algorithm assembly includes appointing in following assemblies One or appoint it is several:
Data source component, sets up the digital independent group for reading in data in machine learning training pattern for selecting for user Part;
Data prediction component, sets up what data were pre-processed for being selected for user in machine learning training pattern Data prediction component;
Text analyzing component, the text point for text analyzing is set up for being selected for user in machine learning training pattern Analyse component;
Machine learning component, the engineering for machine learning is set up for being selected for user in machine learning training pattern Practise component;
Result verification component, is tested for being selected for user to be set up in machine learning training pattern for the result of result verification Demonstrate,prove component.
8. the modeling method of machine learning training pattern is visualized as claimed in claim 6, it is characterised in that:In step S2, Learning object is according to node class name and attribute establishment.Spark study pipelines are generated according to the connection attribute of node.
9. the modeling method of machine learning training pattern is visualized as claimed in claim 6, it is characterised in that:In step S3, Study pipeline is that the resource utilization of foundation Spark clusters is submitted on Spark clusters.Preferably, Spark clusters are State distribution Spark clusters.Can dynamically it be controlled by the encapsulation to AWS interfaces and the management of Spark clustering performance indexs The service condition of Spark cluster resources, dynamic increase and deletion Spark cluster resources, realize dynamic capacity-expanding truly.
10. the modeling method of machine learning training pattern is visualized as claimed in claim 6, it is characterised in that also included:
Step S4, training result is verified.And/or step S5, the model preservation export that training will be completed.
CN201710501660.1A 2017-06-27 2017-06-27 A kind of modeling and method for visualizing machine learning training pattern Pending CN107169575A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710501660.1A CN107169575A (en) 2017-06-27 2017-06-27 A kind of modeling and method for visualizing machine learning training pattern

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710501660.1A CN107169575A (en) 2017-06-27 2017-06-27 A kind of modeling and method for visualizing machine learning training pattern

Publications (1)

Publication Number Publication Date
CN107169575A true CN107169575A (en) 2017-09-15

Family

ID=59827612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710501660.1A Pending CN107169575A (en) 2017-06-27 2017-06-27 A kind of modeling and method for visualizing machine learning training pattern

Country Status (1)

Country Link
CN (1) CN107169575A (en)

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807814A (en) * 2017-09-27 2018-03-16 百度在线网络技术(北京)有限公司 Construction method, device, equipment and the computer-readable recording medium of application component
CN107967359A (en) * 2017-12-21 2018-04-27 百度在线网络技术(北京)有限公司 Data visualization analysis method, system, terminal and computer-readable recording medium
CN107991722A (en) * 2017-12-25 2018-05-04 北京墨迹风云科技股份有限公司 Method for building up, Forecasting Methodology and the prediction meanss of weather prediction model
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN108733358A (en) * 2018-05-21 2018-11-02 浪潮软件集团有限公司 Spark-based machine learning workflow construction method and device
CN108898229A (en) * 2018-06-26 2018-11-27 第四范式(北京)技术有限公司 For constructing the method and system of machine learning modeling process
CN108897587A (en) * 2018-06-22 2018-11-27 北京优特捷信息技术有限公司 Plug type machine learning algorithm operation method, device and readable storage medium storing program for executing
CN108960433A (en) * 2018-06-26 2018-12-07 第四范式(北京)技术有限公司 For running the method and system of machine learning modeling process
CN108984257A (en) * 2018-07-06 2018-12-11 无锡雪浪数制科技有限公司 A kind of machine learning platform for supporting custom algorithm component
CN109213482A (en) * 2018-06-28 2019-01-15 清华大学天津高端装备研究院 The graphical application platform of artificial intelligence and application method based on convolutional neural networks
CN109299785A (en) * 2018-09-17 2019-02-01 浪潮软件集团有限公司 Method and device for realizing machine learning model
CN109343833A (en) * 2018-09-20 2019-02-15 北京神州泰岳软件股份有限公司 Data processing platform (DPP) and data processing method
CN109408175A (en) * 2018-09-28 2019-03-01 北京赛博贝斯数据科技有限责任公司 Real-time interaction method and system in general high-performance deep learning computing engines
CN109558395A (en) * 2018-10-17 2019-04-02 中国光大银行股份有限公司 Data processing system and data digging method
CN109726232A (en) * 2018-12-29 2019-05-07 北京航天数据股份有限公司 A kind of model visualization calculation method and system
CN109726818A (en) * 2018-12-29 2019-05-07 北京航天数据股份有限公司 A kind of model editing method, apparatus, equipment and medium
CN109783859A (en) * 2018-12-13 2019-05-21 重庆金融资产交易所有限责任公司 Model building method, device and computer readable storage medium
CN109828751A (en) * 2019-02-15 2019-05-31 福州大学 Integrated machine learning algorithm library and unified programming framework
CN109948804A (en) * 2019-03-15 2019-06-28 北京清瞳时代科技有限公司 Cross-platform towed deep learning modeling and training method and device
CN110083334A (en) * 2018-01-25 2019-08-02 北京顺智信科技有限公司 The online method and device of model
CN110119271A (en) * 2018-12-19 2019-08-13 厦门渊亭信息科技有限公司 A kind of model across machine learning platform defines agreement and adaption system
CN110175225A (en) * 2019-04-26 2019-08-27 美林数据技术股份有限公司 Non-structural text data processing method and device
CN110188886A (en) * 2018-08-17 2019-08-30 第四范式(北京)技术有限公司 Visualization method and system are carried out to the data processing step of machine-learning process
CN110209902A (en) * 2018-08-17 2019-09-06 第四范式(北京)技术有限公司 To the feature generating process visualization method and system in machine-learning process
CN110222710A (en) * 2019-04-30 2019-09-10 北京深演智能科技股份有限公司 Data processing method, device and storage medium
CN110334809A (en) * 2019-07-03 2019-10-15 成都淞幸科技有限责任公司 A kind of Component encapsulating method and system of intelligent algorithm
CN110531975A (en) * 2019-08-30 2019-12-03 陕西思科锐迪网络安全技术有限责任公司 A kind of deep learning model training method of graphic programming
CN110598868A (en) * 2018-05-25 2019-12-20 腾讯科技(深圳)有限公司 Machine learning model building method and device and related equipment
CN110806859A (en) * 2019-11-11 2020-02-18 成都理工大学 Modular drilling data monitoring and design system based on machine learning
CN110908573A (en) * 2019-12-03 2020-03-24 北京明略软件系统有限公司 Algorithm model training method, device, equipment and storage medium
CN110909039A (en) * 2019-10-25 2020-03-24 北京华如科技股份有限公司 Big data mining tool and method based on drag type process
CN111047046A (en) * 2019-11-01 2020-04-21 东方微银科技(北京)有限公司 Visual generation method and equipment of machine learning model
CN111078094A (en) * 2019-12-04 2020-04-28 北京邮电大学 Distributed machine learning visualization device
CN111080170A (en) * 2019-12-30 2020-04-28 北京云享智胜科技有限公司 Workflow modeling method and device, electronic equipment and storage medium
CN111104731A (en) * 2019-11-19 2020-05-05 北京集奥聚合科技有限公司 Graphical model full-life-cycle modeling method for federal learning
CN111125052A (en) * 2019-10-25 2020-05-08 北京华如科技股份有限公司 Big data intelligent modeling system and method based on dynamic metadata
CN111240662A (en) * 2020-01-16 2020-06-05 同方知网(北京)技术有限公司 Spark machine learning system and learning method based on task visual dragging
CN111259064A (en) * 2020-01-10 2020-06-09 同方知网(北京)技术有限公司 Visual natural language analysis mining system and modeling method thereof
CN111461349A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Modeling method and system
WO2020239033A1 (en) * 2019-05-28 2020-12-03 第四范式(北京)技术有限公司 Method and system for displaying machine learning automatic modeling procedure
CN112130827A (en) * 2019-06-25 2020-12-25 北京启瞳智能科技有限公司 Model development method and platform based on cloud modularization technology and intelligent terminal
CN112685026A (en) * 2020-12-25 2021-04-20 厦门渊亭信息科技有限公司 Multi-language-based visual modeling platform and method
CN112698827A (en) * 2020-12-25 2021-04-23 厦门渊亭信息科技有限公司 Distributed visual modeling platform and method
CN113095432A (en) * 2021-04-27 2021-07-09 电子科技大学 Visualization system and method based on interpretable random forest
WO2021143145A1 (en) * 2020-01-15 2021-07-22 平安科技(深圳)有限公司 Data analysis and modeling method, platform, server and readable storage medium
CN113609098A (en) * 2021-07-31 2021-11-05 云南电网有限责任公司信息中心 Visual modeling platform based on data mining process
CN113609779A (en) * 2021-08-16 2021-11-05 深圳力维智联技术有限公司 Modeling method, device and equipment for distributed machine learning
CN113642148A (en) * 2021-06-30 2021-11-12 北京航空航天大学 Modeling language characterization method based on equation description
CN114764296A (en) * 2021-01-12 2022-07-19 京东科技信息技术有限公司 Machine learning model training method and device, electronic equipment and storage medium
CN115485039A (en) * 2020-02-28 2022-12-16 Cy游戏公司 System and method for supporting creation of game scripts

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王锐君 等: "一种大数据交互式挖掘框架与实现", 科研信息化技术与应用 *
赵玲玲 等: "基于Spark的流程化机器学习分析方法", 计算机系统应用 *

Cited By (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107807814A (en) * 2017-09-27 2018-03-16 百度在线网络技术(北京)有限公司 Construction method, device, equipment and the computer-readable recording medium of application component
CN107967359A (en) * 2017-12-21 2018-04-27 百度在线网络技术(北京)有限公司 Data visualization analysis method, system, terminal and computer-readable recording medium
US11216353B2 (en) 2017-12-21 2022-01-04 Baidu Online Network Technology (Beijing) Co., Ltd. Data visual analysis method, system and terminal and computer readable storage medium
CN107991722A (en) * 2017-12-25 2018-05-04 北京墨迹风云科技股份有限公司 Method for building up, Forecasting Methodology and the prediction meanss of weather prediction model
CN110083334A (en) * 2018-01-25 2019-08-02 北京顺智信科技有限公司 The online method and device of model
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN108733358A (en) * 2018-05-21 2018-11-02 浪潮软件集团有限公司 Spark-based machine learning workflow construction method and device
CN110598868B (en) * 2018-05-25 2023-04-18 腾讯科技(深圳)有限公司 Machine learning model building method and device and related equipment
CN110598868A (en) * 2018-05-25 2019-12-20 腾讯科技(深圳)有限公司 Machine learning model building method and device and related equipment
CN108897587A (en) * 2018-06-22 2018-11-27 北京优特捷信息技术有限公司 Plug type machine learning algorithm operation method, device and readable storage medium storing program for executing
CN108897587B (en) * 2018-06-22 2021-11-12 北京优特捷信息技术有限公司 Pluggable machine learning algorithm operation method and device and readable storage medium
CN108960433A (en) * 2018-06-26 2018-12-07 第四范式(北京)技术有限公司 For running the method and system of machine learning modeling process
CN108960433B (en) * 2018-06-26 2022-04-05 第四范式(北京)技术有限公司 Method and system for running machine learning modeling process
CN108898229A (en) * 2018-06-26 2018-11-27 第四范式(北京)技术有限公司 For constructing the method and system of machine learning modeling process
CN108898229B (en) * 2018-06-26 2021-12-14 第四范式(北京)技术有限公司 Method and system for constructing machine learning modeling process
CN109213482A (en) * 2018-06-28 2019-01-15 清华大学天津高端装备研究院 The graphical application platform of artificial intelligence and application method based on convolutional neural networks
CN108984257A (en) * 2018-07-06 2018-12-11 无锡雪浪数制科技有限公司 A kind of machine learning platform for supporting custom algorithm component
CN110188886A (en) * 2018-08-17 2019-08-30 第四范式(北京)技术有限公司 Visualization method and system are carried out to the data processing step of machine-learning process
CN110209902B (en) * 2018-08-17 2023-11-14 第四范式(北京)技术有限公司 Method and system for visualizing feature generation process in machine learning process
WO2020035076A1 (en) * 2018-08-17 2020-02-20 第四范式(北京)技术有限公司 Method and system for visualizing data processing step of machine learning process
CN110188886B (en) * 2018-08-17 2021-08-20 第四范式(北京)技术有限公司 Method and system for visualizing data processing steps of a machine learning process
CN110209902A (en) * 2018-08-17 2019-09-06 第四范式(北京)技术有限公司 To the feature generating process visualization method and system in machine-learning process
CN109299785A (en) * 2018-09-17 2019-02-01 浪潮软件集团有限公司 Method and device for realizing machine learning model
CN109299785B (en) * 2018-09-17 2022-04-26 浪潮软件股份有限公司 Method and device for realizing machine learning model
CN109343833A (en) * 2018-09-20 2019-02-15 北京神州泰岳软件股份有限公司 Data processing platform (DPP) and data processing method
CN109343833B (en) * 2018-09-20 2022-12-16 鼎富智能科技有限公司 Data processing platform and data processing method
CN109408175A (en) * 2018-09-28 2019-03-01 北京赛博贝斯数据科技有限责任公司 Real-time interaction method and system in general high-performance deep learning computing engines
CN109558395A (en) * 2018-10-17 2019-04-02 中国光大银行股份有限公司 Data processing system and data digging method
CN109783859A (en) * 2018-12-13 2019-05-21 重庆金融资产交易所有限责任公司 Model building method, device and computer readable storage medium
CN110119271A (en) * 2018-12-19 2019-08-13 厦门渊亭信息科技有限公司 A kind of model across machine learning platform defines agreement and adaption system
CN109726818A (en) * 2018-12-29 2019-05-07 北京航天数据股份有限公司 A kind of model editing method, apparatus, equipment and medium
CN109726818B (en) * 2018-12-29 2021-08-17 北京航天数据股份有限公司 Model editing method, device, equipment and medium
CN109726232A (en) * 2018-12-29 2019-05-07 北京航天数据股份有限公司 A kind of model visualization calculation method and system
CN109828751A (en) * 2019-02-15 2019-05-31 福州大学 Integrated machine learning algorithm library and unified programming framework
CN109948804A (en) * 2019-03-15 2019-06-28 北京清瞳时代科技有限公司 Cross-platform towed deep learning modeling and training method and device
CN110175225A (en) * 2019-04-26 2019-08-27 美林数据技术股份有限公司 Non-structural text data processing method and device
CN110222710A (en) * 2019-04-30 2019-09-10 北京深演智能科技股份有限公司 Data processing method, device and storage medium
CN110222710B (en) * 2019-04-30 2022-03-08 北京深演智能科技股份有限公司 Data processing method, device and storage medium
WO2020239033A1 (en) * 2019-05-28 2020-12-03 第四范式(北京)技术有限公司 Method and system for displaying machine learning automatic modeling procedure
CN112130827A (en) * 2019-06-25 2020-12-25 北京启瞳智能科技有限公司 Model development method and platform based on cloud modularization technology and intelligent terminal
CN110334809A (en) * 2019-07-03 2019-10-15 成都淞幸科技有限责任公司 A kind of Component encapsulating method and system of intelligent algorithm
CN110531975A (en) * 2019-08-30 2019-12-03 陕西思科锐迪网络安全技术有限责任公司 A kind of deep learning model training method of graphic programming
CN110909039A (en) * 2019-10-25 2020-03-24 北京华如科技股份有限公司 Big data mining tool and method based on drag type process
CN111125052A (en) * 2019-10-25 2020-05-08 北京华如科技股份有限公司 Big data intelligent modeling system and method based on dynamic metadata
CN111047046A (en) * 2019-11-01 2020-04-21 东方微银科技(北京)有限公司 Visual generation method and equipment of machine learning model
CN110806859A (en) * 2019-11-11 2020-02-18 成都理工大学 Modular drilling data monitoring and design system based on machine learning
CN111104731B (en) * 2019-11-19 2023-09-15 北京集奥聚合科技有限公司 Graphical model full life cycle modeling method for federal learning
CN111104731A (en) * 2019-11-19 2020-05-05 北京集奥聚合科技有限公司 Graphical model full-life-cycle modeling method for federal learning
CN110908573B (en) * 2019-12-03 2021-07-06 北京明略软件系统有限公司 Algorithm model training method, device, equipment and storage medium
CN110908573A (en) * 2019-12-03 2020-03-24 北京明略软件系统有限公司 Algorithm model training method, device, equipment and storage medium
CN111078094A (en) * 2019-12-04 2020-04-28 北京邮电大学 Distributed machine learning visualization device
CN111078094B (en) * 2019-12-04 2021-12-07 北京邮电大学 Distributed machine learning visualization device
CN111080170A (en) * 2019-12-30 2020-04-28 北京云享智胜科技有限公司 Workflow modeling method and device, electronic equipment and storage medium
CN111080170B (en) * 2019-12-30 2023-09-05 北京云享智胜科技有限公司 Workflow modeling method and device, electronic equipment and storage medium
CN111259064A (en) * 2020-01-10 2020-06-09 同方知网(北京)技术有限公司 Visual natural language analysis mining system and modeling method thereof
WO2021143145A1 (en) * 2020-01-15 2021-07-22 平安科技(深圳)有限公司 Data analysis and modeling method, platform, server and readable storage medium
CN111240662B (en) * 2020-01-16 2024-01-09 同方知网(北京)技术有限公司 Spark machine learning system and method based on task visual drag
CN111240662A (en) * 2020-01-16 2020-06-05 同方知网(北京)技术有限公司 Spark machine learning system and learning method based on task visual dragging
CN115485039B (en) * 2020-02-28 2023-06-09 Cy游戏公司 System and method for supporting creation of game scripts
CN115485039A (en) * 2020-02-28 2022-12-16 Cy游戏公司 System and method for supporting creation of game scripts
CN111461349A (en) * 2020-04-07 2020-07-28 中国建设银行股份有限公司 Modeling method and system
CN112685026A (en) * 2020-12-25 2021-04-20 厦门渊亭信息科技有限公司 Multi-language-based visual modeling platform and method
CN112698827A (en) * 2020-12-25 2021-04-23 厦门渊亭信息科技有限公司 Distributed visual modeling platform and method
CN114764296A (en) * 2021-01-12 2022-07-19 京东科技信息技术有限公司 Machine learning model training method and device, electronic equipment and storage medium
CN113095432A (en) * 2021-04-27 2021-07-09 电子科技大学 Visualization system and method based on interpretable random forest
CN113642148A (en) * 2021-06-30 2021-11-12 北京航空航天大学 Modeling language characterization method based on equation description
CN113642148B (en) * 2021-06-30 2024-04-12 北京航空航天大学 Modeling language characterization method based on equation description
CN113609098A (en) * 2021-07-31 2021-11-05 云南电网有限责任公司信息中心 Visual modeling platform based on data mining process
CN113609779A (en) * 2021-08-16 2021-11-05 深圳力维智联技术有限公司 Modeling method, device and equipment for distributed machine learning
CN113609779B (en) * 2021-08-16 2024-04-09 深圳力维智联技术有限公司 Modeling method, device and equipment for distributed machine learning

Similar Documents

Publication Publication Date Title
CN107169575A (en) A kind of modeling and method for visualizing machine learning training pattern
CN106096727B (en) A kind of network model building method and device based on machine learning
CN107943463B (en) Interactive mode automation big data analysis application development system
CN111259064B (en) Visual natural language analysis mining system and modeling method thereof
US20230162051A1 (en) Method, device and apparatus for execution of automated machine learning process
CN110363449A (en) A kind of Risk Identification Method, apparatus and system
CN104268428A (en) Visual configuration method for index calculation
CN108170162B (en) Performance evaluation method for multi-scale wind disturbance analysis unmanned aerial vehicle cluster coordination control system
CN108804630A (en) A kind of big data intellectual analysis service system of Industry-oriented application
CN113656021B (en) Oil gas big data analysis system and method oriented to business scene
CN111796815A (en) Application of full-automatic visual software building platform
CN105975466A (en) Method and device for machine manuscript writing aiming at short newsflashes
CN106529028A (en) Technological procedure automatic generating method
CN109582837A (en) A kind of visualized data processing method based on cloud and system
CN103149840B (en) Semanteme service combination method based on dynamic planning
CN106096159B (en) A kind of implementation method of distributed system behavior simulation analysis system under cloud platform
Yu-ming et al. Research on intelligent manufacturing flexible production line system based on digital twin
CN108519876A (en) A kind of modeling of graphics data stream and processing system and method
CN110968620A (en) Agile data analysis method
Li et al. Artificial Intelligence-Based Sustainable Development of Smart Heritage Tourism
CN103870624B (en) Simulation analysis template for hung crossbeam
CN108052507A (en) A kind of city management information the analysis of public opinion system and method
CN107515979B (en) A kind of processing method and processing system to high-volume part model data
CN110321596A (en) A kind of rolling stock structure simulation method based on finite element analysis
CN107248118A (en) Data digging method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination