CN109389143A - A kind of Data Analysis Services system and method for automatic modeling - Google Patents

A kind of Data Analysis Services system and method for automatic modeling Download PDF

Info

Publication number
CN109389143A
CN109389143A CN201810632499.6A CN201810632499A CN109389143A CN 109389143 A CN109389143 A CN 109389143A CN 201810632499 A CN201810632499 A CN 201810632499A CN 109389143 A CN109389143 A CN 109389143A
Authority
CN
China
Prior art keywords
model
scene
algorithm
data
business model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810632499.6A
Other languages
Chinese (zh)
Inventor
姜琦
路宏琦
耿迪
路明奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nine Chapter Yunji Technology Co Ltd Beijing
Original Assignee
Nine Chapter Yunji Technology Co Ltd Beijing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nine Chapter Yunji Technology Co Ltd Beijing filed Critical Nine Chapter Yunji Technology Co Ltd Beijing
Priority to CN201810632499.6A priority Critical patent/CN109389143A/en
Priority to CN202111299347.7A priority patent/CN113935434A/en
Publication of CN109389143A publication Critical patent/CN109389143A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/285Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques

Abstract

The present invention provides a kind of Data Analysis Services system and method for automatic modeling, this method comprises: display user interface, the user interface is used to be used to create for user setting the scene and data of business model;Obtain scene and/or data that user is arranged in the user interface;According to the scene and/or data of acquisition, a model strategy is selected from multiple model strategies, business model is created according to the model strategy of selection, the model strategy includes at least following information: the arameter optimization method of algorithm and the algorithm.In the present invention, model strategy can be automatically selected according to the scene and/or data of user setting, not need user's preference pattern strategy, improve the degree of automation of Data Analysis Services system, improve user experience.

Description

A kind of Data Analysis Services system and method for automatic modeling
Technical field
The present invention relates to technical field of data processing more particularly to a kind of Data Analysis Services system and automatic modeling sides Method.
Background technique
Current Data Analysis Services system carries out the major way of business model training are as follows: will be used to instruct from database The data for practicing business model export to local, by modeling Shi Liyong third party's modeling tool, according to business demand preference pattern plan Slightly, training business model, constantly manual debugging during training business model, the model parameter optimized, thus The business model trained.
Above-mentioned business model training method has very big drawback: the process of business model training is complicated, and the degree of automation is low, It is not suitable for non-professional user to use.
Summary of the invention
In view of this, the present invention provides a kind of Data Analysis Services system and method for automatic modeling, to solve existing number Complicated, the low problem of the degree of automation according to analysis process system training pattern process.
In order to solve the above technical problems, the present invention provides a kind of method for automatic modeling of Data Analysis Services system, comprising:
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Obtain the user scene and/or data that are arranged in the user interface, and according to the scene of acquisition and/or Data select a model strategy from multiple model strategies, create business model, the model plan according to the model strategy of selection Slightly include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the step of display user interface includes:
Show that scene list is selected for user in the user interface;
When detecting that user selects the operation of the scene in the scene list, choosing is shown in the user interface The scene selected;
Or
Scene input area is shown in the user interface;
When detecting user in the operation of the input area input scene, the scene that user inputs is obtained;
Scene in scene list with the scene matching of user's input is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes: The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging, The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression, Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used Default parameters.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: selection The information of model strategy.
Preferably, after the step of Modeling and Design information for the business model that the display creation is completed, further includes:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update Modeling and Design information runs the business model that the creation is completed.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection Slightly, the business model that the creation is completed is run.
Preferably, after the step of business model that the operation creation is completed, further includes:
The modeling achievement for the business model that display operation is completed, the modeling achievement includes at least one of: the fortune The business model that the score for the business model that the title for the business model that row is completed, the operation are completed and the operation are completed Export result.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed Run complete business model creation time, it is described operation complete business model training information, it is described operation complete The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, after the step of business model that the operation creation is completed, further includes:
The Modeling and Design information for the business model that display operation is completed, the Modeling and Design information include at least: selection The information of model strategy;
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for reruning the business model that the operation is completed, according to update The Modeling and Design information reruns the business model that the operation is completed.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
The first workflow corresponding with the business model that creation is completed is created, first workflow includes multiple workflows Module.
Preferably, after described the step of creating the first workflow corresponding with the business model that creation is completed, further includes:
When the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts Modeling and Design information Operation when, update first workflow.
Preferably, after described the step of creating the first workflow corresponding with the business model that creation is completed, further includes:
When detecting that user creates the operation of the second workflow identical with first workflow content, described in generation Second workflow, the second workflow editable.
Preferably, after the step of display user interface, further includes:
When detecting that user checks the operation of the data of setting, visual information corresponding with the data is shown.
Preferably, after the step of business model that the operation creation is completed, further includes:
When detecting that user issues the operation for the business model that operation is completed, the business mould that the operation is completed is issued Type.
Preferably, after the step of business model that the operation creation is completed, further includes:
When detecting that user reevaluates the operation of the business model of the business model that operation is completed or publication, to the fortune The business model of business model or publication that row is completed is reevaluated.
The present invention also provides a kind of Data Analysis Services systems, comprising:
Display module, for showing that user interface, the user interface are used for for user setting for creating business model Scene and data;
Processing module, the scene and/or data being arranged in the user interface for obtaining user;According to the institute of acquisition Scene and/or data are stated, a model strategy is selected from multiple model strategies, business mould is created according to the model strategy of selection Type, the model strategy include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the display module, for showing that scene list is selected for user in the user interface;Work as detection When selecting the operation of the scene in the scene list to user, the scene of selection is shown in the user interface;
Or
The display module, for showing scene input area in the user interface;When detecting user described When the operation of input area input scene, the scene of user's input is obtained;By the scene in scene list with user's input Matched scene is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes: The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging, The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression, Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used Default parameters.
Preferably, the display module is also used to show the Modeling and Design information for the business model that creation is completed, described to build Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
The first adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information Mould design information;
First operation module, for when the operation for detecting business model of user's execution for running the creation completion When, according to the Modeling and Design information of update, run the business model that the creation is completed.
Preferably, the Data Analysis Services system further include:
Second operation module, for using when detecting that user executes the operation for the business model that operation creation is completed The model strategy of selection runs the business model that the creation is completed.
Preferably, the display module is also used to show the modeling achievement for the business model that operation is completed, described to be modeled as Fruit includes at least one of: the score for the business model that the title for the business model that the operation is completed, the operation are completed The output result for the business model completed with the operation.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed Run complete business model creation time, it is described operation complete business model training information, it is described operation complete The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed Business model information, M is positive integer more than or equal to 1, and N is greater than or equal to the positive integer of M.
Preferably, the display module is also used to show the Modeling and Design information for the business model that operation is completed, described to build Mould design information includes at least: the information of the model strategy of selection;
Second adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information Mould design information;
Third runs module, detects that user executes the business model for reruning the operation completion for working as When operation, according to the Modeling and Design information of update, the business model that the operation is completed is reruned.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, the Data Analysis Services system further include:
Creation module, corresponding first workflow of business model for creating with creating completion, first workflow Including multiple workflow modules.
Preferably, the Data Analysis Services system further include:
Update module, for the operation when the business model for detecting operation creation completion, alternatively, detecting that user adjusts When the operation of Modeling and Design information, first workflow is updated.
Preferably, the Data Analysis Services system further include:
Replication module, for as the behaviour for detecting newly-built the second workflow identical with first workflow content of user When making, second workflow, the second workflow editable are generated.
Preferably, the Data Analysis Services system further include:
Visualization model, for showing corresponding with the data when detecting that user checks the operation of the data of setting Visual information.
Preferably, the Data Analysis Services system further include:
Release module, for issuing the operation when detecting that user issues the operation for the business model that operation is completed The business model of completion.
Preferably, the Data Analysis Services system further include:
Module is reevaluated, for that ought detect that user reevaluates the business model of the business model that operation is completed or publication When operation, the business model of business model or publication that the operation is completed is reevaluated.
The present invention also provides a kind of Data Analysis Services system, including processor, memory and it is stored in the memory Computer program that is upper and can running on the processor, the computer program is realized above-mentioned when being executed by the processor The step of method for automatic modeling.
The present invention also provides a kind of computer readable storage medium, computer is stored on the computer readable storage medium The step of program, the computer program realizes above-mentioned method for automatic modeling when being executed by processor.
The advantageous effects of the above technical solutions of the present invention are as follows:
In the embodiment of the present invention, Data Analysis Services system can be automatic to select according to the scene and/or data of user setting Model strategy is selected, user's preference pattern strategy is not needed, improves the degree of automation of Data Analysis Services system, improve use Family experience.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for automatic modeling of the Data Analysis Services system of the embodiment of the present invention one;
Fig. 2 is the schematic diagram of the user interface of the automatic modeling of the embodiment of the present invention;
Fig. 3 is the schematic diagram of the user interface of the information for checking model strategy of the embodiment of the present invention;
Fig. 4 is the schematic diagram of the user interface of the modeling achievement list of the embodiment of the present invention;
Fig. 5 is the schematic diagram of the user interface of the modeling achievement chart of the embodiment of the present invention;
Fig. 6 and Fig. 7 is the schematic diagram of the user interface for checking data of the embodiment of the present invention;
Fig. 8 is the schematic diagram of the user interface of the model repository of the embodiment of the present invention;
The schematic diagram for the user interface that Fig. 9 and Figure 10 uses for the on-time model performance and resource of the embodiment of the present invention;
Figure 11 is the structural schematic diagram of the Data Analysis Services system of one embodiment of the invention;
Figure 12 is the structural schematic diagram of the Data Analysis Services system of another embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention Attached drawing, the technical solution of the embodiment of the present invention is clearly and completely described.Obviously, described embodiment is this hair Bright a part of the embodiment, instead of all the embodiments.Based on described the embodiment of the present invention, ordinary skill Personnel's every other embodiment obtained, shall fall within the protection scope of the present invention.
Referring to FIG. 1, the process that Fig. 1 is the method for automatic modeling of the Data Analysis Services system of the embodiment of the present invention one is shown It is intended to, the method for automatic modeling includes:
Step 11: display user interface, the user interface are used to be used to create for user setting the scene of business model And data;
Referring to FIG. 2, Fig. 2 is the user interface for automatic modeling of the Data Analysis Services system of the embodiment of the present invention Schematic diagram, in the user interface of automatic modeling, input frame and " selection data module " including " selection scene " (are counted According to) input frame, the scene for creating business model, Ke Yi can be arranged in user in the input frame of " selection scene " In the input frame of " selection data module " (data module is module for storing data), it is arranged for creating business model Data.In the embodiment of the present invention, it is preferable that show that data module on a user interface is that user possesses the number for selecting permission According to module, while showing data module, the description of data module can also be shown in user interface.
Step 12: obtaining scene and/or data that user is arranged in the user interface, and according to the field of acquisition Scape and/or data select a model strategy from multiple model strategies, create business model, institute according to the model strategy of selection Model strategy is stated including at least following information: the arameter optimization method of algorithm and the algorithm.
So-called model strategy, the arameter optimization method of algorithm and algorithm including at least business model can be based on model The information of strategy is trained the algorithm of business model.In some currently preferred embodiments of the present invention, the model strategy is also It may include at least one of following information: the fractionation of the appraisal procedure, the parameter setting method, the data of algorithm of algorithm The feature selection approach of method, the processing method of the data and the data.
It is understood that needing to be stored in advance multiple model plans in Data Analysis Services system in the embodiment of the present invention Slightly.
In the embodiment of the present invention, Data Analysis Services system can be automatic to select according to the scene and/or data of user setting Model strategy is selected, user's preference pattern strategy is not needed, improves the degree of automation of Data Analysis Services system, improve use Family experience.
In the embodiment of the present invention, the type of the algorithm of business model may include at least one of: clustering algorithm, classification Algorithm, regression algorithm, abnormality detection and Language Processing algorithm.Corresponding, the scene may include at least one of: corresponding The scene of clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, the scene and correspondence of corresponding abnormality detection The scene of Language Processing.
For example, the scene of corresponding clustering algorithm for example may include: that card holder group (analyzes the visitor of credit card Which classification family has) and the network group domain (relationship i.e. between analysis network alarm log and equipment, based on equipment to network alarm Log is clustered) etc..The scene of corresponding sorting algorithm for example may include: that customer churn prediction and financial product recommend prediction Deng.The scene of corresponding regression algorithm for example may include: the prediction of settlement of insurance claim amount and cash provision etc..Corresponding abnormality detection Scene for example may include: fraud and abnormal transaction etc..The scene of corresponding Language Processing for example may include: latent semantic analysis With word frequency analysis etc..
In the embodiment of the present invention, the scene of user setting may include big type first, that is, on a user interface selection be Clustering algorithm, sorting algorithm etc., the scene of user setting also may include small type, for example, the scene comprising business objective, example Such as, selection is card holder group, customer churn prediction etc. on a user interface.Certainly, in some other implementation of the invention In example, can also there was only big type, or only small type in user interface, the present invention is without limitation.
In the embodiment of the present invention, it is preferable that the scene refers to the business scenario for creating business model, scene and industry The type of the algorithm of business model is related.
In the embodiment of the present invention, in preference pattern strategy, the scene of user setting can be analyzed, be corresponded to Model strategy obtain corresponding model strategy it is of course also possible to analyze the type of the data of user setting, alternatively, Scene and data are analyzed simultaneously, obtains corresponding model strategy.
In the embodiment of the present invention, it is preferable that can store scene and/or data and model plan in Data Analysis Services system Corresponding relationship slightly, thus according to the corresponding relationship, preference pattern strategy.Certainly, in some other embodiment of the invention, It is also possible to store the corresponding relationship of scene and/or data and the information of model strategy, Data Analysis Services system can basis The corresponding relationship of scene and/or data and the information of model strategy, determines model strategy.
In the embodiment of the present invention, scene and data, which can be, to interact, it is preferable that can be selected according to different data Scene it is different, the data that can be selected according to different scenes are different, and data difference includes the particle of the type of data, data The differences such as degree, the target column that can be selected.
In some embodiments of the invention, the step of display user interface may include:
Step 111: showing that scene list is selected for user in the user interface.
Step 112: when detecting that user selects the operation of the scene in the scene list, in the user interface On show the scene of selection;
In some other embodiment of the invention, the step of display user interface, can also include:
Step 111 ': scene input area is shown in the user interface;The scene input area can be text Input frame is also possible to voice input key;
Step 112 ': when detecting user in the operation of the input area input scene, obtain the institute that user inputs State scene;
Step 113 ': the scene in scene list with the scene matching of user's input is shown in the user interface.
Specifically, Data Analysis Services system can carry out semantic understanding to the scene inputted in input area, it is automatic to know Other scene, and the scene for the scene matching for determining and identifying from scene list.
It will be appreciated that needing to store the scene list in Data Analysis Services system, have in the scene list There is at least one (being typically more than one, for example, 80) scene.
Referring to FIG. 2, other than scene set and data, user is also an option that setting mesh in the embodiment of the present invention It marks feature (i.e. target column in Fig. 2), and model strategy is determined according to target signature.Such as the target column in customer churn prediction Whether to be lost label column.In the embodiment of the present invention, target column can choose a column.It is of course also possible to be multiple row.
That is, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the scene and/or data for obtaining user and being arranged in the user interface, and according to the institute of acquisition The step of stating scene and/or data, selecting a model strategy from multiple model strategies includes: to obtain user in user circle Scene, data and/or the target signature being arranged on face, and according to the scene, data and/or the target signature of acquisition, from more A model strategy is selected in a model strategy.
That is, the effect of target signature can be used to preference pattern strategy, in addition, target signature can also be in training It is used during business model, for example, being used when algorithm evaluation.
In addition, referring to FIG. 2, the title of business model can also be arranged (i.e. in user in the user interface of automatic modeling Automatic modeling title in Fig. 2), meanwhile, the description of business model and the label of business model etc. can also be arranged in user.
It illustrates below and the corresponding relationship of scene, data, target signature and model strategy is illustrated.
1. the scene of corresponding clustering algorithm: card holder group (which classification the client of credit card has), network group domain (net Relationship between network alarm log and equipment clusters network alarm log based on equipment) etc..
Scene-card holder group, data-credit card customer information (such as letter in certain bank's fixed cycle (such as 1 year) With card customer information).
Model strategy 1: data processing: data cleansing and/or data normalization;Feature Engineering: by principal component analysis into Row feature selecting;Algorithm (space characteristics based on Cluster space are selected), algorithm includes at least one of: level is poly- Class, Bayes's Gaussian Mixture, KDTree (K-D tree) and limited Boltzmann machine;The arameter optimization method of algorithm is based on hyper parameter Optimization carries out, and the method for the hyperparameter optimization includes at least one of: random parameter searching method, mesh parameter searcher Method and silhouette coefficient method (such as being polymerized to several classes), specifically, being based on random parameter searching method and/or mesh parameter searcher Method selects hyper parameter, for example, selecting one from parameter list based on random parameter searching method and/or mesh parameter searching method The optimal hyper parameter of group, wherein using evaluation index of the silhouette coefficient as hyper parameter;Assessment: it is based on Silhouette (profile) system Number, homogeneity (homogeney), completeness (integrality) and/or V-measure carry out algorithm evaluation.Each Algorithm all does identical assessment, each arithmetic result retains, and is further analyzed in conjunction with credit card business.
2. the scene of corresponding sorting algorithm: customer churn prediction, financial product recommend prediction etc..
Scene-customer churn prediction, data-customer information (such as certain bank's fixed cycle (such as 1 year) interior client letter Breath), target column-loss/non-loss.
Model strategy 2: data processing: data cleansing and/or data normalization;Feature Engineering: pass through Chi-square Test, Pierre Gloomy correlation coefficient process, extreme tree Method for Feature Selection and/or recursive feature null method etc. carry out feature selecting;Algorithm include with down toward One of few (characteristic based on algorithms of different, each characteristic under all select some algorithms): logistic regression, random forest, Bagging, AdaBoost, neural network and Stack Model;The arameter optimization method of algorithm is carried out based on hyperparameter optimization, described The method of hyperparameter optimization includes at least one of: below random parameter searching method, mesh parameter searching method and curve Product (Area Under the Curve, AUC) fractal methods, specifically, being based on random parameter searching method and/or mesh parameter Searching method selects hyper parameter, for example, based on random parameter searching method and/or mesh parameter searching method from parameter list One group of optimal hyper parameter is selected, wherein using evaluation index of the AUC score as hyper parameter;Assessment: based on AUC score, accurately Rate, accurate rate, recall rate, F1 score and/or logarithm loss carry out algorithm evaluation.Each algorithm does identical assessment, choosing Optimal algorithm out exports each customer churn prediction probability value.
3. the scene of corresponding regression algorithm: the prediction of settlement of insurance claim amount, cash provision etc..
The prediction of scene-settlement of insurance claim amount, data-certain insurance company's customer informations (such as certain bank's fixed cycle (such as 1 year) interior customer information), target column-Claims Resolution amount.
Model strategy 3: data processing: data cleansing and/or data normalization;Feature Engineering: pass through Chi-square Test, Pierre Gloomy correlation coefficient process, extreme tree Method for Feature Selection and/or recursive feature null method etc. carry out feature selecting;Algorithm include with down toward One of few (characteristic based on algorithms of different, each characteristic under all select some algorithms): logistic regression, random forest, support Vector regression (support vector regression, SVR) and neural network;The arameter optimization method of algorithm is based on super ginseng Number optimization carries out, and the method for the hyperparameter optimization includes at least one of: random parameter searching method, mesh parameter search Method and R2 value method, specifically, selecting hyper parameter, example based on random parameter searching method and/or mesh parameter searching method Such as, one group of optimal hyper parameter is selected from parameter list based on random parameter searching method and/or mesh parameter searching method, The middle evaluation index using R2 value as hyper parameter;Assessment: explained variance scoring, mean absolute deviation, mean square error, R2 are based on Value, median absolute error and/or square log error carry out algorithm evaluation.Each algorithm does identical assessment, selects Optimal algorithm, export insurance Claims Resolution amount predicted value.
4. corresponding abnormality detection scene, more specifically, such as fraud, abnormal transaction etc..
Scene-abnormality detection, data-certain industry Transaction Informations (such as Transaction Information in certain industry fixed cycle), can be with Provide target column-exception/non-exception.
Model strategy 4: data processing: data cleansing and/or data normalization;Feature Engineering: feature disequilibrium processing (generally use all features, and carry out the processing of feature disequilibrium);Algorithm includes that at least one of (is selected for abnormality detection Select some algorithms): neural network, support vector machine, robustness regression, arest neighbors and Isolation Forest (isolated forest); The arameter optimization method of the algorithm based on hyperparameter optimization carry out, the method for the hyperparameter optimization include it is following at least it One: random parameter searching method, mesh parameter searching method and F1 fractal methods, specifically, based on random parameter search and/or Mesh parameter searching method selects hyper parameter, for example, being based on random parameter searching method and/or mesh parameter searching method from ginseng One group of optimal hyper parameter is selected in ordered series of numbers table, wherein using evaluation index of the F1 score as hyper parameter;Assessment: based on AUC points Number, accuracy rate, accurate rate, recall rate, F1 score and/or logarithm loss carry out algorithm evaluation.Each algorithm does identical Assessment, selects optimal algorithm, exports the predicting abnormality probability value of transaction.
5. corresponding Language Processing scene, more specifically, such as latent semantic analysis, word frequency analysis etc..
The corresponding text information (for example, summary info, log information, search term) of scene-latent semantic analysis, data-.
Model strategy 5: data processing: word segmentation processing and/or word frequency analysis;Algorithm includes at least one of (for language Speech processing selects some algorithms): potential applications index, implicit Di Li Cray distribution and condition random field;The arameter optimization of algorithm Method includes: to provide default parameters according to the result of word frequency analysis, uses default parameters;Further progress cluster: it algorithm: uses It is locally linear embedding into, composes at least one of insertion, Multidimensional Scaling, local space arrangement progress dimensionality reduction (based on flow pattern sky Between space characteristics selected), then clustered using K-MEANS.It is further analyzed in conjunction with specific business.
In the embodiment of the present invention, it is preferable that after described the step of creating business model according to the model strategy of selection, also It may include: the Modeling and Design information for the business model that display creation is completed, the Modeling and Design information includes at least: selection The information of model strategy.So that user can check the information of the model strategy of selection.The Modeling and Design information may be used also With include: target signature and or scene.
In the embodiment of the present invention, it is preferable that the step of display creates the Modeling and Design information for the business model completed Later, can also include:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update Modeling and Design information, the business model that operation creation is completed.
That is, user can be with the content of user-defined m odel design information, to improve user experience.
The business model that so-called operation creation is completed, includes at least: instructing to the algorithm for the business model that creation is completed Practice, it is, of course, also possible to include: to split to data, data processing is carried out to data, and/or, the feature of data is selected It selects.
In the embodiment of the present invention, it is preferable that after described the step of creating business model according to the model strategy of selection, also It may include: when detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection Slightly, the business model that operation creation is completed.
Referring to FIG. 2, user setting is complete for creating the scene sum number of business model in user interface shown in Fig. 2 After equal, user can click " newly-built " key, the Modeling and Design information for the business model that display creation is completed.For example, can To check the data processing method in model strategy, algorithm, the parameter of algorithm and/or appraisal procedure etc..Alternatively, clicking " training " When key, using the model strategy of selection, business model is created, and run business model.That is, as long as user clicks " training " key, Data Analysis Services system can create business model according to the model strategy automatically selected, and run wound The business model for building completion does not need user's preference pattern strategy, simplifies training process, improves Data Analysis Services system The degree of automation, improve user experience.
In the embodiment of the present invention, after user clicks " newly-built ", the user interface of display can be as shown in figure 3, user circle The Modeling and Design information for the business model that the creation shown under face is completed includes: essential information, feature, modeling and assessment, wherein Essential information includes target and training/test set, and target includes: scene and target column, training/test set be by data into Row is split and/or the methods of sampling formation, and modeling includes algorithm and parameter.
In the embodiment of the present invention, after described the step of running the business model that creation is completed, further includes: display has been run At business model Modeling and Design information.That is, can also look at the industry that operation is completed after having run business model The Modeling and Design information of business model.
In the embodiment of the present invention, after described the step of showing the Modeling and Design information for running the business model completed, also It may include: to update the Modeling and Design information when detecting that user adjusts the operation of the Modeling and Design information;Work as detection When executing the operation for reruning the business model that the operation is completed to user, believed according to the Modeling and Design of update Breath reruns the business model that the operation is completed.
The Modeling and Design information includes: the information of the model strategy for the business model that the operation is completed, and can also be wrapped Include scene and/or target signature.That is, in the embodiment of the present invention, can after business model training is completed, check or Modeling and Design information is stated in adjustment, such as model strategy, target signature and/or the scene etc. of adjustment business model, in addition to data it Outside, other information is all adjustable, and the business model after combustion adjustment again.
In the embodiment of the present invention, after described operation described the step of creating the business model completed, further includes: display fortune The modeling achievement for the business model that row is completed, the modeling achievement may include at least one of: run the business mould of completion The information such as the output result of business model that the score for the business model that the title of type, operation are completed, operation are completed, the output As a result for example can be client whether the prediction result of attrition prediction.The title of business model for example can be algorithm title+when Between stab.
In some currently preferred embodiments of the present invention, the modeling achievement can also include: the business that the operation is completed The business model that the creation time for the business model that the information of the model strategy of model, the operation are completed, the operation are completed Training information (as training duration), it is described operation complete the corresponding workflow of business model (also referred to as task, below content It is middle workflow to be illustrated), the operation state (such as successful, unsuccessfully etc.) and the data of business model completed Feature importance ranking information.
It may include polyalgorithm in a model strategy in the embodiment of the present invention, to run the business that creation is completed After model, the information of available multiple business models.Therefore, the modeling achievement may include: the selected model The information of the preceding M business model of highest scoring in the business model that the corresponding N number of operation of strategy is completed, alternatively, the selection The modeling achievement of business model completed of the N number of operation of the corresponding whole of model strategy, M is positive integer more than or equal to 1, N For the positive integer more than or equal to M.That is, a model strategy may include multiple business models, it, can after the completion of operation With display portion or the information of whole business models.
In the embodiment of the present invention, modeling achievement chart or the business of modeling achievement list display operation completion can be passed through The modeling achievement of model, wherein modeling achievement chart can see the different business mould of the variant training an of automatic modeling Type result compares, and conveniently sees training the inside preferably business model every time.Modeling achievement list can see an automatic modeling All business model results compare, including the corresponding each business model of different training can all be ranked up, conveniently see institute Some training the inside preferably business model.
Referring to FIG. 4, Fig. 4 is the use of the modeling achievement list for the business model that the operation in one embodiment of the invention is completed The schematic diagram at family interface, the user interface of the modeling achievement list show whole model lists, and temporally inverted order is shown default.
Display field name is as follows:
Check box: only issuing success status, can be checked;Choosing rear bottom [model evaluation] button becomes available;
Model name: display specific name (named with model name+timestamp, and workflow is with source analysis by automatic modeling The output title of module is named;) and source (being named with source analysis module);The state that display is put into warehouse (is laughed at Face), support sequence, state be it is successful, click title can into be somebody's turn to do [Model Results details] page;
Ownership: task names belonging to the model are shown.It clicks on new window and enters [task details] page, support sequence;
Founder: display founder's information supports sequence;
Creation time: the date+time supports sequence;
State: success fails, in load, -- (appraisal procedure is not found in representative);
Training time: training duration * h*m*s is shown, if you do not need to not showing such as 59s then when big unit;
Score index: can be configured by table, default at most display 6 simultaneously;
Operation:
Check that result (eyes) clicks to enter [Model Results details page], success status just shows and [check result] button;
It checks log (bookmark), clicks pop-up [log details] pop-up, whole status display [checking log] buttons.
Referring to FIG. 5, the user interface of the modeling achievement chart for the business model that the operation of Fig. 5 embodiment of the present invention is completed Schematic diagram.
The user interface right content -- task model show area includes:
Choose whole model lists of task in user interface display left side.
Task visualizes area: showing whole model visualization information under the task, includes model algorithm parameter, feature Importance, training information etc..It can show whole model training contents with line (curve, broken line etc.) figure state, mouse suspension node, It can show more information.
Model display area: the model color identifier, model name, status indicator, binning state, champion's mark, tool are shown Body scoring, time started, action-item, Visual Chart;When mouse form suspension region, be switched to selected state, and with left side Task model list selected state corresponds;
Color identifier: the Line Chart in color identifier and right side [task model visualization is scored] before model name is protected Hold consistent, most 13 kinds of different colors of distribution (upper limit for supporting algorithm).
Model name: showing its specific name, and suspend complete display;Click model title after issuing successfully, in current page Face enters should [model details] page;When model failure, model name becomes red, can not be clicked;When in model load, It can not be clicked after choosing, when model does not have evaluation module, model name becomes red, can not be clicked after choosing.
Identification-state: in load, issuing and successfully (do not show icon), and failure (not showing icon, title reddens) is not looked for To appraisal procedure (not showing icon, title reddens, and only limits the assessment comparison in workflow).
Storage mark: model is updated in warehouse, then shows the mark (smiling face in figure) updated to warehouse.
Champion's mark: in the task, show that champion identifies (trophy in figure) before the preferable model that scores.It (is scored Filter Bar influences, and the content according to scoring screening is different, and score value can also change).
Specific scoring: (to be appraised point of Filter Bar influences display most three decimal points of score value situation, according in scoring screening Hold difference, score value can also change).
Time started: display job start time, date+time.
Action-item:
Check that result (eyes) clicks to enter [Model Results details page], success status just shows and [check result] button.
It checks log (bookmark), clicks pop-up [log details] pop-up, whole status display [checking log] buttons.
The user interface left content -- task model list includes:
1, task list caused by whole automatic modeling (workflow) training, drop-down load are shown;
2, task list default carries out sequence up and down with time inverted order;
3, specific [task names] title is clicked, opening in new window should [task details] page;
4, task can be deleted, and delete pop-up secondary-confirmation prompt, after deleting successfully, be emptied generated in the task Whole model contents, while deleting the associated task in task list together and (similarly deleting task, also delete association together certainly Dynamic modeling contents) content in model repository is not influenced;
5, task may include multiple models, show its color identifier, model name, status indicator, binning state, Champion's mark is specific to score;
Color identifier: the Line Chart in color identifier and right side [task model visualization is scored] before model name is protected Hold consistent, most 13 kinds of different colors of distribution (upper limit for supporting algorithm);
Model name: automatic modeling is named with model name+timestamp, and workflow is ordered with the output title of analysis module Name;Show its specific name, suspend complete display;Click model title is expert at, behavior selected state, right content switching For current task show area, and the model display position is slided into, current line selected state and after issuing successfully, model name can It is clicked, entering after click in current page should [model details] page;When model failure, model name becomes red, chooses After can not be clicked;When in model load, it can not be clicked after choosing;When model does not have evaluation module, model name becomes For red, can not be clicked after choosing;
Identification-state: in load, issuing and successfully (do not show icon), and failure (not showing icon, title reddens) is not tied Fruit (does not show icon, title reddens);
Storage mark: model is updated in warehouse, then shows the mark (such as smiling face in figure) updated to warehouse;
Champion's mark: in the task, it is (to be appraised to show that champion identifies (such as trophy in figure) before the preferable model that scores Filter Bar is divided to influence, the content according to scoring screening is different, and score value can also change);
Specific scoring: (to be appraised point of Filter Bar influences display most three decimal points of score value situation, according in scoring screening Hold difference, score value can also change).
Model evaluation is placed in list bottom, chooses a certain or multinomial, can use, and clicks button and pops up in current page [model evaluation] reminding window (check box for only issuing success status can be checked).
Above-mentioned modeling achievement chart substantive content (compared with model result) corresponding with modeling achievement list is the same.Figure Table can preferably show the superiority and inferiority situation that same training (such as task 001 or task 002) generates model, and list can be more preferable The superiority and inferiority situation for showing different training (all training) and generating models.
The training is usually that iteration operation algorithm model runs algorithm model more than once (hyper parameter may include The number of iterations).
In the embodiment of the present invention, the user interface of history modeling achievement can also be provided, so that user be facilitated to check history Model achievement.
In the embodiment of the present invention, the method for automatic modeling can also include: the business model pair that creation and creation are completed The first workflow answered, first workflow include multiple workflow modules, can have connection between workflow module and close It is that in two workflow modules with connection relationship, the output of a workflow module is as the defeated of another workflow module Enter.For example, the workflow module may include a data module, which corresponds to the data of user setting, institute Stating workflow module can also include an analysis module, the algorithm in corresponding model strategy.First workflow not editable and Modification, can only check.That is, the bottom of Data Analysis Services system creates a task (i.e. simultaneously in automatic modeling First workflow), while the title of task can be automatically generated, such as model name+timestamp, user's function of the first workflow Energy permission and the user function permission of automatic modeling are consistent.
In the embodiment of the present invention, described the step of creating the first workflow corresponding with the business model that the creation is completed Later, further includes: when the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts Modeling and Design When the operation of information, first workflow is updated.
That is, in above-described embodiment while creating automatic modeling, first workflow can be automatically created, often The operation of secondary model is all carried out in first workflow, and carries out accordingly upgrading the first workflow when model running every time Version;When automatic modeling is run, that is, when work flow operation.When detecting that user adjusts the operation of Modeling and Design information, The Modeling and Design information is updated, according to the Modeling and Design information of update, updates first workflow.
In the embodiment of the present invention, the creation with the step of the business model corresponding workflow of the creation completion it Afterwards, further includes: when detecting that user creates the operation of the second workflow identical with first workflow content, generate institute State the second workflow.The second newly-built workflow such as can be checked, be modified, being edited at the operation, so as to automatic modeling Model further modification and carry out complex scene design.
In the embodiment of the present invention, it is i.e. newly-built with described that workflow can be created by generating the user interface of data application Identical second workflow of one workflow content, it is specific the following steps are included:
1, a new data application (i.e. workflow) is generated;It (models achievement chart at modeling achievement interface and is modeled as Fruit list) comprising [generating data application] key, current workflow can be replicated by modeling achievement exposition.
2, data application title is set;
3, it describes, default shows former automatic modeling content;
4, label, default show former automatic modeling content.
In the embodiment of the present invention, Fig. 6 and Fig. 7 are please referred to, the step of display user interface includes: when detecting user When checking the operation of the data of setting, visual information corresponding with the data is shown.That is, working as the complete number of user setting According to later, table or chart etc. can also for example can be to the data information preview of setting, the visual information, thus side Just user's garbled data.In the embodiment of the present invention, data content is divided into numerical value (integer, floating type) type and other nonumeric types Value, numeric type, nonumeric type can be shown respectively.
It is mentioned in above-described embodiment, it, can be with the letter of display model strategy before model training, or after training Breath, checks or is modified for user, is illustrated below to the user interface of the information of display model strategy.
(1) user interface of training set and test set
In the embodiment of the present invention, when carrying out model training, training set and test set are needed, defaults being trained for use Collection and test set method are as follows: split current data, the method for obtaining training set and test set may also is that another number of fractionation According to, from extracting trained and test data in data, trained and test data is extracted from two data, extracted from other data Trained and test data.
In the embodiment of the present invention, obtains training the training set of business model and the method for test set may include sampling and tear open Point, the methods of sampling may include: 1) unsample, use all data;2) original record;3) X% row is randomly selected;4) random choosing Take N row;5) class balances N row;6) class balance X% row etc..Method for splitting may include: 1) random splits;2) starting K- folding intersects Verifying;3) Number of folds (enabling)/training data ratio (not enabling);4) random seed.
(2) it is arranged and selects the user interface of Feature Engineering, including data processing and feature selecting
(1) data processing
1: the data processing based on classification, comprising: classification processing, missing values.Selectable classification processing method includes: mute Coding vector;Selectable missing values processing method includes by numerical value processing, filling, deletion row etc..
2: the data processing based on numerical value, comprising: numerical value processing, missing values.Selectable Numerical Methods include: mark Quasi- numerical characteristics (Keep as a regular numerical feature), the binaryzation based on given value, branch mailbox etc.;It can The missing values processing method of selection includes filling, deletion row etc..
3: text based data processing, selectable processing method include: word segmentation processing, word frequency analysis
(2) feature selecting includes:
Optional feature selection approach includes: mutual information, Chi-square Test, F inspection, Pearson correlation coefficients method, recurrence spy Levy null method, characteristic model null method etc.;It further, can also include feature orthogonalization, the principal component analysis of feature, matrix Decompose etc..Based on the method for user's selection, system carries out feature selecting automatically.
It is subsequent to carry out again feature selecting again based on the calculated feature importance of model after automatic modeling.
Specifically, user can also directly customized selection feature:
1: according to the data type of different column, different types of variables (i.e. feature) carry out difference show feature (such as It is divided into classification and numerical value);
2: supporting that title is (according to field name title a-z with data (according to the tandem of data sheet display field name) 0-9), type (first classification, then number), role's (first target column is then turned on column, rear to close column) carry out tab sequential sequence;It opens The column opened refer to: the feature selected;The column of closing refer to: the feature not selected.
3: data, which are disbursed from the cost and expenses, holds multiselect, a key Quan Xuan, key removing multiselect;
4: supporting search;
5: leaving the page and enter again, retain last time operation trace;
6: target column and commonly showing obvious differentiation;
(3) user interface of setting and selection algorithm and parameter
(1) algorithm
1: all algorithms can display algorithm brief introduction;
2: first time operative algorithm has the display default value of default value, and non-first time retains last time operation note, opens Close button does not influence this operation;
3: the corresponding button of algorithm is that when closing, can not be adjusted to the arbitrary parameter of algorithm, there is apparent viewing area Point;
Algorithm include: (1) cluster: K-MEANS, neighbour's propagation, mean shift, spectral clustering, hierarchical clustering, density noise, Equilibrium iteration hierarchical clustering etc.;(2) classify: the progressive tree of random forest, gradient, XGBoost, decision tree, close on algorithm (KNN), Additional random number, neural network, logistic regression, support vector machines, stochastic gradient descent etc.;(3) it returns: random forest, gradient Progressive tree, lasso trick recurrence, XGBoost, decision tree, closes on algorithm (KNN), additional random number, neural network, lasso trick at ridge regression Path, logistic regression, support vector machines, stochastic gradient descent etc..
(2) user interface of parameter
Hyper parameter setting:
1: search hyper parameter
1) random grid searches speed
● whether upset original sequence
2) maximum number of iterations,
3) the maximum search time is only positive integer and floating type
4) number of concurrent is only positive integer and -1
Wherein, hyper parameter is the parameter of the setting value before starting learning process, rather than the parameter obtained by training Data.Under normal conditions, it needs to optimize hyper parameter, selects one group of optimal hyper parameter, to improve the performance and effect of study Fruit.
Further, system provides the automated tuning of hyper parameter, and selectable tuning method includes: (1) cluster: profile system Number, Silhouette coefficient, homogeneity (homogeney), completeness (integrality), V-measure etc.;(2) divide Class: AUC score, accuracy rate, accurate rate, recall rate, F1 score, logarithm loss etc.;(3) return: R2 value, explain difference score value, Mean value error, mean square error, root-mean-square error, root mean square log error, absolute mean error etc..One can only generally be selected.
Note: default hyper parameter are as follows: " randomized ": true;"nJobs":1;"mode":"K-FOLD";"nFolds": 5。
It include cross validation UI Preferences in hyper parameter user interface
1: cross validation
1) traditional approach splits training set/verifying collection default and supports that inputting primary contract is only positive integer and floating type, It is defaulted as 0.8
2) K-Fold default supports foldable number, is only positive integer, default value 0
Data are first split into training set and test set (referring to user circle of training set and test set specifically, can be Face setting);Training set is split into training set again for cross validation part and verifying collects.Wherein, verifying collection is used for cross validation, surveys Examination collection is used for subsequent assessment.
All data: usually can't all be brought training by note, but separated a part and come (i.e. verifying collection, this portion Point do not participate in training) parameter that generates to training set tests, it is relatively objective judge these parameters to training set except The matching degree of data.This thought is known as cross validation (Cross Validation).
(4) it is arranged and selects the user interface of appraisal procedure
The user interface of appraisal procedure
1: there is different model evaluation methods according to different classes of algorithm, single choice, or with one of core for scoring Heart standard, while can also show other associated evaluation indexes
Appraisal procedure includes: explained variance scoring, mean absolute deviation, mean square error, R2 scores, median absolutely misses Poor, square log error, F1 value, accuracy rate, accurate rate, recall rate, AUC score, logarithm loss, Cost matrix, accumulative promotion Degree, FBeta scoring, silhouette coefficient, homogeneity (homogeney), completeness (integrality), V-measure etc..Its In, the appraisal procedure of corresponding clustering algorithm includes: silhouette coefficient, homogeneity (homogeney), completeness (complete Property), V-measure;The appraisal procedure of corresponding multi-classification algorithm include: F1 value, accuracy rate, accurate rate, recall rate, AUC score, Logarithm loss, FBeta scoring;The appraisal procedure of corresponding two sorting algorithms includes: F1 value, accuracy rate, accurate rate, recall rate, AUC Score, logarithm loss, Cost matrix, accumulative promotion degree, FBeta scoring;The appraisal procedure of corresponding regression algorithm includes: explanation side Poor scoring, mean absolute deviation, mean square error, R2 value, median absolute error, square log error.
Note: default appraisal procedure is respectively: two classification: AUC score, classify: accuracy rate returns: R2 value more
It after adjusting any of the above links, can be carried out saving, and click " training ", then check as a result, going forward side by side Row saves.User can save customized model strategy, use or be supplied to other users for next time and use.
In the embodiment of the present invention, after model is completed in training, certain standard ability can be reached with Issuance model, model It can be published to warehouse, carry out online etc., that is, the content for meeting certain Score index (evaluation criteria) can just be published to warehouse, carry out Online operation.Above-mentioned model refers to the model of automatic modeling or generates the newly-built model of data application.Only it is published to model repository In model, just can be carried out the online of model, comparison and iteration.
The user interface for being published to warehouse may include:
1, [being published to warehouse] button is clicked, pops up [being published to model repository] pop-up in current page;
2, pop-up includes the following contents: title, description, label;
Meet condition: selection combobox: the appraisal procedure of alternative whole model supports;Alternative condition combobox: greater than etc. In being less than or equal to;The numeric type of input frame: greater than be equal to 0, such as with AUC score, then alternative condition combobox: greater than be equal to, It is less than or equal to, setting value.
It automatically updates: opening or be not turned on: after unlatching, by the model for the condition that meets and being also put into the mould in warehouse not successfully Type is updated into model repository;Time interval is automatically updated to be defaulted as 24 hours;
It submits, clicks button, pop-up updates progress prompt frame, can check all eligible model modification progresses;
After submission, [being published to warehouse] button pattern is changed to and [has been published to warehouse] and configures.Model repository please join See attached drawing 8, on-time model performance and resource service condition can be checked by clicking " on-time model monitoring ".On-time model performance and Resource use is checked referring to Fig. 9 and Figure 10.
Referring to FIG. 8, all online model list can be shown, default is arranged by on-line time inverted order.List is aobvious Show following field: on-time model title, current container, CPU, MEM, GPU use real-time condition, (can match within the scope of certain time Specific duration is set, in a few houres or several days) average/min/max response time, call number and success rate.
" model details " button in on-time model title or operation is clicked, Model Results details page can be entered, it is browsable To the model specifying information.Call number is clicked, details page is called into specific, please refers to Figure 10.
It in Figure 10, can show in a certain range, call number details.By way of domestic map visualization, display The calling situation (different colours represent different degrees of call number) of national each province, the specific province of mouse suspension, display are detailed Feelings include specific province title, ranking, call number and accounting detail;Can also by details list, check call every time it is bright It carefully, include allocating time, response time, call type, method of calling, access state, province and source.
The Model Results list of publication and details can also include the following contents:
1, it has been published to needing through review mechanism for model repository, the online operation of model can be carried out;
2, it supports to import (can be in batches) model from local, and shows all modeling achievement lists;
3, it supports batch to reevaluate model, and can check assessment result;
4, support model iteration is online, and the model of deployment success can will replace online model (inside each model only There can be an online model, the model for defaulting most three deployment success is online in waiting, so for disposing and the upper limit Model all needs to replace existing planned number), become on-time model;
5, whether meeting detection model has carried out eigenvalue assignment when model carries out online, and resource distribution and debud mode are matched It sets, resources pattern is entered if not, relevant configuration is carried out, if there is then skipping;
6, the list of click model achievement can check essential information (title, algorithm types, training time, the training of model Shi Changlie row, configuration is new, the data analysis module including ownership);
7, the api interface information and APIkey for showing model, can carry out Rest, message queue, and tri- kinds of file system NFS Debugging mode carries out debugging interface, but only online model can carry out interface calling;
8, eigenvalue assignment, resource value configuration and debud mode configuration information can be checked;
9, the importance of characteristic variable is shown and the parameters information of model evaluation index is shown;
10, about the ROC curve of performance, the more intuitive diagrammatic representation information of the model evaluation result of confusion matrix;
11, the algorithm parameter information of model, the displaying of training data information and training details.
The Model Results list of publication is different from the achievement list of modeling, and the Model Results list of publication includes performance (mould The case where after type is online, calls whether successful, resource situation etc.).
In the embodiment of the present invention, the user interface that model reevaluates can also be provided, under the user interface, can be executed:
1, selection assessment mark: alternate item, whole evaluation criterias;
2, data are selected, display possesses all data module titles and description of permission (can be read), letter A-Z is pressed, The sequentially lower sequence of 0-9;It clicks [preview], pops up [data preview] page in current page;It supports to be closed with title and description Key word is screened;
3, it clicks [submission] to reevaluate the appraisal procedure and data selected, pop-up assessment achievement list.
The model of model, publication that automatic modeling generates can carry out model and reevaluate.Model reevaluates, using new Data are assessed, if assessment result is unsatisfactory for current business demand, re-start Modeling and Design, model training etc..
Figure 11 is please referred to, the embodiment of the present invention also provides a kind of Data Analysis Services system, comprising:
Display module 1101, for showing that user interface, the user interface are used for for user setting for creating business The scene and data of model;
Processing module 1102, the scene and/or data being arranged in the user interface for obtaining user;According to acquisition The scene and/or data, a model strategy is selected from multiple model strategies, according to the model strategy of selection create business Model, the model strategy include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the display module 1101, for showing that scene list is selected for user in the user interface;When When detecting that user selects the operation of the scene in the scene list, the scene of selection is shown in the user interface;
Or
The display module 1101, for showing scene input area in the user interface;When detecting that user exists When the operation of the input area input scene, the scene of user's input is obtained;By what is inputted in scene list with user The scene of scene matching is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes: The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging, The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression, Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used Default parameters.
Preferably, the display module is also used to show the Modeling and Design information for the business model that creation is completed, described to build Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
The first adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information Mould design information;
First operation module, for when the operation for detecting business model of user's execution for running the creation completion When, according to the Modeling and Design information of update, run the business model that the creation is completed.
Preferably, the Data Analysis Services system further include:
Second operation module, for using when detecting that user executes the operation for the business model that operation creation is completed The model strategy of selection runs the business model that the creation is completed.
Preferably,
The display module is also used to show that the modeling achievement for the business model that operation is completed, the modeling achievement include At least one of: the score of business model that the title for the business model that the operation is completed, the operation are completed and described Run the output result for the business model completed.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed Run complete business model creation time, it is described operation complete business model training information, it is described operation complete The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, the display module is also used to show the Modeling and Design information for the business model that operation is completed, described to build Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
Second adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information Mould design information;
Third runs module, detects that user executes the business model for reruning the operation completion for working as When operation, according to the Modeling and Design information of update, the business model that the operation is completed is reruned.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, the Data Analysis Services system further include:
Creation module, corresponding first workflow of business model for creating with creating completion, first workflow Including multiple workflow modules.
Preferably, the Data Analysis Services system further include:
Update module, for the operation when the business model for detecting operation creation completion, alternatively, detecting that user adjusts When the operation of Modeling and Design information, first workflow is updated.
Preferably, the Data Analysis Services system further include:
Replication module, for as the behaviour for detecting newly-built the second workflow identical with first workflow content of user When making, second workflow, the second workflow editable are generated.
Preferably, the Data Analysis Services system further include:
Visualization model, for showing corresponding with the data when detecting that user checks the operation of the data of setting Visual information.
Preferably, the Data Analysis Services system further include:
Release module, for when detecting that user issues the operation for the business model that the operation is completed, described in publication Run the business model completed.
Preferably, the Data Analysis Services system further include:
Module is reevaluated, for working as the business mould for detecting that user reevaluates business model or publication that the operation is completed When the operation of type, the business model of business model or publication that the operation is completed is reevaluated.
Figure 12 is please referred to, Figure 12 is the structural schematic diagram of the Data Analysis Services system of further embodiment of this invention, the number It include: processor 1201 and memory 1202 according to analysis process system 120.In embodiments of the present invention, Data Analysis Services system System 120 further include: be stored in the computer program that can be run on memory 1202 and on processor 1201, computer program quilt Processor 1201 realizes following steps when executing:
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Obtain the user scene and/or data that are arranged in the user interface, and according to the scene of acquisition and/or Data select a model strategy from multiple model strategies, create business model, the model plan according to the model strategy of selection Slightly include at least following information: the arameter optimization method of algorithm and the algorithm.
Processor 1201, which is responsible for management bus architecture and common processing, memory 112, can store processor 1201 and exists Execute used data when operation.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: in the user interface Show that scene list is selected for user;
When detecting that user selects the operation of the scene in the scene list, choosing is shown in the user interface The scene selected;
Or
Scene input area is shown in the user interface;
When detecting user in the operation of the input area input scene, the scene that user inputs is obtained;
Scene in scene list with the scene matching of user's input is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes: The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging, The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression, Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used Default parameters.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described from multiple model plans After the step of selecting a model strategy in slightly, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: selection The information of model strategy.
Preferably, can also be achieved following steps when computer program is executed by processor 1201: the display creation is completed Business model Modeling and Design information the step of after, further includes:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update Modeling and Design information runs the creation business model that the creation is completed.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described from multiple model plans After the step of selecting a model strategy in slightly, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection Slightly, the business model that the creation is completed is run.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation After the step of business model of completion, further includes:
The modeling achievement for the business model that display operation is completed, the modeling achievement includes at least one of: the fortune The business model that the score for the business model that the title for the business model that row is completed, the operation are completed and the operation are completed Export result.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed Run complete business model creation time, it is described operation complete business model training information, it is described operation complete The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation After the step of business model of completion, further includes:
The Modeling and Design information for the business model that display operation is completed, the Modeling and Design information include at least: selection The information of model strategy;
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for reruning the business model that the operation is completed, according to update The Modeling and Design information reruns the business model that the operation is completed.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, can also be achieved following steps when computer program is executed by processor 1201: creation is completed with creation Corresponding first workflow of business model, first workflow include multiple workflow modules.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described to create and created At business model corresponding first workflow the step of after, further includes:
When the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts the operation and completes Business model information operation when, update first workflow.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described to create and created At business model corresponding first workflow the step of after, further includes:
When detecting that user creates the operation of the second workflow identical with first workflow content, generation and institute State the second workflow, the second workflow editable.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the display user interface The step of after, further includes:
When detecting that user checks the operation of the data of setting, visual information corresponding with the data is shown.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation After the step of business model of completion, further includes:
When detecting that user issues the operation for the business model that operation is completed, the business mould that the operation is completed is issued Type.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation After the step of business model of completion, further includes:
When detecting that user reevaluates the operation of the business model of business model or publication that the operation is completed, to institute The business model for stating business model or publication that operation is completed is reevaluated.
The embodiment of the present invention also provides a kind of computer readable storage medium, stores on the computer readable storage medium Computer program, the computer program realize each process of above-mentioned method for automatic modeling embodiment when being executed by processor, And identical technical effect can be reached, to avoid repeating, which is not described herein again.Wherein, the computer readable storage medium, Such as read-only memory (Read-Only Memory, abbreviation ROM), random access memory (Random Access Memory, letter Claim RAM), magnetic or disk etc..
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art For, without departing from the principles of the present invention, it can also make several improvements and retouch, these improvements and modifications It should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of method for automatic modeling of Data Analysis Services system characterized by comprising
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Scene and/or data that user is arranged in the user interface are obtained, and according to the scene and/or number of acquisition According to, a model strategy is selected from multiple model strategies, according to the model strategy of selection create business model, the model strategy Including at least following information: the arameter optimization method of algorithm and the algorithm.
2. method for automatic modeling according to claim 1, which is characterized in that the model strategy further includes in following information At least one of: the appraisal procedure of the algorithm, the parameter setting method of the algorithm, the data method for splitting, described The feature selection approach of the processing method of data and the data.
3. method for automatic modeling according to claim 1, which is characterized in that the scene includes at least one of: right Answer the scene of clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, the scene of corresponding abnormality detection and right Answer the scene of Language Processing.
4. method for automatic modeling according to claim 1, which is characterized in that described to create industry according to the model strategy of selection After the step of business model, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: the model of selection The information of strategy.
5. method for automatic modeling according to claim 1, which is characterized in that described to create industry according to the model strategy of selection After the step of business model, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model strategy of selection, fortune The business model that the row creation is completed.
6. a kind of Data Analysis Services system characterized by comprising
Display module, for showing that user interface, the user interface are used to be used to create for user setting the field of business model Scape and data;
Processing module, the scene and/or data being arranged in the user interface for obtaining user;According to the field of acquisition Scape and/or data select a model strategy from multiple model strategies, create business model, institute according to the model strategy of selection Model strategy is stated including at least following information: the arameter optimization method of algorithm and the algorithm.
7. Data Analysis Services system according to claim 6, which is characterized in that the model strategy further includes following letter Breath at least one of: the appraisal procedure of the algorithm, the parameter setting method of the algorithm, the data method for splitting, The feature selection approach of the processing method of the data and the data.
8. Data Analysis Services system according to claim 6, which is characterized in that the scene include it is following at least it One: the field of the scene of corresponding clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, corresponding abnormality detection The scene of scape and corresponding Language Processing.
9. Data Analysis Services system according to claim 6, which is characterized in that
The display module is also used to show the Modeling and Design information for the business model that creation is completed, the Modeling and Design information It includes at least: the information of the model strategy of selection.
10. Data Analysis Services system according to claim 6, which is characterized in that further include:
Second operation module, for when detect user execute operation creation complete business model operation when, using selection Model strategy, run it is described creation complete business model.
CN201810632499.6A 2018-06-19 2018-06-19 A kind of Data Analysis Services system and method for automatic modeling Pending CN109389143A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810632499.6A CN109389143A (en) 2018-06-19 2018-06-19 A kind of Data Analysis Services system and method for automatic modeling
CN202111299347.7A CN113935434A (en) 2018-06-19 2018-06-19 Data analysis processing system and automatic modeling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810632499.6A CN109389143A (en) 2018-06-19 2018-06-19 A kind of Data Analysis Services system and method for automatic modeling

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202111299347.7A Division CN113935434A (en) 2018-06-19 2018-06-19 Data analysis processing system and automatic modeling method

Publications (1)

Publication Number Publication Date
CN109389143A true CN109389143A (en) 2019-02-26

Family

ID=65416532

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810632499.6A Pending CN109389143A (en) 2018-06-19 2018-06-19 A kind of Data Analysis Services system and method for automatic modeling
CN202111299347.7A Pending CN113935434A (en) 2018-06-19 2018-06-19 Data analysis processing system and automatic modeling method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202111299347.7A Pending CN113935434A (en) 2018-06-19 2018-06-19 Data analysis processing system and automatic modeling method

Country Status (1)

Country Link
CN (2) CN109389143A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083637A (en) * 2019-04-23 2019-08-02 华东理工大学 A kind of denoising method towards bridge defect ratings data
CN110135064A (en) * 2019-05-15 2019-08-16 上海交通大学 A kind of generator rear bearing temperature fault prediction technique, system and controller
CN110222710A (en) * 2019-04-30 2019-09-10 北京深演智能科技股份有限公司 Data processing method, device and storage medium
CN110334955A (en) * 2019-07-08 2019-10-15 北京字节跳动网络技术有限公司 Processing method, device, equipment and the storage medium of index evaluation
CN110443126A (en) * 2019-06-27 2019-11-12 平安科技(深圳)有限公司 Model hyper parameter adjusts control method, device, computer equipment and storage medium
CN110705312A (en) * 2019-09-30 2020-01-17 贵州航天云网科技有限公司 Development system for rapidly developing industrial mechanism model based on semantic analysis
CN110717535A (en) * 2019-09-30 2020-01-21 北京九章云极科技有限公司 Automatic modeling method and system based on data analysis processing system
CN110766167A (en) * 2019-10-29 2020-02-07 深圳前海微众银行股份有限公司 Interactive feature selection method, device and readable storage medium
CN110807044A (en) * 2019-10-30 2020-02-18 东莞市盟大塑化科技有限公司 Model dimension management method based on artificial intelligence technology
CN110956272A (en) * 2019-11-01 2020-04-03 第四范式(北京)技术有限公司 Method and system for realizing data processing
CN111242358A (en) * 2020-01-07 2020-06-05 杭州策知通科技有限公司 Enterprise information loss prediction method with double-layer structure
CN111724185A (en) * 2019-03-21 2020-09-29 北京沃东天骏信息技术有限公司 User maintenance method and device
CN111784040A (en) * 2020-06-28 2020-10-16 平安医疗健康管理股份有限公司 Optimization method and device for policy simulation analysis and computer equipment
CN112380216A (en) * 2020-11-17 2021-02-19 北京融七牛信息技术有限公司 Automatic feature generation method based on intersection
CN112577955A (en) * 2020-11-23 2021-03-30 淮阴师范学院 Water bloom water body detection method and system
CN112633754A (en) * 2020-12-30 2021-04-09 国网新疆电力有限公司信息通信公司 Modeling method and system of data analysis model
CN112884092A (en) * 2021-04-28 2021-06-01 深圳索信达数据技术有限公司 AI model generation method, electronic device, and storage medium
CN113010946A (en) * 2021-02-26 2021-06-22 万翼科技有限公司 Data analysis method, electronic equipment and related product
CN113010226A (en) * 2021-03-16 2021-06-22 北京云从科技有限公司 Model loading method, system, electronic device and medium
CN113239025A (en) * 2021-04-23 2021-08-10 四川大学 Ship track classification method based on feature selection and hyper-parameter optimization
CN113282461A (en) * 2021-05-28 2021-08-20 中国联合网络通信集团有限公司 Alarm identification method and device for transmission network
CN113449471A (en) * 2021-06-25 2021-09-28 东北电力大学 Wind power output simulation generation method for continuously improving MC (multi-channel) by utilizing AP (access point) clustering-skipping
CN113822327A (en) * 2021-07-31 2021-12-21 云南电网有限责任公司信息中心 Algorithm recommendation method based on data characteristics and analytic hierarchy process
CN114117050A (en) * 2021-11-30 2022-03-01 济南农村商业银行股份有限公司 Full-automatic accounting flow popup window processing method, device and system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610204B (en) * 2022-03-14 2024-03-26 中国农业银行股份有限公司 Auxiliary device and method for data processing, storage medium and electronic equipment
CN115455135B (en) * 2022-06-30 2023-10-31 北京九章云极科技有限公司 Visual automatic modeling method and device, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169798A (en) * 2007-12-06 2008-04-30 中国电信股份有限公司 Data excavation system and method
CN104850405A (en) * 2015-05-25 2015-08-19 武汉众联信息技术股份有限公司 Intelligent configurable workflow engine and implementation method therefor
CN105095436A (en) * 2015-07-23 2015-11-25 苏州国云数据科技有限公司 Automatic modeling method for data of data sources
CN106164945A (en) * 2014-04-11 2016-11-23 微软技术许可有限责任公司 Sight modeling and visualization
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106997386A (en) * 2017-03-28 2017-08-01 上海跬智信息技术有限公司 A kind of OLAP precomputations model, method for automatic modeling and automatic modeling system
CN107038167A (en) * 2016-02-03 2017-08-11 普华诚信信息技术有限公司 Big data excavating analysis system and its analysis method based on model evaluation
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107958268A (en) * 2017-11-22 2018-04-24 用友金融信息技术股份有限公司 The training method and device of a kind of data model

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169798A (en) * 2007-12-06 2008-04-30 中国电信股份有限公司 Data excavation system and method
CN106164945A (en) * 2014-04-11 2016-11-23 微软技术许可有限责任公司 Sight modeling and visualization
CN104850405A (en) * 2015-05-25 2015-08-19 武汉众联信息技术股份有限公司 Intelligent configurable workflow engine and implementation method therefor
CN105095436A (en) * 2015-07-23 2015-11-25 苏州国云数据科技有限公司 Automatic modeling method for data of data sources
CN107038167A (en) * 2016-02-03 2017-08-11 普华诚信信息技术有限公司 Big data excavating analysis system and its analysis method based on model evaluation
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106997386A (en) * 2017-03-28 2017-08-01 上海跬智信息技术有限公司 A kind of OLAP precomputations model, method for automatic modeling and automatic modeling system
CN107103050A (en) * 2017-03-31 2017-08-29 海通安恒(大连)大数据科技有限公司 A kind of big data Modeling Platform and method
CN107958268A (en) * 2017-11-22 2018-04-24 用友金融信息技术股份有限公司 The training method and device of a kind of data model

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724185A (en) * 2019-03-21 2020-09-29 北京沃东天骏信息技术有限公司 User maintenance method and device
CN110083637A (en) * 2019-04-23 2019-08-02 华东理工大学 A kind of denoising method towards bridge defect ratings data
CN110083637B (en) * 2019-04-23 2023-04-18 华东理工大学 Bridge disease rating data-oriented denoising method
CN110222710A (en) * 2019-04-30 2019-09-10 北京深演智能科技股份有限公司 Data processing method, device and storage medium
CN110222710B (en) * 2019-04-30 2022-03-08 北京深演智能科技股份有限公司 Data processing method, device and storage medium
CN110135064A (en) * 2019-05-15 2019-08-16 上海交通大学 A kind of generator rear bearing temperature fault prediction technique, system and controller
CN110135064B (en) * 2019-05-15 2023-07-18 上海交通大学 Method, system and controller for predicting temperature faults of rear bearing of generator
CN110443126A (en) * 2019-06-27 2019-11-12 平安科技(深圳)有限公司 Model hyper parameter adjusts control method, device, computer equipment and storage medium
WO2020258508A1 (en) * 2019-06-27 2020-12-30 平安科技(深圳)有限公司 Model hyper-parameter adjustment and control method and apparatus, computer device, and storage medium
CN110334955B (en) * 2019-07-08 2021-09-14 北京字节跳动网络技术有限公司 Index evaluation processing method, device, equipment and storage medium
CN110334955A (en) * 2019-07-08 2019-10-15 北京字节跳动网络技术有限公司 Processing method, device, equipment and the storage medium of index evaluation
CN110717535B (en) * 2019-09-30 2020-09-11 北京九章云极科技有限公司 Automatic modeling method and system based on data analysis processing system
CN110717535A (en) * 2019-09-30 2020-01-21 北京九章云极科技有限公司 Automatic modeling method and system based on data analysis processing system
CN110705312A (en) * 2019-09-30 2020-01-17 贵州航天云网科技有限公司 Development system for rapidly developing industrial mechanism model based on semantic analysis
CN110766167A (en) * 2019-10-29 2020-02-07 深圳前海微众银行股份有限公司 Interactive feature selection method, device and readable storage medium
CN110807044A (en) * 2019-10-30 2020-02-18 东莞市盟大塑化科技有限公司 Model dimension management method based on artificial intelligence technology
CN110956272A (en) * 2019-11-01 2020-04-03 第四范式(北京)技术有限公司 Method and system for realizing data processing
CN110956272B (en) * 2019-11-01 2023-08-08 第四范式(北京)技术有限公司 Method and system for realizing data processing
CN111242358A (en) * 2020-01-07 2020-06-05 杭州策知通科技有限公司 Enterprise information loss prediction method with double-layer structure
CN111784040B (en) * 2020-06-28 2023-04-25 平安医疗健康管理股份有限公司 Optimization method and device for policy simulation analysis and computer equipment
CN111784040A (en) * 2020-06-28 2020-10-16 平安医疗健康管理股份有限公司 Optimization method and device for policy simulation analysis and computer equipment
CN112380216A (en) * 2020-11-17 2021-02-19 北京融七牛信息技术有限公司 Automatic feature generation method based on intersection
CN112577955A (en) * 2020-11-23 2021-03-30 淮阴师范学院 Water bloom water body detection method and system
CN112633754A (en) * 2020-12-30 2021-04-09 国网新疆电力有限公司信息通信公司 Modeling method and system of data analysis model
CN113010946A (en) * 2021-02-26 2021-06-22 万翼科技有限公司 Data analysis method, electronic equipment and related product
CN113010946B (en) * 2021-02-26 2024-01-23 深圳市万翼数字技术有限公司 Data analysis method, electronic equipment and related products
CN113010226A (en) * 2021-03-16 2021-06-22 北京云从科技有限公司 Model loading method, system, electronic device and medium
CN113239025A (en) * 2021-04-23 2021-08-10 四川大学 Ship track classification method based on feature selection and hyper-parameter optimization
CN112884092B (en) * 2021-04-28 2021-11-02 深圳索信达数据技术有限公司 AI model generation method, electronic device, and storage medium
CN112884092A (en) * 2021-04-28 2021-06-01 深圳索信达数据技术有限公司 AI model generation method, electronic device, and storage medium
CN113282461A (en) * 2021-05-28 2021-08-20 中国联合网络通信集团有限公司 Alarm identification method and device for transmission network
CN113282461B (en) * 2021-05-28 2023-06-23 中国联合网络通信集团有限公司 Alarm identification method and device for transmission network
CN113449471A (en) * 2021-06-25 2021-09-28 东北电力大学 Wind power output simulation generation method for continuously improving MC (multi-channel) by utilizing AP (access point) clustering-skipping
CN113822327A (en) * 2021-07-31 2021-12-21 云南电网有限责任公司信息中心 Algorithm recommendation method based on data characteristics and analytic hierarchy process
CN114117050A (en) * 2021-11-30 2022-03-01 济南农村商业银行股份有限公司 Full-automatic accounting flow popup window processing method, device and system

Also Published As

Publication number Publication date
CN113935434A (en) 2022-01-14

Similar Documents

Publication Publication Date Title
CN109389143A (en) A kind of Data Analysis Services system and method for automatic modeling
Oussalah et al. Forecasting weekly crude oil using Twitter sentiment of US foreign policy and oil companies data
CN113537807B (en) Intelligent wind control method and equipment for enterprises
CN112417176A (en) Graph feature-based method, device and medium for mining implicit association relation between enterprises
JP2000339351A (en) System for identifying selectively related database record
Imran et al. Mining the productivity data of the garment industry
Tounsi et al. CSMAS: Improving multi-agent credit scoring system by integrating big data and the new generation of gradient boosting algorithms
Mott Case-based reasoning: Market, applications, and fit with other technologies
Said et al. New model for making resilient decisions in an uncertain context: the rational resilience-based decision-making model (R2DM)
Quah Estimating software readiness using predictive models
Lv et al. Detecting fraudulent bank account based on convolutional neural network with heterogeneous data
Jeyaraman et al. Practical Machine Learning with R: Define, build, and evaluate machine learning models for real-world applications
US20210356920A1 (en) Information processing apparatus, information processing method, and program
Khramov Robotic and machine learning: how to help support to process customer tickets more effectively
EP2453395A1 (en) Method and system to analyze processes
US20240152818A1 (en) Methods for mitigation of algorithmic bias discrimination, proxy discrimination and disparate impact
US20220374401A1 (en) Determining domain and matching algorithms for data systems
Schreck et al. The AI project manager
CN114519073A (en) Product configuration recommendation method and system based on atlas relation mining
Gnoss et al. XAI in the audit domain-explaining an autoencoder model for anomaly detection
Deshpande et al. How much data analytics is enough? the roi of machine learning classification and its application to requirements dependency classification
Chang Software risk modeling by clustering project metrics
US20200342302A1 (en) Cognitive forecasting
CN110458383A (en) Demand handles implementation method, device and the computer equipment of serviceization, storage medium
Schäfer et al. Clustering-Based Subgroup Detection for Automated Fairness Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190226