CN109389143A - A kind of Data Analysis Services system and method for automatic modeling - Google Patents
A kind of Data Analysis Services system and method for automatic modeling Download PDFInfo
- Publication number
- CN109389143A CN109389143A CN201810632499.6A CN201810632499A CN109389143A CN 109389143 A CN109389143 A CN 109389143A CN 201810632499 A CN201810632499 A CN 201810632499A CN 109389143 A CN109389143 A CN 109389143A
- Authority
- CN
- China
- Prior art keywords
- model
- scene
- algorithm
- data
- business model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
Abstract
The present invention provides a kind of Data Analysis Services system and method for automatic modeling, this method comprises: display user interface, the user interface is used to be used to create for user setting the scene and data of business model;Obtain scene and/or data that user is arranged in the user interface;According to the scene and/or data of acquisition, a model strategy is selected from multiple model strategies, business model is created according to the model strategy of selection, the model strategy includes at least following information: the arameter optimization method of algorithm and the algorithm.In the present invention, model strategy can be automatically selected according to the scene and/or data of user setting, not need user's preference pattern strategy, improve the degree of automation of Data Analysis Services system, improve user experience.
Description
Technical field
The present invention relates to technical field of data processing more particularly to a kind of Data Analysis Services system and automatic modeling sides
Method.
Background technique
Current Data Analysis Services system carries out the major way of business model training are as follows: will be used to instruct from database
The data for practicing business model export to local, by modeling Shi Liyong third party's modeling tool, according to business demand preference pattern plan
Slightly, training business model, constantly manual debugging during training business model, the model parameter optimized, thus
The business model trained.
Above-mentioned business model training method has very big drawback: the process of business model training is complicated, and the degree of automation is low,
It is not suitable for non-professional user to use.
Summary of the invention
In view of this, the present invention provides a kind of Data Analysis Services system and method for automatic modeling, to solve existing number
Complicated, the low problem of the degree of automation according to analysis process system training pattern process.
In order to solve the above technical problems, the present invention provides a kind of method for automatic modeling of Data Analysis Services system, comprising:
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Obtain the user scene and/or data that are arranged in the user interface, and according to the scene of acquisition and/or
Data select a model strategy from multiple model strategies, create business model, the model plan according to the model strategy of selection
Slightly include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described
The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting
Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the step of display user interface includes:
Show that scene list is selected for user in the user interface;
When detecting that user selects the operation of the scene in the scene list, choosing is shown in the user interface
The scene selected;
Or
Scene input area is shown in the user interface;
When detecting user in the operation of the input area input scene, the scene that user inputs is obtained;
Scene in scene list with the scene matching of user's input is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm
Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes:
The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed
Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent
The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging,
The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng
The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC
Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned
Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes
At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression,
Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization
Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point
Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used
Default parameters.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: selection
The information of model strategy.
Preferably, after the step of Modeling and Design information for the business model that the display creation is completed, further includes:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update
Modeling and Design information runs the business model that the creation is completed.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection
Slightly, the business model that the creation is completed is run.
Preferably, after the step of business model that the operation creation is completed, further includes:
The modeling achievement for the business model that display operation is completed, the modeling achievement includes at least one of: the fortune
The business model that the score for the business model that the title for the business model that row is completed, the operation are completed and the operation are completed
Export result.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed
Run complete business model creation time, it is described operation complete business model training information, it is described operation complete
The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed
Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed
The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed
Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, after the step of business model that the operation creation is completed, further includes:
The Modeling and Design information for the business model that display operation is completed, the Modeling and Design information include at least: selection
The information of model strategy;
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for reruning the business model that the operation is completed, according to update
The Modeling and Design information reruns the business model that the operation is completed.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, after described the step of creating business model according to the model strategy of selection, further includes:
The first workflow corresponding with the business model that creation is completed is created, first workflow includes multiple workflows
Module.
Preferably, after described the step of creating the first workflow corresponding with the business model that creation is completed, further includes:
When the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts Modeling and Design information
Operation when, update first workflow.
Preferably, after described the step of creating the first workflow corresponding with the business model that creation is completed, further includes:
When detecting that user creates the operation of the second workflow identical with first workflow content, described in generation
Second workflow, the second workflow editable.
Preferably, after the step of display user interface, further includes:
When detecting that user checks the operation of the data of setting, visual information corresponding with the data is shown.
Preferably, after the step of business model that the operation creation is completed, further includes:
When detecting that user issues the operation for the business model that operation is completed, the business mould that the operation is completed is issued
Type.
Preferably, after the step of business model that the operation creation is completed, further includes:
When detecting that user reevaluates the operation of the business model of the business model that operation is completed or publication, to the fortune
The business model of business model or publication that row is completed is reevaluated.
The present invention also provides a kind of Data Analysis Services systems, comprising:
Display module, for showing that user interface, the user interface are used for for user setting for creating business model
Scene and data;
Processing module, the scene and/or data being arranged in the user interface for obtaining user;According to the institute of acquisition
Scene and/or data are stated, a model strategy is selected from multiple model strategies, business mould is created according to the model strategy of selection
Type, the model strategy include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described
The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting
Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the display module, for showing that scene list is selected for user in the user interface;Work as detection
When selecting the operation of the scene in the scene list to user, the scene of selection is shown in the user interface;
Or
The display module, for showing scene input area in the user interface;When detecting user described
When the operation of input area input scene, the scene of user's input is obtained;By the scene in scene list with user's input
Matched scene is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm
Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes:
The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed
Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent
The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging,
The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng
The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC
Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned
Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes
At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression,
Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization
Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point
Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used
Default parameters.
Preferably, the display module is also used to show the Modeling and Design information for the business model that creation is completed, described to build
Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
The first adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information
Mould design information;
First operation module, for when the operation for detecting business model of user's execution for running the creation completion
When, according to the Modeling and Design information of update, run the business model that the creation is completed.
Preferably, the Data Analysis Services system further include:
Second operation module, for using when detecting that user executes the operation for the business model that operation creation is completed
The model strategy of selection runs the business model that the creation is completed.
Preferably, the display module is also used to show the modeling achievement for the business model that operation is completed, described to be modeled as
Fruit includes at least one of: the score for the business model that the title for the business model that the operation is completed, the operation are completed
The output result for the business model completed with the operation.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed
Run complete business model creation time, it is described operation complete business model training information, it is described operation complete
The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed
Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed
The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed
Business model information, M is positive integer more than or equal to 1, and N is greater than or equal to the positive integer of M.
Preferably, the display module is also used to show the Modeling and Design information for the business model that operation is completed, described to build
Mould design information includes at least: the information of the model strategy of selection;
Second adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information
Mould design information;
Third runs module, detects that user executes the business model for reruning the operation completion for working as
When operation, according to the Modeling and Design information of update, the business model that the operation is completed is reruned.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, the Data Analysis Services system further include:
Creation module, corresponding first workflow of business model for creating with creating completion, first workflow
Including multiple workflow modules.
Preferably, the Data Analysis Services system further include:
Update module, for the operation when the business model for detecting operation creation completion, alternatively, detecting that user adjusts
When the operation of Modeling and Design information, first workflow is updated.
Preferably, the Data Analysis Services system further include:
Replication module, for as the behaviour for detecting newly-built the second workflow identical with first workflow content of user
When making, second workflow, the second workflow editable are generated.
Preferably, the Data Analysis Services system further include:
Visualization model, for showing corresponding with the data when detecting that user checks the operation of the data of setting
Visual information.
Preferably, the Data Analysis Services system further include:
Release module, for issuing the operation when detecting that user issues the operation for the business model that operation is completed
The business model of completion.
Preferably, the Data Analysis Services system further include:
Module is reevaluated, for that ought detect that user reevaluates the business model of the business model that operation is completed or publication
When operation, the business model of business model or publication that the operation is completed is reevaluated.
The present invention also provides a kind of Data Analysis Services system, including processor, memory and it is stored in the memory
Computer program that is upper and can running on the processor, the computer program is realized above-mentioned when being executed by the processor
The step of method for automatic modeling.
The present invention also provides a kind of computer readable storage medium, computer is stored on the computer readable storage medium
The step of program, the computer program realizes above-mentioned method for automatic modeling when being executed by processor.
The advantageous effects of the above technical solutions of the present invention are as follows:
In the embodiment of the present invention, Data Analysis Services system can be automatic to select according to the scene and/or data of user setting
Model strategy is selected, user's preference pattern strategy is not needed, improves the degree of automation of Data Analysis Services system, improve use
Family experience.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for automatic modeling of the Data Analysis Services system of the embodiment of the present invention one;
Fig. 2 is the schematic diagram of the user interface of the automatic modeling of the embodiment of the present invention;
Fig. 3 is the schematic diagram of the user interface of the information for checking model strategy of the embodiment of the present invention;
Fig. 4 is the schematic diagram of the user interface of the modeling achievement list of the embodiment of the present invention;
Fig. 5 is the schematic diagram of the user interface of the modeling achievement chart of the embodiment of the present invention;
Fig. 6 and Fig. 7 is the schematic diagram of the user interface for checking data of the embodiment of the present invention;
Fig. 8 is the schematic diagram of the user interface of the model repository of the embodiment of the present invention;
The schematic diagram for the user interface that Fig. 9 and Figure 10 uses for the on-time model performance and resource of the embodiment of the present invention;
Figure 11 is the structural schematic diagram of the Data Analysis Services system of one embodiment of the invention;
Figure 12 is the structural schematic diagram of the Data Analysis Services system of another embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
Attached drawing, the technical solution of the embodiment of the present invention is clearly and completely described.Obviously, described embodiment is this hair
Bright a part of the embodiment, instead of all the embodiments.Based on described the embodiment of the present invention, ordinary skill
Personnel's every other embodiment obtained, shall fall within the protection scope of the present invention.
Referring to FIG. 1, the process that Fig. 1 is the method for automatic modeling of the Data Analysis Services system of the embodiment of the present invention one is shown
It is intended to, the method for automatic modeling includes:
Step 11: display user interface, the user interface are used to be used to create for user setting the scene of business model
And data;
Referring to FIG. 2, Fig. 2 is the user interface for automatic modeling of the Data Analysis Services system of the embodiment of the present invention
Schematic diagram, in the user interface of automatic modeling, input frame and " selection data module " including " selection scene " (are counted
According to) input frame, the scene for creating business model, Ke Yi can be arranged in user in the input frame of " selection scene "
In the input frame of " selection data module " (data module is module for storing data), it is arranged for creating business model
Data.In the embodiment of the present invention, it is preferable that show that data module on a user interface is that user possesses the number for selecting permission
According to module, while showing data module, the description of data module can also be shown in user interface.
Step 12: obtaining scene and/or data that user is arranged in the user interface, and according to the field of acquisition
Scape and/or data select a model strategy from multiple model strategies, create business model, institute according to the model strategy of selection
Model strategy is stated including at least following information: the arameter optimization method of algorithm and the algorithm.
So-called model strategy, the arameter optimization method of algorithm and algorithm including at least business model can be based on model
The information of strategy is trained the algorithm of business model.In some currently preferred embodiments of the present invention, the model strategy is also
It may include at least one of following information: the fractionation of the appraisal procedure, the parameter setting method, the data of algorithm of algorithm
The feature selection approach of method, the processing method of the data and the data.
It is understood that needing to be stored in advance multiple model plans in Data Analysis Services system in the embodiment of the present invention
Slightly.
In the embodiment of the present invention, Data Analysis Services system can be automatic to select according to the scene and/or data of user setting
Model strategy is selected, user's preference pattern strategy is not needed, improves the degree of automation of Data Analysis Services system, improve use
Family experience.
In the embodiment of the present invention, the type of the algorithm of business model may include at least one of: clustering algorithm, classification
Algorithm, regression algorithm, abnormality detection and Language Processing algorithm.Corresponding, the scene may include at least one of: corresponding
The scene of clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, the scene and correspondence of corresponding abnormality detection
The scene of Language Processing.
For example, the scene of corresponding clustering algorithm for example may include: that card holder group (analyzes the visitor of credit card
Which classification family has) and the network group domain (relationship i.e. between analysis network alarm log and equipment, based on equipment to network alarm
Log is clustered) etc..The scene of corresponding sorting algorithm for example may include: that customer churn prediction and financial product recommend prediction
Deng.The scene of corresponding regression algorithm for example may include: the prediction of settlement of insurance claim amount and cash provision etc..Corresponding abnormality detection
Scene for example may include: fraud and abnormal transaction etc..The scene of corresponding Language Processing for example may include: latent semantic analysis
With word frequency analysis etc..
In the embodiment of the present invention, the scene of user setting may include big type first, that is, on a user interface selection be
Clustering algorithm, sorting algorithm etc., the scene of user setting also may include small type, for example, the scene comprising business objective, example
Such as, selection is card holder group, customer churn prediction etc. on a user interface.Certainly, in some other implementation of the invention
In example, can also there was only big type, or only small type in user interface, the present invention is without limitation.
In the embodiment of the present invention, it is preferable that the scene refers to the business scenario for creating business model, scene and industry
The type of the algorithm of business model is related.
In the embodiment of the present invention, in preference pattern strategy, the scene of user setting can be analyzed, be corresponded to
Model strategy obtain corresponding model strategy it is of course also possible to analyze the type of the data of user setting, alternatively,
Scene and data are analyzed simultaneously, obtains corresponding model strategy.
In the embodiment of the present invention, it is preferable that can store scene and/or data and model plan in Data Analysis Services system
Corresponding relationship slightly, thus according to the corresponding relationship, preference pattern strategy.Certainly, in some other embodiment of the invention,
It is also possible to store the corresponding relationship of scene and/or data and the information of model strategy, Data Analysis Services system can basis
The corresponding relationship of scene and/or data and the information of model strategy, determines model strategy.
In the embodiment of the present invention, scene and data, which can be, to interact, it is preferable that can be selected according to different data
Scene it is different, the data that can be selected according to different scenes are different, and data difference includes the particle of the type of data, data
The differences such as degree, the target column that can be selected.
In some embodiments of the invention, the step of display user interface may include:
Step 111: showing that scene list is selected for user in the user interface.
Step 112: when detecting that user selects the operation of the scene in the scene list, in the user interface
On show the scene of selection;
In some other embodiment of the invention, the step of display user interface, can also include:
Step 111 ': scene input area is shown in the user interface;The scene input area can be text
Input frame is also possible to voice input key;
Step 112 ': when detecting user in the operation of the input area input scene, obtain the institute that user inputs
State scene;
Step 113 ': the scene in scene list with the scene matching of user's input is shown in the user interface.
Specifically, Data Analysis Services system can carry out semantic understanding to the scene inputted in input area, it is automatic to know
Other scene, and the scene for the scene matching for determining and identifying from scene list.
It will be appreciated that needing to store the scene list in Data Analysis Services system, have in the scene list
There is at least one (being typically more than one, for example, 80) scene.
Referring to FIG. 2, other than scene set and data, user is also an option that setting mesh in the embodiment of the present invention
It marks feature (i.e. target column in Fig. 2), and model strategy is determined according to target signature.Such as the target column in customer churn prediction
Whether to be lost label column.In the embodiment of the present invention, target column can choose a column.It is of course also possible to be multiple row.
That is, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the scene and/or data for obtaining user and being arranged in the user interface, and according to the institute of acquisition
The step of stating scene and/or data, selecting a model strategy from multiple model strategies includes: to obtain user in user circle
Scene, data and/or the target signature being arranged on face, and according to the scene, data and/or the target signature of acquisition, from more
A model strategy is selected in a model strategy.
That is, the effect of target signature can be used to preference pattern strategy, in addition, target signature can also be in training
It is used during business model, for example, being used when algorithm evaluation.
In addition, referring to FIG. 2, the title of business model can also be arranged (i.e. in user in the user interface of automatic modeling
Automatic modeling title in Fig. 2), meanwhile, the description of business model and the label of business model etc. can also be arranged in user.
It illustrates below and the corresponding relationship of scene, data, target signature and model strategy is illustrated.
1. the scene of corresponding clustering algorithm: card holder group (which classification the client of credit card has), network group domain (net
Relationship between network alarm log and equipment clusters network alarm log based on equipment) etc..
Scene-card holder group, data-credit card customer information (such as letter in certain bank's fixed cycle (such as 1 year)
With card customer information).
Model strategy 1: data processing: data cleansing and/or data normalization;Feature Engineering: by principal component analysis into
Row feature selecting;Algorithm (space characteristics based on Cluster space are selected), algorithm includes at least one of: level is poly-
Class, Bayes's Gaussian Mixture, KDTree (K-D tree) and limited Boltzmann machine;The arameter optimization method of algorithm is based on hyper parameter
Optimization carries out, and the method for the hyperparameter optimization includes at least one of: random parameter searching method, mesh parameter searcher
Method and silhouette coefficient method (such as being polymerized to several classes), specifically, being based on random parameter searching method and/or mesh parameter searcher
Method selects hyper parameter, for example, selecting one from parameter list based on random parameter searching method and/or mesh parameter searching method
The optimal hyper parameter of group, wherein using evaluation index of the silhouette coefficient as hyper parameter;Assessment: it is based on Silhouette (profile) system
Number, homogeneity (homogeney), completeness (integrality) and/or V-measure carry out algorithm evaluation.Each
Algorithm all does identical assessment, each arithmetic result retains, and is further analyzed in conjunction with credit card business.
2. the scene of corresponding sorting algorithm: customer churn prediction, financial product recommend prediction etc..
Scene-customer churn prediction, data-customer information (such as certain bank's fixed cycle (such as 1 year) interior client letter
Breath), target column-loss/non-loss.
Model strategy 2: data processing: data cleansing and/or data normalization;Feature Engineering: pass through Chi-square Test, Pierre
Gloomy correlation coefficient process, extreme tree Method for Feature Selection and/or recursive feature null method etc. carry out feature selecting;Algorithm include with down toward
One of few (characteristic based on algorithms of different, each characteristic under all select some algorithms): logistic regression, random forest,
Bagging, AdaBoost, neural network and Stack Model;The arameter optimization method of algorithm is carried out based on hyperparameter optimization, described
The method of hyperparameter optimization includes at least one of: below random parameter searching method, mesh parameter searching method and curve
Product (Area Under the Curve, AUC) fractal methods, specifically, being based on random parameter searching method and/or mesh parameter
Searching method selects hyper parameter, for example, based on random parameter searching method and/or mesh parameter searching method from parameter list
One group of optimal hyper parameter is selected, wherein using evaluation index of the AUC score as hyper parameter;Assessment: based on AUC score, accurately
Rate, accurate rate, recall rate, F1 score and/or logarithm loss carry out algorithm evaluation.Each algorithm does identical assessment, choosing
Optimal algorithm out exports each customer churn prediction probability value.
3. the scene of corresponding regression algorithm: the prediction of settlement of insurance claim amount, cash provision etc..
The prediction of scene-settlement of insurance claim amount, data-certain insurance company's customer informations (such as certain bank's fixed cycle (such as
1 year) interior customer information), target column-Claims Resolution amount.
Model strategy 3: data processing: data cleansing and/or data normalization;Feature Engineering: pass through Chi-square Test, Pierre
Gloomy correlation coefficient process, extreme tree Method for Feature Selection and/or recursive feature null method etc. carry out feature selecting;Algorithm include with down toward
One of few (characteristic based on algorithms of different, each characteristic under all select some algorithms): logistic regression, random forest, support
Vector regression (support vector regression, SVR) and neural network;The arameter optimization method of algorithm is based on super ginseng
Number optimization carries out, and the method for the hyperparameter optimization includes at least one of: random parameter searching method, mesh parameter search
Method and R2 value method, specifically, selecting hyper parameter, example based on random parameter searching method and/or mesh parameter searching method
Such as, one group of optimal hyper parameter is selected from parameter list based on random parameter searching method and/or mesh parameter searching method,
The middle evaluation index using R2 value as hyper parameter;Assessment: explained variance scoring, mean absolute deviation, mean square error, R2 are based on
Value, median absolute error and/or square log error carry out algorithm evaluation.Each algorithm does identical assessment, selects
Optimal algorithm, export insurance Claims Resolution amount predicted value.
4. corresponding abnormality detection scene, more specifically, such as fraud, abnormal transaction etc..
Scene-abnormality detection, data-certain industry Transaction Informations (such as Transaction Information in certain industry fixed cycle), can be with
Provide target column-exception/non-exception.
Model strategy 4: data processing: data cleansing and/or data normalization;Feature Engineering: feature disequilibrium processing
(generally use all features, and carry out the processing of feature disequilibrium);Algorithm includes that at least one of (is selected for abnormality detection
Select some algorithms): neural network, support vector machine, robustness regression, arest neighbors and Isolation Forest (isolated forest);
The arameter optimization method of the algorithm based on hyperparameter optimization carry out, the method for the hyperparameter optimization include it is following at least it
One: random parameter searching method, mesh parameter searching method and F1 fractal methods, specifically, based on random parameter search and/or
Mesh parameter searching method selects hyper parameter, for example, being based on random parameter searching method and/or mesh parameter searching method from ginseng
One group of optimal hyper parameter is selected in ordered series of numbers table, wherein using evaluation index of the F1 score as hyper parameter;Assessment: based on AUC points
Number, accuracy rate, accurate rate, recall rate, F1 score and/or logarithm loss carry out algorithm evaluation.Each algorithm does identical
Assessment, selects optimal algorithm, exports the predicting abnormality probability value of transaction.
5. corresponding Language Processing scene, more specifically, such as latent semantic analysis, word frequency analysis etc..
The corresponding text information (for example, summary info, log information, search term) of scene-latent semantic analysis, data-.
Model strategy 5: data processing: word segmentation processing and/or word frequency analysis;Algorithm includes at least one of (for language
Speech processing selects some algorithms): potential applications index, implicit Di Li Cray distribution and condition random field;The arameter optimization of algorithm
Method includes: to provide default parameters according to the result of word frequency analysis, uses default parameters;Further progress cluster: it algorithm: uses
It is locally linear embedding into, composes at least one of insertion, Multidimensional Scaling, local space arrangement progress dimensionality reduction (based on flow pattern sky
Between space characteristics selected), then clustered using K-MEANS.It is further analyzed in conjunction with specific business.
In the embodiment of the present invention, it is preferable that after described the step of creating business model according to the model strategy of selection, also
It may include: the Modeling and Design information for the business model that display creation is completed, the Modeling and Design information includes at least: selection
The information of model strategy.So that user can check the information of the model strategy of selection.The Modeling and Design information may be used also
With include: target signature and or scene.
In the embodiment of the present invention, it is preferable that the step of display creates the Modeling and Design information for the business model completed
Later, can also include:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update
Modeling and Design information, the business model that operation creation is completed.
That is, user can be with the content of user-defined m odel design information, to improve user experience.
The business model that so-called operation creation is completed, includes at least: instructing to the algorithm for the business model that creation is completed
Practice, it is, of course, also possible to include: to split to data, data processing is carried out to data, and/or, the feature of data is selected
It selects.
In the embodiment of the present invention, it is preferable that after described the step of creating business model according to the model strategy of selection, also
It may include: when detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection
Slightly, the business model that operation creation is completed.
Referring to FIG. 2, user setting is complete for creating the scene sum number of business model in user interface shown in Fig. 2
After equal, user can click " newly-built " key, the Modeling and Design information for the business model that display creation is completed.For example, can
To check the data processing method in model strategy, algorithm, the parameter of algorithm and/or appraisal procedure etc..Alternatively, clicking " training "
When key, using the model strategy of selection, business model is created, and run business model.That is, as long as user clicks
" training " key, Data Analysis Services system can create business model according to the model strategy automatically selected, and run wound
The business model for building completion does not need user's preference pattern strategy, simplifies training process, improves Data Analysis Services system
The degree of automation, improve user experience.
In the embodiment of the present invention, after user clicks " newly-built ", the user interface of display can be as shown in figure 3, user circle
The Modeling and Design information for the business model that the creation shown under face is completed includes: essential information, feature, modeling and assessment, wherein
Essential information includes target and training/test set, and target includes: scene and target column, training/test set be by data into
Row is split and/or the methods of sampling formation, and modeling includes algorithm and parameter.
In the embodiment of the present invention, after described the step of running the business model that creation is completed, further includes: display has been run
At business model Modeling and Design information.That is, can also look at the industry that operation is completed after having run business model
The Modeling and Design information of business model.
In the embodiment of the present invention, after described the step of showing the Modeling and Design information for running the business model completed, also
It may include: to update the Modeling and Design information when detecting that user adjusts the operation of the Modeling and Design information;Work as detection
When executing the operation for reruning the business model that the operation is completed to user, believed according to the Modeling and Design of update
Breath reruns the business model that the operation is completed.
The Modeling and Design information includes: the information of the model strategy for the business model that the operation is completed, and can also be wrapped
Include scene and/or target signature.That is, in the embodiment of the present invention, can after business model training is completed, check or
Modeling and Design information is stated in adjustment, such as model strategy, target signature and/or the scene etc. of adjustment business model, in addition to data it
Outside, other information is all adjustable, and the business model after combustion adjustment again.
In the embodiment of the present invention, after described operation described the step of creating the business model completed, further includes: display fortune
The modeling achievement for the business model that row is completed, the modeling achievement may include at least one of: run the business mould of completion
The information such as the output result of business model that the score for the business model that the title of type, operation are completed, operation are completed, the output
As a result for example can be client whether the prediction result of attrition prediction.The title of business model for example can be algorithm title+when
Between stab.
In some currently preferred embodiments of the present invention, the modeling achievement can also include: the business that the operation is completed
The business model that the creation time for the business model that the information of the model strategy of model, the operation are completed, the operation are completed
Training information (as training duration), it is described operation complete the corresponding workflow of business model (also referred to as task, below content
It is middle workflow to be illustrated), the operation state (such as successful, unsuccessfully etc.) and the data of business model completed
Feature importance ranking information.
It may include polyalgorithm in a model strategy in the embodiment of the present invention, to run the business that creation is completed
After model, the information of available multiple business models.Therefore, the modeling achievement may include: the selected model
The information of the preceding M business model of highest scoring in the business model that the corresponding N number of operation of strategy is completed, alternatively, the selection
The modeling achievement of business model completed of the N number of operation of the corresponding whole of model strategy, M is positive integer more than or equal to 1, N
For the positive integer more than or equal to M.That is, a model strategy may include multiple business models, it, can after the completion of operation
With display portion or the information of whole business models.
In the embodiment of the present invention, modeling achievement chart or the business of modeling achievement list display operation completion can be passed through
The modeling achievement of model, wherein modeling achievement chart can see the different business mould of the variant training an of automatic modeling
Type result compares, and conveniently sees training the inside preferably business model every time.Modeling achievement list can see an automatic modeling
All business model results compare, including the corresponding each business model of different training can all be ranked up, conveniently see institute
Some training the inside preferably business model.
Referring to FIG. 4, Fig. 4 is the use of the modeling achievement list for the business model that the operation in one embodiment of the invention is completed
The schematic diagram at family interface, the user interface of the modeling achievement list show whole model lists, and temporally inverted order is shown default.
Display field name is as follows:
Check box: only issuing success status, can be checked;Choosing rear bottom [model evaluation] button becomes available;
Model name: display specific name (named with model name+timestamp, and workflow is with source analysis by automatic modeling
The output title of module is named;) and source (being named with source analysis module);The state that display is put into warehouse (is laughed at
Face), support sequence, state be it is successful, click title can into be somebody's turn to do [Model Results details] page;
Ownership: task names belonging to the model are shown.It clicks on new window and enters [task details] page, support sequence;
Founder: display founder's information supports sequence;
Creation time: the date+time supports sequence;
State: success fails, in load, -- (appraisal procedure is not found in representative);
Training time: training duration * h*m*s is shown, if you do not need to not showing such as 59s then when big unit;
Score index: can be configured by table, default at most display 6 simultaneously;
Operation:
Check that result (eyes) clicks to enter [Model Results details page], success status just shows and [check result] button;
It checks log (bookmark), clicks pop-up [log details] pop-up, whole status display [checking log] buttons.
Referring to FIG. 5, the user interface of the modeling achievement chart for the business model that the operation of Fig. 5 embodiment of the present invention is completed
Schematic diagram.
The user interface right content -- task model show area includes:
Choose whole model lists of task in user interface display left side.
Task visualizes area: showing whole model visualization information under the task, includes model algorithm parameter, feature
Importance, training information etc..It can show whole model training contents with line (curve, broken line etc.) figure state, mouse suspension node,
It can show more information.
Model display area: the model color identifier, model name, status indicator, binning state, champion's mark, tool are shown
Body scoring, time started, action-item, Visual Chart;When mouse form suspension region, be switched to selected state, and with left side
Task model list selected state corresponds;
Color identifier: the Line Chart in color identifier and right side [task model visualization is scored] before model name is protected
Hold consistent, most 13 kinds of different colors of distribution (upper limit for supporting algorithm).
Model name: showing its specific name, and suspend complete display;Click model title after issuing successfully, in current page
Face enters should [model details] page;When model failure, model name becomes red, can not be clicked;When in model load,
It can not be clicked after choosing, when model does not have evaluation module, model name becomes red, can not be clicked after choosing.
Identification-state: in load, issuing and successfully (do not show icon), and failure (not showing icon, title reddens) is not looked for
To appraisal procedure (not showing icon, title reddens, and only limits the assessment comparison in workflow).
Storage mark: model is updated in warehouse, then shows the mark (smiling face in figure) updated to warehouse.
Champion's mark: in the task, show that champion identifies (trophy in figure) before the preferable model that scores.It (is scored
Filter Bar influences, and the content according to scoring screening is different, and score value can also change).
Specific scoring: (to be appraised point of Filter Bar influences display most three decimal points of score value situation, according in scoring screening
Hold difference, score value can also change).
Time started: display job start time, date+time.
Action-item:
Check that result (eyes) clicks to enter [Model Results details page], success status just shows and [check result] button.
It checks log (bookmark), clicks pop-up [log details] pop-up, whole status display [checking log] buttons.
The user interface left content -- task model list includes:
1, task list caused by whole automatic modeling (workflow) training, drop-down load are shown;
2, task list default carries out sequence up and down with time inverted order;
3, specific [task names] title is clicked, opening in new window should [task details] page;
4, task can be deleted, and delete pop-up secondary-confirmation prompt, after deleting successfully, be emptied generated in the task
Whole model contents, while deleting the associated task in task list together and (similarly deleting task, also delete association together certainly
Dynamic modeling contents) content in model repository is not influenced;
5, task may include multiple models, show its color identifier, model name, status indicator, binning state,
Champion's mark is specific to score;
Color identifier: the Line Chart in color identifier and right side [task model visualization is scored] before model name is protected
Hold consistent, most 13 kinds of different colors of distribution (upper limit for supporting algorithm);
Model name: automatic modeling is named with model name+timestamp, and workflow is ordered with the output title of analysis module
Name;Show its specific name, suspend complete display;Click model title is expert at, behavior selected state, right content switching
For current task show area, and the model display position is slided into, current line selected state and after issuing successfully, model name can
It is clicked, entering after click in current page should [model details] page;When model failure, model name becomes red, chooses
After can not be clicked;When in model load, it can not be clicked after choosing;When model does not have evaluation module, model name becomes
For red, can not be clicked after choosing;
Identification-state: in load, issuing and successfully (do not show icon), and failure (not showing icon, title reddens) is not tied
Fruit (does not show icon, title reddens);
Storage mark: model is updated in warehouse, then shows the mark (such as smiling face in figure) updated to warehouse;
Champion's mark: in the task, it is (to be appraised to show that champion identifies (such as trophy in figure) before the preferable model that scores
Filter Bar is divided to influence, the content according to scoring screening is different, and score value can also change);
Specific scoring: (to be appraised point of Filter Bar influences display most three decimal points of score value situation, according in scoring screening
Hold difference, score value can also change).
Model evaluation is placed in list bottom, chooses a certain or multinomial, can use, and clicks button and pops up in current page
[model evaluation] reminding window (check box for only issuing success status can be checked).
Above-mentioned modeling achievement chart substantive content (compared with model result) corresponding with modeling achievement list is the same.Figure
Table can preferably show the superiority and inferiority situation that same training (such as task 001 or task 002) generates model, and list can be more preferable
The superiority and inferiority situation for showing different training (all training) and generating models.
The training is usually that iteration operation algorithm model runs algorithm model more than once (hyper parameter may include
The number of iterations).
In the embodiment of the present invention, the user interface of history modeling achievement can also be provided, so that user be facilitated to check history
Model achievement.
In the embodiment of the present invention, the method for automatic modeling can also include: the business model pair that creation and creation are completed
The first workflow answered, first workflow include multiple workflow modules, can have connection between workflow module and close
It is that in two workflow modules with connection relationship, the output of a workflow module is as the defeated of another workflow module
Enter.For example, the workflow module may include a data module, which corresponds to the data of user setting, institute
Stating workflow module can also include an analysis module, the algorithm in corresponding model strategy.First workflow not editable and
Modification, can only check.That is, the bottom of Data Analysis Services system creates a task (i.e. simultaneously in automatic modeling
First workflow), while the title of task can be automatically generated, such as model name+timestamp, user's function of the first workflow
Energy permission and the user function permission of automatic modeling are consistent.
In the embodiment of the present invention, described the step of creating the first workflow corresponding with the business model that the creation is completed
Later, further includes: when the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts Modeling and Design
When the operation of information, first workflow is updated.
That is, in above-described embodiment while creating automatic modeling, first workflow can be automatically created, often
The operation of secondary model is all carried out in first workflow, and carries out accordingly upgrading the first workflow when model running every time
Version;When automatic modeling is run, that is, when work flow operation.When detecting that user adjusts the operation of Modeling and Design information,
The Modeling and Design information is updated, according to the Modeling and Design information of update, updates first workflow.
In the embodiment of the present invention, the creation with the step of the business model corresponding workflow of the creation completion it
Afterwards, further includes: when detecting that user creates the operation of the second workflow identical with first workflow content, generate institute
State the second workflow.The second newly-built workflow such as can be checked, be modified, being edited at the operation, so as to automatic modeling
Model further modification and carry out complex scene design.
In the embodiment of the present invention, it is i.e. newly-built with described that workflow can be created by generating the user interface of data application
Identical second workflow of one workflow content, it is specific the following steps are included:
1, a new data application (i.e. workflow) is generated;It (models achievement chart at modeling achievement interface and is modeled as
Fruit list) comprising [generating data application] key, current workflow can be replicated by modeling achievement exposition.
2, data application title is set;
3, it describes, default shows former automatic modeling content;
4, label, default show former automatic modeling content.
In the embodiment of the present invention, Fig. 6 and Fig. 7 are please referred to, the step of display user interface includes: when detecting user
When checking the operation of the data of setting, visual information corresponding with the data is shown.That is, working as the complete number of user setting
According to later, table or chart etc. can also for example can be to the data information preview of setting, the visual information, thus side
Just user's garbled data.In the embodiment of the present invention, data content is divided into numerical value (integer, floating type) type and other nonumeric types
Value, numeric type, nonumeric type can be shown respectively.
It is mentioned in above-described embodiment, it, can be with the letter of display model strategy before model training, or after training
Breath, checks or is modified for user, is illustrated below to the user interface of the information of display model strategy.
(1) user interface of training set and test set
In the embodiment of the present invention, when carrying out model training, training set and test set are needed, defaults being trained for use
Collection and test set method are as follows: split current data, the method for obtaining training set and test set may also is that another number of fractionation
According to, from extracting trained and test data in data, trained and test data is extracted from two data, extracted from other data
Trained and test data.
In the embodiment of the present invention, obtains training the training set of business model and the method for test set may include sampling and tear open
Point, the methods of sampling may include: 1) unsample, use all data;2) original record;3) X% row is randomly selected;4) random choosing
Take N row;5) class balances N row;6) class balance X% row etc..Method for splitting may include: 1) random splits;2) starting K- folding intersects
Verifying;3) Number of folds (enabling)/training data ratio (not enabling);4) random seed.
(2) it is arranged and selects the user interface of Feature Engineering, including data processing and feature selecting
(1) data processing
1: the data processing based on classification, comprising: classification processing, missing values.Selectable classification processing method includes: mute
Coding vector;Selectable missing values processing method includes by numerical value processing, filling, deletion row etc..
2: the data processing based on numerical value, comprising: numerical value processing, missing values.Selectable Numerical Methods include: mark
Quasi- numerical characteristics (Keep as a regular numerical feature), the binaryzation based on given value, branch mailbox etc.;It can
The missing values processing method of selection includes filling, deletion row etc..
3: text based data processing, selectable processing method include: word segmentation processing, word frequency analysis
(2) feature selecting includes:
Optional feature selection approach includes: mutual information, Chi-square Test, F inspection, Pearson correlation coefficients method, recurrence spy
Levy null method, characteristic model null method etc.;It further, can also include feature orthogonalization, the principal component analysis of feature, matrix
Decompose etc..Based on the method for user's selection, system carries out feature selecting automatically.
It is subsequent to carry out again feature selecting again based on the calculated feature importance of model after automatic modeling.
Specifically, user can also directly customized selection feature:
1: according to the data type of different column, different types of variables (i.e. feature) carry out difference show feature (such as
It is divided into classification and numerical value);
2: supporting that title is (according to field name title a-z with data (according to the tandem of data sheet display field name)
0-9), type (first classification, then number), role's (first target column is then turned on column, rear to close column) carry out tab sequential sequence;It opens
The column opened refer to: the feature selected;The column of closing refer to: the feature not selected.
3: data, which are disbursed from the cost and expenses, holds multiselect, a key Quan Xuan, key removing multiselect;
4: supporting search;
5: leaving the page and enter again, retain last time operation trace;
6: target column and commonly showing obvious differentiation;
(3) user interface of setting and selection algorithm and parameter
(1) algorithm
1: all algorithms can display algorithm brief introduction;
2: first time operative algorithm has the display default value of default value, and non-first time retains last time operation note, opens
Close button does not influence this operation;
3: the corresponding button of algorithm is that when closing, can not be adjusted to the arbitrary parameter of algorithm, there is apparent viewing area
Point;
Algorithm include: (1) cluster: K-MEANS, neighbour's propagation, mean shift, spectral clustering, hierarchical clustering, density noise,
Equilibrium iteration hierarchical clustering etc.;(2) classify: the progressive tree of random forest, gradient, XGBoost, decision tree, close on algorithm (KNN),
Additional random number, neural network, logistic regression, support vector machines, stochastic gradient descent etc.;(3) it returns: random forest, gradient
Progressive tree, lasso trick recurrence, XGBoost, decision tree, closes on algorithm (KNN), additional random number, neural network, lasso trick at ridge regression
Path, logistic regression, support vector machines, stochastic gradient descent etc..
(2) user interface of parameter
Hyper parameter setting:
1: search hyper parameter
1) random grid searches speed
● whether upset original sequence
2) maximum number of iterations,
3) the maximum search time is only positive integer and floating type
4) number of concurrent is only positive integer and -1
Wherein, hyper parameter is the parameter of the setting value before starting learning process, rather than the parameter obtained by training
Data.Under normal conditions, it needs to optimize hyper parameter, selects one group of optimal hyper parameter, to improve the performance and effect of study
Fruit.
Further, system provides the automated tuning of hyper parameter, and selectable tuning method includes: (1) cluster: profile system
Number, Silhouette coefficient, homogeneity (homogeney), completeness (integrality), V-measure etc.;(2) divide
Class: AUC score, accuracy rate, accurate rate, recall rate, F1 score, logarithm loss etc.;(3) return: R2 value, explain difference score value,
Mean value error, mean square error, root-mean-square error, root mean square log error, absolute mean error etc..One can only generally be selected.
Note: default hyper parameter are as follows: " randomized ": true;"nJobs":1;"mode":"K-FOLD";"nFolds":
5。
It include cross validation UI Preferences in hyper parameter user interface
1: cross validation
1) traditional approach splits training set/verifying collection default and supports that inputting primary contract is only positive integer and floating type,
It is defaulted as 0.8
2) K-Fold default supports foldable number, is only positive integer, default value 0
Data are first split into training set and test set (referring to user circle of training set and test set specifically, can be
Face setting);Training set is split into training set again for cross validation part and verifying collects.Wherein, verifying collection is used for cross validation, surveys
Examination collection is used for subsequent assessment.
All data: usually can't all be brought training by note, but separated a part and come (i.e. verifying collection, this portion
Point do not participate in training) parameter that generates to training set tests, it is relatively objective judge these parameters to training set except
The matching degree of data.This thought is known as cross validation (Cross Validation).
(4) it is arranged and selects the user interface of appraisal procedure
The user interface of appraisal procedure
1: there is different model evaluation methods according to different classes of algorithm, single choice, or with one of core for scoring
Heart standard, while can also show other associated evaluation indexes
Appraisal procedure includes: explained variance scoring, mean absolute deviation, mean square error, R2 scores, median absolutely misses
Poor, square log error, F1 value, accuracy rate, accurate rate, recall rate, AUC score, logarithm loss, Cost matrix, accumulative promotion
Degree, FBeta scoring, silhouette coefficient, homogeneity (homogeney), completeness (integrality), V-measure etc..Its
In, the appraisal procedure of corresponding clustering algorithm includes: silhouette coefficient, homogeneity (homogeney), completeness (complete
Property), V-measure;The appraisal procedure of corresponding multi-classification algorithm include: F1 value, accuracy rate, accurate rate, recall rate, AUC score,
Logarithm loss, FBeta scoring;The appraisal procedure of corresponding two sorting algorithms includes: F1 value, accuracy rate, accurate rate, recall rate, AUC
Score, logarithm loss, Cost matrix, accumulative promotion degree, FBeta scoring;The appraisal procedure of corresponding regression algorithm includes: explanation side
Poor scoring, mean absolute deviation, mean square error, R2 value, median absolute error, square log error.
Note: default appraisal procedure is respectively: two classification: AUC score, classify: accuracy rate returns: R2 value more
It after adjusting any of the above links, can be carried out saving, and click " training ", then check as a result, going forward side by side
Row saves.User can save customized model strategy, use or be supplied to other users for next time and use.
In the embodiment of the present invention, after model is completed in training, certain standard ability can be reached with Issuance model, model
It can be published to warehouse, carry out online etc., that is, the content for meeting certain Score index (evaluation criteria) can just be published to warehouse, carry out
Online operation.Above-mentioned model refers to the model of automatic modeling or generates the newly-built model of data application.Only it is published to model repository
In model, just can be carried out the online of model, comparison and iteration.
The user interface for being published to warehouse may include:
1, [being published to warehouse] button is clicked, pops up [being published to model repository] pop-up in current page;
2, pop-up includes the following contents: title, description, label;
Meet condition: selection combobox: the appraisal procedure of alternative whole model supports;Alternative condition combobox: greater than etc.
In being less than or equal to;The numeric type of input frame: greater than be equal to 0, such as with AUC score, then alternative condition combobox: greater than be equal to,
It is less than or equal to, setting value.
It automatically updates: opening or be not turned on: after unlatching, by the model for the condition that meets and being also put into the mould in warehouse not successfully
Type is updated into model repository;Time interval is automatically updated to be defaulted as 24 hours;
It submits, clicks button, pop-up updates progress prompt frame, can check all eligible model modification progresses;
After submission, [being published to warehouse] button pattern is changed to and [has been published to warehouse] and configures.Model repository please join
See attached drawing 8, on-time model performance and resource service condition can be checked by clicking " on-time model monitoring ".On-time model performance and
Resource use is checked referring to Fig. 9 and Figure 10.
Referring to FIG. 8, all online model list can be shown, default is arranged by on-line time inverted order.List is aobvious
Show following field: on-time model title, current container, CPU, MEM, GPU use real-time condition, (can match within the scope of certain time
Specific duration is set, in a few houres or several days) average/min/max response time, call number and success rate.
" model details " button in on-time model title or operation is clicked, Model Results details page can be entered, it is browsable
To the model specifying information.Call number is clicked, details page is called into specific, please refers to Figure 10.
It in Figure 10, can show in a certain range, call number details.By way of domestic map visualization, display
The calling situation (different colours represent different degrees of call number) of national each province, the specific province of mouse suspension, display are detailed
Feelings include specific province title, ranking, call number and accounting detail;Can also by details list, check call every time it is bright
It carefully, include allocating time, response time, call type, method of calling, access state, province and source.
The Model Results list of publication and details can also include the following contents:
1, it has been published to needing through review mechanism for model repository, the online operation of model can be carried out;
2, it supports to import (can be in batches) model from local, and shows all modeling achievement lists;
3, it supports batch to reevaluate model, and can check assessment result;
4, support model iteration is online, and the model of deployment success can will replace online model (inside each model only
There can be an online model, the model for defaulting most three deployment success is online in waiting, so for disposing and the upper limit
Model all needs to replace existing planned number), become on-time model;
5, whether meeting detection model has carried out eigenvalue assignment when model carries out online, and resource distribution and debud mode are matched
It sets, resources pattern is entered if not, relevant configuration is carried out, if there is then skipping;
6, the list of click model achievement can check essential information (title, algorithm types, training time, the training of model
Shi Changlie row, configuration is new, the data analysis module including ownership);
7, the api interface information and APIkey for showing model, can carry out Rest, message queue, and tri- kinds of file system NFS
Debugging mode carries out debugging interface, but only online model can carry out interface calling;
8, eigenvalue assignment, resource value configuration and debud mode configuration information can be checked;
9, the importance of characteristic variable is shown and the parameters information of model evaluation index is shown;
10, about the ROC curve of performance, the more intuitive diagrammatic representation information of the model evaluation result of confusion matrix;
11, the algorithm parameter information of model, the displaying of training data information and training details.
The Model Results list of publication is different from the achievement list of modeling, and the Model Results list of publication includes performance (mould
The case where after type is online, calls whether successful, resource situation etc.).
In the embodiment of the present invention, the user interface that model reevaluates can also be provided, under the user interface, can be executed:
1, selection assessment mark: alternate item, whole evaluation criterias;
2, data are selected, display possesses all data module titles and description of permission (can be read), letter A-Z is pressed,
The sequentially lower sequence of 0-9;It clicks [preview], pops up [data preview] page in current page;It supports to be closed with title and description
Key word is screened;
3, it clicks [submission] to reevaluate the appraisal procedure and data selected, pop-up assessment achievement list.
The model of model, publication that automatic modeling generates can carry out model and reevaluate.Model reevaluates, using new
Data are assessed, if assessment result is unsatisfactory for current business demand, re-start Modeling and Design, model training etc..
Figure 11 is please referred to, the embodiment of the present invention also provides a kind of Data Analysis Services system, comprising:
Display module 1101, for showing that user interface, the user interface are used for for user setting for creating business
The scene and data of model;
Processing module 1102, the scene and/or data being arranged in the user interface for obtaining user;According to acquisition
The scene and/or data, a model strategy is selected from multiple model strategies, according to the model strategy of selection create business
Model, the model strategy include at least following information: the arameter optimization method of algorithm and the algorithm.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described
The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting
Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, the display module 1101, for showing that scene list is selected for user in the user interface;When
When detecting that user selects the operation of the scene in the scene list, the scene of selection is shown in the user interface;
Or
The display module 1101, for showing scene input area in the user interface;When detecting that user exists
When the operation of the input area input scene, the scene of user's input is obtained;By what is inputted in scene list with user
The scene of scene matching is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm
Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes:
The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed
Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent
The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging,
The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng
The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC
Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned
Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes
At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression,
Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization
Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point
Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used
Default parameters.
Preferably, the display module is also used to show the Modeling and Design information for the business model that creation is completed, described to build
Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
The first adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information
Mould design information;
First operation module, for when the operation for detecting business model of user's execution for running the creation completion
When, according to the Modeling and Design information of update, run the business model that the creation is completed.
Preferably, the Data Analysis Services system further include:
Second operation module, for using when detecting that user executes the operation for the business model that operation creation is completed
The model strategy of selection runs the business model that the creation is completed.
Preferably,
The display module is also used to show that the modeling achievement for the business model that operation is completed, the modeling achievement include
At least one of: the score of business model that the title for the business model that the operation is completed, the operation are completed and described
Run the output result for the business model completed.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed
Run complete business model creation time, it is described operation complete business model training information, it is described operation complete
The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed
Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed
The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed
Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, the display module is also used to show the Modeling and Design information for the business model that operation is completed, described to build
Mould design information includes at least: the information of the model strategy of selection.
Preferably, the Data Analysis Services system further include:
Second adjustment module, for being built described in update when detecting that user adjusts the operation of the Modeling and Design information
Mould design information;
Third runs module, detects that user executes the business model for reruning the operation completion for working as
When operation, according to the Modeling and Design information of update, the business model that the operation is completed is reruned.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, the Data Analysis Services system further include:
Creation module, corresponding first workflow of business model for creating with creating completion, first workflow
Including multiple workflow modules.
Preferably, the Data Analysis Services system further include:
Update module, for the operation when the business model for detecting operation creation completion, alternatively, detecting that user adjusts
When the operation of Modeling and Design information, first workflow is updated.
Preferably, the Data Analysis Services system further include:
Replication module, for as the behaviour for detecting newly-built the second workflow identical with first workflow content of user
When making, second workflow, the second workflow editable are generated.
Preferably, the Data Analysis Services system further include:
Visualization model, for showing corresponding with the data when detecting that user checks the operation of the data of setting
Visual information.
Preferably, the Data Analysis Services system further include:
Release module, for when detecting that user issues the operation for the business model that the operation is completed, described in publication
Run the business model completed.
Preferably, the Data Analysis Services system further include:
Module is reevaluated, for working as the business mould for detecting that user reevaluates business model or publication that the operation is completed
When the operation of type, the business model of business model or publication that the operation is completed is reevaluated.
Figure 12 is please referred to, Figure 12 is the structural schematic diagram of the Data Analysis Services system of further embodiment of this invention, the number
It include: processor 1201 and memory 1202 according to analysis process system 120.In embodiments of the present invention, Data Analysis Services system
System 120 further include: be stored in the computer program that can be run on memory 1202 and on processor 1201, computer program quilt
Processor 1201 realizes following steps when executing:
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Obtain the user scene and/or data that are arranged in the user interface, and according to the scene of acquisition and/or
Data select a model strategy from multiple model strategies, create business model, the model plan according to the model strategy of selection
Slightly include at least following information: the arameter optimization method of algorithm and the algorithm.
Processor 1201, which is responsible for management bus architecture and common processing, memory 112, can store processor 1201 and exists
Execute used data when operation.
Preferably, the model strategy further includes at least one of following information: the appraisal procedure of the algorithm, described
The parameter setting method of algorithm, the method for splitting of the data, the processing method of the data and the data feature selecting
Method.
Preferably, the user interface is also used to be used to create for user setting the target signature of business model.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: in the user interface
Show that scene list is selected for user;
When detecting that user selects the operation of the scene in the scene list, choosing is shown in the user interface
The scene selected;
Or
Scene input area is shown in the user interface;
When detecting user in the operation of the input area input scene, the scene that user inputs is obtained;
Scene in scene list with the scene matching of user's input is shown in the user interface.
Preferably, the scene includes at least one of: the field of the scene of corresponding clustering algorithm, corresponding sorting algorithm
Scape, the scene of corresponding regression algorithm, the scene of the scene of corresponding abnormality detection and corresponding Language Processing.
Preferably, when the scene is the scene of corresponding clustering algorithm, the information of the selected model strategy includes:
The arameter optimization method of algorithm and the algorithm, the algorithm includes at least one of: hierarchical clustering, Bayes Gauss are mixed
Conjunction, KD tree, limited Boltzmann machine, the arameter optimization method of the algorithm are carried out based on hyperparameter optimization, and the hyper parameter is excellent
The method of change includes at least one of: random parameter searching method, mesh parameter searching method, silhouette coefficient method;
When the scene is the scene of corresponding sorting algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, Bagging,
The arameter optimization method of AdaBoost, neural network, Stack Model, the algorithm is carried out based on hyperparameter optimization, the super ginseng
The method of number optimization includes at least one of: random parameter searching method, mesh parameter searching method, area under the curve AUC
Fractal methods;
When the scene is the scene of corresponding regression algorithm, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: logistic regression, random forest, supporting vector are returned
Return, neural network, the arameter optimization method of the algorithm is carried out based on hyperparameter optimization, and the method for the hyperparameter optimization includes
At least one of: random parameter searching method, mesh parameter searching method, R2 value method;
When the scene is the scene of corresponding abnormality detection, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: neural network, support vector machine, robustness regression,
Arest neighbors, isolated forest;The arameter optimization method of the algorithm is carried out based on hyperparameter optimization, the method for the hyperparameter optimization
Including at least one of: random parameter searching method, mesh parameter searching method, F1 fractal methods;
When the scene is the scene of corresponding Language Processing, the information of the selected model strategy include: algorithm and
The arameter optimization method of the algorithm, the algorithm includes at least one of: potential applications index, implicit Di Li Cray point
Cloth, condition random field;The arameter optimization method of the algorithm includes: to provide default parameters according to the result of word frequency analysis, is used
Default parameters.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described from multiple model plans
After the step of selecting a model strategy in slightly, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: selection
The information of model strategy.
Preferably, can also be achieved following steps when computer program is executed by processor 1201: the display creation is completed
Business model Modeling and Design information the step of after, further includes:
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for running the business model that the creation is completed, according to update
Modeling and Design information runs the creation business model that the creation is completed.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described from multiple model plans
After the step of selecting a model strategy in slightly, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model plan of selection
Slightly, the business model that the creation is completed is run.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation
After the step of business model of completion, further includes:
The modeling achievement for the business model that display operation is completed, the modeling achievement includes at least one of: the fortune
The business model that the score for the business model that the title for the business model that row is completed, the operation are completed and the operation are completed
Export result.
Preferably, the modeling achievement further include: the information, described of the model strategy for the business model that the operation is completed
Run complete business model creation time, it is described operation complete business model training information, it is described operation complete
The importance ranking of the feature of the state and data for the business model that the corresponding workflow of business model, the operation are completed
Information.
Preferably, the modeling achievement includes: the business model that the corresponding N number of operation of the selected model strategy is completed
The information of the preceding M business model of middle highest scoring, alternatively, the N number of operation of the corresponding whole of the selected model strategy is completed
Business model information, M is positive integer more than or equal to 1, and N is the positive integer more than or equal to M.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation
After the step of business model of completion, further includes:
The Modeling and Design information for the business model that display operation is completed, the Modeling and Design information include at least: selection
The information of model strategy;
When detecting that user adjusts the operation of the Modeling and Design information, the Modeling and Design information is updated;
When detecting that user executes the operation for reruning the business model that the operation is completed, according to update
The Modeling and Design information reruns the business model that the operation is completed.
Preferably, the Modeling and Design information further include: scene and/or target signature.
Preferably, can also be achieved following steps when computer program is executed by processor 1201: creation is completed with creation
Corresponding first workflow of business model, first workflow include multiple workflow modules.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described to create and created
At business model corresponding first workflow the step of after, further includes:
When the operation for the business model for detecting operation creation completion, alternatively, detecting that user adjusts the operation and completes
Business model information operation when, update first workflow.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: described to create and created
At business model corresponding first workflow the step of after, further includes:
When detecting that user creates the operation of the second workflow identical with first workflow content, generation and institute
State the second workflow, the second workflow editable.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the display user interface
The step of after, further includes:
When detecting that user checks the operation of the data of setting, visual information corresponding with the data is shown.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation
After the step of business model of completion, further includes:
When detecting that user issues the operation for the business model that operation is completed, the business mould that the operation is completed is issued
Type.
Preferably, following steps be can also be achieved when computer program is executed by processor 1201: the operation creation
After the step of business model of completion, further includes:
When detecting that user reevaluates the operation of the business model of business model or publication that the operation is completed, to institute
The business model for stating business model or publication that operation is completed is reevaluated.
The embodiment of the present invention also provides a kind of computer readable storage medium, stores on the computer readable storage medium
Computer program, the computer program realize each process of above-mentioned method for automatic modeling embodiment when being executed by processor,
And identical technical effect can be reached, to avoid repeating, which is not described herein again.Wherein, the computer readable storage medium,
Such as read-only memory (Read-Only Memory, abbreviation ROM), random access memory (Random Access Memory, letter
Claim RAM), magnetic or disk etc..
The above is a preferred embodiment of the present invention, it is noted that for those skilled in the art
For, without departing from the principles of the present invention, it can also make several improvements and retouch, these improvements and modifications
It should be regarded as protection scope of the present invention.
Claims (10)
1. a kind of method for automatic modeling of Data Analysis Services system characterized by comprising
Show that user interface, the user interface are used to be used to create for user setting the scene and data of business model;
Scene and/or data that user is arranged in the user interface are obtained, and according to the scene and/or number of acquisition
According to, a model strategy is selected from multiple model strategies, according to the model strategy of selection create business model, the model strategy
Including at least following information: the arameter optimization method of algorithm and the algorithm.
2. method for automatic modeling according to claim 1, which is characterized in that the model strategy further includes in following information
At least one of: the appraisal procedure of the algorithm, the parameter setting method of the algorithm, the data method for splitting, described
The feature selection approach of the processing method of data and the data.
3. method for automatic modeling according to claim 1, which is characterized in that the scene includes at least one of: right
Answer the scene of clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, the scene of corresponding abnormality detection and right
Answer the scene of Language Processing.
4. method for automatic modeling according to claim 1, which is characterized in that described to create industry according to the model strategy of selection
After the step of business model, further includes:
The Modeling and Design information for the business model that display creation is completed, the Modeling and Design information include at least: the model of selection
The information of strategy.
5. method for automatic modeling according to claim 1, which is characterized in that described to create industry according to the model strategy of selection
After the step of business model, further includes:
When detecting that user executes the operation for running the business model that creation is completed, using the model strategy of selection, fortune
The business model that the row creation is completed.
6. a kind of Data Analysis Services system characterized by comprising
Display module, for showing that user interface, the user interface are used to be used to create for user setting the field of business model
Scape and data;
Processing module, the scene and/or data being arranged in the user interface for obtaining user;According to the field of acquisition
Scape and/or data select a model strategy from multiple model strategies, create business model, institute according to the model strategy of selection
Model strategy is stated including at least following information: the arameter optimization method of algorithm and the algorithm.
7. Data Analysis Services system according to claim 6, which is characterized in that the model strategy further includes following letter
Breath at least one of: the appraisal procedure of the algorithm, the parameter setting method of the algorithm, the data method for splitting,
The feature selection approach of the processing method of the data and the data.
8. Data Analysis Services system according to claim 6, which is characterized in that the scene include it is following at least it
One: the field of the scene of corresponding clustering algorithm, the scene of corresponding sorting algorithm, the scene of corresponding regression algorithm, corresponding abnormality detection
The scene of scape and corresponding Language Processing.
9. Data Analysis Services system according to claim 6, which is characterized in that
The display module is also used to show the Modeling and Design information for the business model that creation is completed, the Modeling and Design information
It includes at least: the information of the model strategy of selection.
10. Data Analysis Services system according to claim 6, which is characterized in that further include:
Second operation module, for when detect user execute operation creation complete business model operation when, using selection
Model strategy, run it is described creation complete business model.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810632499.6A CN109389143A (en) | 2018-06-19 | 2018-06-19 | A kind of Data Analysis Services system and method for automatic modeling |
CN202111299347.7A CN113935434A (en) | 2018-06-19 | 2018-06-19 | Data analysis processing system and automatic modeling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810632499.6A CN109389143A (en) | 2018-06-19 | 2018-06-19 | A kind of Data Analysis Services system and method for automatic modeling |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111299347.7A Division CN113935434A (en) | 2018-06-19 | 2018-06-19 | Data analysis processing system and automatic modeling method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109389143A true CN109389143A (en) | 2019-02-26 |
Family
ID=65416532
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810632499.6A Pending CN109389143A (en) | 2018-06-19 | 2018-06-19 | A kind of Data Analysis Services system and method for automatic modeling |
CN202111299347.7A Pending CN113935434A (en) | 2018-06-19 | 2018-06-19 | Data analysis processing system and automatic modeling method |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111299347.7A Pending CN113935434A (en) | 2018-06-19 | 2018-06-19 | Data analysis processing system and automatic modeling method |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN109389143A (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110083637A (en) * | 2019-04-23 | 2019-08-02 | 华东理工大学 | A kind of denoising method towards bridge defect ratings data |
CN110135064A (en) * | 2019-05-15 | 2019-08-16 | 上海交通大学 | A kind of generator rear bearing temperature fault prediction technique, system and controller |
CN110222710A (en) * | 2019-04-30 | 2019-09-10 | 北京深演智能科技股份有限公司 | Data processing method, device and storage medium |
CN110334955A (en) * | 2019-07-08 | 2019-10-15 | 北京字节跳动网络技术有限公司 | Processing method, device, equipment and the storage medium of index evaluation |
CN110443126A (en) * | 2019-06-27 | 2019-11-12 | 平安科技(深圳)有限公司 | Model hyper parameter adjusts control method, device, computer equipment and storage medium |
CN110705312A (en) * | 2019-09-30 | 2020-01-17 | 贵州航天云网科技有限公司 | Development system for rapidly developing industrial mechanism model based on semantic analysis |
CN110717535A (en) * | 2019-09-30 | 2020-01-21 | 北京九章云极科技有限公司 | Automatic modeling method and system based on data analysis processing system |
CN110766167A (en) * | 2019-10-29 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Interactive feature selection method, device and readable storage medium |
CN110807044A (en) * | 2019-10-30 | 2020-02-18 | 东莞市盟大塑化科技有限公司 | Model dimension management method based on artificial intelligence technology |
CN110956272A (en) * | 2019-11-01 | 2020-04-03 | 第四范式(北京)技术有限公司 | Method and system for realizing data processing |
CN111242358A (en) * | 2020-01-07 | 2020-06-05 | 杭州策知通科技有限公司 | Enterprise information loss prediction method with double-layer structure |
CN111724185A (en) * | 2019-03-21 | 2020-09-29 | 北京沃东天骏信息技术有限公司 | User maintenance method and device |
CN111784040A (en) * | 2020-06-28 | 2020-10-16 | 平安医疗健康管理股份有限公司 | Optimization method and device for policy simulation analysis and computer equipment |
CN112380216A (en) * | 2020-11-17 | 2021-02-19 | 北京融七牛信息技术有限公司 | Automatic feature generation method based on intersection |
CN112577955A (en) * | 2020-11-23 | 2021-03-30 | 淮阴师范学院 | Water bloom water body detection method and system |
CN112633754A (en) * | 2020-12-30 | 2021-04-09 | 国网新疆电力有限公司信息通信公司 | Modeling method and system of data analysis model |
CN112884092A (en) * | 2021-04-28 | 2021-06-01 | 深圳索信达数据技术有限公司 | AI model generation method, electronic device, and storage medium |
CN113010946A (en) * | 2021-02-26 | 2021-06-22 | 万翼科技有限公司 | Data analysis method, electronic equipment and related product |
CN113010226A (en) * | 2021-03-16 | 2021-06-22 | 北京云从科技有限公司 | Model loading method, system, electronic device and medium |
CN113239025A (en) * | 2021-04-23 | 2021-08-10 | 四川大学 | Ship track classification method based on feature selection and hyper-parameter optimization |
CN113282461A (en) * | 2021-05-28 | 2021-08-20 | 中国联合网络通信集团有限公司 | Alarm identification method and device for transmission network |
CN113449471A (en) * | 2021-06-25 | 2021-09-28 | 东北电力大学 | Wind power output simulation generation method for continuously improving MC (multi-channel) by utilizing AP (access point) clustering-skipping |
CN113822327A (en) * | 2021-07-31 | 2021-12-21 | 云南电网有限责任公司信息中心 | Algorithm recommendation method based on data characteristics and analytic hierarchy process |
CN114117050A (en) * | 2021-11-30 | 2022-03-01 | 济南农村商业银行股份有限公司 | Full-automatic accounting flow popup window processing method, device and system |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114610204B (en) * | 2022-03-14 | 2024-03-26 | 中国农业银行股份有限公司 | Auxiliary device and method for data processing, storage medium and electronic equipment |
CN115455135B (en) * | 2022-06-30 | 2023-10-31 | 北京九章云极科技有限公司 | Visual automatic modeling method and device, electronic equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101169798A (en) * | 2007-12-06 | 2008-04-30 | 中国电信股份有限公司 | Data excavation system and method |
CN104850405A (en) * | 2015-05-25 | 2015-08-19 | 武汉众联信息技术股份有限公司 | Intelligent configurable workflow engine and implementation method therefor |
CN105095436A (en) * | 2015-07-23 | 2015-11-25 | 苏州国云数据科技有限公司 | Automatic modeling method for data of data sources |
CN106164945A (en) * | 2014-04-11 | 2016-11-23 | 微软技术许可有限责任公司 | Sight modeling and visualization |
CN106250987A (en) * | 2016-07-22 | 2016-12-21 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN106997386A (en) * | 2017-03-28 | 2017-08-01 | 上海跬智信息技术有限公司 | A kind of OLAP precomputations model, method for automatic modeling and automatic modeling system |
CN107038167A (en) * | 2016-02-03 | 2017-08-11 | 普华诚信信息技术有限公司 | Big data excavating analysis system and its analysis method based on model evaluation |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107958268A (en) * | 2017-11-22 | 2018-04-24 | 用友金融信息技术股份有限公司 | The training method and device of a kind of data model |
-
2018
- 2018-06-19 CN CN201810632499.6A patent/CN109389143A/en active Pending
- 2018-06-19 CN CN202111299347.7A patent/CN113935434A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101169798A (en) * | 2007-12-06 | 2008-04-30 | 中国电信股份有限公司 | Data excavation system and method |
CN106164945A (en) * | 2014-04-11 | 2016-11-23 | 微软技术许可有限责任公司 | Sight modeling and visualization |
CN104850405A (en) * | 2015-05-25 | 2015-08-19 | 武汉众联信息技术股份有限公司 | Intelligent configurable workflow engine and implementation method therefor |
CN105095436A (en) * | 2015-07-23 | 2015-11-25 | 苏州国云数据科技有限公司 | Automatic modeling method for data of data sources |
CN107038167A (en) * | 2016-02-03 | 2017-08-11 | 普华诚信信息技术有限公司 | Big data excavating analysis system and its analysis method based on model evaluation |
CN106250987A (en) * | 2016-07-22 | 2016-12-21 | 无锡华云数据技术服务有限公司 | A kind of machine learning method, device and big data platform |
CN106997386A (en) * | 2017-03-28 | 2017-08-01 | 上海跬智信息技术有限公司 | A kind of OLAP precomputations model, method for automatic modeling and automatic modeling system |
CN107103050A (en) * | 2017-03-31 | 2017-08-29 | 海通安恒(大连)大数据科技有限公司 | A kind of big data Modeling Platform and method |
CN107958268A (en) * | 2017-11-22 | 2018-04-24 | 用友金融信息技术股份有限公司 | The training method and device of a kind of data model |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111724185A (en) * | 2019-03-21 | 2020-09-29 | 北京沃东天骏信息技术有限公司 | User maintenance method and device |
CN110083637A (en) * | 2019-04-23 | 2019-08-02 | 华东理工大学 | A kind of denoising method towards bridge defect ratings data |
CN110083637B (en) * | 2019-04-23 | 2023-04-18 | 华东理工大学 | Bridge disease rating data-oriented denoising method |
CN110222710A (en) * | 2019-04-30 | 2019-09-10 | 北京深演智能科技股份有限公司 | Data processing method, device and storage medium |
CN110222710B (en) * | 2019-04-30 | 2022-03-08 | 北京深演智能科技股份有限公司 | Data processing method, device and storage medium |
CN110135064A (en) * | 2019-05-15 | 2019-08-16 | 上海交通大学 | A kind of generator rear bearing temperature fault prediction technique, system and controller |
CN110135064B (en) * | 2019-05-15 | 2023-07-18 | 上海交通大学 | Method, system and controller for predicting temperature faults of rear bearing of generator |
CN110443126A (en) * | 2019-06-27 | 2019-11-12 | 平安科技(深圳)有限公司 | Model hyper parameter adjusts control method, device, computer equipment and storage medium |
WO2020258508A1 (en) * | 2019-06-27 | 2020-12-30 | 平安科技(深圳)有限公司 | Model hyper-parameter adjustment and control method and apparatus, computer device, and storage medium |
CN110334955B (en) * | 2019-07-08 | 2021-09-14 | 北京字节跳动网络技术有限公司 | Index evaluation processing method, device, equipment and storage medium |
CN110334955A (en) * | 2019-07-08 | 2019-10-15 | 北京字节跳动网络技术有限公司 | Processing method, device, equipment and the storage medium of index evaluation |
CN110717535B (en) * | 2019-09-30 | 2020-09-11 | 北京九章云极科技有限公司 | Automatic modeling method and system based on data analysis processing system |
CN110717535A (en) * | 2019-09-30 | 2020-01-21 | 北京九章云极科技有限公司 | Automatic modeling method and system based on data analysis processing system |
CN110705312A (en) * | 2019-09-30 | 2020-01-17 | 贵州航天云网科技有限公司 | Development system for rapidly developing industrial mechanism model based on semantic analysis |
CN110766167A (en) * | 2019-10-29 | 2020-02-07 | 深圳前海微众银行股份有限公司 | Interactive feature selection method, device and readable storage medium |
CN110807044A (en) * | 2019-10-30 | 2020-02-18 | 东莞市盟大塑化科技有限公司 | Model dimension management method based on artificial intelligence technology |
CN110956272A (en) * | 2019-11-01 | 2020-04-03 | 第四范式(北京)技术有限公司 | Method and system for realizing data processing |
CN110956272B (en) * | 2019-11-01 | 2023-08-08 | 第四范式(北京)技术有限公司 | Method and system for realizing data processing |
CN111242358A (en) * | 2020-01-07 | 2020-06-05 | 杭州策知通科技有限公司 | Enterprise information loss prediction method with double-layer structure |
CN111784040B (en) * | 2020-06-28 | 2023-04-25 | 平安医疗健康管理股份有限公司 | Optimization method and device for policy simulation analysis and computer equipment |
CN111784040A (en) * | 2020-06-28 | 2020-10-16 | 平安医疗健康管理股份有限公司 | Optimization method and device for policy simulation analysis and computer equipment |
CN112380216A (en) * | 2020-11-17 | 2021-02-19 | 北京融七牛信息技术有限公司 | Automatic feature generation method based on intersection |
CN112577955A (en) * | 2020-11-23 | 2021-03-30 | 淮阴师范学院 | Water bloom water body detection method and system |
CN112633754A (en) * | 2020-12-30 | 2021-04-09 | 国网新疆电力有限公司信息通信公司 | Modeling method and system of data analysis model |
CN113010946A (en) * | 2021-02-26 | 2021-06-22 | 万翼科技有限公司 | Data analysis method, electronic equipment and related product |
CN113010946B (en) * | 2021-02-26 | 2024-01-23 | 深圳市万翼数字技术有限公司 | Data analysis method, electronic equipment and related products |
CN113010226A (en) * | 2021-03-16 | 2021-06-22 | 北京云从科技有限公司 | Model loading method, system, electronic device and medium |
CN113239025A (en) * | 2021-04-23 | 2021-08-10 | 四川大学 | Ship track classification method based on feature selection and hyper-parameter optimization |
CN112884092B (en) * | 2021-04-28 | 2021-11-02 | 深圳索信达数据技术有限公司 | AI model generation method, electronic device, and storage medium |
CN112884092A (en) * | 2021-04-28 | 2021-06-01 | 深圳索信达数据技术有限公司 | AI model generation method, electronic device, and storage medium |
CN113282461A (en) * | 2021-05-28 | 2021-08-20 | 中国联合网络通信集团有限公司 | Alarm identification method and device for transmission network |
CN113282461B (en) * | 2021-05-28 | 2023-06-23 | 中国联合网络通信集团有限公司 | Alarm identification method and device for transmission network |
CN113449471A (en) * | 2021-06-25 | 2021-09-28 | 东北电力大学 | Wind power output simulation generation method for continuously improving MC (multi-channel) by utilizing AP (access point) clustering-skipping |
CN113822327A (en) * | 2021-07-31 | 2021-12-21 | 云南电网有限责任公司信息中心 | Algorithm recommendation method based on data characteristics and analytic hierarchy process |
CN114117050A (en) * | 2021-11-30 | 2022-03-01 | 济南农村商业银行股份有限公司 | Full-automatic accounting flow popup window processing method, device and system |
Also Published As
Publication number | Publication date |
---|---|
CN113935434A (en) | 2022-01-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109389143A (en) | A kind of Data Analysis Services system and method for automatic modeling | |
Oussalah et al. | Forecasting weekly crude oil using Twitter sentiment of US foreign policy and oil companies data | |
CN113537807B (en) | Intelligent wind control method and equipment for enterprises | |
CN112417176A (en) | Graph feature-based method, device and medium for mining implicit association relation between enterprises | |
JP2000339351A (en) | System for identifying selectively related database record | |
Imran et al. | Mining the productivity data of the garment industry | |
Tounsi et al. | CSMAS: Improving multi-agent credit scoring system by integrating big data and the new generation of gradient boosting algorithms | |
Mott | Case-based reasoning: Market, applications, and fit with other technologies | |
Said et al. | New model for making resilient decisions in an uncertain context: the rational resilience-based decision-making model (R2DM) | |
Quah | Estimating software readiness using predictive models | |
Lv et al. | Detecting fraudulent bank account based on convolutional neural network with heterogeneous data | |
Jeyaraman et al. | Practical Machine Learning with R: Define, build, and evaluate machine learning models for real-world applications | |
US20210356920A1 (en) | Information processing apparatus, information processing method, and program | |
Khramov | Robotic and machine learning: how to help support to process customer tickets more effectively | |
EP2453395A1 (en) | Method and system to analyze processes | |
US20240152818A1 (en) | Methods for mitigation of algorithmic bias discrimination, proxy discrimination and disparate impact | |
US20220374401A1 (en) | Determining domain and matching algorithms for data systems | |
Schreck et al. | The AI project manager | |
CN114519073A (en) | Product configuration recommendation method and system based on atlas relation mining | |
Gnoss et al. | XAI in the audit domain-explaining an autoencoder model for anomaly detection | |
Deshpande et al. | How much data analytics is enough? the roi of machine learning classification and its application to requirements dependency classification | |
Chang | Software risk modeling by clustering project metrics | |
US20200342302A1 (en) | Cognitive forecasting | |
CN110458383A (en) | Demand handles implementation method, device and the computer equipment of serviceization, storage medium | |
Schäfer et al. | Clustering-Based Subgroup Detection for Automated Fairness Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190226 |