CN101324901A - Method, platform and system for excavating data - Google Patents

Method, platform and system for excavating data Download PDF

Info

Publication number
CN101324901A
CN101324901A CNA2008101348990A CN200810134899A CN101324901A CN 101324901 A CN101324901 A CN 101324901A CN A2008101348990 A CNA2008101348990 A CN A2008101348990A CN 200810134899 A CN200810134899 A CN 200810134899A CN 101324901 A CN101324901 A CN 101324901A
Authority
CN
China
Prior art keywords
data mining
data
task items
model
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008101348990A
Other languages
Chinese (zh)
Inventor
曾宪伟
漆晨曦
柯晓燕
辜敏
张亮
刘斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CNA2008101348990A priority Critical patent/CN101324901A/en
Publication of CN101324901A publication Critical patent/CN101324901A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data mining method, a platform and a system. The data mining method comprises the following steps: a parameter configuration interface of a preset data mining model is provided by a data mining platform; a parameter from a branch office user is received through the parameter configuration interface, and a task item is set and generated; the data mining platform executes the task item to perform data mining. By adopting the data mining method, the platform and the system, the corresponding parameters are arranged through the preset parameter configuration interface after the branch office user is connected to the data mining platform, therefore, the mining model can be rapidly applied, so as to obtain the result of localization, and the popularization and the application of the data mining technology are convenient.

Description

Data digging method, platform and system
Technical field
The present invention relates to data mining technology, relate in particular to a kind of data digging method, platform and system.
Background technology
Along with enriching constantly of information service content, the application of data mining technology more and more widely and becomes the hot issue of research.But in actual applications, data mining exist strongly professional, the technician required height, search time cycle are long, personnel drop into problems such as many.These problems become the main bottleneck that the restriction data mining technology is applied.Therefore, wish very much to find a kind of rapidly, the method for application data digging technology easily.
Summary of the invention
The technical matters that the present invention will solve provides a kind of convenient data digging method of using.
A kind of data digging method provided by the invention comprises step: the parameter/configuration interface that the data mining model that presets is provided at data mining platform; Receive from the parameter setting of branch office customer and generate task items by parameter/configuration interface; Execute the task item to carry out data mining at data mining platform.
Further, provide the explanation document of data mining model in the data mining platform, for the parameter setting of branch office customer is offered help.
According to an embodiment of data digging method of the present invention, above-mentioned task items is made up of according to logical order a plurality of stream files, and this stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.
According to an embodiment of data digging method of the present invention, data mining platform also provides semi-open stream file, and this data digging method also comprises step: receive branch office customer's convection current document and decide the modification of parameter and generate corresponding task items.
Further, data digging method of the present invention also comprises step: data mining platform moves described task items automatically according to the task items Run Script that branch office customer provides.
Further, data digging method of the present invention also comprises step: the data mining model that presets in the data mining platform is carried out Classification Management; Data mining platform carries out differentiated control to the user of branch offices, and the visit and the operating right of the branch office customer of different stage is set.
Data digging method provided by the invention, by at data mining platform initialize data mining model and parameter/configuration interface is provided, the user of branch offices can be directly by being provided with the data mining model that parameter generates localization, carry out data mining, make that the application of data mining technology is rapid more, convenient.
Another technical matters that the present invention will solve provides a kind of convenience, data mining platform rapidly.
The invention provides a kind of data mining platform, comprising: information-storing device, the attribute information and the task items that are used to store data mining model; Apparatus for management of information is used to provide the parameter/configuration interface of data mining model, receives the parameter setting from branch office customer, generates task items, the execution of scheduler task item; The data mining fairground is used to store the mass data that is applied to data mining process; Data mining device is used for utilizing the mass data in data mining fairground to carry out data mining according to task items.
Further, information-storing device also is used to store the explanation document of data mining model; This apparatus for management of information offers branch office customer in response to the request of branch office customer with described explanation document.
Further, above-mentioned task items is made up of according to logical order a plurality of stream files, and described stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.Apparatus for management of information also is used in response to the request of branch office customer described stream file being offered branch office customer, receives amended stream file of designated parameter and the corresponding task items of generation.
According to embodiment of data mining platform of the present invention, apparatus for management of information also is used to receive the task items Run Script from branch office customer, according to the execution of the described task items of scheduling automatically of described task items Run Script.
According to an embodiment of data mining platform of the present invention, information-storing device carries out classification and storage to the attribute information of data mining model.Information-storing device also is used for storing subscriber information, and the user is carried out differentiated control, and the visit and the operating right attribute of correspondence is set for the user of different stage; Apparatus for management of information also is used for visit and the operation according to user's visit and operating right property control user.
Data mining platform provided by the invention, at information storage module initialize data mining model, provide parameter/configuration interface by information management module, the user of branch offices can be directly by being provided with the data mining model that parameter generates localization, and carry out data mining by data-mining module, make data mining technology application more rapidly, convenient.
The present invention also provides a kind of data digging system, comprise: data mining platform, be used to provide the parameter/configuration interface of the data mining model that presets, receive parameter setting by described parameter/configuration interface, generate task items, carry out data mining according to the mass data of described task items utilization storage; Branch offices's terminal is used to connect described data mining platform, receives user's parameter setting, and the parameter/configuration interface that provides by described data mining platform sends to described data mining platform with user's parameter setting.
Description of drawings
Fig. 1 is the synoptic diagram according to data digging system of the present invention;
Fig. 2 is the process flow diagram according to an embodiment of data digging method of the present invention;
Fig. 3 is the process flow diagram according to another embodiment of data digging method of the present invention;
Fig. 4 is the process flow diagram according to an application examples of data digging method of the present invention;
Fig. 5 is the structural drawing according to an embodiment of data mining platform of the present invention.
Embodiment
With reference to the accompanying drawings the present invention is described more fully, exemplary embodiment of the present invention wherein is described.
Basic design of the present invention is, will be packaged into each black box at the code that each stage produced of realizing data mining process, and the parameter interface is set in program code then, and it is deployed on the data mining platform.Branch offices is connected to after the data mining platform, by the parameter/configuration interface (for example visual interface) that presets relevant parameters is set, thereby can uses mining model fast, obtains the result of localization.In addition, branch offices also can the task that make by the working time that data mining task is set move in the time of appointment automatically, need not artificial participation and can finish data mining process.In order to guarantee effective running of data mining platform, the present invention also provides the Classification Management of mining model and user's differentiated control, mining model is put into different categories by concrete application category, visit and the operating right of the user of different stage to different mining models, data mining task is set simultaneously, data excavated centrality and the controllability that achievement manages thereby improve.
Fig. 1 is the synoptic diagram according to data digging system of the present invention.As shown in Figure 1, data digging system comprises terminal 11 and data mining platform 12, terminal 11 is connected with data mining platform 12 by network, be used to connect data mining platform 12, receive user's parameter setting, the parameter/configuration interface that provides by data mining platform 12 sends to data mining platform 12 with user's parameter setting.Data mining platform 12 comprises information management module 121, data-mining module 122, information storage module 123 and data mining fairground 124.Wherein, information storage module 123 is used to store the attribute information of login user, data mining project, task items and template.Information storage module 123 can be a database, for example can adopt Sybase ASE 12.5 to obtain preferable performance.Data mining fairground 124 is used to store the data of magnanimity to offer the data mining process use.Information management module 121 and data-mining module 122 are as the external interface of platform.Information management module 121 and data-mining module 122 can be independent servers.Information management module 121 is used to provide the parameter/configuration interface of data mining model, receives the parameter setting from branch office customer, generates task items, the execution of scheduler task item.Data-mining module 122 is used for utilizing the mass data in data mining fairground to carry out data mining according to task items.On data-mining module 122, be equipped with the Data Mining Tools software (for example Clementine Server8.5) of specialty, be used for the service data mining task.Template research staff is in information management module 121 deploy data mining model templates and to its centralized management, the personnel of branch offices use terminal 11 (for example PC) link information administration module 121, data download mining model template is set up localized model, redeploy localized model then to information management module 121, call the data mining results that Data Mining Tools software (for example Clementine Server8.5) on the data-mining module 122 obtains meeting user's self-demand by information management module 121 when carrying out data mining task at last.
It should be noted that above-mentioned information management module and data-mining module also can be on the same server.
Introduce a kind of implementation can be applicable to the data mining model that presets at data mining platform of the present invention below.
The research staff of data mining model uses the data mining software of specialty to set up data mining model and related application thereof, then these data mining models are uploaded on the data mining platform, the interface that model is called by other staff is set, and uploads related description document about model to data mining platform.Specifically, this implementation procedure comprises:
(1) development model.The research staff intensively commercially understands and after data understand in that concrete problem such as customer segmentation, customer churn early warning, market forecast etc. are carried out, set the algorithm of required input data variable of mining model and research and development mining model, and set up the method for assessment models.The research staff mainly finishes these steps by the data mining software of specialty, and achievement shows as a plurality of stream files.
(2) parameter is set.For ease of promoting model, make branch offices utilize model easily, allow stream file possess the interface that supplies external call.Setup parameter interface in stream file, these parameter interfaces are flexible and changeable, for example, can be the data source connection attributes, the alternative condition of data, the parameter of the outgoing route of data, localization, model evaluation results output routing or the like.
(3) upload model.The research staff prepares data in the data digging flow, set up the stream file of several stages exploitations such as mining model, model evaluation, uploads to by network on the information management server of data mining platform.
(4) form task items and model call parameters interface is set.A plurality of stream files of finishing certain single operation are combined into an operation task item according to certain logical order.Task items can be finished comparatively complete data and prepare, and perhaps sets up model, perhaps model evaluation, the perhaps flow operations in stage such as issue.Stream file in the task items includes parameter, and these parameters need be configured to form external calling interface on data mining platform.To the task items of each foundation, for it corresponding one by one and parameter/configuration interface of the same name are set on information management server according to the parameter that is comprised in the stream file of task items, for these parameters are provided with initial value.
(5), comprise that statement of requirements document, data declaration document, stream file explanation document, parameter declaration document, data mining results document etc. upload to data mining platform all kinds of documents that produced in the data mining application.
Fig. 2 is the process flow diagram according to an embodiment of data digging method of the present invention.
As shown in Figure 2, in step 202, provide the parameter/configuration interface of the data mining model that presets at data mining platform.
In step 204, data mining platform receives from the parameter setting of branch office customer by parameter/configuration interface and generates task items.The user of branch offices arrives data mining platform by network entry, enter into task items attribute interface, see parameter contained in the task items and implication thereof and initial value, by consulting the specification of a model document, the user can understand the meaning that comprises of parameter fast, revise the value of parameter then according to the demand of self, task items can be changed into the data mining process that is fit to self-demand, thereby reach the purpose of efficient popularization data mining model achievement.
In step 206, carry out the task items of generation to carry out data mining at data mining platform.
Concentrate on the data mining platform and disposed data mining model and it is shared, branch offices can be connected to the result that the data mining platform performance model is set up localization, and need not the overlapping investment personnel and fund removes to research and develop model.
An embodiment according to data digging method of the present invention, data mining platform also provides semi-open stream file, and providing such as senior parameter interface such as the expert parameter in the model, cluster number in this type of stream file used than higher user for the digging technology level.This method also comprises step: receive branch office customer's convection current document and decide the modification of parameter and generate corresponding task items.Designated parameters is meant the open advanced parameters interface such as the expert parameter in the model, cluster number.
The data volume that data mining process relates to is often very huge, and the time of operation is also long.According to a preferred embodiment of the present invention, data digging method of the present invention also comprises step: the automatic operation task item of task items Run Script that data mining platform provides according to branch office customer.The user only need set the time of task items plan operation, can make task items carry out in the time of appointment.When the user submits task items to, at first, client will form all sets up a Run Script to the data stream in each task items, each Run Script contains following information: the account number and the password that are connected to Data Mining Tools software, the path of performed stream file on server, parameter that contains in the stream file and parameter value, the path is deposited in the daily record of stream file execution result.Then, client is dealt into server with the Run Script of stream file successively.Server is submitted to Data Mining Tools software with them in the mode of order line successively after receiving all Run Scripts.Data Mining Tools software is connected to the data mining fairground, and the running log of the stream file in the operation task item, and preservation successively file is in project folder, so that the user checks the result phase of task items operation.If wherein certain stream file operation failure, the operation that system then can the aborting task item, promptly other stream files after this stream file can not moved in the task items, thereby have saved system resource.By in mining process, improving automaticity, reduce the artificial link that participates in, saved user's time greatly.
An embodiment according to data digging method of the present invention also comprises step: the data mining model that presets in the data mining platform is carried out Classification Management.Data mining model is that unit classifies with the project by concrete application category, and project is carried out Classification Management by this level of project category again.The model research staff sets up corresponding project by application such as the signatory management of concrete application category such as customer segmentation, commercial accounts' loss early warning and commercial accounts, and mining model, task items and the explanation document relevant with application are all placed corresponding project.When setting up project, information management server can be set up a file of the same name with project name, and afterwards, the journal file in the stream file relevant with project, document and the task items all can be positioned in this file.By the hierarchical classification management to the data mining model, feasible management to model can be chaotic along with being on the increase of model.
An embodiment according to data digging method of the present invention also comprises step: data mining platform carries out differentiated control to the user of branch offices, and the visit and the operating right of the branch office customer of different stage is set.According to user's job function, define user's register that can use data mining platform, give dissimilar users with different authority settings, the range of application of guaranteeing total system is in controlled.The user is divided into following four class roles: keeper, research staff, analyst or running personnel.Running personnel's authority is minimum, is responsible for the ruuning situation of Monitoring Data mining task item, and he only has the authority of the item of executing the task.The analyst is the personnel with certain data mining technology, he can understand the operation workflow of model after learning model illustrates document, thereby can be with reference to original task items, by increasing or deleting operating process, or the logical order of retouching operation stream file sets up a new data mining task item, and what implementation model was higher level reuses.The research staff generally is the technician with department the inside of strong data mining research and development strength, and they are responsible for researching and developing data mining model and it is deployed on the data mining platform, carry out project management, task management and model management.The keeper has the highest weight limit, except the authority with other three roles, also is responsible for the user is managed.
Fig. 3 is the process flow diagram according to another embodiment of data digging method of the present invention.
As shown in Figure 3, in step 302, carry out the model template encapsulation at data mining platform, and the parameter interface is set.
In step 304, branch office customer excavates platform by local client logon data, from data mining platform download model document description to branch offices's client.
In step 306, branch office customer from data mining platform data download mining model to branch offices's client.
In step 308, branch office customer makes amendment to the data mining model in client.
In step 310, branch office customer uploads to data mining platform with amended data mining model.
In step 312, make up corresponding task items at data mining platform according to data mining model from the modification of client upload.Can preset parameter value.
In step 314, branch offices's client is checked task items from data mining platform.
In step 316, by parameter the operational factor of interface setting data mining model is set in branch offices's client, generate the item of executing the task.
In step 318, the item of will executing the task is submitted to data mining platform.
In step 320, the data mining platform item of executing the task carries out data mining, and generates execution journal.
In step 322, execution journal is checked from data mining platform by branch office customer, obtains data mining results.
It is pointed out that step 306 is to 310 being optional steps.
Fig. 4 is the process flow diagram according to an application examples of data digging method of the present invention.
As shown in Figure 4, step 402 is disposed data mining model.Template research staff's log-on message management server is set up the data mining project.Each data mining project is divided into commercial understanding, data understanding and preparation, modelling and assessment, model issue four-stage by the process of data mining.Template research staff is deployed to the data mining model template of developing with Data Mining Tools software of following data standard (administrative model etc. of contracting as customer segmentation model, commercial accounts' loss Early-warning Model and commercial accounts) in the four-stage of project, sets up the data mining task item in each stage then.
Step 404, data are prepared.After the user signs in to information management server, carry out data understanding and the task items of preparatory stage that the template research staff is disposed, generate the required wide table of data of modeling.
Step 406, customizing model.The user's download model template to the machine of this locality, according to after the parameter in the demand change model template of oneself on local machine the operation template explore the result of modeling, then according to result's parameter in the correction model template again.So repeatedly, the result up to modeling meets customer requirements.Then, the user saves as the model of a customization with the model template of revising, and the mold portion of this customization is deployed in the original project, and sets up the data mining task item that comprises this customizing model, and like this, user's facility has been set up localized model with template.
Step 408, performance model.Task items in user's operation information management server, information management server automatically calls data mining program on the data mining server according to the model in the task items successively, mining process is carried out in the data mining fairground that connects the backstage, moves out of data mining results at last.
Below four above-mentioned steps are described in more details.
Wherein, the concrete steps of deployment model template are:
(11) the template research staff uses for each data mining and sets up a corresponding with it project district, is used to store with this data mining and uses relevant model, document and task items.Can set up a file of the same name with project name on the server, after this, the journal file of using in relevant model, document and the task items with this data mining all can be positioned in this file.
(12) the template research staff uses at each data mining, respectively with data understanding and preparatory stage with set up the data streaming file (the connection attribute information and the preset parameter value that data streaming file in are connected data mining fairground also by template research staff set) of model and upload to corresponding project district with evaluation stage, and its attribute information is set.The attribute information system of data streaming file is saved in the database, and data streaming file is then received in the project folder on the server by File Transfer Protocol by system.The attribute information of data streaming file comprises: title, residing stage, version number, founder, creation-time, purposes description, node parameter attribute information in data mining process.Comprise for node parameter attribute information: parameter name, parameter type, parametric description, the default optional value of parameter.
(13) the template research staff integrates by the sequencing that the data streaming file in the data mining application item is carried out according to the project implementation, the specifically created task items that draws certain particular result, this task items is by a plurality of data stream of combination (these data stream are carried out in this task items in order), finish certain result's output, the robotization that has greatly improved data mining process.For example, can make up the data stream that is used for the data preparation in order, thereby obtain a task items that is used for preparing to excavate required data, the wide table of formation data specially.When preserving task items, system is saved in node parameter attribute information, the data stream execution sequence in the data stream that task items comprised in the database.
(14) the template research staff is all kinds of documents that produced in the data mining application, comprise that statement of requirements document, data declaration document, data streaming file explanation document, parameter declaration document, data mining results document or the like upload, browse, download for the user.
The concrete steps that data are prepared are:
(21) user opens the human window of data preparation and preparatory stage task items, sets the value of parameter in the data stream, then task is submitted to server.When submitting task items to, at first, client will form all sets up a Run Script to the data stream in each task items, each Run Script contains following information: the account number and the password that are connected to Data Mining Tools software, the path of performed data stream on server, parameter that contains in the data stream and parameter value, the path is deposited in the daily record of data stream execution result.Then, client is dealt into server with the Run Script of data stream successively.Server is submitted to Data Mining Tools software with them in the mode of order line successively after receiving all Run Scripts.Data Mining Tools software is connected to the data mining fairground, and the running log of the data streaming file in the operation task item, and preservation successively file is in project folder, so that the user checks the result phase of task items operation.
(22) user opens task items operation result Log Window, the state of query task item operation result.Can list in the daily record each data stream title, parameter information, end of run time and whether move successful sign.If wherein certain data stream operation failure, the operation that system then can the aborting task item, promptly other data stream after this data stream can not moved in the task items, thereby have saved system resource.
The concrete steps of customizing model are:
(31) user's download is set up the data stream template (as commercial accounts' segmentation model, the signatory administrative model of commercial accounts' loss Early-warning Model and commercial accounts etc.) of model and evaluation stage and model launch phase to user's local machine.System adopts the mode of FTP to transmit user's data designated flow template to user's local machine.
(32) user opens the data stream template of setting up model and evaluation stage on local machine, revises the parameter value of the modeling node in the data stream, operation modeling data stream, and at this moment data stream can produce a new modeling node.The user can check the result of modeling, checks the result according to the data stream of reruning of the parameter in the results modification data stream template then, so repeatedly, and till the result of modeling meets customer requirements.
(33) user sets up modeling node that the data stream template of model and evaluation stage produced with step (32) lining and replaces modeling node from the data stream template of the model launch phase of downloaded, save as the data streaming file of a model launch phase after the user's modification then, (12) are said set by step passes server back on like that the data streaming file of this customization then.
(34) task items of model launch phase on user's replication server, and for duplicating the task items name of generation.Then, the data stream of uploading in the step (33) is replaced the data stream that has said function in the task items, and preserve task items.System will preserve the information of the task items that the user duplicates to database.
The concrete steps of performance model are:
(41) user opens the human window of model launch phase task items, sets the value of parameter in the data stream, then task is submitted to server.
(42) user opens task items operation result Log Window, the state of query task item operation result.
Fig. 5 shows the structural drawing according to an embodiment of data mining platform of the present invention.As shown in Figure 5, this data mining platform comprises information-storing device 51, apparatus for management of information 52, data mining device 53 and data mining fairground 54.Wherein, information-storing device 51, the attribute information and the task items that are used to store data mining model.Apparatus for management of information 52 is used to provide the parameter/configuration interface of data mining model, receives the parameter setting from branch office customer, generates task items, the execution of scheduler task item.Data mining fairground 54 is used to store the mass data that is applied to data mining process.Data mining device 53 is used for utilizing the mass data in data mining fairground 54 to carry out data mining according to task items.
According to an embodiment of data mining platform of the present invention, information-storing device also is used to store the explanation document of data mining model; Described apparatus for management of information offers branch office customer in response to the request of branch office customer with described explanation document.
According to an embodiment of data mining platform of the present invention, task items is made up of according to logical order a plurality of stream files, and stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.Apparatus for management of information also is used in response to the request of branch office customer described stream file being offered branch office customer, receives amended stream file of designated parameter and the corresponding task items of generation.
According to an embodiment of data mining platform of the present invention, apparatus for management of information also is used to receive the task items Run Script from branch office customer, according to the execution of the automatic scheduler task item of task items Run Script.
According to an embodiment of data mining platform of the present invention, information-storing device carries out classification and storage to the attribute information of data mining model.
According to an embodiment of data mining platform of the present invention, information-storing device also is used for storing subscriber information, and the user is carried out differentiated control, and the visit and the operating right attribute of correspondence is set for the user of different stage; Described apparatus for management of information also is used for visit and the operation according to user's visit and operating right property control user.
Data digging method provided by the invention, platform and system, propose based on data mining fairground and Internet technology, by the data mining research staff achievement in research of data mining model is summarized as template, template is managed concentratedly and is shared on the platform, other branch officeses can be connected to platform by network, applying template is set up localized model, realizes the data mining process of localization.This method, platform and system have realized the data mining model achievement is managed concentratedly and efficiently applied to branch offices, solved effectively data mining technology strongly professional, the technician required height, search time cycle are long, personnel drop into many shortcomings, even make branch offices under the situation of being unfamiliar with Data Mining Tools software, also can call the data mining model template and set up localized model, realize the localization of data mining.
The present invention is in conjunction with technology and instruments such as internet, data mining fairground, data mining softwares, a kind of method of in the colony with multiple-limb mechanism the data mining model being managed concentratedly and being promoted is provided, has effectively solved the current data digging technology and apply situation of difficult.
At first, realize the robotization of data mining process, the time of having saved data preparations, mining model research and development and having promoted, improved the efficient of data mining analysis work.Encapsulate by the data in the CRISP-DM data digging flow being prepared, set up model, model evaluation, four steps of model issue, form data digging flow black box (being task items), in task items, concentrate the parameter interface that is provided with in the step, simplified the operation steps of user's application data mining model.During user's application data mining model, what face no longer is that number is numerous, the operation steps of complex operation, but simple, the friendly task items in the interface of packaged operating process.The user only needs can obtain localized model and use localized The model calculation by the parameter value that is provided with in the task items; If needs are arranged, the user can also can adjust the atomic operation flow process in the task items, so that mining process adapts to the demand of self.This design not only makes the data mining process streamlined, and mining model is used convenient and swift, and the operation in the process also can have elasticity, can adjust flexibly.
Secondly, the present invention combines the main machine frame unified plan analyzed with the branch offices demand for localization.Take to concentrate the pattern of disposing data mining model, by main machine frame specialty research troop unified research and development model template, both solved the problem that most branch officeses excavate the modeling ability deficiency, the wasting of resources of also having avoided each branch offices to repeat to research and develop mining model.The localized model that model template and branch offices's application model template generate all is stored on the server manages concentratedly, conveniently promotes and reuses.Simultaneously, set up virtual data mining fairground and build data mining running environment, the unification of the application of the concentrated and branch offices of realization data, the input cost of saving software and hardware by concentrated.
Description of the invention provides for example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is for better explanation principle of the present invention and practical application, thereby and makes those of ordinary skill in the art can understand the various embodiment that have various modifications that the present invention's design is suitable for special-purpose.

Claims (20)

1. a data digging method is characterized in that, comprises step:
The parameter/configuration interface of the data mining model that presets is provided at data mining platform;
Receive from the parameter setting of branch office customer and generate task items by described parameter/configuration interface;
Carry out described task items to carry out data mining at data mining platform.
2. data digging method according to claim 1 is characterized in that, the explanation document of data mining model is provided in the data mining platform, for the parameter setting of branch office customer is offered help.
3. data digging method according to claim 1 is characterized in that described task items is made up of according to logical order a plurality of stream files, and described stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.
4. data digging method according to claim 3 is characterized in that described data mining platform also provides semi-open stream file, and described method also comprises step:
Receiving branch office customer's convection current document decides the modification of parameter and generates corresponding task items.
5. data digging method according to claim 1 is characterized in that, also comprises step:
Data mining platform moves described task items automatically according to the task items Run Script that branch office customer provides.
6. data digging method according to claim 1 is characterized in that, also comprises step:
The data mining model that presets in the data mining platform is carried out Classification Management.
7. data digging method according to claim 1 is characterized in that, also comprises step:
Data mining platform carries out differentiated control to the user of branch offices, and the visit and the operating right of the branch office customer of different stage is set.
8. a data mining platform is characterized in that, comprising:
Information-storing device, the attribute information and the task items that are used to store data mining model;
Apparatus for management of information is used to provide the parameter/configuration interface of data mining model, receives the parameter setting from branch office customer, generates task items, the execution of scheduler task item;
The data mining fairground is used to store the mass data that is applied to data mining process;
Data mining device is used for utilizing the mass data in described data mining fairground to carry out data mining according to task items.
9. data mining platform according to claim 8 is characterized in that, described information-storing device also is used to store the explanation document of data mining model; Described apparatus for management of information offers branch office customer in response to the request of branch office customer with described explanation document.
10. data mining platform according to claim 8, it is characterized in that, it is characterized in that described task items is made up of according to logical order a plurality of stream files, described stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.
11. data mining platform according to claim 10, it is characterized in that, described apparatus for management of information also is used in response to the request of branch office customer described stream file being offered branch office customer, receives amended stream file of designated parameter and the corresponding task items of generation.
12. data mining platform according to claim 8 is characterized in that, described apparatus for management of information also is used to receive the task items Run Script from branch office customer, according to the execution of the described task items of scheduling automatically of described task items Run Script.
13. data mining platform according to claim 8 is characterized in that, described information-storing device carries out classification and storage to the attribute information of data mining model.
14. data mining platform according to claim 8 is characterized in that, described information-storing device also is used for storing subscriber information, and the user is carried out differentiated control, and the visit and the operating right attribute of correspondence is set for the user of different stage; Described apparatus for management of information also is used for visit and the operation according to user's visit and operating right property control user.
15. a data digging system is characterized in that, comprising:
Data mining platform is used to provide the parameter/configuration interface of the data mining model that presets; Receive parameter setting by described parameter/configuration interface, generate task items, carry out data mining according to the mass data of described task items utilization storage;
Branch offices's terminal is used to connect described data mining platform, receives user's parameter setting, and the parameter/configuration interface that provides by described data mining platform sends to described data mining platform with user's parameter setting.
16. data digging system according to claim 15 is characterized in that, described data mining platform comprises:
Information-storing device, the attribute information and the task items that are used to store data mining model;
Apparatus for management of information is used to provide the parameter/configuration interface of data mining model, receives the parameter setting from branch office customer, generates task items, the execution of scheduler task item;
The data mining fairground is used to store the mass data that is applied to data mining process;
Data mining device is used for utilizing the mass data in described data mining fairground to carry out data mining according to task items.
17. data digging system according to claim 16 is characterized in that, described information-storing device also is used to store the explanation document of data mining model; Described apparatus for management of information offers branch office customer in response to the request of branch office customer with described explanation document.
18. data digging system according to claim 16, it is characterized in that, it is characterized in that described task items is made up of according to logical order a plurality of stream files, described stream file is used to finish the flow operations of data preparation, modelling, model evaluation or issue.
19. data digging system according to claim 18, it is characterized in that, described apparatus for management of information also is used in response to the request of branch office customer described stream file being offered branch office customer, receives amended stream file of designated parameter and the corresponding task items of generation.
20. data mining platform according to claim 16 is characterized in that, described apparatus for management of information also is used to receive the task items Run Script from branch office customer, according to the execution of the described task items of scheduling automatically of described task items Run Script.
CNA2008101348990A 2008-08-06 2008-08-06 Method, platform and system for excavating data Pending CN101324901A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008101348990A CN101324901A (en) 2008-08-06 2008-08-06 Method, platform and system for excavating data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008101348990A CN101324901A (en) 2008-08-06 2008-08-06 Method, platform and system for excavating data

Publications (1)

Publication Number Publication Date
CN101324901A true CN101324901A (en) 2008-12-17

Family

ID=40188440

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008101348990A Pending CN101324901A (en) 2008-08-06 2008-08-06 Method, platform and system for excavating data

Country Status (1)

Country Link
CN (1) CN101324901A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521040A (en) * 2011-12-08 2012-06-27 北京亿赞普网络技术有限公司 Data mining method and system
CN104144142A (en) * 2013-05-07 2014-11-12 阿里巴巴集团控股有限公司 Web vulnerability discovery method and system
CN104346376A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method and system for dynamically inserting data mining algorithm into data mining platform
CN104699777A (en) * 2015-03-10 2015-06-10 中国联合网络通信集团有限公司 Association method and system of management plane and service plane of big data analysis and mining
CN105809311A (en) * 2014-12-30 2016-07-27 航天信息股份有限公司 Device and method for invoice information processing
CN105956049A (en) * 2016-04-26 2016-09-21 乐视控股(北京)有限公司 Data output control method and device
CN103853821B (en) * 2014-02-21 2017-02-22 河海大学 Method for constructing multiuser collaboration oriented data mining platform
WO2018113521A1 (en) * 2016-12-23 2018-06-28 Huawei Technologies Co., Ltd. Generating knowledge base to assist with the modeling of large datasets
CN108228628A (en) * 2016-12-15 2018-06-29 亿度慧达教育科技(北京)有限公司 Wide table generating method and its device in a kind of structured query language database
CN108399323A (en) * 2018-03-01 2018-08-14 中国银行股份有限公司 A kind of parameter management system and parameter management method
CN109886719A (en) * 2018-12-20 2019-06-14 平安科技(深圳)有限公司 Data mining processing method, device and computer equipment based on grid
CN110633308A (en) * 2019-08-28 2019-12-31 北京浪潮数据技术有限公司 Data mining method, system and related device
CN111341454A (en) * 2018-12-19 2020-06-26 中国电信股份有限公司 Data mining method and device

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521040A (en) * 2011-12-08 2012-06-27 北京亿赞普网络技术有限公司 Data mining method and system
CN104144142A (en) * 2013-05-07 2014-11-12 阿里巴巴集团控股有限公司 Web vulnerability discovery method and system
CN104144142B (en) * 2013-05-07 2018-05-08 阿里巴巴集团控股有限公司 A kind of Web bug excavation methods and system
CN104346376A (en) * 2013-07-31 2015-02-11 克拉玛依红有软件有限责任公司 Method and system for dynamically inserting data mining algorithm into data mining platform
CN104346376B (en) * 2013-07-31 2017-11-03 红有软件股份有限公司 Method and system of the data mining algorithm dynamic insertion to data mining platform
CN103853821B (en) * 2014-02-21 2017-02-22 河海大学 Method for constructing multiuser collaboration oriented data mining platform
CN105809311A (en) * 2014-12-30 2016-07-27 航天信息股份有限公司 Device and method for invoice information processing
CN104699777A (en) * 2015-03-10 2015-06-10 中国联合网络通信集团有限公司 Association method and system of management plane and service plane of big data analysis and mining
CN104699777B (en) * 2015-03-10 2019-06-11 中国联合网络通信集团有限公司 The correlating method and system of big data analysis excavation chain of command and service surface
CN105956049A (en) * 2016-04-26 2016-09-21 乐视控股(北京)有限公司 Data output control method and device
CN108228628A (en) * 2016-12-15 2018-06-29 亿度慧达教育科技(北京)有限公司 Wide table generating method and its device in a kind of structured query language database
CN108228628B (en) * 2016-12-15 2020-11-17 亿度慧达教育科技(北京)有限公司 Wide table generation method and device in structured query language database
WO2018113521A1 (en) * 2016-12-23 2018-06-28 Huawei Technologies Co., Ltd. Generating knowledge base to assist with the modeling of large datasets
CN108399323A (en) * 2018-03-01 2018-08-14 中国银行股份有限公司 A kind of parameter management system and parameter management method
CN111341454A (en) * 2018-12-19 2020-06-26 中国电信股份有限公司 Data mining method and device
CN109886719A (en) * 2018-12-20 2019-06-14 平安科技(深圳)有限公司 Data mining processing method, device and computer equipment based on grid
CN110633308A (en) * 2019-08-28 2019-12-31 北京浪潮数据技术有限公司 Data mining method, system and related device

Similar Documents

Publication Publication Date Title
CN101324901A (en) Method, platform and system for excavating data
CN106022007B (en) The cloud platform system and method learning big data and calculating is organized towards biology
CN105593835B (en) Multiple second level clouds are managed by main cloud service manager
US20210034336A1 (en) Executing a process-based software application in a first computing environment and a second computing environment
US7120896B2 (en) Integrated business process modeling environment and models created thereby
CN102810090B (en) Gateway data distribution engine
US20220391221A1 (en) Providing a different configuration of added functionality for each of the stages of predeployment, deployment, and post deployment using a layer of abstraction
Lipton et al. Tosca solves big problems in the cloud and beyond!
US20140109041A1 (en) Yunten's Web Application Methodology & Web Programming Language (YWAM & WPL)
CA2860470A1 (en) System and method for creating, deploying, integrating, and distributing nodes in a grid of distributed graph databases
US20110258345A1 (en) Method and apparatus for mobile data collection and management
CN109889381A (en) Automatic configuration management method and device based on fort machine
US8694601B2 (en) Method and apparatus for communicating during automated data processing
CN104268156B (en) Web site management system and its method
CN113760464A (en) Artificial intelligence model development platform based on deep learning
Volgyesi et al. Component-based development of networked embedded applications
CN114254606A (en) Microservice framework model
CN111104181A (en) Webpage data filling system for visually editing task flow
CN106371931A (en) Web framework-based high-performance geocomputation service system
D'Agostino et al. Lessons learned implementing a science gateway for hydro‐meteorological research
CN101976196B (en) Quality of service oriented code automatic code generating method
Anagnostopoulos et al. REFiLL: A lightweight programmable middleware platform for cost effective RFID application development
Wang et al. The modeling tool of SaaS software
CN113805850A (en) Artificial intelligence management system based on multiple deep learning and machine learning frameworks
CN114253546A (en) Code generation method and device, storage medium and electronic device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20081217