CN107808245A - Based on the network scheduler system for improving traditional decision-tree - Google Patents

Based on the network scheduler system for improving traditional decision-tree Download PDF

Info

Publication number
CN107808245A
CN107808245A CN201711015342.0A CN201711015342A CN107808245A CN 107808245 A CN107808245 A CN 107808245A CN 201711015342 A CN201711015342 A CN 201711015342A CN 107808245 A CN107808245 A CN 107808245A
Authority
CN
China
Prior art keywords
relational database
module
decision tree
attribute
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711015342.0A
Other languages
Chinese (zh)
Inventor
马湧
孙彦广
张云贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Automation Research and Design Institute of Metallurgical Industry
Original Assignee
Automation Research and Design Institute of Metallurgical Industry
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Automation Research and Design Institute of Metallurgical Industry filed Critical Automation Research and Design Institute of Metallurgical Industry
Priority to CN201711015342.0A priority Critical patent/CN107808245A/en
Publication of CN107808245A publication Critical patent/CN107808245A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Educational Administration (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of network scheduler system based on improvement traditional decision-tree, belongs to pipe network dispatching technique field.Hardware includes relational database server, live database server, application server, engineer station.Relational database server is connected with engineer station and application server, and application server is also connected in addition to being connected with relational database server with real-time data base and engineer station, keeps data exchange between three.Application module includes relational database, data acquisition module, scheduling rule result display module, Decision Tree Rule storehouse generation module.Wherein scheduling rule result display module is deployed in engineer station, and Decision Tree Rule library module is deployed in application server, and relational database is deployed in relational database server, and data acquisition module is deployed in real-time data base.Advantage is that scientific and reasonable establishes steam pipe system scheduling rule system, reduces pipe network fluctuation, ensures the operation of pipe network highly effective and safe.

Description

Based on the network scheduler system for improving traditional decision-tree
Technical field
The invention belongs to pipe network dispatching technique field, is especially to provide a kind of based on the pipe network scheduling for improving traditional decision-tree System, the sequencing of steam pipe system scheduling rule is realized, it is rapid, it is scientific.
Background technology
In large-scale joint iron and steel enterprise, vapour system is with large dead time, big inertia, variable element, non-linear, multivariable The complex object of the features such as coupling, in face of such complicated operation conditions, administrative staff are substantially or by production for many years Run the experience command system operation under accumulation, it is difficult to avoid occurring to be vented, degrade situations such as using, cause great wave Take.So necessarily cause scheduling blindness and pipe network operation it is poorly efficient.Simultaneously because the reason such as output, seasonal variety is steamed Vapour dosage can change therewith, and Optimized Operation and the management to vapour system are that relevant enterprise energy efficiency reduces environmental pollution Effective measures.The scheduling rule storehouse based on decision tree is established, with reference to enterprise's scheduling rule storehouse, reasonably optimizing steam pipe system is adjusted Degree, the blindness and hysteresis quality of manual dispatching can be substantially reduced, so as to improve vapour system scheduling level.
Decision Tree algorithms be a kind of this data structure by decision tree based on sorting algorithm.Conventional construction decision tree Algorithm is ID3 algorithms.Decision tree is a tree construction for being similar to flow chart, wherein each internal node is represented on an attribute Test, each branch represents a test output, and each leaf nodes represent a class or class distribution.To unknown sample During this classification, its value tested the attribute of object in the sample one by one in sequence by tree root, and along eligible Branch walk downwards, until reaching some leaf node, the class that this leaf node represents is then the class belonging to the object.Based on decision-making Tree method, establish steel enterprise steam pipe network scheduling rule decision tree system.According to the system entropy and decision-making category of decision tree principle Property the methods of, with reference to existing enterprise's scheduling rule storehouse, expert knowledge library and factbase, calculated with improved construction decision tree greed Method ID3 algorithms establish steel enterprise steam pipe network scheduling rule decision tree system so that pipe network scheduling strategy more reasonable benefit/risk, Effectively reduce pipe network fluctuation range
The content of the invention
It is an object of the invention to provide a kind of based on the network scheduler system for improving traditional decision-tree, pass through improved structure Decision tree greedy algorithm ID3 algorithms are made, the program for realizing steam pipe system scheduling rule is scientific so that pipe network operation more section It is reasonable to learn.System using it is top-down it is recursive divide and rule mode to build, opened from training sample set and relative attribute Begin construction.With the continuous profound construction of decision tree, training sample set will recursively be divided into several less subsets.Tree Path between root and each node correspond to a correlation rule, therefore whole decision tree also just correspond to one group and completely associate Rule.
Hardware of the present invention includes relational database server, live database server, application server, engineer station. Relational database server is connected with engineer station and application server, and application server removes to be connected with relational database server Outside, also it is connected with real-time data base and engineer station, keeps data exchange between three.Application module includes relational database, Data acquisition module, scheduling rule result display module, Decision Tree Rule storehouse generation module.Wherein scheduling rule result shows mould Block is deployed in engineer station, and Decision Tree Rule library module is deployed in application server, and relational database is deployed in relational database Server, data acquisition module are deployed in real-time data base.
Relational database is the data communication medium between display module and Decision Tree Rule library module.Decision Tree Rule storehouse The decision rule of generation is write relational database by module, and display module reads and shown from relational database again.
Relational database:Store for dispatching record, Decision Tree Rule storehouse, the data shown.
Data acquisition module:It is made up of real-time data base and collection in worksite instrument and transmission network;Collection in worksite instrument Information is passed in real-time data base in real time;
Scheduling rule result display module:Data-interface part, data input function is provided for decision Tree algorithms, including read Take data file;
Decision Tree Rule library module:Function includes
1st, since tree root be representing the individual node of training sample set;
2nd, the node that training sample set belongs to same class is leaf, and such is marked;
3rd, otherwise using information gain measurement as split criterion, selection can realize that the attribute of best sample classification is used as and be somebody's turn to do The Split Attribute of node;
4th, a branch is created for each given value of Split Attribute, and divides sample set on this basis;
5th, using above-mentioned same process, the sample decision tree each divided is recursively formed.Once some attribute occurs On some node, then its offspring need not just consider further that;
6th, when meeting following condition for the moment, recurrence partiting step stops:
A) all sample sets for giving node belong to a class together;
B) can be used for further dividing sample without remaining attribute;
C) no specimen in branch.
Entropy before sample set division:
For thering is s data sample set S, wherein categorical attribute C to have m different discrete value c1, c2..., cm(i.e. Data sample S will finally be divided into m classification).Categorical attribute value is c1, c2..., cmSample number difference s1, s2..., sm.That Before division, sample set S total entropy (expectation information) is:
Wherein, piIt is that S concentrates any one sample to belong to classification CiProbability, and use si/ s estimates.Pay attention to, logarithmic function It is bottom with 2, because information binary coding.It can easily be seen that data set S total entropy is the sample to belong to a different category before division The weighted average of this information content.
Entropy after sample set division:
If attribute A has n different Category Attributes value { a1, a2..., an, it can be used attribute A that data set S is divided into n Individual subset { s1, s2..., sn, corresponding each subset SjIn the attribute A of all samples be all aj
If subset SjIn whole sample numbers be sj, wherein categorical attribute value is c1, c2..., cmSample number be s1j, s2j..., smj, then subset SjEntropy be:
Wherein pij=sij/sj, it is SjMiddle sample is belonging respectively to classification CiProbability.
After data set S is divided into n subset using attribute A, S total entropy is the weighted average of the entropy of n subset:
WhereinFor SjThe weight of subset, represent sjProportion of the subset in data set S.
Information gain:
Information gain represents the information content that system is obtained due to classification, is measured by the reduction of system entropy, defines data set S and presses Information gain after attribute A divisions is poor for the front and rear entropy of S divisions:
Gain (A)=I (s1,s2,...,sm)-E(A)
Algorithm calculates the information gain of each attribute, and then attribute of the selection with highest information gain is as data-oriented Collect S decision attribute, create a node, and marked with the attribute, branch is created to each value of attribute, and divide sample accordingly This.
Improved ID3 algorithms
ID3 algorithms are the typical decision Tree algorithms based on information gain, are constructed by the top-down recursive mode of dividing and ruling Decision tree is learnt.Its specific method is that all candidate attributes are tested, and selects the maximum attribute of information gain Pass through the different values construction point of the attribute as the root node of decision tree as optimal Split Attribute, and using this Split Attribute Branch, the above method is constantly then repeated to the subset of each branch, and construct other branches of decision-making tree node successively, until institute Untill some subclass only include generic training sample set.Finally obtained ID3 decision-tree models can is to new Set of data samples is classified and predicted.Attribute is all discrete type, or numerical attribute changes into discrete type by pretreatment in advance.
To improve ID3 solving speeds, raising is improved to traditional ID3 algorithms.Traditional ID3 algorithms are to each on node Attribute will calculate its information gain, then therefrom select the maximum attribute as the node of information gain.Due in information It is related to the calculating of logarithmic function in gain calculation process, built-in function must be called in calculation procedure, which adds meter The calculation amount time.This method greatly reduces amount of calculation, is improved Algorithm for Solving speed using a kind of new standard for selecting attribute. In ID3 algorithms, in ID3 algorithms, it is assumed that the size of positive example collection PE and counter-example collection NE in vector space are respectively p and q, then believe Breath entropy I is represented by:
Weighted information entropy is:
Bring into:
According to Equivalent Infinitesimal principle, if x very littles, ln (1+x) ≈ x
After bringing into:
WithCalculate weighted average entropy and be greatly improved data-handling capacity.
Assuming that representing current sample set with T, current candidate property set is represented with T_attributelist, candidate attribute collection Middle all properties are all discrete type, or numerical attribute changes into discrete type by pretreatment in advance.Then improved ID3 algorithm GID3formtree (T, T_attributelist) flow is described in detail below:
Step 1:Create root node N;
Step 2:If T belongs to same class C, return N is leaf node, labeled as class C;
Step 3:If T_attributelist is sky, return N is leaf node, and mark N is to occur at most existing in T Class;
Step 4:To the attribute in each T_attributelist, information gain gain is calculated;
Step 5:N testing attribute test_attribute=T_attributelist has the category of highest gain values Property;
Step 6:To each test_attributelist value, by mono- new leaf node of node N, and such as Sample set T corresponding to the new leaf node of fruit is sky, then does not divide this leaf node, be marked as in T occurring at most in class; Otherwise ID3formtree (T, T_attributelist) is performed on the leaf node, continues to divide it;
The advantage of the invention is that:Steam pipe system scheduling rule system is established based on traditional decision-tree is improved, realizes pipe network The sequencing of scheduling, it is rapid, it is scientific, ensure that pipe network operation is safe and efficient, improve operational efficiency, be industry energy conservation emission reduction.
Brief description of the drawings
Fig. 1 is graph of a relation between each module of present system.
Fig. 2 is that Decision Tree Rule solves flow chart.
Embodiment
Fig. 1 is graph of a relation between each module of invention system.Present system includes relational database, data acquisition module, Scheduling rule result display module, Decision Tree Rule storehouse generation module.Wherein scheduling rule result display module is deployed in engineering Teacher stands, and Decision Tree Rule storehouse generation module is deployed in application server, and relational database is deployed in relational database server, number Real-time data base is deployed according to acquisition module.Relational database is that scheduling rule result display module generates with Decision Tree Rule storehouse Data communication medium between module.The rule base of generation is write relational database, display by Decision Tree Rule storehouse generation module Module reads and shown from relational database again.
Fig. 2 is that Decision Tree Rule solves flow chart.Root node N is created first, if judging that T belongs to same class C, is returned It is leaf node to return N, labeled as class C;If then judging T_attributelist as sky, return N is leaf node, and mark N is Occur in T at most in class;Secondly to the attribute in each T_attributelist, information gain gain is calculated;N test Attribute test_attribute=T_attributelist has the attribute of highest gain values;Finally to each test_ Attributelist value, by mono- new leaf node of node N, and if sample set T corresponding to new leaf node is Sky, then do not divide this leaf node, be marked as in T occurring at most in class;Otherwise performed on the leaf node GID3formtree (T, T_attributelist), continue to divide it.

Claims (3)

  1. It is 1. a kind of based on the network scheduler system for improving traditional decision-tree, it is characterised in that real including relational database server When database server, application server, engineer station;Relational database server and engineer station and application server phase Even, application server is also connected in addition to being connected with relational database server with real-time data base and engineer station, keeps three Between data exchange;Application module includes relational database, data acquisition module, scheduling rule result display module, decision tree Rule base generation module;Wherein scheduling rule result display module is deployed in engineer station, and Decision Tree Rule library module is deployed in Application server, relational database are deployed in relational database server, and data acquisition module is deployed in real-time data base;
    Relational database is the data communication medium between display module and Decision Tree Rule library module, Decision Tree Rule library module The decision rule of generation is write into relational database, display module reads and shown from relational database again.
  2. 2. system according to claim 1, it is characterised in that
    Described relational database:Store for dispatching record, Decision Tree Rule storehouse, the data shown;
    Described data acquisition module:It is made up of real-time data base and collection in worksite instrument and transmission network;Collection in worksite instrument Information is passed in real-time data base by table in real time;
    Described scheduling rule result display module:Data-interface part, data input function is provided for decision Tree algorithms, including Read data file.
  3. 3. system according to claim 1, it is characterised in that described Decision Tree Rule library facility is:
    Since tree root be representing the individual node of training sample set;
    The node that training sample set belongs to same class is leaf, and such is marked;
    Otherwise using information gain measurement as split criterion, selection can realize the attribute of best sample classification as the node Split Attribute;
    A branch is created for each given value of Split Attribute, and divides sample set on this basis;
    Using above-mentioned same process, the sample decision tree each divided is recursively formed;Once some attribute appears in some On node, then its offspring need not just consider further that;
    When meeting following condition for the moment, recurrence partiting step stops:
    A) all sample sets for giving node belong to a class together;
    B) can be used for further dividing sample without remaining attribute;
    C) no specimen in branch.
CN201711015342.0A 2017-10-25 2017-10-25 Based on the network scheduler system for improving traditional decision-tree Pending CN107808245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711015342.0A CN107808245A (en) 2017-10-25 2017-10-25 Based on the network scheduler system for improving traditional decision-tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711015342.0A CN107808245A (en) 2017-10-25 2017-10-25 Based on the network scheduler system for improving traditional decision-tree

Publications (1)

Publication Number Publication Date
CN107808245A true CN107808245A (en) 2018-03-16

Family

ID=61582258

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711015342.0A Pending CN107808245A (en) 2017-10-25 2017-10-25 Based on the network scheduler system for improving traditional decision-tree

Country Status (1)

Country Link
CN (1) CN107808245A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710979A (en) * 2018-03-31 2018-10-26 西安电子科技大学 A kind of Internet of Things yard craft dispatching method based on decision tree
CN110737731A (en) * 2019-10-25 2020-01-31 徐州工程学院 accumulation fund user data refinement analysis system and method based on decision tree
WO2023246146A1 (en) * 2022-06-23 2023-12-28 上海淇玥信息技术有限公司 Target security recognition method and apparatus based on optimization rule decision tree

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425743A (en) * 2013-07-17 2013-12-04 上海金自天正信息技术有限公司 Steam pipe network prediction system based on Bayesian neural network algorithm
CN105550065A (en) * 2015-12-11 2016-05-04 广州华多网络科技有限公司 Database server communication management method and device
CN106611283A (en) * 2016-06-16 2017-05-03 四川用联信息技术有限公司 Manufacturing material purchasing analysis method based on decision tree algorithm
CN106651199A (en) * 2016-12-29 2017-05-10 冶金自动化研究设计院 Steam pipe network scheduling rule system based on decision-making tree method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103425743A (en) * 2013-07-17 2013-12-04 上海金自天正信息技术有限公司 Steam pipe network prediction system based on Bayesian neural network algorithm
CN105550065A (en) * 2015-12-11 2016-05-04 广州华多网络科技有限公司 Database server communication management method and device
CN106611283A (en) * 2016-06-16 2017-05-03 四川用联信息技术有限公司 Manufacturing material purchasing analysis method based on decision tree algorithm
CN106651199A (en) * 2016-12-29 2017-05-10 冶金自动化研究设计院 Steam pipe network scheduling rule system based on decision-making tree method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710979A (en) * 2018-03-31 2018-10-26 西安电子科技大学 A kind of Internet of Things yard craft dispatching method based on decision tree
CN108710979B (en) * 2018-03-31 2022-02-18 西安电子科技大学 Internet of things port ship scheduling method based on decision tree
CN110737731A (en) * 2019-10-25 2020-01-31 徐州工程学院 accumulation fund user data refinement analysis system and method based on decision tree
CN110737731B (en) * 2019-10-25 2023-12-29 徐州工程学院 Decision tree-based public accumulation user data refinement analysis system and method
WO2023246146A1 (en) * 2022-06-23 2023-12-28 上海淇玥信息技术有限公司 Target security recognition method and apparatus based on optimization rule decision tree

Similar Documents

Publication Publication Date Title
CN111860982A (en) Wind power plant short-term wind power prediction method based on VMD-FCM-GRU
CN106651199A (en) Steam pipe network scheduling rule system based on decision-making tree method
CN110969290A (en) Runoff probability prediction method and system based on deep learning
CN104408562A (en) Photovoltaic system generating efficiency comprehensive evaluation method based on BP (back propagation) neural network
CN113468790B (en) Wind speed characteristic simulation method and system based on improved particle swarm optimization
CN111722046A (en) Transformer fault diagnosis method based on deep forest model
CN108805193A (en) A kind of power loss data filling method based on mixed strategy
CN107808245A (en) Based on the network scheduler system for improving traditional decision-tree
CN106649479A (en) Probability graph-based transformer state association rule mining method
CN110910026B (en) Cross-provincial power transmission line loss intelligent management and decision method and system
CN111178585A (en) Fault reporting amount prediction method based on multi-algorithm model fusion
CN109754122A (en) A kind of Numerical Predicting Method of the BP neural network based on random forest feature extraction
CN103886030A (en) Cost-sensitive decision-making tree based physical information fusion system data classification method
CN114626640A (en) Natural gas load prediction method and system based on characteristic engineering and LSTM neural network
CN115859099A (en) Sample generation method and device, electronic equipment and storage medium
CN113282747B (en) Text classification method based on automatic machine learning algorithm selection
CN117150232B (en) Large model non-time sequence training data quality evaluation method
Ajagunsegun et al. Machine learning-based system for managing energy efficiency of public buildings: An approach towards smart cities
CN113159441A (en) Prediction method and device for implementation condition of banking business project
CN112836876A (en) Power distribution network line load prediction method based on deep learning
CN111126827A (en) Input-output accounting model construction method based on BP artificial neural network
CN116258234A (en) BP neural network model-based energy enterprise carbon emission measuring and predicting method
CN115409317A (en) Transformer area line loss detection method and device based on feature selection and machine learning
CN115965177A (en) Improved autoregressive error compensation wind power prediction method based on attention mechanism
Zheng et al. Stock trend prediction based on ARIMA-LightGBM hybrid model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180316