CN106997488A - A kind of action knowledge extraction method of combination markov decision process - Google Patents
A kind of action knowledge extraction method of combination markov decision process Download PDFInfo
- Publication number
- CN106997488A CN106997488A CN201710173631.7A CN201710173631A CN106997488A CN 106997488 A CN106997488 A CN 106997488A CN 201710173631 A CN201710173631 A CN 201710173631A CN 106997488 A CN106997488 A CN 106997488A
- Authority
- CN
- China
- Prior art keywords
- state
- action
- attribute
- strategy
- value function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of action knowledge extraction method of combination markov decision process, including:Train Random Forest model H;Definition action knowledge extracts problem AKE:For Random Forest model H, attribute is split, defined attribute change, action, definition action knowledge extracts problem AKE on this basis;AKE optimization problems are solved with markov decision process:To any input data, markov decision process MDP, and definition strategy are defined, by Policy iteration more new strategy, finally solves and obtains an optimal policy;The action that knowledge extracts definition is acted in the present invention, multiple property values of state can be changed, in actual applications, it will provide accurate feasible proposal.
Description
Technical field
The invention belongs to machine learning techniques field, particularly a kind of action knowledge of combination markov decision process is carried
Take method.
Background technology
In machine learning, many models such as SVMs, random forest, deep-neural-network have been suggested and taken
Very big success was obtained, but in many practical applications, the exploitativeness of these models is poor.
Intensified learning is the special machine learning of a class, by being interacted with the autonomous of place environment come learning decision strategy,
So that the long-term accumulated award that strategy is received is maximum;Intensified learning and the difference of other machines learning method are:Without advance
Provide training data, but will be by being produced with interacting for environment;In management science field, it is using system that knowledge, which extracts problem,
Method is counted to analyze the behavior of user and find out specific rule;In machine learning field, knowledge is extracted problem and is mainly
Using model subsequent analysis technology.
The major defect of this two classes method is that they are to set up model with total data to extract knowledge, is not to independent
Record extracts its useful knowledge.So in numerous applications, the exploitativeness of these models is poor, because these models are only
One property value of state is modified, this has resulted in result in actual applications and error occurs, it is impossible to give exactly
Go out the suggestion of feasibility.
The content of the invention
Technical problem solved by the invention is that the action knowledge for providing a kind of combination markov decision process is extracted
Method, to solve the property value set up model extraction knowledge with total data in the prior art and only change state, causes
The problem of resultant error is larger;The present invention realizes the action knowledge of data-driven by the markov decision process of intensified learning
Extract, realize and predicting the outcome for machine learning model is converted into the ability of action knowledge.
The technical solution for realizing the object of the invention is:
A kind of action knowledge extraction method of combination markov decision process, comprises the following steps:
Step 1:Train Random Forest model H;
Step 2:Definition action knowledge extracts problem AKE:For Random Forest model H, attribute is split, definition category
Property change, action, on this basis definition action knowledge extract problem AKE;
Step 3, with markov decision process solve AKE optimization problems:To any input data, define Markov and determine
Plan process MDP, and definition strategy, by Policy iteration more new strategy, finally solve and obtain an optimal policy.
The present invention compared with prior art, its remarkable advantage:
(1) method that the present invention proposes a kind of classical intensified learning method markov decision process of combination, is current
Act knowledge and extract field there is provided a kind of new method.
(2) action knowledge extractive technique proposed by the present invention improve efficiently finds optimal policy in finite time
Accuracy rate;The present invention is to be based on Random Forest model, and Random Forest model is one of existing best disaggregated model, extensive
For in practical problem, by the pretreatment of Random Forest model, data ordered categorization can be caused, optimized in follow-up horse
Iteration finds the time of optimal policy in Er Kefu decision processes.
(3) action knowledge extracts the action of definition in the present invention, can change multiple property values of state, in practical application
In, it will provide accurate feasible proposal.
(4) it is based on often walking state in markov decision process and being observed completely, iteration finds optimal policy
Accuracy rate is ensured;The characteristics of total data is to set up model need not be used with reference to markov decision process, the present invention
Can for some individually record extract its available action knowledge, can be by independently understanding environment with interacting for environment
And obtain a preferably strategy.
The present invention is described in further detail below in conjunction with the accompanying drawings.
Brief description of the drawings
Fig. 1 is the inventive method overview flow chart.
Embodiment
The action knowledge extraction method of a kind of combination markov decision process of the present invention, with reference to machine learning and reinforcing
Study, knowledge is acted using markov decision process extraction;Comprise the following steps that:
Step 1:Train Random Forest model H:
A training dataset is given, a Random Forest model H is set up;It is { X, Y } to define training dataset, and X is defeated
Enter data vector set, Y is output classification tag set, and Random Forest model H is set up by stochastical sampling and fully nonlinear water wave, with
Machine forest model H anticipation function is
Wherein,For input vector,Y ∈ Y, y are that Random Forest model H is in input vectorIn the case of export
Prediction classification, c is expects class object, and d is the d decision tree, and D is the total number of decision tree in random forest, wdFor d
The weight of decision tree,It is that the d decision tree is inputtingIn the case of corresponding output,To indicate
Function,Expression be in input data vectorIn the case of the prediction that exports be categorized as c probability.
Step 2:Definition action knowledge extracts problem (AKE):For Random Forest model H, attribute is split, defined
Attribute change, action, on this basis definition act knowledge and extract problem (AKE).
2.1 pairs of attributes are split:Given a Random Forest model H, each attribute xi(i=1 ..., M) is divided
For the interval of M quantity.
If 1) attribute xiIt is classification type and classifies with n, then attribute xiNaturally it is divided into n interval, this
When M=n.
If 2) attribute xiIt is that branch node in value type, Random Forest model H on every decision tree is xi> b,
Then b is attribute xiA cut-point.If the attribute x in all decision treesiThere is n cut-point, then attribute xiIt is divided into n+
1 interval, now M=n+1.
2.2 defined attributes change:Give Random Forest model a H, an attribute change τ and be defined as a triple τ
=(xi, p, q), p and q are attribute x respectivelyiTwo segmentations it is interval.
One attribute change τ is in given input vectorOn be executable, and if only if the input vectorI-th
Attribute xiIn interval p;One attribute change τ is input vectorAttribute xiInterval q is converted to from interval p.
2.3rd, definition is acted:
One action a is defined as an attribute change collection, that is, acts a={ τ1..., τ|a|};Each action a has one
R (α) is awarded immediately.
Wherein, | a | the number of attribute change in expression action a, | a | the action a of >=1, i.e., one comprises at least an attribute
Change τ.
One action a is in input vectorOn be executable, and if only if its all properties change τ existsOn be executable
's.
2.4th, definition action knowledge extraction problem (AKE) is:
Subject to p (y=c | x*) > z
Wherein, A is executable set of actions, AsTo need the optimal action sequence found, aiFor optimal action sequence As
In any one act, R (ai) it is action aiAward immediately, F (As) it is to act on optimal action sequence AsOn obtained total prize
Reward is worth, and y is that Random Forest model H is in input vectorIn the case of the prediction classification that exports, z is constant threshold, x*For
From initial input vectorPerform optimal action sequence AsThe vector result obtained after middle everything.
AKE problems be look for an action sequence input vector be changed into one have expect prediction classification target to
Amount, while ensureing that the award summation of the action sequence is maximum;So, this is an optimization problem, referred to as AKE optimization problems.
In the action definition of AKE problems, an action comprises at least an attribute change, and this can just change multiple category of a state
Property value, in actual applications, it will provide accurate feasible proposal.
Step 3, with markov decision process solve AKE optimization problems:To any input data, define Markov and determine
Plan process (MDP), and definition strategy, by Policy iteration more new strategy, finally solve and obtain an optimal policy.
3.1 define markov decision process for ΠMDP={ S, A, T, R };
Definition procedure is prior art, and wherein S represents state space, and state is represented with s;A represents motion space, and action is used
A is represented;T:S × A × S → [0,1] is state transition function, represents that execution under a state one is transferred to after acting another
Individual shape probability of state;R:S × A → R is reward functions, represents the award immediately that environment is provided during generating state transfer.From state s
Set out, take action a ∈ A (s), receiving the award R of environmental feedback, (s a), and is transferred to T (s, a, s ') probability next
State s ' ∈ S, wherein A (s) expressions at moment can take the set of action in state s.
Markov decision process is the process of a loop iteration, untill meeting end condition, defeated after terminating
Go out optimal policy sequence B.
3.2 definition strategy:
Tactful π is mapping of the state to action:S × A → [0,1], target is to find one there is largest cumulative to award Rπ
Optimal policy π*:
Wherein, RπIt is the accumulative award that t execution is acted under tactful π, γtIt is discount factor γ t powers, Eπ[·]
It is the expectation under tactful π, rtIt is the award immediately of t execution action.
3.3 define value function:
Reward functions are the instant evaluations to a state (action), and value function is then in the long run to consider a shape
The quality of state;Used here as state value function V (s).
A strategy π is given, state value function is defined as:
Based on optimal policy π*, optimum state value function V*(s) it can be defined as:
Wherein, s0Represent original state, s0=s represented using state s as original state, Vπ(s) it is with state s under tactful π
For the state value function of original state, V*(s) it is optimum state value function under tactful π by original state of state s.
According to the optimal equatioies of Bellman, can have:
Wherein, rt+1It is the award immediately of t+1 moment execution action, V*(st+1) it is t+1 moment states st+1Optimum state
Value function, s ' is the state of subsequent time, and T (s, a, s ') is state transition probability, and γ is discount factor, and R (s, α) is in state
Accumulative award under s, action a, V*(s ') is the lower optimum state value functions of NextState s '.
3.4th, solved according to Policy iteration and obtain an optimal policy:
First one strategy π of random initializtiont, calculate state value function v under this strategyt, according to these state value function calls
To new tactful πt+1, calculate the value function v of each state under new strategyt+1, until convergence.
The value of each state under a strategy is calculated, is referred to as Policy evaluation;New strategy, quilt are obtained according to state value
Referred to as stragetic innovation.
3.4.1 Policy evaluation is carried out:
According to Bellman equatioies, the value function of a state is related to the value function of its succeeding state;Therefore, with follow-up
State value function v (s ') updates the value function v (s) of current state;
Policy evaluation traversal institute is stateful, and state value function is updated according to formula below:
After renewal state value function, by tactful πtIt is added in optimal policy sequence B;
Wherein,It is tactful πtLower state s state value function,It is tactful πt+1Lower state s ' state
Value function, (s, it is state s, action a a) to represent strategy to π.
3.4.2 stragetic innovation is carried out:
One, which is obtained, according to state value function is better than old tactful new strategy;For a state s, policy selection one is allowed
Act a so that current state value function R(s, a)+γ∑s′T(s, a, s ')Vπ(s ') is maximum, i.e.,
Wherein, πt+1Represent the strategy at t+1 moment.
3.4.3 according to the result of stragetic innovation, optimal policy sequence B is exported:Whether the state in determination strategy is target
State, if dbjective state is with regard to exit strategy iteration and exports optimal policy sequence B;If not dbjective state, then again
Policy evaluation is carried out, is dbjective state until meeting state s, and export optimal policy B.
Whether it is that the Rule of judgment of object function is:
The method that the present invention proposes a kind of classical intensified learning method markov decision process of combination, is current action
Knowledge extracts field and provides a kind of new method.The present invention is to be based on Random Forest model, and Random Forest model is existing
One of best disaggregated model, has been widely used in practical problem.By the pretreatment of Random Forest model, data can be caused
Ordered categorization, optimizes the time that the iteration in follow-up markov decision process finds optimal policy, therefore the present invention is carried
The action knowledge extraction method gone out improve efficiently the accuracy rate that optimal policy is found in finite time.Acted in the present invention
Knowledge extracts the action of definition, can change multiple property values of state, in actual applications, it will provide accurate feasibility
It is recommended that.It can be observed completely based on state is often walked in markov decision process, iteration finds the accuracy rate of optimal policy
Ensured.The characteristics of total data is to set up model need not be used with reference to markov decision process, the present invention being capable of pin
Its available action knowledge is extracted to some independent record, can be by independently understanding environment with interacting for environment and obtaining
One preferably tactful.
Claims (7)
1. a kind of action knowledge extraction method of combination markov decision process, it is characterised in that comprise the following steps:
Step 1:Train Random Forest model H;
Step 2:Definition action knowledge extracts problem AKE:For Random Forest model H, attribute is split, defined attribute becomes
Change, act, definition action knowledge extracts problem AKE on this basis;
Step 3, with markov decision process solve AKE optimization problems:To any input data, Markovian decision mistake is defined
Journey MDP, and definition strategy, by Policy iteration more new strategy, finally solve and obtain an optimal policy.
2. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 1, it is characterised in that
Training Random Forest model H in step 1 is specially:
A training dataset is given, a Random Forest model H is set up;It is { X, Y } to define training dataset, and X is input number
According to vector set, Y is output classification tag set, and Random Forest model H is set up by stochastical sampling and fully nonlinear water wave, random gloomy
Woods model H anticipation function is
Wherein,For input vector,Y ∈ Y, y are that Random Forest model H is in input vectorIn the case of export it is pre-
Classification is surveyed, c is expects class object, and d is the d decision tree, and D is the total number of decision tree in random forest, wdFor the d certainly
The weight of plan tree,It is that the d decision tree is inputtingIn the case of corresponding output,For indicator function,Expression be in input data vectorIn the case of the prediction that exports be categorized as c probability.
3. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 1, it is characterised in that
Knowledge extraction problem is acted defined in step 2 and specifically includes following steps:
2.1 pairs of attributes are split:Given a Random Forest model H, each attribute xi(i=1 ..., M) is divided into M
The interval of quantity;
2.2 defined attributes change:Give Random Forest model a H, an attribute change τ and be defined as a triple τ=(xi,
P, q), p and q are attribute x respectivelyiTwo segmentations it is interval;
2.3rd, definition is acted:
One action a is defined as an attribute change collection, that is, acts a={ τ1..., τ|a|};Each action a has one to encourage immediately
Appreciate R (α);
Wherein, | a | the number of attribute change in expression action a, | α | the action α of >=1, i.e., one comprises at least an attribute change
τ;
2.4th, definition action knowledge extraction problem (AKE) is:
Subject to p (y=C | x*) > z
Wherein, A is executable set of actions, AsTo need the optimal action sequence found, αiFor optimal action sequence AsIn appoint
One action of meaning, R (ai) it is action aiAward immediately, F (As) it is to act on optimal action sequence AsOn obtained total award
Value, y is that Random Forest model H is in input vectorIn the case of the prediction classification that exports, z is constant threshold, x*For from
Initial input vectorPerform optimal action sequence AsThe vector result obtained after middle everything.
4. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 3, it is characterised in that
Attribute x in step 2.1iThe interval of M quantity is divided into, is specifically divided into:
If 1) attribute xiIt is classification type and classifies with n, then attribute xiNaturally it is divided into n interval, now M
=n;
If 2) attribute xiIt is that branch node in value type, Random Forest model H on every decision tree is xi> b, then b
As attribute xiA cut-point;If the attribute x in all decision treesiThere is n cut-point, then attribute xiIt is divided into n+1
Interval, now M=n+1.
5. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 1, it is characterised in that
AKE optimization problems are solved with markov decision process specifically include following steps in step 3:
3.1 define markov decision process for ΠMDP={ S, A, T, R }:
S represents state space, and state is represented with s;A represents motion space, and action is represented with a;T:S × A × S → [0,1] is shape
State transfer function, represents to be transferred to another shape probability of state after performing an action under a state;R:S × A → R is prize
Function is appreciated, the award immediately that environment is provided during generating state transfer is represented;From state s, action a ∈ A (s) are taken, are received
(s a), and is transferred to T (s, a, s ') probability state s ' the ∈ S, wherein A (s) of subsequent time to the award R of environmental feedback
Expression can take the set of action in state s;
3.2 definition strategy:
Tactful π is mapping of the state to action:S × A → [0,1], target is to find one there is largest cumulative to award RπIt is optimal
Tactful π*:
Wherein, RπIt is the accumulative award that t execution is acted under tactful π, γtIt is discount factor γ t powers, Eπ[] is plan
Expectation under slightly π, rtIt is the award immediately of t execution action;
3.3 define value function:
A strategy π is given, state value function is defined as:
Based on optimal policy π*, optimum state value function V*(s) it can be defined as:
Wherein, s0Represent original state, s0=s represented using state s as original state, Vπ(s) it under tactful π using state s is first to be
The state value function of beginning state, V*(s) it is optimum state value function under tactful π by original state of state s;
According to the optimal equatioies of Bellman, can have:
Wherein, rt+1It is the award immediately of t+1 moment execution action, V*(st+1) it is t+1 moment states st+1Optimum state value letter
Number, s ' is the state of subsequent time, and T (s, a, s ') is state transition probability, and γ is discount factor, and (s is a) in state s, dynamic to R
Make the accumulative award under a, V*(s ') is the lower optimum state value functions of NextState s ';
3.4th, solved according to Policy iteration and obtain an optimal policy:
First one strategy π of random initializtiont, calculate state value function v under this strategyt, obtained newly according to these state value functions
Tactful πt+1, calculate the value function v of each state under new strategyt+1, until convergence.
6. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 5, it is characterised in that
Solved in step 3.4 according to Policy iteration and obtain an optimal policy, specifically include following steps:
3.4.1 Policy evaluation is carried out:
According to Bellman equatioies, the value function of a state is related to the value function of its succeeding state;Therefore, succeeding state is used
Value function v (s ') updates the value function v (s) of current state;
Policy evaluation traversal institute is stateful, and state value function is updated according to formula below:
After renewal state value function, by tactful πtIt is added in optimal policy sequence B;
Wherein,It is tactful πtLower state s state value function,It is tactful πt+1Lower state s ' state value letter
Number, (s, it is state s, action a a) to represent strategy to π;
3.4.2 stragetic innovation is carried out:
One, which is obtained, according to state value function is better than old tactful new strategy;For a state s, policy selection one is allowed to act
A so that current state value function R(s, a)+γ∑s′T(s, a, s ')Vπ(s ') is maximum, i.e.,
Wherein, πt+1Represent the strategy at t+1 moment;
3.4.3 according to the result of stragetic innovation, optimal policy sequence B is exported:Whether the state in determination strategy is dbjective state,
If dbjective state is with regard to exit strategy iteration and exports optimal policy sequence B;If not dbjective state, then plan is re-started
Slightly assess, be dbjective state until meeting state s, and export optimal policy B.
7. a kind of action knowledge extraction method of combination markov decision process as claimed in claim 6, it is characterised in that
Whether it is that object function Rule of judgment is in step 3.4.3:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710173631.7A CN106997488A (en) | 2017-03-22 | 2017-03-22 | A kind of action knowledge extraction method of combination markov decision process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710173631.7A CN106997488A (en) | 2017-03-22 | 2017-03-22 | A kind of action knowledge extraction method of combination markov decision process |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106997488A true CN106997488A (en) | 2017-08-01 |
Family
ID=59431600
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710173631.7A Pending CN106997488A (en) | 2017-03-22 | 2017-03-22 | A kind of action knowledge extraction method of combination markov decision process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106997488A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108376287A (en) * | 2018-03-02 | 2018-08-07 | 复旦大学 | Multi-valued attribute segmenting device based on CN-DBpedia and method |
CN108510110A (en) * | 2018-03-13 | 2018-09-07 | 浙江禹控科技有限公司 | A kind of water table trend analysis method of knowledge based collection of illustrative plates |
CN109741626A (en) * | 2019-02-24 | 2019-05-10 | 苏州科技大学 | Parking situation prediction technique, dispatching method and system |
CN110363015A (en) * | 2019-07-10 | 2019-10-22 | 华东师范大学 | A kind of construction method of the markov Prefetching Model based on user property classification |
CN110378717A (en) * | 2018-04-13 | 2019-10-25 | 北京京东尚科信息技术有限公司 | Method and apparatus for output information |
CN111294284A (en) * | 2018-12-10 | 2020-06-16 | 华为技术有限公司 | Traffic scheduling method and device |
CN113112051A (en) * | 2021-03-11 | 2021-07-13 | 同济大学 | Production maintenance joint optimization method for serial production system based on reinforcement learning |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5806056A (en) * | 1994-04-29 | 1998-09-08 | International Business Machines Corporation | Expert system and method employing hierarchical knowledge base, and interactive multimedia/hypermedia applications |
CN101000624A (en) * | 2007-01-10 | 2007-07-18 | 华为技术有限公司 | Method, system and device for implementing data mining model conversion and application |
CN102054002A (en) * | 2009-10-28 | 2011-05-11 | 中国移动通信集团公司 | Method and device for generating decision tree in data mining system |
CN103034691A (en) * | 2012-11-30 | 2013-04-10 | 南京航空航天大学 | Method for getting expert system knowledge based on support vector machine |
CN103246991A (en) * | 2013-05-28 | 2013-08-14 | 运筹信息科技(上海)有限公司 | Data mining-based customer relationship management method and data mining-based customer relationship management system |
CN103258255A (en) * | 2013-03-28 | 2013-08-21 | 国家电网公司 | Knowledge discovery method applicable to power grid management system |
CN105182988A (en) * | 2015-09-11 | 2015-12-23 | 西北工业大学 | Pilot operation behavior guiding method based on Markov decision-making process |
CN105955921A (en) * | 2016-04-18 | 2016-09-21 | 苏州大学 | Robot hierarchical reinforcement learning initialization method based on automatic discovery of abstract action |
CN106021377A (en) * | 2016-05-11 | 2016-10-12 | 上海点荣金融信息服务有限责任公司 | Information processing method and device implemented by computer |
CN106156488A (en) * | 2016-06-22 | 2016-11-23 | 南京邮电大学 | Knowledge graph based on Bayes's personalized ordering link Forecasting Methodology |
CN106447463A (en) * | 2016-10-21 | 2017-02-22 | 南京大学 | Commodity recommendation method based on Markov decision-making process model |
-
2017
- 2017-03-22 CN CN201710173631.7A patent/CN106997488A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5806056A (en) * | 1994-04-29 | 1998-09-08 | International Business Machines Corporation | Expert system and method employing hierarchical knowledge base, and interactive multimedia/hypermedia applications |
CN101000624A (en) * | 2007-01-10 | 2007-07-18 | 华为技术有限公司 | Method, system and device for implementing data mining model conversion and application |
CN102054002A (en) * | 2009-10-28 | 2011-05-11 | 中国移动通信集团公司 | Method and device for generating decision tree in data mining system |
CN103034691A (en) * | 2012-11-30 | 2013-04-10 | 南京航空航天大学 | Method for getting expert system knowledge based on support vector machine |
CN103258255A (en) * | 2013-03-28 | 2013-08-21 | 国家电网公司 | Knowledge discovery method applicable to power grid management system |
CN103246991A (en) * | 2013-05-28 | 2013-08-14 | 运筹信息科技(上海)有限公司 | Data mining-based customer relationship management method and data mining-based customer relationship management system |
CN105182988A (en) * | 2015-09-11 | 2015-12-23 | 西北工业大学 | Pilot operation behavior guiding method based on Markov decision-making process |
CN105955921A (en) * | 2016-04-18 | 2016-09-21 | 苏州大学 | Robot hierarchical reinforcement learning initialization method based on automatic discovery of abstract action |
CN106021377A (en) * | 2016-05-11 | 2016-10-12 | 上海点荣金融信息服务有限责任公司 | Information processing method and device implemented by computer |
CN106156488A (en) * | 2016-06-22 | 2016-11-23 | 南京邮电大学 | Knowledge graph based on Bayes's personalized ordering link Forecasting Methodology |
CN106447463A (en) * | 2016-10-21 | 2017-02-22 | 南京大学 | Commodity recommendation method based on Markov decision-making process model |
Non-Patent Citations (4)
Title |
---|
LONGBING CAO: ""Actionable knowledge discovery and delivery"", 《METASYNTHETIC COMPUTING AND ENGINEERING OF COMPLEX SYSTEMS》 * |
QIANG YANG等: ""Extracting Actionable Knowledge from Decision Trees"", 《IEEE TRANSACTIONS ON KNOELEDGE AND DATA ENGINEERING》 * |
ZHICHENG CUI等: ""Optimal Action Extraction for Random Forests and Boosted Trees"", 《PROCEEDINGS OF THE 21TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING》 * |
陈兴国等: ""强化学习及其在电脑围棋中的应用"", 《自动化学报》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108376287A (en) * | 2018-03-02 | 2018-08-07 | 复旦大学 | Multi-valued attribute segmenting device based on CN-DBpedia and method |
CN108510110A (en) * | 2018-03-13 | 2018-09-07 | 浙江禹控科技有限公司 | A kind of water table trend analysis method of knowledge based collection of illustrative plates |
CN110378717A (en) * | 2018-04-13 | 2019-10-25 | 北京京东尚科信息技术有限公司 | Method and apparatus for output information |
CN110378717B (en) * | 2018-04-13 | 2024-03-05 | 北京京东尚科信息技术有限公司 | Method and device for outputting information |
CN111294284A (en) * | 2018-12-10 | 2020-06-16 | 华为技术有限公司 | Traffic scheduling method and device |
CN111294284B (en) * | 2018-12-10 | 2022-04-26 | 华为技术有限公司 | Traffic scheduling method and device |
CN109741626A (en) * | 2019-02-24 | 2019-05-10 | 苏州科技大学 | Parking situation prediction technique, dispatching method and system |
CN109741626B (en) * | 2019-02-24 | 2023-09-29 | 苏州科技大学 | Parking situation prediction method, scheduling method and system for parking lot |
CN110363015A (en) * | 2019-07-10 | 2019-10-22 | 华东师范大学 | A kind of construction method of the markov Prefetching Model based on user property classification |
CN113112051A (en) * | 2021-03-11 | 2021-07-13 | 同济大学 | Production maintenance joint optimization method for serial production system based on reinforcement learning |
CN113112051B (en) * | 2021-03-11 | 2022-10-25 | 同济大学 | Production maintenance joint optimization method for serial production system based on reinforcement learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106997488A (en) | A kind of action knowledge extraction method of combination markov decision process | |
Al-Shabandar et al. | A deep gated recurrent neural network for petroleum production forecasting | |
Alzubaidi et al. | A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications | |
WO2023065545A1 (en) | Risk prediction method and apparatus, and device and storage medium | |
CN110889556B (en) | Enterprise operation risk characteristic data information extraction method and extraction system | |
CN109299396B (en) | Convolutional neural network collaborative filtering recommendation method and system fusing attention model | |
Jin et al. | Bayesian symbolic regression | |
CN104008203B (en) | A kind of Users' Interests Mining method for incorporating body situation | |
WO2019015631A1 (en) | Method for generating combined features for machine learning samples and system | |
Pirani et al. | A comparative analysis of ARIMA, GRU, LSTM and BiLSTM on financial time series forecasting | |
US11151480B1 (en) | Hyperparameter tuning system results viewer | |
CN105893609A (en) | Mobile APP recommendation method based on weighted mixing | |
CN104798043A (en) | Data processing method and computer system | |
WO2018133596A1 (en) | Continuous feature construction method based on nominal attribute | |
CN103324954A (en) | Image classification method based on tree structure and system using same | |
CN107451230A (en) | A kind of answering method and question answering system | |
CN113326852A (en) | Model training method, device, equipment, storage medium and program product | |
Patidar et al. | Handling missing value in decision tree algorithm | |
CN116861924A (en) | Project risk early warning method and system based on artificial intelligence | |
CN107368895A (en) | A kind of combination machine learning and the action knowledge extraction method planned automatically | |
Li | A study on the influence of non-intelligence factors on college students’ English learning achievement based on C4. 5 algorithm of decision tree | |
Prudêncio et al. | A modal symbolic classifier for selecting time series models | |
Kim et al. | Knowledge extraction and representation using quantum mechanics and intelligent models | |
CN110310012A (en) | Data analysing method, device, equipment and computer readable storage medium | |
CN113326884A (en) | Efficient learning method and device for large-scale abnormal graph node representation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170801 |