CN108809713A - Monte Carlo tree searching method based on optimal resource allocation algorithm - Google Patents
Monte Carlo tree searching method based on optimal resource allocation algorithm Download PDFInfo
- Publication number
- CN108809713A CN108809713A CN201810593129.6A CN201810593129A CN108809713A CN 108809713 A CN108809713 A CN 108809713A CN 201810593129 A CN201810593129 A CN 201810593129A CN 108809713 A CN108809713 A CN 108809713A
- Authority
- CN
- China
- Prior art keywords
- monte carlo
- decision scheme
- decision
- carlo tree
- resource allocation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
- H04L41/0823—Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, only the selection strategy of the child node of root node in the tree of Monte Carlo is adjusted, optimal resource allocation algorithm is used to carry out the distribution of simulation calculation resource to the Monte Carlo subtree corresponding to each child node, and the searching method of the Monte Carlo tree corresponding to each child node, such as tree strategy etc., remain unchanged, this allows the method for the present invention to facilitate and combined with Monte Carlo tree searching method, simultaneously, Monte Carlo tree can also be improved and search for the decision performance under computing resource limited circumstances.The method of the present invention is suitable for the Monte Carlo tree searching method of all concrete forms, is with a wide range of applications.
Description
Technical field
The present invention relates to game technical fields more particularly to a kind of Monte Carlo tree based on optimal resource allocation algorithm to search
Suo Fangfa.
Background technology
Markov decision process (Markov decision process, MDP) utilizes { state set, behavior aggregate, transfer
Model, Reward Program } four-tuple sequential decision problem known to environment is modeled.Complete decision process can be used
{ state, action } to sequence describe.Wherein each next state s ' is by dependent on current state s and selected action
The probability distribution of a determines.Strategy in MDP refers to the mapping relations from state space to motion space, i.e., in each shape
The rule of specific action is chosen under state.The target of MDP is to find out so that the highest strategy of expected returns.When status number in environment
When mesh is excessive or is difficult to know, strategy can not be effectively assessed.One of the effective measures for solving the problems, such as this are special using covering
Caro tree search (MCTS) estimation is assessed per the value function of a pair of { state acts } with alternative strategy.
Monte Carlo tree search is that one kind building search tree by the random sampling in decision space and according to result, to
The method that best decision is found in localization.It produces far-reaching influence to artificial intelligence (AI), theoretically MCTS
It can be applied to any { state, action } can be used to describing and for by emulating the field come prediction result.Due to
The immense success and the potential application in other many problems that MCTS is obtained in terms of paduk game (Go), researchers couple
The research interest of MCTS steeply rises.
The appearance of MCTS can trace back to nineteen twenty-eight, and John von Neumann propose that minimax theories are to be searched to anti-treeing
Rope (Adversarial Tree Search) method has paved road.Then, Monte Carlo (Monte Carlo) method is 20
The forties in century is formally handled the side for being poorly suited for tree search and defining clear problem as by stochastical sampling
Method.Finally, R é mi Coulomb combined both methods in 2006 and propose MCTS, were provided for the mobile planning in Go
Decision.
So far after, MCTS is widely studied and occurs many variant forms, such as confidence upper limit tree (UCT), single choice
The MCTS of hand or multiselect hand, real-time MCTS etc..Meanwhile the tree strategy (Tree Policy) of MCTS and other aspects all obtain
Improvement and enhancing are arrived.However there are one common ground for the method based on Monte Carlo, that is, need by largely emulating
(Simulation) it tests to count the property of the problem of faced.In the case where computing resource is less, even if in face of medium
The problem of complexity, the state node of Partial key or action side may also can not be interviewed in the tree search process of Monte Carlo
It asks, this also results in MCTS and shows poor predicament in the case of less computing resource.
Invention content
The object of the present invention is to provide a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, Ke Yi
In the case of computing resource is limited so that the performance of Monte Carlo tree search is highly improved.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, including:
To wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that there is n in corresponding motion space
Root node R is then consequently formed in a action0N child node, root node of each child node as a sub- Monte Carlo tree,
And decision scheme of each child node as optimal resource allocation algorithm;
Initial calculation resource is distributed to each decision scheme, sub- Monte Carlo tree corresponding to each decision scheme is carried out with this
The Monte Carlo tree search iteration of corresponding amount of computational resources calculates, and records the income of each iteration;
Judge l wheel after all used computing resource summations of decision schemeWhether can use tricks not less than maximum
Calculate resource T;Wherein,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource;
If it is not, then increasing computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, come true
Each decision scheme actually available amount of computational resources when fixed l+1 wheels calculate, and execute iterative calculation identical with step before;
If so, terminating Monte Carlo tree search process, so that it is determined that the decision scheme institute to be behaved oneself best by average behavior
Corresponding action.
As seen from the above technical solution provided by the invention, the choosing only to the child node of root node in the tree of Monte Carlo
It selects strategy to be adjusted, i.e., the Monte Carlo subtree corresponding to each child node is emulated using optimal resource allocation algorithm
Distribution of computing resource, and the searching method of the Monte Carlo tree corresponding to each child node, such as tree strategy etc., keep not
Become, this allows the method for the present invention to facilitate and combined with Monte Carlo tree searching method, simultaneously, moreover it is possible to improve Monte Carlo tree
Search for the decision performance under computing resource limited circumstances.The method of the present invention is suitable for the Monte Carlo tree of all concrete forms
Searching method is with a wide range of applications.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment
Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill in field, without creative efforts, other are can also be obtained according to these attached drawings
Attached drawing.
Fig. 1 is a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm provided in an embodiment of the present invention
Flow chart;
Fig. 2 is that the Monte Carlo tree provided in an embodiment of the present invention based on optimal resource allocation algorithm searches for schematic diagram;
Fig. 3 is provided in an embodiment of the present invention to child node progress Monte Carlo tree search process schematic diagram.
Specific implementation mode
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this
The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, belongs to protection scope of the present invention.
The embodiment of the present invention provides a kind of based on optimal resource allocation (Optimal Computing Budget
Allocation, OCBA) algorithm Monte Carlo tree searching method, for the Monte Carlo tree situation limited in computing resource
The poor problem of making policy decision performance, this method is using each child node under the root vertex of Monte Carlo as decision scheme, according to optimal
Resource allocation algorithm more suitably distributes computing resource to carry out Monte Carlo tree search to each decision scheme so that is provided calculating
In the case of source is limited, the performance of Monte Carlo tree search can be highly improved.
Main flow of the present invention is as shown in Figure 1, it includes mainly following several parts:
1, to wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that corresponding motion space has
N action, then be consequently formed root node R0N child node, root node of each child node as a sub- Monte Carlo tree,
And decision scheme of each child node as optimal resource allocation algorithm.
In the embodiment of the present invention, it is assumed that there be n action in corresponding motion space, after executing n action respectively, transfers to
N new states, namely form root node R0N child node;Using each child node as the root of a sub- Monte Carlo tree
Node then shares n mutually independent sub- Monte Carlo tree SMCTi, each child node is as optimal resource allocation algorithm
Decision scheme θi。
2, to each decision scheme distribute initial calculation resource, with this to the sub- Monte Carlo tree corresponding to each decision scheme into
The Monte Carlo tree search iteration of the corresponding amount of computational resources of row calculates, and records the income of each iteration.
In the embodiment of the present invention, when initial, namely as l=0, initial calculation resource is distributed for each decision scheme, simultaneously
The Monte Carlo tree search iteration that corresponding amount of computational resources is carried out to the sub- Monte Carlo tree corresponding to each decision scheme calculates.
For ease of understanding, in the embodiment of the present invention, computing resource is regarded as the iterations of Monte Carlo tree search;It enables
L=0,To the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out N0Secondary Monte Carlo tree search iteration
It calculates, and records the income of each iteration.
Actually under various circumstances, computing resource is also understood that calculate time and memory space etc..
3, judge l wheel after all used computing resource summations of decision schemeIt is whether available not less than maximum
Computing resource T.
In the embodiment of the present invention,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource,
I.e. the decision scheme often takes turns the sum of the computing resource used before l wheels and l wheels.
4, increase computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, to determine l
+ 1 wheel total amount of computational resources that each decision scheme is taken turns the 1st to l+1 when calculating, and it is real to determine that each decision scheme is taken turns in l+1
The available amount of computational resources in border, and execute the identical Monte Carlo tree search iteration with step 2 before and calculate.
In the embodiment of the present invention, mean value and variance using optimal resource allocation algorithm according to each decision scheme historical yield
It is by quantityAvailable aggregate calculate resource allocation give each decision scheme, what each decision scheme was taken turns in l+1
Amount of computational resources is WithBetween difference will determine l+1 take turns simulation calculation when the actually available calculating of each scheme
Stock number.
Specifically, for all decision scheme θi, i ∈ I={ 1,2 ..., n }, note wherein any one non-optimal decision-making party
Case is θj, an optimizing decision scheme is θb, other decision schemes are θx, x ∈ X, wherein j, b ∈ I, j ≠ b, x ∈ X=I- j,
b}.Likewise, with symbol j, b, x are respectively as non-optimal decision scheme, optimizing decision scheme, the various property of other decision schemes
The label of matter.Illustratively, j=1, b=2, then x ∈ X={ 3,4,5 ..., n }.
Then there is following formula for all i ∈ I={ 1,2 ..., n }, j, b ∈ I, j ≠ b, x ∈ X=I- { j, b }:
Wherein:
In above formula,Non-optimal decision scheme θ is indicated respectivelyj, optimizing decision scheme θb, other determine
Plan scheme θxIn the amount of computational resources that l+1 takes turns;N indicates that decision scheme corresponding to corresponding subscript is taken turns shown in corresponding subscript
Total computing resource after sub-distribution resource;μk(θi) indicate decision scheme θiIncome when kth time calculates,
It indicates after l takes turns search iteration, the label of the highest decision scheme of averaged historical income;μ indicates the mean value of historical yield, δ
Indicate that the variance of historical yield, subscript l are the serial number of round, under be designated as the labels of various property decision schemes, for example,
For decision scheme θi1st~the l rounds calculate mean value, the variance of historical yield;WithIt is intermediate parameters, whereinWithCan be respectively other schemes θxWith optimal case θbRelative to selected non-optimal decision scheme θjIt is taken turns in l acquired
The proportionality coefficient of total amount of computational resources.It is assumed that decision scheme θjObtained computing resource is a quantity of units.ForThis is public
Formula, it can be seen that if other decision schemes θxHistorical yield mean value bigger (performance better), variance is also bigger (illustrates to show
It is uncertain, need with being calculated more times to determine true representation), thenValue it is bigger, this explanation will distribute to other and determine
Plan scheme θxCalculation amount it is more.
In conjunction withWithBetween difference determine l+1 wheel calculate when the actually available amount of computational resources of each decision scheme:i∈I;That is, for the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out computing resource
Amount isMonte Carlo tree search iteration calculate;Then each decision scheme l+1 wheels distribution computing resource
Total computing resource afterwards is:
Judged in being transferred to step 3 after executing above-mentioned steps, if judging result is no, continue to execute this step,
Until judging result is yes, then step 5 is transferred to.
5, terminate Monte Carlo tree search process, so that it is determined that corresponding to the decision scheme to be behaved oneself best by average behavior
Action.
After terminating the search of Monte Carlo tree, the action corresponding to decision scheme can be selected by average behavior.
Said program of the embodiment of the present invention only works to the selection strategy of the child node of root node in the tree of Monte Carlo,
Optimal resource allocation algorithm is used to carry out the distribution of simulation calculation resource to the Monte Carlo subtree corresponding to each child node, and
Searching method of Monte Carlo tree corresponding to each child node, such as tree strategy etc., remain unchanged, this makes the present invention's
Method can facilitate to be combined with Monte Carlo tree searching method, simultaneously, moreover it is possible to which improve Monte Carlo tree search has in computing resource
Decision performance in the case of limit.The method of the present invention is suitable for the Monte Carlo tree searching method of all concrete forms, has wide
General application range.
In order to make it easy to understand, being introduced with reference to an example.
Said program of the embodiment of the present invention can be adapted for the Monte Carlo tree searching method of all concrete forms.Originally show
In example, decline subproblem is played chess as research object using black and white chess, the concrete form of Monte Carlo tree search uses confidence upper limit tree
(UCT), then root node R in UCT0For chessboard state to be begun, motion space is that can be fallen under current chessboard state at this time
All positions of son, the corresponding position of beginning of each action, the total n actions that begin.
Each child node of root node is that execution acts aiThe new state that chessboard becomes after beginning.Using each child node as
New root node carries out UCT search, thus will generate a new Monte Carlo tree SMCTi, i.e., above-mentioned with node R0As root section
The subtree of the Monte Carlo tree of point.
During black and white chess is played chess, if the side of beginning begins, post-simulation result of calculation is victory, the income of action then this begins
It is denoted as 1;If it is losing, income is denoted as 0;Otherwise, income is denoted as 0.5.
The mean value and variance of all simulation results of each child node by as the input of optimal resource allocation algorithm with
Carry out the calculating of each decision scheme computing resource of next round.
In this example, computing resource is to sub- Monte Carlo Shu Shu SMCTiCarry out the iterations or imitative of UCT search
True number is illustrated in figure 2 based on the MCTS search process for most having resource allocation algorithm, is illustrated in figure 3 and is carried out to child node
The iterative process of Monte Carlo tree search.
After entire method is finished, the optimal action that begins under current chessboard state during return is played chess.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can
By software realization, the mode of necessary general hardware platform can also be added to realize by software.Based on this understanding,
The technical solution of above-described embodiment can be expressed in the form of software products, the software product can be stored in one it is non-easily
In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes the method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Any one skilled in the art is in the technical scope of present disclosure, the change or replacement that can be readily occurred in,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Subject to enclosing.
Claims (4)
1. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, which is characterized in that including:
To wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that corresponding motion space has n to move
Make, then root node R is consequently formed0N child node, root node of each child node as a sub- Monte Carlo tree, and every
Decision scheme of one child node as optimal resource allocation algorithm;
Initial calculation resource is distributed to each decision scheme, sub- Monte Carlo tree corresponding to each decision scheme is carried out with this corresponding
The Monte Carlo tree search iteration of amount of computational resources calculates, and records the income of each iteration;
Judge l wheel after all used computing resource summations of decision schemeWhether can be used not less than maximum and calculate money
Source T;Wherein,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource;
If it is not, then increasing computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, to determine
Each decision scheme actually available amount of computational resources when l+1 wheels calculate, and execute iterative calculation identical with step before;
If so, terminating Monte Carlo tree search process, so that it is determined that corresponding to the decision scheme to be behaved oneself best by average behavior
Action.
2. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature
It is, after executing n action respectively, transfers to n new states, namely form root node R0N child node;
Using each child node as the root node of a sub- Monte Carlo tree, then n mutually independent sub- Monte Carlo trees are shared
SMCTi, decision scheme θ of each child node as optimal resource allocation algorithmi。
3. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature
It is,
When initial, initial calculation resource is distributed for each decision schemeThat is,
To the sub- Monte Carlo tree SMCT corresponding to each decision schemeiIt is N to carry out amount of computational resources0The search of Monte Carlo tree
Iterative calculation, and record the income of each iteration.
4. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature
It is, it is described to utilize optimal resource allocation algorithm according to each decision scheme historical yield, come each decision when determining that l+1 wheels calculate
Total amount of computational resources of the scheme in 1 to l+1 wheelThereby determine that each decision scheme takes turns actually available calculating in l+1
Stock number includes:
Quantity is by mean value and variance using optimal resource allocation algorithm according to each decision scheme historical yield
Available aggregate calculate resource allocation to each decision scheme, the amount of computational resources that each decision scheme obtains is
Remember that any one non-optimal decision scheme is θj, optimal case θb, other decision schemes are θi, then for all i ∈
I={ 1,2 ..., n }, j, b ∈ I, j ≠ b, x ∈ X=I- { j, b } have following formula:
Wherein:
In above formula,Non-optimal decision scheme θ is indicated respectivelyj, optimizing decision scheme θb, other decision-making parties
Case θxIn the amount of computational resources that l+1 takes turns;N indicates decision scheme round shown in corresponding subscript point corresponding to corresponding subscript
With total computing resource after resource;μk(θi) indicate decision scheme θiIncome when kth time calculates, It indicates
After l takes turns search iteration, the label of the highest decision scheme of averaged historical income;μ indicates that the mean value of historical yield, δ indicate
The variance of historical yield, subscript l are the serial number of round, under be designated as the labels of various property decision schemes;WithIt is centre
Parameter;
In conjunction withWithBetween difference determine l+1 wheel calculate when the actually available amount of computational resources of each decision scheme:I ∈,;That is, for the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out computing resource
Amount isMonte Carlo tree search iteration calculate;Then each decision scheme l+1 wheels distribution computing resource
Total computing resource afterwards is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810593129.6A CN108809713B (en) | 2018-06-08 | 2018-06-08 | Monte Carlo tree searching method based on optimal resource allocation algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810593129.6A CN108809713B (en) | 2018-06-08 | 2018-06-08 | Monte Carlo tree searching method based on optimal resource allocation algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108809713A true CN108809713A (en) | 2018-11-13 |
CN108809713B CN108809713B (en) | 2020-12-25 |
Family
ID=64088186
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810593129.6A Active CN108809713B (en) | 2018-06-08 | 2018-06-08 | Monte Carlo tree searching method based on optimal resource allocation algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108809713B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109859532A (en) * | 2019-02-28 | 2019-06-07 | 深圳市北斗智能科技有限公司 | A kind of the break indices method and relevant apparatus of multi-constraint condition |
CN110209770A (en) * | 2019-06-03 | 2019-09-06 | 北京邮电大学 | A kind of name entity recognition method based on policy value network and tree search enhancing |
CN110427261A (en) * | 2019-08-12 | 2019-11-08 | 电子科技大学 | A kind of edge calculations method for allocating tasks based on the search of depth Monte Carlo tree |
CN112202514A (en) * | 2020-10-09 | 2021-01-08 | 中国人民解放军国防科技大学 | Broadband spectrum sensing method based on reinforcement learning |
CN112700005A (en) * | 2020-12-28 | 2021-04-23 | 北京环境特性研究所 | Abnormal event processing method and device based on Monte Carlo tree search |
CN112734312A (en) * | 2021-03-31 | 2021-04-30 | 平安科技(深圳)有限公司 | Method for outputting reference data and computer equipment |
CN113935618A (en) * | 2021-10-12 | 2022-01-14 | 网易有道信息技术(江苏)有限公司 | Evaluation method and device for chess playing capability, electronic equipment and storage medium |
CN114492910A (en) * | 2021-11-03 | 2022-05-13 | 北京科技大学 | Resource load prediction method for multi-model small-batch production line |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130204412A1 (en) * | 2012-02-02 | 2013-08-08 | International Business Machines Corporation | Optimal policy determination using repeated stackelberg games with unknown player preferences |
CN104135769A (en) * | 2014-07-01 | 2014-11-05 | 宁波大学 | Method of OFDMA (Orthogonal Frequency Division Multiple Access) ergodic capacity maximized resource allocation under incomplete channel state information |
CN105727550A (en) * | 2016-01-27 | 2016-07-06 | 安徽大学 | Dot and box chess game system based on UCT algorithm |
WO2016123213A1 (en) * | 2015-01-30 | 2016-08-04 | Alcatel-Lucent Usa Inc. | Frequency resource and/or modulation and coding scheme indicator for machine type communication device |
-
2018
- 2018-06-08 CN CN201810593129.6A patent/CN108809713B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130204412A1 (en) * | 2012-02-02 | 2013-08-08 | International Business Machines Corporation | Optimal policy determination using repeated stackelberg games with unknown player preferences |
CN104135769A (en) * | 2014-07-01 | 2014-11-05 | 宁波大学 | Method of OFDMA (Orthogonal Frequency Division Multiple Access) ergodic capacity maximized resource allocation under incomplete channel state information |
WO2016123213A1 (en) * | 2015-01-30 | 2016-08-04 | Alcatel-Lucent Usa Inc. | Frequency resource and/or modulation and coding scheme indicator for machine type communication device |
CN105727550A (en) * | 2016-01-27 | 2016-07-06 | 安徽大学 | Dot and box chess game system based on UCT algorithm |
Non-Patent Citations (5)
Title |
---|
GAO LIN等: "Research on Resource Allocation Evaluation of Collaborative Product Developmet for Cloud Manufacuturing", 《THE 2015 INTERNATIONAL COFERENCE ON ADVANCES IN CONSTRUCTION MACHINERY AND VEHICLE ENGINEERING》 * |
QUN MENG等: "Enhancing pattern search for global optimization with an additive global and local Gaussian Process Model", 《2017 WINTER SIMULATION COFERENCE》 * |
YUNCHUAN LI等: "Monte Carlo tree search with optimal computing budget allocation", 《2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL》 * |
刘洋: "点格棋博弈中UCT算法的研究与实现", 《中国优秀硕士学位论文全文数据库》 * |
朱怡桦: "运用OCBA法改善求解随机性专案网路最佳化资源分配问题之研究", 《HTTPS://ETD.LIB.NCTU.EDU.TW/CGI-BIN/GS32/TUGSWEB.CGI?O=DNCTUCDR&S=ID=%22GT079832534%22.&SEARCHMODE=BASIC》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109859532A (en) * | 2019-02-28 | 2019-06-07 | 深圳市北斗智能科技有限公司 | A kind of the break indices method and relevant apparatus of multi-constraint condition |
CN110209770B (en) * | 2019-06-03 | 2022-04-15 | 北京邮电大学 | Named entity identification method based on strategy value network and tree search enhancement |
CN110209770A (en) * | 2019-06-03 | 2019-09-06 | 北京邮电大学 | A kind of name entity recognition method based on policy value network and tree search enhancing |
CN110427261A (en) * | 2019-08-12 | 2019-11-08 | 电子科技大学 | A kind of edge calculations method for allocating tasks based on the search of depth Monte Carlo tree |
CN112202514A (en) * | 2020-10-09 | 2021-01-08 | 中国人民解放军国防科技大学 | Broadband spectrum sensing method based on reinforcement learning |
CN112202514B (en) * | 2020-10-09 | 2022-11-08 | 中国人民解放军国防科技大学 | Broadband spectrum sensing method based on reinforcement learning |
CN112700005A (en) * | 2020-12-28 | 2021-04-23 | 北京环境特性研究所 | Abnormal event processing method and device based on Monte Carlo tree search |
CN112700005B (en) * | 2020-12-28 | 2024-02-23 | 北京环境特性研究所 | Abnormal event processing method and device based on Monte Carlo tree search |
CN112734312B (en) * | 2021-03-31 | 2021-07-09 | 平安科技(深圳)有限公司 | Method for outputting reference data and computer equipment |
CN112734312A (en) * | 2021-03-31 | 2021-04-30 | 平安科技(深圳)有限公司 | Method for outputting reference data and computer equipment |
CN113935618A (en) * | 2021-10-12 | 2022-01-14 | 网易有道信息技术(江苏)有限公司 | Evaluation method and device for chess playing capability, electronic equipment and storage medium |
CN114492910A (en) * | 2021-11-03 | 2022-05-13 | 北京科技大学 | Resource load prediction method for multi-model small-batch production line |
CN114492910B (en) * | 2021-11-03 | 2023-11-14 | 北京科技大学 | Resource load prediction method for multi-model small-batch production line |
Also Published As
Publication number | Publication date |
---|---|
CN108809713B (en) | 2020-12-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108809713A (en) | Monte Carlo tree searching method based on optimal resource allocation algorithm | |
Gharehchopogh et al. | A comprehensive survey on symbiotic organisms search algorithms | |
Abraham et al. | Finding numerical solutions of diophantine equations using ant colony optimization | |
Mashwani et al. | A decomposition-based hybrid multiobjective evolutionary algorithm with dynamic resource allocation | |
US7418434B2 (en) | Forward-chaining inferencing | |
US6550053B1 (en) | Time estimator for object oriented software development | |
CN110138612A (en) | A kind of cloud software service resource allocation methods based on QoS model self-correcting | |
CN113671987B (en) | Multi-machine distributed time sequence task allocation method based on non-deadlock contract net algorithm | |
Kalra et al. | Multi‐criteria workflow scheduling on clouds under deadline and budget constraints | |
Garg et al. | Multi-objective workflow grid scheduling based on discrete particle swarm optimization | |
CN105446742A (en) | Optimization method for artificial intelligence performing task | |
Kumar et al. | Lagrangian relaxation techniques for scalable spatial conservation planning | |
CN111127233A (en) | User check value calculation method in undirected authorized graph of social network | |
CN109831343B (en) | Peer-to-peer network cooperation promotion method and system based on past strategy | |
Wang et al. | Regional multi-armed bandits with partial informativeness | |
CN111078380A (en) | Multi-target task scheduling method and system | |
De Rigo et al. | Continental-scale living forest biomass and carbon stock: a robust fuzzy ensemble of IPCC Tier 1 maps for Europe | |
CN104392317A (en) | Project scheduling method based on genetic culture gene algorithm | |
Banati et al. | Modeling evolutionary group search optimization approach for community detection in social networks | |
AlBaity et al. | On extending quantum behaved particle swarm optimization to multiobjective context | |
Zheng et al. | A priority-based level heuristic approach for scheduling dag applications with uncertainties | |
CN108415774A (en) | A kind of Method for HW/SW partitioning based on improvement fireworks algorithm | |
Mirshahvalad et al. | Dynamics of interacting information waves in networks | |
Tomášek et al. | Using one-sided partially observable stochastic games for solving zero-sum security games with sequential attacks | |
CN110162400B (en) | Method and system for realizing cooperation of intelligent agents in MAS system in complex network environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |