CN108809713A - Monte Carlo tree searching method based on optimal resource allocation algorithm - Google Patents

Monte Carlo tree searching method based on optimal resource allocation algorithm Download PDF

Info

Publication number
CN108809713A
CN108809713A CN201810593129.6A CN201810593129A CN108809713A CN 108809713 A CN108809713 A CN 108809713A CN 201810593129 A CN201810593129 A CN 201810593129A CN 108809713 A CN108809713 A CN 108809713A
Authority
CN
China
Prior art keywords
monte carlo
decision scheme
decision
carlo tree
resource allocation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810593129.6A
Other languages
Chinese (zh)
Other versions
CN108809713B (en
Inventor
陈子豪
李斌
李厚强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810593129.6A priority Critical patent/CN108809713B/en
Publication of CN108809713A publication Critical patent/CN108809713A/en
Application granted granted Critical
Publication of CN108809713B publication Critical patent/CN108809713B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0823Configuration setting characterised by the purposes of a change of settings, e.g. optimising configuration for enhancing reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, only the selection strategy of the child node of root node in the tree of Monte Carlo is adjusted, optimal resource allocation algorithm is used to carry out the distribution of simulation calculation resource to the Monte Carlo subtree corresponding to each child node, and the searching method of the Monte Carlo tree corresponding to each child node, such as tree strategy etc., remain unchanged, this allows the method for the present invention to facilitate and combined with Monte Carlo tree searching method, simultaneously, Monte Carlo tree can also be improved and search for the decision performance under computing resource limited circumstances.The method of the present invention is suitable for the Monte Carlo tree searching method of all concrete forms, is with a wide range of applications.

Description

Monte Carlo tree searching method based on optimal resource allocation algorithm
Technical field
The present invention relates to game technical fields more particularly to a kind of Monte Carlo tree based on optimal resource allocation algorithm to search Suo Fangfa.
Background technology
Markov decision process (Markov decision process, MDP) utilizes { state set, behavior aggregate, transfer Model, Reward Program } four-tuple sequential decision problem known to environment is modeled.Complete decision process can be used { state, action } to sequence describe.Wherein each next state s ' is by dependent on current state s and selected action The probability distribution of a determines.Strategy in MDP refers to the mapping relations from state space to motion space, i.e., in each shape The rule of specific action is chosen under state.The target of MDP is to find out so that the highest strategy of expected returns.When status number in environment When mesh is excessive or is difficult to know, strategy can not be effectively assessed.One of the effective measures for solving the problems, such as this are special using covering Caro tree search (MCTS) estimation is assessed per the value function of a pair of { state acts } with alternative strategy.
Monte Carlo tree search is that one kind building search tree by the random sampling in decision space and according to result, to The method that best decision is found in localization.It produces far-reaching influence to artificial intelligence (AI), theoretically MCTS It can be applied to any { state, action } can be used to describing and for by emulating the field come prediction result.Due to The immense success and the potential application in other many problems that MCTS is obtained in terms of paduk game (Go), researchers couple The research interest of MCTS steeply rises.
The appearance of MCTS can trace back to nineteen twenty-eight, and John von Neumann propose that minimax theories are to be searched to anti-treeing Rope (Adversarial Tree Search) method has paved road.Then, Monte Carlo (Monte Carlo) method is 20 The forties in century is formally handled the side for being poorly suited for tree search and defining clear problem as by stochastical sampling Method.Finally, R é mi Coulomb combined both methods in 2006 and propose MCTS, were provided for the mobile planning in Go Decision.
So far after, MCTS is widely studied and occurs many variant forms, such as confidence upper limit tree (UCT), single choice The MCTS of hand or multiselect hand, real-time MCTS etc..Meanwhile the tree strategy (Tree Policy) of MCTS and other aspects all obtain Improvement and enhancing are arrived.However there are one common ground for the method based on Monte Carlo, that is, need by largely emulating (Simulation) it tests to count the property of the problem of faced.In the case where computing resource is less, even if in face of medium The problem of complexity, the state node of Partial key or action side may also can not be interviewed in the tree search process of Monte Carlo It asks, this also results in MCTS and shows poor predicament in the case of less computing resource.
Invention content
The object of the present invention is to provide a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, Ke Yi In the case of computing resource is limited so that the performance of Monte Carlo tree search is highly improved.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, including:
To wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that there is n in corresponding motion space Root node R is then consequently formed in a action0N child node, root node of each child node as a sub- Monte Carlo tree, And decision scheme of each child node as optimal resource allocation algorithm;
Initial calculation resource is distributed to each decision scheme, sub- Monte Carlo tree corresponding to each decision scheme is carried out with this The Monte Carlo tree search iteration of corresponding amount of computational resources calculates, and records the income of each iteration;
Judge l wheel after all used computing resource summations of decision schemeWhether can use tricks not less than maximum Calculate resource T;Wherein,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource;
If it is not, then increasing computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, come true Each decision scheme actually available amount of computational resources when fixed l+1 wheels calculate, and execute iterative calculation identical with step before;
If so, terminating Monte Carlo tree search process, so that it is determined that the decision scheme institute to be behaved oneself best by average behavior Corresponding action.
As seen from the above technical solution provided by the invention, the choosing only to the child node of root node in the tree of Monte Carlo It selects strategy to be adjusted, i.e., the Monte Carlo subtree corresponding to each child node is emulated using optimal resource allocation algorithm Distribution of computing resource, and the searching method of the Monte Carlo tree corresponding to each child node, such as tree strategy etc., keep not Become, this allows the method for the present invention to facilitate and combined with Monte Carlo tree searching method, simultaneously, moreover it is possible to improve Monte Carlo tree Search for the decision performance under computing resource limited circumstances.The method of the present invention is suitable for the Monte Carlo tree of all concrete forms Searching method is with a wide range of applications.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill in field, without creative efforts, other are can also be obtained according to these attached drawings Attached drawing.
Fig. 1 is a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm provided in an embodiment of the present invention Flow chart;
Fig. 2 is that the Monte Carlo tree provided in an embodiment of the present invention based on optimal resource allocation algorithm searches for schematic diagram;
Fig. 3 is provided in an embodiment of the present invention to child node progress Monte Carlo tree search process schematic diagram.
Specific implementation mode
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, belongs to protection scope of the present invention.
The embodiment of the present invention provides a kind of based on optimal resource allocation (Optimal Computing Budget Allocation, OCBA) algorithm Monte Carlo tree searching method, for the Monte Carlo tree situation limited in computing resource The poor problem of making policy decision performance, this method is using each child node under the root vertex of Monte Carlo as decision scheme, according to optimal Resource allocation algorithm more suitably distributes computing resource to carry out Monte Carlo tree search to each decision scheme so that is provided calculating In the case of source is limited, the performance of Monte Carlo tree search can be highly improved.
Main flow of the present invention is as shown in Figure 1, it includes mainly following several parts:
1, to wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that corresponding motion space has N action, then be consequently formed root node R0N child node, root node of each child node as a sub- Monte Carlo tree, And decision scheme of each child node as optimal resource allocation algorithm.
In the embodiment of the present invention, it is assumed that there be n action in corresponding motion space, after executing n action respectively, transfers to N new states, namely form root node R0N child node;Using each child node as the root of a sub- Monte Carlo tree Node then shares n mutually independent sub- Monte Carlo tree SMCTi, each child node is as optimal resource allocation algorithm Decision scheme θi
2, to each decision scheme distribute initial calculation resource, with this to the sub- Monte Carlo tree corresponding to each decision scheme into The Monte Carlo tree search iteration of the corresponding amount of computational resources of row calculates, and records the income of each iteration.
In the embodiment of the present invention, when initial, namely as l=0, initial calculation resource is distributed for each decision scheme, simultaneously The Monte Carlo tree search iteration that corresponding amount of computational resources is carried out to the sub- Monte Carlo tree corresponding to each decision scheme calculates.
For ease of understanding, in the embodiment of the present invention, computing resource is regarded as the iterations of Monte Carlo tree search;It enables L=0,To the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out N0Secondary Monte Carlo tree search iteration It calculates, and records the income of each iteration.
Actually under various circumstances, computing resource is also understood that calculate time and memory space etc..
3, judge l wheel after all used computing resource summations of decision schemeIt is whether available not less than maximum Computing resource T.
In the embodiment of the present invention,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource, I.e. the decision scheme often takes turns the sum of the computing resource used before l wheels and l wheels.
4, increase computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, to determine l + 1 wheel total amount of computational resources that each decision scheme is taken turns the 1st to l+1 when calculating, and it is real to determine that each decision scheme is taken turns in l+1 The available amount of computational resources in border, and execute the identical Monte Carlo tree search iteration with step 2 before and calculate.
In the embodiment of the present invention, mean value and variance using optimal resource allocation algorithm according to each decision scheme historical yield It is by quantityAvailable aggregate calculate resource allocation give each decision scheme, what each decision scheme was taken turns in l+1 Amount of computational resources is WithBetween difference will determine l+1 take turns simulation calculation when the actually available calculating of each scheme Stock number.
Specifically, for all decision scheme θi, i ∈ I={ 1,2 ..., n }, note wherein any one non-optimal decision-making party Case is θj, an optimizing decision scheme is θb, other decision schemes are θx, x ∈ X, wherein j, b ∈ I, j ≠ b, x ∈ X=I- j, b}.Likewise, with symbol j, b, x are respectively as non-optimal decision scheme, optimizing decision scheme, the various property of other decision schemes The label of matter.Illustratively, j=1, b=2, then x ∈ X={ 3,4,5 ..., n }.
Then there is following formula for all i ∈ I={ 1,2 ..., n }, j, b ∈ I, j ≠ b, x ∈ X=I- { j, b }:
Wherein:
In above formula,Non-optimal decision scheme θ is indicated respectivelyj, optimizing decision scheme θb, other determine Plan scheme θxIn the amount of computational resources that l+1 takes turns;N indicates that decision scheme corresponding to corresponding subscript is taken turns shown in corresponding subscript Total computing resource after sub-distribution resource;μki) indicate decision scheme θiIncome when kth time calculates, It indicates after l takes turns search iteration, the label of the highest decision scheme of averaged historical income;μ indicates the mean value of historical yield, δ Indicate that the variance of historical yield, subscript l are the serial number of round, under be designated as the labels of various property decision schemes, for example, For decision scheme θi1st~the l rounds calculate mean value, the variance of historical yield;WithIt is intermediate parameters, whereinWithCan be respectively other schemes θxWith optimal case θbRelative to selected non-optimal decision scheme θjIt is taken turns in l acquired The proportionality coefficient of total amount of computational resources.It is assumed that decision scheme θjObtained computing resource is a quantity of units.ForThis is public Formula, it can be seen that if other decision schemes θxHistorical yield mean value bigger (performance better), variance is also bigger (illustrates to show It is uncertain, need with being calculated more times to determine true representation), thenValue it is bigger, this explanation will distribute to other and determine Plan scheme θxCalculation amount it is more.
In conjunction withWithBetween difference determine l+1 wheel calculate when the actually available amount of computational resources of each decision scheme:i∈I;That is, for the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out computing resource Amount isMonte Carlo tree search iteration calculate;Then each decision scheme l+1 wheels distribution computing resource Total computing resource afterwards is:
Judged in being transferred to step 3 after executing above-mentioned steps, if judging result is no, continue to execute this step, Until judging result is yes, then step 5 is transferred to.
5, terminate Monte Carlo tree search process, so that it is determined that corresponding to the decision scheme to be behaved oneself best by average behavior Action.
After terminating the search of Monte Carlo tree, the action corresponding to decision scheme can be selected by average behavior.
Said program of the embodiment of the present invention only works to the selection strategy of the child node of root node in the tree of Monte Carlo, Optimal resource allocation algorithm is used to carry out the distribution of simulation calculation resource to the Monte Carlo subtree corresponding to each child node, and Searching method of Monte Carlo tree corresponding to each child node, such as tree strategy etc., remain unchanged, this makes the present invention's Method can facilitate to be combined with Monte Carlo tree searching method, simultaneously, moreover it is possible to which improve Monte Carlo tree search has in computing resource Decision performance in the case of limit.The method of the present invention is suitable for the Monte Carlo tree searching method of all concrete forms, has wide General application range.
In order to make it easy to understand, being introduced with reference to an example.
Said program of the embodiment of the present invention can be adapted for the Monte Carlo tree searching method of all concrete forms.Originally show In example, decline subproblem is played chess as research object using black and white chess, the concrete form of Monte Carlo tree search uses confidence upper limit tree (UCT), then root node R in UCT0For chessboard state to be begun, motion space is that can be fallen under current chessboard state at this time All positions of son, the corresponding position of beginning of each action, the total n actions that begin.
Each child node of root node is that execution acts aiThe new state that chessboard becomes after beginning.Using each child node as New root node carries out UCT search, thus will generate a new Monte Carlo tree SMCTi, i.e., above-mentioned with node R0As root section The subtree of the Monte Carlo tree of point.
During black and white chess is played chess, if the side of beginning begins, post-simulation result of calculation is victory, the income of action then this begins It is denoted as 1;If it is losing, income is denoted as 0;Otherwise, income is denoted as 0.5.
The mean value and variance of all simulation results of each child node by as the input of optimal resource allocation algorithm with Carry out the calculating of each decision scheme computing resource of next round.
In this example, computing resource is to sub- Monte Carlo Shu Shu SMCTiCarry out the iterations or imitative of UCT search True number is illustrated in figure 2 based on the MCTS search process for most having resource allocation algorithm, is illustrated in figure 3 and is carried out to child node The iterative process of Monte Carlo tree search.
After entire method is finished, the optimal action that begins under current chessboard state during return is played chess.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can By software realization, the mode of necessary general hardware platform can also be added to realize by software.Based on this understanding, The technical solution of above-described embodiment can be expressed in the form of software products, the software product can be stored in one it is non-easily In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes the method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Any one skilled in the art is in the technical scope of present disclosure, the change or replacement that can be readily occurred in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Subject to enclosing.

Claims (4)

1. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm, which is characterized in that including:
To wait for the original state of decision problem as the root node R of Monte Carlo tree0, it is assumed that corresponding motion space has n to move Make, then root node R is consequently formed0N child node, root node of each child node as a sub- Monte Carlo tree, and every Decision scheme of one child node as optimal resource allocation algorithm;
Initial calculation resource is distributed to each decision scheme, sub- Monte Carlo tree corresponding to each decision scheme is carried out with this corresponding The Monte Carlo tree search iteration of amount of computational resources calculates, and records the income of each iteration;
Judge l wheel after all used computing resource summations of decision schemeWhether can be used not less than maximum and calculate money Source T;Wherein,Indicate total computing resource of the decision scheme after l takes turns distribution computing resource;
If it is not, then increasing computing resource Δ, using optimal resource allocation algorithm according to each decision scheme historical yield, to determine Each decision scheme actually available amount of computational resources when l+1 wheels calculate, and execute iterative calculation identical with step before;
If so, terminating Monte Carlo tree search process, so that it is determined that corresponding to the decision scheme to be behaved oneself best by average behavior Action.
2. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature It is, after executing n action respectively, transfers to n new states, namely form root node R0N child node;
Using each child node as the root node of a sub- Monte Carlo tree, then n mutually independent sub- Monte Carlo trees are shared SMCTi, decision scheme θ of each child node as optimal resource allocation algorithmi
3. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature It is,
When initial, initial calculation resource is distributed for each decision schemeThat is,
To the sub- Monte Carlo tree SMCT corresponding to each decision schemeiIt is N to carry out amount of computational resources0The search of Monte Carlo tree Iterative calculation, and record the income of each iteration.
4. a kind of Monte Carlo tree searching method based on optimal resource allocation algorithm according to claim 1, feature It is, it is described to utilize optimal resource allocation algorithm according to each decision scheme historical yield, come each decision when determining that l+1 wheels calculate Total amount of computational resources of the scheme in 1 to l+1 wheelThereby determine that each decision scheme takes turns actually available calculating in l+1 Stock number includes:
Quantity is by mean value and variance using optimal resource allocation algorithm according to each decision scheme historical yield Available aggregate calculate resource allocation to each decision scheme, the amount of computational resources that each decision scheme obtains is
Remember that any one non-optimal decision scheme is θj, optimal case θb, other decision schemes are θi, then for all i ∈ I={ 1,2 ..., n }, j, b ∈ I, j ≠ b, x ∈ X=I- { j, b } have following formula:
Wherein:
In above formula,Non-optimal decision scheme θ is indicated respectivelyj, optimizing decision scheme θb, other decision-making parties Case θxIn the amount of computational resources that l+1 takes turns;N indicates decision scheme round shown in corresponding subscript point corresponding to corresponding subscript With total computing resource after resource;μki) indicate decision scheme θiIncome when kth time calculates, It indicates After l takes turns search iteration, the label of the highest decision scheme of averaged historical income;μ indicates that the mean value of historical yield, δ indicate The variance of historical yield, subscript l are the serial number of round, under be designated as the labels of various property decision schemes;WithIt is centre Parameter;
In conjunction withWithBetween difference determine l+1 wheel calculate when the actually available amount of computational resources of each decision scheme:I ∈,;That is, for the sub- Monte Carlo tree SMCT corresponding to each decision schemeiCarry out computing resource Amount isMonte Carlo tree search iteration calculate;Then each decision scheme l+1 wheels distribution computing resource Total computing resource afterwards is:
CN201810593129.6A 2018-06-08 2018-06-08 Monte Carlo tree searching method based on optimal resource allocation algorithm Active CN108809713B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810593129.6A CN108809713B (en) 2018-06-08 2018-06-08 Monte Carlo tree searching method based on optimal resource allocation algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810593129.6A CN108809713B (en) 2018-06-08 2018-06-08 Monte Carlo tree searching method based on optimal resource allocation algorithm

Publications (2)

Publication Number Publication Date
CN108809713A true CN108809713A (en) 2018-11-13
CN108809713B CN108809713B (en) 2020-12-25

Family

ID=64088186

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810593129.6A Active CN108809713B (en) 2018-06-08 2018-06-08 Monte Carlo tree searching method based on optimal resource allocation algorithm

Country Status (1)

Country Link
CN (1) CN108809713B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859532A (en) * 2019-02-28 2019-06-07 深圳市北斗智能科技有限公司 A kind of the break indices method and relevant apparatus of multi-constraint condition
CN110209770A (en) * 2019-06-03 2019-09-06 北京邮电大学 A kind of name entity recognition method based on policy value network and tree search enhancing
CN110427261A (en) * 2019-08-12 2019-11-08 电子科技大学 A kind of edge calculations method for allocating tasks based on the search of depth Monte Carlo tree
CN112202514A (en) * 2020-10-09 2021-01-08 中国人民解放军国防科技大学 Broadband spectrum sensing method based on reinforcement learning
CN112700005A (en) * 2020-12-28 2021-04-23 北京环境特性研究所 Abnormal event processing method and device based on Monte Carlo tree search
CN112734312A (en) * 2021-03-31 2021-04-30 平安科技(深圳)有限公司 Method for outputting reference data and computer equipment
CN113935618A (en) * 2021-10-12 2022-01-14 网易有道信息技术(江苏)有限公司 Evaluation method and device for chess playing capability, electronic equipment and storage medium
CN114492910A (en) * 2021-11-03 2022-05-13 北京科技大学 Resource load prediction method for multi-model small-batch production line

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130204412A1 (en) * 2012-02-02 2013-08-08 International Business Machines Corporation Optimal policy determination using repeated stackelberg games with unknown player preferences
CN104135769A (en) * 2014-07-01 2014-11-05 宁波大学 Method of OFDMA (Orthogonal Frequency Division Multiple Access) ergodic capacity maximized resource allocation under incomplete channel state information
CN105727550A (en) * 2016-01-27 2016-07-06 安徽大学 Dot and box chess game system based on UCT algorithm
WO2016123213A1 (en) * 2015-01-30 2016-08-04 Alcatel-Lucent Usa Inc. Frequency resource and/or modulation and coding scheme indicator for machine type communication device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130204412A1 (en) * 2012-02-02 2013-08-08 International Business Machines Corporation Optimal policy determination using repeated stackelberg games with unknown player preferences
CN104135769A (en) * 2014-07-01 2014-11-05 宁波大学 Method of OFDMA (Orthogonal Frequency Division Multiple Access) ergodic capacity maximized resource allocation under incomplete channel state information
WO2016123213A1 (en) * 2015-01-30 2016-08-04 Alcatel-Lucent Usa Inc. Frequency resource and/or modulation and coding scheme indicator for machine type communication device
CN105727550A (en) * 2016-01-27 2016-07-06 安徽大学 Dot and box chess game system based on UCT algorithm

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
GAO LIN等: "Research on Resource Allocation Evaluation of Collaborative Product Developmet for Cloud Manufacuturing", 《THE 2015 INTERNATIONAL COFERENCE ON ADVANCES IN CONSTRUCTION MACHINERY AND VEHICLE ENGINEERING》 *
QUN MENG等: "Enhancing pattern search for global optimization with an additive global and local Gaussian Process Model", 《2017 WINTER SIMULATION COFERENCE》 *
YUNCHUAN LI等: "Monte Carlo tree search with optimal computing budget allocation", 《2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL》 *
刘洋: "点格棋博弈中UCT算法的研究与实现", 《中国优秀硕士学位论文全文数据库》 *
朱怡桦: "运用OCBA法改善求解随机性专案网路最佳化资源分配问题之研究", 《HTTPS://ETD.LIB.NCTU.EDU.TW/CGI-BIN/GS32/TUGSWEB.CGI?O=DNCTUCDR&S=ID=%22GT079832534%22.&SEARCHMODE=BASIC》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859532A (en) * 2019-02-28 2019-06-07 深圳市北斗智能科技有限公司 A kind of the break indices method and relevant apparatus of multi-constraint condition
CN110209770B (en) * 2019-06-03 2022-04-15 北京邮电大学 Named entity identification method based on strategy value network and tree search enhancement
CN110209770A (en) * 2019-06-03 2019-09-06 北京邮电大学 A kind of name entity recognition method based on policy value network and tree search enhancing
CN110427261A (en) * 2019-08-12 2019-11-08 电子科技大学 A kind of edge calculations method for allocating tasks based on the search of depth Monte Carlo tree
CN112202514A (en) * 2020-10-09 2021-01-08 中国人民解放军国防科技大学 Broadband spectrum sensing method based on reinforcement learning
CN112202514B (en) * 2020-10-09 2022-11-08 中国人民解放军国防科技大学 Broadband spectrum sensing method based on reinforcement learning
CN112700005A (en) * 2020-12-28 2021-04-23 北京环境特性研究所 Abnormal event processing method and device based on Monte Carlo tree search
CN112700005B (en) * 2020-12-28 2024-02-23 北京环境特性研究所 Abnormal event processing method and device based on Monte Carlo tree search
CN112734312B (en) * 2021-03-31 2021-07-09 平安科技(深圳)有限公司 Method for outputting reference data and computer equipment
CN112734312A (en) * 2021-03-31 2021-04-30 平安科技(深圳)有限公司 Method for outputting reference data and computer equipment
CN113935618A (en) * 2021-10-12 2022-01-14 网易有道信息技术(江苏)有限公司 Evaluation method and device for chess playing capability, electronic equipment and storage medium
CN114492910A (en) * 2021-11-03 2022-05-13 北京科技大学 Resource load prediction method for multi-model small-batch production line
CN114492910B (en) * 2021-11-03 2023-11-14 北京科技大学 Resource load prediction method for multi-model small-batch production line

Also Published As

Publication number Publication date
CN108809713B (en) 2020-12-25

Similar Documents

Publication Publication Date Title
CN108809713A (en) Monte Carlo tree searching method based on optimal resource allocation algorithm
Gharehchopogh et al. A comprehensive survey on symbiotic organisms search algorithms
Abraham et al. Finding numerical solutions of diophantine equations using ant colony optimization
Mashwani et al. A decomposition-based hybrid multiobjective evolutionary algorithm with dynamic resource allocation
US7418434B2 (en) Forward-chaining inferencing
US6550053B1 (en) Time estimator for object oriented software development
CN110138612A (en) A kind of cloud software service resource allocation methods based on QoS model self-correcting
CN113671987B (en) Multi-machine distributed time sequence task allocation method based on non-deadlock contract net algorithm
Kalra et al. Multi‐criteria workflow scheduling on clouds under deadline and budget constraints
Garg et al. Multi-objective workflow grid scheduling based on discrete particle swarm optimization
CN105446742A (en) Optimization method for artificial intelligence performing task
Kumar et al. Lagrangian relaxation techniques for scalable spatial conservation planning
CN111127233A (en) User check value calculation method in undirected authorized graph of social network
CN109831343B (en) Peer-to-peer network cooperation promotion method and system based on past strategy
Wang et al. Regional multi-armed bandits with partial informativeness
CN111078380A (en) Multi-target task scheduling method and system
De Rigo et al. Continental-scale living forest biomass and carbon stock: a robust fuzzy ensemble of IPCC Tier 1 maps for Europe
CN104392317A (en) Project scheduling method based on genetic culture gene algorithm
Banati et al. Modeling evolutionary group search optimization approach for community detection in social networks
AlBaity et al. On extending quantum behaved particle swarm optimization to multiobjective context
Zheng et al. A priority-based level heuristic approach for scheduling dag applications with uncertainties
CN108415774A (en) A kind of Method for HW/SW partitioning based on improvement fireworks algorithm
Mirshahvalad et al. Dynamics of interacting information waves in networks
Tomášek et al. Using one-sided partially observable stochastic games for solving zero-sum security games with sequential attacks
CN110162400B (en) Method and system for realizing cooperation of intelligent agents in MAS system in complex network environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant