CN116757272A - Continuous motion control reinforcement learning framework and learning method - Google Patents
Continuous motion control reinforcement learning framework and learning method Download PDFInfo
- Publication number
- CN116757272A CN116757272A CN202310805443.7A CN202310805443A CN116757272A CN 116757272 A CN116757272 A CN 116757272A CN 202310805443 A CN202310805443 A CN 202310805443A CN 116757272 A CN116757272 A CN 116757272A
- Authority
- CN
- China
- Prior art keywords
- learning
- motion control
- clustering
- reinforcement learning
- continuous motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002787 reinforcement Effects 0.000 title claims abstract description 26
- 238000000034 method Methods 0.000 title claims abstract description 20
- 230000007704 transition Effects 0.000 claims abstract description 41
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 23
- 230000009471 action Effects 0.000 claims abstract description 17
- 238000012546 transfer Methods 0.000 claims abstract description 9
- 238000013526 transfer learning Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 22
- 230000001186 cumulative effect Effects 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 5
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/092—Reinforcement learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Feedback Control In General (AREA)
Abstract
The application discloses a continuous action control reinforcement learning framework and a learning method, and relates to the technical field of artificial intelligence. The learning framework includes: the multi-step state transfer learning module is used for learning multi-step state transfer by adopting a convolutional neural network and updating a strategy; the expected estimation module is used for estimating the expected of multi-step accumulated returns by adopting a multi-step time sequence difference algorithm; and the sample clustering module is used for clustering different types of state transition samples so that each sample is uniformly sampled. The application combines convolutional neural network, multi-step time sequence differential estimation and state transfer clustering, effectively improves learning efficiency and accuracy, and makes the sample more fully utilized.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a continuous action control reinforcement learning framework and a learning method.
Background
Currently, some effective deep reinforcement learning algorithms are proposed for optimizing continuous control. The most representative is DDPG, which works based on the actor commentator approach. Definition ρ t Is the state at time t, alpha t Is an action at time t, and a deterministic strategy is defined as follows:
α t =π θ (ρ t )
existing actor-commentator frameworks train agents by cyclically updating the estimated function of cumulative returns and strategies to maximize this function. An estimate of the cumulative return may be obtained by minimizing the objective function as follows.
Where B is the set of sampled state transitions, rewards, and actions.
The objective function that needs to be maximized when updating the policy is as follows:
based on the actor-commentator framework, DDPG learns the state transitions of a single step mainly through a fully connected neural network and then estimates the expectations of the cumulative rewards function through the cumulative rewards of the single step. TD3 and SAC are two improved algorithms based on DDPG, and TD3 improves over-estimation, policy update and exploration in DDPG through a double criticizer network, time sequence differential estimation and gaussian noise. SAC has advanced the exploration in DDPG mainly by improving objective functions, it also uses double commentators networks and time-series differential estimation.
However, the prior art has the following disadvantages:
1. considering only single-step state transitions results in learning inefficiencies.
2. The desire to estimate the cumulative return taking into account only the single step returns may result in an inaccurate estimate.
3. Updating the neural network using randomly sampled state transitions tends to underutilize the samples.
Disclosure of Invention
In order to overcome or at least partially solve the above problems, the present application provides a continuous motion control reinforcement learning framework and a learning method, which combine convolutional neural network, multi-step time sequence differential estimation and state transfer clustering, effectively improve learning efficiency and accuracy, and make the sample more fully utilized.
In order to solve the technical problems, the application adopts the following technical scheme:
in a first aspect, the present application provides a continuous motion control reinforcement learning framework, comprising a multi-step state transition learning module, an expectation estimation module, and a sample clustering module, wherein:
the multi-step state transfer learning module is used for learning multi-step state transfer by adopting a convolutional neural network and updating a strategy;
the expected estimation module is used for estimating the expected of multi-step accumulated returns by adopting a multi-step time sequence difference algorithm;
and the sample clustering module is used for clustering different types of state transition samples so that each sample is uniformly sampled.
The framework combines convolutional neural network, multi-step time sequence differential estimation and state transfer clustering for the first time, and has the following characteristics: updating the policy using a convolutional neural network to account for multi-step state transitions; estimating a desire for a multi-step cumulative return using a multi-step time series differential algorithm; each sample is fully sampled by clustering existing state transition samples. In the application, the multi-step state transition is learned through the convolutional neural network in the reinforcement learning aiming at continuous control, so that the learning efficiency is improved; estimating the expected accumulated return through multi-step return on the basis of the previous step, so that the estimation is more accurate; the application also enables the state transition samples of different types to be uniformly sampled through clustering, thereby enabling the samples to be more fully utilized.
Based on the first aspect, further, the policy is thatWherein,,alpha is action, ρ is state, pi is policy, θ c Is the parameter of the convolutional neural network, t is the current time and n p The number of steps for state transition.
Based on the first aspect, further, the following objective function is minimized to obtain the desire to estimate the multi-step cumulative return,
the objective function is:
in n p For the number of state transitions, n q To report the number of steps, B n For sampled multi-step state transitions, multi-step returns, and sets of actions, E is the expectation, Q is a function of the estimated cumulative return expectation,is to estimate the parameter sum of Q
Based on the first aspect, further, a function is adoptedUpdating the strategy.
Based on the first aspect, further, when clustering is performed, the total step number of training is distributed to different time periods in an average mode, and samples in each time period are clustered; the clustering method adopts a k-means algorithm.
Based on the first aspect, further, when the state transition update function is sampled, samples in each cluster are uniformly sampled.
The application has at least the following advantages or beneficial effects:
the application provides a continuous action control reinforcement learning framework and a learning method, which combine a convolutional neural network, multi-step time sequence difference estimation and state transfer clustering, learn multi-step state transfer through the convolutional neural network in reinforcement learning aiming at continuous control, and improve learning efficiency; estimating the expected accumulated return through multi-step return on the basis of the previous step, so that the estimation is more accurate; the application also enables the state transition samples of different types to be uniformly sampled through clustering, thereby enabling the samples to be more fully utilized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a continuous motion control reinforcement learning framework according to an embodiment of the present application;
FIG. 2 is a schematic diagram of experimental training of an intelligent agent in different virtual environments according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a sampling pool obtained after sample clustering in an embodiment of the present application;
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the description of the embodiments of the present application, "plurality" means at least 2.
Examples:
as shown in fig. 1, in a first aspect, an embodiment of the present application provides a continuous motion control reinforcement learning framework, including a multi-step state transition learning module 100, a desire estimation module 200, and a sample clustering module 300, wherein:
the multi-step state transition learning module 100 is used for learning multi-step state transitions by adopting a convolutional neural network and updating a strategy;
the expectation estimation module 200 is configured to estimate an expectation of multi-step cumulative returns using a multi-step time sequence difference algorithm;
the sample clustering module 300 is configured to cluster different types of state transition samples, so that each sample is uniformly sampled.
The framework combines convolutional neural network, multi-step time sequence differential estimation and state transfer clustering through the cooperation of the multi-step state transfer learning module 100, the expected estimation module 200 and the sample clustering module 300, and has the following characteristics: updating the policy using a convolutional neural network to account for multi-step state transitions; estimating a desire for a multi-step cumulative return using a multi-step time series differential algorithm; each sample is fully sampled by clustering existing state transition samples. In the application, the multi-step state transition is learned through the convolutional neural network in the reinforcement learning aiming at continuous control, so that the learning efficiency is improved; estimating the expected accumulated return through multi-step return on the basis of the previous step, so that the estimation is more accurate; the application also enables the state transition samples of different types to be uniformly sampled through clustering, thereby enabling the samples to be more fully utilized.
Based on the first aspect, further, the policy is thatWherein,,alpha is action, ρ is state, pi is policy, θ c Is the parameter of the convolutional neural network, t is the current time and n p The number of steps for state transition.
Based on the first aspect, further, the following objective function is minimized to obtain the desire to estimate the multi-step cumulative return,
the objective function is:
wherein n is p For the number of state transitions, n q To report the number of steps, B n For sampled multi-step state transitions, multi-step returns, and sets of actions, E is the expectation, Q is a function of the estimated cumulative return expectation,is to estimate the parameter sum of Q
Based on the strategy, in a newly defined framework, an estimate of the cumulative return can be obtained by minimizing the objective function described above. Above-mentionedAnd->Corresponding to the above.
Based on the first aspect, further, a function is adoptedUpdating the strategy.
The objective function that needs to be maximized when updating the strategy is as above, which is also accomplished by updating the defined convolutional neural network.
Based on the first aspect, further, when clustering is performed, the total step number of training is distributed to different time periods in average, and samples in each time period are clustered. The clustering method adopts a k-means algorithm.
Based on the first aspect, further, when the state transition update function is sampled, samples in each cluster are uniformly sampled.
In some embodiments of the present application, it is also desirable to divide the total number of steps of the training into different time periods on average, and then cluster the samples in each time period. The clustering method selects k-means. The resulting sample cell is shown in fig. 3. The current period is assumed to be p in sampling, and the number of clusters in each period is k. The probability that the samples in each cluster are sampled at each update of the neural network isThe probability that the sample is sampled in the current period is 0.2.
In some embodiments of the application, the algorithm flow for learning based on the framework is as follows:
np is the number of defined time periods, pt is the number of steps per time period, and the algorithm is as follows:
initializing neural network parameters
Initializing sampling space
Initializing exploring noise
Fore=1:np
Fort=1:pt
Selecting actions by policy
Adding exploratory noise for motion
Executing an action rewards rt and status
Storing actions, states, and rewards into a sampling space
Selecting samples from existing clusters in the sample space and state transitions generated by the current period
Updating neural networks by selected samples
Endfor
Clustering samples in a previous time period
Endfor
Outputting a policy model based on the neural network.
In some embodiments of the present application, the TD3 algorithm is modified with the framework proposed by the present application, resulting in a new algorithm td3+. Experiments were performed on the virtual robot control environment Mujoco, with experimental tasks including HafCheetal, walker d, and Hopper. As shown in fig. 2, are agents in HafCheeta, walker d and Hopper environments. In both the first two environments, the agent needs to walk farther in a fixed number of steps through reinforcement learning, while the last one needs to train the single-leg agent as far as possible.
The comparison algorithm includes DDPG, SAC, and TD3. For all methods, each task runs 2 x 10 x 6 time steps. The cumulative returns obtained by different algorithms on different tasks are shown in table 1, and it can be seen that the effect of algorithm td+3 implemented with the proposed framework is better than the existing algorithm.
Table 1:
running environment | TD3+ | TD3 | SAC | DDPG |
HafCheetal | 13589.17 | 10032.66 | 9643.93 | 9453.22 |
Walker2d | 6167.26 | 4471.43 | 4971.42 | 3804.91 |
Hopper | 3812.30 | 3472.65 | 3531.77 | 3736.21 |
In some embodiments of the present application, ablation experiments were performed, and the results are shown in table 2, comparing the results without using clustering (td3+ woC), without using convolutional neural network (td3+ woS), and without using multi-step time series differential algorithm (td3+ woQ) in the new method. From the above, each part (convolutional neural network, multi-step time sequence differential estimation and clustering) in the application can effectively improve the reinforcement learning effect.
Table 2:
running environment | TD3+ | TD3+woC | TD3+woS | TD3+woQ |
HafCheetal | 13589.17 | 12824.48 | 12654.81 | 12051.56 |
Walker2d | 6267.26 | 6056.13 | 5401.72 | 5737.12 |
Hopper | 3812.30 | 3758.30 | 3713.23 | 3762.34 |
In a second aspect, an embodiment of the present application provides a continuous motion control reinforcement learning method based on the continuous motion control reinforcement learning framework according to any one of the first aspect, including the steps of:
adopting a convolutional neural network to learn multi-step state transition, and updating a learning strategy;
estimating the expectation of multi-step accumulated returns by adopting a multi-step time sequence difference algorithm;
different types of state transition samples are clustered such that each sample is sampled uniformly.
In the application, the multi-step state transition is learned through the convolutional neural network in the reinforcement learning aiming at continuous control, so that the learning efficiency is improved; estimating the expected accumulated return through multi-step return on the basis of the previous step, so that the estimation is more accurate; the application also enables the state transition samples of different types to be uniformly sampled through clustering, thereby enabling the samples to be more fully utilized.
The above is only a preferred embodiment of the present application, and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.
It will be evident to those skilled in the art that the application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (7)
1. The continuous motion control reinforcement learning framework is characterized by comprising a multi-step state transition learning module, a desire estimation module and a sample clustering module, wherein:
the multi-step state transfer learning module is used for learning multi-step state transfer by adopting a convolutional neural network and updating a strategy;
the expected estimation module is used for estimating the expected of multi-step accumulated returns by adopting a multi-step time sequence difference algorithm;
and the sample clustering module is used for clustering different types of state transition samples so that each sample is uniformly sampled.
2. The continuous motion control reinforcement learning framework of claim 1, wherein the strategy isWherein (1)>Alpha is action, ρ is state, pi is policy, θ c Is the parameter of the convolutional neural network, t is the current time and n p The number of steps for state transition.
3. The continuous motion control reinforcement learning framework of claim 1 wherein the following objective function is minimized to obtain the expectation of an estimated multi-step cumulative return,
the objective function is:
wherein n is p For the number of state transitions, n q To report the number of steps, B n For sampled multi-step state transitions, multi-step returns, and sets of actions, E is the expectation, Q is a function of the estimated cumulative return expectation,is to estimate the parameter sum of Q
4. A continuous motion control reinforcement learning framework in accordance with claim 3, characterized by using a functionUpdating the strategy.
5. The framework of continuous motion control reinforcement learning of claim 1, wherein when clustering, the total number of steps of training is evenly distributed to different time periods, and samples in each time period are clustered; the clustering method adopts a k-means algorithm.
6. The continuous motion control reinforcement learning framework of claim 1 wherein samples in each cluster are uniformly sampled as the state transition update function is sampled.
7. A continuous motion control reinforcement learning method based on the continuous motion control reinforcement learning framework according to any one of claims 1 to 6, characterized by comprising the following operations
Adopting a convolutional neural network to learn multi-step state transition, and updating a learning strategy;
estimating the expectation of multi-step accumulated returns by adopting a multi-step time sequence difference algorithm;
different types of state transition samples are clustered such that each sample is sampled uniformly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310805443.7A CN116757272A (en) | 2023-07-03 | 2023-07-03 | Continuous motion control reinforcement learning framework and learning method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310805443.7A CN116757272A (en) | 2023-07-03 | 2023-07-03 | Continuous motion control reinforcement learning framework and learning method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116757272A true CN116757272A (en) | 2023-09-15 |
Family
ID=87956899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310805443.7A Pending CN116757272A (en) | 2023-07-03 | 2023-07-03 | Continuous motion control reinforcement learning framework and learning method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116757272A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169567A (en) * | 2017-03-30 | 2017-09-15 | 深圳先进技术研究院 | The generation method and device of a kind of decision networks model for Vehicular automatic driving |
US20210397959A1 (en) * | 2020-06-22 | 2021-12-23 | Google Llc | Training reinforcement learning agents to learn expert exploration behaviors from demonstrators |
CN115293217A (en) * | 2022-08-23 | 2022-11-04 | 南京邮电大学 | Unsupervised pseudo tag optimization pedestrian re-identification method based on radio frequency signals |
CN115439887A (en) * | 2022-08-26 | 2022-12-06 | 三维通信股份有限公司 | Pedestrian re-identification method and system based on pseudo label optimization and storage medium |
CN116224794A (en) * | 2023-03-03 | 2023-06-06 | 北京理工大学 | Reinforced learning continuous action control method based on discrete-continuous heterogeneous Q network |
-
2023
- 2023-07-03 CN CN202310805443.7A patent/CN116757272A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169567A (en) * | 2017-03-30 | 2017-09-15 | 深圳先进技术研究院 | The generation method and device of a kind of decision networks model for Vehicular automatic driving |
US20210397959A1 (en) * | 2020-06-22 | 2021-12-23 | Google Llc | Training reinforcement learning agents to learn expert exploration behaviors from demonstrators |
CN115293217A (en) * | 2022-08-23 | 2022-11-04 | 南京邮电大学 | Unsupervised pseudo tag optimization pedestrian re-identification method based on radio frequency signals |
CN115439887A (en) * | 2022-08-26 | 2022-12-06 | 三维通信股份有限公司 | Pedestrian re-identification method and system based on pseudo label optimization and storage medium |
CN116224794A (en) * | 2023-03-03 | 2023-06-06 | 北京理工大学 | Reinforced learning continuous action control method based on discrete-continuous heterogeneous Q network |
Non-Patent Citations (2)
Title |
---|
MIN LI 等: "Clustering experience replay for the effective exploitation in reinforcement learning", ELSEVIER, pages 1 - 9 * |
黄天意: "深度强化学习算法及其在无监督去噪中的应用研究", CNKI 博士电子期刊, pages 3 - 4 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107247961B (en) | Track prediction method applying fuzzy track sequence | |
CN111966823B (en) | Graph node classification method facing label noise | |
Jourdan et al. | Using datamining techniques to help metaheuristics: A short survey | |
CN110232416A (en) | Equipment failure prediction technique based on HSMM-SVM | |
CN106067034B (en) | Power distribution network load curve clustering method based on high-dimensional matrix characteristic root | |
CN112766603B (en) | Traffic flow prediction method, system, computer equipment and storage medium | |
CN109558898B (en) | Multi-choice learning method with high confidence based on deep neural network | |
Osentoski et al. | Learning hierarchical models of activity | |
CN112308161A (en) | Particle swarm algorithm based on artificial intelligence semi-supervised clustering target | |
CN111913887B (en) | Software behavior prediction method based on beta distribution and Bayesian estimation | |
CN116757272A (en) | Continuous motion control reinforcement learning framework and learning method | |
CN117117850A (en) | Short-term electricity load prediction method and system | |
CN112415337A (en) | Power distribution network fault diagnosis method based on dynamic set coverage | |
CN108134687B (en) | Gray model local area network peak flow prediction method based on Markov chain | |
CN115794405A (en) | Dynamic resource allocation method of big data processing framework based on SSA-XGboost algorithm | |
Knowles et al. | Message Passing Algorithms for the Dirichlet Diffusion Tree. | |
Yu et al. | Autonomous knowledge-oriented clustering using decision-theoretic rough set theory | |
Jia | An adaptive sampling algorithm for simulation-based optimization with descriptive complexity preference | |
Dlapa | Cluster restarted DM: New algorithm for global optimisation | |
CN103646407B (en) | A kind of video target tracking method based on composition distance relation figure | |
CN105160436A (en) | N neighboring Lipschitz supporting surface-based generalized augmented group global optimization method | |
Moreno et al. | Robust growing hierarchical self organizing map | |
Chen et al. | Composite kernel based SVM for hierarchical multi-label gene function classification | |
He et al. | A method to cloud computing resources requirement prediction on SaaS application | |
CN108960427A (en) | A kind of manufacture cloud service optimum choice method of knowledge based guidance type genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |