CN108021028A - A kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study - Google Patents

A kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study Download PDF

Info

Publication number
CN108021028A
CN108021028A CN201711407168.4A CN201711407168A CN108021028A CN 108021028 A CN108021028 A CN 108021028A CN 201711407168 A CN201711407168 A CN 201711407168A CN 108021028 A CN108021028 A CN 108021028A
Authority
CN
China
Prior art keywords
control
state
pomdp
various dimensions
strategy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711407168.4A
Other languages
Chinese (zh)
Other versions
CN108021028B (en
Inventor
李鹏华
王欢
李嫄源
朱智勤
张家昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201711407168.4A priority Critical patent/CN108021028B/en
Publication of CN108021028A publication Critical patent/CN108021028A/en
Application granted granted Critical
Publication of CN108021028B publication Critical patent/CN108021028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

Converted the present invention relates to a kind of based on relevant redundancy with strengthening the various dimensions intelligence caravan cooperative control method learnt, belong to Internet of Things field.This method is around intelligent business Sojourn house car Unified Device connection protocol, shared equipment interface, the autonomous Collaborative Control demand for improving level of integrated system, utilize the autonomous control boot policy based on POMDP models and depth enhancing study, using input of the state of a control that various dimensions Intelligent Fusion obtains as computer control system, POMDP models are established to perceive, adapt to, the change of tracing equipment state of a control, optimizing behavior strategy is selected using the policy optimization method based on depth enhancing study, realizes the autonomous Collaborative Control of commercial Sojourn house car.The present invention not only contributes to the validity and real-time of final decision, while improves the accuracy and the study degree of optimization of strategy of interaction feedback, lifts user experience.

Description

A kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study
Technical field
The invention belongs to Internet of Things field, is related to a kind of converted based on relevant redundancy and cooperates with control with the various dimensions of enhancing study Method processed.
Background technology
Intelligent caravan utilizes multi-sensor data collection as intelligent network connection automobile and the product of smart home depth integration With the car borne gateway communication technology, intelligentized control method is carried out to mobile unit, meets space experience and intelligence of the people for caravan The demand of life.The real-time and accuracy performed as the intelligent control technology of one of intelligent caravan core, its control strategy Directly decide the quality of intelligent caravan, but for the existing intelligent caravan of existing market, there is control mode is single, control The intelligent shortcoming of strategy generating processed, the problems such as Executing Cost is excessive.For this reason, this patent using multi-source heterogeneous information characteristics it is unified with Fusion, is integrated Multiple Source Sensor data, is that the control under multi-modal complex environment lays the foundation, and use POMDP moulds The state of a control boot policy of state of a control boot policy method and depth enhancing study under type optimizes both and combines, and makes control Decision-making more accurately with intelligence, while uses the bottom control based on bus, is greatly reduced sensor cost of access, improves whole The reliability of aware platform, saves a large amount of calculation resources.
The content of the invention
In view of this, it is an object of the invention to provide a kind of converted based on relevant redundancy to assist with the various dimensions for strengthening study Same control method,
To reach above-mentioned purpose, the present invention provides following technical solution:
A kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study, is comprised the following steps:
S1:Multi-source heterogeneous information characteristics are unified with merging;
S2:Guided using the state of a control strategy based on POMDP models;
S3:Optimized using the state of a control boot policy based on depth enhancing study;
S4:Using the distributed bottom control based on bus.
Further, the step S1 is specially:To under multiple-sensor network environment, multisensor Heterogeneous Information passes through classics Correlation analysis algorithm CCA and isomorphism relevant redundancy conversion (Isomorphic Relevant Redundant Transformation, IRRT) Algorithm Analysis, multiple Heterogeneous Informations are mapped to a unified, computable space of dimension, Information is merged after carrying out unified representation to characteristic information.
Further, the step S2 is specially:Set using the commercial Sojourn house car of multi-source heterogeneous integration technology acquisition is all kinds of Standby state of a control, establishes POMDP models to perceive, adapt to, the change of tracing equipment state of a control;Pass through POMDP models Internal action device is acted to the application of equipment state of a control, to cause equipment state of a control to change, and obtains certain return;Root A series of performed tactful possibilities are weighed according to the accumulative return of acquisition, and then the equipment control of commercial Sojourn house car is asked Topic is converted into policy selection problem;Specifically, POMDP models are described as<S, A, T, O, Q, β>, integrated environment state is in POMDP Confidence state representation in model probability distribution is B={ bt, the probability distribution of its t moment is bt={ bt(s1) ..., bt (sm)};Wherein, bt(si) represent that t moment ambient condition is siProbability;By controlling current time the observation of environment with moving The selection of work, POMDP model inferences go out the value of the confidence of subsequent time state of a control;Assuming that the confidence state of initial time is b0, Execution acts a and observation σ, obtains subsequent time confidence state b1;When in state of a control s1, what model obtained is viewed as O1, mould Type internal state is i1;By calculating, corresponding action a is selected according to state of a control boot policy1, cause ambient condition from s1 It is transferred to s2, model, which obtains, returns r1With observing O2, model internal state is from i at this time1(b1) it is transferred to i2(b2), then model according to This is continued to run with;
Specifically, the boot policy estimation function of Construct question realizes that dialogue state tracks, which isWherein,It is corresponding The value of the action vector state s of node n;Developed by state of a control strategy, obtain subsequent timeState of a control boot policy function, whereinRepresent optimal plan Slightly,Represent the strategic function of last moment.
Further, the step S3 is specially:Drawing for commercial Sojourn house car equipment state of a control is obtained according to POMDP models Strategy is led, optimizing behavior strategy is selected using the policy optimization method for strengthening study DQN based on depth;Specifically, using Q- Network (Q (s, a, θ)) defines behavioral strategy, utilizes target Q- networks (Q (s, a;θ-)) target Q value that DQN loses item is generated, with And the stochastical sampling state value that POMDP models are used to train Q networks is remembered again;Learn to define POMDP models by strengthening It is expected that Total ReturnWherein, r is returnedtConverted by the factor gamma ∈ [0,1] of each time step, T is to terminate step Suddenly;Using action value function Qπ(s, a) observation state StAdaptive expectations, and utilize neutral net Q (s, a)=(Q (s, a;θ)) Approximation action value function;For based on the boot policy π, optimal action value function Q under action aπ(s, a)=E [Rt|st=a, a1 =a, π] pass through strategyRealize;Build the Bellman equations containing working value aThe Bellman target components of iteration are solved by adjusting Q-network;
First, DQN is reconstructed using memory, in each time step t of POMDP models, will remember tuple et=(st, at, rt, st+1) it is stored in mnemonic Dt={ e1..., etIn;
Secondly, DQN maintains two independent Q networks (Q (s, a, θ)) and (Q (s, a respectively;θ-));Parameter current θ is every Repeatedly updated in a time step, and be copied to old parameter θ after n iterations-In;When updating iteration, in order to Minimize relative to old parameter θ-Side Bellman errors, by optimizing loss functionTo update parameter current θ;For renewal i every time, from memory Individually sampling obtains memory tuple (s, a, r, s ')~U (D) in memory D;For each sample, calculated by stochastic gradient descent Method renewal parameter current θ;The gradient g of declineiBy θ relative to θ-Loss sample gradientTry to achieve;
Finally, in each time step t, select relative to current Q- networks (Q (s, a;Preference behavior act θ)); Q networks (Q (s, a are safeguarded using Center Parameter server;θ-)) distributed expression;Meanwhile the parameter server receives by force The gradient information that chemistry is practised, and under the driving of asynchronous stochastic gradient descent algorithm, ginseng is changed using these gradient informations Number vector θ-
Further, the step S4 is specially:The addressing mode of the data channel based on memory mapping is designed, synthesis is examined Consider triggering mode, sequential and load capacity problem, cooperate with variable connector and sampling holder, realize being total to for Data interface channels Enjoy;Self-control system of the design with redundancy structure, the control instruction that intelligently parsing fusion decision-making is obtained, it is defeated to take into account power supply Go out fluctuation, electromagnetic radiation and distributed capacitor inductive interferences factor, complete the autonomous control of mobile unit.
The beneficial effects of the present invention are:The present invention is a kind of various dimensions intelligence converted based on relevant redundancy with strengthening study Can caravan cooperative control method.Equipment working condition is being monitored, is driving that environmental classes people perceives, characteristics of human body's identification, user view pushes away Reason, resource information interaction, vehicle device autonomous control etc., compared with other methods, this patent surrounds intelligent business Sojourn house car Unified Device connection protocol, shared equipment interface, the autonomous Collaborative Control demand for improving level of integrated system, using based on POMDP Model and the autonomous control boot policy of depth enhancing study, using the state of a control that various dimensions Intelligent Fusion obtains as calculating The input of machine control system, establishes POMDP models to perceive, adapt to, the change of tracing equipment state of a control, using based on depth The policy optimization method of enhancing study (DQN) selects optimizing behavior strategy, realizes the autonomous Collaborative Control of commercial Sojourn house car. POMDP models and depth are strengthened two methods of study to be combined, formed under multi-modal pattern with the intelligent caravan under complex environment Optimal Control decision-making, not only contributes to the validity and real-time of final decision, at the same improve the accuracy of interaction feedback with The study degree of optimization of strategy, lifts user experience.
Brief description of the drawings
In order to make the purpose of the present invention, technical solution and beneficial effect clearer, the present invention provides drawings described below and carries out Explanation:
Fig. 1 is multi-source heterogeneous feature unified representation with merging figure;
Fig. 2 is in the state of a control boot policy figure of POMDP models;
Fig. 3 is the state of a control strategy optimization model of depth enhancing study.
Embodiment
Below in conjunction with attached drawing, the preferred embodiment of the present invention is described in detail.
As shown in Figure 1, 2, 3, each several part specific implementation details of the present invention are as follows:
1st, multi-source heterogeneous information characteristics are unified with merging.The process includes following 5 steps:
(1) multi-modal data gathers;
(2) multi-modal feature extraction;
(3) feature association;
(4) multi-modal feature unified representation;
(5) multi-source heterogeneous information characteristics fusion.
2nd, the state of a control boot policy design based on POMDP models.The process includes following 4 steps:
(1) POMDP model perceptions, adaptation, the change of tracing equipment state of a control are established;
(2) affector is acted to the application of equipment state of a control, obtains certain return;
(3) a series of tactful possibilities performed by being weighed according to the accumulative return of acquisition,
(4) gained policy selection is carried out;
3rd, the state of a control boot policy optimization based on depth enhancing study.The process includes following 3 steps:
(1) Q- net definitions behavioral strategy;
(2) target Q value that DQN loses item is generated;
(3) the stochastical sampling state value that POMDP models are used to train Q networks is remembered again.
4th, the distributed bottom control based on bus.Using the intelligent caravan gateway centralized Control scheme based on CAN bus, Intelligent caravan interface unit module is introduced, effectively isolates the diversity of controlled device, reduces the complexity of system;Pass through keyboard Control, remote control mode improve the reliability of intelligent caravan gateway from intelligent caravan gateway separation;Intelligent room car internal network is selected CAN bus, reduces the cost of system, meets the scalability of system, and according to more principal characteristics of CAN bus, realize controlled pair The plug-and-play feature of elephant.
Finally illustrate, preferred embodiment above is merely illustrative of the technical solution of the present invention and unrestricted, although logical Cross above preferred embodiment the present invention is described in detail, however, those skilled in the art should understand that, can be Various changes are made to it in form and in details, without departing from claims of the present invention limited range.

Claims (5)

  1. A kind of 1. various dimensions cooperative control method converted based on relevant redundancy with strengthening study, it is characterised in that:This method bag Include following steps:
    S1:Multi-source heterogeneous information characteristics are unified with merging;
    S2:Guided using the state of a control strategy based on POMDP models;
    S3:Optimized using the state of a control boot policy based on depth enhancing study;
    S4:Using the distributed bottom control based on bus.
  2. 2. a kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study according to claim 1, It is characterized in that:The step S1 is specially:To under multiple-sensor network environment, multisensor Heterogeneous Information passes through classical related Parser CCA and isomorphism relevant redundancy conversion (Isomorphic Relevant Redundant Transformation, Multiple Heterogeneous Informations are mapped to a unified, computable space of dimension, characteristic information are carried out by IRRT) Algorithm Analysis Information is merged after unified representation.
  3. 3. a kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study according to claim 1, It is characterized in that:The step S2 is specially:The commercial Sojourn house car various kinds of equipment obtained using multi-source heterogeneous integration technology State of a control, establishes POMDP models to perceive, adapt to, the change of tracing equipment state of a control;Pass through the inside of POMDP models Affector is acted to the application of equipment state of a control, to cause equipment state of a control to change, and obtains certain return;According to obtaining The accumulative return obtained turns the equipment control problem of commercial Sojourn house car to weigh a series of performed tactful possibilities Change policy selection problem into;Specifically, POMDP models are described as<S, A, T, O, Q, β>, integrated environment state is in POMDP models Confidence state representation in probability distribution is B={ bt, the probability distribution of its t moment is bt={ bt(s1) ..., bt(sm)};Its In, bt(si) represent that t moment ambient condition is SiProbability;Selection by the observation and action that control current time environment, POMDP model inferences go out the value of the confidence of subsequent time state of a control;Assuming that the confidence state of initial time is b0, execution action a With observing a, subsequent time confidence state b is obtained1;When in state of a control s1, what model obtained is viewed as o1, model inside shape State is i1;By calculating, corresponding action a is selected according to state of a control boot policy1, cause ambient condition from s1It is transferred to s2, Model obtains return r1With observing o2, model internal state is from i at this time1(b1) it is transferred to i2(b2), then model is according to this after reforwarding OK;
    Specifically, the boot policy estimation function of Construct question realizes that dialogue state tracks, which isWherein,It is corresponding The value of the action vector state s of node n;Developed by state of a control strategy, obtain subsequent timeState of a control boot policy function, whereinRepresent optimal plan Slightly,Represent the strategic function of last moment.
  4. 4. a kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study according to claim 1, It is characterized in that:The step S3 is specially:The guiding plan of commercial Sojourn house car equipment state of a control is obtained according to POMDP models Slightly, optimizing behavior strategy is selected using the policy optimization method for strengthening study DQN based on depth;Specifically, using Q- networks (Q (s, a;Behavioral strategy θ)) is defined, utilizes target Q- networks (Q (s, a;θ-)) generate the target Q value that DQN loses item, Yi Jichong New memory POMDP models are used for the stochastical sampling state value for training Q networks;By strengthening the expection for learning to define POMDP models Total ReturnWherein, r is returnedtConverted by the factor gamma ∈ [0,1] of each time step, T is to terminate step;Adopt With action value function Qπ(s, a) observation state SiAdaptive expectations, and utilize neutral net Q (s, a)=(Q (s, a;It is θ)) approximate dynamic Make value function;For based on the boot policy π under action a, optimal action value function Pass through strategyRealize;Build the Bellman equations containing working value aThe Bellman target components of iteration are asked by adjusting Q-network Solution;
    First, DQN is reconstructed using memory, in each time step t of POMDP models, will remember tuple et=(st, at, rt, st+1) it is stored in mnemonic Dt={ e1..., etIn;
    Secondly, DQN maintains two independent Q networks (Q (s, a respectively;θ)) and (Q (s, a;θ-));Parameter current 6 is in each time Repeatedly updated in step-length, and be copied to old parameter θ after n iterations-In;When updating iteration, in order to minimize Relative to old parameter θ-Side Bellman errors, by optimizing loss functionTo update parameter current 6;For renewal i every time, from memory Individually sampling obtains memory tuple (s, a, r, s)~U (D) in memory D;For each sample, calculated by stochastic gradient descent Method updates parameter current 6;The gradient g of declineiBy 6 relative to θ-Loss sample gradientTry to achieve;
    Finally, in each time step t, select relative to current Q- networks (Q (s, a;Preference behavior act θ));Use Center Parameter server safeguards Q networks (Q (s, a;θ-)) distributed expression;Meanwhile the parameter server receives extensive chemical The gradient information practised, and under the driving of asynchronous stochastic gradient descent algorithm, using these gradient informations come change parameter to Measure θ-
  5. 5. a kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study according to claim 1, It is characterized in that:The step S4 is specially:The addressing mode of the data channel based on memory mapping is designed, is considered tactile Originating party formula, sequential and load capacity problem, cooperate with variable connector and sampling holder, realize the shared of Data interface channels;If Self-control system of the meter with redundancy structure, the control instruction that intelligently parsing fusion decision-making is obtained, takes into account power supply output wave Dynamic, electromagnetic radiation and distributed capacitor inductive interferences factor, complete the autonomous control of mobile unit.
CN201711407168.4A 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt Active CN108021028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711407168.4A CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711407168.4A CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Publications (2)

Publication Number Publication Date
CN108021028A true CN108021028A (en) 2018-05-11
CN108021028B CN108021028B (en) 2019-04-09

Family

ID=62074474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711407168.4A Active CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Country Status (1)

Country Link
CN (1) CN108021028B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108777872A (en) * 2018-05-22 2018-11-09 中国人民解放军陆军工程大学 Deep Q neural network anti-interference model and intelligent anti-interference algorithm
CN110533054A (en) * 2018-05-25 2019-12-03 中国电力科学研究院有限公司 The multi-modal adaptive machine learning method of one kind and device
CN110989602A (en) * 2019-12-12 2020-04-10 齐鲁工业大学 Method and system for planning paths of autonomous guided vehicle in medical pathological examination laboratory
CN113126498A (en) * 2021-04-17 2021-07-16 西北工业大学 Optimization control system and control method based on distributed reinforcement learning
CN113727420A (en) * 2021-09-03 2021-11-30 重庆邮电大学 Multimode access network selection device and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083656A (en) * 2007-07-05 2007-12-05 上海交通大学 Data stream technique based multi-source heterogeneous data integrated system
CN101094173A (en) * 2007-06-28 2007-12-26 上海交通大学 Integrated system of data interchange under distributed isomerical environment
CN101213111A (en) * 2005-06-03 2008-07-02 埃伦贝格尔及珀恩斯根有限公司 Multiplexing-system for controlling loads in boots or recreational vehicles
CN101222524A (en) * 2008-01-09 2008-07-16 华南理工大学 Distributed multi-sensor cooperated measuring method and system
CN101466111A (en) * 2009-01-13 2009-06-24 中国人民解放军理工大学通信工程学院 Dynamic spectrum access method based on policy planning constrain Q study
CN102207928A (en) * 2011-06-02 2011-10-05 河海大学常州校区 Reinforcement learning-based multi-Agent sewage treatment decision support system
CN106228314A (en) * 2016-08-11 2016-12-14 电子科技大学 The workflow schedule method of study is strengthened based on the degree of depth

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101213111A (en) * 2005-06-03 2008-07-02 埃伦贝格尔及珀恩斯根有限公司 Multiplexing-system for controlling loads in boots or recreational vehicles
CN101094173A (en) * 2007-06-28 2007-12-26 上海交通大学 Integrated system of data interchange under distributed isomerical environment
CN101083656A (en) * 2007-07-05 2007-12-05 上海交通大学 Data stream technique based multi-source heterogeneous data integrated system
CN101222524A (en) * 2008-01-09 2008-07-16 华南理工大学 Distributed multi-sensor cooperated measuring method and system
CN101466111A (en) * 2009-01-13 2009-06-24 中国人民解放军理工大学通信工程学院 Dynamic spectrum access method based on policy planning constrain Q study
CN102207928A (en) * 2011-06-02 2011-10-05 河海大学常州校区 Reinforcement learning-based multi-Agent sewage treatment decision support system
CN106228314A (en) * 2016-08-11 2016-12-14 电子科技大学 The workflow schedule method of study is strengthened based on the degree of depth

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LEI ZHANG, ETC: "Mining Semantically Consistent Patterns for Cross-View Data", 《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 *
MAXIM EGOROV: "Deep Reinforcement Learning with POMDPs", 《IEEE》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108777872A (en) * 2018-05-22 2018-11-09 中国人民解放军陆军工程大学 Deep Q neural network anti-interference model and intelligent anti-interference algorithm
CN110533054A (en) * 2018-05-25 2019-12-03 中国电力科学研究院有限公司 The multi-modal adaptive machine learning method of one kind and device
CN110533054B (en) * 2018-05-25 2024-02-06 中国电力科学研究院有限公司 Multi-mode self-adaptive machine learning method and device
CN110989602A (en) * 2019-12-12 2020-04-10 齐鲁工业大学 Method and system for planning paths of autonomous guided vehicle in medical pathological examination laboratory
CN110989602B (en) * 2019-12-12 2023-12-26 齐鲁工业大学 Autonomous guided vehicle path planning method and system in medical pathology inspection laboratory
CN113126498A (en) * 2021-04-17 2021-07-16 西北工业大学 Optimization control system and control method based on distributed reinforcement learning
CN113727420A (en) * 2021-09-03 2021-11-30 重庆邮电大学 Multimode access network selection device and method
CN113727420B (en) * 2021-09-03 2023-05-23 重庆邮电大学 Multimode access network selection device and method

Also Published As

Publication number Publication date
CN108021028B (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN108021028A (en) A kind of various dimensions cooperative control method converted based on relevant redundancy with strengthening study
Xu et al. Scalable learning paradigms for data-driven wireless communication
Ferreira et al. An approach to reservoir computing design and training
CN107241213A (en) A kind of web service composition method learnt based on deeply
CN114237041B (en) Space-ground cooperative fixed time fault tolerance control method based on preset performance
CN105760344B (en) A kind of distributed principal components analysis-artificial neural networks modeling method of exothermic chemical reaction
CN106529818A (en) Water quality evaluation prediction method based on fuzzy wavelet neural network
Gan et al. Research on role modeling and behavior control of virtual reality animation interactive system in Internet of Things
CN107451230A (en) A kind of answering method and question answering system
CN105068421A (en) Two-degree-of-freedom cooperative control method for multiple mobile robots
CN106897744A (en) A kind of self adaptation sets the method and system of depth confidence network parameter
CN112990485A (en) Knowledge strategy selection method and device based on reinforcement learning
CN107220758A (en) A kind of Electric Power Network Planning accessory system
Chen et al. Deep-broad learning system for traffic flow prediction toward 5G cellular wireless network
CN112906888A (en) Task execution method and device, electronic equipment and storage medium
Shahriari et al. Generic online learning for partial visible dynamic environment with delayed feedback: Online learning for 5G C-RAN load-balancer
Zhu et al. Tri-HGNN: Learning triple policies fused hierarchical graph neural networks for pedestrian trajectory prediction
CN117475518B (en) Synchronous human motion recognition and prediction method and system
CN108182476A (en) A kind of policy learning method controlled in intensified learning by wish
CN112446462A (en) Generation method and device of target neural network model
CN117012304B (en) Deep learning molecule generation system and method fused with GGNN-GAN
CN114281955A (en) Dialogue processing method, device, equipment and storage medium
CN117473616A (en) Railway BIM data edge caching method based on multi-agent reinforcement learning
CN117319232A (en) Multi-agent cluster consistency cooperative control method based on behavior prediction
CN113276114A (en) Reconfigurable mechanical arm cooperative force/motion control system and method based on terminal task assignment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant