CN108021028B - It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt - Google Patents

It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt Download PDF

Info

Publication number
CN108021028B
CN108021028B CN201711407168.4A CN201711407168A CN108021028B CN 108021028 B CN108021028 B CN 108021028B CN 201711407168 A CN201711407168 A CN 201711407168A CN 108021028 B CN108021028 B CN 108021028B
Authority
CN
China
Prior art keywords
control
state
model
network
pomdp model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711407168.4A
Other languages
Chinese (zh)
Other versions
CN108021028A (en
Inventor
李鹏华
王欢
李嫄源
朱智勤
张家昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201711407168.4A priority Critical patent/CN108021028B/en
Publication of CN108021028A publication Critical patent/CN108021028A/en
Application granted granted Critical
Publication of CN108021028B publication Critical patent/CN108021028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/04Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators
    • G05B13/042Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric involving the use of models or simulators in which a parameter or coefficient is automatically adjusted to optimise the performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Feedback Control In General (AREA)

Abstract

The present invention relates to a kind of various dimensions intelligence caravan cooperative control methods converted based on relevant redundancy with enhancing study, belong to internet of things field.This method is around intelligent business Sojourn house car Unified Device connection protocol, shared device interface, the autonomous Collaborative Control demand for improving level of integrated system, utilize the autonomous control boot policy based on POMDP model and depth enhancing study, input of the state of a control obtained using various dimensions Intelligent Fusion as computer control system, POMDP model is established to perceive, adapt to, the variation of tracing equipment state of a control, optimizing behavior strategy is selected using based on the policy optimization method of depth enhancing study, realizes the autonomous Collaborative Control of commercial Sojourn house car.The present invention not only contributes to the validity and real-time of final decision, while improving the accuracy of interaction feedback and the study degree of optimization of strategy, promotes user experience.

Description

It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt
Technical field
The invention belongs to internet of things field, are related to a kind of convert based on relevant redundancy and cooperate with control with the various dimensions of enhancing study Method processed.
Background technique
Product of the intelligent caravan as intelligent network connection automobile and smart home depth integration, utilizes multi-sensor data collection With the car borne gateway communication technology, intelligentized control method is carried out to mobile unit, meets space experience and intelligence of the people for caravan The demand of life.The real-time and accuracy executed as the intelligent control technology of one of intelligent caravan core, control strategy Directly decide the superiority and inferiority of intelligent caravan, but for the existing intelligent caravan of existing market, single, control that there is control modes The problems such as strategy generating intelligence processed is short of, Executing Cost is excessively high.For this purpose, this patent using multi-source heterogeneous information characteristics it is unified with Fusion, integrates Multiple Source Sensor data, lays the foundation for the control under multi-modal complex environment, and uses POMDP mould State of a control boot policy method under type is combined with state of a control boot policy optimization the two of depth enhancing study, makes to control Decision more accurately with intelligence, while using the bottom control based on bus, is greatly reduced sensor cost of access, improves entire The reliability of aware platform saves a large amount of calculation resources.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of converted based on relevant redundancy to assist with the various dimensions for enhancing study Same control method,
In order to achieve the above objectives, the invention provides the following technical scheme:
It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt, comprising the following steps:
S1: multi-source heterogeneous information characteristics are unified and are merged;
S2: using the state of a control strategy guidance based on POMDP model;
S3: using the state of a control boot policy optimization based on depth enhancing study;
S4: the distributed bottom control based on bus is used.
Further, the step S1 specifically: under multiple-sensor network environment, multisensor Heterogeneous Information passes through classics Correlation analysis algorithm CCA and isomorphism relevant redundancy convert (Isomorphic Relevant Redundant Transformation, IRRT) algorithm analysis, multiple Heterogeneous Informations are mapped to unified, the computable space of dimension, Information is merged after carrying out unified representation to characteristic information.
Further, the step S2 specifically: the commercial Sojourn house car obtained using multi-source heterogeneous integration technology is all kinds of to be set Standby state of a control establishes POMDP model to perceive, adapt to, the variation of tracing equipment state of a control;Pass through POMDP model Internal action device is acted to equipment state of a control application, to cause equipment state of a control to change, and obtains certain return;Root A possibility that a series of performed strategies are measured according to the accumulative return of acquisition, and then the control of the equipment of commercial Sojourn house car is asked Topic is converted into policy selection problem;Specifically, POMDP model is described as<S, A, T, O, Q, and β>, integrated environment state is in POMDP Confidence state in model probability distribution is expressed as B={ bt, the probability distribution of t moment is bt={ bt(s1) ..., bt (sm)};Wherein, bt(si) expression t moment ambient condition be siProbability;By controlling current time the observation of environment and moving The selection of work, POMDP model inference go out the value of the confidence of subsequent time state of a control;Assuming that the confidence state of initial time is b0, Execution acts a and observation o, obtains subsequent time confidence state b1;When in state of a control S1, what model obtained is viewed as O1, mould Type internal state is i1;By calculating, corresponding movement a is selected according to state of a control boot policy1, cause ambient condition from S1 It is transferred to S2, model, which obtains, returns r1With observation O2, model internal state is from i at this time1(b1) it is transferred to i2(b2), then model according to This is continued to run;
Specifically, the boot policy estimation function of Construct question realizes dialogue state tracking, which isWherein,It is corresponding The value of the movement vector state s of node n;Developed by state of a control strategy, obtains subsequent timeState of a control boot policy function, whereinIndicate optimal plan Slightly,Indicate the strategic function of last moment.
Further, the step S3 specifically: drawing for commercial Sojourn house car equipment state of a control is obtained according to POMDP model Strategy is led, selects optimizing behavior strategy using the policy optimization method of study DQN is enhanced based on depth;Specifically, using Q- Network (Q (s, a;Behavioral strategy θ)) is defined, target Q- network (Q (s, a are utilized;θ-)) target Q value that DQN loses item is generated, with And memory POMDP model is used to train the stochastical sampling state value of Q network again;Learn to define POMDP model by enhancing It is expected that Total ReturnWherein, r is returnedtIt is converted by the factor gamma ∈ [0,1] of each time step, T is to terminate step Suddenly;Using movement value function Qπ(s, a) observation state stAdaptive expectations, and utilize neural network Q (s, a)=(Q (s, a;θ)) Approximation movement value function;For based on the boot policy π, optimal movement value function Q under movement aπ(s, a)=E [Rt|st=a, a1 =a, π] pass through strategyIt realizes;Construct the Bellman equation containing action value aIt is solved by adjusting Bellman target component of the Q-network to iteration;
Firstly, DQN using memory reconstruct, in each time step t of POMDP model, will remember tuple et=(st, at, rt, st+1) it is stored in mnemonic Dt={ e1..., etIn;
Secondly, DQN maintains two independent Q network (Q (s, a respectively;θ)) and (Q (s, a;θ-));Parameter current 6 is every It is repeatedly updated in a time step, and is copied to old parameter θ after n iterations-In;When updating iteration, in order to It minimizes relative to old parameter θ-Side Bellman error, pass through optimization loss functionTo update parameter current 6;For updating i every time, from memory Individually sampling obtains memory tuple (s, a, r, s ')~U (D) in memory D;For each sample, calculated by stochastic gradient descent Method updates parameter current 6;The gradient g of declineiBy 6 relative to θ-Loss sample gradientIt acquires;
Finally, selecting in each time step t relative to current Q- network (Q (s, a;Preference behavior act θ)); Q network (Q (s, a are safeguarded using Center Parameter server;θ-)) distributed indicate;Meanwhile the parameter server receives by force The gradient information that chemistry is practised, and under the driving of asynchronous stochastic gradient descent algorithm, ginseng is modified using these gradient informations Number vector θ-
Further, the step S4 specifically: the addressing mode for the data channel that design is mapped based on memory, synthesis are examined Consider triggering mode, timing and load capacity problem, cooperates with variable connector and sampling holder, realize being total to for Data interface channels It enjoys;The self-control system with redundancy structure is designed, intelligently parsing merges decision control instruction obtained, it is defeated to take into account power supply Fluctuation, electromagnetic radiation and distributed capacitor inductive interferences factor out, complete the autonomous control of mobile unit.
The beneficial effects of the present invention are: the present invention is a kind of various dimensions intelligence converted based on relevant redundancy with enhancing study It can caravan cooperative control method.Equipment working condition is monitored, drive environmental classes people perception, characteristics of human body identification, user be intended to push away Reason, resource information interaction, vehicle device autonomous control etc., compared with other methods, this patent surrounds intelligent business Sojourn house car Unified Device connection protocol, shared device interface, the autonomous Collaborative Control demand for improving level of integrated system, using based on POMDP The autonomous control boot policy of model and depth enhancing study, the state of a control obtained using various dimensions Intelligent Fusion is as calculating The input of machine control system establishes POMDP model to perceive, adapt to, the variation of tracing equipment state of a control, using being based on depth Enhancing learns the policy optimization method of (DQN) to select optimizing behavior strategy, realizes the autonomous Collaborative Control of commercial Sojourn house car. POMDP model and depth are enhanced in conjunction with two methods of study, formed under multi-modal mode with the intelligent caravan under complex environment Optimal Control decision, not only contributes to the validity and real-time of final decision, at the same improve the accuracy of interaction feedback with The study degree of optimization of strategy promotes user experience.
Detailed description of the invention
In order to keep the purpose of the present invention, technical scheme and beneficial effects clearer, the present invention provides following attached drawing and carries out Illustrate:
Fig. 1 be multi-source heterogeneous feature unified representation with merge scheme;
Fig. 2 is in the state of a control boot policy figure of POMDP model;
Fig. 3 is the state of a control strategy optimization model of depth enhancing study.
Specific embodiment
Below in conjunction with attached drawing, a preferred embodiment of the present invention will be described in detail.
As shown in Figure 1, 2, 3, each section specific implementation details of the present invention are as follows:
1, multi-source heterogeneous information characteristics are unified and are merged.The process includes following 5 steps:
(1) multi-modal data acquires;
(2) multi-modal feature extraction;
(3) feature association;
(4) multi-modal feature unified representation;
(5) multi-source heterogeneous information characteristics fusion.
2, the state of a control boot policy design based on POMDP model.The process includes following 4 steps:
(1) POMDP model perception, adaptation, the variation of tracing equipment state of a control are established;
(2) affector is acted to equipment state of a control application, obtains certain return;
(3) a possibility that a series of performed strategies are measured according to the accumulative return of acquisition,
(4) gained policy selection is carried out;
3, the state of a control boot policy optimization based on depth enhancing study.The process includes following 3 steps:
(1) Q- net definitions behavioral strategy;
(2) target Q value that DQN loses item is generated;
(3) the stochastical sampling state value that POMDP model is used to train Q network is remembered again.
4, the distributed bottom control based on bus.Using CAN bus based intelligent caravan gateway centralized control scheme, Intelligent caravan interface unit module is introduced, the effective diversity that controlled device is isolated reduces the complexity of system;Pass through keyboard Control, remote control mode improve the reliability of intelligent caravan gateway from intelligent caravan gateway separation;Intelligent room vehicle internal network is selected CAN bus reduces the cost of system, meets the scalability of system, and according to more principal characteristics of CAN bus, realizes controlled pair The plug-and-play feature of elephant.
Finally, it is stated that preferred embodiment above is only used to illustrate the technical scheme of the present invention and not to limit it, although logical It crosses above preferred embodiment the present invention is described in detail, however, those skilled in the art should understand that, can be Various changes are made to it in form and in details, without departing from claims of the present invention limited range.

Claims (3)

1. a kind of converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt, it is characterised in that: this method packet Include following steps:
S1: multi-source heterogeneous information characteristics are unified and are merged;
S2: using the state of a control strategy guidance based on POMDP model;
S3: using the state of a control boot policy optimization based on depth enhancing study;
S4: the distributed bottom control based on bus is used;
The step S2 specifically: using the control shape for the commercial Sojourn house car various kinds of equipment that multi-source heterogeneous integration technology obtains State establishes POMDP model to perceive, adapt to, the variation of tracing equipment state of a control;Pass through the internal action device of POMDP model It is acted to equipment state of a control application, to cause equipment state of a control to change, and obtains certain return;According to the tired of acquisition Meter returns a possibility that measure a series of performed strategies, and then the equipment control problem of commercial Sojourn house car is converted into plan Slightly select permeability;Specifically, POMDP model is described as { S, A, T, O, Q, β }, and integrated environment state is in POMDP model probability point Confidence state in cloth is expressed as B={ bt, the probability distribution of t moment is bt={ bt(s1),...,bt(Sm)};Wherein, bt (si) expression t moment ambient condition be SiProbability;By controlling current time the selection of the observation and movement of environment, POMDP Model inference goes out the value of the confidence of subsequent time state of a control;Assuming that the confidence state of initial time is b0, execution act a and observation O obtains subsequent time confidence state b1;When in state of a control S1, what model obtained is viewed as O1, model internal state is i1; By calculating, corresponding movement a is selected according to state of a control boot policy1, cause ambient condition from S1It is transferred to S2, model obtains R must be returned1With observation O2, model internal state is from i at this time1(b1) it is transferred to i2(b2), then model continues to run according to this;
Specifically, the boot policy estimation function of Construct question realizes dialogue state tracking, which isWherein,It is corresponding The value of the movement vector state s of node n;Developed by state of a control strategy, obtains subsequent timeState of a control boot policy function, whereinIndicate optimal plan Slightly, Vt* the strategic function of last moment is indicated;
The step S3 specifically: the boot policy of commercial Sojourn house car equipment state of a control is obtained according to POMDP model, is used Enhance the policy optimization method of study DQN based on depth to select optimizing behavior strategy;Specifically, using Q- network (Q (s, a; Behavioral strategy θ)) is defined, target Q- network (Q (s, a are utilized;θ-)) target Q value of DQN loss item is generated, and remember again POMDP model is used to train the stochastical sampling state value of Q network;Learn the expection Total Return of definition POMDP model by enhancingWherein, r is returnedtIt is converted by factor gamma=[0,1] of each time step, T is to terminate step;Using movement Value function Qπ(s, a) observation state StAdaptive expectations, and utilize neural network Q (s, a)=(Q (s, a;θ-)) approximation action value Function;For based on the boot policy π, optimal movement value function Q under movement aπ(s, a)=E [Rt|st=a, a1=a, π] pass through StrategyIt realizes;Construct the Bellman equation containing action value aIt is solved by adjusting Bellman target component of the Q-network to iteration;
Firstly, DQN using memory reconstruct, in each time step t of POMDP model, will remember tuple et=(st,at,rt, st+1) it is stored in mnemonic Dt={ e1,...,etIn;
Secondly, DQN maintains two independent Q network (Q (s, a respectively;θ)) and (Q (s, a;θ-));Parameter current θ is in each time It is repeatedly updated in step-length, and is copied to old parameter θ after n iterations-In;When updating iteration, in order to minimize Relative to old parameter θ-Side Bellman error, pass through optimization loss functionTo update parameter current θ;For updating i every time, from memory Individually sampling obtains memory tuple (s, a, r, s`)~U (D) in memory D;For each sample, calculated by stochastic gradient descent Method updates parameter current δ;The gradient g of declineiBy θ relative to θ-Loss sample gradientIt acquires;
Finally, selecting in each time step t relative to current Q- network (Q (s, a;Preference behavior act θ));It uses Center Parameter server safeguards Q network (Q (s, a;θ-)) distributed indicate;Meanwhile the parameter server receives extensive chemical The gradient information practised, and under the driving of asynchronous stochastic gradient descent algorithm, modified using these gradient informations parameter to Measure θ-
2. a kind of various dimensions cooperative control method converted based on relevant redundancy with enhancing study according to claim 1, It is characterized by: the step S1 specifically: under multiple-sensor network environment, multisensor Heterogeneous Information passes through classical related Parser CCA and isomorphism relevant redundancy transformation (IsomorphicRelevantRedundantTransformation, IRRT) Multiple Heterogeneous Informations are mapped to unified, the computable space of dimension by algorithm analysis, carry out unified table to characteristic information Information is merged after showing.
3. a kind of various dimensions cooperative control method converted based on relevant redundancy with enhancing study according to claim 1, It is characterized by: the step S4 specifically: the addressing mode for the data channel that design is mapped based on memory comprehensively considers touching Originating party formula, timing and load capacity problem cooperate with variable connector and sampling holder, realize the shared of Data interface channels;If The self-control system with redundancy structure is counted, intelligently parsing merges decision control instruction obtained, takes into account power supply output wave Dynamic, electromagnetic radiation and distributed capacitor inductive interferences factor, complete the autonomous control of mobile unit.
CN201711407168.4A 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt Active CN108021028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711407168.4A CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711407168.4A CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Publications (2)

Publication Number Publication Date
CN108021028A CN108021028A (en) 2018-05-11
CN108021028B true CN108021028B (en) 2019-04-09

Family

ID=62074474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711407168.4A Active CN108021028B (en) 2017-12-22 2017-12-22 It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt

Country Status (1)

Country Link
CN (1) CN108021028B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108777872B (en) * 2018-05-22 2020-01-24 中国人民解放军陆军工程大学 Intelligent anti-interference method and intelligent anti-interference system based on deep Q neural network anti-interference model
CN110533054B (en) * 2018-05-25 2024-02-06 中国电力科学研究院有限公司 Multi-mode self-adaptive machine learning method and device
CN110989602B (en) * 2019-12-12 2023-12-26 齐鲁工业大学 Autonomous guided vehicle path planning method and system in medical pathology inspection laboratory
CN113126498A (en) * 2021-04-17 2021-07-16 西北工业大学 Optimization control system and control method based on distributed reinforcement learning
CN113727420B (en) * 2021-09-03 2023-05-23 重庆邮电大学 Multimode access network selection device and method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101083656A (en) * 2007-07-05 2007-12-05 上海交通大学 Data stream technique based multi-source heterogeneous data integrated system
CN101094173A (en) * 2007-06-28 2007-12-26 上海交通大学 Integrated system of data interchange under distributed isomerical environment
CN101213111A (en) * 2005-06-03 2008-07-02 埃伦贝格尔及珀恩斯根有限公司 Multiplexing-system for controlling loads in boots or recreational vehicles
CN101222524A (en) * 2008-01-09 2008-07-16 华南理工大学 Distributed multi-sensor cooperated measuring method and system
CN101466111A (en) * 2009-01-13 2009-06-24 中国人民解放军理工大学通信工程学院 Dynamic spectrum access method based on policy planning constrain Q study
CN102207928A (en) * 2011-06-02 2011-10-05 河海大学常州校区 Reinforcement learning-based multi-Agent sewage treatment decision support system
CN106228314A (en) * 2016-08-11 2016-12-14 电子科技大学 The workflow schedule method of study is strengthened based on the degree of depth

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101213111A (en) * 2005-06-03 2008-07-02 埃伦贝格尔及珀恩斯根有限公司 Multiplexing-system for controlling loads in boots or recreational vehicles
CN101094173A (en) * 2007-06-28 2007-12-26 上海交通大学 Integrated system of data interchange under distributed isomerical environment
CN101083656A (en) * 2007-07-05 2007-12-05 上海交通大学 Data stream technique based multi-source heterogeneous data integrated system
CN101222524A (en) * 2008-01-09 2008-07-16 华南理工大学 Distributed multi-sensor cooperated measuring method and system
CN101466111A (en) * 2009-01-13 2009-06-24 中国人民解放军理工大学通信工程学院 Dynamic spectrum access method based on policy planning constrain Q study
CN102207928A (en) * 2011-06-02 2011-10-05 河海大学常州校区 Reinforcement learning-based multi-Agent sewage treatment decision support system
CN106228314A (en) * 2016-08-11 2016-12-14 电子科技大学 The workflow schedule method of study is strengthened based on the degree of depth

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Deep Reinforcement Learning with POMDPs;Maxim Egorov;《IEEE》;20151211;正文第1-5页
Mining Semantically Consistent Patterns for Cross-View Data;Lei Zhang, etc;《IEEE Transactions on Knowledge and Data Engineering》;20141231;正文第1-4页

Also Published As

Publication number Publication date
CN108021028A (en) 2018-05-11

Similar Documents

Publication Publication Date Title
CN108021028B (en) It is a kind of to be converted based on relevant redundancy and enhance the various dimensions cooperative control method learnt
Ye et al. Machine learning for vehicular networks: Recent advances and application examples
Sunhare et al. Internet of things and data mining: An application oriented survey
Xu et al. Scalable learning paradigms for data-driven wireless communication
CN110481536B (en) Control method and device applied to hybrid electric vehicle
CN101222524B (en) Distributed multi-sensor cooperated measuring system
CN102200787B (en) Robot behaviour multi-level integrated learning method and robot behaviour multi-level integrated learning system
CN114237041B (en) Space-ground cooperative fixed time fault tolerance control method based on preset performance
CN108153259B (en) Multi-controller optimal state estimation control strategy design method based on Kalman filtering
WO2018120963A1 (en) Feedback-based self-adaptive subjective and objective weight context awareness system and working method thereof
CN110501917A (en) The system and method for realizing internet of things intelligent household information management using cloud computing
CN112949472A (en) Cooperative sensing method based on multi-sensor information fusion
CN113538162A (en) Planting strategy generation method and device, electronic equipment and storage medium
Shahriari et al. Generic online learning for partial visible dynamic environment with delayed feedback: Online learning for 5G C-RAN load-balancer
CN117475518B (en) Synchronous human motion recognition and prediction method and system
CN108182476A (en) A kind of policy learning method controlled in intensified learning by wish
CN112965381B (en) Method for establishing cooperative intelligent self-adaptive decision model
Sun et al. Optimal tracking control of switched systems applied in grid-connected hybrid generation using reinforcement learning
CN113276114A (en) Reconfigurable mechanical arm cooperative force/motion control system and method based on terminal task assignment
Shang et al. [Retracted] Human‐Computer Interaction of Networked Vehicles Based on Big Data and Hybrid Intelligent Algorithm
Mou et al. Biologically inspired machine learning-based trajectory analysis in intelligent dispatching energy storage system
CN117077511A (en) Multi-element load prediction method, device and storage medium based on improved firefly algorithm and SVR
Ge et al. An AFD-based ILC dynamics adaptive matching method in frequency domain for distributed consensus control of unknown multiagent systems
Chen et al. FUNOff: Offloading applications at function granularity for mobile edge computing
CN117220287B (en) Generating capacity prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant