CN111353644B - Prediction model generation method of intelligent network cloud platform based on reinforcement learning - Google Patents
Prediction model generation method of intelligent network cloud platform based on reinforcement learning Download PDFInfo
- Publication number
- CN111353644B CN111353644B CN202010122791.0A CN202010122791A CN111353644B CN 111353644 B CN111353644 B CN 111353644B CN 202010122791 A CN202010122791 A CN 202010122791A CN 111353644 B CN111353644 B CN 111353644B
- Authority
- CN
- China
- Prior art keywords
- model
- prediction
- network
- automobile
- cloud platform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Traffic Control Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, which relates to the technical field of intelligent networking automobile cloud platform prediction, and is characterized in that an automobile information prediction network model is generated by utilizing a plurality of acquired automobile information data and operator character sequences and combining an RNN Controller network and a model analyzer, wherein the operator character sequences have a plurality of operators, so that the operators of the generated automobile information prediction network model have a plurality of operators, have structural diversity and can realize better prediction effect; the method has the advantages that a reinforcement learning thought is initially adopted in the automobile information prediction network model, the recognition network structure is transformed into the prediction problem, a set of universal automatic prediction model generation method is adapted, a good network structure thought is selected from a large amount of search spaces, the structure does not need to be designed manually, time and working cost are saved, and efficiency is higher; the idea of weight sharing is adopted, so that the searching efficiency is improved and is about 1000 times faster than that of a non-shared model.
Description
Technical Field
The invention relates to a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, and belongs to the technical field of intelligent networking automobile cloud platform prediction.
Background
The intelligent networked automobile cloud platform is connected with an entire automobile enterprise, an automobile owner (user) and terminal equipment (an intelligent networked automobile and the like), and records the operation record of the automobile owner (user) and the operation state of the terminal equipment (the intelligent networked automobile) in real time.
By combing the current intelligent networked automobile cloud platform prediction model, the fact that the current network structure prediction model is artificially and finely selected for characteristics and structural model design is found, and a novel model or algorithm is usually designed, so that an algorithm engineer is required to have a solid theoretical basis, and meanwhile, the novel model or algorithm has strong engineering capability and innovation capability, and the algorithm engineer is required to produce an effective model in a long period. For example, network structures such as common recommended model structure Logistic Regression (LR), factorization Machine (FM), FFM, DNN, deep FM, DCN, XDeepFM, and FiBiNET all need to consume a lot of time and work cost of an algorithm engineer, and efficiency is relatively low.
Disclosure of Invention
The invention provides a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, the model generation efficiency is high, and the generated model can greatly improve the prediction precision.
In order to alleviate the above problems, the technical scheme adopted by the invention is as follows:
the invention provides a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, which comprises the following steps:
s1, acquiring a plurality of automobile information data from an intelligent networked automobile cloud platform;
s2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set;
s3, selecting a plurality of types of network structure models, abstracting and inducing a plurality of operators from the network structure models, and forming an operator data set;
s4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver;
s5, initializing the iteration number K =0, and setting an iteration number threshold K m ;
S6, the RNN Controller network randomly generates S different operator character sequences according to the operator data set, and each operator character sequence is composed of a plurality of operators which are randomly sampled from the operator data set;
s7, respectively converting the S operator character sequences into S sub-models by the model analyzer, randomly initializing the parameters of each current sub-model if K =0, and respectively initializing the parameters of the S sub-models obtained in the previous training round to the parameters of the current S sub-models if K is not equal to 0;
s8, training each current submodel through a training data set, storing parameters of each submodel, evaluating each currently trained submodel according to a test data set, and respectively obtaining S rewards, wherein K = K +1;
s9, if K = K m And if not, updating the parameters of the RNN Controller network according to the current rewarded by adopting a PolicyGradient reinforcement learning algorithm, and then jumping to the step S6.
The technical effect of the technical scheme is as follows: operators of the generated automobile information prediction network model are various, and the automobile information prediction network model has structural diversity and can achieve a better prediction effect; the method is characterized in that a reinforcement learning thought is initially adopted in the automobile information prediction network model, the recognition network structure is transformed into the prediction problem, a set of general automatic prediction model generation method is adapted, a good network structure thought is selected from a large amount of search spaces, the structure is not required to be designed manually, the time and the working cost are saved, and the efficiency is higher; and a weight sharing idea is adopted, the trained parameters in the previous round are used as the parameters of the current sub-model and then the sub-model is studied again, so that the searching efficiency is improved and is about 1000 times faster than that of a non-shared model.
Specifically, in step S1, the vehicle information data is vehicle information data or vehicle user personal information data; the vehicle information data comprises vehicle parameter data, vehicle real-time driving data and current road environment data of the vehicle.
The technical effect of the technical scheme is as follows: vehicle information data are input, and the finally obtained vehicle information prediction network model can be used for predicting the driving behavior of the intelligent networked vehicle; personal information data of automobile users are input, and the finally obtained automobile information prediction network model can be used for automobile price prediction.
More specifically, the step S2 specifically includes: and (2) arranging the automobile information data into a plurality of training examples in a form of < X, Y >, wherein each training example forms the training data set, X is the information data characteristic, and Y is the prediction target.
The technical effect of the technical scheme is as follows: the automobile information data is arranged into a form of < X, Y >, so that the finally obtained automobile information prediction network model has strong field universality and can be adapted to different prediction tasks, and the automobile information prediction network model is actually tested and applied in different application scenes (such as prediction of intelligent internet automobile driving operation behaviors, prediction of user click conditions, recommendation of automobiles with proper price according to user conditions and the like) and comprises a recommendation system in the current internet field.
Optionally, in step S3, the network structure models are ten types, and are respectively FM, PNN, PIN, HNN, deep FM, fiBiNET, DCN, XDeepFM, AFM, and AutoInt.
The technical effect of the technical scheme is as follows: the ten types are the existing mainstream models, and operators extracted from the ten types can enable the automobile information prediction network model to have diversity and excellent prediction effect.
Optionally, the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is a decoding algorithm for stacking the operator character sequences into a model structure of TensorFlow.
Optionally, in the step S5, K m =2000。
Optionally, S =2.
The technical effect of the technical scheme is as follows: the finally generated automobile information prediction network model can be optimal, and the prediction effect is best.
Optionally, the automobile information prediction network model is saved in a checkpoint file format.
Optionally, the submodel is a tensrflow model.
The technical effect of the technical scheme is as follows: the test process of the model is simple, and the model can be conveniently deployed to an intelligent networking automobile cloud platform.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flowchart of a prediction model generation method of an intelligent networked cloud platform based on reinforcement learning in an embodiment;
FIG. 2 is a diagram of relationships between RNN Controller networks, model resolvers, and sub-models in an embodiment;
FIG. 3 is a schematic diagram of the RNN Controller network generating the operator character sequence according to the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, fig. 2 and fig. 3, the embodiment provides a prediction model generation method for an intelligent networking cloud platform based on reinforcement learning, including the following steps:
s1, obtaining a plurality of automobile information data from an intelligent networking automobile cloud platform.
In this embodiment, the vehicle information data is vehicle information data, and the vehicle information data includes vehicle parameter data, vehicle real-time driving data, and current road environment data of the vehicle.
The vehicle parameter data is the basic parameter content of the vehicle recorded in the specification after the vehicle leaves a factory, wherein the basic parameter content comprises engine parameters, power battery parameters and sensor parameter data; the sensor parameter data relates to laser radar parameter data, millimeter wave radar parameter data, ultrasonic radar parameter data, infrared night vision device parameter data, high definition camera parameter data, corner sensor parameter data and rotation speed sensor parameter data.
The vehicle real-time driving data comprises vehicle real-time steering data, vehicle real-time braking data, vehicle real-time driving data and vehicle real-time sensor detection data.
The current road environment data of the vehicle comprises traffic light indication information data and pedestrian position and flow information data.
S2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set.
In the present embodiment, the car information data is organized in the form of < X, Y > into a number of instances, each instance constituting a car information data set. Wherein X is an information data characteristic which is < vehicle parameter data, vehicle real-time driving data and current road environment data >; and Y is a prediction target, and the prediction target of the embodiment is the next driving behavior of the intelligent networked automobile, including steering behavior, braking behavior and driving behavior.
And S3, selecting a plurality of types of network structure models, abstracting and summarizing a plurality of operators from the network structure models, and forming an operator data set.
In this embodiment, there are ten types of network structure models, which are FM (factorization machine), PNN (Product-based neural network), PIN (inner Product-based neural network), HNN (holographic Product network), deep FM (deep factorization machine), fiBiNET (bilinear network of feature importance), DCN (deep combination network), XDeepFM (extreme deep factorization machine), AFM (attention-machine factorization machine), and AutoInt (automatic feature combination of self attention), respectively.
The operators obtained from the network structure model include six categories, and the specific correspondence is as follows:
the second-order combination operator-FM is FM/DeepFM;
a second-order combination operator HFM is PNN/PIN/HNN;
a second-order combined operator-Biliner-All is FiBiNET;
a second-order combined operator-Biliner-arch is FiBiNET;
a second-order combined operator-Biliner-Interaction is FiBiNET;
a second-order combined operator-CIN is XDeepFM/DCN;
DNN input part-first order: deep FM;
DNN input partial-second order PIN;
the DNN input part-first order + second order: PIN;
the DNN input part is a first-order and a second-order n-dimension, wherein the first order is from PIN, and the second-order n-dimension is obtained by performing sum polymerization on PIN feature combination dimension;
the DNN input part comprises first-order and second-order k dimensions, wherein the first order is from PIN, and the second-order k dimension is obtained by performing sum polymerization on PIN characteristic Embedding dimension;
AFM, attention-MLP on Field level;
attention-SENTET at Field level FiBiNET;
Attention-Multi-Head at Field level: autoInt;
attachment on Field level-DeepFM;
the number of hidden layers of DNN is DeepFM/XDeepFM;
DeepFM/XDeepFM;
Skip-Connection:XDeepFM/DCN。
and S4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver.
In this embodiment, the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is a decoding algorithm for stacking operator character sequences into a model structure of TensorFlow.
S5, initializing the iteration number K =0, and setting an iteration number threshold K m 。
In this embodiment, K m The value is 2000.
And S6, randomly generating S =2 different operator character sequences by the RNN Controller network according to the operator data set, wherein each operator character sequence is composed of a plurality of operators randomly sampled from the operator data set.
In this embodiment, each step of RNN generation in the RNN Controller network corresponds to generation of an operator structure, and a specific recommended model structure can be obtained by combining and assembling the generated operator sequences, for example, the following structure:
string sequence str1= "feature combination operator: FM, input feature operator: original structure, attention operator: none, hidden layer size: 300, activation function: RELU, number of layers: 3 layers ", then this structure is that of standard deep fm;
string sequence str2= "feature combination operator: biliner All, input feature operator: first and second order stitching, attention operator: SENET, hidden layer size: 300, activation function: RELU, number of layers: 3 layers ", then this structure is the structure of the fibanet-all.
And S7, respectively converting the 2 operator character sequences into 2 submodels (Child models) by the Model analyzer, randomly initializing the parameters of each current submodel if K =0, and respectively initializing the parameters of the 2 submodels obtained in the previous training round into the parameters of the current 2 submodels if K is not equal to 0.
S8, training each current submodel through the training data set, storing parameters of each submodel, evaluating each submodel after current training according to the test data set, and respectively obtaining S rewards, wherein K = K +1.
In the present embodiment, reward refers to an evaluation index of recommendation performance (e.g., accuracy ACC of driving behavior prediction, AUC of click, etc.). If the structure performance is excellent, a certain reward is given to the Controller, and if the generated structure performance is poor, the Controller is punished.
In this embodiment, adam's monitoring update algorithm is used to train update parameters, and the obtained model format is a checkpoint file format.
And S9, if K =2000, selecting the best rewarded one from the S sub-models after current training as the output of the automobile information prediction network model to complete the generation of the automobile information prediction network model, and otherwise, updating the parameters of the RNN Controller network according to the current rewarded by adopting a PolicyGradient reinforcement learning algorithm and then jumping to the step S6.
In this embodiment, through multiple feedback of child model, the RNN Controller network structure becomes more and more excellent, and then the generated operator character sequence becomes more excellent, and then the obtained sub-model becomes more excellent.
In this embodiment, the parameters of the RNN Controller network are updated according to the reinforcement learning algorithm of policygredient, and the specific optimization formula of the optimization structure is as follows:
in the above formula, J (theta) C ) Represents a specific optimization objective, wherein a 1:T Represents the operator sequence set generated by each step of the Controller, T is the length of the operator sequence, and R is the calculationThe Reward, P obtained for a substructure is the probability distribution of the sampling space of the operator structure with the goal of J (theta) C ) And (4) maximizing.
In the parameter solving process of the Controller, besides Reward, a specific gradient function needs to be calculated, and the specific gradient function is expanded according to time steps, wherein the calculation mode is as follows:
in this embodiment, 2 submodels are selected to perform the sampling calculation expectation process, and the gradient function is updated as:
through multiple experiments, when S is observed to take different values, the fact that S is not larger is found to be better, S =2 enables a finally generated automobile information prediction network model to be optimal, and the prediction effect is best.
Update the general formula as
Wherein alpha is C For the updated learning rate of Controller, J (θ) C ) Is a parameter variable of the Controller.
In the present embodiment, P (a) at each time step t |a 1:(t-1 ) And in the probability calculation process, calculating cross entropy according to an operator generated by sampling and the probability of sampling in each step.
In step S7 of this embodiment, after the model has been iteratively trained once, in the following process, the parameters learned by the last child model are used to initialize the parameters of the current round, and the advantages of such parameter sharing are as follows:
1. the parameters of the child model are ensured to be initialized in a better parameter space, so that the child model has fewer required learning steps and better effect;
2. compared with the non-sharing mode, the learning efficiency of the sharing mode is saved by about 1000 times, because:
non-sharing: controller steps C1=100000 steps, S =2,child requires steps C2=100 steps, and takes a total of time: C1S C2=10^5 ^ 2 ^ 100=2 ^ 10^7 steps;
the sharing mode comprises the following steps: controller steps C1=2000 steps, S =2,child requires steps C2=5 steps, and takes time in total:
C1S C2= 2S 10^ 3S 25 =2S 10^4 steps.
The efficiency is saved: 2 x 10^7/2 x 10^4 ^ 10^3.
After the automobile information prediction network model generated in the embodiment is deployed to the intelligent internet automobile cloud platform, the automobile information prediction network model can be used for predicting automobile driving behaviors.
In this embodiment, the generated parameter structure of the automobile information prediction network model is a checkpoint file, and the checkpoint file is deployed to an intelligent networked automobile cloud platform in a restful service interface manner.
Example 2
Compared with the embodiment 1, the automobile information data in the step S1 is the personal information data of the automobile user, including the age of the user, the sex of the user, the city where the user is located, the hobbies of the user and the personality of the user, and the Y in the step S2 is the predicted automobile price. The automobile information prediction network model generated in the embodiment can be used for predicting automobile prices after being deployed on the intelligent networked automobile cloud platform, and the deployment method is the same as that in the embodiment 1.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (7)
1. A prediction model generation method of an intelligent networking cloud platform based on reinforcement learning is characterized by comprising the following steps:
s1, acquiring a plurality of automobile information data from an intelligent network automobile cloud platform;
s2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set;
s3, selecting a plurality of types of network structure models, abstracting and inducing a plurality of operators from the network structure models, and forming an operator data set;
s4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver;
s5, initializing the iteration frequency K =0, and setting an iteration frequency threshold valueK m ;
S6, randomly generating S different operator character sequences by the RNNController network according to the operator data set, wherein each operator character sequence is composed of a plurality of operators randomly sampled from the operator data set;
s7, respectively converting the S operator character sequences into S sub-models by a model analyzer, randomly initializing the parameters of each current sub-model if K =0, and respectively initializing the parameters of the S sub-models obtained in the previous training round to the parameters of the current S sub-models if K is not equal to 0;
s8, training each current submodel through a training data set, storing parameters of each submodel, evaluating each currently trained submodel according to a test data set, and respectively obtaining S rewards, wherein K = K +1; reward is an evaluation index of recommended performance;
s9, if K =K m Selecting the best rewarded one from the S sub-models after current training as an automobile information prediction network model to be output, and completing generation of the automobile information prediction network model, otherwise adopting a PolicyGradient reinforcement learning algorithm, updating the RNNController network parameters according to the current rewarded, and then jumping to the step S6;
in the step S1, the vehicle information data is vehicle information data; the vehicle information data comprises vehicle parameter data, vehicle real-time driving data and current road environment data of the vehicle;
the step S2 specifically includes: arranging automobile information data into a plurality of training examples in a form of < X, Y >, wherein each training example forms the training data set, and X is an information data characteristic which is < vehicle parameter data, vehicle real-time driving data and current road environment data >; y is a prediction target, and the prediction target is the next driving behavior of the intelligent networked automobile, including steering behavior, braking behavior and driving behavior.
2. The method according to claim 1, wherein in step S3, the network structure models are ten types, and are respectively FM, PNN, PIN, HNN, deep FM, fiBiNET, DCN, XDeepFM, AFM, and AutoInt.
3. The reinforcement learning-based prediction model generation method for the intelligent internet cloud platform according to claim 1, wherein the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is used for stacking the operator character sequences into a model structure of TensorFlow.
4. The reinforcement learning-based intelligent networked cloud platform prediction model generation method according to claim 1, wherein in the step S5,K m =2000。
5. the reinforcement learning-based intelligent networked cloud platform prediction model generation method according to claim 1, wherein S =2.
6. The intelligent internet cloud platform prediction model generation method based on reinforcement learning of claim 1, wherein the automobile information prediction network model is stored in a checkpoint file format.
7. The reinforcement learning-based intelligent networking cloud platform prediction model generation method according to claim 1, wherein the sub-model is a TensorFlow model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010122791.0A CN111353644B (en) | 2020-02-27 | 2020-02-27 | Prediction model generation method of intelligent network cloud platform based on reinforcement learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010122791.0A CN111353644B (en) | 2020-02-27 | 2020-02-27 | Prediction model generation method of intelligent network cloud platform based on reinforcement learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111353644A CN111353644A (en) | 2020-06-30 |
CN111353644B true CN111353644B (en) | 2023-04-07 |
Family
ID=71195959
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010122791.0A Active CN111353644B (en) | 2020-02-27 | 2020-02-27 | Prediction model generation method of intelligent network cloud platform based on reinforcement learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111353644B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112926735A (en) * | 2021-01-29 | 2021-06-08 | 北京字节跳动网络技术有限公司 | Method, device, framework, medium and equipment for updating deep reinforcement learning model |
CN114742236A (en) * | 2022-04-24 | 2022-07-12 | 重庆长安汽车股份有限公司 | Environmental vehicle behavior prediction model training method and system |
CN117010447B (en) * | 2023-10-07 | 2024-01-23 | 成都理工大学 | End-to-end based microarchitecturable search method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105956698A (en) * | 2016-04-27 | 2016-09-21 | 国网天津市电力公司 | Power load combination prediction method based on IOWGA operator and fresh prediction precision |
CN106529461A (en) * | 2016-11-07 | 2017-03-22 | 湖南源信光电科技有限公司 | Vehicle model identifying algorithm based on integral characteristic channel and SVM training device |
CN107862346A (en) * | 2017-12-01 | 2018-03-30 | 驭势科技(北京)有限公司 | A kind of method and apparatus for carrying out driving strategy model training |
CN109733415A (en) * | 2019-01-08 | 2019-05-10 | 同济大学 | A kind of automatic Pilot following-speed model that personalizes based on deeply study |
CN109871778A (en) * | 2019-01-23 | 2019-06-11 | 长安大学 | Lane based on transfer learning keeps control method |
CN110009108A (en) * | 2019-04-09 | 2019-07-12 | 沈阳航空航天大学 | A kind of completely new quantum transfinites learning machine |
CN110304075A (en) * | 2019-07-04 | 2019-10-08 | 清华大学 | Track of vehicle prediction technique based on Mix-state DBN and Gaussian process |
-
2020
- 2020-02-27 CN CN202010122791.0A patent/CN111353644B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105956698A (en) * | 2016-04-27 | 2016-09-21 | 国网天津市电力公司 | Power load combination prediction method based on IOWGA operator and fresh prediction precision |
CN106529461A (en) * | 2016-11-07 | 2017-03-22 | 湖南源信光电科技有限公司 | Vehicle model identifying algorithm based on integral characteristic channel and SVM training device |
CN107862346A (en) * | 2017-12-01 | 2018-03-30 | 驭势科技(北京)有限公司 | A kind of method and apparatus for carrying out driving strategy model training |
CN109733415A (en) * | 2019-01-08 | 2019-05-10 | 同济大学 | A kind of automatic Pilot following-speed model that personalizes based on deeply study |
CN109871778A (en) * | 2019-01-23 | 2019-06-11 | 长安大学 | Lane based on transfer learning keeps control method |
CN110009108A (en) * | 2019-04-09 | 2019-07-12 | 沈阳航空航天大学 | A kind of completely new quantum transfinites learning machine |
CN110304075A (en) * | 2019-07-04 | 2019-10-08 | 清华大学 | Track of vehicle prediction technique based on Mix-state DBN and Gaussian process |
Non-Patent Citations (5)
Title |
---|
周末 ; 金敏 ; .多算法多模型与在线第二次学习结合的短期电力负荷预测方法.计算机应用.2017,(第11期),全文. * |
孙岩 ; 吕世聘 ; 王秀坤 ; 唐一源 ; .贝叶斯网络结构模型的构建.小型微型计算机系统.2008,(第05期),全文. * |
孙玉山,白红.关于t是周期的抛物问题的拟谱方法.黑龙江大学自然科学学报.1990,(第03期),全文. * |
崔建双 ; 刘晓婵 ; 杨美华 ; 李雯燕 ; .基于元学习推荐的优化算法自动选择框架与实证分析.计算机应用.2017,(第04期),全文. * |
郝占刚 ; 章伟雄 ; 陈政 ; .基于监督联合去噪模型的社交网络链接预测.中国科学:信息科学.2017,(第11期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111353644A (en) | 2020-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111353644B (en) | Prediction model generation method of intelligent network cloud platform based on reinforcement learning | |
CN111061277B (en) | Unmanned vehicle global path planning method and device | |
US11842261B2 (en) | Deep reinforcement learning with fast updating recurrent neural networks and slow updating recurrent neural networks | |
WO2021103625A1 (en) | Short-term vehicle speed condition real-time prediction method based on interaction between vehicle ahead and current vehicle | |
CN112162555B (en) | Vehicle control method based on reinforcement learning control strategy in hybrid vehicle fleet | |
CN110520868B (en) | Method, program product and storage medium for distributed reinforcement learning | |
CN108594858B (en) | Unmanned aerial vehicle searching method and device for Markov moving target | |
CN110850861A (en) | Attention-based hierarchical lane change depth reinforcement learning | |
US20240095495A1 (en) | Attention neural networks with short-term memory units | |
CN114261400A (en) | Automatic driving decision-making method, device, equipment and storage medium | |
CN113264064B (en) | Automatic driving method for intersection scene and related equipment | |
CN111507499B (en) | Method, device and system for constructing model for prediction and testing method | |
CN116795720A (en) | Unmanned driving system credibility evaluation method and device based on scene | |
Han et al. | Ensemblefollower: A hybrid car-following framework based on reinforcement learning and hierarchical planning | |
Arbabi et al. | Planning for autonomous driving via interaction-aware probabilistic action policies | |
Mazumder et al. | Action permissibility in deep reinforcement learning and application to autonomous driving | |
Naing et al. | Dynamic car-following model calibration with deep reinforcement learning | |
CN117396389A (en) | Automatic driving instruction generation model optimization method, device, equipment and storage medium | |
Hjaltason | Predicting vehicle trajectories with inverse reinforcement learning | |
Radovic et al. | Agent forecasting at flexible horizons using ODE flows | |
Pak et al. | Carnet: A dynamic autoencoder for learning latent dynamics in autonomous driving tasks | |
CN114104005B (en) | Decision-making method, device and equipment of automatic driving equipment and readable storage medium | |
Schütt et al. | Exploring the Range of Possible Outcomes by means of Logical Scenario Analysis and Reduction for Testing Automated Driving Systems | |
Madni et al. | Digital Twin: Key Enabler and Complement to Model-Based Systems Engineering | |
CN112800670B (en) | Multi-target structure optimization method and device for driving cognitive model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |