CN111353644B - Prediction model generation method of intelligent network cloud platform based on reinforcement learning - Google Patents

Prediction model generation method of intelligent network cloud platform based on reinforcement learning Download PDF

Info

Publication number
CN111353644B
CN111353644B CN202010122791.0A CN202010122791A CN111353644B CN 111353644 B CN111353644 B CN 111353644B CN 202010122791 A CN202010122791 A CN 202010122791A CN 111353644 B CN111353644 B CN 111353644B
Authority
CN
China
Prior art keywords
model
prediction
network
automobile
cloud platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010122791.0A
Other languages
Chinese (zh)
Other versions
CN111353644A (en
Inventor
黄通文
韩胜明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Meiyun Zhixiang Intelligent Technology Co ltd
Original Assignee
Chengdu Meiyun Zhixiang Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Meiyun Zhixiang Intelligent Technology Co ltd filed Critical Chengdu Meiyun Zhixiang Intelligent Technology Co ltd
Priority to CN202010122791.0A priority Critical patent/CN111353644B/en
Publication of CN111353644A publication Critical patent/CN111353644A/en
Application granted granted Critical
Publication of CN111353644B publication Critical patent/CN111353644B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, which relates to the technical field of intelligent networking automobile cloud platform prediction, and is characterized in that an automobile information prediction network model is generated by utilizing a plurality of acquired automobile information data and operator character sequences and combining an RNN Controller network and a model analyzer, wherein the operator character sequences have a plurality of operators, so that the operators of the generated automobile information prediction network model have a plurality of operators, have structural diversity and can realize better prediction effect; the method has the advantages that a reinforcement learning thought is initially adopted in the automobile information prediction network model, the recognition network structure is transformed into the prediction problem, a set of universal automatic prediction model generation method is adapted, a good network structure thought is selected from a large amount of search spaces, the structure does not need to be designed manually, time and working cost are saved, and efficiency is higher; the idea of weight sharing is adopted, so that the searching efficiency is improved and is about 1000 times faster than that of a non-shared model.

Description

Prediction model generation method of intelligent network cloud platform based on reinforcement learning
Technical Field
The invention relates to a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, and belongs to the technical field of intelligent networking automobile cloud platform prediction.
Background
The intelligent networked automobile cloud platform is connected with an entire automobile enterprise, an automobile owner (user) and terminal equipment (an intelligent networked automobile and the like), and records the operation record of the automobile owner (user) and the operation state of the terminal equipment (the intelligent networked automobile) in real time.
By combing the current intelligent networked automobile cloud platform prediction model, the fact that the current network structure prediction model is artificially and finely selected for characteristics and structural model design is found, and a novel model or algorithm is usually designed, so that an algorithm engineer is required to have a solid theoretical basis, and meanwhile, the novel model or algorithm has strong engineering capability and innovation capability, and the algorithm engineer is required to produce an effective model in a long period. For example, network structures such as common recommended model structure Logistic Regression (LR), factorization Machine (FM), FFM, DNN, deep FM, DCN, XDeepFM, and FiBiNET all need to consume a lot of time and work cost of an algorithm engineer, and efficiency is relatively low.
Disclosure of Invention
The invention provides a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, the model generation efficiency is high, and the generated model can greatly improve the prediction precision.
In order to alleviate the above problems, the technical scheme adopted by the invention is as follows:
the invention provides a prediction model generation method of an intelligent networking cloud platform based on reinforcement learning, which comprises the following steps:
s1, acquiring a plurality of automobile information data from an intelligent networked automobile cloud platform;
s2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set;
s3, selecting a plurality of types of network structure models, abstracting and inducing a plurality of operators from the network structure models, and forming an operator data set;
s4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver;
s5, initializing the iteration number K =0, and setting an iteration number threshold K m
S6, the RNN Controller network randomly generates S different operator character sequences according to the operator data set, and each operator character sequence is composed of a plurality of operators which are randomly sampled from the operator data set;
s7, respectively converting the S operator character sequences into S sub-models by the model analyzer, randomly initializing the parameters of each current sub-model if K =0, and respectively initializing the parameters of the S sub-models obtained in the previous training round to the parameters of the current S sub-models if K is not equal to 0;
s8, training each current submodel through a training data set, storing parameters of each submodel, evaluating each currently trained submodel according to a test data set, and respectively obtaining S rewards, wherein K = K +1;
s9, if K = K m And if not, updating the parameters of the RNN Controller network according to the current rewarded by adopting a PolicyGradient reinforcement learning algorithm, and then jumping to the step S6.
The technical effect of the technical scheme is as follows: operators of the generated automobile information prediction network model are various, and the automobile information prediction network model has structural diversity and can achieve a better prediction effect; the method is characterized in that a reinforcement learning thought is initially adopted in the automobile information prediction network model, the recognition network structure is transformed into the prediction problem, a set of general automatic prediction model generation method is adapted, a good network structure thought is selected from a large amount of search spaces, the structure is not required to be designed manually, the time and the working cost are saved, and the efficiency is higher; and a weight sharing idea is adopted, the trained parameters in the previous round are used as the parameters of the current sub-model and then the sub-model is studied again, so that the searching efficiency is improved and is about 1000 times faster than that of a non-shared model.
Specifically, in step S1, the vehicle information data is vehicle information data or vehicle user personal information data; the vehicle information data comprises vehicle parameter data, vehicle real-time driving data and current road environment data of the vehicle.
The technical effect of the technical scheme is as follows: vehicle information data are input, and the finally obtained vehicle information prediction network model can be used for predicting the driving behavior of the intelligent networked vehicle; personal information data of automobile users are input, and the finally obtained automobile information prediction network model can be used for automobile price prediction.
More specifically, the step S2 specifically includes: and (2) arranging the automobile information data into a plurality of training examples in a form of < X, Y >, wherein each training example forms the training data set, X is the information data characteristic, and Y is the prediction target.
The technical effect of the technical scheme is as follows: the automobile information data is arranged into a form of < X, Y >, so that the finally obtained automobile information prediction network model has strong field universality and can be adapted to different prediction tasks, and the automobile information prediction network model is actually tested and applied in different application scenes (such as prediction of intelligent internet automobile driving operation behaviors, prediction of user click conditions, recommendation of automobiles with proper price according to user conditions and the like) and comprises a recommendation system in the current internet field.
Optionally, in step S3, the network structure models are ten types, and are respectively FM, PNN, PIN, HNN, deep FM, fiBiNET, DCN, XDeepFM, AFM, and AutoInt.
The technical effect of the technical scheme is as follows: the ten types are the existing mainstream models, and operators extracted from the ten types can enable the automobile information prediction network model to have diversity and excellent prediction effect.
Optionally, the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is a decoding algorithm for stacking the operator character sequences into a model structure of TensorFlow.
Optionally, in the step S5, K m =2000。
Optionally, S =2.
The technical effect of the technical scheme is as follows: the finally generated automobile information prediction network model can be optimal, and the prediction effect is best.
Optionally, the automobile information prediction network model is saved in a checkpoint file format.
Optionally, the submodel is a tensrflow model.
The technical effect of the technical scheme is as follows: the test process of the model is simple, and the model can be conveniently deployed to an intelligent networking automobile cloud platform.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flowchart of a prediction model generation method of an intelligent networked cloud platform based on reinforcement learning in an embodiment;
FIG. 2 is a diagram of relationships between RNN Controller networks, model resolvers, and sub-models in an embodiment;
FIG. 3 is a schematic diagram of the RNN Controller network generating the operator character sequence according to the embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, fig. 2 and fig. 3, the embodiment provides a prediction model generation method for an intelligent networking cloud platform based on reinforcement learning, including the following steps:
s1, obtaining a plurality of automobile information data from an intelligent networking automobile cloud platform.
In this embodiment, the vehicle information data is vehicle information data, and the vehicle information data includes vehicle parameter data, vehicle real-time driving data, and current road environment data of the vehicle.
The vehicle parameter data is the basic parameter content of the vehicle recorded in the specification after the vehicle leaves a factory, wherein the basic parameter content comprises engine parameters, power battery parameters and sensor parameter data; the sensor parameter data relates to laser radar parameter data, millimeter wave radar parameter data, ultrasonic radar parameter data, infrared night vision device parameter data, high definition camera parameter data, corner sensor parameter data and rotation speed sensor parameter data.
The vehicle real-time driving data comprises vehicle real-time steering data, vehicle real-time braking data, vehicle real-time driving data and vehicle real-time sensor detection data.
The current road environment data of the vehicle comprises traffic light indication information data and pedestrian position and flow information data.
S2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set.
In the present embodiment, the car information data is organized in the form of < X, Y > into a number of instances, each instance constituting a car information data set. Wherein X is an information data characteristic which is < vehicle parameter data, vehicle real-time driving data and current road environment data >; and Y is a prediction target, and the prediction target of the embodiment is the next driving behavior of the intelligent networked automobile, including steering behavior, braking behavior and driving behavior.
And S3, selecting a plurality of types of network structure models, abstracting and summarizing a plurality of operators from the network structure models, and forming an operator data set.
In this embodiment, there are ten types of network structure models, which are FM (factorization machine), PNN (Product-based neural network), PIN (inner Product-based neural network), HNN (holographic Product network), deep FM (deep factorization machine), fiBiNET (bilinear network of feature importance), DCN (deep combination network), XDeepFM (extreme deep factorization machine), AFM (attention-machine factorization machine), and AutoInt (automatic feature combination of self attention), respectively.
The operators obtained from the network structure model include six categories, and the specific correspondence is as follows:
the second-order combination operator-FM is FM/DeepFM;
a second-order combination operator HFM is PNN/PIN/HNN;
a second-order combined operator-Biliner-All is FiBiNET;
a second-order combined operator-Biliner-arch is FiBiNET;
a second-order combined operator-Biliner-Interaction is FiBiNET;
a second-order combined operator-CIN is XDeepFM/DCN;
DNN input part-first order: deep FM;
DNN input partial-second order PIN;
the DNN input part-first order + second order: PIN;
the DNN input part is a first-order and a second-order n-dimension, wherein the first order is from PIN, and the second-order n-dimension is obtained by performing sum polymerization on PIN feature combination dimension;
the DNN input part comprises first-order and second-order k dimensions, wherein the first order is from PIN, and the second-order k dimension is obtained by performing sum polymerization on PIN characteristic Embedding dimension;
AFM, attention-MLP on Field level;
attention-SENTET at Field level FiBiNET;
Attention-Multi-Head at Field level: autoInt;
attachment on Field level-DeepFM;
the number of hidden layers of DNN is DeepFM/XDeepFM;
DeepFM/XDeepFM;
Skip-Connection:XDeepFM/DCN。
and S4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver.
In this embodiment, the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is a decoding algorithm for stacking operator character sequences into a model structure of TensorFlow.
S5, initializing the iteration number K =0, and setting an iteration number threshold K m
In this embodiment, K m The value is 2000.
And S6, randomly generating S =2 different operator character sequences by the RNN Controller network according to the operator data set, wherein each operator character sequence is composed of a plurality of operators randomly sampled from the operator data set.
In this embodiment, each step of RNN generation in the RNN Controller network corresponds to generation of an operator structure, and a specific recommended model structure can be obtained by combining and assembling the generated operator sequences, for example, the following structure:
string sequence str1= "feature combination operator: FM, input feature operator: original structure, attention operator: none, hidden layer size: 300, activation function: RELU, number of layers: 3 layers ", then this structure is that of standard deep fm;
string sequence str2= "feature combination operator: biliner All, input feature operator: first and second order stitching, attention operator: SENET, hidden layer size: 300, activation function: RELU, number of layers: 3 layers ", then this structure is the structure of the fibanet-all.
And S7, respectively converting the 2 operator character sequences into 2 submodels (Child models) by the Model analyzer, randomly initializing the parameters of each current submodel if K =0, and respectively initializing the parameters of the 2 submodels obtained in the previous training round into the parameters of the current 2 submodels if K is not equal to 0.
S8, training each current submodel through the training data set, storing parameters of each submodel, evaluating each submodel after current training according to the test data set, and respectively obtaining S rewards, wherein K = K +1.
In the present embodiment, reward refers to an evaluation index of recommendation performance (e.g., accuracy ACC of driving behavior prediction, AUC of click, etc.). If the structure performance is excellent, a certain reward is given to the Controller, and if the generated structure performance is poor, the Controller is punished.
In this embodiment, adam's monitoring update algorithm is used to train update parameters, and the obtained model format is a checkpoint file format.
And S9, if K =2000, selecting the best rewarded one from the S sub-models after current training as the output of the automobile information prediction network model to complete the generation of the automobile information prediction network model, and otherwise, updating the parameters of the RNN Controller network according to the current rewarded by adopting a PolicyGradient reinforcement learning algorithm and then jumping to the step S6.
In this embodiment, through multiple feedback of child model, the RNN Controller network structure becomes more and more excellent, and then the generated operator character sequence becomes more excellent, and then the obtained sub-model becomes more excellent.
In this embodiment, the parameters of the RNN Controller network are updated according to the reinforcement learning algorithm of policygredient, and the specific optimization formula of the optimization structure is as follows:
Figure BDA0002393493780000071
in the above formula, J (theta) C ) Represents a specific optimization objective, wherein a 1:T Represents the operator sequence set generated by each step of the Controller, T is the length of the operator sequence, and R is the calculationThe Reward, P obtained for a substructure is the probability distribution of the sampling space of the operator structure with the goal of J (theta) C ) And (4) maximizing.
In the parameter solving process of the Controller, besides Reward, a specific gradient function needs to be calculated, and the specific gradient function is expanded according to time steps, wherein the calculation mode is as follows:
Figure BDA0002393493780000072
in this embodiment, 2 submodels are selected to perform the sampling calculation expectation process, and the gradient function is updated as:
Figure BDA0002393493780000073
through multiple experiments, when S is observed to take different values, the fact that S is not larger is found to be better, S =2 enables a finally generated automobile information prediction network model to be optimal, and the prediction effect is best.
Update the general formula as
Figure BDA0002393493780000074
Wherein alpha is C For the updated learning rate of Controller, J (θ) C ) Is a parameter variable of the Controller.
In the present embodiment, P (a) at each time step t |a 1:(t-1 ) And in the probability calculation process, calculating cross entropy according to an operator generated by sampling and the probability of sampling in each step.
In step S7 of this embodiment, after the model has been iteratively trained once, in the following process, the parameters learned by the last child model are used to initialize the parameters of the current round, and the advantages of such parameter sharing are as follows:
1. the parameters of the child model are ensured to be initialized in a better parameter space, so that the child model has fewer required learning steps and better effect;
2. compared with the non-sharing mode, the learning efficiency of the sharing mode is saved by about 1000 times, because:
non-sharing: controller steps C1=100000 steps, S =2,child requires steps C2=100 steps, and takes a total of time: C1S C2=10^5 ^ 2 ^ 100=2 ^ 10^7 steps;
the sharing mode comprises the following steps: controller steps C1=2000 steps, S =2,child requires steps C2=5 steps, and takes time in total:
C1S C2= 2S 10^ 3S 25 =2S 10^4 steps.
The efficiency is saved: 2 x 10^7/2 x 10^4 ^ 10^3.
After the automobile information prediction network model generated in the embodiment is deployed to the intelligent internet automobile cloud platform, the automobile information prediction network model can be used for predicting automobile driving behaviors.
In this embodiment, the generated parameter structure of the automobile information prediction network model is a checkpoint file, and the checkpoint file is deployed to an intelligent networked automobile cloud platform in a restful service interface manner.
Example 2
Compared with the embodiment 1, the automobile information data in the step S1 is the personal information data of the automobile user, including the age of the user, the sex of the user, the city where the user is located, the hobbies of the user and the personality of the user, and the Y in the step S2 is the predicted automobile price. The automobile information prediction network model generated in the embodiment can be used for predicting automobile prices after being deployed on the intelligent networked automobile cloud platform, and the deployment method is the same as that in the embodiment 1.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (7)

1. A prediction model generation method of an intelligent networking cloud platform based on reinforcement learning is characterized by comprising the following steps:
s1, acquiring a plurality of automobile information data from an intelligent network automobile cloud platform;
s2, preprocessing the automobile information data, forming an automobile information data set, and dividing the automobile information data set into a training data set and a testing data set;
s3, selecting a plurality of types of network structure models, abstracting and inducing a plurality of operators from the network structure models, and forming an operator data set;
s4, constructing a model generation architecture, wherein the model generation architecture comprises an RNN Controller network and a model resolver;
s5, initializing the iteration frequency K =0, and setting an iteration frequency threshold valueK m
S6, randomly generating S different operator character sequences by the RNNController network according to the operator data set, wherein each operator character sequence is composed of a plurality of operators randomly sampled from the operator data set;
s7, respectively converting the S operator character sequences into S sub-models by a model analyzer, randomly initializing the parameters of each current sub-model if K =0, and respectively initializing the parameters of the S sub-models obtained in the previous training round to the parameters of the current S sub-models if K is not equal to 0;
s8, training each current submodel through a training data set, storing parameters of each submodel, evaluating each currently trained submodel according to a test data set, and respectively obtaining S rewards, wherein K = K +1; reward is an evaluation index of recommended performance;
s9, if K =K m Selecting the best rewarded one from the S sub-models after current training as an automobile information prediction network model to be output, and completing generation of the automobile information prediction network model, otherwise adopting a PolicyGradient reinforcement learning algorithm, updating the RNNController network parameters according to the current rewarded, and then jumping to the step S6;
in the step S1, the vehicle information data is vehicle information data; the vehicle information data comprises vehicle parameter data, vehicle real-time driving data and current road environment data of the vehicle;
the step S2 specifically includes: arranging automobile information data into a plurality of training examples in a form of < X, Y >, wherein each training example forms the training data set, and X is an information data characteristic which is < vehicle parameter data, vehicle real-time driving data and current road environment data >; y is a prediction target, and the prediction target is the next driving behavior of the intelligent networked automobile, including steering behavior, braking behavior and driving behavior.
2. The method according to claim 1, wherein in step S3, the network structure models are ten types, and are respectively FM, PNN, PIN, HNN, deep FM, fiBiNET, DCN, XDeepFM, AFM, and AutoInt.
3. The reinforcement learning-based prediction model generation method for the intelligent internet cloud platform according to claim 1, wherein the RNN Controller network is a two-layer LSTM network, and the hidden layer size is 256; the model parser is used for stacking the operator character sequences into a model structure of TensorFlow.
4. The reinforcement learning-based intelligent networked cloud platform prediction model generation method according to claim 1, wherein in the step S5,K m =2000。
5. the reinforcement learning-based intelligent networked cloud platform prediction model generation method according to claim 1, wherein S =2.
6. The intelligent internet cloud platform prediction model generation method based on reinforcement learning of claim 1, wherein the automobile information prediction network model is stored in a checkpoint file format.
7. The reinforcement learning-based intelligent networking cloud platform prediction model generation method according to claim 1, wherein the sub-model is a TensorFlow model.
CN202010122791.0A 2020-02-27 2020-02-27 Prediction model generation method of intelligent network cloud platform based on reinforcement learning Active CN111353644B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010122791.0A CN111353644B (en) 2020-02-27 2020-02-27 Prediction model generation method of intelligent network cloud platform based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010122791.0A CN111353644B (en) 2020-02-27 2020-02-27 Prediction model generation method of intelligent network cloud platform based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN111353644A CN111353644A (en) 2020-06-30
CN111353644B true CN111353644B (en) 2023-04-07

Family

ID=71195959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010122791.0A Active CN111353644B (en) 2020-02-27 2020-02-27 Prediction model generation method of intelligent network cloud platform based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN111353644B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926735A (en) * 2021-01-29 2021-06-08 北京字节跳动网络技术有限公司 Method, device, framework, medium and equipment for updating deep reinforcement learning model
CN114742236A (en) * 2022-04-24 2022-07-12 重庆长安汽车股份有限公司 Environmental vehicle behavior prediction model training method and system
CN117010447B (en) * 2023-10-07 2024-01-23 成都理工大学 End-to-end based microarchitecturable search method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956698A (en) * 2016-04-27 2016-09-21 国网天津市电力公司 Power load combination prediction method based on IOWGA operator and fresh prediction precision
CN106529461A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Vehicle model identifying algorithm based on integral characteristic channel and SVM training device
CN107862346A (en) * 2017-12-01 2018-03-30 驭势科技(北京)有限公司 A kind of method and apparatus for carrying out driving strategy model training
CN109733415A (en) * 2019-01-08 2019-05-10 同济大学 A kind of automatic Pilot following-speed model that personalizes based on deeply study
CN109871778A (en) * 2019-01-23 2019-06-11 长安大学 Lane based on transfer learning keeps control method
CN110009108A (en) * 2019-04-09 2019-07-12 沈阳航空航天大学 A kind of completely new quantum transfinites learning machine
CN110304075A (en) * 2019-07-04 2019-10-08 清华大学 Track of vehicle prediction technique based on Mix-state DBN and Gaussian process

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105956698A (en) * 2016-04-27 2016-09-21 国网天津市电力公司 Power load combination prediction method based on IOWGA operator and fresh prediction precision
CN106529461A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Vehicle model identifying algorithm based on integral characteristic channel and SVM training device
CN107862346A (en) * 2017-12-01 2018-03-30 驭势科技(北京)有限公司 A kind of method and apparatus for carrying out driving strategy model training
CN109733415A (en) * 2019-01-08 2019-05-10 同济大学 A kind of automatic Pilot following-speed model that personalizes based on deeply study
CN109871778A (en) * 2019-01-23 2019-06-11 长安大学 Lane based on transfer learning keeps control method
CN110009108A (en) * 2019-04-09 2019-07-12 沈阳航空航天大学 A kind of completely new quantum transfinites learning machine
CN110304075A (en) * 2019-07-04 2019-10-08 清华大学 Track of vehicle prediction technique based on Mix-state DBN and Gaussian process

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
周末 ; 金敏 ; .多算法多模型与在线第二次学习结合的短期电力负荷预测方法.计算机应用.2017,(第11期),全文. *
孙岩 ; 吕世聘 ; 王秀坤 ; 唐一源 ; .贝叶斯网络结构模型的构建.小型微型计算机系统.2008,(第05期),全文. *
孙玉山,白红.关于t是周期的抛物问题的拟谱方法.黑龙江大学自然科学学报.1990,(第03期),全文. *
崔建双 ; 刘晓婵 ; 杨美华 ; 李雯燕 ; .基于元学习推荐的优化算法自动选择框架与实证分析.计算机应用.2017,(第04期),全文. *
郝占刚 ; 章伟雄 ; 陈政 ; .基于监督联合去噪模型的社交网络链接预测.中国科学:信息科学.2017,(第11期),全文. *

Also Published As

Publication number Publication date
CN111353644A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN111353644B (en) Prediction model generation method of intelligent network cloud platform based on reinforcement learning
CN111061277B (en) Unmanned vehicle global path planning method and device
US11842261B2 (en) Deep reinforcement learning with fast updating recurrent neural networks and slow updating recurrent neural networks
WO2021103625A1 (en) Short-term vehicle speed condition real-time prediction method based on interaction between vehicle ahead and current vehicle
CN112162555B (en) Vehicle control method based on reinforcement learning control strategy in hybrid vehicle fleet
CN110520868B (en) Method, program product and storage medium for distributed reinforcement learning
CN108594858B (en) Unmanned aerial vehicle searching method and device for Markov moving target
CN110850861A (en) Attention-based hierarchical lane change depth reinforcement learning
US20240095495A1 (en) Attention neural networks with short-term memory units
CN114261400A (en) Automatic driving decision-making method, device, equipment and storage medium
CN113264064B (en) Automatic driving method for intersection scene and related equipment
CN111507499B (en) Method, device and system for constructing model for prediction and testing method
CN116795720A (en) Unmanned driving system credibility evaluation method and device based on scene
Han et al. Ensemblefollower: A hybrid car-following framework based on reinforcement learning and hierarchical planning
Arbabi et al. Planning for autonomous driving via interaction-aware probabilistic action policies
Mazumder et al. Action permissibility in deep reinforcement learning and application to autonomous driving
Naing et al. Dynamic car-following model calibration with deep reinforcement learning
CN117396389A (en) Automatic driving instruction generation model optimization method, device, equipment and storage medium
Hjaltason Predicting vehicle trajectories with inverse reinforcement learning
Radovic et al. Agent forecasting at flexible horizons using ODE flows
Pak et al. Carnet: A dynamic autoencoder for learning latent dynamics in autonomous driving tasks
CN114104005B (en) Decision-making method, device and equipment of automatic driving equipment and readable storage medium
Schütt et al. Exploring the Range of Possible Outcomes by means of Logical Scenario Analysis and Reduction for Testing Automated Driving Systems
Madni et al. Digital Twin: Key Enabler and Complement to Model-Based Systems Engineering
CN112800670B (en) Multi-target structure optimization method and device for driving cognitive model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant