CN114154566A - Edge calculation active service method and system based on deep reinforcement learning - Google Patents

Edge calculation active service method and system based on deep reinforcement learning Download PDF

Info

Publication number
CN114154566A
CN114154566A CN202111370645.0A CN202111370645A CN114154566A CN 114154566 A CN114154566 A CN 114154566A CN 202111370645 A CN202111370645 A CN 202111370645A CN 114154566 A CN114154566 A CN 114154566A
Authority
CN
China
Prior art keywords
user
intention
model
function
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111370645.0A
Other languages
Chinese (zh)
Inventor
缪巍巍
张明轩
曾锃
黄进
张瑞
张震
李世豪
滕昌志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd filed Critical Information and Telecommunication Branch of State Grid Jiangsu Electric Power Co Ltd
Priority to CN202111370645.0A priority Critical patent/CN114154566A/en
Publication of CN114154566A publication Critical patent/CN114154566A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses an edge calculation active service method and system based on deep reinforcement learning, wherein the method comprises the following steps: 1) extracting user characteristic information and extracting user intention classification at the same time; 2) pre-training an intention pre-judging model through a deep neural network, outputting the intention pre-judging model as a multi-classification user intention probability through a normalized exponential function softmax, then optimizing the intention pre-judging model by utilizing a cross entropy loss function, outputting the optimized intention pre-judging model as the type of a current intention, and establishing a DDPG model by taking the second layer from the last half of the intention pre-judging model as a representation vector; 3) optimizing the DDPG model through on-line exploration; 4) setting a reward function of reinforcement learning, wherein if the user uses one of the services, the reward value is 1, and otherwise, the reward value is 0; and prejudging the user resource request according to the reward value. The method of the invention can improve the service efficiency of the edge node and improve the satisfaction degree of users.

Description

Edge calculation active service method and system based on deep reinforcement learning
Technical Field
The invention relates to an edge computing active service system and method based on deep reinforcement learning, and belongs to the technical field of user edge computing.
Background
During the interaction process between the user (such as an AR user and an intrusion detection terminal device) using edge computing and the edge node, the edge node may provide active edge services according to the load condition of the user, so as to increase the user experience, such as computing offloading, edge cache services, and the like. If the performance bottleneck of the user can be pre-judged in advance, the user can be actively served according to the use information of the user, and the user experience is improved. The satisfaction degree of the user can be effectively improved by prejudging the load condition of the user and carrying out active service, and the existing methods mainly comprise the following steps:
1) based on manual rule configuration, according to user preferences, historical loads and the like, relevant rules can be configured manually, user resource requirements can be pre-judged, and for example, video resources can be deployed in advance for users who like watching movies; for users who enjoy playing games, more computing resources can be pre-allocated.
Problems with manual rule configuration:
a) expert domain knowledge is required, and a large amount of manual participation is required;
b) the resource requirements of users may be changeable and complex, and the configuration needs to be carried out step by step;
c) the user portrait, application information and the like are very complex and have thousands of characteristics, so that reasonable rules are difficult to configure manually;
2) the method based on supervised learning trains supervised learning models through neural networks, tree models and the like according to user characteristics, historical loads and the like of users, predicts user resource requirements through multi-classification, and deploys in advance.
The problems existing in supervised learning are as follows:
a) the resource requirements of the users have sequentiality, the service quality of the users in the last resource request can influence the requirements of the users in the next step, and the supervision and learning are difficult to consider;
b) with the development of services, resource request characteristics of different applications of a user also change, and when application updating and the like occur each time, supervised learning needs to retrain a model, so that the calculation amount is large, and much time is needed.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: in the edge computing scenario, the resource requests of the users are sequential, and the resource requests of the users may change dynamically, which is large in computation amount and takes much time.
The technical scheme adopted by the invention has the working principle that: the invention carries out user resource demand prejudgment on the following two scenes:
edge caching: when a user browses video resources and other conditions, if video requests and the like of the user can be predicted, caching can be performed on the edge in advance, and faster bandwidth resources are provided for the user;
compute intensive applications: in application requests of a user for playing games, data calculation and the like, if the user request can be predicted, more efficient calculation service is actively provided for a user calculation task, and the calculation efficiency of the user is improved.
Aiming at the situations, the method and the system realize the prejudgment of the user resource request by exploring different resource services provided for the user each time in a reinforcement learning mode and acquiring rewards through user clicking or other feedback, and finally aim at maximizing the long-term accumulated rewards of the user or the satisfaction degree of the user.
The technical scheme of the invention is as follows:
an edge computing active service method based on deep reinforcement learning comprises the following steps:
1) extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period, a user position and the like, and extracting user intention classification;
2) pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the input of the intention pre-judging model is a user portrait, an application load of a user in a set period and a user position, the output of the intention pre-judging model is a multi-classification user intention probability passing through a normalized exponential function softmax, then the intention pre-judging model is optimized by utilizing a cross entropy loss function, the trained intention pre-judging model is output as the category of the current intention, and meanwhile, the second reciprocal layer of the trained intention pre-judging model is used as a representation vector to establish a DDPG model;
3) optimizing the DDPG model through on-line exploration;
4) setting a reward function of reinforcement learning, wherein if the user uses a service corresponding to one intention, the reward value is 1, otherwise, the reward value is 0; in the process of interaction between the active service system and the user, the active service system prejudges the user resource request according to the reward value, and selects the action which enables the critic valuation function to be maximum, namely, corresponding service is provided.
An edge computing active service system based on deep reinforcement learning comprises the following program modules;
a feature extraction module: extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period, a user position and the like, and extracting user intention classification;
a neural network training module: pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the input of the intention pre-judging model is a user portrait, an application load of a user in a set period and a user position, the output of the intention pre-judging model is a multi-classification user intention probability passing through a normalized exponential function softmax, then the intention pre-judging model is optimized by utilizing a cross entropy loss function, the trained intention pre-judging model is output as the category of the current intention, and meanwhile, the second reciprocal layer of the trained intention pre-judging model is used as a representation vector to establish a DDPG model;
a model optimization module: optimizing the DDPG model by over-line exploration;
a prejudgment module: setting a reward function of reinforcement learning, wherein if the user uses a service corresponding to one intention, the reward value is 1, otherwise, the reward value is 0; in the process of interaction between the active service system and the user, the active service system prejudges the user resource request according to the reward value, and selects the action which enables the critic valuation function to be maximum, namely, corresponding service is provided.
The invention achieves the following beneficial effects: the method and the system of the invention actively push the service to the user in a dynamic environment through deep reinforcement learning, optimize the pushed service quality through continuous trial and error, improve the service efficiency of the edge node and improve the satisfaction degree of the user. Meanwhile, the method can dynamically increase or reduce the user intentions, and the model can be automatically updated through reinforcement learning and can provide an optimal intention prejudgment result aiming at the behaviors of the user sequence.
Drawings
FIG. 1 is a flowchart of an edge computing initiative service method based on deep reinforcement learning according to the present invention;
FIG. 2 is a diagram of the structure of DDPG model of the present invention.
Detailed Description
The technical scheme of the invention is further explained by combining the drawings and the specific embodiments.
Example 1
As shown in fig. 1, an edge computing active service method based on deep reinforcement learning according to the present invention includes the following steps:
1) extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period, a user position and the like, and extracting user intention classification;
2) pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the input of the intention pre-judging model is a user portrait, an application load of a user in a set period and a user position, the output of the intention pre-judging model is a multi-classification user intention probability passing through a normalized exponential function softmax, then the intention pre-judging model is optimized by utilizing a cross entropy loss function, the trained intention pre-judging model is output as the category of the current intention, and meanwhile, the second reciprocal layer of the trained intention pre-judging model is used as a representation vector to establish a DDPG model; because the last layer of the intention prejudging model is a normalized exponential function softmax and is irrelevant to the next task, the output of the network middle layer is used as a representation vector according to a transfer learning method;
3) optimizing the DDPG model through on-line exploration, and specifically comprising the following steps:
31) implementing reinforcement learning by reinforcement learning DDPG algorithm (deep dependent policy gradient), wherein the actor network takes the expression vector obtained in step 2) as input, and DDPG algorithm outputs storage or calculation service provided for users;
32) the critic network predicts the long-term benefit after service by representing vectors and exposed problems and optimizes by timing differential errors,
Figure BDA0003362332200000051
wherein Q represents a critic network, s is the current environment state, a is the selected service action, and w is a parameter of the critic network; s ', a' are the state and action, respectively, at the next moment, r is the reward function, γ is the discount factor, typically 0.95; l (w) represents an optimized value, E [ ] is a desired value, a ' is a value that maximizes the critic network Q (s ', a ', w);
33) the DDPG algorithm dynamically explores through a noise function OUNoise;
4) setting a reward function of reinforcement learning, wherein if the user uses a service corresponding to one intention, the reward value is 1, otherwise, the reward value is 0; in the process of interaction between the active service system and the user, the active service system prejudges the user resource request according to the reward value, and selects the action which enables the critic valuation function to be maximum, namely, corresponding service is provided. Through interaction with a user, a reinforcement learning DDPG algorithm is utilized to optimize a strategy function, namely an active service model, and the prejudgment accuracy is improved;
5) when a new demand of a user is added, the deep neural network in the step 2) is kept unchanged, the operator network output and the critic network input in the step 3) are modified, dynamic exploration is carried out on a new intention, and the click rate of the user is improved.
The structure of the DDPG model is shown in FIG. 2, and the specific working steps of the DDPG model are as follows:
1) pushing calculation or storage service to a user according to a strategy function, and selecting an action which enables a criticc evaluation function to be maximum after outish noise is added to the strategy output at the training time; at the time of testing, selecting an action that maximizes the critic's evaluation function; the strategy function refers to an output value of a strategy network, and outputs corresponding actions aiming at each state, wherein the actions are pushed services;
2) selecting whether to use the pushed service by the user at the user side;
3) acquiring a reward function according to the selection of a user, and updating a valuation function and a strategy function at the same time;
4) and continuing returning to the step 1) for circulating work.
An edge computing active service system based on deep reinforcement learning comprises the following program modules;
a feature extraction module: extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period, a user position and the like, and extracting user intention classification;
a neural network training module: pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the input of the intention pre-judging model is a user portrait, an application load of a user in a set period and a user position, the output of the intention pre-judging model is a multi-classification user intention probability passing through a normalized exponential function softmax, then the intention pre-judging model is optimized by utilizing a cross entropy loss function, the trained intention pre-judging model is output as the category of the current intention, and meanwhile, the second reciprocal layer of the trained intention pre-judging model is used as a representation vector to establish a DDPG model;
a model optimization module: optimizing the DDPG model by over-line exploration;
a prejudgment module: setting a reward function of reinforcement learning, wherein if the user uses a service corresponding to one intention, the reward value is 1, otherwise, the reward value is 0; in the process of interaction between the active service system and the user, the active service system prejudges the user resource request according to the reward value, and selects the action which enables the critic valuation function to be maximum, namely, corresponding service is provided.
A lifting module: when a user newly increases the demand, the deep neural network in the neural network training module is kept unchanged, the operator network output and the critic network input in the model optimization module are modified, the new intention is dynamically explored, and the click rate of the user is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (7)

1. An edge computing active service method based on deep reinforcement learning is characterized by comprising the following steps:
1) extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period and a user position, and extracting user intention classification;
2) pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the input of the intention pre-judging model is a user portrait, an application load of a user in a set period and a user position, the output of the intention pre-judging model is a multi-classification user intention probability passing through a normalized exponential function softmax, then the intention pre-judging model is optimized by utilizing a cross entropy loss function, the trained intention pre-judging model is output as the category of the current intention, and meanwhile, the second reciprocal layer of the trained intention pre-judging model is used as a representation vector to establish a DDPG model;
3) optimizing the DDPG model through on-line exploration;
4) setting a reward function of reinforcement learning, wherein if the user uses a service corresponding to one intention, the reward value is 1, otherwise, the reward value is 0; in the process of interaction between the active service system and the user, the active service system prejudges the user resource request according to the reward value, and selects the action which enables the critic valuation function to be maximum, namely, corresponding service is provided.
2. The edge computing active service method based on deep reinforcement learning of claim 1, further comprising:
5) when a new demand of a user is added, the deep neural network in the step 2) is kept unchanged, the operator network output and the critic network input in the step 3) are modified, dynamic exploration is carried out on a new intention, and the click rate of the user is improved.
3. The edge computing active service method based on deep reinforcement learning according to claim 1 or 2, wherein in step 3), the specific steps are as follows:
31) implementing reinforcement learning by a reinforcement learning DDPG algorithm, wherein the actor network takes the expression vector obtained in the step 2) as input, and the DDPG algorithm outputs storage or calculation service provided for a user;
32) the critic network predicts the long-term benefit after service by representing vectors and exposed problems and optimizes by timing differential errors,
Figure FDA0003362332190000021
wherein Q represents a critic network, s is the current environment state, a is the selected service action, and w is a parameter of the critic network; s ', a' are respectively the state and action at the next moment, r is the reward function, and gamma is the discount factor; l (w) represents an optimized value, E [ ] is a desired value, a ' is a value that maximizes the critic network Q (s ', a ', w);
33) the DDPG algorithm is dynamically explored by the noise function OUNoise.
4. The edge computing active service method based on deep reinforcement learning according to claim 1 or 2, characterized in that the DDPG model comprises the following specific working steps:
1) pushing calculation or storage service to a user according to a strategy function, and selecting an action which enables a criticc evaluation function to be maximum after outish noise is added to the strategy output at the training time; at the time of testing, selecting an action that maximizes the critic's evaluation function; the strategy function refers to an output value of a strategy network, and outputs corresponding actions aiming at each state, wherein the actions are pushed services;
2) selecting whether to use the pushed service by the user at the user side;
3) acquiring a reward function according to the selection of a user, and updating a valuation function and a strategy function at the same time;
4) and continuing returning to the step 1) for circulating work.
5. An edge computing active service system based on deep reinforcement learning is characterized by comprising the following program modules;
a feature extraction module: extracting user characteristic information, wherein the characteristic information comprises a user portrait, an application load of a user in a set period and a user position, and extracting user intention classification;
a neural network training module: pre-training an intention pre-judging model through a deep neural network, wherein the intention pre-judging model is a multi-classification neural network model, the output of the intention pre-judging model is multi-classification user intention probability passing through a normalized exponential function softmax, then, the intention pre-judging model is optimized by utilizing a cross entropy loss function, the optimized intention pre-judging model is output as the current intention category, and meanwhile, the second layer from the last of the intention pre-judging model is used as a representation vector to establish a DDPG model;
a model optimization module: optimizing the DDPG model by over-line exploration;
a prejudgment module: setting a reward function of reinforcement learning, wherein if the user uses one of the services, the reward value is 1, and otherwise, the reward value is 0; and in the process of interacting with the user, prejudging the user resource request according to the reward value, and selecting the action which enables the critic valuation function to be maximum.
6. The deep reinforcement learning-based edge computing active service system according to claim 5, further comprising:
a lifting module: when a user newly increases the demand, the deep neural network in the neural network training module is kept unchanged, the operator network output and the critic network input in the model optimization module are modified, the new intention is dynamically explored, and the click rate of the user is improved.
7. The deep reinforcement learning-based edge computing active service system according to claim 5, wherein the DDPG model comprises the following specific working steps:
1) pushing calculation or storage service to a user according to a strategy function, and selecting an action which enables a criticc evaluation function to be maximum after outish noise is added to the strategy output at the training time; at the time of testing, selecting an action that maximizes the critic's evaluation function; the strategy function refers to an output value of a strategy network, and outputs corresponding actions aiming at each state, wherein the actions are pushed services;
2) selecting whether to use the pushed service by the user at the user side;
3) acquiring a reward function according to the selection of a user, and updating a valuation function and a strategy function at the same time;
4) and continuing returning to the step 1) for circulating work.
CN202111370645.0A 2021-11-18 2021-11-18 Edge calculation active service method and system based on deep reinforcement learning Pending CN114154566A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111370645.0A CN114154566A (en) 2021-11-18 2021-11-18 Edge calculation active service method and system based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111370645.0A CN114154566A (en) 2021-11-18 2021-11-18 Edge calculation active service method and system based on deep reinforcement learning

Publications (1)

Publication Number Publication Date
CN114154566A true CN114154566A (en) 2022-03-08

Family

ID=80457057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111370645.0A Pending CN114154566A (en) 2021-11-18 2021-11-18 Edge calculation active service method and system based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN114154566A (en)

Similar Documents

Publication Publication Date Title
CN112181666B (en) Equipment assessment and federal learning importance aggregation method based on edge intelligence
CN111835827B (en) Internet of things edge computing task unloading method and system
Zhu et al. Deep reinforcement learning for mobile edge caching: Review, new features, and open issues
US20210256403A1 (en) Recommendation method and apparatus
US20190294975A1 (en) Predicting using digital twins
CN111447471B (en) Model generation method, play control method, device, equipment and storage medium
CN113095512A (en) Federal learning modeling optimization method, apparatus, medium, and computer program product
EP4350572A1 (en) Method, apparatus and system for generating neural network model, devices, medium and program product
US20230259739A1 (en) Image detection method and apparatus, computer-readable storage medium, and computer device
US20220180209A1 (en) Automatic machine learning system, method, and device
CN114519435A (en) Model parameter updating method, model parameter updating device and electronic equipment
CN112817563A (en) Target attribute configuration information determination method, computer device, and storage medium
US20240095529A1 (en) Neural Network Optimization Method and Apparatus
CN110782016A (en) Method and apparatus for optimizing neural network architecture search
CN114154566A (en) Edge calculation active service method and system based on deep reinforcement learning
CN112269942B (en) Method, device and system for recommending object and electronic equipment
CN113762972A (en) Data storage control method and device, electronic equipment and storage medium
CN113971454A (en) Deep learning model quantification method and related device
CN113934871B (en) Training method and device of multimedia recommendation model, electronic equipment and storage medium
CN114692878A (en) Parameter tuning method, device and related equipment
US20240135191A1 (en) Method, apparatus, and system for generating neural network model, device, medium, and program product
US20230342425A1 (en) Optimal sequential decision making with changing action space
US20240037234A1 (en) Smart incentivization for achieving collaborative machine learning
CN117539648A (en) Service quality management method and device for electronic government cloud platform
CN117689009A (en) Regularized personalized federal training-oriented communication optimization method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination