CN110519816A - A kind of radio roaming control method, device, storage medium and terminal device - Google Patents
A kind of radio roaming control method, device, storage medium and terminal device Download PDFInfo
- Publication number
- CN110519816A CN110519816A CN201910793482.3A CN201910793482A CN110519816A CN 110519816 A CN110519816 A CN 110519816A CN 201910793482 A CN201910793482 A CN 201910793482A CN 110519816 A CN110519816 A CN 110519816A
- Authority
- CN
- China
- Prior art keywords
- wireless access
- access point
- roaming
- neural network
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W36/00—Hand-off or reselection arrangements
- H04W36/08—Reselecting an access point
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W36/00—Hand-off or reselection arrangements
- H04W36/24—Reselection being triggered by specific parameters
- H04W36/30—Reselection being triggered by specific parameters by measured or perceived connection quality data
Abstract
The invention discloses a kind of radio roaming control methods, comprising: the state vector for obtaining client is sampled every the preset time cycle;It include RSSI value, channel utilization and the noise of client corresponding with each wireless access point respectively in the state vector;Result vector is obtained according to the neural network model after the state vector and preset training;It include the corresponding roaming valuation of several described wireless access point in the result vector;According to preset random number, the current exploration coefficient of the neural network model and the result vector selection target wireless access point from several described wireless access point, using the target wireless access points as the roaming candidate of client.Correspondingly, the invention also discloses a kind of radio roaming control device, computer readable storage medium and terminal devices.The usage experience of client can be improved according to environmental change and network internal operating status dynamic adjustment roaming candidate using technical solution of the present invention.
Description
Technical field
The present invention relates to wireless communication technology fields more particularly to a kind of radio roaming control method, device, computer can
Read storage medium and terminal device.
Background technique
Roaming refers to that client (comprising products such as router, wireless extensions devices, is referred to as here from a wireless access point
For wireless access point) it is switched to the process of another wireless access point, substantially problem to be solved is when to trigger roaming,
And how to determine the target wireless access points of roaming switch.
The seamless roam strategy that existing wireless access point is supported is all based on greatly 802.11k/v/r agreement, utilizes signal
Intensity (RSSI) realizes radio roaming as threshold value and judge index, and main method may be summarized to be: wireless access point week
The RSSI for the client that the monitoring of phase property receives, which is compared with preset signal strength threshold, if the RSSI
Less than signal strength threshold, then wireless access point issues 802.11k message request to client, and client starts to query alternative nothing
The corresponding RSSI of line access point, and the information inquired is returned into current wireless access point, in wireless access point list, according to
Decision logic and method based on RSSI determine the target wireless access points roamed into wireless access point list, to realize
Client is switched to the seamless roam process of another wireless access point from a wireless access point.
Existing wireless network roaming strategy is a kind of fixed threshold strategy that decision is carried out based on RSSI, still, wirelessly
Network is easy by blocking between multipath effect, position area, height above sea level, temperature humidity, wireless access point and client
The influence of the environmental factors such as situation, wireless channel would generally be varied over time, and only pass through RSSI power
The considerations of whether being roamed to measure, having lacked the internal operation state to the network of wireless access point composition, it is therefore, only logical
Environment and the network internal operation of Various Complex can not be combined by crossing the fixed threshold strategy progress roaming decisions based on RSSI
State influences the usage experience of client.
Summary of the invention
The technical problem to be solved by the embodiment of the invention is that providing a kind of radio roaming control method, device, calculating
Machine readable storage medium storing program for executing and terminal device, can be according to environmental change and network internal operating status dynamic adjustment roaming mesh
Mark, improves the usage experience of client.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of radio roaming control method, the methods
Suitable for the network being made of several wireless access point;The described method includes:
The state vector for obtaining client is sampled every the preset time cycle;It wherein, include point in the state vector
RSSI value, channel utilization and the noise of client not corresponding with each wireless access point;
Result vector is obtained according to the neural network model after the state vector and preset training;Wherein, the knot
It include the corresponding roaming valuation of several described wireless access point in fruit vector;
If according to preset random number, the current exploration coefficient of the neural network model and the result vector from described
Selection target wireless access point in dry wireless access point, using the target wireless access points as the roaming mesh of client
Mark.
Further, described according to preset random number, the current exploration coefficient and the knot of the neural network model
Fruit vector selection target wireless access point from several described wireless access point, specifically includes:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, select any wireless in several described wireless access point
Access point is as the target wireless access points;
If currently exploring coefficient is not more than preset random number, the roaming valuation maximum value in the result vector is selected
Corresponding wireless access point is as the target wireless access points.
Further, after determining target wireless access points, the method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtWhen indicating the t times sampling
Corresponding exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendTable
Show that coefficient, ε are explored in enddecayIt indicates to explore coefficient attenuation number of iterations.
Further, the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1,
St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1It indicates
The state vector that the t+1 times sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtStore preset experience replay pond
In;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
Further, described calculated according to the actual speed rate and the link delay obtains reward parameter Rt+1And sample
Data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter
Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch
Punishment, 1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtIt is selected when with the t-1 times sampling
Target wireless access points At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith
The target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, S
Indicate actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
Further, the weight probability distribution according in the experience replay pond chooses several sample data conducts
Sample set specifically includes:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Its
In, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j
=1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2, E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
Further, the neural network model includes input layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it is specific to wrap
It includes:
According to the sample set and back-propagation algorithm to the basal layer, valuation layer and decision-making level of the neural network model
Parameter optimize.
In order to solve the above-mentioned technical problem, the embodiment of the invention also provides a kind of radio roaming control device, the dresses
Set the network for being suitable for being made of several wireless access point;Described device includes:
State vector obtains module, for sampling the state vector for obtaining client every the preset time cycle;Wherein,
It include the RSSI value of client corresponding with each wireless access point, channel utilization and making an uproar in the state vector respectively
Sound;
Result vector obtains module, for being obtained according to the neural network model after the state vector and preset training
Result vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module, for the current exploration coefficient according to preset random number, the neural network model
With the result vector from several described wireless access point selection target wireless access point, the Target Wireless is accessed
Roaming candidate of the point as client.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes
The computer program of storage;Wherein, where the computer program controls the computer readable storage medium at runtime
Equipment executes radio roaming control method described in any of the above embodiments.
The embodiment of the invention also provides a kind of terminal device, including processor, memory and it is stored in the storage
In device and it is configured as the computer program executed by the processor, the processor is real when executing the computer program
Existing radio roaming control method described in any of the above embodiments.
Compared with prior art, the embodiment of the invention provides a kind of radio roaming control methods, device, computer-readable
Storage medium and terminal device sample the state vector for obtaining client every the preset time cycle, wrap in the state vector
RSSI value, channel utilization and the noise for including client corresponding with each wireless access point respectively, according to the state vector
Result vector is obtained with the neural network model after preset training, includes that several wireless access point are corresponding in the result vector
Roaming valuation, according to preset random number, the current exploration coefficient of neural network model and the result vector from several nothings
Selection target wireless access point in line access point, using target wireless access points as the roaming candidate of client, so as to
According to environmental change and network internal operating status dynamic adjustment roaming candidate, the usage experience of client is improved.
Detailed description of the invention
Fig. 1 is a kind of flow chart of a preferred embodiment of radio roaming control method provided by the invention;
Fig. 2 is a kind of structural block diagram of a preferred embodiment of neural network model provided by the invention;
Fig. 3 is a kind of structural block diagram of a preferred embodiment of radio roaming control device provided by the invention;
Fig. 4 is a kind of structural block diagram of a preferred embodiment of terminal device provided by the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained all without creative efforts
Other embodiments shall fall within the protection scope of the present invention.
It is shown in Figure 1 the embodiment of the invention provides a kind of radio roaming control method, it is one kind provided by the invention
The flow chart of one preferred embodiment of radio roaming control method, the method are suitable for being made of several wireless access point
Network;The method includes the steps S11 to step S13:
Step S11, the state vector of acquisition client is sampled every the preset time cycle;Wherein, the state vector
In include client corresponding with each wireless access point respectively RSSI value, channel utilization and noise;
Step S12, result vector is obtained according to the neural network model after the state vector and preset training;Its
In, it include the corresponding roaming valuation of several described wireless access point in the result vector;
Step S13, according to preset random number, the current exploration coefficient and the result vector of the neural network model
The selection target wireless access point from several described wireless access point, using the target wireless access points as client
Roaming candidate.
Specifically, every pre-set time cycle (the specific time cycle can be configured according to actual needs)
The network being made of several (being assumed to be M) wireless access point is sampled, obtains client to be roamed relative to every
Uplink RSSI value corresponding to one wireless access point, channel utilization U and noise N, accordingly obtain the state of the client to
Amount is St=[RSSI1, U1, N1, RSSIM, UM, NM], wherein t indicates sampling number, StIt indicates the t times (i.e. this)
The state vector obtained is sampled, M indicates the number of wireless access point, M > 1, state vector StLength be 3M;By state vector St
Being input to pre-set trained neural network model, (neural network model uses deeply study to be instructed in advance
Practice) in, neural network model exports the result vector Q that length is M, includes that M wireless access point is corresponding unrestrained in result vector Q
Valuation, the corresponding roaming valuation of a wireless access point are swum, roaming valuation represents the client roaming switch to corresponding nothing
A possibility that line access point, roaming valuation is bigger, and possibility is bigger;A random number (0≤random number≤1) is generated, according to this
Random number, the current exploration coefficient of neural network model and the result vector selection target from M wireless access point wirelessly connect
Access point, using the target wireless access points selected as the roaming candidate of client.
It should be noted that after client access wireless access point, due to by environmental change or network internal operation
The influence of state change, the wireless access point currently connected may be unable to satisfy the use demand of client, it is therefore desirable to every
Period regular hour samples the state vector of client, to judge whether client needs roaming switch wireless access
Point, and after selecting target wireless access points, in order to avoid the operation of unnecessary roaming switch, can further judge
Whether the target wireless access points and the wireless access point that client is currently connect are identical, if not identical, issue to client
Road report operates so that client executes corresponding roaming switch according to the target wireless access points, if they are the same, then without to
Client issues road report.
A kind of radio roaming control method provided by the embodiment of the present invention, by every the preset time cycle to client
RSSI value, channel utilization and the noise at end are sampled, and accordingly obtain state vector, and state vector is input to and is trained
Neural network model, obtain result vector, accordingly with selection target is wireless from all wireless access point according to result vector
Roaming object of the access point as client, it is contemplated that client is changed institute's band by environmental change and network internal operating status
The influence come, and handled using the neural network model after training, so as to according in environmental change and network
Portion's operating status dynamic adjustment roaming candidate, improves the accuracy of roaming candidate, and improve the usage experience of client.
In a further advantageous embodiment, described according to preset random number, the current exploration of the neural network model
Coefficient and the result vector selection target wireless access point from several described wireless access point, specifically include:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, select any wireless in several described wireless access point
Access point is as the target wireless access points;
If currently exploring coefficient is not more than preset random number, the roaming valuation maximum value in the result vector is selected
Corresponding wireless access point is as the target wireless access points.
Specifically, in conjunction with above-described embodiment, according to the random number of generation, the current exploration coefficient of neural network model and
Result vector is from M wireless access point when selection target wireless access point, first by the current exploration coefficient of neural network model
It is compared with the random number of generation, if the current exploration coefficient of neural network model is greater than the random number generated, selects M
Any one wireless access point in a wireless access point is as target wireless access points;If the current spy of neural network model
Rope coefficient is no more than the random number generated, the then corresponding wireless access point conduct of roaming valuation maximum value in selection result vector
Target wireless access points.
In another preferred embodiment, after determining target wireless access points, the method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtWhen indicating the t times sampling
Corresponding exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendTable
Show that coefficient, ε are explored in enddecayIt indicates to explore coefficient attenuation number of iterations.
Specifically, after sampling each time and determining target wireless access points, being needed according to public affairs in conjunction with above-described embodiment
FormulaLinear attenuation update is carried out to current coefficient of exploring, corresponding exploration coefficient is most when sampling for the first time
Greatly, it with the increase of sampling number, explores coefficient and is gradually reduced.
It should be noted that exploring coefficient has pre-set initial value εstartWith end value εend, decline when exploring coefficient
Reduce to end value εendWhen, it explores coefficient and keeps end value εendIt is constant, i.e., no longer exploration coefficient is updated.
It should be understood that when sampling for the first time, needing spy as much as possible when the embodiment of the present invention is implemented for the first time
Various states in rope network, therefore the corresponding exploration coefficient of neural network model is maximum at this time, with the increase of sampling number,
The degree of understanding of network state is increased, the target wireless access points determined according to the neural network model after training are more and more quasi-
Really, then network state is explored without as much as possible, explores coefficient and be gradually reduced, neural network model can gradually by with
Machine explores state and is changed into the optimal exploration state of execution.
In another preferred embodiment, the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1,
St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1It indicates
The state vector that the t+1 times sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtStore preset experience replay pond
In;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
The embodiment of the present invention is a kind of method for optimizing update to above-mentioned neural network model, specifically, in conjunction with upper
Embodiment is stated, it is current that handling capacity actual measurement between the wireless access point first with client and currently connected obtains client
Actual speed rate (unit Mbps) and link delay (unit ms) are calculated according to the actual speed rate of acquisition and link delay and are obtained
Reward parameter Rt+1, and calculate and obtain sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, then by sample data
(St, At, Rt+1, St+1) and the corresponding weight parameter W of the sample datatIt stores in pre-set experience replay pond, according to warp
The weight probability distribution for all sample datas tested in playback pond chooses several sample datas as sample set, thus according to
The sample set and back-propagation algorithm of selection optimize neural network model.
It should be noted that the initial value in experience replay pond is 0, and it is provided with certain threshold value, each time by sample
Data (St, At, Rt+1, St+1) and the corresponding weight parameter W of sample datatAfter storing experience replay pond, it is also necessary to further sentence
Whether the quantity of the sample data stored in disconnected experience replay pond reaches threshold value, if reached, according to the principle of first in first out
Store at first sample data and its corresponding weight parameter are deleted from experience replay pond.
As an improvement of the above scheme, described calculated according to the actual speed rate and the link delay obtains reward parameter
Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S- δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter
Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch
Punishment, 1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtIt is selected when with the t-1 times sampling
Target wireless access points At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith
The target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein,
S indicates actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
It should be noted that 1 { A of functiont≠At-1In the operation of client executing roaming switch, value is 1, in client
Value is 0 when being not carried out roaming switch operation, can pass through formula (1- δhandoff×1{At≠At-1) prize to Roaming control
Encourage parameter Rt+1It punishes, to avoid frequently roaming.
As an improvement of the above scheme, the weight probability distribution according in the experience replay pond chooses several samples
Notebook data is specifically included as sample set:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Its
In, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j
=1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2, E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
Specifically, E sample data is stored in experience replay pond in conjunction with above-described embodiment, each sample data pair
A weight parameter is answered, according to formulaIt can calculate and obtain each of experience replay pond sample data
Weight probability, then selected from E sample data weight probability meet preset condition (such as corresponding weight probability be greater than it is certain
Weight probability threshold value sample data) sample set of several sample datas as optimization neural network model.
As an improvement of the above scheme, the neural network model include input layer, basal layer, valuation layer, decision-making level and
Polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it is specific to wrap
It includes:
According to the sample set and back-propagation algorithm to the basal layer, valuation layer and decision-making level of the neural network model
Parameter optimize.
As shown in connection with fig. 2, be a kind of neural network model provided by the invention a preferred embodiment structural block diagram,
Neural network model in Fig. 2 includes input layer 1, basal layer 2 and basal layer 3 (parameter is indicated with θ), valuation layer 4 and valuation layer 5
(parameter is indicated with α), decision-making level 6 and decision-making level 7 (parameter is indicated with β), polymer layer 8, each layer have one or more nerves
Member, each neuron add nonlinear activation function to constitute by linear combination, to can fit by multiple-layer stacked various
Complicated functional relation, wherein 2 layers to 5 layers are all made of full connection plus ReLU activation primitive, 6 layers and 7 layers using full articulamentum, 6
Layer and 7 layers of difference output state cost function V (S;θ, β) and advantage function G (S, A;θ, α), 8 layers according to formulaIt is calculated, wherein
S represents state vector, and A represents the target wireless access points chosen.
Wherein, state value function stand current state StValue assessment, advantage function represent in current state StUnder,
The target wireless access points A of selectiontBehavior memory assessment, the main function that the two is separated is the valuation reduced between the two
Mutually by being influenced, in roaming scence, particularity is the network state and current line of next sampling time node
Not to be directly linked, if network is in the state far from roaming switch boundary, state value function will play leading make at this time
With;And when network is in close to the state on roaming switch boundary, advantage function plays a leading role, because at this time selection is suitable
Roaming switch target will directly affect subsequent reward parameter.
Specifically, in conjunction with above-described embodiment, using the sample set and back-propagation algorithm of selection, according to formula
Update is optimized to the parameter θ of basal layer, the parameter alpha of valuation layer and the parameter beta of decision-making level, wherein αlearnIndicate nerve net
Coefficient, θ are lost in the learning rate of network, γ expressiont -,αt-,βt -It is every to use θ by τ trainingt,αt,βtIt updates once, i.e., it is every to pass through τ times
Training just updates the parameter of neural network in the neural network model used in above-mentioned roaming decisions, is made by parameter update
It is more accurate to the prediction of target wireless access points to obtain neural network model.
It should be noted that in neural network, the number of each specific neuron of layer can according to actual needs into
Row setting, for example, the number of the wireless access point of networking is M, then in neural network model shown in Fig. 2, input layer 1
The number of neuron is 3M, and the number of the neuron of basal layer 2 and basal layer 3 is 2M, the neuron of valuation layer 4 and valuation layer 5
Number be M, the number of the neuron of decision-making level 6 is 1, and the number of the neuron of decision-making level 7 is M, the result that polymer layer 8 exports
The dimension of vector Q is M.
You need to add is that needing before using deeply learning training neural network model to deeply
The relevant parameter of habit is initialized, and ginseng is shown in Table 1, and is the Initialize installation of deeply learning parameter, wherein M is indicated
Network topology interstitial content influences the size of other deeply learning parameters and neural network model;τ indicates neural network ginseng
Examine the update cycle of parameter, i.e., it is every that neural network parameter is just updated to nerve net used in roaming decisions by 20 wheel training
In network model, the training stability of deeply learning system can be increased, accelerate convergence speed;Losing coefficient gamma is prize
A part of function design is encouraged, indicates that roaming decisions more lay particular emphasis on current network state, while future network status also has
It is certain to influence;Learning rate αlearnIndicate the parameter renewal speed of the every wheel training of neural network;Training sample set size N indicates every wheel
The quantity for the sample data that training uses, every wheel training are able to ascend trained stability according to weight probability distribution stochastical sampling;
Link delay specific gravity δdelayIt is bigger, it is bigger to represent the shared specific gravity rewarded of delay;Roaming switch punishes δhandoffIt is bigger, represent hair
At the time of raw roaming switch, the discount of obtained reward is bigger.
Table is arranged in 1 deeply learning parameter of table
Parameter | Numerical value |
Wireless access point number M | It is provided by network topology |
It is initial to explore coefficient εstart | 1 |
Terminate to explore coefficient εstart | 0.001 |
Explore coefficient attenuation number of iterations εdecay | 500×M |
Frequency of training K | max[(εdecay+1200),(εdecay+1500)] |
Neural network reference parameter update cycle τ | 20 |
Lose coefficient gamma | 0.9 |
Learning rate αlearn | 0.005 |
Experience replay pond size E | 1800 |
Training sample set size N | 64 |
Link delay specific gravity δdelay | 0.1Kb/ms2 |
Roaming switch punishes δhandoff | 0.1 |
It should be understood that the parameter setting in table 1 is in addition to this kind of preferred Initialize installation mode may be used also
To there is other a variety of combinations of values, the embodiment of the present invention is not especially limited.
The embodiment of the invention also provides a kind of radio roaming control devices, can be realized described in any of the above-described embodiment
All processes of radio roaming control method, the technical effect difference of effect and the realization of modules, unit in device
It is corresponding identical as the technical effect of effect and the realization of radio roaming control method described in above-described embodiment, it is no longer superfluous here
It states.
It is shown in Figure 3, it is a kind of structure of a preferred embodiment of radio roaming control device provided by the invention
Block diagram, described device are suitable for the network being made of several wireless access point;Described device includes:
State vector obtains module 11, for sampling the state vector for obtaining client every the preset time cycle;Its
In, in the state vector include respectively the RSSI value of client corresponding with each wireless access point, channel utilization and
Noise;
Result vector obtains module 12, for being obtained according to the neural network model after the state vector and preset training
Obtain result vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module 13, for being according to the current exploration of preset random number, the neural network model
Several and result vector selection target wireless access point from several described wireless access point, the Target Wireless is connect
Roaming candidate of the access point as client.
Preferably, the roaming candidate selecting module 13 specifically includes:
Parameter judging unit, for judging it is preset random whether the current exploration coefficient of the neural network model is greater than
Number;
First roaming candidate selecting unit, if being greater than preset random number for currently exploring coefficient, if selection is described
Any wireless access point in dry wireless access point is as the target wireless access points;
Second roaming candidate selecting unit, if for currently explore coefficient be not more than preset random number, selection described in
The corresponding wireless access point of roaming valuation maximum value in result vector is as the target wireless access points.
Preferably, described device further include:
Coefficient updating module is explored, for according to formulaCurrent coefficient of exploring is updated;
Wherein, εtIndicate corresponding exploration coefficient, ε when the t times samplingt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstart
It indicates initial and explores coefficient, c indicates to terminate to explore coefficient, εdecayIt indicates to explore coefficient attenuation number of iterations.
Preferably, described device further include:
Network data acquisition module, for obtaining the current actual speed rate of client and link delay;
Reward and weight calculation module obtain reward parameter for calculating according to the actual speed rate and the link delay
Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIt indicates to adopt for the t times
The state vector of sample acquisition, St+1Indicate the state vector that the t+1 times sampling obtains, AtIndicate the target selected when the t times sampling
Wireless access point;
Sample data memory module is used for the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtIt deposits
It stores up in preset experience replay pond;
Sample set chooses module, for choosing several sample numbers according to the weight probability distribution in the experience replay pond
According to as sample set;
Model optimization module, it is excellent for being carried out according to the sample set and back-propagation algorithm to the neural network model
Change.
Preferably, the reward and weight calculation module specifically include:
Parameter calculation unit is rewarded, for according to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠
At-1) calculate to obtain and reward parameter Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay ratio
Weight, δhandoffIndicate roaming switch punishment, 1 { A of functiont≠At-1Indicate the target wireless access points selected when the t times sampling
AtWith the target wireless access points A selected when the t-1 times samplingt-1When different, 1 { At≠At-1}=1 is selected when the t times sampling
Target wireless access points AtWith the target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
Weight parameter computing unit, for according to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1)
Corresponding weight parameter Wt;Wherein, S indicates actual speed rate, StheorIndicate the specification of the wireless access point connected according to client
The Theoretical Rate of gain of parameter.
Preferably, the sample set is chosen module and is specifically included:
Weight probability calculation unit, for according to formulaCalculate each of experience replay pond sample
The weight probability of notebook data;Wherein, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIt indicates j-th
The weight probability of sample data, j=1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,
2, E;
Sample set selection unit, for selecting weight probability to meet several sample datas of preset condition as the sample
This collection.
Preferably, the neural network model includes input layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, the model optimization module specifically includes:
Model optimization unit, for the basis according to the sample set and back-propagation algorithm to the neural network model
Layer, valuation layer and the parameter of decision-making level optimize.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes
The computer program of storage;Wherein, where the computer program controls the computer readable storage medium at runtime
Equipment executes radio roaming control method described in any of the above-described embodiment.
It is shown in Figure 4 the embodiment of the invention also provides a kind of terminal device, it is that a kind of terminal provided by the invention is set
The structural block diagram of a standby preferred embodiment, the terminal device include processor 10, memory 20 and are stored in described
In memory 20 and it is configured as the computer program executed by the processor 10, the processor 10 is executing the calculating
Radio roaming control method described in any of the above-described embodiment is realized when machine program.
Preferably, the computer program can be divided into one or more module/units (such as computer program 1, meter
Calculation machine program 2), one or more of module/units are stored in the memory 20, and by
The processor 10 executes, to complete the present invention.One or more of module/units, which can be, can complete specific function
Series of computation machine program instruction section, the instruction segment is for describing execution of the computer program in the terminal device
Journey.
The processor 10 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc., general processor can be microprocessor or the processor 10 is also possible to any conventional place
Device is managed, the processor 10 is the control centre of the terminal device, utilizes terminal device described in various interfaces and connection
Various pieces.
The memory 20 mainly includes program storage area and data storage area, wherein program storage area can store operation
Application program needed for system, at least one function etc., data storage area can store related data etc..In addition, the memory
20 can be high-speed random access memory, can also be nonvolatile memory, such as plug-in type hard disk, intelligent memory card
(Smart Media Card, SMC), secure digital (Secure Digital, SD) card and flash card (Flash Card) etc., or
The memory 20 is also possible to other volatile solid-state parts.
It should be noted that above-mentioned terminal device may include, but it is not limited only to, processor, memory, those skilled in the art
Member does not constitute the restriction to terminal device it is appreciated that Fig. 4 structural block diagram is only the example of above-mentioned terminal device, can be with
Including perhaps combining certain components or different components than illustrating more or fewer components.
To sum up, a kind of radio roaming control method, device, computer readable storage medium provided by the embodiment of the present invention
And terminal device, it has the advantages that
(1) it may cause network internal operating status as time goes by change, can be run according to network internal
State adjusts the roaming switch target of client in real time, improves the usage experience of client;
(2) environment changing factor can be captured and dynamic adjusts roaming policy, adapt to different environment;
(3) compared to traditional loaming method based on RSSI threshold value, better net can be obtained under above-mentioned complex environment
Network rate and lower link delay;
(4) compared to traditional loaming method based on RSSI threshold value, it is contemplated that noise, channel utilization are to channel circumstance
Influence, and feature is further extracted by neural network, is conducive to accurately find optimal roaming switch target;
(5) by way of combining neural network and experience replay, environmental history can be remembered, helps to mention
The behavioural characteristic at preceding client perception end, so as to optimize roaming opportunity;
(6) actual speed rate current according to client is to sample data (St, At, Rt+1, St+1) assign weight parameter Wt, make
It is higher to obtain the probability sampled closer to the sample data of real velocity, so as to improve the accurate of neural network model valuation
Degree.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art
For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations
Also it should be regarded as protection scope of the present invention.
Claims (10)
1. a kind of radio roaming control method, which is characterized in that the method was suitable for being made of several wireless access point
Network;The described method includes:
The state vector for obtaining client is sampled every the preset time cycle;Wherein, in the state vector include respectively with
RSSI value, channel utilization and the noise of the corresponding client of each wireless access point;
Result vector is obtained according to the neural network model after the state vector and preset training;Wherein, the result to
It include the corresponding roaming valuation of several described wireless access point in amount;
According to preset random number, the current exploration coefficient of the neural network model and the result vector from it is described several
Selection target wireless access point in wireless access point, using the target wireless access points as the roaming candidate of client.
2. radio roaming control method as described in claim 1, which is characterized in that it is described according to preset random number, it is described
The current exploration coefficient and the result vector of neural network model selection target from several described wireless access point are wireless
Access point specifically includes:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, any wireless access in several described wireless access point is selected
Point is used as the target wireless access points;
If currently exploring coefficient is not more than preset random number, select the roaming valuation maximum value in the result vector corresponding
Wireless access point as the target wireless access points.
3. radio roaming control method as described in claim 1, which is characterized in that after determining target wireless access points,
The method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtIndicate corresponding when the t times sampling
Exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendIndicate knot
Beam explores coefficient, εdecayIt indicates to explore coefficient attenuation number of iterations.
4. radio roaming control method as described in claim 1, which is characterized in that the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1)
Corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1Indicate t+1
The state vector that secondary sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtIt stores in preset experience replay pond;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
5. radio roaming control method as claimed in claim 4, which is characterized in that described according to the actual speed rate and described
Link delay, which calculates, obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter Rt+1;
Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch punishment,
1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtWith the target selected when the t-1 times sampling
Wireless access point At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith the t-1 times
The target wireless access points A selected when samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, S is indicated
Actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
6. radio roaming control method as claimed in claim 4, which is characterized in that described according in the experience replay pond
Weight probability distribution chooses several sample datas as sample set, specifically includes:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Wherein, E table
Show the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j=1,
2 ..., E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2 ..., E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
7. radio roaming control method as claimed in claim 4, which is characterized in that the neural network model includes input
Layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it specifically includes:
According to the sample set and back-propagation algorithm to the basal layer of the neural network model, the ginseng of valuation layer and decision-making level
Number optimizes.
8. a kind of radio roaming control device, which is characterized in that described device was suitable for being made of several wireless access point
Network;Described device includes:
State vector obtains module, for sampling the state vector for obtaining client every the preset time cycle;Wherein, described
It include RSSI value, channel utilization and the noise of client corresponding with each wireless access point respectively in state vector;
Result vector obtains module, for obtaining result according to the neural network model after the state vector and preset training
Vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module, for according to preset random number, the neural network model current exploration coefficient and institute
Result vector selection target wireless access point from several described wireless access point is stated, the target wireless access points are made
For the roaming candidate of client.
9. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium includes the calculating of storage
Machine program;Wherein, the equipment where the computer program controls the computer readable storage medium at runtime executes such as
The described in any item radio roaming control methods of claim 1~7.
10. a kind of terminal device, which is characterized in that including processor, memory and store in the memory and matched
It is set to the computer program executed by the processor, the processor is realized when executing the computer program as right is wanted
Seek 1~7 described in any item radio roaming control methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910793482.3A CN110519816B (en) | 2019-08-22 | 2019-08-22 | Wireless roaming control method, device, storage medium and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910793482.3A CN110519816B (en) | 2019-08-22 | 2019-08-22 | Wireless roaming control method, device, storage medium and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110519816A true CN110519816A (en) | 2019-11-29 |
CN110519816B CN110519816B (en) | 2021-09-10 |
Family
ID=68627062
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910793482.3A Active CN110519816B (en) | 2019-08-22 | 2019-08-22 | Wireless roaming control method, device, storage medium and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110519816B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111065131A (en) * | 2019-12-16 | 2020-04-24 | 深圳大学 | Switching method and device and electronic equipment |
CN111278073A (en) * | 2020-01-20 | 2020-06-12 | 普联技术有限公司 | WIFI roaming setting method and device, wireless connection equipment and readable storage medium |
WO2021208809A1 (en) * | 2020-04-13 | 2021-10-21 | 杭州萤石软件有限公司 | Wireless roaming method and system |
WO2022105876A1 (en) * | 2020-11-23 | 2022-05-27 | 华为技术有限公司 | Method and apparatus for selecting decision |
CN114900859A (en) * | 2022-07-11 | 2022-08-12 | 深圳市华曦达科技股份有限公司 | Easy mesh network management method and device |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101884240A (en) * | 2007-10-01 | 2010-11-10 | 高通股份有限公司 | Mobile access in a diverse access point network |
CN105657758A (en) * | 2016-01-12 | 2016-06-08 | 杭州全维通信服务股份有限公司 | Multi-AP adaptive switching method based on Markov model |
CN105808721A (en) * | 2016-03-07 | 2016-07-27 | 中国科学院声学研究所 | Data mining based customer service content analysis method and system |
WO2016118979A2 (en) * | 2015-01-23 | 2016-07-28 | C3, Inc. | Systems, methods, and devices for an enterprise internet-of-things application development platform |
CN106550348A (en) * | 2016-12-09 | 2017-03-29 | 锐捷网络股份有限公司 | A kind of method for realizing in the wireless local area network roaming, WAP and server |
WO2017054883A1 (en) * | 2015-10-02 | 2017-04-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Analytics driven wireless device session context handover in operator cloud |
CN106708037A (en) * | 2016-12-05 | 2017-05-24 | 北京贝虎机器人技术有限公司 | Autonomous mobile equipment positioning method and device, and autonomous mobile equipment |
CN106792992A (en) * | 2016-12-12 | 2017-05-31 | 上海掌门科技有限公司 | A kind of method and apparatus for providing WAP information |
WO2018085416A1 (en) * | 2016-11-04 | 2018-05-11 | Intel Corporation | Mobility support for 5g nr |
CN108900333A (en) * | 2018-06-27 | 2018-11-27 | 新华三大数据技术有限公司 | A kind of appraisal procedure and assessment device of quality of wireless network |
CN109379752A (en) * | 2018-09-10 | 2019-02-22 | 中国移动通信集团江苏有限公司 | Optimization method, device, equipment and the medium of Massive MIMO |
CN109447275A (en) * | 2018-11-09 | 2019-03-08 | 西安邮电大学 | Based on the handoff algorithms of machine learning in UDN |
CN109766932A (en) * | 2018-12-25 | 2019-05-17 | 新华三大数据技术有限公司 | A kind of Feature Selection method and Feature Selection device |
-
2019
- 2019-08-22 CN CN201910793482.3A patent/CN110519816B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101884240A (en) * | 2007-10-01 | 2010-11-10 | 高通股份有限公司 | Mobile access in a diverse access point network |
WO2016118979A2 (en) * | 2015-01-23 | 2016-07-28 | C3, Inc. | Systems, methods, and devices for an enterprise internet-of-things application development platform |
WO2017054883A1 (en) * | 2015-10-02 | 2017-04-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Analytics driven wireless device session context handover in operator cloud |
CN105657758A (en) * | 2016-01-12 | 2016-06-08 | 杭州全维通信服务股份有限公司 | Multi-AP adaptive switching method based on Markov model |
CN105808721A (en) * | 2016-03-07 | 2016-07-27 | 中国科学院声学研究所 | Data mining based customer service content analysis method and system |
WO2018085416A1 (en) * | 2016-11-04 | 2018-05-11 | Intel Corporation | Mobility support for 5g nr |
CN106708037A (en) * | 2016-12-05 | 2017-05-24 | 北京贝虎机器人技术有限公司 | Autonomous mobile equipment positioning method and device, and autonomous mobile equipment |
CN106550348A (en) * | 2016-12-09 | 2017-03-29 | 锐捷网络股份有限公司 | A kind of method for realizing in the wireless local area network roaming, WAP and server |
CN106792992A (en) * | 2016-12-12 | 2017-05-31 | 上海掌门科技有限公司 | A kind of method and apparatus for providing WAP information |
CN108900333A (en) * | 2018-06-27 | 2018-11-27 | 新华三大数据技术有限公司 | A kind of appraisal procedure and assessment device of quality of wireless network |
CN109379752A (en) * | 2018-09-10 | 2019-02-22 | 中国移动通信集团江苏有限公司 | Optimization method, device, equipment and the medium of Massive MIMO |
CN109447275A (en) * | 2018-11-09 | 2019-03-08 | 西安邮电大学 | Based on the handoff algorithms of machine learning in UDN |
CN109766932A (en) * | 2018-12-25 | 2019-05-17 | 新华三大数据技术有限公司 | A kind of Feature Selection method and Feature Selection device |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111065131A (en) * | 2019-12-16 | 2020-04-24 | 深圳大学 | Switching method and device and electronic equipment |
CN111065131B (en) * | 2019-12-16 | 2023-04-18 | 深圳大学 | Switching method and device and electronic equipment |
CN111278073A (en) * | 2020-01-20 | 2020-06-12 | 普联技术有限公司 | WIFI roaming setting method and device, wireless connection equipment and readable storage medium |
CN111278073B (en) * | 2020-01-20 | 2022-03-08 | 普联技术有限公司 | WIFI roaming setting method and device, wireless connection equipment and readable storage medium |
WO2021208809A1 (en) * | 2020-04-13 | 2021-10-21 | 杭州萤石软件有限公司 | Wireless roaming method and system |
WO2022105876A1 (en) * | 2020-11-23 | 2022-05-27 | 华为技术有限公司 | Method and apparatus for selecting decision |
CN114900859A (en) * | 2022-07-11 | 2022-08-12 | 深圳市华曦达科技股份有限公司 | Easy mesh network management method and device |
CN114900859B (en) * | 2022-07-11 | 2022-09-20 | 深圳市华曦达科技股份有限公司 | Easy mesh network management method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110519816B (en) | 2021-09-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110519816A (en) | A kind of radio roaming control method, device, storage medium and terminal device | |
CN102183621B (en) | Aquaculture dissolved oxygen concentration online forecasting method and system | |
CN109245840A (en) | Spectrum prediction method in cognitive radio system based on convolutional neural networks | |
CN105005204A (en) | Intelligent engine system capable of automatically triggering intelligent home and intelligent life scenes and method | |
CN102185731B (en) | Network health degree testing method and system | |
CN110460880A (en) | Wireless industrial streaming media self-adapting transmission method based on population and neural network | |
CN107105453B (en) | Cut-in method is selected based on the heterogeneous network of analytic hierarchy process (AHP) and evolutionary game theory | |
CN115310360A (en) | Digital twin auxiliary industrial Internet of things reliability optimization method based on federal learning | |
CN112329997A (en) | Power demand load prediction method and system, electronic device, and storage medium | |
CN116957874B (en) | Intelligent automatic course arrangement method, system and equipment for universities and storage medium | |
Liu et al. | Dynamic multichannel sensing in cognitive radio: Hierarchical reinforcement learning | |
CN110516889B (en) | Load comprehensive prediction method based on Q-learning and related equipment | |
CN107889195A (en) | A kind of self study heterogeneous wireless network access selection method of differentiated service | |
CN111555924B (en) | Gateway equipment optimization deployment method for intelligent road system | |
CN110501952B (en) | Information acquisition equipment and management method | |
Wang et al. | Application and Performance Comparison of Biogeography-based Optimization Algorithm on Unconstrained Function Optimization Problem. | |
CN116390165B (en) | Dynamic management method and system for data-driven cellular base station | |
CN104378420B (en) | Data transmission method and device based on environment sensing | |
CN114697974B (en) | Network coverage optimization method and device, electronic equipment and storage medium | |
CN116306884B (en) | Pruning method and device for federal learning model and nonvolatile storage medium | |
CN117073303B (en) | Uniform refrigeration regulation and control method of spiral instant freezer | |
Gao et al. | Deep Reinforcement Learning Based Rendering Service Placement for Cloud Gaming in Mobile Edge Computing Systems | |
CN116227741A (en) | Water chilling unit energy saving method and device based on self-adaptive algorithm and related medium | |
Mendes | Convergence of the Reinforcement Learning Mechanism Applied to the Channel Detection Sequence Problem | |
CN114528969A (en) | Method for selecting task network, system and method for determining action based on sensing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |