CN110519816A - A kind of radio roaming control method, device, storage medium and terminal device - Google Patents

A kind of radio roaming control method, device, storage medium and terminal device Download PDF

Info

Publication number
CN110519816A
CN110519816A CN201910793482.3A CN201910793482A CN110519816A CN 110519816 A CN110519816 A CN 110519816A CN 201910793482 A CN201910793482 A CN 201910793482A CN 110519816 A CN110519816 A CN 110519816A
Authority
CN
China
Prior art keywords
wireless access
access point
roaming
neural network
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910793482.3A
Other languages
Chinese (zh)
Other versions
CN110519816B (en
Inventor
黄泽淳
程文强
陈建平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TP Link Technologies Co Ltd
Original Assignee
TP Link Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TP Link Technologies Co Ltd filed Critical TP Link Technologies Co Ltd
Priority to CN201910793482.3A priority Critical patent/CN110519816B/en
Publication of CN110519816A publication Critical patent/CN110519816A/en
Application granted granted Critical
Publication of CN110519816B publication Critical patent/CN110519816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/08Reselecting an access point
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/24Reselection being triggered by specific parameters
    • H04W36/30Reselection being triggered by specific parameters by measured or perceived connection quality data

Abstract

The invention discloses a kind of radio roaming control methods, comprising: the state vector for obtaining client is sampled every the preset time cycle;It include RSSI value, channel utilization and the noise of client corresponding with each wireless access point respectively in the state vector;Result vector is obtained according to the neural network model after the state vector and preset training;It include the corresponding roaming valuation of several described wireless access point in the result vector;According to preset random number, the current exploration coefficient of the neural network model and the result vector selection target wireless access point from several described wireless access point, using the target wireless access points as the roaming candidate of client.Correspondingly, the invention also discloses a kind of radio roaming control device, computer readable storage medium and terminal devices.The usage experience of client can be improved according to environmental change and network internal operating status dynamic adjustment roaming candidate using technical solution of the present invention.

Description

A kind of radio roaming control method, device, storage medium and terminal device
Technical field
The present invention relates to wireless communication technology fields more particularly to a kind of radio roaming control method, device, computer can Read storage medium and terminal device.
Background technique
Roaming refers to that client (comprising products such as router, wireless extensions devices, is referred to as here from a wireless access point For wireless access point) it is switched to the process of another wireless access point, substantially problem to be solved is when to trigger roaming, And how to determine the target wireless access points of roaming switch.
The seamless roam strategy that existing wireless access point is supported is all based on greatly 802.11k/v/r agreement, utilizes signal Intensity (RSSI) realizes radio roaming as threshold value and judge index, and main method may be summarized to be: wireless access point week The RSSI for the client that the monitoring of phase property receives, which is compared with preset signal strength threshold, if the RSSI Less than signal strength threshold, then wireless access point issues 802.11k message request to client, and client starts to query alternative nothing The corresponding RSSI of line access point, and the information inquired is returned into current wireless access point, in wireless access point list, according to Decision logic and method based on RSSI determine the target wireless access points roamed into wireless access point list, to realize Client is switched to the seamless roam process of another wireless access point from a wireless access point.
Existing wireless network roaming strategy is a kind of fixed threshold strategy that decision is carried out based on RSSI, still, wirelessly Network is easy by blocking between multipath effect, position area, height above sea level, temperature humidity, wireless access point and client The influence of the environmental factors such as situation, wireless channel would generally be varied over time, and only pass through RSSI power The considerations of whether being roamed to measure, having lacked the internal operation state to the network of wireless access point composition, it is therefore, only logical Environment and the network internal operation of Various Complex can not be combined by crossing the fixed threshold strategy progress roaming decisions based on RSSI State influences the usage experience of client.
Summary of the invention
The technical problem to be solved by the embodiment of the invention is that providing a kind of radio roaming control method, device, calculating Machine readable storage medium storing program for executing and terminal device, can be according to environmental change and network internal operating status dynamic adjustment roaming mesh Mark, improves the usage experience of client.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of radio roaming control method, the methods Suitable for the network being made of several wireless access point;The described method includes:
The state vector for obtaining client is sampled every the preset time cycle;It wherein, include point in the state vector RSSI value, channel utilization and the noise of client not corresponding with each wireless access point;
Result vector is obtained according to the neural network model after the state vector and preset training;Wherein, the knot It include the corresponding roaming valuation of several described wireless access point in fruit vector;
If according to preset random number, the current exploration coefficient of the neural network model and the result vector from described Selection target wireless access point in dry wireless access point, using the target wireless access points as the roaming mesh of client Mark.
Further, described according to preset random number, the current exploration coefficient and the knot of the neural network model Fruit vector selection target wireless access point from several described wireless access point, specifically includes:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, select any wireless in several described wireless access point Access point is as the target wireless access points;
If currently exploring coefficient is not more than preset random number, the roaming valuation maximum value in the result vector is selected Corresponding wireless access point is as the target wireless access points.
Further, after determining target wireless access points, the method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtWhen indicating the t times sampling Corresponding exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendTable Show that coefficient, ε are explored in enddecayIt indicates to explore coefficient attenuation number of iterations.
Further, the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1It indicates The state vector that the t+1 times sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtStore preset experience replay pond In;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
Further, described calculated according to the actual speed rate and the link delay obtains reward parameter Rt+1And sample Data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch Punishment, 1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtIt is selected when with the t-1 times sampling Target wireless access points At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith The target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, S Indicate actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
Further, the weight probability distribution according in the experience replay pond chooses several sample data conducts Sample set specifically includes:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Its In, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j =1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2, E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
Further, the neural network model includes input layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it is specific to wrap It includes:
According to the sample set and back-propagation algorithm to the basal layer, valuation layer and decision-making level of the neural network model Parameter optimize.
In order to solve the above-mentioned technical problem, the embodiment of the invention also provides a kind of radio roaming control device, the dresses Set the network for being suitable for being made of several wireless access point;Described device includes:
State vector obtains module, for sampling the state vector for obtaining client every the preset time cycle;Wherein, It include the RSSI value of client corresponding with each wireless access point, channel utilization and making an uproar in the state vector respectively Sound;
Result vector obtains module, for being obtained according to the neural network model after the state vector and preset training Result vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module, for the current exploration coefficient according to preset random number, the neural network model With the result vector from several described wireless access point selection target wireless access point, the Target Wireless is accessed Roaming candidate of the point as client.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes The computer program of storage;Wherein, where the computer program controls the computer readable storage medium at runtime Equipment executes radio roaming control method described in any of the above embodiments.
The embodiment of the invention also provides a kind of terminal device, including processor, memory and it is stored in the storage In device and it is configured as the computer program executed by the processor, the processor is real when executing the computer program Existing radio roaming control method described in any of the above embodiments.
Compared with prior art, the embodiment of the invention provides a kind of radio roaming control methods, device, computer-readable Storage medium and terminal device sample the state vector for obtaining client every the preset time cycle, wrap in the state vector RSSI value, channel utilization and the noise for including client corresponding with each wireless access point respectively, according to the state vector Result vector is obtained with the neural network model after preset training, includes that several wireless access point are corresponding in the result vector Roaming valuation, according to preset random number, the current exploration coefficient of neural network model and the result vector from several nothings Selection target wireless access point in line access point, using target wireless access points as the roaming candidate of client, so as to According to environmental change and network internal operating status dynamic adjustment roaming candidate, the usage experience of client is improved.
Detailed description of the invention
Fig. 1 is a kind of flow chart of a preferred embodiment of radio roaming control method provided by the invention;
Fig. 2 is a kind of structural block diagram of a preferred embodiment of neural network model provided by the invention;
Fig. 3 is a kind of structural block diagram of a preferred embodiment of radio roaming control device provided by the invention;
Fig. 4 is a kind of structural block diagram of a preferred embodiment of terminal device provided by the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained all without creative efforts Other embodiments shall fall within the protection scope of the present invention.
It is shown in Figure 1 the embodiment of the invention provides a kind of radio roaming control method, it is one kind provided by the invention The flow chart of one preferred embodiment of radio roaming control method, the method are suitable for being made of several wireless access point Network;The method includes the steps S11 to step S13:
Step S11, the state vector of acquisition client is sampled every the preset time cycle;Wherein, the state vector In include client corresponding with each wireless access point respectively RSSI value, channel utilization and noise;
Step S12, result vector is obtained according to the neural network model after the state vector and preset training;Its In, it include the corresponding roaming valuation of several described wireless access point in the result vector;
Step S13, according to preset random number, the current exploration coefficient and the result vector of the neural network model The selection target wireless access point from several described wireless access point, using the target wireless access points as client Roaming candidate.
Specifically, every pre-set time cycle (the specific time cycle can be configured according to actual needs) The network being made of several (being assumed to be M) wireless access point is sampled, obtains client to be roamed relative to every Uplink RSSI value corresponding to one wireless access point, channel utilization U and noise N, accordingly obtain the state of the client to Amount is St=[RSSI1, U1, N1, RSSIM, UM, NM], wherein t indicates sampling number, StIt indicates the t times (i.e. this) The state vector obtained is sampled, M indicates the number of wireless access point, M > 1, state vector StLength be 3M;By state vector St Being input to pre-set trained neural network model, (neural network model uses deeply study to be instructed in advance Practice) in, neural network model exports the result vector Q that length is M, includes that M wireless access point is corresponding unrestrained in result vector Q Valuation, the corresponding roaming valuation of a wireless access point are swum, roaming valuation represents the client roaming switch to corresponding nothing A possibility that line access point, roaming valuation is bigger, and possibility is bigger;A random number (0≤random number≤1) is generated, according to this Random number, the current exploration coefficient of neural network model and the result vector selection target from M wireless access point wirelessly connect Access point, using the target wireless access points selected as the roaming candidate of client.
It should be noted that after client access wireless access point, due to by environmental change or network internal operation The influence of state change, the wireless access point currently connected may be unable to satisfy the use demand of client, it is therefore desirable to every Period regular hour samples the state vector of client, to judge whether client needs roaming switch wireless access Point, and after selecting target wireless access points, in order to avoid the operation of unnecessary roaming switch, can further judge Whether the target wireless access points and the wireless access point that client is currently connect are identical, if not identical, issue to client Road report operates so that client executes corresponding roaming switch according to the target wireless access points, if they are the same, then without to Client issues road report.
A kind of radio roaming control method provided by the embodiment of the present invention, by every the preset time cycle to client RSSI value, channel utilization and the noise at end are sampled, and accordingly obtain state vector, and state vector is input to and is trained Neural network model, obtain result vector, accordingly with selection target is wireless from all wireless access point according to result vector Roaming object of the access point as client, it is contemplated that client is changed institute's band by environmental change and network internal operating status The influence come, and handled using the neural network model after training, so as to according in environmental change and network Portion's operating status dynamic adjustment roaming candidate, improves the accuracy of roaming candidate, and improve the usage experience of client.
In a further advantageous embodiment, described according to preset random number, the current exploration of the neural network model Coefficient and the result vector selection target wireless access point from several described wireless access point, specifically include:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, select any wireless in several described wireless access point Access point is as the target wireless access points;
If currently exploring coefficient is not more than preset random number, the roaming valuation maximum value in the result vector is selected Corresponding wireless access point is as the target wireless access points.
Specifically, in conjunction with above-described embodiment, according to the random number of generation, the current exploration coefficient of neural network model and Result vector is from M wireless access point when selection target wireless access point, first by the current exploration coefficient of neural network model It is compared with the random number of generation, if the current exploration coefficient of neural network model is greater than the random number generated, selects M Any one wireless access point in a wireless access point is as target wireless access points;If the current spy of neural network model Rope coefficient is no more than the random number generated, the then corresponding wireless access point conduct of roaming valuation maximum value in selection result vector Target wireless access points.
In another preferred embodiment, after determining target wireless access points, the method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtWhen indicating the t times sampling Corresponding exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendTable Show that coefficient, ε are explored in enddecayIt indicates to explore coefficient attenuation number of iterations.
Specifically, after sampling each time and determining target wireless access points, being needed according to public affairs in conjunction with above-described embodiment FormulaLinear attenuation update is carried out to current coefficient of exploring, corresponding exploration coefficient is most when sampling for the first time Greatly, it with the increase of sampling number, explores coefficient and is gradually reduced.
It should be noted that exploring coefficient has pre-set initial value εstartWith end value εend, decline when exploring coefficient Reduce to end value εendWhen, it explores coefficient and keeps end value εendIt is constant, i.e., no longer exploration coefficient is updated.
It should be understood that when sampling for the first time, needing spy as much as possible when the embodiment of the present invention is implemented for the first time Various states in rope network, therefore the corresponding exploration coefficient of neural network model is maximum at this time, with the increase of sampling number, The degree of understanding of network state is increased, the target wireless access points determined according to the neural network model after training are more and more quasi- Really, then network state is explored without as much as possible, explores coefficient and be gradually reduced, neural network model can gradually by with Machine explores state and is changed into the optimal exploration state of execution.
In another preferred embodiment, the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1It indicates The state vector that the t+1 times sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtStore preset experience replay pond In;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
The embodiment of the present invention is a kind of method for optimizing update to above-mentioned neural network model, specifically, in conjunction with upper Embodiment is stated, it is current that handling capacity actual measurement between the wireless access point first with client and currently connected obtains client Actual speed rate (unit Mbps) and link delay (unit ms) are calculated according to the actual speed rate of acquisition and link delay and are obtained Reward parameter Rt+1, and calculate and obtain sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, then by sample data (St, At, Rt+1, St+1) and the corresponding weight parameter W of the sample datatIt stores in pre-set experience replay pond, according to warp The weight probability distribution for all sample datas tested in playback pond chooses several sample datas as sample set, thus according to The sample set and back-propagation algorithm of selection optimize neural network model.
It should be noted that the initial value in experience replay pond is 0, and it is provided with certain threshold value, each time by sample Data (St, At, Rt+1, St+1) and the corresponding weight parameter W of sample datatAfter storing experience replay pond, it is also necessary to further sentence Whether the quantity of the sample data stored in disconnected experience replay pond reaches threshold value, if reached, according to the principle of first in first out Store at first sample data and its corresponding weight parameter are deleted from experience replay pond.
As an improvement of the above scheme, described calculated according to the actual speed rate and the link delay obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S- δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch Punishment, 1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtIt is selected when with the t-1 times sampling Target wireless access points At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith The target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, S indicates actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
It should be noted that 1 { A of functiont≠At-1In the operation of client executing roaming switch, value is 1, in client Value is 0 when being not carried out roaming switch operation, can pass through formula (1- δhandoff×1{At≠At-1) prize to Roaming control Encourage parameter Rt+1It punishes, to avoid frequently roaming.
As an improvement of the above scheme, the weight probability distribution according in the experience replay pond chooses several samples Notebook data is specifically included as sample set:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Its In, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j =1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2, E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
Specifically, E sample data is stored in experience replay pond in conjunction with above-described embodiment, each sample data pair A weight parameter is answered, according to formulaIt can calculate and obtain each of experience replay pond sample data Weight probability, then selected from E sample data weight probability meet preset condition (such as corresponding weight probability be greater than it is certain Weight probability threshold value sample data) sample set of several sample datas as optimization neural network model.
As an improvement of the above scheme, the neural network model include input layer, basal layer, valuation layer, decision-making level and Polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it is specific to wrap It includes:
According to the sample set and back-propagation algorithm to the basal layer, valuation layer and decision-making level of the neural network model Parameter optimize.
As shown in connection with fig. 2, be a kind of neural network model provided by the invention a preferred embodiment structural block diagram, Neural network model in Fig. 2 includes input layer 1, basal layer 2 and basal layer 3 (parameter is indicated with θ), valuation layer 4 and valuation layer 5 (parameter is indicated with α), decision-making level 6 and decision-making level 7 (parameter is indicated with β), polymer layer 8, each layer have one or more nerves Member, each neuron add nonlinear activation function to constitute by linear combination, to can fit by multiple-layer stacked various Complicated functional relation, wherein 2 layers to 5 layers are all made of full connection plus ReLU activation primitive, 6 layers and 7 layers using full articulamentum, 6 Layer and 7 layers of difference output state cost function V (S;θ, β) and advantage function G (S, A;θ, α), 8 layers according to formulaIt is calculated, wherein S represents state vector, and A represents the target wireless access points chosen.
Wherein, state value function stand current state StValue assessment, advantage function represent in current state StUnder, The target wireless access points A of selectiontBehavior memory assessment, the main function that the two is separated is the valuation reduced between the two Mutually by being influenced, in roaming scence, particularity is the network state and current line of next sampling time node Not to be directly linked, if network is in the state far from roaming switch boundary, state value function will play leading make at this time With;And when network is in close to the state on roaming switch boundary, advantage function plays a leading role, because at this time selection is suitable Roaming switch target will directly affect subsequent reward parameter.
Specifically, in conjunction with above-described embodiment, using the sample set and back-propagation algorithm of selection, according to formula Update is optimized to the parameter θ of basal layer, the parameter alpha of valuation layer and the parameter beta of decision-making level, wherein αlearnIndicate nerve net Coefficient, θ are lost in the learning rate of network, γ expressiont -t-,βt -It is every to use θ by τ trainingtttIt updates once, i.e., it is every to pass through τ times Training just updates the parameter of neural network in the neural network model used in above-mentioned roaming decisions, is made by parameter update It is more accurate to the prediction of target wireless access points to obtain neural network model.
It should be noted that in neural network, the number of each specific neuron of layer can according to actual needs into Row setting, for example, the number of the wireless access point of networking is M, then in neural network model shown in Fig. 2, input layer 1 The number of neuron is 3M, and the number of the neuron of basal layer 2 and basal layer 3 is 2M, the neuron of valuation layer 4 and valuation layer 5 Number be M, the number of the neuron of decision-making level 6 is 1, and the number of the neuron of decision-making level 7 is M, the result that polymer layer 8 exports The dimension of vector Q is M.
You need to add is that needing before using deeply learning training neural network model to deeply The relevant parameter of habit is initialized, and ginseng is shown in Table 1, and is the Initialize installation of deeply learning parameter, wherein M is indicated Network topology interstitial content influences the size of other deeply learning parameters and neural network model;τ indicates neural network ginseng Examine the update cycle of parameter, i.e., it is every that neural network parameter is just updated to nerve net used in roaming decisions by 20 wheel training In network model, the training stability of deeply learning system can be increased, accelerate convergence speed;Losing coefficient gamma is prize A part of function design is encouraged, indicates that roaming decisions more lay particular emphasis on current network state, while future network status also has It is certain to influence;Learning rate αlearnIndicate the parameter renewal speed of the every wheel training of neural network;Training sample set size N indicates every wheel The quantity for the sample data that training uses, every wheel training are able to ascend trained stability according to weight probability distribution stochastical sampling; Link delay specific gravity δdelayIt is bigger, it is bigger to represent the shared specific gravity rewarded of delay;Roaming switch punishes δhandoffIt is bigger, represent hair At the time of raw roaming switch, the discount of obtained reward is bigger.
Table is arranged in 1 deeply learning parameter of table
Parameter Numerical value
Wireless access point number M It is provided by network topology
It is initial to explore coefficient εstart 1
Terminate to explore coefficient εstart 0.001
Explore coefficient attenuation number of iterations εdecay 500×M
Frequency of training K max[(εdecay+1200),(εdecay+1500)]
Neural network reference parameter update cycle τ 20
Lose coefficient gamma 0.9
Learning rate αlearn 0.005
Experience replay pond size E 1800
Training sample set size N 64
Link delay specific gravity δdelay 0.1Kb/ms2
Roaming switch punishes δhandoff 0.1
It should be understood that the parameter setting in table 1 is in addition to this kind of preferred Initialize installation mode may be used also To there is other a variety of combinations of values, the embodiment of the present invention is not especially limited.
The embodiment of the invention also provides a kind of radio roaming control devices, can be realized described in any of the above-described embodiment All processes of radio roaming control method, the technical effect difference of effect and the realization of modules, unit in device It is corresponding identical as the technical effect of effect and the realization of radio roaming control method described in above-described embodiment, it is no longer superfluous here It states.
It is shown in Figure 3, it is a kind of structure of a preferred embodiment of radio roaming control device provided by the invention Block diagram, described device are suitable for the network being made of several wireless access point;Described device includes:
State vector obtains module 11, for sampling the state vector for obtaining client every the preset time cycle;Its In, in the state vector include respectively the RSSI value of client corresponding with each wireless access point, channel utilization and Noise;
Result vector obtains module 12, for being obtained according to the neural network model after the state vector and preset training Obtain result vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module 13, for being according to the current exploration of preset random number, the neural network model Several and result vector selection target wireless access point from several described wireless access point, the Target Wireless is connect Roaming candidate of the access point as client.
Preferably, the roaming candidate selecting module 13 specifically includes:
Parameter judging unit, for judging it is preset random whether the current exploration coefficient of the neural network model is greater than Number;
First roaming candidate selecting unit, if being greater than preset random number for currently exploring coefficient, if selection is described Any wireless access point in dry wireless access point is as the target wireless access points;
Second roaming candidate selecting unit, if for currently explore coefficient be not more than preset random number, selection described in The corresponding wireless access point of roaming valuation maximum value in result vector is as the target wireless access points.
Preferably, described device further include:
Coefficient updating module is explored, for according to formulaCurrent coefficient of exploring is updated; Wherein, εtIndicate corresponding exploration coefficient, ε when the t times samplingt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstart It indicates initial and explores coefficient, c indicates to terminate to explore coefficient, εdecayIt indicates to explore coefficient attenuation number of iterations.
Preferably, described device further include:
Network data acquisition module, for obtaining the current actual speed rate of client and link delay;
Reward and weight calculation module obtain reward parameter for calculating according to the actual speed rate and the link delay Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, t indicates sampling number, StIt indicates to adopt for the t times The state vector of sample acquisition, St+1Indicate the state vector that the t+1 times sampling obtains, AtIndicate the target selected when the t times sampling Wireless access point;
Sample data memory module is used for the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtIt deposits It stores up in preset experience replay pond;
Sample set chooses module, for choosing several sample numbers according to the weight probability distribution in the experience replay pond According to as sample set;
Model optimization module, it is excellent for being carried out according to the sample set and back-propagation algorithm to the neural network model Change.
Preferably, the reward and weight calculation module specifically include:
Parameter calculation unit is rewarded, for according to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠ At-1) calculate to obtain and reward parameter Rt+1;Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay ratio Weight, δhandoffIndicate roaming switch punishment, 1 { A of functiont≠At-1Indicate the target wireless access points selected when the t times sampling AtWith the target wireless access points A selected when the t-1 times samplingt-1When different, 1 { At≠At-1}=1 is selected when the t times sampling Target wireless access points AtWith the target wireless access points A selected when the t-1 times samplingt-1When identical, 1 { At≠At-1}=0;
Weight parameter computing unit, for according to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) Corresponding weight parameter Wt;Wherein, S indicates actual speed rate, StheorIndicate the specification of the wireless access point connected according to client The Theoretical Rate of gain of parameter.
Preferably, the sample set is chosen module and is specifically included:
Weight probability calculation unit, for according to formulaCalculate each of experience replay pond sample The weight probability of notebook data;Wherein, E indicates the quantity of the sample data in the experience replay pond, E >=1, PjIt indicates j-th The weight probability of sample data, j=1,2, E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1, 2, E;
Sample set selection unit, for selecting weight probability to meet several sample datas of preset condition as the sample This collection.
Preferably, the neural network model includes input layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, the model optimization module specifically includes:
Model optimization unit, for the basis according to the sample set and back-propagation algorithm to the neural network model Layer, valuation layer and the parameter of decision-making level optimize.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes The computer program of storage;Wherein, where the computer program controls the computer readable storage medium at runtime Equipment executes radio roaming control method described in any of the above-described embodiment.
It is shown in Figure 4 the embodiment of the invention also provides a kind of terminal device, it is that a kind of terminal provided by the invention is set The structural block diagram of a standby preferred embodiment, the terminal device include processor 10, memory 20 and are stored in described In memory 20 and it is configured as the computer program executed by the processor 10, the processor 10 is executing the calculating Radio roaming control method described in any of the above-described embodiment is realized when machine program.
Preferably, the computer program can be divided into one or more module/units (such as computer program 1, meter Calculation machine program 2), one or more of module/units are stored in the memory 20, and by The processor 10 executes, to complete the present invention.One or more of module/units, which can be, can complete specific function Series of computation machine program instruction section, the instruction segment is for describing execution of the computer program in the terminal device Journey.
The processor 10 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc., general processor can be microprocessor or the processor 10 is also possible to any conventional place Device is managed, the processor 10 is the control centre of the terminal device, utilizes terminal device described in various interfaces and connection Various pieces.
The memory 20 mainly includes program storage area and data storage area, wherein program storage area can store operation Application program needed for system, at least one function etc., data storage area can store related data etc..In addition, the memory 20 can be high-speed random access memory, can also be nonvolatile memory, such as plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card and flash card (Flash Card) etc., or The memory 20 is also possible to other volatile solid-state parts.
It should be noted that above-mentioned terminal device may include, but it is not limited only to, processor, memory, those skilled in the art Member does not constitute the restriction to terminal device it is appreciated that Fig. 4 structural block diagram is only the example of above-mentioned terminal device, can be with Including perhaps combining certain components or different components than illustrating more or fewer components.
To sum up, a kind of radio roaming control method, device, computer readable storage medium provided by the embodiment of the present invention And terminal device, it has the advantages that
(1) it may cause network internal operating status as time goes by change, can be run according to network internal State adjusts the roaming switch target of client in real time, improves the usage experience of client;
(2) environment changing factor can be captured and dynamic adjusts roaming policy, adapt to different environment;
(3) compared to traditional loaming method based on RSSI threshold value, better net can be obtained under above-mentioned complex environment Network rate and lower link delay;
(4) compared to traditional loaming method based on RSSI threshold value, it is contemplated that noise, channel utilization are to channel circumstance Influence, and feature is further extracted by neural network, is conducive to accurately find optimal roaming switch target;
(5) by way of combining neural network and experience replay, environmental history can be remembered, helps to mention The behavioural characteristic at preceding client perception end, so as to optimize roaming opportunity;
(6) actual speed rate current according to client is to sample data (St, At, Rt+1, St+1) assign weight parameter Wt, make It is higher to obtain the probability sampled closer to the sample data of real velocity, so as to improve the accurate of neural network model valuation Degree.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations Also it should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of radio roaming control method, which is characterized in that the method was suitable for being made of several wireless access point Network;The described method includes:
The state vector for obtaining client is sampled every the preset time cycle;Wherein, in the state vector include respectively with RSSI value, channel utilization and the noise of the corresponding client of each wireless access point;
Result vector is obtained according to the neural network model after the state vector and preset training;Wherein, the result to It include the corresponding roaming valuation of several described wireless access point in amount;
According to preset random number, the current exploration coefficient of the neural network model and the result vector from it is described several Selection target wireless access point in wireless access point, using the target wireless access points as the roaming candidate of client.
2. radio roaming control method as described in claim 1, which is characterized in that it is described according to preset random number, it is described The current exploration coefficient and the result vector of neural network model selection target from several described wireless access point are wireless Access point specifically includes:
Judge whether the current exploration coefficient of the neural network model is greater than preset random number;
If currently exploring coefficient is greater than preset random number, any wireless access in several described wireless access point is selected Point is used as the target wireless access points;
If currently exploring coefficient is not more than preset random number, select the roaming valuation maximum value in the result vector corresponding Wireless access point as the target wireless access points.
3. radio roaming control method as described in claim 1, which is characterized in that after determining target wireless access points, The method also includes:
According to formulaCurrent coefficient of exploring is updated;Wherein, εtIndicate corresponding when the t times sampling Exploration coefficient, εt+1Indicate corresponding exploration coefficient, ε when the t+1 times samplingstartIt indicates initial and explores coefficient, εendIndicate knot Beam explores coefficient, εdecayIt indicates to explore coefficient attenuation number of iterations.
4. radio roaming control method as described in claim 1, which is characterized in that the method also includes:
Obtain client current actual speed rate and link delay;
It is calculated according to the actual speed rate and the link delay and obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) Corresponding weight parameter Wt;Wherein, t indicates sampling number, StIndicate state vector, the S of the t times sampling acquisitiont+1Indicate t+1 The state vector that secondary sampling obtains, AtIndicate the target wireless access points selected when the t times sampling;
By the sample data (St, At, Rt+1, St+1) and corresponding weight parameter WtIt stores in preset experience replay pond;
Several sample datas are chosen as sample set according to the weight probability distribution in the experience replay pond;
The neural network model is optimized according to the sample set and back-propagation algorithm.
5. radio roaming control method as claimed in claim 4, which is characterized in that described according to the actual speed rate and described Link delay, which calculates, obtains reward parameter Rt+1With sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt, it specifically includes:
According to formula Rt+1=(log10S-δdelay×D)×(1-δhandoff×1{At≠At-1) calculate to obtain and reward parameter Rt+1; Wherein, S indicates actual speed rate, and D indicates link delay, δdelayIndicate link delay specific gravity, δhandoffIndicate roaming switch punishment, 1 { A of functiont≠At-1Indicate the target wireless access points A selected when the t times samplingtWith the target selected when the t-1 times sampling Wireless access point At-1When different, 1 { At≠At-1}=1, the target wireless access points A selected when the t times samplingtWith the t-1 times The target wireless access points A selected when samplingt-1When identical, 1 { At≠At-1}=0;
According to formulaIt calculates and obtains sample data (St, At, Rt+1, St+1) corresponding weight parameter Wt;Wherein, S is indicated Actual speed rate, StheorIndicate the Theoretical Rate that the specifications parameter of the wireless access point connected according to client obtains.
6. radio roaming control method as claimed in claim 4, which is characterized in that described according in the experience replay pond Weight probability distribution chooses several sample datas as sample set, specifically includes:
According to formulaCalculate the weight probability of each of experience replay pond sample data;Wherein, E table Show the quantity of the sample data in the experience replay pond, E >=1, PjIndicate the weight probability of j-th of sample data, j=1, 2 ..., E, WiIndicate the corresponding weight parameter of i-th of sample data, i=1,2 ..., E;
Weight probability is selected to meet several sample datas of preset condition as the sample set.
7. radio roaming control method as claimed in claim 4, which is characterized in that the neural network model includes input Layer, basal layer, valuation layer, decision-making level and polymer layer,
Then, described that the neural network model is optimized according to the sample set and back-propagation algorithm, it specifically includes:
According to the sample set and back-propagation algorithm to the basal layer of the neural network model, the ginseng of valuation layer and decision-making level Number optimizes.
8. a kind of radio roaming control device, which is characterized in that described device was suitable for being made of several wireless access point Network;Described device includes:
State vector obtains module, for sampling the state vector for obtaining client every the preset time cycle;Wherein, described It include RSSI value, channel utilization and the noise of client corresponding with each wireless access point respectively in state vector;
Result vector obtains module, for obtaining result according to the neural network model after the state vector and preset training Vector;It wherein, include the corresponding roaming valuation of several described wireless access point in the result vector;
Roaming candidate selecting module, for according to preset random number, the neural network model current exploration coefficient and institute Result vector selection target wireless access point from several described wireless access point is stated, the target wireless access points are made For the roaming candidate of client.
9. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium includes the calculating of storage Machine program;Wherein, the equipment where the computer program controls the computer readable storage medium at runtime executes such as The described in any item radio roaming control methods of claim 1~7.
10. a kind of terminal device, which is characterized in that including processor, memory and store in the memory and matched It is set to the computer program executed by the processor, the processor is realized when executing the computer program as right is wanted Seek 1~7 described in any item radio roaming control methods.
CN201910793482.3A 2019-08-22 2019-08-22 Wireless roaming control method, device, storage medium and terminal equipment Active CN110519816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910793482.3A CN110519816B (en) 2019-08-22 2019-08-22 Wireless roaming control method, device, storage medium and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910793482.3A CN110519816B (en) 2019-08-22 2019-08-22 Wireless roaming control method, device, storage medium and terminal equipment

Publications (2)

Publication Number Publication Date
CN110519816A true CN110519816A (en) 2019-11-29
CN110519816B CN110519816B (en) 2021-09-10

Family

ID=68627062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910793482.3A Active CN110519816B (en) 2019-08-22 2019-08-22 Wireless roaming control method, device, storage medium and terminal equipment

Country Status (1)

Country Link
CN (1) CN110519816B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111065131A (en) * 2019-12-16 2020-04-24 深圳大学 Switching method and device and electronic equipment
CN111278073A (en) * 2020-01-20 2020-06-12 普联技术有限公司 WIFI roaming setting method and device, wireless connection equipment and readable storage medium
WO2021208809A1 (en) * 2020-04-13 2021-10-21 杭州萤石软件有限公司 Wireless roaming method and system
WO2022105876A1 (en) * 2020-11-23 2022-05-27 华为技术有限公司 Method and apparatus for selecting decision
CN114900859A (en) * 2022-07-11 2022-08-12 深圳市华曦达科技股份有限公司 Easy mesh network management method and device

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101884240A (en) * 2007-10-01 2010-11-10 高通股份有限公司 Mobile access in a diverse access point network
CN105657758A (en) * 2016-01-12 2016-06-08 杭州全维通信服务股份有限公司 Multi-AP adaptive switching method based on Markov model
CN105808721A (en) * 2016-03-07 2016-07-27 中国科学院声学研究所 Data mining based customer service content analysis method and system
WO2016118979A2 (en) * 2015-01-23 2016-07-28 C3, Inc. Systems, methods, and devices for an enterprise internet-of-things application development platform
CN106550348A (en) * 2016-12-09 2017-03-29 锐捷网络股份有限公司 A kind of method for realizing in the wireless local area network roaming, WAP and server
WO2017054883A1 (en) * 2015-10-02 2017-04-06 Telefonaktiebolaget Lm Ericsson (Publ) Analytics driven wireless device session context handover in operator cloud
CN106708037A (en) * 2016-12-05 2017-05-24 北京贝虎机器人技术有限公司 Autonomous mobile equipment positioning method and device, and autonomous mobile equipment
CN106792992A (en) * 2016-12-12 2017-05-31 上海掌门科技有限公司 A kind of method and apparatus for providing WAP information
WO2018085416A1 (en) * 2016-11-04 2018-05-11 Intel Corporation Mobility support for 5g nr
CN108900333A (en) * 2018-06-27 2018-11-27 新华三大数据技术有限公司 A kind of appraisal procedure and assessment device of quality of wireless network
CN109379752A (en) * 2018-09-10 2019-02-22 中国移动通信集团江苏有限公司 Optimization method, device, equipment and the medium of Massive MIMO
CN109447275A (en) * 2018-11-09 2019-03-08 西安邮电大学 Based on the handoff algorithms of machine learning in UDN
CN109766932A (en) * 2018-12-25 2019-05-17 新华三大数据技术有限公司 A kind of Feature Selection method and Feature Selection device

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101884240A (en) * 2007-10-01 2010-11-10 高通股份有限公司 Mobile access in a diverse access point network
WO2016118979A2 (en) * 2015-01-23 2016-07-28 C3, Inc. Systems, methods, and devices for an enterprise internet-of-things application development platform
WO2017054883A1 (en) * 2015-10-02 2017-04-06 Telefonaktiebolaget Lm Ericsson (Publ) Analytics driven wireless device session context handover in operator cloud
CN105657758A (en) * 2016-01-12 2016-06-08 杭州全维通信服务股份有限公司 Multi-AP adaptive switching method based on Markov model
CN105808721A (en) * 2016-03-07 2016-07-27 中国科学院声学研究所 Data mining based customer service content analysis method and system
WO2018085416A1 (en) * 2016-11-04 2018-05-11 Intel Corporation Mobility support for 5g nr
CN106708037A (en) * 2016-12-05 2017-05-24 北京贝虎机器人技术有限公司 Autonomous mobile equipment positioning method and device, and autonomous mobile equipment
CN106550348A (en) * 2016-12-09 2017-03-29 锐捷网络股份有限公司 A kind of method for realizing in the wireless local area network roaming, WAP and server
CN106792992A (en) * 2016-12-12 2017-05-31 上海掌门科技有限公司 A kind of method and apparatus for providing WAP information
CN108900333A (en) * 2018-06-27 2018-11-27 新华三大数据技术有限公司 A kind of appraisal procedure and assessment device of quality of wireless network
CN109379752A (en) * 2018-09-10 2019-02-22 中国移动通信集团江苏有限公司 Optimization method, device, equipment and the medium of Massive MIMO
CN109447275A (en) * 2018-11-09 2019-03-08 西安邮电大学 Based on the handoff algorithms of machine learning in UDN
CN109766932A (en) * 2018-12-25 2019-05-17 新华三大数据技术有限公司 A kind of Feature Selection method and Feature Selection device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111065131A (en) * 2019-12-16 2020-04-24 深圳大学 Switching method and device and electronic equipment
CN111065131B (en) * 2019-12-16 2023-04-18 深圳大学 Switching method and device and electronic equipment
CN111278073A (en) * 2020-01-20 2020-06-12 普联技术有限公司 WIFI roaming setting method and device, wireless connection equipment and readable storage medium
CN111278073B (en) * 2020-01-20 2022-03-08 普联技术有限公司 WIFI roaming setting method and device, wireless connection equipment and readable storage medium
WO2021208809A1 (en) * 2020-04-13 2021-10-21 杭州萤石软件有限公司 Wireless roaming method and system
WO2022105876A1 (en) * 2020-11-23 2022-05-27 华为技术有限公司 Method and apparatus for selecting decision
CN114900859A (en) * 2022-07-11 2022-08-12 深圳市华曦达科技股份有限公司 Easy mesh network management method and device
CN114900859B (en) * 2022-07-11 2022-09-20 深圳市华曦达科技股份有限公司 Easy mesh network management method and device

Also Published As

Publication number Publication date
CN110519816B (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN110519816A (en) A kind of radio roaming control method, device, storage medium and terminal device
CN102183621B (en) Aquaculture dissolved oxygen concentration online forecasting method and system
CN109245840A (en) Spectrum prediction method in cognitive radio system based on convolutional neural networks
CN105005204A (en) Intelligent engine system capable of automatically triggering intelligent home and intelligent life scenes and method
CN102185731B (en) Network health degree testing method and system
CN110460880A (en) Wireless industrial streaming media self-adapting transmission method based on population and neural network
CN107105453B (en) Cut-in method is selected based on the heterogeneous network of analytic hierarchy process (AHP) and evolutionary game theory
CN115310360A (en) Digital twin auxiliary industrial Internet of things reliability optimization method based on federal learning
CN112329997A (en) Power demand load prediction method and system, electronic device, and storage medium
CN116957874B (en) Intelligent automatic course arrangement method, system and equipment for universities and storage medium
Liu et al. Dynamic multichannel sensing in cognitive radio: Hierarchical reinforcement learning
CN110516889B (en) Load comprehensive prediction method based on Q-learning and related equipment
CN107889195A (en) A kind of self study heterogeneous wireless network access selection method of differentiated service
CN111555924B (en) Gateway equipment optimization deployment method for intelligent road system
CN110501952B (en) Information acquisition equipment and management method
Wang et al. Application and Performance Comparison of Biogeography-based Optimization Algorithm on Unconstrained Function Optimization Problem.
CN116390165B (en) Dynamic management method and system for data-driven cellular base station
CN104378420B (en) Data transmission method and device based on environment sensing
CN114697974B (en) Network coverage optimization method and device, electronic equipment and storage medium
CN116306884B (en) Pruning method and device for federal learning model and nonvolatile storage medium
CN117073303B (en) Uniform refrigeration regulation and control method of spiral instant freezer
Gao et al. Deep Reinforcement Learning Based Rendering Service Placement for Cloud Gaming in Mobile Edge Computing Systems
CN116227741A (en) Water chilling unit energy saving method and device based on self-adaptive algorithm and related medium
Mendes Convergence of the Reinforcement Learning Mechanism Applied to the Channel Detection Sequence Problem
CN114528969A (en) Method for selecting task network, system and method for determining action based on sensing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant