WO2022249225A1 - Optical path design device, optical path design method, and program - Google Patents

Optical path design device, optical path design method, and program Download PDF

Info

Publication number
WO2022249225A1
WO2022249225A1 PCT/JP2021/019527 JP2021019527W WO2022249225A1 WO 2022249225 A1 WO2022249225 A1 WO 2022249225A1 JP 2021019527 W JP2021019527 W JP 2021019527W WO 2022249225 A1 WO2022249225 A1 WO 2022249225A1
Authority
WO
WIPO (PCT)
Prior art keywords
slots
frequency
communication demand
mask
network
Prior art date
Application number
PCT/JP2021/019527
Other languages
French (fr)
Japanese (ja)
Inventor
将之 下田
貴章 田中
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/019527 priority Critical patent/WO2022249225A1/en
Priority to JP2023523710A priority patent/JPWO2022249225A1/ja
Publication of WO2022249225A1 publication Critical patent/WO2022249225A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]

Definitions

  • the present invention relates to an optical path design device, an optical path design method, and a program.
  • the frequency band of the grid network is divided more finely than in the fixed grid network. ). At this time, it is required to efficiently allocate communication demand slots to frequency slots, and the problem of how to efficiently allocate communication demand slots to frequency slots is called a frequency allocation problem.
  • FIG. 9 shows an example of candidates for allocating communication demand slots to frequency slots.
  • FIG. 9 shows candidates when there are 8 frequency slots and 2 communication demand slots. Each frequency slot is numbered from 1 to 8, and frequency slots 6 and 7 are occupied slots to which other communication demand slots have already been assigned. Frequency slots 1-5 and 8 are available slots to which communication demand slots can be assigned.
  • frequency slots 1 and 2 are assigned communication demand slots.
  • frequency slots 2 and 3 are assigned communication demand slots.
  • the above has explained the case where there are eight frequency slots.
  • the number of candidates may become enormous.
  • the number of candidates is 370 in the flexible grid network.
  • Non-Patent Document 1 a frequency selection method using reinforcement learning has been proposed (for example, Non-Patent Document 1).
  • Non-Patent Document 1 has the following problems. One is that it is necessary to determine candidate frequency slots in advance. In addition, one is to examine communication demand slots in ascending order of frequency slot numbers, and to allocate a communication demand slot to the frequency slot if it can be allocated, so that other frequencies to which communication demand slots can be allocated. The point is that slots cannot be considered.
  • An object of the present invention is to provide an optical path designing device, an optical path designing method, and a program capable of allocating communication demand slots to frequency slots with high accuracy.
  • information on a network is input to an estimation model that outputs a behavioral distribution indicating the probability of frequency slots to which communication demand slots are assigned.
  • a behavior distribution estimating unit for outputting a behavior distribution indicating the probability of
  • a mask generating unit for generating a mask, which is data indicating whether or not the communication demand slot is allocated to the frequency slot, based on the information about the network, and the behavior
  • a candidate value calculation unit for calculating candidate values for allocatable frequency slots based on the distribution and the mask; and determining frequency slots to which the communication demand slots are assigned based on the candidate values calculated by the candidate value calculation unit.
  • a frequency allocation unit that performs optical path design.
  • information about a network is input to an estimation model that outputs a behavioral distribution indicating the probability of frequency slots to which communication demand slots are assigned.
  • a behavior distribution estimation step of outputting a behavior distribution indicating slot probabilities;
  • a mask generation step of generating a mask, which is data indicating whether or not the communication demand slots are allocated to frequency slots, based on the information about the network;
  • a candidate value calculating step of calculating candidate values of allocatable frequency slots based on the behavior distribution and the mask; and frequency slots to which the communication demand slots are assigned based on the candidate values calculated by the candidate value calculating step.
  • a frequency allocation step of determining.
  • One aspect of the present invention is a program for causing a computer to function as the above optical path design device.
  • communication demand slots can be assigned to frequency slots with high accuracy.
  • FIG. 1 is a diagram showing the configuration of an optical path design device 1 according to a first embodiment
  • FIG. 4 is a flow chart showing a method for generating a mask by a mask generator 16 according to the first embodiment
  • 4 is a flow chart showing the operation of the optical path design device 1 when generating an estimated model according to the first embodiment
  • 4 is a flow chart showing the operation of the optical path designing device 1 when allocating communication demand slots to frequency slots according to the first embodiment.
  • FIG. 2 is a diagram showing an environment in an experiment according to the first embodiment
  • FIG. FIG. 4 is a diagram showing a convolutional neural network used in experiments according to the first embodiment
  • FIG. 4 is a diagram showing PPO and Adam's parameters in experiments according to the first embodiment
  • FIG. 4 is a diagram showing blocking probabilities in experiments according to the first embodiment; This is an example of candidates for allocating communication demand slots to frequency slots.
  • FIG. 1 is a diagram showing the configuration of an optical path designing device 1 according to the first embodiment.
  • the optical path design device 1 determines frequency slots to which communication demand slots are assigned by observing the state of the network.
  • the optical path design device 1 includes a network information acquisition unit 10, a route determination unit 12, a behavior distribution estimation unit 14, an estimation model storage unit 15, a mask generation unit 16, a candidate value calculation unit 18, a frequency allocation unit 20, and a reward calculation unit 22. , and an estimation model updating unit 24 .
  • the network information acquisition unit 10 acquires information about networks.
  • Information about the network includes information about the configuration of the network such as, for example, the network topology, the number of fibers connecting each node of the network, or the number of frequency slots for each link of the network.
  • the multiple number of fibers connecting each node can affect the route that is determined when the route through each node is determined. For example, if there are three fibers connecting node A and node B (fiber a, fiber b, and fiber c), the path from node A to node B includes fiber a, fiber b, or fiber c. includes any. Therefore, route information also includes information about fibers.
  • Information about the fiber may also be considered when the route is determined. For example, the number of used frequency slots in each fiber is considered and the path through the fiber with the highest number of used frequency slots is determined.
  • Information about the network includes information about the frequency allocation of the network, such as the usage status of frequency slots for each link or the duration of the allocation.
  • Information about the network includes information about the required lightpaths.
  • the information about the requested lightpath includes, for example, the information of the nodes that are the start and end points of the lightpath, the number of communication demand slots or the duration of the allocation.
  • the network information acquisition unit 10 acquires information about the network, for example, by observing the network.
  • the frequency slot usage status determination unit 11 determines the frequency slot usage status s t in the network based on the number of frequency slots of each link in the network and the frequency slot usage status of each link acquired by the network information acquisition unit 10. to decide.
  • the usage status of frequency slots of each link in the network is, for example, information as to whether each frequency slot is a usable slot or an occupied slot.
  • a network has two links, and the link usage status, which is the usage status of eight frequency slots for each link, is represented by one-dimensional vectors l 1 and l 2 .
  • l 1 and l 2 are represented by formulas (1) and (2), for example.
  • each element takes a value of 0 or 1.
  • 0 in the nth element means that the nth frequency slot is an occupied slot.
  • a 1 in the nth element means that the nth frequency slot is a usable slot. That is, l1 means that the 3rd, 4th, 5th and 6th frequency slots are usable slots, and l2 means that the 1st to 4th frequency slots are usable slots.
  • the usage status s t of the frequency slots in the network is defined, for example, as shown in Equation (3).
  • s t is a matrix containing all the usage states of each link in the network.
  • the s t may contain information about the network's frequency allocation, such as the duration of the allocation to the frequency slot.
  • the route determination unit 12 determines a route connecting the start point and the end point of the optical path in the network based on the network information acquired by the network information acquisition unit 10 .
  • the route determination unit 12 determines a route by evaluating fragmentation of used slots by entropy, for example.
  • a method for determining a route by evaluating the fragmentation of used slots by entropy is described in Non-Patent Document 2, for example.
  • the route determination unit 12 may determine the route by, for example, a method of selecting the shortest route that can be assigned from among the k-th shortest paths (K-shortest path algorithm).
  • K-shortest path algorithm The route determined by the route determination unit 12 can also be included in the information about the network.
  • the behavior distribution estimating unit 14 inputs the frequency slot usage status st and the route determined by the route determining unit 12 to the estimation model stored in the estimation model storage unit 15, thereby calculating the probability distribution of the behavior at .
  • Estimate the behavioral distribution that is The distribution of actions at is called a "policy”, and is a probability indicating to which frequency slot a communication demand slot is assigned in st .
  • the estimation model is created using, for example, a polynomial function and a neural network, and outputs a behavioral distribution by inputting usage conditions st and routes.
  • a method of generating the estimation model stored in the estimation model storage unit 15 will be described later.
  • the behavioral distribution estimated by the behavioral distribution estimating unit 14 is, for example, a one-dimensional vector A act having the same number of elements as the number of frequency slots, and each element indicates the probability that each frequency slot is the leading slot. That is, for example, when communication demand slots are two and frequency slots are eight, the elements of A act are eight, and the third element of A act is the third frequency slot. Indicates probability.
  • the mask generation unit 16 generates data (hereinafter referred to as a mask) indicating whether communication demand slots can be allocated to frequency slots based on the usage status st of the frequency slots and the route determined by the route determination unit 12. .
  • FIG. 2 is a flow chart showing how the mask generator 16 generates a mask.
  • the mask generation unit 16 detects availability of frequency slots on the route determined by the route determination unit 12 (step S100). For example, the mask generator 16 detects frequency slots that can be used on all links through which the determined route passes. For example, when the route determined by the route determination unit 12 passes through two links, and the usage status of the frequency slots of the links is represented by equations (1) and (2), the availability status A available is It is represented by Formula (4).
  • AND means applying the logical operator AND to each element. That is, when the n-th elements of l 1 and l 2 are both 1, the n-th element of A available becomes 1; otherwise, the n-th element of A available becomes 0. In other words, availability status A available indicates frequency slots that are available on all links along a given path.
  • the mask generation unit 16 generates a mask A mask indicating consecutive available slots to which communication demand slots can be assigned, based on the availability status A available (step S102).
  • the mask generation unit 16 generates, for example, a mask indicating all candidates for consecutive available slots to which communication demand slots can be assigned. More specifically, the mask is a one-dimensional vector A mask having the same number of elements as the frequency slots. It indicates that the slot cannot be assigned, and the fact that the n-th element is 1 indicates that the communication demand slot can be assigned when the n-th frequency slot is set as the leading slot.
  • the mask generator 16 calculates the elements of A mask by the following calculation.
  • the mask generation unit 16 When the number of communication demand slots is m, the mask generation unit 16 generates n 1 is calculated for the th element, and 0 is calculated for the nth element of A mask otherwise.
  • the n-th element of A mask is calculated as 0 even when the (n+m ⁇ 1)-th element of the availability status A available does not exist.
  • the fourth element of the availability status A available to 5 Since the values up to the 4th element are all 1, the 4th element of A mask is 1. In addition, since the second element of the values from the first element to the second element of the availability status A available is 0, the first element of A mask is 0. Also, since the ninth element of the availability status A available does not exist, the eighth element of A mask is 0. By performing the above calculations for all the elements, the mask A mask becomes [0,0,0,1,1,1,0].
  • the mask generation unit 16 may generate, for example, a mask indicating a part of candidates for consecutive available slots to which communication demand slots can be assigned. For example, after generating the mask A mask indicating all the consecutive available slot candidates described above, the mask generation unit 16 changes some of the elements of the mask A mask from 1 to 0. FIG. More specifically, when assigning communication demand slots to frequency slots, if the number of groups of consecutive available slots (hereinafter referred to as blocks) increases, some of the elements of the corresponding mask A mask from 1 to 0.
  • the number of blocks is and two blocks consisting of the 4th through 8th available slots.
  • the availability status A available changes to [1, 0, 0, 1, 0, 0, 1, 1]
  • the number of blocks is It increases to three blocks: the first available slot, the fourth available slot, and the seventh to eighth available slots. Therefore, the mask generator 16 sets the fifth element of the mask A mask to 0. Similarly, the mask generator 16 sets the sixth element of the mask A mask to 0, and finally the mask A mask becomes [0, 0, 0, 1, 0, 0, 1, 0].
  • the mask generation unit 16 generates a mask indicating a part of candidates for consecutive available slots to which communication demand slots can be assigned, thereby preventing fragmentation of available slots. can.
  • the candidate value calculator 18 calculates candidate values of allocatable slots based on the estimation result A act from the behavior distribution estimator 14 and the mask A mask generated by the mask generator 16 .
  • the candidate value calculation unit 18 calculates the allocatable candidate value A out by Equation (5).
  • Equation (5) means the Hadamard product of A act and A mask . That is, each element of A out is the product of the corresponding elements of A act and A mask . Therefore, the allocatable candidate value A out is obtained by masking non-allocatable slots indicated by A mask among the elements of A act .
  • the frequency allocation unit 20 determines frequency slots to which communication demand slots are allocated. For example, the frequency allocation unit 20 sets the frequency slot corresponding to the element having the largest value in A out as the leading slot. For example, the frequency allocation unit 20 converts the values of the elements of A out into probability values, and sets the frequency slot corresponding to the elements determined stochastically based on the converted probability values as the top slot.
  • a method for converting the values of the elements of A out into probability values is, for example, a method of converting the values of the elements using a softmax function, but is not limited to this.
  • the reward calculation unit 22 calculates a reward r t+1 based on the information on the network acquired by the network information acquisition unit 10 and the action at acquired from the frequency allocation unit 20 .
  • the information about the network may be the usage status s t of the frequency slots determined by the frequency slot usage status determining section 11 .
  • the action at obtained from the frequency allocating unit 20 is the action at time t by the frequency allocating unit 20, which is the action of allocating the communication demand slot to the frequency slot.
  • the reward calculation unit 22 calculates the reward r t+1 as 1 when the communication demand slot can be assigned to the frequency slot when the action a t is performed in the usage status st , and the reward r t+1 when it is not possible. Calculate t+1 as -1.
  • the reward calculator 22 may calculate the reward r t +1 based on the usage status s t and the usage status s t +1 . At this time, the remuneration calculation unit 22 calculates the remuneration r t+1 after the network information acquisition unit 10 observes the usage status s t+1 . The reward calculator 22 may calculate the reward r t+ 1 based on the usage status s t , the usage status s t+1 and the behavior at .
  • the estimation model updating unit 24 updates the estimation model stored in the estimation model storage unit 15 based on the reward r t+1 calculated by the reward calculation unit 22 .
  • the estimation model updating unit 24 updates the estimation model so that the reward that can be obtained in the future is maximized.
  • the estimation model update method is performed using, for example, DQN (deep Q-network) described in Non-Patent Document 3 or A3C (asynchronous advantage actor-critic) described in Non-Patent Document 4, but is not limited thereto.
  • FIG. 3 is a flow chart showing the operation of the optical path design device 1 when generating an estimated model.
  • the network information acquisition unit 10 acquires information about the network (step S200).
  • the frequency slot usage status determining unit 11 determines the frequency slot usage status s t based on the information on the network (step S201).
  • the route determination unit 12 determines a route based on information about the network (step S202).
  • the behavior distribution estimation unit 14 estimates the behavior distribution based on the usage status st and the route (step S204).
  • the mask generation unit 16 generates a mask based on the usage status st and the route (step S206).
  • the candidate value calculation unit 18 calculates allocatable candidate values based on the behavior distribution and the mask (step S208).
  • the frequency allocation unit 20 allocates communication demand slots to frequency slots based on the allocatable candidate values (step S210).
  • the reward calculation unit 22 calculates a reward based on the network-related information acquired by the network information acquisition unit 10 and the behavior of the frequency allocation unit 20 (step S212).
  • the estimation model update unit 24 updates the estimation model based on the reward (step S214).
  • the optical path design device 1 repeats the operations from step S200 to step S214 to update the estimated model, thereby generating the estimated model.
  • FIG. 4 is a flowchart showing the operation of the optical path designing device 1 when allocating communication demand slots to frequency slots.
  • the flowchart shown in FIG. 4 corresponds to the operations from step S200 to step S210 in the flowchart shown in FIG.
  • communication demand slots can be allocated without determining candidate frequency slots in advance. Also, by calculating A out indicating all assignable frequency slots, all assignable frequency slots can be considered.
  • FIG. 5 is a diagram showing the environment in this experiment.
  • the information about the requested lightpath that is, the information of the nodes that are the start and end points of the lightpath, the number of communication demand slots, and the duration of allocation, were randomly generated according to a uniform distribution.
  • the average arrival rate of lightpath requests was 10, and the average service time of allocation processing by the lightpath design device 1 was 12.
  • the usage status s t is represented by a two-dimensional vector.
  • a value of 1 in the lth row and mth column element of the usage status st indicates that the mth frequency slot of the lth link is a usable slot, and a value of 0 indicates that the slot is an occupied slot. indicates that
  • the reward calculation unit 22 calculates the reward r t+1 as 1 when the frequency assignment by the frequency assignment unit 20 succeeds and -1 when the assignment fails.
  • the estimation model updating unit 24 used a convolutional neural network as the estimation model.
  • FIG. 6 is a diagram showing the convolutional neural network used in this experiment.
  • a convolutional neural network consists of a convolutional layer 200 and a fully connected layer 210 .
  • the convolution layer 200-1 receives the frequency slot usage state s t , it performs convolution and outputs a two-dimensional vector of 32 ⁇ S (S is the number of frequency slots).
  • S is the number of frequency slots and takes a value of 80.
  • convolution is performed by inputting the two-dimensional vectors output from the convolution layers 200-1 and 200-2 to the convolution layers 200-2 and 200-3, respectively.
  • the action distribution A act is output.
  • the value of the usage state s t is output.
  • the estimation model is updated based on the value of the usage status s t output from the fully connected layer 210-2.
  • PPO Proximal Policy Optimization
  • Adam was used as an optimization algorithm.
  • FIG. 7 shows the PPO and Adam parameters in this experiment.
  • FIG. 8 is a diagram showing the blocking probability in this experiment. Without the mask, the blocking probability is approximately 80%, while with the mask, the blocking probability is approximately 2.2%. Also, when there is a mask, the blocking probability can be close to 2.2% from the stage where the evaluation step is 1, so the learning time can be greatly shortened.
  • the optical path design device 1 does not have to generate an estimated model.
  • the optical path designing device 1 includes a network information acquisition unit 10, a route determination unit 12, a behavior distribution estimation unit 14, an estimation model storage unit 15, a mask generation unit 16, a candidate value calculation unit 18, and a frequency allocation unit 20. , the reward calculator 22 and the estimation model updater 24 may not be provided.
  • the estimated model storage unit 15 stores the generated estimated model.
  • the behavior distribution estimator 14 estimates the behavior distribution, which is the probability distribution of the behavior at , based on the usage status st and the route, but is not limited to this.
  • the action distribution estimator 14 may estimate the action distribution, which is the probability distribution of the action at , based on the information about the network acquired by the network information acquisition unit 10, for example.
  • 1 optical path design device 10 network information acquisition unit, 11 frequency slot usage determination unit, 12 route determination unit, 14 behavior distribution estimation unit, 15 estimation model storage unit, 16 mask generation unit, 18 candidate value calculation unit, 20 frequency Allocation unit, 22 reward calculation unit, 24 estimation model update unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

This optical path design device comprises: a behavior distribution estimation unit that outputs behavior distribution indicating the probability of frequency slots to which communication demand slots are assigned, by inputting information relating to a network to an estimation model that receives information relating to the network as input and outputs behavior distribution that indicates the probability of frequency slots to which communication demand slots are assigned; a mask generation unit that generates a mask, which is data indicating whether or not the communication demand slots can be allocated to frequency slots, on the basis of the information relating to the network; a candidate value calculation unit that calculates candidate values for allocatable frequency slots on the basis of the behavior distribution and the mask; and frequency allocation unit that determines frequency slots to which the communication demand slots are allocated on the basis of the candidate values calculated by the candidate value calculation unit.

Description

光パス設計装置、光パス設計方法及びプログラムOptical path design device, optical path design method and program
 本発明は、光パス設計装置、光パス設計方法及びプログラムに関する。 The present invention relates to an optical path design device, an optical path design method, and a program.
 フレキシブルグリッドネットワークにおいては、固定グリッドネットワークと比較してグリッドネットワークの周波数帯域を細かく分割し、分割した領域(以下、周波数スロットと呼ぶ)に各光パスが必要とする帯域幅(以下、通信需要スロットと呼ぶ)を割り当てる。このとき、通信需要スロットを周波数スロットに効率的に割り当てることが要求され、どのようにして効率的に通信需要スロットを周波数スロットに割り当てるかという問題は周波数割り当て問題と呼ばれる。 In the flexible grid network, the frequency band of the grid network is divided more finely than in the fixed grid network. ). At this time, it is required to efficiently allocate communication demand slots to frequency slots, and the problem of how to efficiently allocate communication demand slots to frequency slots is called a frequency allocation problem.
 しかしながら、フレキシブルグリッドネットワークにおいては周波数の割り当ての候補の数が膨大であり、全ての候補を適切に評価することは難しい。図9は、周波数スロットに通信需要スロットを割り当てるときの候補の一例である。図9は、周波数スロットが8個であり、通信需要スロットが2個である場合の候補を示している。各周波数スロットには1から8までの番号が付されており、周波数スロット6及び7は、他の通信需要スロットがすでに割り当てられている占有スロット(Occupied Slot)である。周波数スロット1~5及び8は通信需要スロットを割り当てることができる使用可能スロット(Available Slot)である。 However, in the flexible grid network, the number of frequency allocation candidates is enormous, and it is difficult to properly evaluate all candidates. FIG. 9 shows an example of candidates for allocating communication demand slots to frequency slots. FIG. 9 shows candidates when there are 8 frequency slots and 2 communication demand slots. Each frequency slot is numbered from 1 to 8, and frequency slots 6 and 7 are occupied slots to which other communication demand slots have already been assigned. Frequency slots 1-5 and 8 are available slots to which communication demand slots can be assigned.
 図9において、通信需要スロットがどの周波数スロットに割り当てられるかという候補は7パターン存在する。例えば、候補1においては、周波数スロット1及び2に通信需要スロットを割り当てる。例えば、候補2においては、周波数スロット2及び3に通信需要スロットを割り当てる。以上のように通信需要スロットが割り当てられる周波数スロットをずらしていくことで7パターンの候補を挙げることができる。以下の説明において、通信需要スロットが割り当てられる周波数スロットのうち、付されている番号が最も小さい周波数スロットを「先頭スロット」と呼ぶ。 In FIG. 9, there are seven patterns of candidates for which frequency slot the communication demand slot is assigned. For example, in candidate 1, frequency slots 1 and 2 are assigned communication demand slots. For example, in candidate 2, frequency slots 2 and 3 are assigned communication demand slots. By shifting the frequency slots to which communication demand slots are allocated as described above, seven patterns of candidates can be presented. In the following description, among the frequency slots to which communication demand slots are assigned, the frequency slot with the smallest number is called the "head slot".
 図9に示す例において、周波数スロット6及び7が占有スロットであることから、いずれかの1つの周波数スロットに通信需要スロットを割り当てる候補は採用することができない。つまり、図9に示す例において、候補5~7は採用することができない。 In the example shown in FIG. 9, since frequency slots 6 and 7 are occupied slots, candidates for allocating communication demand slots to any one frequency slot cannot be adopted. That is, candidates 5 to 7 cannot be adopted in the example shown in FIG.
 以上では周波数スロットが8個の場合について説明した。しかし、例えば周波数スロットの数が大きい帯域に通信需要スロットを割り当てる場合、候補の数が膨大になることがある。例えば、384個の周波数スロットに15個のスロットを有する通信需要スロットを割り当てる場合、フレキシブルグリッドネットワークにおいては候補数が370個となる。 The above has explained the case where there are eight frequency slots. However, for example, when assigning communication demand slots to a band having a large number of frequency slots, the number of candidates may become enormous. For example, when assigning communication demand slots having 15 slots to 384 frequency slots, the number of candidates is 370 in the flexible grid network.
 周波数割り当て問題を解決する方法として、例えば強化学習を用いた周波数選択手法が提案されている(例えば、非特許文献1)。 As a method of solving the frequency allocation problem, for example, a frequency selection method using reinforcement learning has been proposed (for example, Non-Patent Document 1).
 しかしながら、非特許文献1に記載の方法には以下に示すいくつかの問題点がある。1つは、予め候補とする周波数スロットを決定する必要があるという点である。また、1つは、通信需要スロットを周波数スロットの番号が小さい方から順に検討し、割り当て可能である場合に通信需要スロットを当該周波数スロットに割り当てるため、通信需要スロットを割り当て可能である他の周波数スロットを考慮することができない点である。
 本発明の目的は、通信需要スロットを高い精度で周波数スロットに割り当てることができる光パス設計装置、光パス設計方法及びプログラムを提供することにある。
However, the method described in Non-Patent Document 1 has the following problems. One is that it is necessary to determine candidate frequency slots in advance. In addition, one is to examine communication demand slots in ascending order of frequency slot numbers, and to allocate a communication demand slot to the frequency slot if it can be allocated, so that other frequencies to which communication demand slots can be allocated. The point is that slots cannot be considered.
An object of the present invention is to provide an optical path designing device, an optical path designing method, and a program capable of allocating communication demand slots to frequency slots with high accuracy.
 本発明の一態様は、ネットワークに関する情報を入力として、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力する推定モデルに、ネットワークに関する情報を入力することで、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力させる行動分布推定部と、前記ネットワークに関する情報に基づいて、前記通信需要スロットの周波数スロットへの割当可否を示すデータであるマスクを生成するマスク生成部と、前記行動分布及び前記マスクに基づいて、割当可能な周波数スロットの候補値を算出する候補値算出部と、前記候補値算出部により算出された候補値に基づいて、前記通信需要スロットを割り当てる周波数スロットを決定する周波数割当部と、を備える光パス設計装置である。 According to one aspect of the present invention, information on a network is input to an estimation model that outputs a behavioral distribution indicating the probability of frequency slots to which communication demand slots are assigned. a behavior distribution estimating unit for outputting a behavior distribution indicating the probability of , a mask generating unit for generating a mask, which is data indicating whether or not the communication demand slot is allocated to the frequency slot, based on the information about the network, and the behavior A candidate value calculation unit for calculating candidate values for allocatable frequency slots based on the distribution and the mask; and determining frequency slots to which the communication demand slots are assigned based on the candidate values calculated by the candidate value calculation unit. and a frequency allocation unit that performs optical path design.
 本発明の一態様は、ネットワークに関する情報を入力として、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力する推定モデルに、前記ネットワークに関する情報を入力することで、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力させる行動分布推定ステップと、前記ネットワークに関する情報に基づいて、前記通信需要スロットの周波数スロットへの割当可否を示すデータであるマスクを生成するマスク生成ステップと、前記行動分布及び前記マスクに基づいて、割当可能な周波数スロットの候補値を算出する候補値算出ステップと、前記候補値算出ステップにより算出された候補値に基づいて、前記通信需要スロットを割り当てる周波数スロットを決定する周波数割当ステップと、を有する光パス設計方法である。 According to one aspect of the present invention, information about a network is input to an estimation model that outputs a behavioral distribution indicating the probability of frequency slots to which communication demand slots are assigned. a behavior distribution estimation step of outputting a behavior distribution indicating slot probabilities; a mask generation step of generating a mask, which is data indicating whether or not the communication demand slots are allocated to frequency slots, based on the information about the network; a candidate value calculating step of calculating candidate values of allocatable frequency slots based on the behavior distribution and the mask; and frequency slots to which the communication demand slots are assigned based on the candidate values calculated by the candidate value calculating step. and a frequency allocation step of determining.
 本発明の一態様は、上記の光パス設計装置としてコンピュータを機能させるためのプログラムである。 One aspect of the present invention is a program for causing a computer to function as the above optical path design device.
 本発明によれば、通信需要スロットを高い精度で周波数スロットに割り当てることができる。 According to the present invention, communication demand slots can be assigned to frequency slots with high accuracy.
第1の実施形態に係る光パス設計装置1の構成を示す図である。1 is a diagram showing the configuration of an optical path design device 1 according to a first embodiment; FIG. 第1の実施形態に係るマスク生成部16がマスクを生成する方法を示すフローチャートである。4 is a flow chart showing a method for generating a mask by a mask generator 16 according to the first embodiment; 第1の実施形態に係る推定モデルを生成するときの、光パス設計装置1の動作を示すフローチャートである。4 is a flow chart showing the operation of the optical path design device 1 when generating an estimated model according to the first embodiment; 第1の実施形態に係る周波数スロットに通信需要スロットを割り当てるときの、光パス設計装置1の動作を示すフローチャートである。4 is a flow chart showing the operation of the optical path designing device 1 when allocating communication demand slots to frequency slots according to the first embodiment. 第1の実施形態に係る実験における環境を示す図である。FIG. 2 is a diagram showing an environment in an experiment according to the first embodiment; FIG. 第1の実施形態に係る実験において使用した畳み込みニューラルネットワークを示す図である。FIG. 4 is a diagram showing a convolutional neural network used in experiments according to the first embodiment; 第1の実施形態に係る実験におけるPPO及びAdamのパラメータを示す図である。FIG. 4 is a diagram showing PPO and Adam's parameters in experiments according to the first embodiment; 第1の実施形態に係る実験におけるブロッキング確率を示す図である。FIG. 4 is a diagram showing blocking probabilities in experiments according to the first embodiment; 周波数スロットに通信需要スロットを割り当てるときの候補の一例である。This is an example of candidates for allocating communication demand slots to frequency slots.
〈第1の実施形態〉
 図1は、第1の実施形態に係る光パス設計装置1の構成を示す図である。光パス設計装置1は、ネットワークの状態を観測することにより、通信需要スロットを割り当てる周波数スロットを決定する。
<First Embodiment>
FIG. 1 is a diagram showing the configuration of an optical path designing device 1 according to the first embodiment. The optical path design device 1 determines frequency slots to which communication demand slots are assigned by observing the state of the network.
 光パス設計装置1は、ネットワーク情報取得部10、経路決定部12、行動分布推定部14、推定モデル記憶部15、マスク生成部16、候補値算出部18、周波数割当部20、報酬算出部22、推定モデル更新部24を備える。 The optical path design device 1 includes a network information acquisition unit 10, a route determination unit 12, a behavior distribution estimation unit 14, an estimation model storage unit 15, a mask generation unit 16, a candidate value calculation unit 18, a frequency allocation unit 20, and a reward calculation unit 22. , and an estimation model updating unit 24 .
 ネットワーク情報取得部10は、ネットワークに関する情報を取得する。ネットワークに関する情報は、例えば、ネットワークトポロジ、ネットワークの各ノードを接続するファイバの数、又はネットワークの各リンクの周波数スロットの数などのネットワークの構成に関する情報を含む。各ノードを接続するファイバの数が複数ある場合は、各ノードを通る経路が決定されるときに決定される経路に影響を与えうる。例えば、ノードAとノードBとを接続するファイバが3つ(ファイバa、ファイバb及びファイバcとする)である場合、ノードAからノードBへの経路にはファイバa、ファイバb又はファイバcのいずれかが含まれる。そのため、経路情報にはファイバに関する情報も含まれる。また、経路が決定されるときには、ファイバに関する情報も考慮され得る。例えば、それぞれにファイバにおいて使用されている周波数スロットの数が考慮され、使用されている周波数スロットの数が最も多いファイバを通る経路が決定される。 The network information acquisition unit 10 acquires information about networks. Information about the network includes information about the configuration of the network such as, for example, the network topology, the number of fibers connecting each node of the network, or the number of frequency slots for each link of the network. The multiple number of fibers connecting each node can affect the route that is determined when the route through each node is determined. For example, if there are three fibers connecting node A and node B (fiber a, fiber b, and fiber c), the path from node A to node B includes fiber a, fiber b, or fiber c. includes any. Therefore, route information also includes information about fibers. Information about the fiber may also be considered when the route is determined. For example, the number of used frequency slots in each fiber is considered and the path through the fiber with the highest number of used frequency slots is determined.
 ネットワークに関する情報は、各リンクの周波数スロットの使用状況又は割当の持続時間などのネットワークの周波数割当に関する情報を含む。ネットワークに関する情報は、要求される光パスに関する情報を含む。要求される光パスに関する情報は、例えば、光パスの始点及び終点となるノードの情報、通信需要スロットの数又は割当の持続時間などを含む。 Information about the network includes information about the frequency allocation of the network, such as the usage status of frequency slots for each link or the duration of the allocation. Information about the network includes information about the required lightpaths. The information about the requested lightpath includes, for example, the information of the nodes that are the start and end points of the lightpath, the number of communication demand slots or the duration of the allocation.
 ネットワーク情報取得部10は、例えばネットワークを観測することにより、ネットワークに関する情報を取得する。 The network information acquisition unit 10 acquires information about the network, for example, by observing the network.
 周波数スロット使用状況決定部11は、ネットワーク情報取得部10により取得された、ネットワークの各リンクの周波数スロットの数及び各リンクの周波数スロットの使用状況に基づいて、ネットワークにおける周波数スロットの使用状況sを決定する。ネットワークの各リンクの周波数スロットの使用状況は、例えば各周波数スロットが使用可能スロットであるか占有スロットであるかという情報である。例えば、ネットワークが2つのリンクを持ち、各リンクの8つの周波数スロットの使用状況であるリンク使用状況を1次元ベクトルl及びlで表す。ここで、l及びlは、例えば式(1)及び式(2)により表される。 The frequency slot usage status determination unit 11 determines the frequency slot usage status s t in the network based on the number of frequency slots of each link in the network and the frequency slot usage status of each link acquired by the network information acquisition unit 10. to decide. The usage status of frequency slots of each link in the network is, for example, information as to whether each frequency slot is a usable slot or an occupied slot. For example, a network has two links, and the link usage status, which is the usage status of eight frequency slots for each link, is represented by one-dimensional vectors l 1 and l 2 . Here, l 1 and l 2 are represented by formulas (1) and (2), for example.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 式(1)及び(2)において、各要素は0又は1の値をとる。n番目の要素が0であることは、n番目の周波数スロットが占有スロットであることを意味する。n番目の要素が1であることは、n番目の周波数スロットが使用可能スロットであることを意味する。つまり、lは、3番目、4番目、5番目及び6番目の周波数スロットが使用可能スロットであることを意味し、lは、1番目から4番目の周波数スロットが使用可能スロットであることを意味する。 In equations (1) and (2), each element takes a value of 0 or 1. 0 in the nth element means that the nth frequency slot is an occupied slot. A 1 in the nth element means that the nth frequency slot is a usable slot. That is, l1 means that the 3rd, 4th, 5th and 6th frequency slots are usable slots, and l2 means that the 1st to 4th frequency slots are usable slots. means
 このとき、ネットワークにおける周波数スロットの使用状況sは例えば式(3)に示すように定義される。 At this time, the usage status s t of the frequency slots in the network is defined, for example, as shown in Equation (3).
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 つまり、sは、ネットワークの各リンクの使用状況を全て含む行列である。sは、周波数スロットへの割当の持続時間などのネットワークの周波数割当に関する情報を含んでもよい。 That is, s t is a matrix containing all the usage states of each link in the network. The s t may contain information about the network's frequency allocation, such as the duration of the allocation to the frequency slot.
 経路決定部12は、ネットワーク情報取得部10により取得されたネットワークに関する情報に基づいて、ネットワークにおいて光パスの始点から終点を結ぶ経路を決定する。経路決定部12は、例えば、エントロピーによる使用スロットの断片化を評価することにより、経路を決定する。エントロピーによる使用スロットの断片化を評価することにより、経路を決定する方法は、例えば非特許文献2に記載されている。経路決定部12は、例えば、k番目の短いパスの中から割り当て可能な最短経路を選択する方法(K-shortest path algorithm)により、経路を決定してもよい。経路決定部12により決定される経路も、ネットワークに関する情報に含まれうる。 The route determination unit 12 determines a route connecting the start point and the end point of the optical path in the network based on the network information acquired by the network information acquisition unit 10 . The route determination unit 12 determines a route by evaluating fragmentation of used slots by entropy, for example. A method for determining a route by evaluating the fragmentation of used slots by entropy is described in Non-Patent Document 2, for example. The route determination unit 12 may determine the route by, for example, a method of selecting the shortest route that can be assigned from among the k-th shortest paths (K-shortest path algorithm). The route determined by the route determination unit 12 can also be included in the information about the network.
 行動分布推定部14は、周波数スロットの使用状況s及び経路決定部12により決定される経路を、推定モデル記憶部15に記憶されている推定モデルに入力することで、行動aの確率分布である行動分布を推定する。行動aの分布は、「方策」(policy)と呼ばれるものであり、sにおいて通信需要スロットをどの周波数スロットに割り当てるかを示す確率である。 The behavior distribution estimating unit 14 inputs the frequency slot usage status st and the route determined by the route determining unit 12 to the estimation model stored in the estimation model storage unit 15, thereby calculating the probability distribution of the behavior at . Estimate the behavioral distribution that is The distribution of actions at is called a "policy", and is a probability indicating to which frequency slot a communication demand slot is assigned in st .
 推定モデルは例えば、多項式関数、ニューラルネットワークを用いて作成され、使用状況s及び経路を入力することにより、行動分布を出力する。推定モデル記憶部15に記憶されている推定モデルの生成方法については後述する。行動分布推定部14による推定される行動分布は、例えば周波数スロットの数と同じ数の要素を持つ1次元ベクトルAactであって、各要素が、各周波数スロットが先頭スロットとなる確率を示す。つまり、例えば通信需要スロットが2個であり、周波数スロットが8個であるとき、Aactの要素は8個であり、Aactの3番目の要素は、3番目の周波数スロットが先頭スロットになる確率を示す。 The estimation model is created using, for example, a polynomial function and a neural network, and outputs a behavioral distribution by inputting usage conditions st and routes. A method of generating the estimation model stored in the estimation model storage unit 15 will be described later. The behavioral distribution estimated by the behavioral distribution estimating unit 14 is, for example, a one-dimensional vector A act having the same number of elements as the number of frequency slots, and each element indicates the probability that each frequency slot is the leading slot. That is, for example, when communication demand slots are two and frequency slots are eight, the elements of A act are eight, and the third element of A act is the third frequency slot. Indicates probability.
 マスク生成部16は、周波数スロットの使用状況s及び経路決定部12により決定される経路に基づいて、通信需要スロットの周波数スロットへの割当可否を示すデータ(以下、マスクと呼ぶ)を生成する。 The mask generation unit 16 generates data (hereinafter referred to as a mask) indicating whether communication demand slots can be allocated to frequency slots based on the usage status st of the frequency slots and the route determined by the route determination unit 12. .
 図2は、マスク生成部16がマスクを生成する方法を示すフローチャートである。初めにマスク生成部16は、経路決定部12により決定される経路における周波数スロットの使用可能状況を検出する(ステップS100)。例えば、マスク生成部16は、決定される経路が通るリンクの全てにおいて使用可能である周波数スロットを検出する。例えば、経路決定部12により決定される経路が通るリンクが2つであり、リンクの周波数スロットの使用状況が式(1)及び式(2)で表されるとき、使用可能状況Aavailableは、式(4)により表される。 FIG. 2 is a flow chart showing how the mask generator 16 generates a mask. First, the mask generation unit 16 detects availability of frequency slots on the route determined by the route determination unit 12 (step S100). For example, the mask generator 16 detects frequency slots that can be used on all links through which the determined route passes. For example, when the route determined by the route determination unit 12 passes through two links, and the usage status of the frequency slots of the links is represented by equations (1) and (2), the availability status A available is It is represented by Formula (4).
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 式(4)において、ANDは各要素それぞれに対し、論理演算子のANDを適用することを意味する。つまり、l及びlにおいて、n番目の要素がともに1であるときは、Aavailableにおけるn番目の要素が1となり、そうでない場合はAavailableにおけるn番目の要素が0となる。つまり、使用可能状況Aavailableは、所与の経路が通るリンク全てにおいて使用可能である周波数スロットを示す。 In Expression (4), AND means applying the logical operator AND to each element. That is, when the n-th elements of l 1 and l 2 are both 1, the n-th element of A available becomes 1; otherwise, the n-th element of A available becomes 0. In other words, availability status A available indicates frequency slots that are available on all links along a given path.
 マスク生成部16は、使用可能状況Aavailableに基づいて、通信需要スロットが割り当て可能な連続する使用可能スロットを示すマスクAmaskを生成する(ステップS102)。マスク生成部16は、例えば、通信需要スロットを割り当て可能な連続する使用可能スロットの候補全てを示すマスクを生成する。より具体的には、マスクは周波数スロットと同じ数の要素を有する1次元ベクトルAmaskであり、n番目の要素が0であることは、n番目の周波数スロットを先頭スロットとしたときに通信需要スロットが割り当て不可能であることを示し、n番目の要素が1であることは、n番目の周波数スロットを先頭スロットとしたときに通信需要スロットが割り当て可能であることを示す。 The mask generation unit 16 generates a mask A mask indicating consecutive available slots to which communication demand slots can be assigned, based on the availability status A available (step S102). The mask generation unit 16 generates, for example, a mask indicating all candidates for consecutive available slots to which communication demand slots can be assigned. More specifically, the mask is a one-dimensional vector A mask having the same number of elements as the frequency slots. It indicates that the slot cannot be assigned, and the fact that the n-th element is 1 indicates that the communication demand slot can be assigned when the n-th frequency slot is set as the leading slot.
 マスク生成部16は次の計算によりAmaskの要素を算出する。マスク生成部16は、通信需要スロットの数がm個である場合に、使用可能状況Aavailableのn番目の要素からn+m-1番目の要素までの値が全て1であるとき、Amaskのn番目の要素を1と算出し、それ以外のときはAmaskのn番目の要素を0と算出する。また、使用可能状況Aavailableのn+m-1番目の要素が存在しない場合もAmaskのn番目の要素を0と算出する。 The mask generator 16 calculates the elements of A mask by the following calculation. When the number of communication demand slots is m, the mask generation unit 16 generates n 1 is calculated for the th element, and 0 is calculated for the nth element of A mask otherwise. In addition, the n-th element of A mask is calculated as 0 even when the (n+m−1)-th element of the availability status A available does not exist.
 例えば、通信需要スロットが2個であり、使用可能状況Aavailableが[1,0,0,1,1,1,1,1]であるとき、使用可能状況Aavailableの4番目の要素から5番目の要素までの値が全て1であるため、Amaskの4番目の要素は1となる。また、使用可能状況Aavailableの1番目の要素から2番目の要素までの値は2番目の要素が0であるため、Amaskの1番目の要素は0となる。また、使用可能状況Aavailableの9番目の要素は存在しないため、Amaskの8番目の要素は0となる。全ての要素について以上の計算を行うことで、マスクAmaskは[0,0,0,1,1,1,1,0]となる。 For example, when there are two communication demand slots and the availability status A available is [1, 0, 0, 1, 1, 1, 1, 1], the fourth element of the availability status A available to 5 Since the values up to the 4th element are all 1, the 4th element of A mask is 1. In addition, since the second element of the values from the first element to the second element of the availability status A available is 0, the first element of A mask is 0. Also, since the ninth element of the availability status A available does not exist, the eighth element of A mask is 0. By performing the above calculations for all the elements, the mask A mask becomes [0,0,0,1,1,1,1,0].
 マスク生成部16は、例えば、通信需要スロットを割り当て可能な連続する使用可能スロットの候補の一部を示すマスクを生成してもよい。マスク生成部16は、例えば上記説明した連続する使用可能スロットの候補全てを示すマスクAmaskを生成した後にマスクAmaskの要素の一部を1から0にする。より具体的には、通信需要スロットを周波数スロットに割り当てたときに、一群の連続する使用可能スロット(以下、ブロックと呼ぶ)の数が増加する場合に、対応するマスクAmaskの要素の一部を1から0にする。 The mask generation unit 16 may generate, for example, a mask indicating a part of candidates for consecutive available slots to which communication demand slots can be assigned. For example, after generating the mask A mask indicating all the consecutive available slot candidates described above, the mask generation unit 16 changes some of the elements of the mask A mask from 1 to 0. FIG. More specifically, when assigning communication demand slots to frequency slots, if the number of groups of consecutive available slots (hereinafter referred to as blocks) increases, some of the elements of the corresponding mask A mask from 1 to 0.
 例えば、通信需要スロットが2個であり、使用可能状況Aavailableが[1,0,0,1,1,1,1,1]であるとき、ブロックの数は、1番目の使用可能スロットからなるブロック及び4番目から8番目までの使用可能スロットからなるブロックの2つである。このとき、5番目及び6番目の周波数スロットに通信需要スロットを割り当てると使用可能状況Aavailableは[1,0,0,1,0,0,1,1]に変化し、ブロックの数は、1番目の使用可能スロットからなるブロック、4番目の使用可能スロットからなるブロック及び7番目から8番目までの使用可能スロットからなるブロックの3つに増加する。そのため、マスク生成部16は、マスクAmaskの5番目の要素を0にする。同様にして、マスク生成部16は、マスクAmaskの6番目の要素を0にし、最終的にマスクAmaskは[0,0,0,1,0,0,1,0]となる。 For example, when there are two communication demand slots and the availability status A available is [1, 0, 0, 1, 1, 1, 1, 1], the number of blocks is and two blocks consisting of the 4th through 8th available slots. At this time, when communication demand slots are assigned to the 5th and 6th frequency slots, the availability status A available changes to [1, 0, 0, 1, 0, 0, 1, 1], and the number of blocks is It increases to three blocks: the first available slot, the fourth available slot, and the seventh to eighth available slots. Therefore, the mask generator 16 sets the fifth element of the mask A mask to 0. Similarly, the mask generator 16 sets the sixth element of the mask A mask to 0, and finally the mask A mask becomes [0, 0, 0, 1, 0, 0, 1, 0].
 マスク生成部16は、以上のように通信需要スロットを割り当て可能な連続する使用可能スロットの候補の一部を示すマスクを生成することにより、断片化される使用可能スロットが生じるのを防ぐことができる。 As described above, the mask generation unit 16 generates a mask indicating a part of candidates for consecutive available slots to which communication demand slots can be assigned, thereby preventing fragmentation of available slots. can.
 候補値算出部18は、行動分布推定部14による推定結果Aact及びマスク生成部16により生成されるマスクAmaskに基づいて割当可能なスロットの候補値を算出する。例えば、候補値算出部18は式(5)により割当可能候補値Aoutを算出する。 The candidate value calculator 18 calculates candidate values of allocatable slots based on the estimation result A act from the behavior distribution estimator 14 and the mask A mask generated by the mask generator 16 . For example, the candidate value calculation unit 18 calculates the allocatable candidate value A out by Equation (5).
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 式(5)の右辺は、AactとAmaskとのアダマール積を意味する。つまり、Aoutのそれぞれの要素は、AactとAmaskの対応する要素の積である。そのため、割当可能候補値Aoutは、Aactの要素のうち、Amaskが示す割り当て不可能なスロットをマスクしたものである。 The right side of equation (5) means the Hadamard product of A act and A mask . That is, each element of A out is the product of the corresponding elements of A act and A mask . Therefore, the allocatable candidate value A out is obtained by masking non-allocatable slots indicated by A mask among the elements of A act .
 周波数割当部20は、候補値算出部18により算出される割当可能候補値Aoutに基づいて、通信需要スロットを割り当てる周波数スロットを決定する。周波数割当部20は例えば、Aoutにおいて、最も大きい値をとる要素に対応する周波数スロットを先頭スロットとする。周波数割当部20は例えば、Aoutの要素の値を確率値に変換し、変換した確率値に基づいて確率的に決定される要素に対応する周波数スロットを先頭スロットとする。Aoutの要素の値を確率値に変換する方法は、例えばソフトマックス関数により要素の値を変換する方法があるがこれに限られない。 Based on the allocatable candidate value A out calculated by the candidate value calculator 18, the frequency allocation unit 20 determines frequency slots to which communication demand slots are allocated. For example, the frequency allocation unit 20 sets the frequency slot corresponding to the element having the largest value in A out as the leading slot. For example, the frequency allocation unit 20 converts the values of the elements of A out into probability values, and sets the frequency slot corresponding to the elements determined stochastically based on the converted probability values as the top slot. A method for converting the values of the elements of A out into probability values is, for example, a method of converting the values of the elements using a softmax function, but is not limited to this.
 報酬算出部22は、ネットワーク情報取得部10により取得されるネットワークに関する情報及び周波数割当部20から取得する行動aに基づいて報酬rt+1を算出する。ネットワークに関する情報は、周波数スロット使用状況決定部11により決定される周波数スロットの使用状況sであってもよい。周波数割当部20から取得する行動aは、周波数割当部20が時刻tにする行動であって、通信需要スロットを周波数スロットに割り当てる行動である。 The reward calculation unit 22 calculates a reward r t+1 based on the information on the network acquired by the network information acquisition unit 10 and the action at acquired from the frequency allocation unit 20 . The information about the network may be the usage status s t of the frequency slots determined by the frequency slot usage status determining section 11 . The action at obtained from the frequency allocating unit 20 is the action at time t by the frequency allocating unit 20, which is the action of allocating the communication demand slot to the frequency slot.
 報酬算出部22は、例えば、使用状況sにおいて行動aをしたときに通信需要スロットを周波数スロットに割り当てることができた場合に報酬rt+1を1と算出し、できなかった場合に報酬rt+1を-1と算出する。 For example, the reward calculation unit 22 calculates the reward r t+1 as 1 when the communication demand slot can be assigned to the frequency slot when the action a t is performed in the usage status st , and the reward r t+1 when it is not possible. Calculate t+1 as -1.
 報酬算出部22は、使用状況s及び使用状況st+1に基づいて、報酬rt+1を算出してもよい。このとき、報酬算出部22は、ネットワーク情報取得部10により使用状況st+1が観測された後に報酬rt+1を算出する。
 報酬算出部22は、使用状況s、使用状況st+1及び行動aに基づいて、報酬rt+1を算出してもよい。
The reward calculator 22 may calculate the reward r t +1 based on the usage status s t and the usage status s t +1 . At this time, the remuneration calculation unit 22 calculates the remuneration r t+1 after the network information acquisition unit 10 observes the usage status s t+1 .
The reward calculator 22 may calculate the reward r t+ 1 based on the usage status s t , the usage status s t+1 and the behavior at .
 推定モデル更新部24は、報酬算出部22により算出された報酬rt+1に基づいて、推定モデル記憶部15に記憶される推定モデルを更新する。推定モデル更新部24は、将来にわたり得られる報酬が最大となるように推定モデルを更新する。推定モデルの更新方法は例えば非特許文献3に記載のDQN(deep Q-network)又は非特許文献4に記載のA3C(asynchronous advantage actor-critic)を用いて行われるがこれに限られない。 The estimation model updating unit 24 updates the estimation model stored in the estimation model storage unit 15 based on the reward r t+1 calculated by the reward calculation unit 22 . The estimation model updating unit 24 updates the estimation model so that the reward that can be obtained in the future is maximized. The estimation model update method is performed using, for example, DQN (deep Q-network) described in Non-Patent Document 3 or A3C (asynchronous advantage actor-critic) described in Non-Patent Document 4, but is not limited thereto.
 図3は、推定モデルを生成するときの、光パス設計装置1の動作を示すフローチャートである。
初めにネットワーク情報取得部10がネットワークに関する情報を取得する(ステップS200)。その後、周波数スロット使用状況決定部11は、ネットワークに関する情報に基づいて周波数スロットの使用状況sを決定する(ステップS201)。経路決定部12はネットワークに関する情報に基づいて経路を決定する(ステップS202)。行動分布推定部14は、使用状況s及び経路に基づいて行動分布を推定する(ステップS204)。マスク生成部16は、使用状況s及び経路に基づいてマスクを生成する(ステップS206)。候補値算出部18は、行動分布及びマスクに基づいて割当可能候補値を算出する(ステップS208)。周波数割当部20は、割当可能候補値に基づいて周波数スロットに通信需要スロットを割り当てる(ステップS210)。
FIG. 3 is a flow chart showing the operation of the optical path design device 1 when generating an estimated model.
First, the network information acquisition unit 10 acquires information about the network (step S200). After that, the frequency slot usage status determining unit 11 determines the frequency slot usage status s t based on the information on the network (step S201). The route determination unit 12 determines a route based on information about the network (step S202). The behavior distribution estimation unit 14 estimates the behavior distribution based on the usage status st and the route (step S204). The mask generation unit 16 generates a mask based on the usage status st and the route (step S206). The candidate value calculation unit 18 calculates allocatable candidate values based on the behavior distribution and the mask (step S208). The frequency allocation unit 20 allocates communication demand slots to frequency slots based on the allocatable candidate values (step S210).
 報酬算出部22は、ネットワーク情報取得部10により取得されるネットワークに関する情報及び周波数割当部20による行動に基づいて報酬を算出する(ステップS212)。推定モデル更新部24は、報酬に基づいて推定モデルを更新する(ステップS214)。光パス設計装置1は、ステップS200からステップS214の動作を繰り返し、推定モデルを更新することにより、推定モデルを生成する。 The reward calculation unit 22 calculates a reward based on the network-related information acquired by the network information acquisition unit 10 and the behavior of the frequency allocation unit 20 (step S212). The estimation model update unit 24 updates the estimation model based on the reward (step S214). The optical path design device 1 repeats the operations from step S200 to step S214 to update the estimated model, thereby generating the estimated model.
 図4は、周波数スロットに通信需要スロットを割り当てるときの、光パス設計装置1の動作を示すフローチャートである。図4に示すフローチャートは、図3に示すフローチャートのステップS200からステップS210の動作に相当する。 FIG. 4 is a flowchart showing the operation of the optical path designing device 1 when allocating communication demand slots to frequency slots. The flowchart shown in FIG. 4 corresponds to the operations from step S200 to step S210 in the flowchart shown in FIG.
 第1の実施形態によれば、予め候補とする周波数スロットを決定することなく、通信需要スロットを割り当てることができる。また、全ての割当可能な周波数スロットを示すAoutを算出することにより、全ての割当可能な周波数スロットを考慮することができる。 According to the first embodiment, communication demand slots can be allocated without determining candidate frequency slots in advance. Also, by calculating A out indicating all assignable frequency slots, all assignable frequency slots can be considered.
〈実験結果〉
 光パス設計装置1の性能評価を全米科学財団ネットワーク(National Science Foundation Network)を使用して行った。NSF Networkはノード数14, リンク数13で構成される。図5は、本実験における環境を示す図である。また、要求されている光パスに関する情報である、光パスの始点及び終点となるノードの情報、通信需要スロットの数及び割当の持続時間は一様分布にしたがってランダムに生成した。光パスの要求の平均到着率は10であり、光パス設計装置1による割当処理の平均サービス時間は12であった。
<Experimental result>
A performance evaluation of the optical path design device 1 was performed using the National Science Foundation Network. The NSF Network consists of 14 nodes and 13 links. FIG. 5 is a diagram showing the environment in this experiment. In addition, the information about the requested lightpath, that is, the information of the nodes that are the start and end points of the lightpath, the number of communication demand slots, and the duration of allocation, were randomly generated according to a uniform distribution. The average arrival rate of lightpath requests was 10, and the average service time of allocation processing by the lightpath design device 1 was 12.
 周波数スロットは図5に示す通り80個である。ここで使用状況sは2次元ベクトルで表される。使用状況sのl行目m列目の要素が1であることは、l番目のリンクのm番目の周波数スロットは利用可能スロットであることを示し、0であることは当該スロットは占有スロットであることを示す。 There are 80 frequency slots as shown in FIG. Here, the usage status s t is represented by a two-dimensional vector. A value of 1 in the lth row and mth column element of the usage status st indicates that the mth frequency slot of the lth link is a usable slot, and a value of 0 indicates that the slot is an occupied slot. indicates that
 報酬算出部22は、報酬rt+1を、周波数割当部20により周波数の割り当てに成功した場合には1と算出し、割り当てに失敗した場合には-1と算出する。 The reward calculation unit 22 calculates the reward r t+1 as 1 when the frequency assignment by the frequency assignment unit 20 succeeds and -1 when the assignment fails.
 推定モデル更新部24は、推定モデルとして畳み込みニューラルネットワークを使用した。図6は、本実験において使用した畳み込みニューラルネットワークを示す図である。畳み込みニューラルネットワークは、畳み込み層200及び全結合層210から構成される。畳み込み層200-1に周波数スロットの使用状況sが入力されると、畳み込みを行い32×S(Sは周波数スロットの数)の二次元ベクトルを出力する。ここで、Sは周波数スロットの数であり値は80をとる。その後、畳み込み層200-2及び200-3にそれぞれ畳み込み層200-1及び200-2から出力される二次元ベクトルを入力することで畳み込みを行う。その後、畳み込み層200-3から出力される二次元ベクトルを全結合層210-1に入力することで行動分布Aactを出力させる。また、畳み込み層200-3から出力される二次元ベクトルを全結合層210-2に入力することで使用状況sの価値を出力させる。全結合層210-2から出力される使用状況sの価値に基づいて推定モデルが更新される。強化学習アルゴリズムとしてPPO(Proximal Policy Optimization)を使用し、最適化アルゴリズムとしてAdamを使用した。図7に、本実験におけるPPO及びAdamのパラメータを示す。 The estimation model updating unit 24 used a convolutional neural network as the estimation model. FIG. 6 is a diagram showing the convolutional neural network used in this experiment. A convolutional neural network consists of a convolutional layer 200 and a fully connected layer 210 . When the convolution layer 200-1 receives the frequency slot usage state s t , it performs convolution and outputs a two-dimensional vector of 32×S (S is the number of frequency slots). Here, S is the number of frequency slots and takes a value of 80. After that, convolution is performed by inputting the two-dimensional vectors output from the convolution layers 200-1 and 200-2 to the convolution layers 200-2 and 200-3, respectively. After that, by inputting the two-dimensional vector output from the convolutional layer 200-3 to the fully connected layer 210-1, the action distribution A act is output. Also, by inputting the two-dimensional vector output from the convolutional layer 200-3 to the fully connected layer 210-2, the value of the usage state s t is output. The estimation model is updated based on the value of the usage status s t output from the fully connected layer 210-2. PPO (Proximal Policy Optimization) was used as a reinforcement learning algorithm, and Adam was used as an optimization algorithm. FIG. 7 shows the PPO and Adam parameters in this experiment.
 本実験においては、光パス設計装置1が要求されている光パスに関する情報を500,000パターン使用して推定モデルを生成するごとに、要求されている光パスに関する情報を10,000パターン使用して、推定モデルを評価した。具体的には、通信需要スロットの割り当てが失敗した割合を表すブロッキング確率を算出することで、推定モデルを評価した。光パス設計装置1が推定モデルを生成し、推定モデルを評価するステップを「評価ステップ」と呼ぶ。評価ステップをマスク生成部16により生成されるマスクがある場合とない場合で行い、比較を行った。 In this experiment, every time the optical path designing apparatus 1 uses 500,000 patterns of information on requested optical paths to generate an estimated model, 10,000 patterns of information on requested optical paths is used. We then evaluated the estimation model. Specifically, we evaluated the estimation model by calculating the blocking probability, which represents the proportion of communication demand slot allocation failures. A step in which the optical path design device 1 generates an estimated model and evaluates the estimated model is called an "evaluation step". The evaluation step was performed with and without the mask generated by the mask generator 16 and compared.
 図8は、本実験におけるブロッキング確率を示す図である。マスクがない場合には、ブロッキング確率が約80%であるのに対し、マスクがある場合には、ブロッキング確率が約2.2%である。また、マスクがある場合においては、評価ステップが1である段階からブロッキング確率を2.2%近くにすることができていることから、学習時間を大幅に短縮することができている。 FIG. 8 is a diagram showing the blocking probability in this experiment. Without the mask, the blocking probability is approximately 80%, while with the mask, the blocking probability is approximately 2.2%. Also, when there is a mask, the blocking probability can be close to 2.2% from the stage where the evaluation step is 1, so the learning time can be greatly shortened.
〈他の実施形態〉
 以上、図面を参照してこの発明の一実施形態について詳しく説明してきたが、具体的な構成は上述のものに限られることはなく、この発明の要旨を逸脱しない範囲内において様々な設計変更等をすることが可能である。
<Other embodiments>
Although one embodiment of the present invention has been described in detail above with reference to the drawings, the specific configuration is not limited to the above, and various design changes, etc., can be made without departing from the gist of the present invention. It is possible to
 光パス設計装置1は、推定モデルを生成しなくてもよい。このとき、光パス設計装置1は、ネットワーク情報取得部10、経路決定部12、行動分布推定部14、推定モデル記憶部15、マスク生成部16、候補値算出部18、周波数割当部20を備え、報酬算出部22及び推定モデル更新部24を備えなくてもよい。このとき、推定モデル記憶部15は、生成された推定モデルを記憶する。 The optical path design device 1 does not have to generate an estimated model. At this time, the optical path designing device 1 includes a network information acquisition unit 10, a route determination unit 12, a behavior distribution estimation unit 14, an estimation model storage unit 15, a mask generation unit 16, a candidate value calculation unit 18, and a frequency allocation unit 20. , the reward calculator 22 and the estimation model updater 24 may not be provided. At this time, the estimated model storage unit 15 stores the generated estimated model.
 行動分布推定部14は、使用状況s及び経路に基づいて行動aの確率分布である行動分布を推定するがこれに限られない。行動分布推定部14は、例えばネットワーク情報取得部10により取得されるネットワークに関する情報に基づいて、行動aの確率分布である行動分布を推定してもよい。 The behavior distribution estimator 14 estimates the behavior distribution, which is the probability distribution of the behavior at , based on the usage status st and the route, but is not limited to this. The action distribution estimator 14 may estimate the action distribution, which is the probability distribution of the action at , based on the information about the network acquired by the network information acquisition unit 10, for example.
1 光パス設計装置、10 ネットワーク情報取得部、11 周波数スロット使用状況決定部、12 経路決定部、14 行動分布推定部、15 推定モデル記憶部、16 マスク生成部、18 候補値算出部、20 周波数割当部、22 報酬算出部、24 推定モデル更新部 1 optical path design device, 10 network information acquisition unit, 11 frequency slot usage determination unit, 12 route determination unit, 14 behavior distribution estimation unit, 15 estimation model storage unit, 16 mask generation unit, 18 candidate value calculation unit, 20 frequency Allocation unit, 22 reward calculation unit, 24 estimation model update unit

Claims (5)

  1.  ネットワークに関する情報を入力として、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力する推定モデルに、ネットワークに関する情報を入力することで、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力させる行動分布推定部と、
     前記ネットワークに関する情報に基づいて、前記通信需要スロットの周波数スロットへの割当可否を示すデータであるマスクを生成するマスク生成部と、
     前記行動分布及び前記マスクに基づいて、割当可能な周波数スロットの候補値を算出する候補値算出部と、
     前記候補値算出部により算出された候補値に基づいて、前記通信需要スロットを割り当てる周波数スロットを決定する周波数割当部と、
     を備える光パス設計装置。
    By inputting information about the network into an estimation model that outputs a behavioral distribution that indicates the probability of frequency slots to which communication demand slots are assigned, the behavioral distribution that indicates the probability of frequency slots to which communication demand slots are assigned is generated. a behavioral distribution estimator for outputting;
    a mask generation unit that generates a mask, which is data indicating whether or not the communication demand slots can be allocated to frequency slots, based on the information about the network;
    a candidate value calculation unit that calculates candidate values for assignable frequency slots based on the behavior distribution and the mask;
    a frequency allocation unit that determines a frequency slot to allocate the communication demand slot based on the candidate value calculated by the candidate value calculation unit;
    An optical path design device comprising:
  2.  前記ネットワークに関する情報及び前記周波数割当部により決定された行動である通信需要スロットを割り当てる周波数スロットに基づいて、報酬を算出する報酬算出部と、
     前記報酬に基づいて、前記推定モデルを更新する推定モデル更新部と、
     を備える請求項1に記載の光パス設計装置。
    a remuneration calculation unit that calculates a remuneration based on the information on the network and frequency slots to which communication demand slots are allocated, which is the action determined by the frequency allocation unit;
    an estimation model updating unit that updates the estimation model based on the reward;
    The optical path design device according to claim 1, comprising:
  3.  前記マスクは、通信需要スロットを周波数スロットに割り当てたときに、一群の連続する使用可能スロットの数が増加する場合に、通信需要スロットが周波数スロットへの割当不可を示す、
     請求項1又は2に記載の光パス設計装置。
    the mask indicates that communication demand slots cannot be assigned to frequency slots if the number of a group of consecutive available slots increases when allocating communication demand slots to frequency slots;
    3. The optical path design device according to claim 1 or 2.
  4.  ネットワークに関する情報を入力として、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力する推定モデルに、前記ネットワークに関する情報を入力することで、通信需要スロットを割り当てる周波数スロットの確率を示す行動分布を出力させる行動分布推定ステップと、
     前記ネットワークに関する情報に基づいて、前記通信需要スロットの周波数スロットへの割当可否を示すデータであるマスクを生成するマスク生成ステップと、
     前記行動分布及び前記マスクに基づいて、割当可能な周波数スロットの候補値を算出する候補値算出ステップと、
     前記候補値算出ステップにより算出された候補値に基づいて、前記通信需要スロットを割り当てる周波数スロットを決定する周波数割当ステップと、
     を有する光パス設計方法。
    By inputting information about a network into an estimation model that outputs a behavior distribution indicating the probability of frequency slots to which communication demand slots are assigned, the behavior distribution indicating the probability of frequency slots to which communication demand slots are assigned. an action distribution estimation step that outputs
    a mask generation step of generating a mask, which is data indicating whether or not the communication demand slots can be assigned to frequency slots, based on the information about the network;
    a candidate value calculation step of calculating candidate values for assignable frequency slots based on the behavior distribution and the mask;
    a frequency allocation step of determining a frequency slot to allocate the communication demand slot based on the candidate value calculated by the candidate value calculation step;
    An optical path design method comprising:
  5.  請求項1から請求項3のいずれか一項に記載の光パス設計装置としてコンピュータを動作させるプログラム。 A program for operating a computer as the optical path design device according to any one of claims 1 to 3.
PCT/JP2021/019527 2021-05-24 2021-05-24 Optical path design device, optical path design method, and program WO2022249225A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2021/019527 WO2022249225A1 (en) 2021-05-24 2021-05-24 Optical path design device, optical path design method, and program
JP2023523710A JPWO2022249225A1 (en) 2021-05-24 2021-05-24

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/019527 WO2022249225A1 (en) 2021-05-24 2021-05-24 Optical path design device, optical path design method, and program

Publications (1)

Publication Number Publication Date
WO2022249225A1 true WO2022249225A1 (en) 2022-12-01

Family

ID=84229684

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/019527 WO2022249225A1 (en) 2021-05-24 2021-05-24 Optical path design device, optical path design method, and program

Country Status (2)

Country Link
JP (1) JPWO2022249225A1 (en)
WO (1) WO2022249225A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011176518A (en) * 2010-02-24 2011-09-08 Fujitsu Ltd Route allocation apparatus and method thereof
JP2017220870A (en) * 2016-06-09 2017-12-14 富士通株式会社 Device for supporting wave length reassignment, method, and program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011176518A (en) * 2010-02-24 2011-09-08 Fujitsu Ltd Route allocation apparatus and method thereof
JP2017220870A (en) * 2016-06-09 2017-12-14 富士通株式会社 Device for supporting wave length reassignment, method, and program

Also Published As

Publication number Publication date
JPWO2022249225A1 (en) 2022-12-01

Similar Documents

Publication Publication Date Title
US8737836B2 (en) Apparatus and method for setting an optical path in an optical network
CN109905784B (en) Service reconstruction method and equipment for optical network wavelength allocation
Couckuyt et al. Automatic surrogate model type selection during the optimization of expensive black-box problems
WO2022249225A1 (en) Optical path design device, optical path design method, and program
Chaves et al. A methodology to design the link cost functions for impairment aware routing algorithms in optical networks
JP7436747B2 (en) OTN network resource optimization method and apparatus, computing device and storage medium
Zhang et al. Routing and spectrum assignment algorithm with traffic prediction and periodic rerouting in elastic optical networks
Jiao et al. Reliability-Oriented RSA Combined with Reinforcement Learning in Elastic Optical Networks
Lezama et al. Routing and spectrum allocation in flexgrid optical networks using differential evolution optimization
WO2016107757A1 (en) A method and system for assigning performance indicators to objects of a network
CN113438173B (en) Routing and spectrum allocation method, device, storage medium and electronic equipment
Rolland et al. Queueing delay guarantees in bandwidth packing
JP6264747B2 (en) Network design apparatus and network design method
Ruiz et al. A column generation approach for large-scale RSA-based network planning
JP2017028548A (en) Physical resource allocation device, physical resource allocation method and program
US10341041B2 (en) Method and device for assisting wavelength reallocation in wavelength division multiplexing optical network
Todd et al. Fast efficient design of shared backup path protected networks using a multi-flow optimization model
JP4607903B2 (en) Method for selecting a strategy for rerouting a line in a communication network and a network using this method
JP5952755B2 (en) Network design apparatus and network design program
Yamazaki et al. Virtualized-elastic-regenerator placement by firefly algorithm for translucent elastic optical networks
Kennington et al. Wavelength translation in WDM networks: Optimization models and solution procedures
Hu et al. Dynamic network simulation using DeepRMSA in Elastic Optical Networks
WO2018181031A1 (en) Optical path design device and optical path design method
Din Delay-Variation Constrained Spectrum Extraction and Contraction Problem for Multipath Routing on Elastic Optical Networks.
JP3895700B2 (en) Communication network failure frequency calculation method, communication network failure frequency calculation device, communication network failure frequency calculation program, and recording medium recording the program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942883

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023523710

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21942883

Country of ref document: EP

Kind code of ref document: A1