JP2021158458A

JP2021158458A - Terminal device, program to be executed by computer, and computer readable recording medium with program recorded thereon

Info

Publication number: JP2021158458A
Application number: JP2020055284A
Authority: JP
Inventors: 真衣太田; Mai Ota; 眞太郎丸; Makoto Taroumaru; 一人矢野; Kazuto Yano
Original assignee: ATR Advanced Telecommunications Research Institute International; Fukuoka University
Current assignee: ATR Advanced Telecommunications Research Institute International; Fukuoka University
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2021-10-07
Anticipated expiration: 2040-03-26
Also published as: JP7341430B2

Abstract

To provide a terminal device which effectively utilizes a frequency by suppressing a packet loss.SOLUTION: Prediction means 3 predicts a channel idle time and a channel use time on a plurality of channels. A learning apparatus 5 executes Q learning in which a length of a transmission time of a packet is defined as a state, a selection of a channel is defined as an action and a throughput in the case where a communication is made successful is defined as a reward; and outputs output information including an action in the case where a maximum Q value is obtained in each state. Control means 4 detects from the output information the action in the case where the maximum Q value is obtained in the state of the Q learning corresponding to a length of a transmission time of a packet for transmission, and selects a channel which is selected by the detected action, as a channel for transmission. Transmission means 7 transmits the packet for transmission to a base station through the channel for transmission without executing back-off in the case where it is determined by the control means that the packet for transmission is transmissible.SELECTED DRAWING: Figure 2

Description

この発明は、端末装置、コンピュータに実行させるためのプログラムおよびプログラムを記録したコンピュータ読み取り可能な記録媒体に関するものである。 The present invention relates to a terminal device, a program to be executed by a computer, and a computer-readable recording medium on which the program is recorded.

無線ＬＡＮ（Local Area Network）などに代表されるＣＳＭＡ／ＣＡ（Carrier Sense Multiple Access/ Collision Avoidance）方式では、同時送信などによるパケットの衝突・損失が起きた場合、バックオフ時間（他局からの送信電波が止まったことを検知した後、直ちに送信せず、自局が送信するまでの意図的な待ち時間）を長くすることにより、パケットの衝突確率を低減させる（特許文献１）。 In the CSMA / CA (Carrier Sense Multiple Access / Collision Avoidance) method represented by wireless LAN (Local Area Network), backoff time (transmission from other stations) occurs when packet collision / loss occurs due to simultaneous transmission. After detecting that the radio wave has stopped, the packet is not transmitted immediately, and the packet collision probability is reduced by lengthening the (intentional waiting time until transmission by the own station) (Patent Document 1).

特開２００６−０１３８９４号公報Japanese Unexamined Patent Publication No. 2006-013894

しかし、近年では、端末数の増加により、無闇にバックオフ時間を増加させることは、パケット衝突の解決にはつながらず、端末自身の通信遅延の原因となる．
そこで、この発明の実施の形態によれば、パケット損失を抑制して周波数を有効利用する端末装置を提供する。 However, in recent years, increasing the backoff time indiscriminately due to the increase in the number of terminals does not lead to the resolution of packet collisions and causes the communication delay of the terminals themselves.
Therefore, according to the embodiment of the present invention, there is provided a terminal device that suppresses packet loss and effectively utilizes the frequency.

また、この発明の実施の形態によれば、パケット損失を抑制して周波数の有効利用をコンピュータに実行させるためのプログラムを提供する。 Further, according to the embodiment of the present invention, a program for suppressing packet loss and causing a computer to effectively use a frequency is provided.

更に、この発明の実施の形態によれば、パケット損失を抑制して周波数の有効利用をコンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体を提供する。 Further, according to an embodiment of the present invention, there is provided a computer-readable recording medium on which a program for suppressing packet loss and causing a computer to effectively use frequencies is recorded.

（構成１）
この発明の実施の形態によれば、端末装置は、予測手段と、学習器と、選択手段と、送信手段とを備える。予測手段は、チャネルが空いている時間であるチャネル空き時間とチャネルが使用されている時間であるチャネル使用時間とを複数のチャネルの全てについて予測する。学習器は、パケットを基地局へ送信する通信の通信結果と、当該端末装置と競合する競合端末装置の情報である競合端末情報と、１つのパケットを送信してから次のパケットを送信するまでのチャネルの待機時間であるチャネル待機時間とからなる入力情報、またはパケットを前記基地局へ送信できないことを示す送信不可情報からなる入力情報に基づいて、パケットの送信時間の長さを状態とし、チャネルを選択することを行動とし、通信に成功したときのスループットまたはチャネル待機時間の逆数を報酬とするＱ学習を実行してＱ値の更新をＱ学習の終了条件が満たされるまで繰り返し実行し、Ｑ学習の終了条件が満たされたときの各状態と、各状態における最大のＱ値と、各状態において最大のＱ値が得られるときの行動とを相互に対応付けた構成からなる出力情報を出力する学習処理を実行する。選択手段は、送信したいパケットの送信時間の長さに対応するＱ学習の状態に対応する最大のＱ値が得られるときの行動を出力情報から検出し、その検出した行動によって選択されたチャネルを送信用チャネルとして選択する。送信手段は、送信用チャネルにおいて予測手段によって予測されたチャネル空き時間が、送信したいパケットの送信時間の長さよりも長いことからなる第１の条件が満たされたとき、バックオフを実行せずに、送信したいパケットを送信用チャネルで基地局へ送信する。 (Structure 1)
According to an embodiment of the present invention, the terminal device includes predictive means, a learner, selection means, and transmission means. The predicting means predicts the channel free time, which is the time when the channel is free, and the channel usage time, which is the time when the channel is used, for all of the plurality of channels. The learner transmits the communication result of the communication for transmitting the packet to the base station, the competing terminal information which is the information of the competing terminal device competing with the terminal device, and from the transmission of one packet to the transmission of the next packet. Based on the input information consisting of the channel standby time, which is the standby time of the channel, or the input information consisting of the non-transmission information indicating that the packet cannot be transmitted to the base station, the length of the packet transmission time is set as the state. The action is to select a channel, Q learning is executed with the reciprocal of the throughput or channel waiting time when communication is successful as a reward, and the Q value is updated repeatedly until the end condition of Q learning is satisfied. Output information consisting of a configuration in which each state when the end condition of Q learning is satisfied, the maximum Q value in each state, and the action when the maximum Q value is obtained in each state are interrelated. Execute the learning process to output. The selection means detects the action when the maximum Q value corresponding to the Q learning state corresponding to the length of the transmission time of the packet to be transmitted is obtained from the output information, and selects the channel selected by the detected action. Select as the transmission channel. The transmitting means does not perform a backoff when the first condition, which consists of the channel free time predicted by the predicting means in the transmitting channel being longer than the length of the transmitting time of the packet to be transmitted, is satisfied. , Send the packet you want to send to the base station on the transmission channel.

（構成２）
構成１において、学習器は、競合端末情報が競合端末装置の個数からなり、かつ、通信結果が通信の成功を示すとき、競合端末装置の個数が少なくなれば大きくなり、競合端末装置の個数が多くなれば小さくなるように報酬を算出する第１の演算処理を実行するとともに、チャネル待機時間が短くなれば大きくなり、チャネル待機時間が長くなれば小さくなるように報酬を算出する第２の演算処理を実行して学習処理を実行する。 (Structure 2)
In the configuration 1, when the competing terminal information consists of the number of competing terminal devices and the communication result indicates success of communication, the learning device increases as the number of competing terminal devices decreases, and the number of competing terminal devices increases. The first calculation process that calculates the reward so that it becomes smaller as the number increases is executed, and the second calculation that calculates the reward so that the reward becomes larger as the channel waiting time becomes shorter and becomes smaller as the channel waiting time becomes longer. Execute the process and execute the learning process.

（構成３）
構成２において、学習器は、競合端末装置の個数が第１の個数であるとき第１の値からなる第１の重み係数をスループットに乗算し、競合端末装置の個数が第１の個数よりも多い第２の個数であるとき第１の値よりも小さい第２の値からなる第１の重み係数をスループットに乗算して第１の演算処理を実行する。 (Structure 3)
In configuration 2, the learner multiplies the throughput by a first weighting factor consisting of the first value when the number of competing terminal devices is the first number, and the number of competing terminal devices is larger than the first number. When the number of the second number is large, the throughput is multiplied by the first weighting coefficient consisting of the second value smaller than the first value to execute the first arithmetic processing.

（構成４）
構成３において、学習器は、競合端末装置の個数の逆数を第１の重み係数としてスループットに乗算して第１の演算処理を実行する。 (Structure 4)
In configuration 3, the learner executes the first arithmetic processing by multiplying the throughput by the reciprocal of the number of competing terminal devices as the first weighting factor.

（構成５）
構成２から構成４のいずれかにおいて、学習器は、チャネル待機時間が第１の時間長からなるとき、第３の値からなる第２の重み係数をスループットに乗算し、チャネル待機時間が第１の時間長よりも長い第２の時間長からなるとき、第３の値よりも小さい第４の値からなる第２の重み係数をスループットに乗算して第２の演算処理を実行する。 (Structure 5)
In any of configurations 2 through 4, when the channel wait time consists of the first time length, the learner multiplies the throughput by a second weighting factor consisting of a third value, and the channel wait time is first. When the second time length is longer than the time length of, the throughput is multiplied by the second weighting coefficient consisting of the fourth value smaller than the third value to execute the second arithmetic processing.

（構成６）
構成２から構成５のいずれかにおいて、学習器は、チャネル待機時間の逆数を第２の重み係数としてスループットに乗算して第２の演算処理を実行する。 (Structure 6)
In any of the configurations 2 to 5, the learner executes the second arithmetic processing by multiplying the throughput by the reciprocal of the channel standby time as the second weighting factor.

（構成７）
構成１において、学習器は、学習処理において、入力情報が送信不可情報からなるとき、報酬を零としてＱ値を更新する。 (Structure 7)
In the configuration 1, when the input information consists of non-transmissionable information in the learning process, the learner updates the Q value with the reward as zero.

（構成８）
構成１から構成７のいずれかにおいて、端末装置は、受信手段を更に備える。受信手段は、競合端末情報を制御チャネルで基地局から受信する。 (Structure 8)
In any one of configurations 1 to 7, the terminal device further comprises receiving means. The receiving means receives the competing terminal information from the base station on the control channel.

（構成９）
構成１から構成８のいずれかにおいて、通信結果は、送信手段が送信したいパケットを基地局へ送信した後、パケットを受信したことを示すＡＣＫパケットを基地局から受信したとき、第１の指標からなり、送信手段が送信したいパケットを基地局へ送信した後、ＡＣＫパケットを基地局から受信しないとき、第２の指標からなる。 (Structure 9)
In any of the configurations 1 to 8, the communication result is obtained from the first index when the transmitting means transmits the packet to be transmitted to the base station and then receives the ACK packet indicating that the packet has been received from the base station. Therefore, when the transmitting means transmits the packet to be transmitted to the base station and then does not receive the ACK packet from the base station, the second index is used.

（構成１０）
構成１から構成９のいずれかにおいて、送信手段は、第１の条件に加え、送信用チャネルにおけるキャリアセンスの結果、他の端末装置が送信用チャネルで通信していないことからなる第２の条件が満たされたとき、バックオフを実行せずに、送信したいパケットを送信用チャネルで基地局へ送信する。 (Structure 10)
In any one of configurations 1 to 9, in addition to the first condition, the transmission means has a second condition in which, as a result of carrier sense in the transmission channel, another terminal device is not communicating in the transmission channel. When is satisfied, the packet to be transmitted is transmitted to the base station on the transmission channel without executing backoff.

（構成１１）
また、この発明の実施の形態によれば、プログラムは、
予測手段が、チャネルが空いている時間であるチャネル空き時間とチャネルが使用されている時間であるチャネル使用時間とを複数のチャネルの全てについて予測する第１のステップと、
学習器が、パケットを基地局へ送信する通信の通信結果と、当該端末装置と競合する競合端末装置の情報である競合端末情報と、１つのパケットを送信してから次のパケットを送信するまでのチャネルの待機時間であるチャネル待機時間とからなる入力情報、またはパケットを基地局へ送信できないことを示す送信不可情報からなる入力情報に基づいて、パケットの送信時間の長さを状態とし、チャネルを選択することを行動とし、通信に成功したときのスループットまたはチャネル待機時間の逆数を報酬とするＱ学習を実行してＱ値の更新をＱ学習の終了条件が満たされるまで繰り返し実行し、Ｑ学習の終了条件が満たされたときの各状態と、各状態における最大のＱ値と、各状態において最大のＱ値が得られるときの行動とを相互に対応付けた構成からなる出力情報を出力する学習処理を実行する第２のステップと、
選択手段が、送信したいパケットの送信時間の長さに対応するＱ学習の状態に対応する最大のＱ値が得られるときの行動を出力情報から検出し、その検出した行動によって選択されたチャネルを送信用チャネルとして選択する第３のステップと、
送信手段が、送信用チャネルにおいて予測手段によって予測されたチャネル空き時間が、送信したいパケットの送信時間の長さよりも長いことからなる第１の条件が満たされたとき、バックオフを実行せずに、送信したいパケットを前記送信用チャネルで前記基地局へ送信する第４のステップとをコンピュータに実行させるためのプログラムである。 (Structure 11)
Also, according to embodiments of the present invention, the program is:
The first step in which the predictor predicts the channel free time, which is the time the channel is free, and the channel usage time, which is the time the channel is used, for all of the plurality of channels.
From the transmission of one packet to the transmission of the next packet, the communication result of the communication in which the learner transmits the packet to the base station, the competing terminal information which is the information of the competing terminal device competing with the terminal device, and the competing terminal information. Based on the input information consisting of the channel standby time, which is the standby time of the channel, or the input information consisting of the non-transmission information indicating that the packet cannot be transmitted to the base station, the length of the packet transmission time is set as the state of the channel. The action is to select, and Q learning is executed with the reciprocal of the throughput or channel waiting time when communication is successful as a reward, and the Q value is updated repeatedly until the end condition of Q learning is satisfied. Outputs output information consisting of a configuration in which each state when the learning end condition is satisfied, the maximum Q value in each state, and the action when the maximum Q value is obtained in each state are interrelated. The second step of executing the learning process to be
The selection means detects the action when the maximum Q value corresponding to the Q learning state corresponding to the length of the transmission time of the packet to be transmitted is obtained from the output information, and selects the channel selected by the detected action. The third step of selecting as the transmission channel and
When the transmitting means satisfies the first condition that the channel free time predicted by the predicting means in the transmitting channel is longer than the length of the transmitting time of the packet to be transmitted, the backoff is not executed. , A program for causing a computer to execute a fourth step of transmitting a packet to be transmitted to the base station on the transmission channel.

（構成１２）
構成１１において、学習器は、第２のステップにおいて、競合端末情報が競合端末装置の個数からなり、かつ、通信結果が通信の成功を示すとき、競合端末装置の個数が少なくなれば大きくなり、競合端末装置の個数が多くなれば小さくなるように報酬を算出する第１の演算処理を実行するとともに、チャネル待機時間が短くなれば大きくなり、チャネル待機時間が長くなれば小さくなるように報酬を算出する第２の演算処理を実行して学習処理を実行する。 (Structure 12)
In the configuration 11, when the competing terminal information consists of the number of competing terminal devices and the communication result indicates success of communication in the second step, the learning device becomes larger as the number of competing terminal devices decreases. The first arithmetic process for calculating the reward so that the number of competing terminal devices increases is executed, and the reward is increased as the channel standby time becomes shorter and decreases as the channel standby time becomes longer. The learning process is executed by executing the second arithmetic process to be calculated.

（構成１３）
構成１２において、学習器は、第２のステップにおいて、競合端末装置の個数が第１の個数であるとき第１の値からなる第１の重み係数をスループットに乗算し、競合端末装置の個数が第１の個数よりも多い第２の個数であるとき第１の値よりも小さい第２の値からなる第１の重み係数をスループットに乗算して第１の演算処理を実行する。 (Structure 13)
In configuration 12, in the second step, the learner multiplies the throughput by the first weighting factor consisting of the first value when the number of competing terminal devices is the first number, and the number of competing terminal devices is increased. When the second number is larger than the first number, the throughput is multiplied by the first weighting coefficient consisting of the second value smaller than the first value to execute the first arithmetic processing.

（構成１４）
構成１３において、学習器は、第２のステップにおいて、競合端末装置の個数の逆数を第１の重み係数として前記スループットに乗算して第１の演算処理を実行する。 (Structure 14)
In configuration 13, in the second step, the learner executes the first arithmetic processing by multiplying the throughput by the reciprocal of the number of competing terminal devices as the first weighting factor.

（構成１５）
構成１２から構成１４のいずれかにおいて、学習器は、第２のステップにおいて、チャネル待機時間が第１の時間長からなるとき、第３の値からなる第２の重み係数をスループットに乗算し、チャネル待機時間が第１の時間長よりも長い第２の時間長からなるとき、第３の値よりも小さい第４の値からなる第２の重み係数をスループットに乗算して第２の演算処理を実行する。 (Structure 15)
In any of configurations 12 through 14, the learner multiplies the throughput by a second weighting factor of a third value when the channel latency consists of the first time length in the second step. When the channel standby time consists of a second time length longer than the first time length, the throughput is multiplied by a second weighting factor consisting of a fourth value smaller than the third value to perform the second arithmetic processing. To execute.

（構成１６）
構成１２から構成１５のいずれかにおいて、学習器は、第２のステップにおいて、チャネル待機時間の逆数を第２の重み係数としてスループットに乗算して第２の演算処理を実行する。 (Structure 16)
In any of configurations 12 through 15, the learner executes the second arithmetic processing in the second step by multiplying the throughput by the reciprocal of the channel standby time as the second weighting factor.

（構成１７）
構成１１において、学習器は、第２のステップの学習処理において、入力情報が送信不可情報からなるとき、報酬を零としてＱ値を更新する。 (Structure 17)
In configuration 11, when the input information consists of non-transmissionable information in the learning process of the second step, the learner updates the Q value with the reward as zero.

（構成１８）
構成１１から構成１７のいずれかにおいて、プログラムは、受信手段が、競合端末情報を制御チャネルで基地局から受信する第５のステップを更にコンピュータに実行させる。 (Structure 18)
In any of configurations 11 through 17, the program causes the receiving means to further cause the computer to perform a fifth step of receiving competing terminal information from the base station on the control channel.

（構成１９）
構成１１から構成１８のいずれかにおいて、通信結果は、送信手段が送信したいパケットを基地局へ送信した後、パケットを受信したことを示すＡＣＫパケットを基地局から受信したとき、第１の指標からなり、送信手段が送信したいパケットを基地局へ送信した後、ＡＣＫパケットを基地局から受信しないとき、第２の指標からなる。 (Structure 19)
In any of the configurations 11 to 18, the communication result is obtained from the first index when the transmitting means transmits the packet to be transmitted to the base station and then receives the ACK packet indicating that the packet has been received from the base station. Therefore, when the transmitting means transmits the packet to be transmitted to the base station and then does not receive the ACK packet from the base station, the second index is used.

（構成２０）
構成１１から構成１９のいずれかにおいて、送信手段は、第１の条件に加え、送信用チャネルにおけるキャリアセンスの結果、他の端末装置が送信用チャネルで通信していないことからなる第２の条件が満たされたとき、バックオフを実行せずに送信したいパケットを送信用チャネルで基地局へ送信する。 (Structure 20)
In any one of configurations 11 to 19, in addition to the first condition, the transmission means has a second condition in which, as a result of carrier sense in the transmission channel, another terminal device is not communicating in the transmission channel. When is satisfied, the packet to be transmitted is transmitted to the base station on the transmission channel without executing backoff.

（構成２１）
更に、この発明の実施の形態によれば、記録媒体は、構成１１から構成２０のいずれかに記載のプログラムを記録したコンピュータ読み取り可能な記録媒体である。 (Structure 21)
Further, according to an embodiment of the present invention, the recording medium is a computer-readable recording medium on which the program according to any one of configurations 11 to 20 is recorded.

パケット損失を抑制して周波数を有効利用できる。 Frequency can be used effectively by suppressing packet loss.

この発明の実施の形態における通信システムの概略図である。It is the schematic of the communication system in embodiment of this invention. 図１に示す端末装置の概略図である。It is the schematic of the terminal apparatus shown in FIG. 図２に示す予測手段の概略図である。It is the schematic of the prediction means shown in FIG. 受信電力スペクトルの概念図である。It is a conceptual diagram of the received power spectrum. ビジー継続時間サブセットおよびアイドル継続時間サブセットを時系列に配列した例を示す概念図である。It is a conceptual diagram which shows the example which arranged the busy duration subset and the idle duration subset in time series. 図３に示す予測器の概略図である。It is a schematic diagram of the predictor shown in FIG. 変遷パターンと係数ｂ_１，ｂ_２，・・・，ｂ_ｐとの対応表を示す図である。It is a figure which shows the correspondence table of a transition pattern and a coefficient b ₁ , b ₂ , ..., b _p. チャネル待機時間の概念図である。It is a conceptual diagram of a channel waiting time. Ｑテーブルの概念図である。It is a conceptual diagram of a Q table. Ｑテーブルの更新方法を説明するための第１の概略図である。It is 1st schematic diagram for demonstrating the method of updating the Q table. Ｑテーブルの更新方法を説明するための第２の概略図である。It is a 2nd schematic diagram for demonstrating the method of updating the Q table. Ｑテーブルの更新方法を説明するための第３の概略図である。It is a 3rd schematic diagram for demonstrating the method of updating the Q table. 送信用パケットが送信可能か否かを判定する方法を説明するための図である。It is a figure for demonstrating the method of determining whether or not a transmission packet can be transmitted. 図２に示す端末装置の動作を説明するためのフローチャートである。It is a flowchart for demonstrating operation of the terminal apparatus shown in FIG. 図１４のステップＳ５の詳細な動作を説明するためのフローチャートである。It is a flowchart for demonstrating the detailed operation of step S5 of FIG. 図１４のステップＳ１２の詳細な動作を説明するためのフローチャートである。It is a flowchart for demonstrating the detailed operation of step S12 of FIG. 出力情報の概念図である。It is a conceptual diagram of output information.

本発明の実施の形態について図面を参照しながら詳細に説明する。なお、図中同一または相当部分には同一符号を付してその説明は繰返さない。 Embodiments of the present invention will be described in detail with reference to the drawings. The same or corresponding parts in the drawings are designated by the same reference numerals, and the description thereof will not be repeated.

図１は、この発明の実施の形態における通信システムの概略図である。図１を参照して、通信システム１００は、基地局ＢＳと、複数の端末装置ＴＭとを備える。基地局ＢＳおよび複数の端末装置ＴＭは、無線通信空間に配置される。 FIG. 1 is a schematic diagram of a communication system according to an embodiment of the present invention. With reference to FIG. 1, the communication system 100 includes a base station BS and a plurality of terminal devices TM. The base station BS and the plurality of terminal devices TM are arranged in the wireless communication space.

基地局ＢＳは、通信範囲ＲＥＧ１を有する。複数の端末装置ＴＭは、通信範囲ＲＥＧ１内に配置される。 The base station BS has a communication range REG1. The plurality of terminal devices TM are arranged within the communication range REG1.

基地局ＢＳは、パケットを複数の端末装置ＴＭのいずれかへ送信するとともに複数の端末装置ＴＭのいずれかからパケットを受信する。 The base station BS transmits the packet to any one of the plurality of terminal devices TM and receives the packet from any one of the plurality of terminal devices TM.

そして、基地局ＢＳは、パケットを受信したとき、パケットを受信したことを示すＡＣＫ（Acknowledgement）パケットをパケットの送信元の端末装置（複数の端末装置ＴＭのいずれか）へ送信する。 Then, when the base station BS receives the packet, it transmits an ACK (Acknowledgment) packet indicating that the packet has been received to the terminal device (any of a plurality of terminal device TMs) that is the source of the packet.

また、基地局ＢＳは、送信元の端末装置以外の端末装置に関する情報である競合端末情報を制御チャネルで送信元の端末装置へ送信する。 Further, the base station BS transmits competing terminal information, which is information about a terminal device other than the source terminal device, to the source terminal device via a control channel.

複数の端末装置ＴＭの各々は、パケットを基地局ＢＳへ送信する場合、後述する方法によって、送信用チャネルＣＨ＿Ｔを選択し、その選択した送信用チャネルＣＨ＿Ｔにおいてキャリアセンスを実行した後、パケットの送信が可能であると判断すると、バックオフを実行せずにパケットを送信用チャネルで基地局ＢＳへ送信する。 When transmitting a packet to the base station BS, each of the plurality of terminal devices TM selects a transmission channel CH_T by the method described later, executes carrier sense on the selected transmission channel CH_T, and then transmits the packet. If it is determined that this is possible, the packet is transmitted to the base station BS on the transmission channel without executing the backoff.

また、複数の端末装置ＴＭの各々は、基地局ＢＳからパケットを受信する。そして、複数の端末装置ＴＭの各々は、パケットを受信すると、ＡＣＫパケットを基地局ＢＳへ送信する。 Further, each of the plurality of terminal devices TM receives a packet from the base station BS. Then, when each of the plurality of terminal devices TM receives the packet, the ACK packet is transmitted to the base station BS.

更に、複数の端末装置ＴＭの各々は、制御チャネルで競合端末情報を基地局ＢＳから受信する。 Further, each of the plurality of terminal devices TM receives competing terminal information from the base station BS on the control channel.

以下においては、複数の端末装置ＴＭの各々を「端末装置１０」と表記する。 In the following, each of the plurality of terminal devices TM will be referred to as “terminal device 10”.

図２は、図１に示す端末装置の概略図である。図２を参照して、端末装置１０は、アンテナ１と、受信手段２と、予測手段３と、制御手段４と、学習器５と、アプリケーション６と、送信手段７とを備える。 FIG. 2 is a schematic view of the terminal device shown in FIG. With reference to FIG. 2, the terminal device 10 includes an antenna 1, a receiving means 2, a predicting means 3, a control means 4, a learning device 5, an application 6, and a transmitting means 7.

受信手段２は、制御手段４からの制御に従って、通信システム１００において無線通信に用いられる周波数帯域内の相互に異なる複数の周波数からなる複数のチャネルでアンテナ１を介してキャリアセンスを行い、受信電力の時間依存性を示す受信電力スペクトルを取得する。そして、受信手段２は、受信電力スペクトルを予測手段３へ出力する。 According to the control from the control means 4, the receiving means 2 performs carrier sense via the antenna 1 on a plurality of channels having a plurality of frequencies different from each other in the frequency band used for wireless communication in the communication system 100, and receives power. Acquires a received power spectrum showing the time dependence of. Then, the receiving means 2 outputs the received power spectrum to the predicting means 3.

また、受信手段２は、アンテナ１を介して、制御チャネルで競合端末情報を基地局ＢＳから受信し、その受信した競合端末情報を制御手段４へ出力する。 Further, the receiving means 2 receives the competing terminal information from the base station BS on the control channel via the antenna 1, and outputs the received competing terminal information to the control means 4.

更に、受信手段２は、制御手段４からの制御に従って、送信用チャネルＣＨ＿Ｔでアンテナ１を介してキャリアセンスを行い、そのキャリアセンスの結果を制御手段４へ出力する。 Further, the receiving means 2 performs carrier sense via the antenna 1 on the transmission channel CH_T according to the control from the control means 4, and outputs the result of the carrier sense to the control means 4.

更に、受信手段２は、アンテナ１を介して、送信用チャネルＣＨ＿Ｔで基地局ＢＳからＡＣＫパケットを受信し、その受信したＡＣＫパケットを制御手段４へ出力する。 Further, the receiving means 2 receives an ACK packet from the base station BS on the transmission channel CH_T via the antenna 1, and outputs the received ACK packet to the control means 4.

予測手段３は、受信電力スペクトルを受信手段２から受ける。そして、予測手段３は、制御手段４からの制御に応じて、後述する方法によって、受信電力スペクトルに基づいて、チャネルが空いている時間の長さであるチャネル空き時間と、チャネルが使用されている時間の長さであるチャネル使用時間とを検出し、その検出したチャネル空き時間およびチャネル使用時間に基づいて、パケットを送信した後のチャネル空き時間およびチャネル使用時間を予測する。そして、予測手段３は、その予測した予測チャネル空き時間および予測チャネル使用時間を制御手段４へ出力する。 The prediction means 3 receives the received power spectrum from the receiving means 2. Then, in response to the control from the control means 4, the predictor means 3 uses the channel free time, which is the length of time the channel is free, and the channel, based on the received power spectrum, by the method described later. It detects the channel usage time, which is the length of time, and predicts the channel free time and channel usage time after transmitting the packet based on the detected channel free time and channel usage time. Then, the prediction means 3 outputs the predicted prediction channel free time and the prediction channel usage time to the control means 4.

制御手段４は、複数のチャネルでキャリアセンスを実行するように受信手段２を制御する。 The control means 4 controls the receiving means 2 so as to execute carrier sense on a plurality of channels.

また、制御手段４は、チャネル空き時間およびチャネル使用時間を予測するように予測手段３を制御する。 Further, the control means 4 controls the prediction means 3 so as to predict the channel free time and the channel use time.

更に、制御手段４は、受信手段２から競合端末情報およびＡＣＫパケットを受け、予測チャネル空き時間および予測チャネル使用時間を予測手段３から受け、送信したいパケットである送信用パケットをアプリケーション６から受ける。 Further, the control means 4 receives the competing terminal information and the ACK packet from the receiving means 2, receives the predicted channel free time and the predicted channel usage time from the predicting means 3, and receives a transmission packet which is a packet to be transmitted from the application 6.

そして、制御手段４は、送信用パケットを送信するときの送信時間の長さが予測チャネル空き時間よりも長いとき、送信用パケットの送信が不可能であると判定し、送信不可情報を生成する。 Then, when the length of the transmission time when transmitting the transmission packet is longer than the predicted channel free time, the control means 4 determines that the transmission packet cannot be transmitted, and generates transmission non-transmission information. ..

また、制御手段４は、送信用パケットを送信手段７へ出力した後、ＡＣＫパケットを受信手段２から受けると、パケットを基地局ＢＳへ送信する通信が成功したことを示す第１の指標を生成し、送信用パケットを送信手段７へ出力した後、ＡＣＫパケットを受信手段２から受けないとき、パケットを基地局ＢＳへ送信する通信が失敗したことを示す第２の指標を生成する。そして、制御手段４は、第１の指標および第２の指標からなる通信結果を生成する。第１の指標は、例えば、“１”からなり、第２の指標は、例えば、“０”からなる。 Further, when the control means 4 receives the ACK packet from the receiving means 2 after outputting the transmitting packet to the transmitting means 7, it generates a first index indicating that the communication for transmitting the packet to the base station BS is successful. Then, when the transmission packet is output to the transmission means 7 and then the ACK packet is not received from the reception means 2, a second index indicating that the communication for transmitting the packet to the base station BS has failed is generated. Then, the control means 4 generates a communication result including the first index and the second index. The first index is composed of, for example, "1", and the second index is composed of, for example, "0".

更に、制御手段４は、予測チャネル空き時間および予測チャネル使用時間に基づいて、送信用パケットを送信してから次の送信用パケットを送信するまでのチャネルの待機時間の長さであるチャネル待機時間を算出する。 Further, the control means 4 has a channel waiting time which is the length of the waiting time of the channel from the transmission of the transmission packet to the transmission of the next transmission packet based on the predicted channel free time and the predicted channel usage time. Is calculated.

制御手段４は、送信用パケットを送信するときの送信時間の長さが予測チャネル空き時間よりも短いとき、競合端末情報、通信結果およびチャネル待機時間からなる入力情報ＩＦ＿ＩＮＰＵＴ１を生成する。そして、制御手段４は、その生成した入力情報ＩＦ＿ＩＮＰＵＴ１を学習器５へ出力し、パケットの送信時間の長さを状態とし、チャネルを選択することを行動とし、通信のスループットを報酬としたＱ学習を実行するように学習器５を制御する。 When the length of the transmission time when transmitting the transmission packet is shorter than the predicted channel free time, the control means 4 generates the input information IF_INPUT1 including the competing terminal information, the communication result, and the channel standby time. Then, the control means 4 outputs the generated input information IF_INPUT1 to the learner 5, sets the length of the packet transmission time as a state, sets the channel as an action, and uses the communication throughput as a reward for Q-learning. The learner 5 is controlled so as to execute.

制御手段４は、送信不可情報を生成したとき、送信不可情報からなる入力情報ＩＦ＿ＩＮＰＵＴ２を生成し、その生成した入力情報ＩＦ＿ＩＮＰＵＴ２を学習器５へ出力して報酬を零としてＱ値を更新するように学習器５を制御する。 When the control means 4 generates the non-transmission information, the control means 4 generates the input information IF_INPUT2 composed of the non-transmission information, outputs the generated input information IF_INPUT2 to the learner 5, sets the reward to zero, and updates the Q value. Control the learner 5.

制御手段４は、入力情報ＩＦ＿ＩＮＰＵＴ１または入力情報ＩＦ＿ＩＮＰＵＴ２を学習器５へ出力した後、出力情報ＩＦ＿ＯＵＴを学習器５から受ける。そして、制御手段４は、送信用パケットのデータ量を送信レートで除算して送信用パケットの送信時間を算出する。 The control means 4 outputs the input information IF_INPUT1 or the input information IF_INPUT2 to the learner 5, and then receives the output information IF_OUT from the learner 5. Then, the control means 4 divides the data amount of the transmission packet by the transmission rate to calculate the transmission time of the transmission packet.

そうすると、制御手段４は、出力情報ＩＦ＿ＯＵＴを参照して、送信用パケットの送信時間の長さに対応する状態における複数のＱ値のうち、最大のＱ値と、最大のＱ値に対応する行動とを検出する。そして、制御手段４は、その検出した行動において選択されたチャネルを送信用チャネルＣＨ＿Ｔとして選択する。 Then, the control means 4 refers to the output information IF_OUT, and among the plurality of Q values in the state corresponding to the length of the transmission time of the transmission packet, the maximum Q value and the action corresponding to the maximum Q value. And detect. Then, the control means 4 selects the channel selected in the detected action as the transmission channel CH_T.

その後、制御手段４は、送信用チャネルＣＨ＿Ｔにおいて、キャリアセンスを実行するように受信手段２を制御する。制御手段４は、受信手段２からキャリアセンスの結果を受けると、その受けたキャリアセンスの結果に基づいて送信用パケットの送信が可能であるか否かを判定する。 After that, the control means 4 controls the receiving means 2 so as to execute the carrier sense in the transmitting channel CH_T. When the control means 4 receives the carrier sense result from the receiving means 2, it determines whether or not the transmission packet can be transmitted based on the received carrier sense result.

制御手段４は、送信用パケットの送信が可能であると判定したとき、送信用パケットおよび送信用チャネルＣＨ＿Ｔを送信手段７へ出力し、送信用パケットを送信用チャネルＣＨ＿Ｔで送信するように送信手段７を制御する。 When the control means 4 determines that the transmission packet can be transmitted, the control means 4 outputs the transmission packet and the transmission channel CH_T to the transmission means 7, and transmits the transmission packet on the transmission channel CH_T so as to transmit the transmission packet. 7 is controlled.

一方、制御手段４は、送信用パケットの送信が不可能であると判定したとき、送信不可情報を生成する。その後、制御手段４は、新たな送信用パケットをアプリケーション６から受ける。 On the other hand, when the control means 4 determines that the transmission of the transmission packet is impossible, the control means 4 generates the transmission non-transmission information. After that, the control means 4 receives a new transmission packet from the application 6.

アプリケーション６は、送信用パケットを生成し、その生成した送信用パケットを制御手段４へ出力する。 The application 6 generates a transmission packet and outputs the generated transmission packet to the control means 4.

送信手段７は、送信用パケットおよび送信用チャネルＣＨ＿Ｔを制御手段４から受けると、アンテナ１を介して、送信用チャネルＣＨ＿Ｔで送信用パケットを基地局ＢＳへ送信する。 When the transmission means 7 receives the transmission packet and the transmission channel CH_T from the control means 4, the transmission means 7 transmits the transmission packet to the base station BS via the transmission channel CH_T via the antenna 1.

図３は、図２に示す予測手段３の概略図である。図３を参照して、予測手段３は、判定部３１と、計測部３２と、分類部３３と、予測部３４と、予測器３５とを含む。 FIG. 3 is a schematic view of the prediction means 3 shown in FIG. With reference to FIG. 3, the prediction means 3 includes a determination unit 31, a measurement unit 32, a classification unit 33, a prediction unit 34, and a predictor 35.

判定部３１は、受信手段２から受信電力スペクトルを受け、その受けた受信電力スペクトルに基づいて、ビジー状態（Ｂｕｓｙ）であるかアイドル状態（Ｉｄｌｅ）であるかを判定する。そして、判定部３１は、ビジー状態（Ｂｕｓｙ）となる開始時刻と終了時刻とを検出するとともにアイドル状態（Ｉｄｌｅ）となる開始時刻と終了時刻とを検出する。そうすると、判定部３１は、ビジー状態（Ｂｕｓｙ）となる開始時刻および終了時刻と、アイドル状態（Ｉｄｌｅ）となる開始時刻および終了時刻とを計測部３２へ出力する。 The determination unit 31 receives a received power spectrum from the receiving means 2, and determines whether it is in a busy state (Busy) or an idle state (Idle) based on the received received power spectrum. Then, the determination unit 31 detects the start time and the end time in the busy state (Busy), and also detects the start time and the end time in the idle state (Idle). Then, the determination unit 31 outputs the start time and end time of the busy state (Busy) and the start time and end time of the idle state (Idle) to the measurement unit 32.

計測部３２は、ビジー状態（Ｂｕｓｙ）となる開始時刻および終了時刻と、アイドル状態（Ｉｄｌｅ）となる開始時刻および終了時刻とを判定部３１から受ける。そして、計測部３２は、ビジー状態（Ｂｕｓｙ）となる開始時刻および終了時刻に基づいてビジー継続時間を計測し、アイドル状態（Ｉｄｌｅ）となる開始時刻および終了時刻に基づいてアイドル継続時間を計測する。そうすると、計測部３２は、ビジー継続時間およびアイドル継続時間を分類部３３へ出力する。 The measurement unit 32 receives from the determination unit 31 a start time and an end time in a busy state (Busy) and a start time and an end time in an idle state (Idle). Then, the measuring unit 32 measures the busy duration based on the start time and the end time of the busy state (Busy), and measures the idle duration based on the start time and the end time of the idle state (Idle). .. Then, the measurement unit 32 outputs the busy duration and the idle duration to the classification unit 33.

分類部３３は、ビジー継続時間およびアイドル継続時間を計測部３２から受ける。そして、分類部３３は、後述する方法によって、ビジー継続時間およびアイドル継続時間をサブセットに分類する。そうすると、分類部３３は、その分類したビジー継続時間およびアイドル継続時間のサブセットを予測部３４へ出力する。 The classification unit 33 receives the busy duration and the idle duration from the measurement unit 32. Then, the classification unit 33 classifies the busy duration and the idle duration into a subset by a method described later. Then, the classification unit 33 outputs the classified subset of the busy duration and the idle duration to the prediction unit 34.

予測部３４は、ビジー継続時間およびアイドル継続時間のサブセットを分類部３３から受ける。そして、予測部３４は、ビジー継続時間およびアイドル継続時間のサブセットを予測器３５へ出力する。その後、予測部３４は、将来のビジー継続時間およびアイドル継続時間を予測した結果である予測ビジー継続時間および予測アイドル継続時間を予測器３５から受け、その受けた予測ビジー継続時間および予測アイドル継続時間をそれぞれ予測チャネル使用時間および予測チャネル空き時間として制御手段４へ出力する。 The prediction unit 34 receives a subset of the busy duration and the idle duration from the classification unit 33. Then, the prediction unit 34 outputs a subset of the busy duration and the idle duration to the predictor 35. After that, the prediction unit 34 receives the predicted busy duration and the predicted idle duration, which are the results of predicting the future busy duration and the idle duration, from the predictor 35, and the received predicted busy duration and the predicted idle duration. Is output to the control means 4 as the predicted channel usage time and the predicted channel free time, respectively.

図４は、受信電力スペクトルの概念図である。図４において、縦軸は、受信電力を表し、横軸は、時間を表す。 FIG. 4 is a conceptual diagram of the received power spectrum. In FIG. 4, the vertical axis represents the received power and the horizontal axis represents the time.

図４を参照して、受信電力スペクトルＳＰ＿ＲＳＳＩは、受信電力が時間の経過とともに変化する。 With reference to FIG. 4, the received power spectrum SP_RSSI changes in received power over time.

判定部３１は、受信電力スペクトルＳＰ＿ＲＳＳＩを受信手段２から受けると、しきい値ＲＳＳＩ＿ｔｈを設定する。しきい値ＲＳＳＩ＿ｔｈは、例えば、−８０［ｄＢｍ］である。 Upon receiving the received power spectrum SP_RSSI from the receiving means 2, the determination unit 31 sets the threshold value RSSI_th. The threshold RSSI_th is, for example, −80 [dBm].

そして、判定部３１は、受信電力スペクトルＳＰ＿ＲＳＳＩの受信電力がしきい値ＲＳＳＩ＿ｔｈに一致する時刻ｔ１〜ｔ８を検出する。その後、判定部３１は、時刻ｔ１〜ｔ８に基づいて、時刻ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８の区間をビジー状態と判定し、時刻ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７の区間をアイドル状態と判定する。 Then, the determination unit 31 detects times t1 to t8 in which the received power of the received power spectrum SP_RSSI coincides with the threshold value RSSI_th. After that, the determination unit 31 determines that the section between times t1 to t2, t3 to t4, t5 to t6, and t7 to t8 is in a busy state based on the times t1 to t8, and determines the time t2 to t3, t4 to t5, and t6. The section from to t7 is determined to be in the idle state.

そうすると、判定部３１は、ビジー状態と時刻ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８の区間とを対応付けた［ビジー状態：ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８］を生成するとともに、アイドル状態と時刻ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７の区間とを対応付けた［アイドル状態：ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７］を生成する。そして、判定部３１は、［ビジー状態：ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８］および［アイドル状態：ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７］を計測部３２へ出力する。 Then, the determination unit 31 associates the busy state with the section of the time t1 to t2, t3 to t4, t5 to t6, t7 to t8 [busy state: t1 to t2, t3 to t4, t5 to t6, t7. ~ T8] is generated, and [idle state: t2 to t3, t4 to t5, t6 to t7] in which the idle state is associated with the section of time t2 to t3, t4 to t5, t6 to t7 is generated. Then, the determination unit 31 outputs [busy state: t1 to t2, t3 to t4, t5 to t6, t7 to t8] and [idle state: t2 to t3, t4 to t5, t6 to t7] to the measurement unit 32. do.

計測部３２は、［ビジー状態：ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８］および［アイドル状態：ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７］を判定部３１から受ける。そして、計測部３２は、時刻ｔ１から時刻ｔ２までの時間長Ｔ_ｌｅｇ１、時刻ｔ３から時刻ｔ４までの時間長Ｔ_ｌｅｇ２、時刻ｔ５から時刻ｔ６までの時間長Ｔ_ｌｅｇ３、および時刻ｔ７から時刻ｔ８までの時間長Ｔ_ｌｅｇ４を計測し、それぞれ、ビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}を生成する。 The measuring unit 32 receives [busy state: t1 to t2, t3 to t4, t5 to t6, t7 to t8] and [idle state: t2 to t3, t4 to t5, t6 to t7] from the determination unit 31. Then, the measuring unit 32, the time length _{T leg1} from time t1 to time t2, the time length _{T leg2} from time t3 to time t4, the time length _T from the time t5 to time t6 _Leg3, and from time t7 to time t8 to measure the time length _{T Leg4} of, respectively, to generate the busy duration _{_T} Busy1 _{~T Busy4.}

また、計測部３２は、時刻ｔ２から時刻ｔ３までの時間長Ｔ_ｌｅｇ５、時刻ｔ４から時刻ｔ５までの時間長Ｔ_ｌｅｇ６、および時刻ｔ６から時刻ｔ７までの時間長Ｔ_ｌｅｇ７を計測して、それぞれ、アイドル継続時間Ｔ_Ｉｄｌ１〜_{ＴＩｄｅ３}を生成する。 Further, the measuring unit 32, the time length _{T Leg5} from time t2 to time t3, the time length _{T Leg6} from time t4 to time t5, and the time length _{T Leg7} from time t6 to time t7 is measured, respectively, Idle duration _TIdl1 to _Tide3 are generated.

そして、計測部３２は、ビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}と時刻ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８の区間とを対応付けた［ｔ１〜ｔ２：Ｔ_{Ｂｕｓｙ１}／ｔ３〜ｔ４：Ｔ_{Ｂｕｓｙ２}／ｔ５〜ｔ６：Ｔ_{Ｂｕｓｙ３}／ｔ７〜ｔ８：Ｔ_{Ｂｕｓｙ４}］を生成する。 Then, the measuring unit 32 associates the busy durations T _{Busy1 to} T _Busy4 with the sections of the times t1 to t2, t3 to t4, t5 to t6, and t7 to t8 [t1 to t2: T _Busy1 / t3 to t4]. : T _Busy2 / t5 to t6: T _Busy3 / t7 to t8: T _Busy4 ] is generated.

また、計測部３２は、アイドル継続時間Ｔ_Ｉｄｌ１〜_{ＴＩｄｅ３}と、時刻ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７の区間とを対応付けた［ｔ２〜ｔ３：Ｔ_Ｉｄｌ１／ｔ４〜ｔ５：Ｔ_Ｉｄｌ２／ｔ６〜ｔ７：Ｔ_Ｉｄｌ３］を生成する。 Further, the measuring unit 32 associates the idle durations _TIdl1 to _Tide3 with the intervals of the times t2 to t3, t4 to t5, t6 to t7 [t2 to t3: _TIdl1 / t4 to t5: _TIdl2 /. t6 to t7: _TIdl3 ] is generated.

そうすると、計測部３２は、［ｔ１〜ｔ２：Ｔ_{Ｂｕｓｙ１}／ｔ３〜ｔ４：Ｔ_{Ｂｕｓｙ２}／ｔ５〜ｔ６：Ｔ_{Ｂｕｓｙ３}／ｔ７〜ｔ８：Ｔ_{Ｂｕｓｙ４}］および［ｔ２〜ｔ３：Ｔ_Ｉｄｌ１／ｔ４〜ｔ５：Ｔ_Ｉｄｌ２／ｔ６〜ｔ７：Ｔ_Ｉｄｌ３］を分類部３３へ出力する。 Then, the measuring unit 32 has [t1 to t2: T _Busy1 / t3 to t4: T _Busy2 / t5 to t6: T _Busy3 / t7 to t8: T _Busy4 ] and [t2 to t3: _TIdl1 / t4 to t5: T. _Idl2 / t6 to t7: _TIdl3 ] is output to the classification unit 33.

分類部３３は、［ｔ１〜ｔ２：Ｔ_{Ｂｕｓｙ１}／ｔ３〜ｔ４：Ｔ_{Ｂｕｓｙ２}／ｔ５〜ｔ６：Ｔ_{Ｂｕｓｙ３}／ｔ７〜ｔ８：Ｔ_{Ｂｕｓｙ４}］および［ｔ２〜ｔ３：Ｔ_Ｉｄｌ１／ｔ４〜ｔ５：Ｔ_Ｉｄｌ２／ｔ６〜ｔ７：Ｔ_Ｉｄｌ３］を計測部３２から受ける。 The classification unit 33 includes [t1 to t2: T _Busy1 / t3 to t4: T _Busy2 / t5 to t6: T _Busy3 / t7 to t8: T _Busy4 ] and [t2 to t3: T _Idl1 / t4 to t5: _TIdl2 / t6 to t7: _TIdl3 ] is received from the measuring unit 32.

また、分類部３３は、ビジー継続時間およびアイドル継続時間をサブセットに分類するための時間範囲を予め保持している。 In addition, the classification unit 33 holds in advance a time range for classifying the busy duration and the idle duration into a subset.

ビジー継続時間およびアイドル継続時間をサブセットに分類するための時間範囲は、例えば、次の通りである。
（１）ビジー継続時間サブセット：１ｍｓ未満→ｓｔｅｐ１、１ｍｓ以上→ｓｔｅｐ２
（２）アイドル継続時間サブセット：０．５ｍｓ未満→ｓｔｅｐ１、０．５ｍｓ以上→ｓｔｅｐ２
分類部３３は、ビジー継続時間サブセットの時間範囲（１）に基づいて、ビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}の各々をサブセットｓｔｅｐ１，ｓｔｅｐ２のいずれかに分類し、アイドル継続時間サブセットの時間範囲（２）に基づいて、アイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３の各々をサブセットｓｔｅｐ１，ｓｔｅｐ２のいずれかに分類する。 The time range for classifying busy duration and idle duration into subsets is, for example:
(1) Busy duration subset: less than 1 ms → step1, 1 ms or more → step2
(2) Idle duration subset: less than 0.5 ms → step1, 0.5 ms or more → step2
Classifying unit 33 based on the time range of busy duration subset (1), classifies each of the busy duration _{_T} Busy1 _{~T Busy4} any subset step1, step2, a time range of idle duration subset (2 ), Each of the idle durations _{TIdl1 to} _TIdl3 is classified into one of the subsets step1 and step2.

図５は、ビジー継続時間サブセットおよびアイドル継続時間サブセットを時系列に配列した例を示す概念図である。 FIG. 5 is a conceptual diagram showing an example in which a busy duration subset and an idle duration subset are arranged in chronological order.

図５を参照して、分類部３３は、時刻ｔ１〜ｔ２，ｔ３〜ｔ４，ｔ５〜ｔ６，ｔ７〜ｔ８の区間、時刻ｔ２〜ｔ３，ｔ４〜ｔ５，ｔ６〜ｔ７の区間、サブセットに分類したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}、およびサブセットに分類したアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３に基づいて、例えば、ビジー継続時間Ｔ_{Ｂｕｓｙ１}（ｓｔｅｐ２）／アイドル継続時間Ｔ_Ｉｄｌ１（ｓｔｅｐ１）／ビジー継続時間Ｔ_{Ｂｕｓｙ２}（ｓｔｅｐ１）／アイドル継続時間Ｔ_Ｉｄｌ２（ｓｔｅｐ２）／ビジー継続時間Ｔ_{Ｂｕｓｙ３}（ｓｔｅｐ１）／アイドル継続時間Ｔ_Ｉｄｌ３（ｓｔｅｐ１）／ビジー継続時間Ｔ_{Ｂｕｓｙ４}（ｓｔｅｐ１）のようにビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３を時系列に配列する。 With reference to FIG. 5, the classification unit 33 classifies the time t1 to t2, t3 to t4, t5 to t6, t7 to t8, the time t2 to t3, t4 to t5, t6 to t7, and a subset. Busy duration T _{Busy1 to} T _Busy4 , and idle duration T _{Idl1 to} T _Idl3 classified into subsets, for example, busy duration T _Busy1 (step2) / idle duration _TIdl1 (step1) / busy duration T _Busy2 (step1) / Idle duration T _Idl2 (step2) / Busy duration T _Busy3 (step1) / Idle duration T _Idl3 (step1) / Busy duration T _Busy4 (step1) Busy duration T _Busy1 ~ T _Busy4 and arranging the idle duration _{_T} Idl1 _{~T Idl3} in chronological order.

なお、図５に示す時系列に配列したビジー継続時間およびアイドル継続時間は、複数生成される。 A plurality of busy durations and idle durations arranged in the time series shown in FIG. 5 are generated.

そして、分類部３３は、時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３を予測部３４へ出力する。 _{Then, the classification unit 33 outputs} the busy durations T Busy 1 to T Busy ₄ and the idle durations T _{Idl 1 to} T _{Idl 3} arranged in time series to the prediction unit 34.

予測部３４は、時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３を分類部３３から受けると、その受けた時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３を予測器３５へ出力する。 Prediction unit 34, when receives the busy duration arranged in series _{_T} Busy1 _{~T Busy4} and idle duration _{_T} Idl1 _{~T Idl3} from the classification unit 33, the busy duration arranged in time series received _{T BUSY1} through T the _Busy4 and idle duration _{_T} Idl1 _{~T Idl3} outputs to the prediction unit 35.

予測器３５は、時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３を予測部３４から受けると、時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３に基づいて、将来のビジー継続時間およびアイドル継続時間を予測し、その予測した予測ビジー継続時間よび予測アイドル継続時間を予測部３４へ出力する。 Predictor 35, when it receives the busy duration arranged in series _{_T} Busy1 _{~T Busy4} and idle duration _{_T} Idl1 _{~T Idl3} from the prediction unit 34, when the busy duration arranged in series _{_T} Busy1 _{~T Busy4} and idle The future busy duration and idle duration are predicted based on the durations T _{Idl1 to} _TIdl3 , and the predicted busy duration and the predicted idle duration are output to the prediction unit 34.

予測器３５における将来のビジー継続時間およびアイドル継続時間の予測方法について説明する。 A method of predicting the future busy duration and idle duration in the predictor 35 will be described.

図６は、図３に示す予測器３５の概略図である。図６を参照して、予測器３５は、１層、２層、・・・、Ｋ層を有する。なお、Ｋは、２以上の整数である。 FIG. 6 is a schematic view of the predictor 35 shown in FIG. With reference to FIG. 6, the predictor 35 has one layer, two layers, ..., K layer. K is an integer of 2 or more.

１層、２層、・・・、Ｋ層の各々は、サブセット数Ｓ_ｉ（ｉは、２以上の整数）に一致する個数の解析部（図６の○）を有する。図６においては、サブセット数Ｓ_ｉが２個である場合の予測器３５の概略図を示す。 Each of the 1st layer, 2nd layer, ..., And K layer has a number of analysis units (◯ in FIG. 6) corresponding to the number of _{subsets Si (i is an integer of 2 or more).} FIG. 6 shows a schematic diagram of the predictor 35 when the _{number of subsets Si is two.}

サブセット数がＳ_ｉである場合、総データストリームは、Ｓ_１×Ｓ_２×・・・×Ｓ_Ｋになる。図６に示す場合、サブセット数Ｓ_１〜Ｓ_Ｋの各々は、“２”であるので、総データストリームは、２×２×・・・×２＝２^Ｋになる。 If the number of subsets is _{S i,} total data stream will _{_{S 1 × S 2 × ··· ×}} S K. The case shown in FIG. 6, each of the number of subsets _S 1 to S _K, since it is "2", the total data stream will 2 × 2 × ··· × 2 = 2 K.

１層、２層、・・・、Ｋ層の各々は、ビジー／アイドルの相関性を解析する。そして、１層は、ビジー／アイドルデータを予測部３４から受け、その受けたビジー／アイドルデータに基づいてビジー／アイドルの相関性を解析し、ビジー／アイドルの相関性の解析結果を２層へ出力する。 Each of the 1st layer, the 2nd layer, ..., And the K layer analyzes the busy / idle correlation. Then, the first layer receives busy / idle data from the prediction unit 34, analyzes the busy / idle correlation based on the received busy / idle data, and transfers the analysis result of the busy / idle correlation to the second layer. Output.

ビジー／アイドルデータは、図５に示す時系列に配列したビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ４}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３からなる。１層は、ビジー／アイドルデータに基づいて、ビジー継続時間Ｔ_{Ｂｕｓｙ１}〜Ｔ_{Ｂｕｓｙ３}およびアイドル継続時間Ｔ_Ｉｄｌ１〜Ｔ_Ｉｄｌ３が、ビジー継続時間Ｔ_{Ｂｕｓｙ１}（ｓｔｅｐ２）→アイドル継続時間Ｔ_Ｉｄｌ１（ｓｔｅｐ１）→ビジー継続時間Ｔ_{Ｂｕｓｙ２}（ｓｔｅｐ１）→アイドル継続時間Ｔ_Ｉｄｌ２（ｓｔｅｐ２）→ビジー継続時間Ｔ_{Ｂｕｓｙ３}（ｓｔｅｐ１）→アイドル継続時間Ｔ_Ｉｄｌ３（ｓｔｅｐ１）→ビジー継続時間Ｔ_{Ｂｕｓｙ４}（ｓｔｅｐ１）のように変遷している変遷パターンを検出し、その検出した変遷パターンに基づいてビジー継続時間とアイドル継続時間との相関性を解析する。 Busy / idle data consists busy duration _{_T} Busy1 _{~T Busy4} and idle duration _{_T} Idl1 _{~T Idl3} arranged in time series as shown in FIG. Based on the busy / idle data, the first layer has busy durations T _{Busy1 to} T _Busy3 and idle durations _{TIdl1 to} _TIdl3 , busy duration T _Busy1 (step2) → idle duration _TIdl1 (step1) → busy. Duration T _Busy2 (step1) → Idle duration T _Idl2 (step2) → Busy duration T _Busy3 (step1) → Idle duration T _Idl3 (step1) → Busy duration T _Busy4 (step1) A transition pattern is detected, and the correlation between the busy duration and the idle duration is analyzed based on the detected transition pattern.

より具体的には、１層は、ビジー継続時間Ｔ_{Ｂｕｓｙ１}（ｓｔｅｐ２）→アイドル継続時間Ｔ_Ｉｄｌ１（ｓｔｅｐ１）への変遷、アイドル継続時間Ｔ_Ｉｄｌ１（ｓｔｅｐ１）→ビジー継続時間Ｔ_{Ｂｕｓｙ２}（ｓｔｅｐ１）への変遷、ビジー継続時間Ｔ_{Ｂｕｓｙ２}（ｓｔｅｐ１）→アイドル継続時間Ｔ_Ｉｄｌ２（ｓｔｅｐ２）への変遷、アイドル継続時間Ｔ_Ｉｄｌ２（ｓｔｅｐ２）→ビジー継続時間Ｔ_{Ｂｕｓｙ３}（ｓｔｅｐ１）への変遷、ビジー継続時間Ｔ_{Ｂｕｓｙ３}（ｓｔｅｐ１）→アイドル継続時間Ｔ_Ｉｄｌ３（ｓｔｅｐ１）への変遷、およびアイドル継続時間Ｔ_Ｉｄｌ３（ｓｔｅｐ１）→ビジー継続時間Ｔ_{Ｂｕｓｙ４}（ｓｔｅｐ１）への変遷に基づいて、サブセットｓｔｅｐ２のビジー継続時間Ｔ_{Ｂｕｓｙ１}の後にサブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌ１が到来し、サブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌ１の後にサブセットｓｔｅｐ１のビジー継続時間Ｔ_{Ｂｕｓｙ２}が到来し、サブセットｓｔｅｐ１のビジー継続時間Ｔ_{Ｂｕｓｙ２}の後にサブセットｓｔｅｐ２のアイドル継続時間Ｔ_Ｉｄｌ２が到来し、サブセットｓｔｅｐ２のアイドル継続時間Ｔ_Ｉｄｌ２の後にサブセットｓｔｅｐ１のビジー継続時間Ｔ_{Ｂｕｓｙ３}が到来し、サブセットｓｔｅｐ１のビジー継続時間Ｔ_{Ｂｕｓｙ３}の後にサブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌ３が到来し、サブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌ３の後にサブセットｓｔｅｐ１のビジー継続時間Ｔ_{Ｂｕｓｙ４}が到来することを検出することによって、隣接するビジー継続時間／アイドル継続時間の相関性および隣接するアイドル継続時間／ビジー継続時間の相関性、即ち、隣接するビジー継続時間／アイドル継続時間におけるビジー継続時間のサブセットとアイドル継続時間のサブセットとの繋がり度合と、隣接するアイドル継続時間／ビジー継続時間におけるアイドル継続時間のサブセットとビジー継続時間のサブセットとの繋がり度合とを解析する。 More specifically, the first layer is _{a transition from busy duration T Busy1} (step2) to idle duration _TIdl1 (step1), idle duration _TIdl1 (step1) → busy duration T _Busy2 (step1). Transition, busy duration T _Busy2 (step1) → idle duration T _Idl2 (step2), idle duration T _Idl2 (step2) → busy duration T _Busy3 (step1), busy duration T _Busy3 ( step1) → evolution of idle duration _T Idl3 to (step1), and idle duration _T Idl3 (step1) → on the basis of the busy continuing transition time _T Busy4 to (step1), after the busy duration of a subset step2 _{T BUSY1} arrives idle duration _{T Idl1} subset step1, busy duration _{T BUSY2} subset step1 is reached after the idle duration _{T Idl1} subset step1, idle duration subset step2 after busy duration of a subset step1 _{T BUSY2} T _IDL2 is reached, busy duration _{T Busy3} subset step1 is reached after the idle duration _{T IDL2} subset step2, idle duration _{T Idl3} subset step1 is reached after the busy duration of a subset step1 _{T Busy3,} By detecting the arrival of the busy duration T _Busy4 of the subset step1 after the idle duration _TIdl3 of the subset step1, the correlation between the adjacent busy duration / idle duration and the adjacent idle duration / busy duration Correlation, that is, the degree of connection between the busy duration subset and the idle duration subset in the adjacent busy duration / idle duration, and the idle duration subset and busy in the adjacent idle duration / busy duration. Analyze the degree of connection with a subset of duration.

つまり、１層は、サブセットｓｔｅｐ１のビジー継続時間Ｔ_Ｂｕｓｙの次に到来するアイドル継続時間Ｔ_Ｉｄｌｅのサブセットがサブセットｓｔｅｐ１であるかを示す度合Ｄｇ１と、サブセットｓｔｅｐ１のビジー継続時間Ｔ_Ｂｕｓｙの次に到来するアイドル継続時間Ｔ_Ｉｄｌｅのサブセットがサブセットｓｔｅｐ２であるかを示す度合Ｄｇ２と、サブセットｓｔｅｐ２のビジー継続時間Ｔ_Ｂｕｓｙの次に到来するアイドル継続時間Ｔ_Ｉｄｌｅのサブセットがサブセットｓｔｅｐ１であるかを示す度合Ｄｇ３と、、サブセットｓｔｅｐ２のビジー継続時間Ｔ_Ｂｕｓｙの次に到来するアイドル継続時間Ｔ_Ｉｄｌｅのサブセットがサブセットｓｔｅｐ２であるかを示す度合Ｄｇ４と、サブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌｅの次に到来するビジー継続時間Ｔ_Ｂｕｓｙのサブセットがサブセットｓｔｅｐ１であるかを示す度合Ｄｇ５と、サブセットｓｔｅｐ１のアイドル継続時間Ｔ_Ｉｄｌｅの次に到来するビジー継続時間Ｔ_Ｂｕｓｙのサブセットがサブセットｓｔｅｐ２であるかを示す度合Ｄｇ６と、サブセットｓｔｅｐ２のアイドル継続時間Ｔ_Ｉｄｌｅの次に到来するビジー継続時間Ｔ_Ｂｕｓｙのサブセットがサブセットｓｔｅｐ１であるかを示す度合Ｄｇ７と、サブセットｓｔｅｐ２のアイドル継続時間Ｔ_Ｉｄｌｅの次に到来するビジー継続時間Ｔ_Ｂｕｓｙのサブセットがサブセットｓｔｅｐ２であるかを示す度合Ｄｇ８とを解析結果として解析する。 That is, the first layer comes after the busy duration T _Busy of the subset step 1 and the degree Dg1 indicating whether the subset of the idle duration T _Idle _{is the subset step 1 and the busy duration T Busy} of the subset step 1. degree indicating subset of idle duration _{T idle} is the degree Dg2 indicating which subset step2, a subset of the idle duration _{T idle} arriving the next busy duration of a subset step2 _{T busy} is whether a subset step1 to Dg3 and ,, the degree Dg4 subset busy duration of a subset step2 _{T busy} next idle duration for the arrival of _{T idle} is indicating which subset step2, busy continuing arriving at the next idle duration _{T idle} subset step1 subset of time _{T busy} is the degree Dg5 indicating which subset step1, a subset of the busy duration _{T busy} arriving at the next idle duration subsets step1 _{T idle} is the degree Dg6 indicating which subset step2, subset next subset busy duration _{T busy} arriving idle duration _{T idle} of step2 is the degree Dg7 indicating which subset step1, busy duration _{T busy} arriving at the next idle duration subsets step2 _{T idle} The degree Dg8 indicating whether the subset is the subset step2 is analyzed as the analysis result.

そうすると、１層は、１層のデータがビジーデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）の次に到来するアイドル継続時間Ｔ_ＩｄｌｅのサブセットＳ＿ＰＤＣ＿１＿１（ｓｔｅｐ１，２のいずれか）を予測し、その予測したアイドル継続時間Ｔ_Ｉｄｌｅの予測サブセットＳ＿ＰＤＣ＿１＿１（ｓｔｅｐ１，２のいずれか）を２層へ出力する。 Then, when the data of the first layer is busy data, the idle that arrives next to the subset of the _{busy duration T Busy (either step 1 or 2) based on the analysis result (degrees Dg1 to Dg8).} predicting a subset of duration _{T idle} S_PDC_1_1 (either step1,2), and outputs the prediction subset of the predicted idle duration _{T idle} S_PDC_1_1 (either Step1,2) to two layers.

また、１層は、１層のデータがアイドルデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）の次に到来するビジー継続時間Ｔ_ＢｕｓｙのサブセットＳ＿ＰＤＣ＿１＿２（ｓｔｅｐ１，２のいずれか）を予測し、その予測したアイドル継続時間Ｔ_Ｉｄｌｅの予測サブセットＳ＿ＰＤＣ＿１＿２（ｓｔｅｐ１，２のいずれか）を２層へ出力する。 Further, the first layer, the busy data of one layer if idle data, based on the analysis result (the degree Dg1～Dg8), arrives next subset of idle duration _{T Idle} (either Step1,2) A subset S_PDC_1_2 (one of steps 1 and 2) of the duration T _Busy is predicted, and the predicted subset S_PDC_1_2 (one of the steps 1 and 2) of the predicted _{idle duration T Idle is output to the second layer.}

２層は、１層から予測サブセットＳ＿ＰＤＣ＿１＿１（ｓｔｅｐ１，２のいずれか）とビジー／アイドルデータ、または予測サブセットＳ＿ＰＤＣ＿１＿２（ｓｔｅｐ１，２のいずれか）とビジー／アイドルデータとを受ける。 The second layer receives the predicted subset S_PDC_1_1 (one of steps 1 and 2) and busy / idle data from the first layer, or the predicted subset S_PDC_1_2 (either of steps 1 and 2) and busy / idle data.

そして、２層は、１層と同様にして、ビジー／アイドルデータに基づいて、上述した度合Ｄｇ１〜Ｄｇ８を解析する。 Then, the second layer analyzes the above-mentioned degrees Dg1 to Dg8 based on the busy / idle data in the same manner as the first layer.

その後、２層は、２層のデータがアイドルデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、アイドル継続時間Ｔ_Ｉｄｌｅの予測サブセットＳ＿ＰＤＣ＿１＿２の次に到来するビジー継続時間Ｔ_Ｂｕｓｙの予測サブセットＳ＿ＰＤＣ＿２＿１を生成する。 After that, when the data of the second layer is idle data, _{the prediction of the busy duration T Busy} that comes after the prediction subset S_PDC_1_2 of _{the idle duration T Idle} based on the analysis result (degrees Dg1 to Dg8) is predicted. Generate a subset S_PDC_2_1.

また、２層は、２層のデータがビジーデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、予測サブセットＳ＿ＰＤＣ＿１＿１の次に到来するアイドル継続時間Ｔ_Ｉｄｌｅの予測サブセットＳ＿ＰＤＣ＿２＿２を生成する。 Further, the two-layer, if the data of the two layers is busy data, analysis results based on (degree Dg1～Dg8), to generate the idle duration _T prediction subset S_PDC_2_2 of _Idle arriving at the next prediction subset S_PDC_1_1.

そして、２層は、予測サブセットＳ＿ＰＤＣ＿２＿１、または予測サブセットＳ＿ＰＤＣ＿２＿２を３層へ出力する。 Then, the second layer outputs the predicted subset S_PDC_2 or the predicted subset S_PDC_2 to the third layer.

以下、同様にして、Ｋ層は、Ｋ−１層から予測サブセットＳ＿ＰＤＣ＿Ｋ−１＿１とビジー／アイドルデータ、または予測サブセットＳ＿ＰＤＣ＿Ｋ−１＿２とビジー／アイドルデータとを受ける。 Hereinafter, in the same manner, the K layer receives the predicted subset S_PDC_K-1_1 and busy / idle data, or the predicted subset S_PDC_K-1_2 and busy / idle data from the K-1 layer.

そして、Ｋ層は、１層と同様にして、ビジー／アイドルデータに基づいて、上述した度合Ｄｇ１〜Ｄｇ８を解析する。 Then, the K layer analyzes the above-mentioned degrees Dg1 to Dg8 based on the busy / idle data in the same manner as the first layer.

その後、Ｋ層は、Ｋ層のデータがアイドルデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、予測サブセットＳ＿ＰＤＣ＿Ｋ−１＿１の次に到来するビジー継続時間Ｔ_Ｂｕｓｙの予測サブセットＳ＿ＰＤＣ＿Ｋ＿１を生成する。そして、Ｋ層は、予測サブセットＳ＿ＰＤＣ＿Ｋ＿１をビジーデータ（ビジー継続時間Ｔ_Ｂｕｓｙ）の予測値として出力する。 After that, when the data of the K layer is idle data, the K layer generates a predicted subset S_PDC_K_1 of _{the busy duration T Busy} that comes next to the predicted subset S_PDC_K-1_1 based on the analysis results (degrees Dg1 to Dg8). do. Then, the K layer outputs the prediction subset S_PDC_K_1 as the predicted value of the _{busy data (busy duration T Busy).}

また、Ｋ層は、Ｋ層のデータがビジーデータである場合、解析結果（度合Ｄｇ１〜Ｄｇ８）に基づいて、予測サブセットＳ＿ＰＤＣ＿Ｋ−１＿２の次に到来するアイドル継続時間Ｔ_Ｉｄｌｅの予測サブセットＳ＿ＰＤＣ＿Ｋ＿２を生成する。そして、Ｋ層は、予測サブセットＳ＿ＰＤＣ＿Ｋ＿１をアイドルデータ（アイドル継続時間Ｔ_Ｉｄｌｅ）の予測値として出力する。 Also, K layer, if the data of the K layer are busy data, generates an analysis result based on the (degree Dg1～Dg8), prediction subset S_PDC_K-1_2 prediction subset S_PDC_K_2 idle duration arriving at the next _{T Idle} do. Then, K layer outputs the prediction subset S_PDC_K_1 as the predicted value of the idle data (idle duration _{T Idle).}

予測部３４は、予測器３５のＫ層が５層である場合、ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）（１層）／アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）（２層）／ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）（３層）／アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）（４層）／ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）（５層）からなるアイドル／ビジーデータＤａｔａ１と、アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）（１層）／ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）（２層）／アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）（３層）／ビジー継続時間Ｔ_Ｂｕｓｙのサブセット（ｓｔｅｐ１，２のいずれか）（４層）／アイドル継続時間Ｔ_Ｉｄｌｅのサブセット（ｓｔｅｐ１，２のいずれか）（５層）からなるアイドル／ビジーデータＤａｔａ２とを交互に予測器３５へ出力する。 When the K layer of the predictor 35 is 5 layers, the prediction unit 34 has _{a subset of busy duration T Busy} (one of steps 1 and 2) (1 layer) / a subset of _{idle duration T Idle (steps 1 and 2).} Any) (2 layers) / Busy duration T _Busy subset (1 or 2 steps) (3 layers) / Idle duration T _Idle subset (1 or 2 steps) (4 layers) / Busy continuation Idle / busy data Data1 consisting of a subset of time T _Busy (one of steps 1 and 2) (5 layers) and _{a subset of idle duration T Idle} (any of steps 1 and 2) (1 layer) / busy duration T _{A subset of Busy} (one of steps 1 and 2) (2 layers) / _{a subset of idle duration T Idle} (one of steps 1 and 2) (3 layers) / _{a subset of busy duration T Busy} (one of steps 1 and 2) ) (4 layers) / idle duration _Idle / busy data Data2 consisting of a subset of TIdle (one of steps 1 and 2) (5 layers) is alternately output to the predictor 35.

予測器３５は、アイドル／ビジーデータＤａｔａ１を予測部３４から受けたとき、１層、３層および５層は、ビジーデータになり。２層および４層は、アイドルデータになる。その結果、予測器４５は、予測アイドル継続時間を出力する。 When the predictor 35 receives the idle / busy data Data1 from the prediction unit 34, the first layer, the third layer, and the fifth layer become busy data. The 2nd and 4th layers are idle data. As a result, the predictor 45 outputs the predicted idle duration.

また、予測器３５は、アイドル／ビジーデータＤａｔａ２を予測部３４から受けたとき、１層、３層および５層は、アイドルデータになり、２層および４層は、ビジーデータになる。その結果、予測器４５は、予測ビジー継続時間を出力する。 Further, when the predictor 35 receives the idle / busy data Data2 from the prediction unit 34, the 1st layer, the 3rd layer and the 5th layer become idle data, and the 2nd layer and the 4th layer become busy data. As a result, the predictor 45 outputs the predicted busy duration.

このように、予測器３５は、予測アイドル継続時間と予測ビジー継続時間とを交互に予測部３４へ出力する。 In this way, the predictor 35 alternately outputs the predicted idle duration and the predicted busy duration to the prediction unit 34.

そして、予測部３４は、予測器３５から受けた予測アイドル継続時間および予測ビジー継続時間を交互に制御手段４へ出力する。 Then, the prediction unit 34 alternately outputs the predicted idle duration and the predicted busy duration received from the predictor 35 to the control means 4.

予測アイドル継続時間および予測ビジー継続時間は、サブセットｓｔｅｐ１，２のいずれかを含むので、制御手段４は、予測アイドル継続時間および予測ビジー継続時間のサブセット（サブセットｓｔｅｐ１，２のいずれか）に基づいて、予測アイドル継続時間の長さおよび予測ビジー継続時間の長さを知ることができる。 Since the predicted idle duration and the predicted busy duration include either of the subset steps 1 and 2, the control means 4 is based on a subset of the predicted idle duration and the predicted busy duration (either of the subset steps 1 and 2). , The length of the predicted idle duration and the length of the predicted busy duration can be known.

予測器３５は、一般的には、次式によって、予測アイドル継続時間または予測ビジー継続時間を予測する。 The predictor 35 generally predicts a predicted idle duration or a predicted busy duration by the following equation.

式（１）において、Ｘ_ｉ，Ｘ_ｉ−１，・・・，Ｘ_{ｉ―ｐ−１}は、直近ｐ（ｐは、２以上の整数である。）個のビジー／アイドル計測時間であり、既知の値である。また、Ｘ_ｉ＋１は、予測値である。更に、ｂ_１，ｂ_２，・・・，ｂ_ｐは、係数である。 In the equation (1), X _i , X _i-1 , ..., X _i-p-1 are the busy / idle measurement times of the latest p (p is an integer of 2 or more). It is a known value. Further, X _{i + 1} is a predicted value. Further, b ₁ , b ₂ , ..., B _p are coefficients.

式（１）によって、Ｘ_ｉ＋１を予測する場合、係数ｂ_１，ｂ_２，・・・，ｂ_ｐを決定する必要がある。 _{When predicting X i + 1} by the equation (1), it is necessary to determine the coefficients b ₁ , b ₂ , ..., B _p.

この場合、予測器３５は、トレーニング区間において、係数ｂ_１，ｂ_２，・・・，ｂ_ｐを学習する。 In this case, the predictor 35 learns the _{coefficients b 1} , b ₂ , ..., B _{p in the training section.}

トレーニング区間における係数ｂ_１，ｂ_２，・・・，ｂ_ｐの学習について説明する。係数ｂ_１，ｂ_２，・・・，ｂ_ｐを算出するためのトレーニングデータを、例えば、２００サンプルとし、予測式（１）の次数を、例えば、２次（ｐ＝２）として係数ｂ_１，ｂ_２を学習する。 Coefficients _b _1, b 2 in the training interval, ..., learning of _{b p} will be described. Coefficients _b _1, b 2, · · ·, training data for calculating the _{b p,} for example, a 200 sample, the order of the prediction equation (1), for example, the coefficient _{b 1} as a secondary (p = 2) , B ₂ is learned.

予測器３５は、２００サンプルのトレーニングデータに基づいて、上述した方法によって、ビジー継続時間およびアイドル継続時間の予測値を算出し、係数ｂ_１，ｂ_２を算出する。この場合、予測器３５は、Ｌｅｖｉｎｓｏｎ−Ｄｕｒｂｉｎ帰納法を用いて最小二乗式から生じる正規方程式を解くことによって係数ｂ_１，ｂ_２を算出する。 The predictor 35 calculates the predicted values of the busy duration and the idle duration by the method described above based on the training data of 200 samples, and calculates the coefficients b ₁ and b ₂ . In this case, the predictor 35 calculates the _{coefficients b 1} and b ₂ by solving the normal equations resulting from the least squares equation using the Levinson-Durbin induction method.

なお、予測器３５は、トレーニング区間において、ビジー継続時間Ｔ_Ｂｕｓｙ／アイドル継続時間Ｔ_Ｉｄｌｅの変遷パターンの全てについて係数ｂ_１，ｂ_２を算出する。 The predictor 35 calculates the _{coefficients b 1} and b ₂ for all the transition patterns of the busy duration T _Busy / idle duration T _{Idle in the training section.}

例えば、サブセット数Ｓ_ｉが２個であり、Ｋ＝５である場合、ビジー継続時間Ｔ_Ｂｕｓｙ／アイドル継続時間Ｔ_Ｉｄｌｅの変遷パターンは、３２個である。 For example, when the number of subsets S _i is 2 and K = 5, the transition pattern of _{busy duration T Busy} / idle duration T _{Idle is 32.}

従って、予測器３５は、３２個の変遷パターンの全てについて、係数ｂ_１，ｂ_２を算出し、その算出した係数ｂ_１，ｂ_２を３２個の変遷パターンのそれぞれに対応付けて予測部３４へ出力する。 Accordingly, the predictor 35, 32 for all the transition pattern, coefficients _b 1, _{b 2} calculates the prediction unit 34 in association with each of the coefficients _b 1, _{b 2} with the calculated 32 transition pattern Output to.

予測部３４は、３２個の変遷パターンと、３２組の係数ｂ_１，ｂ_２との対応関係を予測器３５から受け、その受けた３２個の変遷パターンと、３２組の係数ｂ_１，ｂ_２との対応関係を保持する。 The prediction unit 34 receives the correspondence between the 32 transition patterns and the 32 sets of coefficients b ₁ and b ₂ from the predictor 35, and receives the 32 transition patterns and the 32 sets of coefficients b ₁ and b. _Maintain the correspondence with 2.

予測器３５は、ｐが２以外である場合も、同様にして、３２個の変遷パターンの全てについて、係数ｂ_１，ｂ_２，・・・，ｂ_ｐを算出し、その算出した係数ｂ_１，ｂ_２，・・・，ｂ_ｐを３２個の変遷パターンのそれぞれに対応付けて予測部３４へ出力する。 _{The predictor 35 similarly calculates the coefficients b 1} , b ₂ , ..., B _p for all 32 transition patterns even when p is other than 2, and the calculated coefficients b ₁ , _b 2, ···, in association with each of the 32 transition patterns _{b p} outputs to the prediction unit 34.

予測部３４は、３２個の変遷パターンと、３２組の係数ｂ_１，ｂ_２，・・・，ｂ_ｐとの対応関係を予測器３５から受け、その受けた３２個の変遷パターンと、３２組の係数ｂ_１，ｂ_２，・・・，ｂ_ｐとの対応関係を保持する。 Prediction unit 34, and 32 transition patterns, 32 sets of coefficients _b _1, b 2, · · ·, receives the correspondence relationship between _{b p} from the predictor 35, and 32 changes the pattern thereof received, 32 The correspondence with the set coefficients b ₁ , b ₂ , ..., B _p is maintained.

図７は、変遷パターンと係数ｂ_１，ｂ_２，・・・，ｂ_ｐとの対応表を示す図である。図７を参照して、対応表ＴＢＬ１は、変遷パターンと係数の組とを含む。変遷パターンおよび係数の組は、相互に対応付けられる。 FIG. 7 is a diagram showing a correspondence table between the transition pattern and the coefficients b ₁ , b ₂ , ..., B _p. With reference to FIG. 7, the correspondence table TBL1 includes a transition pattern and a set of coefficients. Transition patterns and coefficient sets are associated with each other.

変遷パターン１、変遷パターン２、変遷パターン３、・・・、変遷パターンｍ（ｍは、サブセット数とＫ層の個数とに基づいて決定される変遷パターンの総数）は、それぞれ、係数の組ｂ_１１，ｂ_１２，ｂ_１３，・・・，ｂ_１ｐ、係数の組ｂ_２１，ｂ_２２，ｂ_２３，・・・，ｂ_２ｐ、係数の組ｂ_３１，ｂ_３２，ｂ_３３，・・・，ｂ_３ｐ、・・・、係数の組ｂ_ｍ１，ｂ_ｍ２，ｂ_ｍ３，・・・，ｂ_ｍｐに対応付けられる。 Transition pattern 1, transition pattern 2, transition pattern 3, ..., Transition pattern m (m is the total number of transition patterns determined based on the number of subsets and the number of K layers) are set of coefficients b, respectively. ₁₁ , b ₁₂ , b ₁₃ , ..., b _1p , coefficient set b ₂₁ , b ₂₂ , b ₂₃ , ..., b _2p , coefficient set b ₃₁ , b ₃₂ , b ₃₃ , ..., b _3p , ..., Coefficient set b _m1 , b _m2 , b _m3 , ..., b _mp .

予測部３４は、対応表ＴＢＬ１を保持している。そして、予測部３４は、時系列に配列されたビジー継続時間／アイドル継続時間（図５参照）を分類部３３から受けると、対応表ＴＢＬ１を参照して、その受けた時系列に配列されたビジー継続時間／アイドル継続時間（図５参照）の変遷パターンが、対応表ＴＢＬ１に格納された変遷パターン１〜変遷パターンｍのうちのいずれに一致するかを判定し、ビジー継続時間／アイドル継続時間（図５参照）の変遷パターンに一致すると判定した変遷パターン（変遷パターン１〜変遷パターンｍのいずれか）に対応する係数の組（ｂ_１１，ｂ_１２，ｂ_１３，・・・，ｂ_１ｐ；ｂ_２１，ｂ_２２，ｂ_２３，・・・，ｂ_２ｐ；・・・；ｂ_ｍ１，ｂ_ｍ２，ｂ_ｍ３，・・・，ｂ_ｍｐのいずれか）を検出する。そして、予測部３４は、その検出した係数の組と、時系列に配列されたビジー継続時間／アイドル継続時間（図５参照）とを予測器３５へ出力する。 The prediction unit 34 holds the correspondence table TBL1. Then, when the prediction unit 34 receives the busy duration / idle duration (see FIG. 5) arranged in time series from the classification unit 33, the prediction unit 34 is arranged in the received time series with reference to the correspondence table TBL1. It is determined which of the transition patterns 1 to the transition pattern m stored in the correspondence table TBL1 the transition pattern of the busy duration / idle duration (see FIG. 5) matches, and the busy duration / idle duration is determined. A set of coefficients (b ₁₁ , b ₁₂ , b ₁₃ , ..., b _1p ; b ₂₁ , b ₂₂ , b ₂₃ , ..., b _2p ; ...; b _m1 , b _m2 , b _m3 , ..., b _mp ) is detected. Then, the prediction unit 34 outputs the set of the detected coefficients and the busy duration / idle duration (see FIG. 5) arranged in time series to the predictor 35.

その後、予測部３４は、予測ビジー継続時間／予測アイドル継続時間を予測器３５から受ける。 After that, the prediction unit 34 receives the prediction busy duration / prediction idle duration from the predictor 35.

学習器５におけるＱ学習について説明する。図８は、チャネル待機時間の概念図である。 Q-learning in the learner 5 will be described. FIG. 8 is a conceptual diagram of the channel standby time.

図８を参照して、時刻ｔ_ＣＰＴは、チャネル＃１〜チャネル＃４のいずれかでパケットを送信する通信が完了した時刻であり、現在とする。そして、現在の時刻ｔ_ＣＰＴから、次のチャネル使用時間（ビジー継続時間Ｔ_Ｂｕｓｙ）までの待ち時間の長さをチャネル待機時間ＷＴとする。 With reference to FIG. 8, the time t _CPT is the time when the communication for transmitting the packet on any of the channels # 1 to # 4 is completed, and is set to the present time. Then, the length of the waiting time from the current time t _CPT to the next channel usage time (busy duration T _Busy ) is defined as the channel standby time WT.

この場合、チャネル＃１においては、現在の時刻ｔ_ＣＰＴから、次のチャネル使用時間（ビジー継続時間Ｔ_{Ｂｕｓｙ１}）までの待ち時間の長さは、チャネル待機時間ＷＴ１であり、チャネル＃２においては、現在の時刻ｔ_ＣＰＴから、次のチャネル使用時間（ビジー継続時間Ｔ_{Ｂｕｓｙ２}）までの待ち時間の長さは、チャネル待機時間ＷＴ２であり、チャネル＃３においては、現在の時刻ｔ_ＣＰＴから、次のチャネル使用時間（ビジー継続時間Ｔ_{Ｂｕｓｙ３}）までの待ち時間の長さは、チャネル待機時間ＷＴ３であり、チャネル＃４においては、現在の時刻ｔ_ＣＰＴから、次のチャネル使用時間（ビジー継続時間Ｔ_{Ｂｕｓｙ４}）までの待ち時間の長さは、チャネル待機時間ＷＴ４である。 In this case, in channel # 1, _{the length of the waiting time from the current time t CPT} to the next channel usage time (busy duration T _Busy1 ) is the channel standby time WT1, and in channel # 2, the waiting time is WT1. The length of the waiting time from the current time t _CPT to the next channel usage time (busy duration T _Busy2 ) is the channel standby time WT2, and in channel # 3, from the current time t _CPT , the next the length of the waiting time to the channel use time (busy duration _{T Busy3)} is a channel waiting time WT 3, in the channel # 4, the current time _{t CPT,} next channel use time (busy duration _{T Busy4} ) Is the channel standby time WT4.

制御手段４は、予測手段３から受けた予測チャネル空き時間および予測チャネル使用時間に基づいて、各チャネル＃１〜チャネル＃４におけるチャネル待機時間ＷＴ１〜ＷＴ４を検出する。 The control means 4 detects the channel standby times WT1 to WT4 in each channel # 1 to channel # 4 based on the predicted channel free time and the predicted channel usage time received from the predicting means 3.

図９は、Ｑテーブルの概念図である。パケットの送信時間の長さは、例えば、「長」、「中」および「短」の３つの状態ｓ_ｔを取り得るので、「短」を“１”で表し、「中」を“２”で表し、「長」を“３”で表す。 FIG. 9 is a conceptual diagram of the Q table. The length of the transmission time of the packet, for example, "long", since it can take three states s _t "medium" and "short", represented by "short" to "1", the "medium""2" It is represented by, and "length" is represented by "3".

「長」は、例えば、パケットの送信時間の長さが２ｍｓであり、「中」は、例えば、パケットの送信時間の長さが１ｍｓであり、「短」は、例えば、パケットの送信時間の長さが０．５ｍｓである。 “Long” means, for example, the length of the packet transmission time is 2 ms, “medium” means, for example, the length of the packet transmission time is 1 ms, and “short” means, for example, the length of the packet transmission time. The length is 0.5 ms.

通信に用いるチャネルを選択する行動ａ_ｔは、例えば、チャネル＃１を選択する、チャネル＃２を選択する、チャネル＃３を選択する、およびチャネル＃４を選択する、の４つの行動ａ_ｔを取り得るので、チャネル＃１を選択することを“１”で表し、チャネル＃２を選択することを“２”で表し、チャネル＃３を選択することを“３”で表し、チャネル＃４を選択することを“４”で表す。 Action a _t to select a channel to be used for communication, for example, selects the channel # 1, selects the channel # 2, and selects the channel # 3, and selects the channel # 4, the four actions a _t Since it is possible, selecting channel # 1 is represented by "1", selecting channel # 2 is represented by "2", selecting channel # 3 is represented by "3", and channel # 4 is represented by "3". The selection is represented by "4".

従って、Ｑテーブルは、３行×４列の行列によって表され、１２個のＱ値（＝Ｑ_１，１〜Ｑ_３，４）を含む。 Therefore, the Q table is represented by a matrix of 3 rows × 4 columns and contains 12 Q values (= Q _{1, 1 to} Q _{3, 4} ).

１２個のＱ値（＝Ｑ_１，１〜Ｑ_３，４）の初期値は、全て、“０”である。そして、端末装置１０のパケットの送信時間の長さが状態ｓ_ｔ（＝１〜３のいずれか）にあり、チャネル＃１〜チャネル＃４のいずれかを選択する行動ａ_ｔを取ったとき、パケットの送信が可能である場合、通信結果（通信の成功／失敗）、競合端末情報およびチャネル待機時間ＷＴに基づいて、次の通信における報酬ｒ_ｔ＋１が算出される。 The initial values of the 12 Q values (= Q _{1,1 to} Q _3,4 ) are all "0". Then, there is the terminal device 10 transmits time packet length state s _{t (=} 1 to 3 either), when acted a _t to select one of the channels # 1 to channel # 4, _{When the packet can be transmitted, the reward rt + 1} in the next communication is calculated based on the communication result (communication success / failure), the competing terminal information, and the channel waiting time WT.

そして、図８に示すチャネル待機時間ＷＴが短い方がチャネル利用効率を高くできるので、チャネル待機時間ＷＴが短い方が、通信のスループットが大きくなり、チャネル待機時間ＷＴが長い方が、通信のスループットが小さくなるようにチャネル待機時間ＷＴに基づいて重み付けを行う。 The shorter the channel standby time WT shown in FIG. 8, the higher the channel utilization efficiency. Therefore, the shorter the channel standby time WT, the larger the communication throughput, and the longer the channel standby time WT, the higher the communication throughput. Weighting is performed based on the channel standby time WT so that

また、競合端末装置の個数が少ない方が、競合が減少して通信のスループットが大きくなり、競合端末装置の個数が多い方が競合して通信のスループットが小さくなるので、競合端末装置の個数が少ない方が通信のスループットが大きくなり、競合端末装置の個数が多い方が通信のスループットが小さくなるように競合端末装置の個数に基づいて重み付けを行う。 Further, the smaller the number of competing terminal devices, the smaller the competition and the larger the communication throughput, and the larger the number of competing terminal devices, the smaller the communication throughput. Weighting is performed based on the number of competing terminal devices so that the smaller the number, the higher the communication throughput, and the larger the number of competing terminal devices, the smaller the communication throughput.

その結果、ＱテーブルのＱ値は、次式によって更新される。 As a result, the Q value in the Q table is updated by the following equation.

式（２）において、ｒ_ｔ＋１は、報酬であり、αは、学習率であり、γは、割引率である。 In equation (2), rt _{+ 1} is a reward, α is a learning rate, and γ is a discount rate.

この発明の実施の形態においては、αは、例えば、０．５であり、γは、例えば、０である。 In embodiments of the present invention, α is, for example, 0.5 and γ is, for example, 0.

式（２）の報酬ｒ_ｔ＋１は、次式によって算出される。 _{The reward rt + 1} of the formula (2) is calculated by the following formula.

式（３Ａ）のＦは、通信のスループットである。式（３Ａ）のｗ_１は、式（３Ｂ）によって算出される重み係数であり、式（３Ｂ）のＮ_ＣＦＬは、競合端末装置の個数である。その結果、重み係数ｗ_１は、競合端末装置の個数Ｎ_ＣＦＬが多くなれば小さくなり、競合端末装置の個数Ｎ_ＣＦＬが少なくなれば大きくなる。 F of the formula (3A) is the throughput of communication. W ₁ of the formula (3A) is a weighting coefficient calculated by the formula (3B), and _NCFL of the formula (3B) is the number of competing terminal devices. As a result, the weighting coefficients w ₁ is smaller The more the number N _CFL competing terminal device, increases if less number N _CFL competing terminals.

また、式（３Ａ）のｗ_２は、式（３Ｃ）によって算出される重み係数であり、式（３Ｃ）のＷＴは、各チャネルにおけるチャネル待機時間である。その結果、重み係数ｗ_２は、チャネル待機時間ＷＴが短くなれば大きくなり、チャネル待機時間ＷＴが長くなれば小さくなる。 _{Further, w 2} in the formula (3A) is a weighting coefficient calculated by the formula (3C), and WT in the formula (3C) is a channel standby time in each channel. As a result, the weighting coefficient w ₂ becomes larger as the channel waiting time WT becomes shorter, and becomes smaller as the channel waiting time WT becomes longer.

更に、式（３Ａ）のｖは、通信が成功したとき、“１”からなり、通信が失敗したとき、“０”からなる（式（３Ｄ）参照）。従って、通信が失敗したとき、報酬ｒ_ｔ＋１は、“０”になる。 Further, v of the formula (3A) is composed of "1" when the communication is successful, and is composed of "0" when the communication is unsuccessful (see formula (3D)). Therefore, when the communication fails, the reward rt _{+ 1} becomes “0”.

式（３）に示すように、学習器５は、競合端末装置の個数Ｎ_ＣＦＬが少なくなれば大きくなり、競合端末装置の個数Ｎ_ＣＦＬが多くなれば小さくなるように報酬ｒ_ｔ＋１を算出する。そして、学習器５は、競合端末装置の個数Ｎ_ＣＦＬが第１の個数であるとき第１の値からなる重み係数ｗ_１をスループットＦに乗算し、競合端末装置の個数Ｎ_ＣＦＬが第１の個数よりも多い第２の個数であるとき第１の値よりも小さい第２の値からなる重み係数ｗ_１をスループットＦに乗算する。 As shown in equation (3), the learning unit 5 is greater if less number N _CFL competing terminal device calculates the reward r _{t + 1} to be smaller The more the number N _CFL competing terminals. _{Then, when the number N CFL} of the number of competing terminal devices is the first number, the learner 5 multiplies the throughput F by the _{weighting coefficient w 1} consisting of the first value, and _{the number N CFL} of the competing terminal devices is the first. the weighting coefficients w ₁ consisting of small second value than the first value when a larger second number than the number multiplying the throughput F.

また、学習器５は、チャネル待機時間ＷＴが短くなれば大きくなり、チャネル待機時間ＷＴが長くなれば小さくなるように報酬ｒ_ｔ＋１を算出する。そして、学習器５は、チャネル待機時間ＷＴが第１の時間長からなるとき、第３の値からなる重み係数ｗ_２をスループットＦに乗算し、チャネル待機時間ＷＴが第１の時間長よりも長い第２の時間長からなるとき、第３の値よりも小さい第４の値からなる重み係数ｗ_２をスループットに乗算する。 Further, the learner 5 calculates the _{reward rt + 1} so that the channel standby time WT becomes larger as the channel standby time WT becomes shorter and the channel standby time WT becomes smaller as the channel standby time WT becomes longer. Then, when the channel standby time WT consists of the first time length, the learner 5 _{multiplies the throughput F by the weighting coefficient w 2} consisting of the third value, and the channel standby time WT is larger than the first time length. When it consists of a long second time length, the throughput is multiplied by a _{weighting factor w 2 consisting of a fourth value smaller than the third value.}

なお、パケットの送信が“不可”であるとき、報酬ｒ_ｔ＋１は、“０”になる。パケットの送信が“不可”であるとき、スループットＦを算出できないからである。 When the packet transmission is "impossible", the reward rt _{+ 1} becomes "0". This is because the throughput F cannot be calculated when the packet transmission is "impossible".

学習器５は、状態ｓ_ｔと行動ａ_ｔとに対応するＱ値（＝Ｑ_ｓ，ａ）と、報酬ｒ_ｔ＋１と、学習率αと、割引率γとを式（２）に代入してＱ値（＝Ｑ_ｓ，ａ）を更新する。 Learning unit 5, Q value corresponding to the state _{s t} and the action _{_{a t (= Q s, a}} ) and, with the reward _{r t + 1,} the learning rate and α, and the discount rate γ are substituted into the formula (2) Update the Q value (= Q _{s, a).}

そして、端末装置１０の行動ａ_ｔを決定する場合、ε−ｇｒｅｅｄｙ法が用いられる。このε−ｇｒｅｅｄｙ法は、ある一定の小さい数ε（例えば、０．３）を決定しておき、発生した乱数がε以下であるとき、行動ａ_ｔをランダムに決定し、発生した乱数がε以下でないとき、行動ａ_ｔをＱ値が最大である行動に決定する方法である。 Then, when determining an action _{a t} of the terminal device 10, epsilon-greedy method is used. The epsilon-greedy method, the number epsilon (e.g., 0.3) small constant in advance to determine, when the generated random number is less than epsilon, the action a _t randomly determined, the generated random number is epsilon If not below, a method for determining an action a _t the action Q value is the maximum.

なお、制御手段４は、通信結果、競合端末情報（＝競合端末装置の個数Ｎ_ＣＦＬ）およびチャネル待機時間ＷＴを学習器５へ出力してＱ学習を実行するように学習器５を制御する場合、図８に示すチャネル待機時間ＷＴ１〜ＷＴ４をそれぞれチャネル＃１〜＃４に対応付けて学習器５へ出力する。これによって、学習器５は、Ｑ学習において状態ｓ_ｔおよび行動ａ_ｔを決定した後に報酬ｒ_ｔ＋１を算出するとき、決定した行動ａ_ｔによって選択されたチャネル（＝チャネル＃１〜＃４のいずれか）に対応するチャネル待機時間ＷＴ（＝チャネル待機時間ＷＴ１〜ＷＴ４のいずれか）を用いて重み係数ｗ_２を式（３Ｃ）によって算出できる。 When the control means 4 controls the learner 5 to output the communication result, the competing terminal information (= the number of competing terminal devices _NCFL ) and the channel standby time WT to the learner 5 to execute Q-learning. , The channel standby times WT1 to WT4 shown in FIG. 8 are associated with channels # 1 to # 4, respectively, and output to the learner 5. Any Thus, the learning unit 5, when calculating a reward r _{t + 1} after determining the state s _t and action a _t in Q-learning, selected by the determined action a _t channels (= the channel # 1 to # 4 _{The weight coefficient w 2} can be calculated by the equation (3C) using the channel standby time WT (= any of the channel standby times WT1 to WT4) corresponding to (?).

図１０から図１２は、それぞれ、Ｑテーブルの更新方法を説明するための第１から第３の概略図である。 10 to 12 are first to third schematic views for explaining a method of updating the Q table, respectively.

図１０を参照して、Ｑ学習の初期状態においては、Ｑテーブルの全てのＱ値（Ｑ_１，１〜Ｑ_３，４）は、“０”である（図１０の（ａ）参照）。 With reference to FIG. 10, in the initial state of Q-learning, all Q values (Q _{1, 1 to} Q _{3, 4} ) in the Q table are “0” (see (a) in FIG. 10).

この発明の実施の形態においては、パケットの送信時間の長さが「短」であるとき、状態ｓ_ｔを“１”とし、パケットの送信時間の長さが「中」であるとき、状態ｓ_ｔを“２”とし、パケットの送信時間の長さが「長」であるとき、状態ｓ_ｔを“３”とする。 In the embodiment of the present invention, when the length of the transmission time of the packet is "short", and a state s _t "1", when the length of the transmission time of the packet is "medium", the state s _{When t} is “2” and the length of the packet transmission time is “long”, the state _{st is set} to “3”.

学習器５は、パケットの送信時間の長さとして、「長」、「中」および「短」の複数の長さが存在するので、“１”，“２”，“３”の複数の状態ｓ_ｔを選択し得る。そこで。学習器５は、複数の状態ｓ_ｔからランダムに１つの状態ｓ_ｔを決定する。例えば、学習器５は、“２”の状態ｓ_ｔを決定したものとする。 Since the learner 5 has a plurality of lengths of "long", "medium", and "short" as the length of the packet transmission time, there are a plurality of states of "1", "2", and "3". You may select a s _t. Therefore. Learner 5 determines one state s _t at random from a plurality of state s _t. For example, the learning unit 5 shall be determined the state s _t "2".

また、学習器５は、ε−ｇｒｅｅｄｙ法に従って行動ａ_ｔを決定する。この場合、Ｑテーブルの全てのＱ値（Ｑ_１，１〜Ｑ_３，４）が“０”であるので（即ち、最大のＱ値が１つに決定されないので）、学習器５は、乱数がε以下であるか否かに拘わらず、ランダムに端末装置１０の行動ａ_ｔを決定する。そして、学習器５は、例えば、“２”の行動ａ_ｔを決定したものとする。 Also, the learning unit 5 determines an action _{a t} in accordance with epsilon-greedy method. In this case, since all the Q values (Q _{1, 1 to} Q _{3, 4} ) in the Q table are "0" (that is, the maximum Q value is not determined to be one), the learner 5 is a random number. There irrespective of whether it is less than epsilon, determines an activity a _t of the terminal apparatus 10 randomly. Then, the learning unit 5, for example, it is assumed that the determined actions a _t "2".

そうすると、学習器５は、端末装置１０がパケットを送信可能である場合、通信結果（通信の成功／失敗）、競合端末情報およびチャネル待機時間に基づいて、上述した方法によって、パケットの（ｔ＋１）回目の送信における報酬ｒ_ｔ＋１を算出する。 Then, when the terminal device 10 can transmit the packet, the learner 5 determines the packet (t + 1) by the method described above based on the communication result (communication success / failure), the competing terminal information, and the channel waiting time. Calculate the reward rt _{+ 1 for the second transmission.}

その後、学習器５は、算出した報酬ｒ_ｔ＋１と、学習率αと、割引率γと、パケットのｔ回目の送信におけるＱ値（状態ｓ_ｔ（＝“２”）と行動ａ_ｔ（＝“２”）とに対応する初期状態のＱ値）とを式（２）に代入してＱ値をＱ値（ｑ_２，２）に更新する（図１０の（ｂ）参照）。 After that, the learning unit 5, and the reward _{r t + 1,} which is calculated, learning rate and α, and the discount rate γ, Q value in the t-th transmission of the packet (state _s t (= "2") and the action _a t (= " The Q value in the initial state corresponding to 2 ”) is substituted into the equation ( _{2) to update the Q value to the Q value (q 2, 2} ) (see (b) in FIG. 10).

ここで、パケットの送信時間の長さによって決定される状態ｓ_ｔ、およびチャネルを選択することである行動ａ_ｔは、過去の通信の失敗および通信の成功に依存しないものである。また、報酬ｒ_ｔ＋１は、過去の通信の失敗／成功に依存せず、現在の通信の失敗／成功に依存して算出されるものである。従って、累積報酬についてのパラメータである割引率γは、“０”に設定されるべきである。 Here, the state s _t which is determined by the length of the transmission time of the _packet, and it is a behavior a _t selecting a channel and do not depend on the success of failure and communication of past communications. Further, the reward rt _{+ 1} is calculated not depending on the failure / success of the past communication but on the failure / success of the current communication. Therefore, the discount rate γ, which is a parameter for the cumulative reward, should be set to “0”.

その結果、パケットのｔ回目の送信におけるＱ値は、“０”であり（図１０の（ａ）参照）、γ＝０であるので、更新されたＱ値（＝ｑ_２，２）は、実質的に、αｒ_ｔ＋１に等しい。 As a result, the Q value in the t-th transmission of the packet is “0” (see (a) in FIG. 10), and γ = 0, so that the updated Q value (= q _{2, 2} ) is Substantially equal to _{αrt + 1.}

引き続いて、学習器５は、パケットの送信時間の長さとしての「長」、「中」および「短」に基づいて、例えば、ランダムに状態ｓ_ｔを“１”に決定し、ε−ｇｒｅｅｄｙ法によって行動ａ_ｔを“４”に決定するものとする。 Subsequently, the learning unit 5, "long" as the length of the transmission time of the packet, based on the "medium" and "short", for example, to determine the "1" state s _t at random, epsilon-greedy It shall be determined to "4" the action a _t by law.

そうすると、学習器５は、パケットを送信可能である場合、通信結果（通信の成功／失敗）、競合端末情報およびチャネル待機時間に基づいて、上述した方法によって、（ｔ＋１）回目の送信における報酬ｒ_ｔ＋１を算出する。 Then, when the learner 5 can transmit the packet, the reward r in the (t + 1) th transmission is performed by the method described above based on the communication result (communication success / failure), the competing terminal information, and the channel waiting time. Calculate _{t + 1.}

その後、学習器５は、算出した報酬ｒ_ｔ＋１と、学習率αと、割引率γ（＝０）と、パケットのｔ回目の送信におけるＱ値（状態ｓ_ｔ（＝“１”）と行動ａ_ｔ（＝“４”）とに対応する初期状態のＱ値）とを式（２）に代入してＱ値をＱ値（＝ｑ_１，４）に更新する（図１１の（ａ）参照）。 After that, the learning unit 5, and the reward _{r t + 1,} which is calculated, learning rate and α, and the discount rate γ (= 0), Q value in the t-th transmission of the packet (state _s t (= "1") and the action a _{Substituting t} (= Q value in the initial state corresponding to “4”) into equation (2 _{) to update the Q value to Q value (= q 1, 4} ) (see (a) in FIG. 11). ).

一方、学習器５は、発生した乱数がε以下でないとき、端末装置１０の行動ａ_ｔをＱ値が最大である行動に決定する。この時点では、Ｑテーブルは、図１０の（ｂ）に示す状態になっているので、最大のＱ値は、ｑ_２，２になる。従って、学習器５は、端末装置１０の行動ａ_ｔを“２”（チャネル＃２を選択する行動）に決定する。 On the other hand, the learning unit 5, when the generated random number is not less than epsilon, it determines an activity a _t of the terminal device 10 in action Q value is the maximum. At this point, the Q table is in the state shown in FIG. 10B, so the maximum Q value is q ₂ and 2. Thus, the learning unit 5, the action _{a t} of the terminal device 10 "2" is determined to be (action to select a channel # 2).

そして、学習器５は、端末装置１０が行動“２”（チャネル＃２を選択する行動）を実行したときの報酬ｒ_ｔ＋１を上述した方法によって算出し、その算出した報酬ｒ_ｔ＋１を用いて式（２）によってＱ値（＝ｑ_２，２）をＱ値（＝ｑ’_２，２）に更新する（図１１の（ｂ）参照）。 _{Then, the learning device 5 calculates the reward rt + 1} when the terminal device 10 executes the action “2” (the action of selecting the channel # 2) by the method described above, _{and uses the calculated reward rt + 1} to formulate the formula. The Q value (= q _2, 2) is updated to the Q value (= q ' _{2, 2} ) according to (2) (see (b) in FIG. 11).

なお、パケットのｔ回目の送信における通信が失敗であるとき、式（３Ａ）のｖは、“０”に設定されるので（式（３Ｄ）参照）、報酬ｒ_ｔ＋１は、“０”になる。従って、学習器５は、パケットのｔ回目の送信における通信が失敗であるとき、パケットの（ｔ＋１）回目の送信における報酬ｒ_ｔ＋１を“０”と算出してＱ値を更新する。 When the communication in the t-th transmission of the packet is unsuccessful, v in the equation (3A) is set to "0" (see equation (3D)), so the reward rt _{+ 1} becomes "0". .. Therefore, when the communication in the t-th transmission of the packet is unsuccessful, the learner 5 _{calculates the reward rt + 1} in the (t + 1) th transmission of the packet as “0” and updates the Q value.

以後、学習器５は、終了条件が満たされるまで、上述した動作を繰り返し実行してＱテーブルのＱ値を更新する。なお、終了条件は、例えば、上述したＱ値の更新が所定回数実行されたときである。そして、所定回数は、例えば、１０回である。 After that, the learner 5 repeatedly executes the above-described operation until the end condition is satisfied, and updates the Q value in the Q table. The end condition is, for example, when the above-mentioned Q value update is executed a predetermined number of times. The predetermined number of times is, for example, 10 times.

学習器５は、終了条件が満たされると、終了条件が満たされたときのＱテーブル（図１２参照）を参照して、“１”の状態ｓ_ｔに対応するｑ値（＝ｑ_１，１、ｑ_１，３、ｑ_１，４）のうちの最大のＱ値（＝ｑ_{１＿ｍａｘ}）と、最大のＱ値（＝ｑ_{１＿ｍａｘ}）に対応する行動ａ_{ｔ＿ｍａｘ１}とを検出し、“２”の状態ｓ_ｔに対応するｑ値（＝ｑ_２，２、ｑ_２，３）のうちの最大のＱ値（＝ｑ_{２＿ｍａｘ}）と、最大のＱ値（＝ｑ_{２＿ｍａｘ}）に対応する行動ａ_{ｔ＿ｍａｘ２}とを検出し、“３”の状態ｓ_ｔに対応するｑ値（＝ｑ_３，２、ｑ_３，４）のうちの最大のＱ値（＝ｑ_{３＿ｍａｘ}）と、最大のＱ値（＝ｑ_{３＿ｍａｘ}）に対応する行動ａ_{ｔ＿ｍａｘ３}とを検出する。そして、学習器５は、“１”の状態ｓ_ｔに最大のＱ値（＝ｑ_{１＿ｍａｘ}）と行動ａ_{ｔ＿ｍａｘ１}とを対応付けた［“１”の状態ｓ_ｔ／ｑ_{１＿ｍａｘ}／ａ_{ｔ＿ｍａｘ１}］を生成し、“２”の状態ｓ_ｔに最大のＱ値（＝ｑ_{２＿ｍａｘ}）と行動ａ_{ｔ＿ｍａｘ２}とを対応付けた［“２”の状態ｓ_ｔ／ｑ_{２＿ｍａｘ}／ａ_{ｔ＿ｍａｘ２}］を生成し、“３”の状態ｓ_ｔに最大のＱ値（＝ｑ_{３＿ｍａｘ}）と行動ａ_{ｔ＿ｍａｘ３}とを対応付けた［“３”の状態ｓ_ｔ／ｑ_{３＿ｍａｘ}／ａ_{ｔ＿ｍａｘ３}］を生成する。 Learning unit 5, when the end condition is satisfied, with reference to the Q table when the termination condition is satisfied (see FIG. 12), the corresponding q value the state s _t of "1" (= q _{1, 1} , Q _1,3 , q _1,4 ), the maximum Q value (= q _{1_max} ) and the action _{at_max 1} corresponding to the maximum Q value (= q _{1_max} ) are detected, and the state of "2". q value corresponding to _{_{_{s t (= q 2,2, q}}} 2,3) maximum Q value among the _{(= q 2_max),} maximum Q value and the _{(= q 2_max)} corresponding to act _{a T_max2} detected, "3" q values corresponding to the state _{s t} of ₍₌ _{q 3,2,} q _3,4) maximum Q value among the _{(= q 3_max),} maximum Q value _{(= q 3_max)} The action _{at_max3} corresponding to is detected. The generated learning device 5, "1" maximum Q value in state _{s t} of _{(= q 1_max)} and associates the action _{a t_max1} [ "1" state _s t _{_/ q} 1_max _/ _{a t_max1} of] and generates a "2" maximum Q value in state _{s t} of _{(= q 2_max)} and associates the action _{a t_max2} [ "2" state _s t _{_/ q} 2_max _/ _{a t_max2} of, "3" generating a maximum Q value in state _{s t} of _{(= q 3_max)} and associates the action _{a t_max3} [ "3" state _s t _{_/ q} 3_max _/ _{a t_max3} of.

そうすると、学習器５は、［“１”の状態ｓ_ｔ／ｑ_{１＿ｍａｘ}／ａ_{ｔ＿ｍａｘ１}］、［“２”の状態ｓ_ｔ／ｑ_{２＿ｍａｘ}／ａ_{ｔ＿ｍａｘ２}］および［“３”の状態ｓ_ｔ／ｑ_{３＿ｍａｘ}／ａ_{ｔ＿ｍａｘ３}］からなる出力情報ＩＦ＿ＯＵＴを制御手段４へ出力する。 Then, the learning unit 5, [ "1" state _{_s} t _{_/} q _{1_max} _/ _a _{t_max1} of] [ "2" state _{_s} t _{_/} q _{2_max} _/ _a _{t_max2} of] and [ "3" state _{_s} t _{/ q} _{3_max} of / _{At_max3} ] outputs the output information IF_OUT to the control means 4.

制御手段４は、出力情報ＩＦ＿ＯＵＴ（＝［“１”の状態ｓ_ｔ／ｑ_{１＿ｍａｘ}／ａ_{ｔ＿ｍａｘ１}］、［“２”の状態ｓ_ｔ／ｑ_{２＿ｍａｘ}／ａ_{ｔ＿ｍａｘ２}］および［“３”の状態ｓ_ｔ／ｑ_{３＿ｍａｘ}／ａ_{ｔ＿ｍａｘ３}］）を学習器５から受ける。 Control means 4, the output information IF_OUT (= [ "1" state _{_s} t _{_/} q _{1_max} _/ _a _{t_max1} of] [ "2" state _{_s} t _{_/} q _{2_max} _/ _a _{t_max2} of] and [ "3" state _{s t} of / Q _{3_max} / _{at_max3} ]) is received from the learner 5.

そして、制御手段４は、送信用パケットの送信時間の長さに対応する状態ｓ_ｔが“１”の状態ｓ_ｔであるとき、出力情報ＩＦ＿ＯＵＴから［“１”の状態ｓ_ｔ／ｑ_{１＿ｍａｘ}／ａ_{ｔ＿ｍａｘ１}］を検出し、その検出した［“１”の状態ｓ_ｔ／ｑ_{１＿ｍａｘ}／ａ_{ｔ＿ｍａｘ１}］から行動ａ_{ｔ＿ｍａｘ１}を検出する。そうすると、制御手段４は、行動ａ_{ｔ＿ｍａｘ１}によって選択されたチャネル（チャネル＃１、チャネル＃３、およびチャネル＃４のいずれか）を送信用チャネルＣＨ＿Ｔとして選択する。 Then, the control unit 4, when the state _{s t} corresponds to the length of the transmission time of the transmission packet is a state _{s t} "1", the output information from the IF_OUT in the "1" state _{_s} t _{/ q} _{1_max} _/ to detect _a t_max1], to detect the action _{a t_max1} from the detected [ "1" state _{_s} t _{_/} q _{1_max} _/ _{a t_max1} of]. Then, the control unit 4 selects action _a channel selected by _{T_max1} (channel # 1, channel # 3, and either channel # 4) as a transmission channel CH_T.

また、制御手段４は、送信用パケットの送信時間の長さに対応する状態ｓ_ｔが“２”の状態ｓ_ｔであるとき、出力情報ＩＦ＿ＯＵＴから［“２”の状態ｓ_ｔ／ｑ_{２＿ｍａｘ}／ａ_{ｔ＿ｍａｘ２}］を検出し、その検出した［“２”の状態ｓ_ｔ／ｑ_{２＿ｍａｘ}／ａ_{ｔ＿ｍａｘ２}］から行動ａ_{ｔ＿ｍａｘ２}を検出する。そうすると、制御手段４は、行動ａ_{ｔ＿ｍａｘ２}によって選択されたチャネル（チャネル＃２、およびチャネル＃３のいずれか）を送信用チャネルＣＨ＿Ｔとして選択する。 Further, the control unit 4, when the state _{s t} corresponds to the length of the transmission time of the transmission packet is a state _{s t} "2", the output information from the IF_OUT in the "2" state _{_s} t _{/ q} _{2_max} _/ to detect _a t_max2], to detect the action _{a t_max2} from the detected [ "2" state _{_s} t _{_/} q _{2_max} _/ _{a t_max2} of]. Then, the control unit 4 selects a channel selected by the action _{a t_max2} (channel # 2, and one of the channels # 3) as a transmission channel CH_T.

更に、制御手段４は、送信用パケットの送信時間の長さに対応する状態ｓ_ｔが“３”の状態ｓ_ｔであるとき、出力情報ＩＦ＿ＯＵＴから［“３”の状態ｓ_ｔ／ｑ_{３＿ｍａｘ}／ａ_{ｔ＿ｍａｘ３}］を検出し、その検出した［“３”の状態ｓ_ｔ／ｑ_{３＿ｍａｘ}／ａ_{ｔ＿ｍａｘ３}］から行動ａ_{ｔ＿ｍａｘ３}を検出する。そうすると、制御手段４は、行動ａ_{ｔ＿ｍａｘ３}によって選択されたチャネル（チャネル＃２、およびチャネル＃４のいずれか）を送信用チャネルＣＨ＿Ｔとして選択する。 Furthermore, the control unit 4, when the state _{s t} corresponds to the length of the transmission time of the transmission packet is a state _{s t} "3", the output information from the IF_OUT of [ "3" state _{_s} t _{/ q} _{3_max} _/ to detect _a t_max3], to detect the action _{a t_max3} from the detected [ "3" state _{_s} t _{_/} q _{3_max} _/ _{a t_max3} of]. Then, the control unit 4 selects a channel selected by the action _{a t_max3} (channel # 2, and either channel # 4) as a transmission channel CH_T.

図１３は、送信用パケットが送信可能か否かを判定する方法を説明するための図である。 FIG. 13 is a diagram for explaining a method of determining whether or not the transmission packet can be transmitted.

図１３を参照して、現在の時刻Ｔ_ＣＰＴよりも将来の時刻においては、チャネル＃１において、チャネル空き時間Ｔ_{Ｉｄｌｅ１}が存在し、チャネル＃２において、チャネル空き時間Ｔ_{Ｉｄｌｅ２}が存在し、チャネル＃３において、チャネル空き時間Ｔ_{Ｉｄｌｅ３}が存在し、チャネル＃４において、チャネル空き時間Ｔ_{Ｉｄｌｅ４}が存在する。 Referring to FIG. 13, in the current time _T time future than _CPT, the channel # 1, channel exists idle time _{T Idle1,} in the channel # 2, there is a channel idle time _{T Idle2,} channel # In 3, the channel free time _TIdle3 exists, and in channel # 4, the channel free time _TIdle4 exists.

チャネル空き時間Ｔ_{Ｉｄｌｅ１}は、送信用パケットの送信時間の長さよりも短い。チャネル空き時間Ｔ_{Ｉｄｌｅ２}，Ｔ_{Ｉｄｌｅ３}，Ｔ_{Ｉｄｌｅ４}は、送信用パケットの送信時間の長さよりも長い。そして、チャネル空き時間Ｔ_{Ｉｄｌｅ２}は、チャネル空き時間Ｔ_{Ｉｄｌｅ２}，Ｔ_{Ｉｄｌｅ３}，Ｔ_{Ｉｄｌｅ４}のうちで最も短い。 The channel free time _TIdle1 is shorter than the length of the transmission time of the transmission packet. The channel free time T _Idle2 , T _{Idle 3} , and T _{Idle 4} are longer than the length of the transmission time of the transmission packet. The channel free time _TIdle2 is the shortest among the channel free time _TIdle2 , _TIdle3 , and _TIdle4.

制御手段４は、例えば、送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）を選択すると、その選択した送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）においてキャリアセンスを実行するように受信手段２を制御し、その後、受信手段２からキャリアセンスの結果を受けると、キャリアセンスの結果に基づいて他の端末装置が送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）で通信を行っているか否かを判定する。 When the control means 4 selects, for example, the transmission channel CH_T (any of channels # 1 to # 4), the control means 4 sets the carrier sense in the selected transmission channel CH_T (any of channels # 1 to # 4). When the receiving means 2 is controlled to execute and then the carrier sense result is received from the receiving means 2, another terminal device receives the carrier sense result, and another terminal device performs transmission channel CH_T (channels # 1 to channel # 4). It is determined whether or not communication is performed by either).

制御手段４は、他の端末装置が送信用チャネルＣＨ＿Ｔで通信を行っていないと判定したとき、送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）における予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いか否かを判定する。 When the control means 4 determines that the other terminal device is not communicating on the transmission channel CH_T, the predicted channel free time in the transmission channel CH_T (any of channel # 1 to channel # 4) is used for transmission. Determine if the packet is longer than the transmission time.

そして、制御手段４は、送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）における予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いと判定したとき、送信用チャネルＣＨ＿Ｔおよび送信用パケットを送信手段７へ出力し、送信用チャネルＣＨ＿Ｔで送信用パケットを基地局ＢＳへ送信するように送信手段７を制御する。 Then, when the control means 4 determines that the predicted channel free time in the transmission channel CH_T (any of channel # 1 to channel # 4) is longer than the length of the transmission time of the transmission packet, the transmission channel CH_T And the transmission packet is output to the transmission means 7, and the transmission means 7 is controlled so that the transmission packet is transmitted to the base station BS on the transmission channel CH_T.

一方、制御手段４は、送信用チャネルＣＨ＿Ｔ（チャネル＃１〜チャネル＃４のいずれか）における予測チャネル空き時間が送信用パケットの送信時間の長さ以下であると判定したとき、またはキャリアセンスの結果に基づいて他の端末装置が送信用チャネルＣＨ＿Ｔで通信を行っていると判定したとき、送信用パケットの送信が不可であると判定し、通信不可情報を生成する。 On the other hand, when the control means 4 determines that the predicted channel free time in the transmission channel CH_T (any of channel # 1 to channel # 4) is equal to or less than the length of the transmission time of the transmission packet, or when the carrier sense determines. When it is determined that another terminal device is communicating on the transmission channel CH_T based on the result, it is determined that the transmission of the transmission packet is impossible, and the communication prohibition information is generated.

図１３に示す場合、チャネル＃２におけるチャネル空き状態Ｔ_{Ｉｄｌｅ２}（＝予測チャネル空き時間）は、送信用パケットの送信時間の長さよりも長い。 In the case shown in FIG. 13, the channel free state _TIdle2 (= predicted channel free time) in channel # 2 is longer than the length of the transmission time of the transmission packet.

従って、制御手段４は、チャネル空き時間Ｔ_{Ｉｄｌｅ２}（＝予測チャネル空き時間）が送信用パケットの送信時間の長さよりも長いと判定する。 Therefore, the control means 4 determines that the channel free time _TIdle2 (= predicted channel free time) is longer than the length of the transmission time of the transmission packet.

そして、制御手段４は、送信用チャネルＣＨ＿Ｔ（＝チャネル＃２）および送信用パケットを送信手段７へ出力し、送信用チャネルＣＨ＿Ｔ（＝チャネル＃２）で送信用パケットを基地局ＢＳへ送信するように送信手段７を制御する。 Then, the control means 4 outputs the transmission channel CH_T (= channel # 2) and the transmission packet to the transmission means 7, and transmits the transmission packet to the base station BS on the transmission channel CH_T (= channel # 2). The transmission means 7 is controlled so as to.

上述したように、送信用パケットが送信可能であるか否かの判定は、次の２つからなる。
（判定１）送信用チャネルＣＨ＿Ｔにおけるキャリアセンスの結果に基づいて他の端末装置が送信用チャネルＣＨ＿Ｔで通信を行っているか否かを判定する。
（判定２）送信用チャネルＣＨ＿Ｔにおける予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いか否かを判定する。 As described above, the determination as to whether or not the transmission packet can be transmitted consists of the following two.
(Determination 1) Based on the result of carrier sense in the transmission channel CH_T, it is determined whether or not another terminal device is communicating on the transmission channel CH_T.
(Determination 2) It is determined whether or not the predicted channel free time in the transmission channel CH_T is longer than the transmission time of the transmission packet.

そして、判定１，２の２つの判定結果が肯定的判定結果であるとき、送信用パケットの送信が可能であると判定し、判定１，２のうちの少なくとも１つの判定において、判定結果が否定的判定結果であるとき、送信用パケットの送信が不可能であると判定する。 Then, when the two determination results of the determinations 1 and 2 are positive determination results, it is determined that the transmission packet can be transmitted, and the determination result is negative in at least one determination of the determinations 1 and 2. When it is the target determination result, it is determined that the transmission packet cannot be transmitted.

なお、この発明の実施の形態においては、判定２のみによって送信用チャネルＣＨ＿Ｔにおける予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いと判定したとき、バックオフを実行せずに送信用パケットを送信用チャネルＣＨ＿Ｔで基地局ＢＳへ送信するようにしてもよい。 In the embodiment of the present invention, when it is determined only by the determination 2 that the predicted channel free time in the transmission channel CH_T is longer than the transmission time of the transmission packet, the packet is transmitted without performing backoff. The credit packet may be transmitted to the base station BS on the transmission channel CH_T.

図１４は、図２に示す端末装置１０の動作を説明するためのフローチャートである。図１４を参照して、端末装置１０の動作が開始されると、制御手段４は、アプリケーション６から送信用パケットを受けたか否かを判定することによって送信用パケットがあるか否かを判定する（ステップＳ１）。この場合、制御手段４は、アプリケーション６から送信用パケットを受けたとき、送信用パケットがあると判定し、アプリケーション６から送信用パケットを受けなかったとき、送信用パケットがないと判定する。 FIG. 14 is a flowchart for explaining the operation of the terminal device 10 shown in FIG. With reference to FIG. 14, when the operation of the terminal device 10 is started, the control means 4 determines whether or not there is a transmission packet by determining whether or not a transmission packet has been received from the application 6. (Step S1). In this case, when the control means 4 receives the transmission packet from the application 6, it determines that there is a transmission packet, and when it does not receive the transmission packet from the application 6, it determines that there is no transmission packet.

ステップＳ１において、送信用パケットがあると判定されると、制御手段４は、学習器５から出力された出力情報ＩＦ＿ＯＵＴに基づいて行動ａ_ｔ（チャネルを選択する行動）を決定する（ステップＳ２）。 In step S1, if it is determined that there is a transmission packet, the control unit 4 determines based on the output information IF_OUT output from the learning unit 5 act a _{t (action} to select a channel) (step S2) ..

そして、制御手段４は、決定した行動ａ_ｔにおいて選択されたチャネルでキャリアセンスを行うように受信手段２を制御し、受信手段２は、制御手段４からの制御に従ってキャリアセンスを行う（ステップＳ３）。 Then, the control unit 4 controls the receiving unit 2 to perform carrier sensing in the selected channel in the determined action a _t, the receiving means 2, the carrier sense carried out in accordance with control of the control unit 4 (step S3 ).

その後、予測手段３は、上述した方法によって、各チャネルにおける予測チャネル空き時間および予測チャネル使用時間を予測し（ステップＳ４）、その予測した予測チャネル空き時間および予測チャネル使用時間を制御手段４へ出力する。 After that, the prediction means 3 predicts the predicted channel free time and the predicted channel usage time in each channel by the method described above (step S4), and outputs the predicted predicted channel free time and the predicted channel usage time to the control means 4. do.

制御手段４は、予測チャネル空き時間および予測チャネル使用時間を予測手段３から受ける。 The control means 4 receives the prediction channel free time and the prediction channel usage time from the prediction means 3.

引き続いて、制御手段４は、送信用パケットが送信可能であるか否かを判定する（ステップＳ５）。この場合、制御手段４は、上述した判定１，２によって送信用パケットが送信可能であるか否かを判定する。 Subsequently, the control means 4 determines whether or not the transmission packet can be transmitted (step S5). In this case, the control means 4 determines whether or not the transmission packet can be transmitted by the determinations 1 and 2 described above.

ステップＳ５において、送信用パケットが送信可能でないと判定されたとき、制御手段４は、送信不可情報を生成する（ステップＳ６）。 When it is determined in step S5 that the transmission packet cannot be transmitted, the control means 4 generates the transmission non-transmission information (step S6).

一方、ステップＳ５において、送信用パケットが送信可能であると判定されたとき、制御手段４は、ステップＳ２で決定した行動ａ_ｔにおいて選択されたチャネルを送信用チャネルＣＨ＿Ｔとし、送信用チャネルＣＨ＿Ｔと送信用パケットとを送信手段７へ出力し、送信用チャネルＣＨ＿Ｔで送信用パケットを基地局ＢＳへ送信するように送信手段７を制御する。 On the other hand, in step S5, when the transmission packet is determined to be transmittable, the control unit 4, the selected channel and a transmitting channel CH_T in action a _t determined in step S2, the transmission channel CH_T The transmission means 7 is controlled so as to output the transmission packet to the transmission means 7 and transmit the transmission packet to the base station BS on the transmission channel CH_T.

送信手段７は、送信用チャネルＣＨ＿Ｔおよび送信用パケットを制御手段４から受け、制御手段４からの制御に従って、アンテナ１を介して、送信用チャネルＣＨ＿Ｔで送信用パケットを基地局ＢＳへ送信する（ステップＳ７）。 The transmission means 7 receives the transmission channel CH_T and the transmission packet from the control means 4, and transmits the transmission packet to the base station BS on the transmission channel CH_T via the antenna 1 according to the control from the control means 4 (the transmission means 7 receives the transmission channel CH_T and the transmission packet from the control means 4. Step S7).

その後、制御手段４は、ＡＣＫパケットを基地局ＢＳから受信したか否かを判定する（ステップＳ８）。この場合、制御手段４は、受信手段２からＡＣＫパケットを受けたとき、ＡＣＫパケットを基地局ＢＳから受信したと判定し、受信手段２からＡＣＫパケットを受けなかったとき、ＡＣＫパケットを基地局ＢＳから受信しなかったと判定する。 After that, the control means 4 determines whether or not the ACK packet has been received from the base station BS (step S8). In this case, when the control means 4 receives the ACK packet from the receiving means 2, it determines that the ACK packet has been received from the base station BS, and when it does not receive the ACK packet from the receiving means 2, the control means 4 receives the ACK packet from the base station BS. It is determined that the packet was not received from.

ステップＳ８において、ＡＣＫパケットを基地局ＢＳから受信したと判定されたとき、制御手段４は、通信結果を“１”に設定する（ステップＳ９）。 When it is determined in step S8 that the ACK packet has been received from the base station BS, the control means 4 sets the communication result to "1" (step S9).

一方、ステップＳ８において、ＡＣＫパケットを基地局ＢＳから受信しなかったと判定されたとき、制御手段４は、通信結果を“０”に設定する（ステップＳ１０）。 On the other hand, when it is determined in step S8 that the ACK packet has not been received from the base station BS, the control means 4 sets the communication result to "0" (step S10).

そして、ステップＳ９またはステップＳ１０の後、制御手段４は、制御チャネルを用いて、競合端末情報を基地局ＢＳから取得する（ステップＳ１１）。 Then, after step S9 or step S10, the control means 4 acquires competing terminal information from the base station BS using the control channel (step S11).

引き続いて、ステップＳ６またはステップＳ１１の後、制御手段４は、競合端末情報、送信不可情報、通信結果およびチャネル空き時間を入力情報として学習器５へ出力し、Ｑ学習を行うように学習器５を制御する。 Subsequently, after step S6 or step S11, the control means 4 outputs competing terminal information, transmission non-transmission information, communication result, and channel free time as input information to the learner 5, so that the learner 5 performs Q-learning. To control.

学習器５は、競合端末情報、送信不可情報、通信結果およびチャネル空き時間を入力情報として受け、その受けた競合端末情報、送信不可情報、通信結果およびチャネル待機時間を用いてＱ学習を実行する（ステップＳ１２）。 The learner 5 receives competing terminal information, non-transmission information, communication result, and channel free time as input information, and executes Q-learning using the received competing terminal information, non-transmission information, communication result, and channel standby time. (Step S12).

そして、学習器５は、Ｑ学習の結果である出力情報ＩＦ＿ＯＵＴを制御手段４へ出力する。制御手段４は、出力情報ＩＦ＿ＯＵＴを学習器５から受ける。 Then, the learner 5 outputs the output information IF_OUT, which is the result of Q-learning, to the control means 4. The control means 4 receives the output information IF_OUT from the learner 5.

その後、制御手段４は、通信を終了するか否かを判定する（ステップＳ１３）。ステップＳ１３において、通信を終了しないと判定されたとき、一連の動作は、ステップＳ１へ移行する。その後、ステップＳ１３において、通信を終了すると判定されるまで、ステップＳ１〜ステップＳ１３が繰り返し実行される。 After that, the control means 4 determines whether or not to terminate the communication (step S13). When it is determined in step S13 that the communication is not terminated, the series of operations shifts to step S1. After that, in step S13, steps S1 to S13 are repeatedly executed until it is determined that the communication is terminated.

そして、ステップＳ１３において、通信を終了すると判定されると、端末装置１０の動作が終了する。 Then, in step S13, when it is determined that the communication is terminated, the operation of the terminal device 10 ends.

図１５は、図１４のステップＳ５の詳細な動作を説明するためのフローチャートである。 FIG. 15 is a flowchart for explaining the detailed operation of step S5 of FIG.

図１５を参照して、図１４のステップＳ４の後、制御手段４は、キャリアセンスの結果に基づいて、送信用チャネルＣＨ＿Ｔで他の端末装置が通信を行っているか否かを判定する（ステップＳ５１）。 With reference to FIG. 15, after step S4 of FIG. 14, the control means 4 determines whether or not another terminal device is communicating on the transmission channel CH_T based on the result of carrier sense (step). S51).

ステップＳ５１において、送信用チャネルＣＨ＿Ｔで他の端末装置が通信を行っていないと判定されたとき、制御手段４は、予測手段３から受けた各チャネルにおける予測チャネル空き時間および予測チャネル使用時間に基づいて、送信用チャネルＣＨ＿Ｔにおける予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いか否かを判定する（ステップＳ５２）。 In step S51, when it is determined in the transmission channel CH_T that no other terminal device is communicating, the control means 4 is based on the predicted channel free time and the predicted channel usage time in each channel received from the predicting means 3. Therefore, it is determined whether or not the predicted channel free time in the transmission channel CH_T is longer than the transmission time of the transmission packet (step S52).

そして、ステップＳ５１において、送信用チャネルＣＨ＿Ｔで他の端末装置が通信を行っていると判定されたとき、またはステップＳ５２において、送信用チャネルＣＨ＿Ｔにおける予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長くないと判定されたとき、制御手段４は、送信用パケットの送信が不可であると判定する（ステップＳ５３）。 Then, in step S51, when it is determined that another terminal device is communicating on the transmission channel CH_T, or in step S52, the predicted channel free time on the transmission channel CH_T is the transmission time of the transmission packet. When it is determined that the length is not longer than the length, the control means 4 determines that the transmission of the transmission packet is impossible (step S53).

一方、ステップＳ５２において、送信用チャネルＣＨ＿Ｔにおける予測チャネル空き時間が、送信用パケットの送信時間の長さよりも長いと判定されたとき、制御手段４は、送信用パケットの送信が可能であると判定する（ステップＳ５４）。 On the other hand, in step S52, when it is determined that the predicted channel free time in the transmission channel CH_T is longer than the length of the transmission time of the transmission packet, the control means 4 determines that the transmission packet can be transmitted. (Step S54).

そして、ステップＳ５３の後、一連の動作は、図１４のステップＳ６へ移行し、ステップＳ５４の後、一連の動作は、図１４のステップＳ７へ移行する。 Then, after step S53, the series of operations shifts to step S6 of FIG. 14, and after step S54, the series of operations shifts to step S7 of FIG.

図１６は、図１４のステップＳ１２の詳細な動作を説明するためのフローチャートである。 FIG. 16 is a flowchart for explaining the detailed operation of step S12 of FIG.

図１６を参照して、図１４のステップＳ６またはステップＳ１１の後、学習器５は、入力情報を制御手段４から受ける（ステップＳ１２１）。 With reference to FIG. 16, after step S6 or step S11 of FIG. 14, the learner 5 receives input information from the control means 4 (step S121).

そして、学習器５は、ｔ回目のパケットの送信における状態ｓ_ｔをランダムに決定する（ステップＳ１２２）。 Then, the learning unit 5 determines randomly a state _{s t} at transmission of the t-th packet (step S122).

その後、学習器５は、ε−ｇｒｅｅｄｙ法に基づいて、ｔ回目のパケットの送信における行動ａ_ｔを決定する（ステップＳ１２３）。 Then, the learning unit 5, based on the epsilon-greedy method, determines an activity _{a t} in the transmission of the t-th packet (step S123).

引き続いて、学習器５は、競合端末情報（例えば、競合端末個数）に基づいて式（３Ｂ）によって重み係数ｗ_１を算出する（ステップＳ１２４）。 Subsequently, the learning unit 5, competing terminal information (e.g., conflict terminal number) is calculated weighting coefficients _{w 1} by the formula (3B) based on (step S124).

そして、学習器５は、チャネル待機時間ＷＴに基づいて式（３Ｃ）によって重み係数ｗ_２を算出する（ステップＳ１２５）。 Then, the learning unit 5 calculates the weighting coefficient _{w 2} by the formula (3C) based on the channel waiting time WT (step S125).

その後。学習器５は、送信不可情報が入力情報に含まれるか否かを判定する（ステップＳ１２６）。 afterwards. The learner 5 determines whether or not the non-transmissible information is included in the input information (step S126).

ステップ１２６において、送信不可情報が入力情報に含まれないと判定されたとき、学習器５は、状態ｓ_ｔにおいて行動ａ_ｔを実行したときの（ｔ＋１）回目のパケットの送信における報酬ｒ_ｔ＋１を重み係数ｗ_１および重み係数ｗ_２を用いて式（３Ａ）によって算出する（ステップＳ１２７）。 In step 126, when it is judged that the transmission failure information is not included in the input information, the learning unit 5, a reward r _{t + 1} in the transmission of the (t + 1) th packet when executing an action a _t in state s _t It calculated by equation (3A) by using the weighting coefficients _{w 1} and the weight coefficient _{w 2} (step S127).

一方、ステップ１２６において、送信不可情報が入力情報に含まれると判定されたとき、学習器５は、状態ｓ_ｔにおいて行動ａ_ｔを実行したときの（ｔ＋１）回目のパケットの送信における報酬ｒ_ｔ＋１を零と算出する（ステップＳ１２８）。 On the other hand, in step 126, when it is judged that the transmission prohibition information is included in the input information, the learning unit 5, compensation in the transmission of the (t + 1) th packet when executing an action a _t in state s _{_t} r _t _{+ 1} Is calculated as zero (step S128).

そして、ステップＳ１２７またはステップＳ１２８の後、学習器５は、学習率α、割引率γ（＝０）および報酬ｒ_ｔ＋１を用いて、状態ｓ_ｔおよび行動ａ_ｔに対応するＱテーブルのＱ値を更新する（ステップＳ１２９）。 After step S127 or step S128, the learning unit 5, the learning rate alpha, discount rate gamma (= 0) and with the reward _{r t + 1,} the Q value of the Q table corresponding to the state _{s t} and action _{a t} Update (step S129).

そうすると、学習器５は、Ｑ学習の終了条件が成立するか否かを判定する（ステップＳ１３０）。終了条件は、例えば、Ｑ値の更新回数が１０回に達したことである。 Then, the learner 5 determines whether or not the Q-learning end condition is satisfied (step S130). The end condition is, for example, that the number of updates of the Q value has reached 10 times.

ステップＳ１３０において、Ｑ学習の終了条件が成立しないと判定されたとき、一連の動作は、ステップＳ１２２へ移行する。その後、ステップＳ１３０において、Ｑ学習の終了条件が成立すると判定されるまで、ステップＳ１２２〜ステップＳ１３０が繰り返し実行される。 When it is determined in step S130 that the Q-learning end condition is not satisfied, the series of operations shifts to step S122. After that, in step S130, steps S122 to S130 are repeatedly executed until it is determined that the Q-learning end condition is satisfied.

そして、ステップＳ１３０において、Ｑ学習の終了条件が成立すると判定されると、学習器５は、各状態ｓ_ｔと、各状態ｓ_ｔにおける最大のＱ値と、最大のＱ値が得られるときの行動ａ_ｔとの対応関係からなる出力情報ＩＦ＿ＯＵＴを制御手段４へ出力する（ステップＳ１３１）。その後、一連の動作は、図１４のステップＳ１３へ移行する。 Then, in step S130, the termination condition of the Q-learning is determined to be satisfied, the learning device 5, when the respective states s _t, and the maximum Q value in each state s _t, the maximum Q value is obtained outputs the output information IF_OUT consisting correspondence between the action _{a t} to the control means 4 (step S131). After that, the series of operations shifts to step S13 of FIG.

図１６に示すステップＳ１２２〜Ｓ１２５，Ｓ１２６の“ＮＯ”，Ｓ１２７，Ｓ１２９を繰り返し実行することによって、競合端末個数が少ないときおよび／またはチャネル待機時間ＷＴが短いとき、報酬ｒ_ｔ＋１が大きくなる。そして、Ｑ値は、実質的に報酬ｒ_ｔ＋１によって決定されるので、報酬ｒ_ｔ＋１が大きくなると、状態ｓ_ｔおよび行動ａ_ｔに対応するＱ値は、より大きいＱ値に更新される。 By repeatedly executing "NO", S127, and S129 in steps S122 to S125 and S126 shown in FIG. 16, the reward rt _{+ 1} becomes large when the number of competing terminals is small and / or when the channel standby time WT is short. Then, the Q value are determined by the substantially reward r _{t + 1,} the reward r _{t + 1} is large, Q values corresponding to the state s _t and action a _t is updated to a larger Q value.

その結果、競合端末個数が少ないこと、およびチャネル待機時間ＷＴが短いことは、報酬ｒ_ｔ＋１を大きくすることを介してＱ値を大きくする。 As a result, the small number of competing terminals and the short channel standby time WT increase the Q value through increasing the _{reward rt + 1.}

従って、競合端末個数が少ないことは、端末装置１０が他の端末装置に対して競合端末となる確率を低くする。その結果、パケットの衝突確率を低減できる。 Therefore, a small number of competing terminals reduces the probability that the terminal device 10 will become a competing terminal with respect to other terminal devices. As a result, the packet collision probability can be reduced.

また、チャネル待機時間ＷＴが短いことは、チャネル待機時間ＷＴが短いチャネルが選択される確率を大きくする。その結果、チャネル利用効率を向上できる。 Further, the short channel waiting time WT increases the probability that a channel having a short channel waiting time WT is selected. As a result, channel utilization efficiency can be improved.

図１４のステップＳ２が１回目に実行される場合、学習器５は、行動ａ_ｔをランダムに決定する（即ち、チャネルをランダムに選択する）。 If step S2 of FIG. 14 is executed for the first time, the learning unit 5 determines randomly action a _t (i.e., to select a channel at random).

そして、学習器５は、ステップＳ１２（図１６に示すフローチャート）においてＱ学習を実行し、各状態ｓ_ｔと、各状態ｓ_ｔにおける最大のＱ値と、最大のＱ値が得られるときの行動ａ_ｔとの対応関係からなる出力情報ＩＦ＿ＯＵＴを制御手段４へ出力する。 Then, the learning unit 5, step S12 performs a Q learning in (the flowchart shown in FIG. 16), act upon the respective states s _t, and the maximum Q value in each state s _t, the maximum Q value is obtained outputs the output information IF_OUT consisting correspondence between a _t to the control means 4.

図１７は、出力情報ＩＦ＿ＯＵＴの概念図である。図１７を参照して、出力情報ＩＦ＿ＯＵＴは、状態ｓ_ｔと、状態ｓ_ｔに対応するＱ値のうちの最大のＱ値と、最大のＱ値が得られるときの行動ａ_ｔとを含む。 FIG. 17 is a conceptual diagram of the output information IF_OUT. Referring to FIG. 17, the output information IF_OUT includes a state _{s t,} and the maximum Q value among the Q value corresponding to the state _{s t,} the action _{a t} when the maximum of Q value is obtained.

状態ｓ_ｔ、状態ｓ_ｔに対応するＱ値のうちの最大のＱ値、および最大のＱ値が得られるときの行動ａ_ｔは、相互に対応付けられる。 State s _t, a maximum of Q value, and action a _t when the maximum of Q value is obtained from among the Q value corresponding to the state s _t is correlated with one another.

ステップＳ１２（図１６に示すフローチャート）におけるＱ学習の終了時のＱテーブルは、図１７に示すＱテーブルＴＢＬ＿Ｑからなる。 The Q-table at the end of Q-learning in step S12 (flow chart shown in FIG. 16) includes the Q-table TBL_Q shown in FIG.

学習器５は、Ｑ学習の終了条件が満たされたと判定すると、“１”の状態ｓ_ｔに対応するＱ値（＝ｑ_１，１，ｑ_１，３，ｑ_１，４）のうちの最大のＱ値（例えば、ｑ_１，３）と、最大のＱ値（例えば、ｑ_１，３）が得られるときの行動ａ_ｔ（＝“３”）とを検出する。 Learner 5, the maximum of the determines that the termination condition of the Q-learning are satisfied, "1" Q value corresponding to the state _{s t} of _{_{_{(= q 1,1, q 1,3,}}} q 1,4) the Q value (for _{example, q 1, 3)} and the maximum Q value _{(e.g., q 1, 3)} for detecting a behavior when the obtained _{a t (= "3")} .

そして、学習器５は、“１”の状態ｓ_ｔと、最大のＱ値（例えば、ｑ_１，３）と、行動ａ_ｔ（＝“３”）とを相互に対応付けて出力情報ＩＦ＿ＯＵＴに格納する。 Then, the learning unit 5 includes a state _{s t} "1", the maximum Q value _{(e.g., q 1, 3)} and, action _a t (= "3") and the mutual association with output information IF_OUT the Store.

また、学習器５は、“２”の状態ｓ_ｔに対応するＱ値（＝ｑ_２，２，ｑ_２，３）のうちの最大のＱ値（例えば、ｑ_２，２）と、最大のＱ値（例えば、ｑ_２，２）が得られるときの行動ａ_ｔ（＝“２”）とを検出する。 Also, the learning unit 5, Q values corresponding to the state _{s t} of _{_{"2" (= q 2,2,}} q 2,3) maximum Q value among the _{(e.g., q 2, 2)} and the maximum Q value _{(e.g., q 2, 2)} act _a t when the obtained (= "2") and detected.

そして、学習器５は、“２”の状態ｓ_ｔと、最大のＱ値（例えば、ｑ_２，２）と、行動ａ_ｔ（＝“２”）とを相互に対応付けて出力情報ＩＦ＿ＯＵＴに格納する。 Then, the learning unit 5, "2" and state _{s t} of the maximum Q value _{(e.g., q 2, 2)} and, action _a t (= "2") and the mutually correlated output information IF_OUT the Store.

更に、学習器５は、“３”の状態ｓ_ｔに対応するＱ値（＝ｑ_３，２，ｑ_３，４）のうちの最大のＱ値（例えば、ｑ_３，４）と、最大のＱ値（例えば、ｑ_３，４）が得られるときの行動ａ_ｔ（＝“４”）とを検出する。 Furthermore, the learning unit 5, Q values corresponding to the state _{s t} of _{_{"3" (= q 3,2,}} q 3,4) maximum Q value among the _{(e.g., q 3, 4),} the largest Q value _{(e.g., q 3, 4)} actions _a t when the obtained (= "4") and detected.

そして、学習器５は、“３”の状態ｓ_ｔと、最大のＱ値（例えば、ｑ_３，４）と、行動ａ_ｔ（＝“４”）とを相互に対応付けて出力情報ＩＦ＿ＯＵＴに格納する。 Then, the learning unit 5 includes a state _{s t} "3", the maximum Q value _{(e.g., q 3, 4)} and, action _a t (= "4") and in association with one another output information IF_OUT the Store.

制御手段４は、出力情報ＩＦ＿ＯＵＴを学習器５から受ける。そして、制御手段４は、ステップＳ１２（図１６に示すフローチャート）が実行された後のステップＳ２において、次のようにして行動ａ_ｔを決定する（即ち、送信用チャネルＣＨ＿Ｔを選択する）。 The control means 4 receives the output information IF_OUT from the learner 5. Then, the control unit 4, in step S2 after step S12 (the flowchart shown in FIG. 16) is executed, as follows to determine the action a _t (i.e., selects a transmission channel CH_T).

制御手段４は、出力情報ＩＦ＿ＯＵＴに含まれる“１”の状態ｓ_ｔ、“２”の状態ｓ_ｔおよび“３”の状態ｓ_ｔから、送信用パケットの送信時間の長さによって決定される状態ｓ’_ｔに一致する状態ｓ_ｔを検出する。 State control means 4, the state _s t of "1" s contained in the output information IF_OUT, from the state _{s t} "2" state _{s t} and "3" of which is determined by the length of the transmission time of the transmission packet The _{state st} that matches _s't is detected.

例えば、制御手段４は、状態ｓ’_ｔに一致する状態ｓ_ｔとして“１”の状態ｓ_ｔを検出する。そして、制御手段４は、“１”の状態ｓ_ｔに対応付けられた最大のＱ値（＝ｑ_{１＿ｍａｘ}（＝ｑ_１，３））と行動ａ_ｔ（＝ａ_{ｔ＿ｍａｘ１}（＝“３”））を検出する。 For example, the control unit 4 detects the state _{s t} "1" as the state _{s t} matching state s' _t. Then, the control means 4, "1" maximum Q value associated with state _{s t} of _{_{(= q 1_max (= q 1,3}} )) and action _{_{a t (= a t_max1 (=}} "3")) Is detected.

そうすると、制御手段４は、“３”の行動ａ_ｔにおいて選択されたチャネル＃３を送信用チャネルＣＨ＿Ｔとして選択する。 Then, the control unit 4 selects "3" to channel # 3 selected in action _{a t} as a transmission channel CH_T.

状態ｓ’_ｔに一致する状態ｓ_ｔが“２”の状態または“３”の状態である場合も、制御手段４は、同様にして、チャネル＃２またはチャネル＃４を送信用チャネルＣＨ＿Ｔとして選択する。 Even when the state s _t match the state s' _t is the state of "2" state or "3", the control unit 4, similarly, select a channel # 2 or channel # 4 as a transmission channel CH_T do.

図１４に示すフローチャート（図１５および図１６に示すフローチャートを含む）に従ってパケットを送信することによって、ステップＳ１２における学習器５による学習の初期においては、キャリアセンスによって送信信号が検出されなかった場合、またはランダムアクセスのため、たまたま送信タイミングが同じであった場合には、パケットの衝突が生じるが、このようなパケットの衝突は、最終的に最適なチャネルを選択する上では必要なコストであり、学習が進めばパケットの衝突が減少していくので問題ではない。 By transmitting the packet according to the flowchart shown in FIG. 14 (including the flowchart shown in FIGS. 15 and 16), when the transmission signal is not detected by the carrier sense in the initial stage of learning by the learner 5 in step S12, Or because of random access, if the transmission timing happens to be the same, packet collision will occur, but such packet collision is a necessary cost in finally selecting the optimum channel. As learning progresses, packet collisions will decrease, so this is not a problem.

また、図１４に示すフローチャート（図１５および図１６に示すフローチャートを含む）においては、競合端末装置の個数が多い方が報酬ｒ_ｔ＋１が小さくなるので（式（３Ａ），（３Ｂ）参照）、端末装置１０が他の端末装置に対して競合端末となる確率を下げている。 Further, in the flowchart shown in FIG. 14 (including the flowchart shown in FIGS. 15 and 16), the reward rt _{+ 1} becomes smaller as the number of competing terminal devices increases (see equations (3A) and (3B)). The probability that the terminal device 10 becomes a competing terminal with respect to other terminal devices is reduced.

従って、空きチャネルの予想だけでチャネル選択を行った場合に比べ、複数の端末装置が同じ空きチャネルに接続しようとする確率を低下することができる。その結果、パケット損失を抑制して周波数を有効利用できる。 Therefore, it is possible to reduce the probability that a plurality of terminal devices will try to connect to the same free channel as compared with the case where the channel is selected only by predicting the free channel. As a result, packet loss can be suppressed and the frequency can be effectively used.

なお、端末装置１０の動作は、ソフトウェアによって実現されてもよい。この場合、端末装置１０は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）およびＲＡＭ（Random Access Memory）を備える。そして、ＲＯＭは、図１４に示すフローチャート（図１５および図１６に示すフローチャートを含む）の各ステップからなるプログラムＰｒｏｇ＿Ａを記憶する。 The operation of the terminal device 10 may be realized by software. In this case, the terminal device 10 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory). Then, the ROM stores a program Prog_A including each step of the flowchart shown in FIG. 14 (including the flowchart shown in FIGS. 15 and 16).

ＣＰＵは、ＲＯＭからプログラムＰｒｏｇ＿Ａを読み出し、その読み出したプログラムＰｒｏｇ＿Ａを実行し、送信用パケットの送信時間の長さにマッチする予測チャネル空き時間を有するチャネルを送信用チャネルＣＨ＿Ｔとして選択してパケットを基地局ＢＳへ送信する。ＲＡＭは、算出された重み係数ｗ_１，ｗ_２、算出された報酬ｒ_ｔ＋１、および更新後のＱ値等を一時的に記憶する。 The CPU reads the program Prog_A from the ROM, executes the read program Prog_A, selects a channel having a predicted channel free time that matches the length of the transmission time of the transmission packet as the transmission channel CH_T, and bases the packet. Send to station BS. The RAM temporarily stores the calculated weighting factors w ₁ , w ₂ , the calculated reward rt _{+ 1} , and the updated Q value.

また、プログラムＰｒｏｇ＿Ａは、ＣＤ，ＤＶＤ等の記録媒体に記録されて流通してもよい。プログラムＰｒｏｇ＿Ａを記録した記録媒体がコンピュータに装着されると、コンピュータは、記録媒体からプログラムＰｒｏｇ＿Ａを読み出して実行し、送信用パケットの送信時間の長さにマッチする予測チャネル空き時間を有するチャネルを送信用チャネルＣＨ＿Ｔとして選択してパケットを基地局ＢＳへ送信する。 Further, the program Prog_A may be recorded on a recording medium such as a CD or DVD and distributed. When the recording medium on which the program Prog_A is recorded is attached to the computer, the computer reads the program Prog_A from the recording medium, executes the program, and sends a channel having a predicted channel free time that matches the length of the transmission time of the transmission packet. The packet is transmitted to the base station BS by selecting it as the credit channel CH_T.

従って、プログラムＰｒｏｇ＿Ａを記録した記録媒体は、コンピュータ読み取り可能な記録媒体である。 Therefore, the recording medium on which the program Prog_A is recorded is a computer-readable recording medium.

なお、上記においては、競合端末情報は、競合端末個数からなると説明したが、この発明の実施の形態においては、これに限らず、競合端末情報は、チャネルを占有するパケット長の期待値であってもよく、端末装置１０自身の送信したいパケットのパケット長と似た傾向のパケット長を有する端末装置の個数であってもよい。 In the above, it has been described that the competing terminal information consists of the number of competing terminals, but in the embodiment of the present invention, the competing terminal information is not limited to this, and the competing terminal information is an expected value of the packet length occupying the channel. It may be the number of terminal devices having a packet length having a tendency similar to the packet length of the packet to be transmitted by the terminal device 10 itself.

チャネルを占有するパケット長の期待値は、端末装置１０以外の端末装置から送信されたパケットがチャネルを占有するときのパケット長の平均値からなる。そして、チャネルを占有するパケット長の期待値は、基地局ＢＳにおいて取得され、競合端末情報として基地局ＢＳから制御チャネルで端末装置１０へ送信される。 The expected value of the packet length occupying the channel is the average value of the packet lengths when the packets transmitted from the terminal devices other than the terminal device 10 occupy the channel. Then, the expected value of the packet length occupying the channel is acquired in the base station BS, and is transmitted from the base station BS to the terminal device 10 via the control channel as competing terminal information.

また、端末装置１０自身の送信したいパケットのパケット長と似た傾向のパケット長を有する端末装置の個数は、基地局ＢＳにより取得され、競合端末情報として基地局ＢＳから制御チャネルで端末装置１０へ送信される。 Further, the number of terminal devices having a packet length having a tendency similar to the packet length of the packet to be transmitted by the terminal device 10 itself is acquired by the base station BS, and is obtained from the base station BS as competing terminal information from the base station BS to the terminal device 10 via a control channel. Will be sent.

競合端末情報がチャネルを占有するパケット長の期待値からなるとき、重み係数ｗ_１は、チャネルを占有するパケット長の期待値の逆数として算出される。また、競合端末情報が端末装置１０自身の送信したいパケットのパケット長と似た傾向のパケット長を有する端末装置の個数からなるとき、重み係数ｗ_１は、端末装置１０自身の送信したいパケットのパケット長と似た傾向のパケット長を有する端末装置の個数の逆数として算出される。 When competing terminal information is from the expected value of the packet length occupying a channel, the weighting coefficients w ₁ is calculated as the inverse of the expected value of the packet length occupying a channel. Further, when a conflict terminal information consists of the number of terminal devices having a packet length of trends similar to the packet length of a packet to be transmitted for the terminal apparatus 10 itself, the weighting coefficients w _1, the packet of the packet to be transmitted for the terminal apparatus 10 itself It is calculated as the inverse of the number of terminal devices having a packet length that tends to be similar to the length.

重み係数ｗ_１が、チャネルを占有するパケット長の期待値の逆数として算出された場合、または端末装置１０自身の送信したいパケットのパケット長と似た傾向のパケット長を有する端末装置の個数の逆数として算出された場合、端末装置１０と同様なチャネルを選択する確率の高い端末装置との競合を抑制することができる。 When the weighting coefficient w ₁ is calculated as the reciprocal of the expected value of the packet length occupying the channel, or the reciprocal of the number of terminal devices having a packet length having a tendency similar to the packet length of the packet to be transmitted by the terminal device 10 itself. When calculated as, it is possible to suppress competition with a terminal device having a high probability of selecting a channel similar to that of the terminal device 10.

また、上記においては、チャネル待機時間ＷＴを用いて重み係数ｗ_２を算出したが、この発明の実施の形態においては、これに限らず、チャネル待機時間ＷＴの逆数を報酬ｒ_ｔ＋１として算出してもよい。この場合、報酬ｒ_ｔ＋１は、式（３Ａ）において、ｗ_２＝０とし、スループットＦをチャネル待機時間ＷＴの逆数に変えた次式によって算出される。 Further, in the above, the weight coefficient w ₂ was calculated using the channel standby time WT, but the embodiment of the present invention is not limited to this, and the reciprocal of the channel standby time WT is calculated as the _{reward rt + 1.} May be good. In this case, the reward rt _{+ 1} _{is calculated by the following equation in which w 2} = 0 in the equation (3A) and the throughput F is changed to the reciprocal of the channel standby time WT.

チャネル待機時間ＷＴの逆数を報酬ｒ_ｔ＋１とすることによって、チャネル待機時間ＷＴが短い方が報酬ｒ_ｔ＋１が大きくなり、チャネル待機時間ＷＴが長い方が報酬ｒ_ｔ＋１が小さくなる。その結果、チャネルの利用効率を向上させるチャネルを送信用チャネルＣＨ＿Ｔとして利用することができる。 By setting the reciprocal of the channel waiting time WT as the reward rt _{+ 1} , the shorter _{the channel waiting time WT, the larger the reward rt + 1} , and the longer the channel waiting time WT, the smaller the _{reward rt + 1.} As a result, the channel that improves the utilization efficiency of the channel can be used as the transmission channel CH_T.

更に、この発明の実施の形態においては、各チャネル＃１〜＃４でキャリアセンスを行うとき、送信元の端末装置の識別情報ＩＤ（例えば、無線ＬＡＮのＭＡＣアドレス）を検出し、チャネル＃１〜＃４ごとに検出された送信元の端末装置数と、端末装置１０の送信パケット長とに応じて送信用チャネルＣＨ＿Ｔを選択するようにしてもよい。 Further, in the embodiment of the present invention, when the carrier sense is performed on each of the channels # 1 to # 4, the identification information ID (for example, the MAC address of the wireless LAN) of the terminal device of the source is detected and the channel # 1 is detected. The transmission channel CH_T may be selected according to the number of source terminal devices detected for each of ~ # 4 and the transmission packet length of the terminal device 10.

この場合、制御手段４は、上述した学習器５からの出力情報ＩＦ＿ＯＵＴに加え、チャネル＃１〜＃４ごとに検出された送信元の端末装置数を考慮して送信用チャネルＣＨ＿Ｔを選択する。 In this case, the control means 4 selects the transmission channel CH_T in consideration of the number of transmission source terminal devices detected for each of channels # 1 to # 4, in addition to the output information IF_OUT from the learner 5 described above.

また、チャネル＃１〜＃４ごとに検出された送信元の端末装置数と、所要伝送遅延とに応じて送信用チャネルＣＨ＿Ｔを選択するようにしてもよい。 Further, the transmission channel CH_T may be selected according to the number of transmission source terminal devices detected for each of channels # 1 to # 4 and the required transmission delay.

より具体的には、即時に送信できた送信用パケットであれば、チャネルごとに検出された送信元の端末装置の個数が少ないチャネルであり、かつ、チャネル空き時間の予測値が端末装置１０の送信用パケット長以上であるチャネルから送信用チャネルＣＨ＿Ｔを選択してもよい。 More specifically, if it is a transmission packet that can be transmitted immediately, the number of source terminal devices detected for each channel is small, and the predicted value of the channel free time is the terminal device 10. The transmission channel CH_T may be selected from the channels having the transmission packet length or more.

この場合、制御手段４は、上述した学習器５からの出力情報ＩＦ＿ＯＵＴに加え、送信用パケットを即時に送信できたことを考慮して送信用チャネルＣＨ＿Ｔを選択する。 In this case, the control means 4 selects the transmission channel CH_T in consideration of the fact that the transmission packet could be transmitted immediately in addition to the output information IF_OUT from the learner 5 described above.

これにより、各チャネルで通信を行う端末数が増加すると、パケット長やパケット送信間隔などの分散が増加し、自己相関が減少するため、予測手段３による予測値の精度が劣化し、本発明の効果を得にくくなるという問題を解決できる。 As a result, as the number of terminals communicating on each channel increases, the variance such as packet length and packet transmission interval increases, and the autocorrelation decreases, so that the accuracy of the predicted value by the predicting means 3 deteriorates, and the accuracy of the predicted value by the predicting means 3 deteriorates. It can solve the problem that it is difficult to obtain the effect.

なお、自己相関が減少するとは、定期的にパケットを送信している場合、自身の最初のパケットの送信間隔に比較して、時間の経過とともにパケットの送信間隔が大きくずれてくることを示す。 Note that the decrease in autocorrelation means that when packets are transmitted periodically, the packet transmission interval deviates significantly with the passage of time as compared with the transmission interval of its own first packet.

更に、上記においては、重み係数ｗ_１は、競合端末装置の個数の逆数として算出されると説明したが、この発明の実施の形態においては、これに限らず、重み係数ｗ_１は、競合端末装置の個数が少なくなれば大きくなり、競合端末装置の個数が多くなれば小さくなるように算出されるものであれば、どのような計算式によって算出されてもよい。 Further, in the above, the weighting coefficients w ₁ has been described as being calculated as the reciprocal of the number of competing terminal device, in the embodiment of the present invention is not limited thereto, the weight coefficient w ₁ is competing terminal Any calculation formula may be used as long as it is calculated so that it increases as the number of devices decreases and decreases as the number of competing terminal devices increases.

更に、上記においては、重み係数ｗ_２は、チャネル待機時間の逆数として算出されると説明したが、この発明の実施の形態においては、これに限らず、重み係数ｗ_２は、チャネル待機時間が長くなれば小さくなり、チャネル待機時間が短くなれば大きくなるように算出されるものであれば、どのような計算式によって算出されてもよい。 Further, in the above, it has been explained that the weighting coefficient w ₂ is calculated as the reciprocal of the channel waiting time, but in the embodiment of the present invention, the weighting coefficient w ₂ is not limited to this, and the weighting coefficient w 2 is the channel waiting time. Any calculation formula may be used as long as it is calculated so that it becomes smaller as the length becomes longer and increases as the channel standby time becomes shorter.

この発明の実施の形態においては、送信用パケットの送信時間の長さと、学習器５からの出力情報ＩＦ＿ＯＵＴとに基づいて送信用チャネルＣＨ＿Ｔを選択する制御手段４は、「選択手段」を構成する。 In the embodiment of the present invention, the control means 4 for selecting the transmission channel CH_T based on the length of the transmission time of the transmission packet and the output information IF_OUT from the learner 5 constitutes the "selection means". ..

また、この発明の実施の形態においては、スループットＦに重み係数ｗ_１を乗算して報酬ｒ_ｔ＋１を算出する処理は、「第１の演算処理」を構成し、スループットＦに重み係数ｗ_２を乗算して報酬ｒ_ｔ＋１を算出する処理は、「第２の演算処理」を構成する。 Further, in embodiments of the present invention, the process of calculating a reward r _{t + 1} by multiplying the weighting coefficients w ₁ throughput F constitutes a "first processing", the weight coefficient w ₂ throughput F The process of multiplying to _{calculate the reward rt + 1} constitutes a "second arithmetic process".

更に、この発明の実施の形態においては、重み係数ｗ_１は、「第１の重み係数」を構成し、重み係数ｗ_２は、「第２の重み係数」を構成する。 Furthermore, in embodiments of the present invention, the weighting coefficients w ₁ constitutes "a first weighting factor", the weight coefficient w ₂ constitutes "the second weighting coefficient".

更に、この発明の実施の形態においては、通信結果の“１”は、「第１の指標」を構成し、通信結果の“０”は、「第２の指標」を構成する。 Further, in the embodiment of the present invention, the communication result "1" constitutes the "first index", and the communication result "0" constitutes the "second index".

上述した実施の形態によれば、次の効果を得ることができる。 According to the above-described embodiment, the following effects can be obtained.

送信したいパケット長に応じて高いスループットが得られるようにチャネルを選択するようになるため、チャネルの周波数利用効率を向上できる。また、互いに異なる送信パケット長のパケットを送信する端末と次回のパケットの送信が競合する場合でも競合相手とならず、結果としてパケット衝突確率を抑制することができる。 Since the channel is selected so that a high throughput can be obtained according to the packet length to be transmitted, the frequency utilization efficiency of the channel can be improved. Further, even when the terminal that transmits packets having different transmission packet lengths and the transmission of the next packet compete with each other, they do not become competitors, and as a result, the packet collision probability can be suppressed.

更に、チャネルごとに観測される端末装置数に応じてチャネル選択することにより、パケット衝突によって生じる再送を抑制し、遅延を抑制できる。 Further, by selecting the channel according to the number of terminal devices observed for each channel, it is possible to suppress the retransmission caused by the packet collision and suppress the delay.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は、上記した実施の形態の説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 It should be considered that the embodiments disclosed this time are exemplary in all respects and not restrictive. The scope of the present invention is shown by the scope of claims rather than the description of the embodiment described above, and is intended to include all modifications within the meaning and scope equivalent to the scope of claims.

この発明は、端末装置、コンピュータに実行させるためのプログラムおよびプログラムを記録したコンピュータ読み取り可能な記録媒体に適用される。 The present invention applies to terminal devices, programs to be executed by a computer, and a computer-readable recording medium on which the programs are recorded.

１アンテナ、２受信手段、３予測手段、４制御手段、５学習器、６アプリケーション、７送信手段、１０端末装置、３１判定部、３２計測部、３３分類部、３４予測部、３５予測器、１００通信システム。 1 antenna, 2 receiving means, 3 predicting means, 4 controlling means, 5 learner, 6 application, 7 transmitting means, 10 terminal device, 31 judgment unit, 32 measuring unit, 33 classification unit, 34 predicting unit, 35 predictor, 100 communication system.

Claims

A predictive means for predicting the channel free time, which is the time when the channel is free, and the channel use time, which is the time when the channel is used, for all of the plurality of channels.
The communication result of the communication for transmitting the packet to the base station, the competing terminal information which is the information of the competing terminal device competing with the terminal device, and the waiting of the channel from the transmission of one packet to the transmission of the next packet. Based on the input information consisting of the channel standby time, which is the time, or the input information consisting of the non-transmission information indicating that the packet cannot be transmitted to the base station, the length of the packet transmission time is set as the state and the channel is selected. Q learning is executed with the throughput when communication is successful or the inverse of the channel standby time as a reward, and the Q value is updated repeatedly until the end condition of the Q learning is satisfied. Output information consisting of a configuration in which each state when the end condition of Q learning is satisfied, the maximum Q value in each state, and the action when the maximum Q value is obtained in each state are interrelated. A learner that executes learning processing that outputs
The action when the maximum Q value corresponding to the Q learning state corresponding to the length of the transmission time of the packet to be transmitted is obtained is detected from the output information, and the channel selected by the detected action is sent. Choices to choose as a credit channel and
When the first condition of the transmission channel, which consists of the channel free time predicted by the prediction means being longer than the transmission time of the packet to be transmitted, is satisfied, the backoff is not executed. A terminal device including a transmission means for transmitting a packet to be transmitted to the base station on the transmission channel.

When the competing terminal information consists of the number of competing terminal devices and the communication result indicates success of communication, the learning device becomes larger as the number of competing terminal devices decreases. The first arithmetic process for calculating the reward is executed so that the number becomes smaller as the number increases, and the reward becomes larger as the channel waiting time becomes shorter and smaller as the channel waiting time becomes longer. The terminal device according to claim 1, wherein the second arithmetic process to be calculated is executed to execute the learning process.

When the number of competing terminal devices is the first number, the learner multiplies the throughput by a first weighting coefficient consisting of the first value, and the number of competing terminal devices is calculated from the first number. The second aspect of claim 2, wherein the first arithmetic processing is executed by multiplying the throughput by the first weighting coefficient consisting of a second value smaller than the first value when the number of the second number is large. Terminal device.

The terminal device according to claim 3, wherein the learning device executes the first arithmetic processing by multiplying the throughput by the reciprocal of the number of competing terminal devices as the first weighting factor.

When the channel waiting time consists of the first time length, the learner multiplies the throughput by a second weighting coefficient consisting of a third value, and the channel waiting time is longer than the first time length. 2. The second arithmetic processing is executed by multiplying the throughput by the second weighting coefficient consisting of a fourth value smaller than the third value when the second time length is long. The terminal device according to any one of claims 4.

The second aspect of the present invention, wherein the learner executes the second arithmetic processing by multiplying the throughput by the reciprocal of the channel standby time as the second weighting factor. Terminal device.

The terminal device according to claim 1, wherein the learning device updates the Q value with the reward as zero when the input information consists of the non-transmissionable information in the learning process.

The terminal device according to any one of claims 1 to 7, further comprising a receiving means for receiving the competing terminal information from the base station on a control channel.

The communication result is composed of the first index when the transmitting means transmits the packet to be transmitted to the base station and then receives an ACK packet indicating that the packet has been received from the base station. The terminal according to any one of claims 1 to 8, which comprises a second index when the means does not receive the ACK packet from the base station after transmitting the packet to be transmitted to the base station. Device.

When, in addition to the first condition, the transmission means satisfies the second condition, which comprises that, as a result of carrier sense in the transmission channel, another terminal device is not communicating in the transmission channel. The terminal device according to any one of claims 1 to 9, wherein the packet to be transmitted is transmitted to the base station through the transmission channel without executing backoff.

The first step in which the predictor predicts the channel free time, which is the time when the channel is free, and the channel use time, which is the time when the channel is used, for all of the plurality of channels.
From the transmission of one packet to the transmission of the next packet, the communication result of the communication in which the learner transmits the packet to the base station, the competing terminal information which is the information of the competing terminal device competing with the terminal device, and the competing terminal information. Based on the input information consisting of the channel standby time, which is the standby time of the channel, or the input information consisting of the non-transmission information indicating that the packet cannot be transmitted to the base station, the length of the packet transmission time is set as the state. , The action is to select a channel, Q learning is executed with the throughput when communication is successful or the inverse of the channel standby time as a reward, and the update of the Q value is repeated until the end condition of the Q learning is satisfied. A configuration in which each state when the end condition of the Q learning is satisfied, the maximum Q value in each state, and the action when the maximum Q value is obtained in each state are associated with each other. The second step of executing the learning process to output the output information consisting of
The selection means detects an action when the maximum Q value corresponding to the Q learning state corresponding to the length of the transmission time of the packet to be transmitted is obtained from the output information, and is selected by the detected action. The third step of selecting the channel for transmission and
When the transmitting means satisfies the first condition that the channel free time predicted by the predicting means in the transmitting channel is longer than the length of the transmitting time of the packet to be transmitted, the backoff is executed. A program for causing a computer to execute a fourth step of transmitting a packet to be transmitted to the base station on the transmission channel without using the packet.

In the second step, when the competing terminal information consists of the number of competing terminal devices and the communication result indicates success of communication, the learning device becomes larger as the number of competing terminal devices decreases. Therefore, the first arithmetic process for calculating the reward is executed so that the number of competing terminal devices increases, and the channel standby time increases as the channel standby time increases, and the channel standby time increases as the channel standby time increases. The program for causing the computer according to claim 11 to execute the learning process by executing the second arithmetic process for calculating the reward so as to be small.

In the second step, the learner multiplies the throughput by a first weighting coefficient consisting of a first value when the number of competing terminal devices is the first number, and the number of competing terminal devices. Is a second number greater than the first number, the first weighting factor consisting of a second value smaller than the first value is multiplied by the throughput to execute the first arithmetic processing. The program for causing the computer according to claim 12 to execute.

The thirteenth aspect of claim 13, wherein the learner executes the first arithmetic processing by multiplying the throughput by the reciprocal of the number of competing terminal devices as the first weighting coefficient in the second step. A program that lets a computer run.

In the second step, when the channel standby time consists of the first time length, the learner multiplies the throughput by a second weighting coefficient consisting of a third value, and the channel standby time When the second time length is longer than the first time length, the second weighting coefficient consisting of the fourth value smaller than the third value is multiplied by the throughput to perform the second arithmetic processing. The program for causing the computer according to any one of claims 12 to 14 to execute the above.

Claims 12 to 15 wherein the learner executes the second arithmetic processing by multiplying the throughput by the reciprocal of the channel standby time as the second weighting factor in the second step. A program for causing the computer according to any one of the items to be executed.

The learning device causes the computer according to claim 11 to update the Q value with the reward as zero when the input information comprises the non-transmissible information in the learning process of the second step. Program for.

The computer according to any one of claims 11 to 17, wherein the receiving means further causes a computer to perform a fifth step of receiving the competing terminal information from the base station on a control channel. program.

The communication result comprises a first index when the transmitting means transmits a packet to be transmitted to the base station and then receives an ACK packet indicating that the packet has been received from the base station. The computer according to any one of claims 11 to 18, which comprises a second index when the means does not receive the ACK packet from the base station after transmitting the packet to be transmitted to the base station. A program to be executed by.

When, in addition to the first condition, the transmission means satisfies the second condition, which comprises that, as a result of carrier sense in the transmission channel, another terminal device is not communicating in the transmission channel. The program for causing the computer according to any one of claims 11 to 19 to transmit the packet to be transmitted to the base station through the transmission channel without executing backoff.

A computer-readable recording medium on which the program according to any one of claims 11 to 20 is recorded.