WO2024055739A1

WO2024055739A1 - Method for determining uplink channel, and terminal and network device

Info

Publication number: WO2024055739A1
Application number: PCT/CN2023/107293
Authority: WO
Inventors: 王和俊; 王滨后; 徐芳; 孙可欣; 谢刚
Original assignee: 青岛海尔智能技术研发有限公司; 海尔智家股份有限公司
Priority date: 2022-09-16
Filing date: 2023-07-13
Publication date: 2024-03-21
Also published as: CN117768075A

Abstract

The present application relates to the technical field of wireless communications. Disclosed is a method for determining an uplink channel. The method is applied to a station (STA) device, and comprises: receiving a trigger instruction, wherein the trigger instruction is used for triggering an STA to sense a channel state; if there is an idle channel, determining an uplink channel according to performance information of the idle channel, and accessing the uplink channel for data transmission; and if there is no idle channel, sensing the channel state again after a set backoff duration. In this way, in the case of communication disorder, etc. in which CSI is unknown or CSI is incomplete, selection of an uplink channel can be completed by means of an STA, such that a relatively large increase in the throughput can be achieved, and the channel utilization rate is maximized, thereby reducing the probability of inter-channel collision, and improving the spectrum efficiency of a system. Further disclosed in the present application are a terminal and a network device.

Description

Method and terminal and network equipment for determining uplink channel

This application is filed based on a Chinese patent application with application number 202211128492.3 and a filing date of September 16, 2022, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application as a reference.

Technical field

The present application relates to the field of wireless communication technology, for example, to a method for determining an uplink channel, a terminal, and network equipment.

Background technique

Currently, the cooperative scheduling methods between multiple APs mainly include C-OFDMA (Coordinated Orthogonal Frequency-Division Multiple Access, Coordinated Orthogonal Frequency Division Multiple Access) and CBF (Coordinated Beamforming, Coordinated Beamforming).

Among them, in C-OFDMA, the AP (Wireless Access Point, wireless access point) coordinates to share OFDMA (Orthogonal Frequency-Division Multiple Access, Orthogonal Frequency Division Multiple Access) resources for all STAs (Station, terminal), and Let different STAs use orthogonal times and frequencies to avoid RU (Resource Unit, spectrum resource block) conflicts. On the one hand, allocating appropriate RUs to STA requires sufficient CSI (Channel State Information) or sufficient channel estimation; on the other hand, C-OFDMA uses orthogonal channels to suppress interference and transmit correctly, but in In the case of loss of coordination, the AP cannot guarantee that the channels are orthogonal, so it cannot suppress interference through orthogonal channels. In addition, when ensuring channel orthogonality, the utilization rate of the channel will also be reduced. However, the channel information fed back by the STA to the AP under CBF is incomplete, making it difficult for the AP to effectively estimate the channel.

In addition, in WiFi6, channel estimation is completed in the HE-SIG-B field. The length of this field is limited, so complete channel estimation cannot be achieved. To obtain complete CSI, multiple transmissions are required, which will greatly reduce the efficiency of the channel.

Therefore, the two existing solutions, CBF and C-OFDMA, require complete CSI to achieve channel sharing. However, in the case of communication loss, it is difficult for APs to accurately know the channels used by other APs, so it is very difficult to It is difficult to give channel estimation, that is to say, when communication is out of coordination, these two solutions are not feasible.

Contents of the invention

In order to provide a basic understanding of some aspects of the disclosed embodiments, a simplified summary is provided below. The generalization is not This is a general review, and it is not intended to identify key/important constituent elements or delineate the protection scope of these embodiments, but is intended to serve as a prelude to the detailed description that follows.

Embodiments of the present disclosure provide a method, terminals, and network equipment for determining uplink channels, thereby completing channel allocation when the CSI is unknown or the CSI information is incomplete, thereby improving system throughput and improving channel efficiency. Utilization.

In some embodiments, the method for determining the uplink channel, applied to terminal equipment STA, includes:

Receive a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel status;

If an idle channel exists, determine the uplink channel based on the performance information of the idle channel, and access the uplink channel for data transmission;

If there is no idle channel, the channel status will be sensed again after the set backoff time.

Optionally, accessing the uplink channel for data transmission includes:

After the data transmission is successful, the performance information of the corresponding channel is updated according to the transmission result, and the new data packet arrival indication is received;

After the data transmission fails, the performance information of the corresponding channel is updated according to the transmission result, and the data transmission instruction is re-executed.

Optionally, determining the uplink channel based on the performance information of the idle channel includes:

Obtain performance information of idle channels;

The idle channel with the best ability to successfully transmit data packets is determined as the uplink channel.

Optionally, the determination of the uplink channel includes:

Construct an uplink channel selection model based on reinforcement learning;

Input channel status information and performance information into the uplink channel selection model based on reinforcement learning for training, and obtain the average network throughput;

When the average network throughput reaches the maximum value, the uplink channel is determined according to the output of the reinforcement learning-based uplink channel selection model.

Optionally, the training of the uplink channel selection model based on reinforcement learning includes:

Use channel state information and performance information as state sets for reinforcement learning

in, Represents the set of channel sensing weights of STA on the channel; Represents the sensing weight of the k-th STA on the m-th channel at time t;

Represents the set of data packet transmission weights of STA on the channel; Indicates the perceived weight of the data packet transmission weight of the k-th STA on the m-th channel at time t;

The state set S is input into the uplink channel selection model based on reinforcement learning for training, and the action set A={f ₁ , f ₂ ,..., f _M } is obtained, indicating that the STA takes actions corresponding to selecting among M idle channels. A collection of actions on the uplink channel;

Determine reward parameters based on the state set Represents the immediate reward of the k-th STA transmitting on the m-th channel;

The uplink channel selection model based on reinforcement learning is trained according to the reward parameter R _t to obtain the channel corresponding to the system action that maximizes the reward parameter R _t as the uplink channel.

Optionally, the establishment of the uplink channel selection model based on reinforcement learning includes:

Among them, C _t represents the average network throughput at time t; N represents the total number of STAs; Indicates the signal-to-interference-noise ratio of the kth STA at time t.

Optionally, training the reinforcement learning-based uplink channel selection model according to the reward parameter Rt includes:

Use the following method as the update rule for reinforcement learning:

Among them, Q _t represents the Q value of the current state, Q _t+1 represents the Q value of the next state moment; α represents the learning rate of reinforcement learning, with a value of (0,1); β represents the emphasis on historical rewards. , the value is (0,1); represents the immediate reward; maxQ _t (S′,A′) represents the maximum Q value of all possible action strategies at the next moment.

In some embodiments, the method for determining an uplink channel, applied to an access point AP, includes:

Send a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel status;

Receive data transmitted by the STA through the uplink channel; the uplink channel is determined by the STA based on the performance information of the idle channel.

In some embodiments, a terminal device is provided, including a processor and a memory, the memory is used to store a computer program, the processor is used to call and run the program stored in the memory, and perform the above-mentioned determination. Upstream channel method.

In some embodiments, a network device is provided, including a processor and a communication interface. The communication interface is used to communicate with other network devices; the processor is used to run a set of programs, so that the network device implements the above-mentioned functions. Method to determine the upstream channel.

The method, terminal and network device for determining the uplink channel provided by the embodiments of the present disclosure can achieve the following technical effects:

The terminal STA senses the channel status and determines the uplink channel based on the channel performance in the idle channel for data transmission. In this way, in the case of unknown CSI information or incomplete CSI information such as communication loss, the uplink channel selection can be completed through the STA, which can achieve a greater throughput improvement, maximize channel utilization, and reduce the probability of collision between channels. , improve the spectral efficiency of the system.

The above general description and the following description are exemplary and explanatory only and are not intended to limit the application.

Description of drawings

One or more embodiments are exemplified by corresponding drawings. These exemplary descriptions and drawings do not constitute limitations to the embodiments. Elements with the same reference numerals in the drawings are shown as similar elements. The drawings are not limited to scale and in which:

Figure 1 is a schematic diagram of an environmental system according to an embodiment of the present disclosure;

Figure 2 is a schematic flowchart of a method for determining an uplink channel provided by an embodiment of the present disclosure;

Figure 3 is a schematic flowchart of another method for determining an uplink channel provided by an embodiment of the present disclosure;

Figure 4 is a schematic diagram of the training process of the uplink channel selection model based on reinforcement learning in an embodiment of the present disclosure;

Figure 5 is a schematic flowchart of another method for determining an uplink channel provided by an embodiment of the present disclosure;

Figure 6 is a schematic flowchart of another method for determining an uplink channel provided by an embodiment of the present disclosure;

Figure 7 is an application schematic diagram of an embodiment of the present disclosure;

Figure 8 is a schematic diagram of a terminal device provided by an embodiment of the present disclosure;

Figure 9 is a schematic diagram of a network device provided by an embodiment of the present disclosure.

Detailed ways

In order to be able to understand the features and technical contents of the embodiments of the present disclosure in more detail, the implementation of the embodiments of the present disclosure is described in detail below in conjunction with the accompanying drawings. The attached drawings are for reference only and are not used to limit the embodiments of the present disclosure. In the following technical description, for the convenience of explanation, a full understanding of the disclosed embodiments is provided through multiple details. However, one or more embodiments can still be implemented without these details. In other cases, to simplify the drawings, well-known structures and devices can be simplified for display.

The terms "first", "second", etc. in the description and claims of the embodiments of the present disclosure and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that data so used are interchangeable under appropriate circumstances for the purposes of the embodiments of the disclosure described herein. Furthermore, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusion.

Unless otherwise stated, the term "plurality" means two or more.

In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an "or" relationship. For example, A/B indicates: A or B.

The term "and/or" is an association relationship describing objects, indicating that three relationships can exist. For example, A and/or B means: A or B, or A and B.

The term "correspondence" can refer to an association relationship or a binding relationship. The correspondence between A and B refers to an association relationship or a binding relationship between A and B.

In the embodiment of the present disclosure, AP represents a wireless access point, which may be a router, a gateway or a combined router-gateway.

STA represents a user terminal, which may be a mobile terminal or station connected to the AP via a communication connection function to obtain access to AP system resources (eg, network). It can be a cellular phone, a cordless phone, a Session Initiation Protocol (SIP) phone, a Wireless Local Loop (WLL) station, a Personal Digital Assistant (Personal Digital Assistant, PDA) device, a device with wireless communication capabilities Handheld devices, computing devices or other processing devices connected to wireless modems, vehicle-mounted devices, wearable devices, next-generation communication systems such as terminal devices in NR networks, or future evolved Public Land Mobile Networks (PLMN) Terminal equipment in the network, etc. It can also be a mobile phone (Mobile Phone), tablet computer (Pad), computer with wireless transceiver function, virtual reality (Virtual Reality, VR) terminal equipment, augmented reality (Augmented Reality, AR) terminal equipment, industrial control (industrial control) Wireless terminal equipment in self-driving (self driving), wireless terminal equipment in remote medical (remote medical), wireless terminal equipment in smart grid (smart grid), transportation safety (transportation safety) Wireless terminal equipment, wireless terminal equipment in smart city (smart city) or wireless terminal equipment in smart home (smart home), etc.

Figure 1 shows a schematic diagram of an environmental system provided by an embodiment of the present disclosure.

As shown in Figure 1, the environmental system includes multiple APs and multiple STAs.

In the multiple systems mentioned above, each AP can access one or more STAs; each STA can also access one or more APs.

For example, an AP can connect to a STA and establish at least two channels. As shown in Figure 1, STA1 and AP1; one AP can also be connected to two STAs and establish at least two channels. As shown in Figure 1, STA2, AP2, and AP3.

Each STA can obtain channel sensing information and data transmission information of one or more channels between it and the accessed AP. Each STA can also sense and obtain the interference power of other APs in the group to the STA.

In the existing technology, although one STA can connect to multiple APs and establish multiple duplex channels with multiple APs, each channel needs to be responsible for carrying uplink and downlink data. Before sending data on the channel, the STA and AP will perform carrier sense multiple access (CSMA)/enhanced distributed channel access. (enhanced distributed channel access, EDCA) backoff, an air interface collision may occur after data is sent. If an air interface collision occurs, the data transmission fails and needs to be resent. When data is sent both uplink and downlink on the channel, if the AP or STA is sending uplink data, the downlink needs to wait for the uplink transmission to complete before data can be sent. If downlink data is being sent, the uplink needs to wait for the downlink sending to complete before data can be sent. Therefore, uplink and downlink data may collide, and the waiting time for data transmission may be extended, affecting the channel utilization and data throughput of the system.

The uplink channel in this embodiment is responsible for transmitting information data from the STA to the AP.

An idle channel refers to an unoccupied channel among the multiple channels accessed. Generally, the idle channel can be determined among the accessed channels by searching for a channel that emits an idle signal, or by searching for a carrier-free channel.

By building the above network architecture, as well as information interaction and data processing between AP and STA, when the channel status is difficult to estimate, channel allocation is completed based on this method for determining the uplink channel. In order to improve the system throughput and improve channel utilization.

In some cases, the above-mentioned environment system may also include other network entities such as network controllers and mobility management entities, which are not limited in the embodiments of the present application.

Based on the above environmental system, embodiments of the present disclosure provide a method for determining an uplink channel, so that the STA can determine the uplink channel in the allocated resources and access the upload data.

As shown in Figure 2, this method is applied to terminal equipment STA, including:

Step S201: The STA receives a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel state.

Here, the trigger command is used to inform the STA that the data packet has arrived, and it can sense the channel status to transmit the data packet.

Step S202: If there is an idle channel, the STA determines the uplink channel based on the performance information of the idle channel, and accesses the uplink channel for data transmission.

Step S203: If there is no idle channel, the STA senses the channel status again after the set backoff time.

In this way, when a data packet arrives, the STA begins to sense the channel status. If no idle channel is sensed, the data packet backs off and continues sensing; if the idle state is sensed, the uplink channel is determined in the idle state based on the performance information of the idle state. , for data transmission. When the CSI information is unknown or incomplete, such as communication loss, the STA can complete the uplink channel selection, which can achieve greater throughput improvement, maximize channel utilization, reduce the probability of collision between channels, and improve Spectral efficiency of the system.

Optionally, determine the uplink channel based on the performance information of the idle channel, including:

Obtain performance information of idle channels;

Here, the ability to successfully transmit a data packet may be determined by the historical transmission success rate of the data packet and/or the channel sensing weight.

For example, the channel with the highest historical data packet transmission success rate is determined as the uplink channel.

In this way, by obtaining historical transmission data in one or more idle channel performance information, the historical transmission success rate can be determined. The higher the historical transmission success rate, the lower the possibility of channel collisions affecting transmission quality.

For another example, the channel with the highest channel sensing weight is determined as the uplink channel.

The channel sensing weight can generally be calculated and obtained by the STA using the spectrum sensing algorithm on the corresponding channel, and is used to represent the channel quality.

In recent years, research based on reinforcement learning has become more and more extensive. Reinforcement learning is an online learning algorithm. The agent interacts with the external environment through a reward mechanism and adjusts its behavior according to the reward value obtained in the environment, allowing the agent to learn. And adapt to the external environment, prompting the agent to choose the behavior that can obtain the maximum reward for itself in the environment. The characteristics of reinforcement learning and adapting to the external environment can be applied to the channel selection between the STA and the AP, so that the STA can learn the changing channel status as an agent, and finally select the one that successfully transmits the data packet in the idle channel. The idle channel with the best capability is used as the uplink channel to reduce channel status scanning overhead and improve channel detection probability. This achieves the purpose of achieving greater throughput improvement, maximizing channel utilization, reducing the probability of collisions between channels, and improving the spectrum efficiency of the system.

Below, the above solutions will be described with reference to specific solutions.

As shown in Figure 3, an embodiment of the present disclosure provides a method for determining an uplink channel, which is applied to the STA in Figure 1 to determine the uplink channel between the STA and the AP through the data processing method of reinforcement learning. The method includes:

Step S301: The STA receives a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel state.

Step S302: When there is an idle channel, construct an uplink channel selection model based on reinforcement learning based on the network average throughput optimization problem.

If there is no idle channel, the STA senses the channel status again after the set backoff time.

Here, the uplink channel selection model based on reinforcement learning is constructed, including a state set, an action set and a reward function.

Among them, C _t represents the average network throughput at time t; N represents the total number of STAs; Indicates the signal-to-interference-noise ratio of the kth STA at time t. Among them, the signal-to-interference-to-noise ratio refers to the ratio of the signal to the sum of interference and noise in the system.

In this way, the channel information between the STA and the AP is used to establish the uplink channel selection model, so that when selecting the uplink channel, the status of each channel can be combined to ensure that the system throughput meets the requirements.

Step S303: With the goal of maximizing the average network throughput, the channel state information and performance information are input into the uplink channel selection model based on reinforcement learning for training, and the average network throughput is obtained.

Optionally, the training of the reinforcement learning-based uplink channel selection model includes:

The state set S is input into the uplink channel selection model based on reinforcement learning for training, and the action set A={f ₁ , f ₂ ,..., f _M } is obtained, indicating that the STA takes steps corresponding to selecting the uplink channel among M idle channels. a collection of actions;

Determine reward parameters based on state set Represents the immediate reward of the k-th STA transmitting on the m-th channel at time t;

Here, the reward parameter R _t is used to represent the average value of the perception weight and channel transmission weight of the selected uplink channel at time t.

Further, training a resource allocation decision-making model based on reinforcement learning based on the reward parameter R _t includes: using the following method as the update rule for reinforcement learning:

Figure 4 shows a schematic diagram of reinforcement learning training in an embodiment of the present disclosure to illustrate the above steps.

The reinforcement learning in this embodiment uses the Q-Learning algorithm. The agent performs actions in the environment to obtain certain rewards to perceive the environment, thereby learning a mapping strategy from state to action to maximize the reward value.

In Figure 4, STA is used as an agent for reinforcement learning and performs data processing as an intelligent agent. Based on the mutual interference information between APs and channel idle conditions received by the STA, the reinforcement learning algorithm is used to achieve reasonable and effective uplink channel selection. Through the process of continuous interaction between the agent STA and the environment, feedback is obtained from the environment, and then the action of the agent STA is changed to realize the adjustment of the uplink channel selection action.

Specifically, STA first obtains mutual interference information between APs and channel idle conditions as channel performance information and State information S ₀ , the agent STA takes action A ₀ in the S ₀ environment as a channel selection decision, and feeds it back to the AP in the environment. Here, the actions taken by STA can be selected according to the greedy strategy.

After the agent STA makes a channel selection decision, it performs access and data transmission according to the selected uplink channel. Determine the reward parameter R ₁ based on the system throughput and provide feedback to the STA; and send the next state S ₁ including mutual interference information between APs and channel idle conditions to the STA. After receiving the reward parameter R ₁ and the environment status S ₁ , the STA updates the Q value table according to the update rules of reinforcement learning, and takes action A ₁ to the environment as an uplink channel selection decision. After receiving action A ₁ , the environment state changes from state S ₁ to S ₂ , and the reward parameter R ₂ is fed back. That is, STA gets the reward parameter R ₂ and state S ₂ , updates the Q value table, and takes action A ₂ ; gets the reward parameter R ₃ and state S ₃ , updates the Q value table, and takes action A ₃ . This cycle is continued until the system throughput reaches the maximum, that is, the reward parameter Rt reaches the maximum. Ultimately, the purpose of reducing interference and improving throughput is achieved.

Through the update of the Q value table, a Q value is used for each channel in the table to represent the level of channel transmission quality; when a data packet arrives, the STA begins to sense the idle channel. If no idle channel is sensed, the data packet backs off. , continue sensing; if an idle channel is sensed, use the Q-Learning mechanism to learn the uplink channel selection strategy. The Q-learning learning process includes: determining the action A _t+1 at this moment based on the previous state S _t , then updating the state S _t+1 , and feeding back a reward R _t . Through learning, the STA will select a channel with the best transmission quality among the idle channels for transmission. The transmission quality here is measured by the success rate of historical transmission data packets. And update the Q value in the Q value table according to the reward parameters. In this way, by sorting the channels according to the Q value, a sorted list of channel transmission quality can be obtained. When the received data packet reaches the information, the STA can take action according to the greedy decision-making strategy through Q-Learning, that is, select among the idle channels with probability ε, and finally determine the uplink channel.

Step S304: When the average network throughput reaches the maximum value, determine the uplink channel according to the output of the uplink channel selection model based on reinforcement learning.

When the reward parameter R _t reaches the maximum value, the system's corresponding action A _t is used as the optimal strategy to determine the corresponding uplink channel selection action.

Step S305: accessing an uplink channel for data transmission.

In this way, through the uplink channel selection model based on reinforcement learning, the terminal STA makes decisions by sensing the channel status and the number of idle channels, selects the channel with the highest channel quality for data transmission, and feeds the reward back to the environment while updating the next state. The uplink channel is determined based on channel performance in the idle channel for data transmission. In this way, in the case of unknown CSI information or incomplete CSI information such as communication loss, the uplink channel selection can be completed through the STA, which can achieve a greater throughput improvement, maximize channel utilization, and reduce the probability of collision between channels. , improve the spectral efficiency of the system.

Figure 5 shows a method for determining the uplink channel to illustrate the perceived channel conditions when a data packet arrives, And use reinforcement learning to select the channel to be accessed to complete the uplink transmission to the AP.

As shown in Figure 5, an embodiment of the present disclosure provides a method for determining an uplink channel, which is applied to the STA in Figure 1 to determine the uplink channel between the STA and the AP through the data processing method of reinforcement learning. The method includes:

Step S501: The STA receives a trigger instruction; the trigger instruction includes a data packet arrival indication.

Step S502: The STA senses whether there is an idle channel.

Step S503: If there is no idle channel, data packet backoff is performed, and the channel status is sensed again after the set backoff time. The set backoff duration is determined by a random distribution with mean λ.

Step S504: If there is an idle channel, use the Q-learning algorithm to output the channel selection decision as the uplink channel through the uplink channel selection model based on reinforcement learning.

Step S505: Access the uplink channel for data transmission. And the action set and reward parameters in step S504 are updated according to the selected channel action and the system throughput change after the selection.

Step S506: After the data transmission is successful, the information required by the Q-learning algorithm in step S504 is updated according to the transmission result. Update the status set in step S504 according to the transmission result, and return to step S501 to receive a new data packet arrival indication.

Step S507: After the data transmission fails, update the information required by the Q-learning algorithm in step S504 according to the transmission result, and return to step S502 to re-execute the data transmission instruction. The status set in step S504 is updated according to the transmission result.

In this way, through the uplink channel selection model based on reinforcement learning, the terminal STA makes decisions by sensing the channel status and the number of idle channels, selects the channel with the highest channel quality for data transmission, and feeds the reward back to the environment while updating the next state. And continue to update the environment status after making a decision. Update the two situations that exist after data packet transmission into the Q-learning learning process. After the transmission is successful, after updating the environment status, this data transmission ends, waiting for the arrival of new data packets, and entering the next round of data transmission; after the transmission fails, after updating the environment status, it is necessary to enter the retransmission mechanism and re-sense. channel for data packet transmission. In this way, in the case of unknown CSI information or incomplete CSI information such as communication loss, the uplink channel selection can be completed through the STA, which can achieve a greater throughput improvement, maximize channel utilization, and reduce the probability of collision between channels. , improve the spectral efficiency of the system.

Figure 6 shows a method for determining the uplink channel, applied to the AP in the environment system shown in Figure 1, including:

Step S601, the AP sends a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel status.

Here, the AP sends a trigger command to the STA to obtain the data cache information fed back by the STA, and triggers the STA to sense the channel status for data transmission. The AP can send a BSRP buffer status report poll frame (Buffer Status Report Poll, BSRP) to cause the STA to send a buffer status report frame (Buffer Status Report, BSR).

Step S602: The AP receives the data transmitted by the STA through the uplink channel; the uplink channel is determined by the STA based on the performance information of the idle channel.

After receiving the data transmitted by the STA through the uplink channel, the AP also sends an acknowledgment character (ACK) to the STA to indicate receipt of the uploaded data.

In this way, in the case of unknown CSI information or incomplete CSI information such as communication loss, the uplink channel selection can be completed through the STA, which can achieve a greater throughput improvement, maximize channel utilization, and reduce the probability of collision between channels. , improve the spectral efficiency of the system.

Figure 7 shows an application diagram of a method for determining an uplink channel.

In this practical application, the method for determining the uplink channel includes the following steps:

Step S701, the AP sends a BRSP to the STA, requesting to obtain the STA's data cache information;

Step S702, the STA sends a BSR to the AP to feed back the data cache information;

Step S703: The STA senses the current status of all channels. If it senses that there are multiple idle channels, it enters the Q-learning learning process environment and selects the idle channel with the best ability to successfully transmit data packets as the uplink channel. If there is no channel idle, the data packet will back off for a period of time before being transmitted. The back off time is subject to a random distribution with mean λ.

Step S704: The STA accesses the uplink channel and transmits data.

Step S705: The AP receives the data transmitted by the STA and sends an ACK to the STA to indicate receipt.

In this way, the terminal STA senses the channel status, determines the uplink channel based on the channel performance in the idle channel, and performs data transmission. In this way, in the case of unknown CSI information or incomplete CSI information such as communication loss, the uplink channel selection can be completed through the STA, which can achieve a greater throughput improvement, maximize channel utilization, and reduce the probability of collision between channels. , reduce the impact of interference between multiple APs on data transmission, and improve the spectrum efficiency of the system.

As shown in FIG. 8 , an embodiment of the present disclosure provides a terminal device, including a processor 800 and a memory 801 . The memory 801 is used to store computer programs, and the processor 800 is used to call and run the programs stored in the memory, and perform the above-mentioned method for determining the uplink channel.

Optionally, the device also includes a communication interface 802 and a bus 803. The communication interface 802 is used to communicate with other network devices; the processor 800, the communication interface 802, and the memory 801 can communicate with each other through the bus 803.

As shown in FIG. 9 , an embodiment of the present disclosure provides a network device, including a processor 900 and a memory 901 . The memory 901 is used to store computer programs, and the processor 900 is used to call and run the programs stored in the memory, and perform the above-mentioned method for determining the uplink channel.

Optionally, the device also includes a communication interface 902 and a bus 903. The communication interface 902 is used to communicate with other network devices; the processor 900, the communication interface 902, and the memory 901 can communicate with each other through the bus 903.

In addition, the above-mentioned logical instructions in the memory 901 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.

As a computer-readable storage medium, the memory 901 can be used to store software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 900 executes program instructions/modules stored in the memory 901 to execute functional applications and data processing, that is, to implement the method for determining the uplink channel in the above embodiment.

The memory 901 may include a stored program area and a stored data area, where the stored program area may store an operating system and at least one application program required for a function; the stored data area may store data created according to the use of the terminal device, etc. In addition, the memory 901 may include high-speed random access memory and may also include non-volatile memory.

Embodiments of the present disclosure provide a computer-readable storage medium that stores computer-executable instructions, and the computer-executable instructions are configured to execute the above method for determining an uplink channel.

Embodiments of the present disclosure provide a computer program product. The computer program product includes a computer program stored on a computer-readable storage medium. The computer program includes program instructions. When the program instructions are executed by a computer, the The computer executes the above method for determining the uplink channel.

An embodiment of the present disclosure provides a computer program that, when executed by a computer, causes the computer to implement the above method for determining an uplink channel.

The above-mentioned computer-readable storage medium may be a transient computer-readable storage medium or a non-transitory computer-readable storage medium.

The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product. The computer software product is stored in a storage medium and includes one or more instructions to enable a computer device (which may be a personal computer, a server, or a network equipment, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. The aforementioned storage media can be non-transitory storage media, including: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, etc. A medium that can store program code or a temporary storage medium.

The foregoing description and drawings illustrate embodiments of the disclosure sufficiently to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples represent only possible variations. Unless expressly required, individual components and features are optional and the order of operations may vary. Portions and features of some embodiments may be included in or substituted for those of other embodiments. Furthermore, the words used in this application are used only to describe the embodiments and not to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. . Similarly, the term "and/or" as used in this application refers to an or any and all possible combinations of one or more of the associated listed. In addition, when used in this application, the term "comprise" and its variations "comprises" and/or "comprising" etc. refer to stated features, integers, steps, operations, elements, and/or The presence of a component does not exclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groupings of these. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method or apparatus including the stated element. In this article, each embodiment may focus on its differences from other embodiments, and the same and similar parts among various embodiments may be referred to each other. For the methods, products, etc. disclosed in the embodiments, if they correspond to the method part disclosed in the embodiment, then the relevant parts can be referred to the description of the method part.

Those skilled in the art will appreciate that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented with electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software may depend on the specific application and design constraints of the technical solution. The skilled person may use different methods to implement the described functionality for each specific application, but such implementations should not be considered to be beyond the scope of the disclosed embodiments. The skilled person can clearly understand that for the convenience and simplicity of description, the specific working processes of the systems, devices and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be described again here.

In the embodiments disclosed herein, the disclosed methods and products (including but not limited to devices, equipment, etc.) can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units may only be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined. Either it can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms. The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to implement this embodiment. In addition, each functional unit in the embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

The flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more components for implementing the specified logical function(s). Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two consecutive blocks can actually execute essentially in parallel, it They can sometimes be executed in reverse order, depending on the functionality involved. In the descriptions corresponding to the flowcharts and block diagrams in the accompanying drawings, operations or steps corresponding to different blocks may also occur in a sequence different from that disclosed in the description, and sometimes there is no specific distinction between different operations or steps. order. For example, two consecutive operations or steps may actually be performed substantially in parallel, or they may sometimes be performed in reverse order, depending on the functionality involved. Each block in the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration, may be implemented by special purpose hardware-based systems that perform the specified functions or actions, or may be implemented using special purpose hardware implemented in combination with computer instructions.

Claims

A method for determining an uplink channel, applied to terminal equipment STA, characterized in that the method includes:

Receive a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel status;

If an idle channel exists, determine the uplink channel based on the performance information of the idle channel, and access the uplink channel for data transmission;

If there is no idle channel, the channel status will be sensed again after the set backoff time.
The method according to claim 1, characterized in that said accessing the uplink channel for data transmission includes:

After the data transmission is successful, the performance information of the corresponding channel is updated according to the transmission result, and the new data packet arrival indication is received;

After the data transmission fails, the performance information of the corresponding channel is updated according to the transmission result, and the data transmission instruction is re-executed.
The method according to claim 1 or 2, characterized in that determining the uplink channel according to the performance information of the idle channel includes:

Obtain performance information of idle channels;

The idle channel with the best ability to successfully transmit data packets is determined as the uplink channel.
The method according to claim 3, characterized in that the determination of the uplink channel includes:

Construct an uplink channel selection model based on reinforcement learning;

Input channel status information and performance information into the uplink channel selection model based on reinforcement learning for training, and obtain the average network throughput;

When the average network throughput reaches the maximum value, the uplink channel is determined according to the output of the reinforcement learning-based uplink channel selection model.
The method according to claim 4, characterized in that the training of the uplink channel selection model based on reinforcement learning includes:

Use channel state information and performance information as state sets for reinforcement learning

in, Represents the set of channel sensing weights of STA on the channel; Represents the sensing weight of the k-th STA on the m-th channel at time t;

Represents the set of data packet transmission weights of STA on the channel; Indicates the perceived weight of the data packet transmission weight of the k-th STA on the m-th channel at time t;

The state set S is input into the uplink channel selection model based on reinforcement learning for training, and the action set A={f 1 , f 2 ,..., f M } is obtained, indicating that the STA takes steps corresponding to selecting the uplink channel among M idle channels. a collection of actions;

Determine reward parameters based on the state set Represents the immediate reward of the k-th STA transmitting on the m-th channel;

The uplink channel selection model based on reinforcement learning is trained according to the reward parameter R t to obtain the channel corresponding to the system action that maximizes the reward parameter R t as the uplink channel.
The method according to claim 5, characterized in that the establishment of the uplink channel selection model based on reinforcement learning includes:

Among them, C t represents the average network throughput at time t; N represents the total number of STAs; Indicates the signal-to-interference-noise ratio of the kth STA at time t.
The method of claim 5, wherein training the reinforcement learning-based uplink channel selection model according to the reward parameter R t includes:

Use the following method as the update rule for reinforcement learning:

Among them, Q t represents the Q value of the current state, Q t+1 represents the Q value of the next state moment; α represents the learning rate of reinforcement learning, with a value of (0,1); β represents the emphasis on historical rewards. , the value is (0,1); represents the immediate reward; maxQ t (S′,A′) represents the maximum Q value of all possible action strategies at the next moment.
A method for determining an uplink channel, applied to an access point AP, is characterized by including:

Send a trigger instruction; the trigger instruction is used to trigger the STA to sense the channel status;

Receive data transmitted by the STA through the uplink channel; the uplink channel is determined by the STA based on the performance information of the idle channel.
A terminal device, characterized in that it includes a processor and a memory, the memory is used to store a computer program, the processor is used to call and run the program stored in the memory, and execute the steps as described in claims 1 to 7 Method used to determine the upstream channel.
A network device, characterized in that it includes a processor and a communication interface, the communication interface is used to communicate with other network devices; the processor is used to run a set of programs, so that the network device implements the use as claimed in claim 8 method for determining the uplink channel.
A computer program, when the computer program is executed by a computer, causes the computer to implement the method for determining an uplink channel as claimed in claims 1 to 7.
A computer program product. The computer program product includes computer instructions stored on a computer-readable storage medium. When the program instructions are executed by a computer, the computer implements the methods described in claims 1 to 7. Method to determine the upstream channel.