WO2024046286A1 - 一种信道聚合方法及装置 - Google Patents

一种信道聚合方法及装置 Download PDF

Info

Publication number
WO2024046286A1
WO2024046286A1 PCT/CN2023/115350 CN2023115350W WO2024046286A1 WO 2024046286 A1 WO2024046286 A1 WO 2024046286A1 CN 2023115350 W CN2023115350 W CN 2023115350W WO 2024046286 A1 WO2024046286 A1 WO 2024046286A1
Authority
WO
WIPO (PCT)
Prior art keywords
channel
time period
aggregation
channel aggregation
value
Prior art date
Application number
PCT/CN2023/115350
Other languages
English (en)
French (fr)
Inventor
舒同欣
刘鹏
郭子阳
罗嘉俊
杨迅
颜敏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024046286A1 publication Critical patent/WO2024046286A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/04Wireless resource allocation
    • H04W72/044Wireless resource allocation based on the type of the allocated resource
    • H04W72/0453Resources in frequency domain, e.g. a carrier in FDMA
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/52Allocation or scheduling criteria for wireless resources based on load
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W72/00Local resource management
    • H04W72/50Allocation or scheduling criteria for wireless resources
    • H04W72/54Allocation or scheduling criteria for wireless resources based on quality criteria
    • H04W72/542Allocation or scheduling criteria for wireless resources based on quality criteria using measured or perceived quality

Definitions

  • the present application relates to the field of communication technology, and in particular, to a channel aggregation method and device.
  • channel aggregation technology In order to cope with the shortage of spectrum resources and the increase in business traffic, channel aggregation technology has been introduced in the communication standards formulated by the Institute of Electrical and Electronics Engineers (IEEE). Specific channel aggregation technology can be based on the main channel and aggregate the main channel and the secondary channels adjacent to the main channel to support larger channel bandwidth and thereby increase the data transmission rate.
  • IEEE Institute of Electrical and Electronics Engineers
  • channel aggregation methods are mainly divided into two categories: static channel aggregation and dynamic channel aggregation.
  • static channel aggregation is: under the premise that the primary channel is idle, it is necessary to wait for all secondary channels to be idle before channel aggregation can be performed.
  • dynamic channel aggregation is: when the primary channel is idle, if there happens to be an idle secondary channel, the primary channel and the idle secondary channel can be aggregated.
  • Embodiments of the present application provide a channel aggregation method and device, in order to make optimal channel aggregation decisions and solve the problems of low channel aggregation throughput and large delay.
  • inventions of the present application provide a channel aggregation method.
  • the method can be executed by a first terminal device, or by a component of the first terminal device (such as a processor, a chip, or a chip system, etc.). It can also be executed by It is implemented by a logic module or software that can realize all or part of the functions of the first terminal device.
  • the following takes the first terminal device to execute this method as an example.
  • the method includes: the first terminal device receives a load report from the network device.
  • the load report includes the load of each of the M channels of the network device in the t-th time period.
  • M channels include 1 primary channel and M-1 secondary channels corresponding to the first terminal device, M is an integer greater than or equal to 2, and t is an integer greater than or equal to 2; the first terminal device will
  • the channel environment information of the tth time period is input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained.
  • the channel environment information of the tth time period includes the primary channel and M-1 secondary channels.
  • the load information of the t-th time period, and the channel state monitoring information obtained by the first terminal device performing channel state monitoring on the primary channel and M-1 secondary channels in the t-th time period, and the t-th channel aggregation indication value is used to indicate N secondary channels among the M-1 secondary channels are aggregated with the primary channel, where N is an integer greater than or equal to 0 and less than or equal to M-1; the first terminal device performs channel aggregation on the primary channel and the N secondary channels, That is, the terminal device can send data packets through the channel aggregated by the primary channel and N secondary channels in the t+1th time period, and the t+1th time period is the time period after the tth time period.
  • the load report may also include the deadline for the t-th period.
  • the first terminal device can obtain accurate load information of each channel from the network device side, and combine it with the channel status monitoring information obtained from its own channel status monitoring (such as information about the first terminal device sending data packets on each channel, etc.) , based on the real-time load and channel status of the channel, using artificial intelligence (AI), that is, the prediction ability of the channel aggregation model, to make optimal channel aggregation decisions, which is beneficial to reducing the number of data sent by the first terminal device after the aggregation channel
  • AI artificial intelligence
  • the probability of collision with data packets sent by other terminal devices improves the transmission performance of the aggregated channel and solves the problems of low channel aggregation throughput and large delay.
  • the channel status monitoring information obtained by the first terminal device performing channel status monitoring on the primary channel and M-1 secondary channels in the t-th time period may, but is not limited to, include one or more of the following: Item: The busy and idle status of the primary channel and each of the M-1 secondary channels monitored by the first terminal device in the t-th time period in each time unit; The data packet sending status of each time unit monitored by the first terminal device on each of the primary channel and M-1 secondary channels in the segment; the data packet sending status of the first terminal device monitored in the t-th time segment The first terminal device transmits data on the primary channel and each of the M-1 secondary channels. The number of consecutive time units that the packet sending status and the channel busy and idle status remain unchanged at the same time.
  • the first terminal device can monitor the channel status of each channel from the perspective of the busy and idle status of each channel and the status of sending data packets on each channel, which is beneficial to the real-time load and channel status based on the channel.
  • the channel aggregation model makes optimal channel aggregation decisions, thereby improving the transmission performance of the aggregated channels.
  • the method further includes: the first terminal device determines, based on the load information of the main channel and each of the N' sub-channels in the t-1th time period, to obtain the th-th time period based on the channel aggregation model.
  • the reward value of the t-1 channel aggregation indication value where the t-1th channel aggregation indication value is used to indicate the aggregation of N' secondary channels among the M-1 secondary channels with the main channel; the first terminal device is based on the t-1th channel aggregation indication value.
  • the channel environment information of the t-1th time period, the t-1th channel aggregation indication value and the set state action value function are used to determine the channel corresponding to the t-1th channel aggregation indication value based on the channel environment information of the t-1th time period.
  • the first state action value of the aggregation mode; the first terminal device is based on the channel environment information of the t-1th time period, the 2 M-1 -1 candidate channel aggregation indication values corresponding to the primary channel and M-1 secondary channels, and
  • the set state action value function determines the second state action value, in which 2 M-1 -1 candidate channel aggregation indication values correspond to 2 M- 1 -1 candidate channel aggregation of the primary channel and M-1 secondary channels.
  • the second state action value is the maximum state action value among the state action values of the candidate channel aggregation method corresponding to 2 M-1 -1 candidate channel aggregation indication values based on the channel environment information of the t-1th time period. ; and the first terminal device determines the loss of the channel aggregation model based on the reward value of the first state action value, the second state action value and the t-1 channel aggregation indication value; the first terminal device determines the loss of the channel aggregation model based on the loss of the channel aggregation model.
  • the channel aggregation model is trained and updated; where N' is the same as or different from N, and the t-1th time period is the time period before the tth time period.
  • the first terminal device can test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices. And based on the channel aggregation decision and the situation of sending data packets on the aggregated channel, combined with the load condition of each channel, different rewards will be given to the channel aggregation decision made by the channel aggregation model, and the channel aggregation model will be guided according to the conditions on each channel.
  • the load conditions are learned in order to output the optimal channel aggregation decision through the channel aggregation model.
  • the method further includes: the first terminal device determines, based on the load information of the main channel and each of the N' sub-channels in the t-1th time period, to obtain the th-th time period based on the channel aggregation model.
  • the reward value of the t-1 channel aggregation indication value where the t-1th channel aggregation indication value is used to indicate that N' secondary channels among the M-1 secondary channels are aggregated with the main channel, N' is the same as or different from N, and the The t-1 time period is the time period before the t-th time period;
  • the first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and obtains the t-th channel aggregation indication value, which includes: the first terminal device inputs the channel environment information of the t-th time period, the t-th
  • the reward value of 1 channel aggregation indication value is input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained.
  • the first terminal device determines the reward for obtaining the t-1th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • Values may include the following situations. Each situation may be used in combination or independently. This application does not limit the combination of each situation:
  • the first terminal device Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model
  • the first terminal device Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model
  • the first terminal device When the data packet sent by the first terminal device on the channel after the aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model;
  • the first terminal device When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices and N' is zero, the first terminal device will Determine to obtain the t-1th channel aggregation index based on the channel aggregation model. The reward value of the indicated value;
  • R t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model
  • K represents the K-th sub-channel among N' sub-channels
  • the first terminal device can test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices. And based on the channel aggregation decision and the situation of sending data packets on the aggregated channel, combined with the load condition of each channel, different rewards are given to the channel aggregation decision made by the channel aggregation model to guide the channel aggregation model according to each channel Learn the load conditions on the network in order to output the optimal channel aggregation decision through the channel aggregation model.
  • embodiments of the present application provide a communication device, which has the function of implementing the method in the first aspect.
  • the function can be implemented by hardware, or can be implemented by hardware executing corresponding software.
  • the hardware or software includes one or more modules corresponding to the above functions, such as an interface unit and a processing unit.
  • the device may be a chip or integrated circuit.
  • the device includes a memory and a processor.
  • the memory is used to store instructions executed by the processor.
  • the device can perform the method of the first aspect.
  • the device may be a first terminal device.
  • inventions of the present application provide a communication device.
  • the communication device includes an interface circuit and a processor, and the processor and the interface circuit are coupled to each other.
  • the processor is used to implement the method of the first aspect above through logic circuits or executing instructions.
  • the interface circuit is used to receive signals from other communication devices other than the communication device and transmit them to the processor or to send signals from the processor to other communication devices other than the communication device. It can be understood that the interface circuit may be a transceiver or a transceiver or a transceiver or an input-output interface.
  • the communication device may also include a memory for storing instructions executed by the processor or input data required for the processor to run the instructions or data generated after the processor executes the instructions.
  • the memory can be a physically separate unit, or it can be coupled to the processor, or the processor can include the memory.
  • embodiments of the present application provide a computer-readable storage medium, in which computer programs or instructions are stored. When the computer programs or instructions are executed, the method of the first aspect can be implemented.
  • embodiments of the present application further provide a computer program product, which includes a computer program or instructions.
  • a computer program product which includes a computer program or instructions.
  • embodiments of the present application further provide a chip, which is coupled to a memory and used to read and execute programs or instructions stored in the memory to implement the method of the first aspect.
  • Figure 1 is a schematic diagram of a communication system architecture provided by an embodiment of the present application.
  • Figure 2 is a schematic diagram of a fully connected neural network provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a neuron calculating output according to input provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of adjacent multi-channel aggregation provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of a preamble puncturing transmission provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of a channel aggregation method provided by an embodiment of the present application.
  • Figure 7 is one of the schematic diagrams of indication information of channel load information provided by an embodiment of the present application.
  • Figure 8 is a second schematic diagram of indication information of channel load information provided by an embodiment of the present application.
  • Figure 9A is a schematic diagram of the structure of a channel aggregation model provided by an embodiment of the present application.
  • Figure 9B is a schematic diagram of a reinforcement learning process provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of a communication device provided by an embodiment of the present application.
  • Figure 11 is a second schematic diagram of a communication device provided by an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of a device provided by an embodiment of the present application.
  • the technical solutions of the embodiments of this application can be applied to various communication systems, such as: 5G systems, LTE systems, long term evolution-advanced (LTE-A) systems and other communication systems, and can also be extended to wireless security systems.
  • Wireless fidelity (WiFi) Wireless fidelity
  • Wimax global interoperability for microwave access
  • 3GPP global interoperability for microwave access
  • 6G systems 6G systems.
  • the communication system architecture applied in the embodiment of the present application may be as shown in Figure 1, including a network device and multiple terminal devices. In Figure 1, three terminal devices are taken as an example. Terminal device 1 - terminal device 3 can send data (or data packets) to the network device separately or simultaneously. It should be noted that the embodiment of the present application does not limit the number of terminal devices and network devices in the communication system shown in Figure 1 .
  • the above-mentioned terminal equipment may also be called a terminal, a user equipment (UE), a mobile station (MS), a mobile terminal, etc.
  • Terminal devices can be widely used in various scenarios, such as device-to-device (D2D) communication, vehicle to everything (V2X) communication, machine-type communication (MTC), Internet of things (IoT), virtual reality, augmented reality, industrial control, autonomous driving, telemedicine, smart grid, smart furniture, smart office, smart wear, smart transportation, smart city, etc.
  • D2D device-to-device
  • V2X vehicle to everything
  • MTC machine-type communication
  • IoT Internet of things
  • virtual reality augmented reality
  • industrial control autonomous driving
  • telemedicine smart grid
  • smart furniture smart office
  • smart wear smart transportation
  • smart city etc.
  • Terminal devices can be mobile phones, tablets, computers with wireless transceiver functions, wearable devices, vehicles, drones, helicopters, airplanes, ships, robots, robotic arms, smart home devices, vehicle terminals, IoT terminals, and wearable devices , sites (station, STA) in the WiFi system, etc.
  • the embodiments of this application do not limit the specific technology and specific equipment form used by the terminal equipment.
  • Network equipment may also be called access network (AN) equipment or radio access network (RAN) equipment. It can be a base station, an evolved base station (evolved NodeB, eNodeB), a transmitter and receiver point (TRP), an integrated access and backhauling (IAB) node, or a fifth generation (5th Generation, 5G) next generation base station (next generation NodeB, gNB) in the mobile communication system, base station in the sixth generation (6th generation, 6G) mobile communication system, base stations in other future mobile communication systems, home base stations (for example, home evolved nodeB, or home node B, HNB), access point (AP), wireless relay node, wireless backhaul node, etc. in the WiFi system.
  • AN access network
  • RAN radio access network
  • Neural network is a machine learning technology that simulates the human brain neural network in order to achieve artificial intelligence.
  • the neural network consists of at least 3 layers, an input layer, an intermediate layer (also called a hidden layer) and an output layer. Deeper neural networks may contain more hidden layers between the input and output layers. Taking the simplest neural network as an example, its internal structure and implementation will be described. See the schematic diagram of a fully connected neural network containing three layers shown in Figure 2. As shown in Figure 2, the neural network includes 3 layers, namely the input layer, the hidden layer and the output layer. Each circle in Figure 2 represents a neuron. The input layer has 3 neurons and the hidden layer has 4 Neurons, the output layer has 2 neurons, and the neurons in each layer are fully connected to the neurons in the next layer.
  • Each connection between neurons corresponds to a weight, and these weights can be updated through training.
  • Each neuron in the hidden layer and output layer can also correspond to a bias, and these biases can also be updated through training. Updating a neural network means updating these weights and biases.
  • Know the structure of the neural network that is, the number of neurons contained in each layer of the neural network and the connection relationship between the neurons, and the parameters of the neural network, that is, the weight corresponding to each connection between the neurons, each neuron
  • the bias corresponding to the element all the information of the neural network is known.
  • each neuron may have multiple input connections, and each neuron calculates an output based on the input. See Figure 3, which is a schematic diagram of a neuron calculating output based on input. As shown in Figure 3, a neuron contains 3 inputs, 1 output, and 2 calculation functions.
  • “*” represents the mathematical operation “multiply” or “multiply by”, in which the activation function can use S-shaped function (sigmoid function), hyperbolic function, rectification function (rectification function, ReLu), etc.
  • Each neuron may have multiple output connections, and the output of one neuron serves as the input of the next neuron.
  • the input layer only has output connections, each neuron of the input layer is the value input to the neural network, and the output value of each neuron is directly used as the input of all output connections.
  • the output layer only has input connections, and the output is calculated using the calculation method of the above formula (1-1).
  • x represents the input of the neural network
  • y represents the output of the neural network
  • wi represents the weight of the i-th layer neural network
  • bi represents the bias of the i-th layer neural network
  • fi represents the activation function of the i-th layer neural network
  • i 1, 2,...,k.
  • FIG. 2 is a schematic diagram of adjacent multi-channel aggregation.
  • the 20MHz main channel and the 20MHz secondary channel can be aggregated into a channel with a bandwidth of 40MHz; the 40MHz main channel and the 40MHz secondary channel can be aggregated into a bandwidth It is an 80MHz channel; the 8MHz main channel and the 80MHz secondary channel can be aggregated into a channel with a bandwidth of 160MHz.
  • FIG. 5 is a schematic diagram of preamble puncturing transmission.
  • TX means transport
  • CH means channel
  • the bandwidth of each channel (CH1, CH2, CH3, CH4) is 20MHz
  • frame 1 frame 1
  • frame 2 frame 2
  • frame 3 The transmission bandwidth of (frame 3) is 80MHz.
  • the sub-20MHz channel (recorded as S20) is busy when transmitting frame 1, S20 is punctured, so the actual bandwidth of frame 1 is 60MHz. Similarly, the actual bandwidth of frame 2 is 60MHz, and the actual bandwidth of frame 3 is 40MHz.
  • Channel aggregation methods are mainly divided into two categories: static channel aggregation and dynamic channel aggregation.
  • static channel aggregation under the premise that the primary channel is idle, it is necessary to wait for all secondary channels to be idle before channel aggregation can be performed.
  • dynamic channel aggregation when the primary channel is idle, if there happens to be an idle secondary channel, the primary channel and the idle secondary channel can be aggregated.
  • the main idea of the current channel aggregation method is to aggregate the main channel and the idle secondary channel when the main channel is idle.
  • the aggregated channels applied by multiple terminal devices may partially or completely overlap, the data packets sent by each terminal device may have a high collision rate, and the terminal device may enter the backoff window multiple times. Waiting for data packets to be sent leads to problems of low channel aggregation throughput and large delay.
  • this application provides a channel aggregation method, which aims to make optimal channel aggregation decisions based on the real-time status of the channel and the transmission requirements of the service, using the prediction ability of artificial intelligence (AI) to improve the quality of the aggregated channel.
  • transmission performance solving the problems of low channel aggregation throughput and large delay.
  • the number of nouns means “singular noun or plural noun", that is, “one or more”, unless otherwise specified.
  • At least one means one or more
  • plural means two or more.
  • “And/or” describes the relationship between associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the related objects are in an "or” relationship.
  • A/B means: A or B.
  • At least one of the following or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items).
  • at least one of a, b, or c means: a, b, c, a and b, a and c, b and c, or a and b and c, where a, b, c Can be single or multiple.
  • Figure 6 shows a channel aggregation method provided by an embodiment of the present application.
  • the method includes:
  • the first terminal device receives a load report from the network device.
  • the load report includes the load information of each of the M channels of the network device in the t-th time period.
  • M is an integer greater than or equal to 2
  • t is greater than or equal to 2.
  • the network device can obtain the time period corresponding to the acquisition cycle for each of the M channels of the network device through carrier sensing (such as carrier sensing) according to the set acquisition cycle. time period) load information.
  • the load information of the channel in a certain time period can be represented by a load value, which represents the time when the channel is busy in the time period (that is, the time when data packets are transmitted) Ratio to the total time.
  • the network device can obtain through carrier sensing whether each of the M channels of the network device in the time period has a data packet in each time unit. Transmission, and determine the load information (such as load value) of each channel in this time period based on whether each channel has data packet transmission in each time unit in this time period.
  • the time unit may be resources of different time granularities such as subframes, slots, mini-slots or symbols, and one time period may include one or more time units.
  • the network device For the t-th time period, after the network device obtains the load information of each of the M channels in the t-th time period, it can send a load report including the load information of each of the M channels in the t-th time period.
  • (Load report) Send the terminal device through broadcast, multicast, etc., for example: send it through broadcast to one or more terminal devices located within the service range of the network device.
  • the indication information used to indicate the load information of each channel in the load report can be shown in Figure 7, in which the channel number (channel number) field is used to indicate the number (or index) of the channel, occupying an 8-bit (octet );
  • the channel load field is used to indicate the load value corresponding to the channel and occupies one octet.
  • the load report For the load information of each of the M channels in the t-th time period, the load report generates a total of M*16 bits of overhead.
  • the indication information used to indicate the load information of each channel can also be as shown in Figure 8.
  • the indication information can also include a regulatory class field and an actual measurement stop time. ) field.
  • the regulatory field can indicate a type set, occupying one octet.
  • the type set can include: operating frequency band, channel bandwidth, channel set, transmission power upper limit, set emission limits (emissions limits set), behavior limit set ( behavior limits set) and other information.
  • the value of the regulatory field is 55, and the corresponding type set indicates that the channel is in the 5 GHz frequency band, the channel bandwidth is 20MHz, and the channel numbers (or indexes) of the channels included in the channel set are 149, 153, and 157 , 161, 165, the transmission power is 1000mW, the emissions limits set is 4, and the behavior limits set is 10;
  • the regulatory class set corresponding to the value 12 of the regulatory class field indicates that the letter is in the 2.407GHz frequency band, the channel bandwidth is 25MHz, and it belongs to the channel set Included channels have a channel number (or index) of 1-11, a transmission power of 1000mW, emissions limits sets of 4 and behavior limits set of 10.
  • the actual measurement stop time field which occupies 8 octets, is used to indicate the time to complete the load measurement. It can be used to ensure the time consistency of the load report issued to each terminal device. For example, the network device uses carrier monitoring in the tth time period. When load measurement is performed on the channel, the time to complete the load measurement is the deadline of the t-th time period.
  • the regulatory field and the actual measurement stop time field are optional, and whether there is a regulatory field and the actual measurement stop time field can be indicated by the first 2 bits of the indication information. For example: 00 means there are no these two fields, 01 means the actual measurement stop time field exists, 10 means the supervision field exists, and 11 means both the supervision field and the actual measurement stop time field exist.
  • the M channels include 1 main channel corresponding to the first terminal device and M-1 secondary channels.
  • 1 main channel corresponding to the first terminal device can be passed by the network device.
  • the radio resource control (RRC) message or the like is directed to the first terminal device, or it can also be determined by the first terminal device based on the load information of M channels (such as selecting the channel with the smallest load value as the main channel), etc., this application There is no limit to this.
  • the first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and obtains the t-th channel aggregation indication value.
  • the channel environment information in the t-th time period includes the load information of the primary channel and each of the M-1 secondary channels in the t-th time period, and the first terminal device's response to the primary channel in the t-th time period.
  • Channel state monitoring information obtained by performing channel state monitoring with M-1 secondary channels.
  • the channel aggregation indicator value is used to indicate the aggregation of N secondary channels among the M-1 secondary channels with the primary channel. N is greater than or equal to 0, and An integer less than or equal to M-1.
  • the first terminal device can also perform channel status monitoring on the primary channel and M-1 secondary channels within the time period corresponding to each acquisition cycle to obtain channel status monitoring information.
  • the channel state monitoring information obtained by the first terminal device performing channel state monitoring on the primary channel and M-1 secondary channels in the t-th time period may include: The busy and idle status of the main channel and each of the M-1 secondary channels monitored in the time period in each time unit; the first terminal device monitored by the first terminal device in the t-th time period.
  • channel i the i-th channel among the primary channel and M-1 secondary channels (a total of M channels)
  • the number of elements contained in is equal to the number of time units included in the t-th time period.
  • the value of the element is 1, which means that the busy status of channel i in the time unit corresponding to the element is busy (that is, there is transmission of data packets, possibly It is the data transmission of the first terminal device, or it may be the data packet transmission of other terminal devices).
  • the value of the element is 0, which means that the busy status of channel i in the time unit corresponding to the element is idle (that is, there is no data packet transmission) , the value of the element is -1, which means that the first terminal device does not monitor the channel.
  • the busy status of i in the time unit corresponding to this element (for example, because the first terminal device sends a data packet on a channel other than channel i in the time unit corresponding to this element, it is unable to monitor the busy status of channel i in the time unit corresponding to this element) idle state). for example It means that the busy-idle status of channel i in the first 9 time units of the t-th time period is idle, and the busy-idle status of the 10th time unit is busy.
  • channel i the i-th channel among the primary channel and M-1 secondary channels (a total of M channels)
  • the number of elements contained in is equal to the number of time units included in the t-th time period.
  • the value of the element is 1, which means that for channel i, the first terminal device has sent a data packet in the time unit corresponding to the element.
  • the value of the element is 0 represents that for channel i, the first terminal device sends no data packet in the time unit corresponding to this element. for example It means that the first terminal device has sent data packets on channel i in the first 3 time units and the 10th time unit of the t-th time period, but has not sent data packets on channel i in the 4th to 9th time units.
  • the data packet sending status and the busy and idle status of the channel on the main channel and each of the M-1 secondary channels remain unchanged at the same time.
  • the number of consecutive time units can be used express.
  • the first terminal device can The value of is set to the initial value 0; in the second time unit of the t-th time period, and The values of the elements corresponding to the second time unit in are all the same as the values of the elements corresponding to the first time unit, The value +1( is 1); in the third time unit of the t-th time period, and The values of the elements corresponding to the third time unit in are all the same as the values of the elements corresponding to the second time unit, The value +1( is 2); in the fourth time unit of the t-th time period, there is The value of the element corresponding to the fourth time unit in is not the same as the value of the element corresponding to the third time unit, The value of is reset to 0; in the fifth time unit of the t-th time period, there is The value of the element corresponding to the fifth time unit in is not the same as the value of the element corresponding to the fourth time unit, The value of is reset to 0; in the fifth time unit of the t-th time period, there is The
  • the input of the channel aggregation model may be the channel environment information S of a certain time period (such as the t-th time period), and the output of the channel aggregation model is the channel aggregation indication value Y.
  • the channel environment information S t of the t-th time period includes the load information of the main channel in the t-th time period.
  • Y can be a number between 0 and 2 M-1 -1. Each number is mapped to a specific channel aggregation method including the main channel.
  • the parameters of each layer of neurons in the channel aggregation model can be configured through random initialization. It is also possible to use multiple channel environment information samples in the sample library that have been marked with target channel aggregation indication values corresponding to the channel aggregation mode, and obtain them through training by the training device.
  • the channel environment information corresponding to multiple time periods can be obtained by the first terminal device for multiple channel environment information samples in the sample library, and the channel environment information corresponding to each time period can be manually obtained, According to the channel environment information corresponding to the next time period in the time period, after determining a preferred channel aggregation method corresponding to the channel environment information corresponding to the time period, mark the channel environment information corresponding to the time period corresponding to the preferred channel aggregation method. Target channel aggregation indicator value.
  • the training device (such as the first terminal device or network device) can input the channel environment information samples in the sample library to the channel aggregation model, and obtain the channel aggregation indication value output by the channel aggregation model.
  • the channel aggregation indicator value output by the aggregation model is the target channel aggregation indicator value corresponding to the channel environment information sample.
  • the loss function (loss function) training device can calculate the loss (loss) of the channel aggregation model. The higher the loss, the channel aggregation model is passed. The greater the difference between the output channel aggregation indicator value and the target channel aggregation indicator value, the channel aggregation model adjusts the parameters of the neurons in the channel aggregation model according to the loss. For example, if the stochastic gradient descent method is used to update the parameters of the neurons in the channel aggregation model, then the The training process of the channel aggregation model becomes the process of reducing this loss as much as possible.
  • the channel aggregation model is continuously trained through the channel environment information samples in the sample set. When the loss is reduced to the preset range, the trained channel aggregation model can be obtained.
  • each block in Figure 9A represents a fully connected layer
  • the channel aggregation model can be composed of 7 fully connected layers, of which 7
  • the fully connected layer consists of 1 input layer, 5 hidden layers and 1 output layer from left to right.
  • the activation function of each layer can use a rectification function (ReLu).
  • the inputs of the input layer are the channel environment information S of a certain time period (such as the first time period)
  • the output h1 of the input layer is the input of hidden layer 1
  • the output h2 of hidden layer 1 is the input of hidden layer 2.
  • the output h3 of hidden layer 2 is the input of hidden layer 3.
  • the XOR operation result of the output h4 of hidden layer 3 and the output h2 of hidden layer 1 is the input of hidden layer 4.
  • the output h5 of hidden layer 4 is the input of hidden layer 5.
  • the XOR operation result of the output h6 of hidden layer 5 and the output h4 of hidden layer 3 is the input of the output layer, and the output of the output layer is the channel aggregation indication value Y.
  • the process of training the channel aggregation model is the process of continuously adjusting the parameters of the neurons in each layer of the channel aggregation model.
  • the above-mentioned training device can be a first terminal device, a network device, or other devices such as a server or a computer.
  • the training device can determine the channel aggregation model. The parameters of the neurons in each layer are then sent to the first terminal device.
  • the channel aggregation model outputs a channel aggregation indication value (such as the t-th time period) based on the channel environment information (S t-1 ) of a certain time period (such as the t-1th time period).
  • the first terminal device can also test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices, and based on the channel aggregation method indicated by the channel aggregation indication value and Based on the situation of sending data packets on the aggregated channel, combined with the load situation of each channel in that time period, a reward value (such as R t ) is given based on the channel aggregation indication value output by the channel aggregation model, and the reward value is also As the input of the channel aggregation model in the next time period (such as the tth time period).
  • a reward value such as R t
  • the channel aggregation model is guided to learn according to the load conditions on each channel, in order to output the optimal channel aggregation decision through the channel aggregation model.
  • the channel aggregation model can also be based on the channel environment information of a certain time period (such as the t-1th time period) and output the channel aggregation indicator value (such as the t-1th time period) as the next time period (such as the t-1th time period). t time periods) as input to the channel aggregation model.
  • the first terminal device can determine the reward value of the channel aggregation indication value based on the channel aggregation model in the following manner, that is, determine that the first terminal device performs the decision action (i.e., the channel aggregation indicator value) obtained based on the channel aggregation model.
  • the reward value of the channel aggregation method corresponding to the aggregation indication value takes this time period as the t-th time period and the reward value of the channel aggregation indicator value obtained based on the channel aggregation model is R t+1 as an example:
  • the first terminal device shall Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;
  • the first terminal device shall Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;
  • the first terminal device When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is not zero, the first terminal device will Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;
  • the first terminal device When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels collides with the data packet sent by other terminal devices and N is zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.
  • R t+1 represents the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model
  • K represents the K-th sub-channel among N sub-channels
  • K 1, 2,...,N
  • the first terminal device also determines the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value based on the average load information (such as load value) of the primary channel and N secondary channels in the t-th time period. Rt +1 .
  • the main channel and N channels will load information in the t+1th time period.
  • the product of the mean value (such as the load value) and -1 is used as the reward value R t+1 ; when the first terminal device sends a data packet on the channel after the main channel and N secondary channels are aggregated, it does not occur when other terminal devices send data packets.
  • the average value of the load information (such as load value) of the main channel and N channels in the t+1th time period is used as the reward value R t+1 .
  • the time period is the t-th time period and the reward value for determining the t-th channel aggregation indication value based on the channel aggregation model is R t+1 .
  • the first terminal device can also use the main channel and each of the N' sub-channels in the t-1th time period.
  • the load information of the time period is used to determine the reward value R t for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, and the t-1th channel aggregation indication value is used for indication N' secondary channels among the M-1 secondary channels are aggregated with the main channel, and N' is the same as or different from N.
  • the first terminal device when the data packet sent by the first terminal device on the channel after aggregation of the main channel and N' secondary channels does not collide with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.
  • the first terminal device Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.
  • the first terminal device When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.
  • the first terminal device When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices and N' is zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.
  • R t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model
  • K represents the K-th sub-channel among N' sub-channels
  • the first terminal device sends a data packet through the aggregated channel of the primary channel and N secondary channels in the t+1th time period.
  • the t+1th time period is the time period after the tth time period.
  • the first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing. After obtaining the t-th channel aggregation indication value, it can use the M-1 sub-channels indicated by the t-th channel aggregation indication value to N secondary channels are aggregated with the primary channel, M-1 secondary channels are aggregated with the primary channel, and data packets are sent to the network device through the aggregated channels in the t+1th time period after the tth time period.
  • the user in order to make the channel aggregation decision (i.e., the channel aggregation indicator value) made by the channel aggregation model meet the user's expectations, the user can also pre-configure the evaluation method to make different decision actions based on various channel environment information S a( That is, the state action value function Q of the channel aggregation method corresponding to different channel aggregation indicator value Y), for the decision action a output by the channel aggregation model based on the channel environment information S of a certain period of time (that is, the output channel aggregation indicator value Y).
  • the corresponding channel aggregation method performs state action value evaluation to obtain a first state action value; and all possible decision actions corresponding to the channel environment information S in this time period (that is, all possible
  • the channel aggregation method corresponding to the channel aggregation indication value Y) is evaluated respectively to obtain multiple state action values, and the maximum value is selected as the second state action value.
  • the loss of the channel aggregation model can be determined based on the second state action value, the first state action value, and the reward value, and the channel aggregation model can be trained and updated.
  • the stochastic gradient descent method can be used to update the neurons in the channel aggregation model based on the loss. parameter.
  • the following expected squared reward value function (which can also be called a loss function) can be used to determine the loss of the channel aggregation model.
  • L( ⁇ ) E[R t+1 + ⁇ max a′ Q(s t ′,a′, ⁇ * )-Q(s t ,a t ; ⁇ )] 2
  • L() represents the expected squared reward value function
  • L( ⁇ ) represents the loss of the channel aggregation model
  • Q() represents the set state action value function
  • represents the discount factor (the value can be 0.9, etc.)
  • represents The current parameters of the channel aggregation model, R t+1, represent the reward value of the decision action a t obtained based on the channel aggregation model (that is, the channel aggregation method corresponding to the obtained t-th channel aggregation indication value)
  • Q(s t ,a t ; ⁇ )] represents the first state action value of decision action a t (that is, the channel aggregation mode corresponding to the output t-th channel aggregation indication value) based on the channel environment information st of the t-th time period
  • max a′ Q( s t ′, a ′, ⁇ * ) represents the candidate channel aggregation corresponding to each of the candidate channel aggregation indication values
  • a′ represents the decision action corresponding to the second state action value
  • ⁇ * represents represents the parameters of the target channel aggregation model, that is, the parameters of the channel aggregation model when outputting the decision action a′ (the channel aggregation indication value corresponding to a′) of the second state action value.
  • the above is explained by taking the time period as the t-th time period and determining the loss of the channel aggregation model as an example. It can be understood that for other time periods (such as the t-1th time period), it will correspond to the t-th time period.
  • the reward value, first state action value and second state action value of the segment are replaced with the reward value, first state action value and second state action value corresponding to the t-1th time period, and the corresponding t-th time period can be determined.
  • the loss of the channel aggregation model in 1 time period, the channel aggregation model in t-1 time period is trained and updated.
  • the first terminal device processes the input channel environment information of the t-th time period based on the channel aggregation model to obtain the channel aggregation indication value.
  • the first terminal device performs channel aggregation in the t+1th time period based on the channel aggregation mode indicated by the channel aggregation indication value as an example for explanation.
  • the channel aggregation model can also be deployed on the network device.
  • the network device side obtains the channel environment information of the first terminal device corresponding to the t-th time period and inputs it into the channel aggregation model.
  • the input channel environment information of the t-th time period is Perform processing to obtain the channel aggregation indication value, and the network device sends the channel aggregation indication value or the channel aggregation method indicated by the channel aggregation indication value to the first terminal device, and the first terminal device determines the channel aggregation indication value or channel aggregation mode indicated by the channel aggregation indication value from the network device. Perform channel aggregation in the channel aggregation mode indicated by the channel aggregation indicator value.
  • the first terminal device includes corresponding hardware structures and/or software modules that perform each function.
  • the units and method steps of each example described in conjunction with the embodiments disclosed in this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software driving the hardware depends on the specific application scenarios and design constraints of the technical solution.
  • Figures 10 and 11 are schematic structural diagrams of possible communication devices provided by embodiments of the present application. These communication devices can be used to implement the functions of the first terminal device in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments.
  • the communication device may be a first terminal device, or may be a module (such as a chip) applied to the first terminal device.
  • the communication device 1000 includes a processing unit 1010 and an interface unit 1020, where the interface unit 1020 may also be a transceiver unit or an input/output interface.
  • the communication device 1000 may be used to implement the functions of the first terminal device in the above method embodiment shown in FIG. 6 .
  • the interface unit 1020 is configured to receive a load report from the network device.
  • the load report includes the load information of each of the M channels of the network device in the t-th time period, where the M channels include 1 corresponding to the first terminal device.
  • primary channels and M-1 secondary channels M is an integer greater than or equal to 2
  • t is an integer greater than or equal to 2
  • the processing unit 1010 is used to input the channel environment information of the tth time period into the channel aggregation model Perform processing to obtain the t-th channel aggregation indicator value.
  • the channel environment information of the t-th time period includes the load information of the primary channel and each of the M-1 secondary channels in the t-th time period, and the load information of the t-th time period.
  • the channel state monitoring information obtained by performing channel state monitoring on the main channel and M-1 secondary channels during the time period.
  • the t-th channel aggregation indicator value is used to indicate the aggregation of N secondary channels among the M-1 secondary channels with the primary channel.
  • N is an integer greater than or equal to 0 and less than or equal to M-1; and channel aggregation is performed on the primary channel and N secondary channels.
  • the load report also includes the deadline for the t-th period.
  • the processing unit 1010 performs channel status monitoring on the primary channel and M-1 secondary channels in the t-th time period, and the channel status monitoring information obtained includes one or more of the following: Processing unit 1010 The busy and idle status of the main channel and each of the M-1 secondary channels in each time unit monitored in the t-th time period; the communication device monitored by the processing unit 1010 in the t-th time period The data packet transmission status of each time unit on the main channel and each of the M-1 secondary channels; the communication device monitored by the processing unit 1010 in the t-th time period is on the main channel and M-1 The number of consecutive time units that the data packet sending status and the busy and idle status of the channel on each secondary channel remain unchanged at the same time.
  • the processing unit 1010 is also configured to determine, based on the load information of each of the primary channel and N' secondary channels in the t-1th time period, the t-1th time interval obtained based on the channel aggregation model.
  • the reward value of the channel aggregation indicator value where the t-1th channel aggregation indicator value is used to indicate the aggregation of N' secondary channels among the M-1 secondary channels with the main channel; based on the channel environment information of the t-1th time period , the t-1th channel aggregation indication value and the set state action value function determine the first state action of the channel aggregation method corresponding to the t-1th channel aggregation indication value based on the channel environment information of the t-1th time period value; based on the channel environment information of the t-1th time period, the 2 M-1 -1 candidate channel aggregation indication values corresponding to the primary channel and M-1 secondary channels, and the set status action value function, determine the second State action value, in which 2 M
  • the processing unit 1010 is also configured to determine the t-th time period based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • the reward value of 1 channel aggregation indication value where the t-1th channel aggregation indication value is used to indicate that N' secondary channels among the M-1 secondary channels are aggregated with the main channel, N' is the same as or different from N, and the t-th 1 time period is the time period before the t-th time period;
  • the processing unit 1010 inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and when obtaining the t-th channel aggregation indication value, it is specifically used to convert the t-th time period into the channel aggregation indicator value.
  • the channel environment information of t time periods and the reward value of the t-1th channel aggregation indication value are input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained.
  • the processing unit 1010 determines to obtain the t-1th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • the reward value of the value it is specifically used: when the interface unit 1020 sends a data packet on the channel after the main channel and N' secondary channels have been aggregated, and does not collide with the data packet sent by other terminal devices, and N' is zero, the processing unit 1010 based on Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model; where R t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, Indicates the load information of the main channel in the t-1th time period.
  • the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period.
  • the reward value of the value it is specifically used: when the interface unit 1020 sends a data packet on the channel after the main channel and N' secondary channels are aggregated, and the data packet sent by other terminal devices collides, and N' is zero, the processing unit 1010 according to Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model; where R t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, Indicates the load information of the main channel in the t-1th time period.
  • the processing unit 1010 is also configured to determine the t-th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. reward value.
  • the processing unit 1010 determines the reward value of the t-th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period, Specifically used: when the data packet sent by the interface unit 1020 on the channel after aggregation of the main channel and N secondary channels does not collide with the data packet sent by other terminal devices, and N is not zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.
  • the processing unit 1010 determines the negative value of each of the primary channel and N secondary channels in the t-th time period. carrying information to determine the reward value of the t-th channel aggregation indication value based on the channel aggregation model, specifically for: when the interface unit 1020 sends a data packet on the main channel and the N secondary channel aggregation channel but does not send data to other terminal devices When a packet collides and N is zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.
  • the processing unit 1010 determines when the reward value of the t-th channel aggregation indication value is obtained based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. , specifically used for: when the data packet sent by the interface unit 1020 on the channel aggregated between the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is not zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.
  • the processing unit 1010 determines when the reward value of the t-th channel aggregation indication value is obtained based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. , specifically used for: when the data packet sent by the interface unit 1020 on the channel aggregated between the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is zero, the processing unit 1010 Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;
  • R t+1 represents the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model
  • K represents the K-th sub-channel among N sub-channels
  • K 1, 2,...,N
  • the processing unit 1010 is also configured to determine the channel based on the t-th time period based on the channel environment information of the t-th time period, the t-th channel aggregation indicator value and the set state action value function.
  • Environmental information carries out the first state action value of the channel aggregation mode corresponding to the t-th channel aggregation indication value; 2 M-1 -1 corresponding to the channel environment information of the t-th time period, the primary channel and the M- 1 secondary channels
  • the candidate channel aggregation indicator value and the set state action value function determine the second state action value, where 2 M-1 -1 candidate channel aggregation indicator values correspond to 2 M-1 of the primary channel and M-1 secondary channels.
  • the second state action value is based on the channel environment information of the tth time period, respectively, 2 M-1 -1 candidate channel aggregation indication values corresponding to the state action values of the candidate channel aggregation mode maximum state action value; and determine the loss of the channel aggregation model based on the first state action value, the second state action value and the reward value of the t-th channel aggregation indication value based on the channel aggregation model; based on the loss of the channel aggregation model, the channel aggregation model Aggregate models for training updates.
  • this application also provides a communication device 1100, including a processor 1110 and an interface circuit 1120.
  • the processor 1110 and the interface circuit 1120 are coupled to each other.
  • the interface circuit 1120 can be a transceiver, an input-output interface, an input interface, an output interface, a communication interface, etc.
  • the communication device 1100 may also include a memory 1130 for storing instructions executed by the processor 1110 or input data required for the processor 1110 to run the instructions or data generated after the processor 1110 executes the instructions.
  • the memory 1130 can also be integrated with the processor 1110 .
  • the processor 1110 can be used to implement the functions of the above-mentioned processing unit 1010, and the interface circuit 1120 can be used to implement the functions of the above-mentioned interface unit 1020.
  • the device may be a network device or a first terminal device.
  • the device may include a processor, a transceiver and an antenna.
  • the processor may include a Multiple processing units are obtained. Different processing units can be independent devices or integrated into one or more processors. Among them, the processor can be the nerve center and command center of the device.
  • the processor can generate operation control signals based on the instruction opcode and timing signals to complete the operations of fetching and executing instructions.
  • the processor can execute the corresponding channel aggregation method process according to the instructions corresponding to the channel aggregation method; the transceiver and the antenna can receive signals from other devices and transmit them to the processor or send signals from the processor to Other equipment.
  • the device can also include a neural network processor (neural-network processing unit, NPU).
  • NPU neural-network processing unit
  • the NPU implements training and update of the channel aggregation model (ie, neural network model), and performs calculation output based on the information of the input channel aggregation model.
  • Channel aggregation mode (or channel aggregation indication value corresponding to the channel aggregation mode).
  • the NPU can include an inference module and a training module, where the training module can be used to implement training and update of the channel aggregation model (ie, neural network model).
  • the inference module can implement calculations and output channel aggregation methods based on the information of the input channel aggregation model.
  • the NPU may be coupled to the central processing unit, which is not limited in this application.
  • processor in the embodiments of the present application can be a central processing unit (CPU), or other general-purpose processor, digital signal processor (DSP), or application-specific integrated circuit. (application specific integrated circuit, ASIC), logic circuit, field programmable gate array (field programmable gate array, FPGA) Or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
  • a general-purpose processor can be a microprocessor or any conventional processor.
  • the method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory In memory, register, hard disk, mobile hard disk, CD-ROM or any other form of storage medium well known in the art.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in network equipment or terminal equipment. Of course, the processor and the storage medium can also exist as discrete components in network equipment or terminal equipment.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device.
  • the computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer program or instructions may be transmitted from a network device, terminal, A computer, server or data center transmits via wired or wireless means to another network device, terminal, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
  • the available media may be magnetic media, such as floppy disks, hard disks, and tapes; optical media, such as digital video optical disks; or semiconductor media, such as solid-state hard drives.
  • the computer-readable storage medium may be volatile or nonvolatile storage media, or may include both volatile and nonvolatile types of storage media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请涉及通信技术领域,公开了一种信道聚合方法及装置,以期做出最优的信道聚合决策,解决信道聚合吞吐小、时延大的问题。该方法包括:第一终端设备接收负载报告,负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息,M个信道包括第一终端设备对应的1个主信道和M-1个次信道;将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值,第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息,第t信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合;在第t+1个时间段通过主信道和N个次信道聚合后的信道发送数据包。

Description

一种信道聚合方法及装置
相关申请的交叉引用
本申请要求在2022年08月31日提交中国专利局、申请号为202211066591.3、申请名称为“一种信道聚合方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,尤其涉及一种信道聚合方法及装置。
背景技术
为了应对频谱资源的短缺和业务流量增加的问题,电气与电子工程师协会(institute of electrical and electronics engineers,IEEE)制定的通信标准中引入了信道聚合技术。具体的信道聚合技术可以基于主信道,将主信道和与主信道相邻的次信道聚合,以支持更大的信道带宽,从而提高数据传输速率。
目前,信道聚合方法主要分为静态(static)信道聚合和动态(dynamic)信道聚合两类信道聚合方法。静态信道聚合的主要思想为:在主信道空闲的前提条件下,需要等待所有次信道也空闲,才可以进行信道聚合。动态信道聚合的主要思想是:在主信道空闲时,如果正好也存在次信道空闲,即可将主信道和空闲的次信道聚合。
然而,采用上述信道聚合方法时,当存在多个终端设备竞争信道资源时,会存在各终端设备发送的数据包碰撞率高、终端设备多次进入倒退窗口等待发送数据包,导致信道聚合吞吐小、时延大的问题。
发明内容
本申请实施例提供一种信道聚合方法及装置,以期做出最优的信道聚合决策,解决信道聚合吞吐小、时延大的问题。
第一方面,本申请实施例提供一种信道聚合方法,该方法可以由第一终端设备执行,也可以由第一终端设备的部件(例如处理器、芯片、或芯片系统等)执行,还可以由能实现全部或部分第一终端设备功能的逻辑模块或软件实现。以下以第一终端设备执行该方法为例进行说明,该方法包括:第一终端设备接收来自网络设备的负载报告,负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息,其中,M个信道包括第一终端设备对应的1个主信道和M-1个次信道,M为大于或等于2的整数,t为大于或等于2的整数;第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值,第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息,第t信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合,N为大于或等于0、且小于或等于M-1的整数;第一终端设备对主信道和N个次信道进行信道聚合,也即终端设备可以在第t+1个时间段通过主信道和N个次信道聚合后的信道发送数据包,第t+1个时间段为第t个时间段之后的时间段。
可选地,负载报告还可以包括第t时段的截止时间。
采用上述方法,第一终端设备可以从网络设备侧获取各信道准确的负载信息,并结合自身进行信道状态监测得到的信道状态监测信息(如第一终端设备在各信道发送数据包的信息等),基于信道的实时负载和信道状态,利用人工智能(artificial intelligence,AI),即信道聚合模型的预测能力,做出优选的信道聚合决策,有利于降低第一终端设备在聚合后的信道发送数据与其它终端设备发送的数据包的碰撞概率,提升聚合后信道的传输性能,解决信道聚合吞吐小、时延大的问题。
在一种可能的设计中,第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息可以但不限于包括以下中的一项或多项:第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态;第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态;第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据 包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。
上述设计中,第一终端设备可以从各个信道的忙闲状态和自身在各个信道的发送数据包情况等角度出发,对各个信道进行信道状态监测,有利于基于信道的实时负载和信道状态,通过信道聚合模型做出最优的信道聚合决策,从而提升聚合后信道的传输性能。
在一种可能的设计中,该方法还包括:第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值,其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合;第一终端设备根据第t-1个时间段的信道环境信息、第t-1信道聚合指示值和设定的状态动作值函数,确定基于第t-1个时间段的信道环境信息进行第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值;第一终端设备根据第t-1个时间段的信道环境信息、主信道与M-1个次信道对应的2M-1-1个候选信道聚合指示值和设定的状态动作值函数,确定第二状态动作值,其中2M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2M-1-1个候选信道聚合方式,第二状态动作值为基于第t-1个时间段的信道环境信息分别进行2M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值;以及第一终端设备根据第一状态动作值、第二状态动作值和第t-1信道聚合指示值的奖励值,确定信道聚合模型的损失;第一终端设备根据信道聚合模型的损失,对信道聚合模型进行训练更新;其中,N’与N相同或不同,第t-1个时间段为第t个时间段之前的时间段。
上述设计中,在信道聚合模型做出信道聚合决策(即输出信道聚合指示值)后,第一终端设备可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞,并根据该信道聚合决策以及在聚合后的信道上的发送数据包的情况,结合各信道的负载情况,对信道聚合模型做出的信道聚合决策给予不同的奖励,引导信道聚合模型根据各信道上的负载情况进行学习,以期通过信道聚合模型输出最优的信道聚合决策。
在一种可能的设计中,该方法还包括:第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值,其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合,N’与N相同或不同,第t-1个时间段为第t个时间段之前的时间段;
第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值,包括:第一终端设备将第t个时间段的信道环境信息、第t-1信道聚合指示值的奖励值输入到信道聚合模型进行处理,得到第t信道聚合指示值。
采用上述方法,还可以设定奖励策略对信道聚合模型做出的信道聚合决策(即输出的信道聚合指示值)给予不同的奖励,并将给予的奖励值也作为信道聚合模型下次作出信道决策的影响因素,以期使信道聚合模型作出用户需求的信道聚合决策。
可选地,第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值,可以包括以下情况,各个情况可以结合使用,也可以独立使用,本申请不限制各个情况的组合情况:
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指 示值的奖励值;
上述各个情况中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,K表示N’个次信道中的第K个次信道,K=1、2、…、N’,表示第K个次信道在第t-1个时间段的负载信息,表示主信道在第t-1个时间段的负载信息。
上述设计中,在信道聚合模型做出信道聚合决策(即输出信道聚合指示值)后,第一终端设备可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞,并根据该信道聚合决策以及在聚合后的信道上的发送数据包的情况,结合各信道的负载情况,对信道聚合模型做出的信道聚合决策给予不同的奖励,以引导信道聚合模型根据各信道上的负载情况进行学习,以期通过信道聚合模型输出最优的信道聚合决策。
第二方面,本申请实施例提供一种通信装置,该装置具有实现上述第一方面中方法的功能,所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块,比如包括接口单元和处理单元。
在一个可能的设计中,该装置可以是芯片或者集成电路。
在一个可能的设计中,该装置包括存储器和处理器,存储器用于存储所述处理器执行的指令,当指令被处理器执行时,所述装置可以执行上述第一方面的方法。
在一个可能的设计中,该装置可以为第一终端设备。
第三方面,本申请实施例提供一种通信装置,该通信装置包括接口电路和处理器,处理器和接口电路之间相互耦合。处理器通过逻辑电路或执行指令用于实现上述第一方面的方法。接口电路用于接收来自该通信装置之外的其它通信装置的信号并传输至处理器或将来自处理器的信号发送给该通信装置之外的其它通信装置。可以理解的是,接口电路可以为收发器或收发机或收发信机或输入输出接口。
可选的,通信装置还可以包括存储器,用于存储处理器执行的指令或存储处理器运行指令所需要的输入数据或存储处理器运行指令后产生的数据。存储器可以是物理上独立的单元,也可以与处理器耦合,或者处理器包括该存储器。
第四方面,本申请实施例提供一种计算机可读存储介质,在存储介质中存储有计算机程序或指令,当计算机程序或指令被执行时,可以实现上述第一方面的方法。
第五方面,本申请实施例还提供一种计算机程序产品,包括计算机程序或指令,当计算机程序或指令被执行时,可以实现上述第一方面的方法。
第六方面,本申请实施例还提供一种芯片,该芯片与存储器耦合,用于读取并执行存储器中存储的程序或指令,实现上述第一方面的方法。
上述第二方面至第六方面所能达到的技术效果请参照上述第一方面所能达到的技术效果,这里不再重复赘述。
附图说明
图1为本申请实施例提供的一种通信系统架构示意图;
图2为本申请实施例提供的一种全连接神经网络示意图;
图3为本申请实施例提供的一种神经元根据输入计算输出的示意图;
图4为本申请实施例提供的一种相邻的多信道聚合的示意图;
图5为本申请实施例提供的一种前导码打孔传输的示意图;
图6为本申请实施例提供的一种信道聚合方法示意图;
图7为本申请实施例提供的信道的负载信息的指示信息示意图之一;
图8为本申请实施例提供的信道的负载信息的指示信息示意图之二;
图9A为本申请实施例提供的一种信道聚合模型的结构的示意图;
图9B为本申请实施例提供的一种强化学习流程示意图;
图10为本申请实施例提供的通信装置示意图之一;
图11为本申请实施例提供的通信装置示意图之二;
图12为本申请实施例提供的一种设备结构示意图。
具体实施方式
本申请实施例的技术方案可以应用于各种通信系统,例如:5G系统,LTE系统,长期演进高级(long term evolution-advanced,LTE-A)系统等通信系统中,也可以扩展到如无线保真(wireless fidelity,WiFi)、全球微波互联接入(worldwide interoperability for microwave access,wimax)、以及3GPP等相关的蜂窝系统中,及未来的通信系统,如6G系统等。具体的,本申请实施例所应用的通信系统架构可以如图1所示,包括网络设备和多个终端设备,图1中以三个终端设备为例。终端设备1-终端设备3可以分别或者同时向网络设备发送数据(或数据包),需要说明的是,本申请实施例中不限定图1中所示通信系统中终端设备以及网络设备的个数。
上述终端设备也可以称为终端(terminal)、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端等。终端设备可以广泛应用于各种场景,例如,设备到设备(device-to-device,D2D)通信、车到一切(vehicle to everything,V2X)通信、机器类通信(machine-type communication,MTC)、物联网(internet of things,IoT)、虚拟现实、增强现实、工业控制、自动驾驶、远程医疗、智能电网、智能家具、智能办公、智能穿戴、智能交通、智慧城市等。终端设备可以是手机、平板电脑、带无线收发功能的电脑、可穿戴设备、车辆、无人机、直升机、飞机、轮船、机器人、机械臂、智能家居设备、车载终端、IoT终端、可穿戴设备、WiFi系统中的站点(station,STA)等。本申请的实施例对终端设备所采用的具体技术和具体设备形态不做限定。
网络设备也可以称为接入网(access network,AN)设备,或无线接入网(radio access network,RAN)设备。可以是基站(base station)、演进型基站(evolved NodeB,eNodeB)、收发点(transmitter and receiver point,TRP)、集成接入和回传(integrated access and backhauling,IAB)节点、第五代(5th generation,5G)移动通信系统中的下一代基站(next generation NodeB,gNB)、第六代(6th generation,6G)移动通信系统中的基站、其他未来移动通信系统中的基站、家庭基站(例如,home evolved nodeB,或home node B,HNB)、WiFi系统中的接入点(access point,AP)、无线中继节点、无线回传节点等。
在介绍本申请实施例之前,首先对本申请中的部分用语进行解释说明,以便于本领域技术人员理解。
1)、神经网络(neural network,NN)是一种模拟人脑神经网络以期能够实现类人工智能的机器学习技术。神经网络至少包括3层,一个输入层、一个中间层(也称隐藏层)以及一个输出层。更深一些的神经网络可能在输入层和输出层之间包含更多的隐藏层。以最简单的神经网络为例,对其内部的结构和实现进行说明,参见图2所示的包含3个层的全连接神经网络示意图。如图2所示,该神经网络包括3个层,分别是输入层、隐藏层以及输出层,其中图2中每个圆代表一个神经元,输入层有3个神经元,隐藏层有4个神经元,输出层有2个神经元,并且每层神经元与下一层神经元全连接。神经元之间的每条连线对应一个权重,这些权重通过训练可以更新。隐藏层和输出层的每个神经元还可以对应一个偏置,这些偏置通过训练也可以更新。更新神经网络是指更新这些权重和偏置。知道了神经网络的结构,即神经网络每层包含的神经元个数以及神经元之间的连接关系,和神经网络的参数,即神经元之间的每条连线对应的权重、每个神经元对应的偏置,就知道了该神经网络的全部信息。
由图2可知,每个神经元可能有多条输入连线,每个神经元根据输入计算输出。参见图3,图3是一个神经元根据输入计算输出的示意图。如图3所示,一个神经元包含3个输入,1个输出,以及2个计算功能,输出的计算公式(1-1)可以表示为:
输出=激活函数(输入1*权重1+输入2*权重2+输入3*权重3+偏置)    (1-1);
其中,“*”表示数学运算“乘”或“乘以”,其中激活函数可以采用S型函数(sigmoid函数)、双曲函数、整流函数(rectification function,ReLu)等。
每个神经元可能有多条输出连线,一个神经元的输出作为下一个神经元的输入。应理解,输入层只有输出连线,输入层的每个神经元是输入神经网络的值,每个神经元的输出值直接作为所有输出连线的输入。输出层只有输入连线,采用上述公式(1-1)的计算方式计算输出。可选的,输出层可以没有激活函数的计算,也就是说前述公式(1-1)可以变换成:输出=输入1*权重1+输入2*权重2+输入3*权重3+偏置。
举例来说,k层神经网络可以表示为:
y=fk(fk-1(…(f1(w1*x+b1)))          (1-2);
其中,x表示神经网络的输入,y表示神经网络的输出,wi表示第i层神经网络的权重,bi表示第i层神经网络的偏置,fi表示第i层神经网络的激活函数,i=1,2,…,k。
2)、信道聚合,在IEEE 802.11ac标准中,信道聚合技术被首次引入,允许基于一个20兆赫(mega hertz,MHz)的主信道(primary channel),将多个相邻的20MHz的次信道(secondary channel)聚合为带宽为40MHz、80MHz或者160MHz的信道用于传输,从而提高传输效率。图4为相邻的多信道聚合的示意图,参照图4所示可知,20MHz的主信道和20MHz的次信道可以聚合为带宽为40MHz的信道;40MHz的主信道和40MHz的次信道可以聚合为带宽为80MHz的信道;8MHz的主信道和80MHz的次信道可以聚合为带宽为160MHz的信道。
在802.11ac标准的下一代标准,即802.11ax标准中,基于前导码打孔(preamble puncturing)等技术,信道聚合被允许在非相邻的20MHz信道之间进行,为信道聚合提供了更多的灵活性,也为进一步提高传输吞吐率带来了更多可能。如图5所示,图5是前导码打孔传输的示意图。其中,TX表示发送(transport),CH表示信道(channel),每个信道(CH1、CH2、CH3、CH4)的带宽均为20MHz,帧1(frame 1)、帧2(frame 2)以及帧3(frame 3)的传输带宽均为80MHz,由于传输frame 1时,次20MHz信道(记为S20)繁忙(busy),所以S20被打孔,故frame 1的实际带宽为60MHz。同理,frame 2的实际带宽为60MHz,frame 3的实际带宽为40MHz。
3)、信道聚合方法,目前信道聚合方法主要分为静态信道聚合和动态信道聚合两类信道聚合方法。静态信道聚合的主要思想为:在主信道空闲的前提条件下,需要等待所有次信道也空闲,才可以进行信道聚合。动态信道聚合的主要思想是:在主信道空闲时,如果正好也存在次信道空闲,即可将主信道和空闲的次信道聚合。
由上述信道聚合方法可知,目前信道聚合方法,主要思想是在主信道空闲的情况下,将主信道与空闲的次信道进行聚合。然而,当存在多个终端设备竞争信道资源时,会存在多个终端设备应用的聚合后的信道存在部分或全部重叠,存在各终端设备发送的数据包碰撞率高、终端设备多次进入倒退窗口等待发送数据包,导致信道聚合吞吐小、时延大的问题。
基于此,本申请提供一种信道聚合方法,旨在基于信道的实时状态和业务的传输需求,利用人工智能(artificial intelligence,AI)的预测能力,做出优选的信道聚合决策提升聚合后信道的传输性能,解决信道聚合吞吐小、时延大的问题。下面将结合附图,对本申请实施例进行详细描述,其中附图中的虚线表示可选步骤或组件。
另外,需要理解的是,本申请实施例提及“第一”、“第二”等序数词是用于对多个对象进行区分,不用于限定多个对象的大小、内容、顺序、时序、优先级或者重要程度等。例如,第t个时间段和第t+1个时间段,并不是表示这两个时间段对应的优先级或者重要程度等的不同。
本申请实施例中,对于名词的数目,除非特别说明,表示“单数名词或复数名词”,即"一个或多个”。“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。例如,A/B,表示:A或B。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),表示:a,b,c,a和b,a和c,b和c,或a和b和c,其中a,b,c可以是单个,也可以是多个。
图6为本申请实施例提供的信道聚合方法,该方法包括:
S601:第一终端设备接收来自网络设备的负载报告,负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息,M为大于或等于2的整数,t为大于或等于2的整数。
在本申请实施例中,网络设备可以按照设定的获取周期,通过载波监听(carrier sensing)等方式,获取网络设备的M个信道中每个信道在获取周期所对应的时间段(如第t个时间段)的负载信息。其中,信道在某一时间段(如第t个时间段)的负载信息,可以用于一个负载值来表示,该负载值表示该时间段内信道繁忙的时间(即有数据包传输的时间)占总时间比的比值。
作为一种示例:对于某一时间段(如第t个时间段),网络设备可以通过载波监听获得该时间段内网络设备的M个信道中每个信道在每个时间单元是否有数据包的传输,并根据该时间段内每个信道在每个时间单元是否有数据包传输,来确定每个信道在该时间段的负载信息(如负载值)。其中时间单元可以为子帧、时隙(slot)、迷你时隙或符号等不同的时间粒度的资源,一个时间段内可以包括一个或多个时间单元。
例如:某一时间段(如第t个时间段)包括50个时域单元,网络设备通过载波监听获得信道A在 该时间段中的30个时域单元中有数据包的传输,则可以确定信道A在该时间段的负载信息(如负载值)为30/50*100%=60%。可选地,还可以将负载值量化(scale)到0-255,比如信道A在该时间段的负载值为60%,60%*255=153,可以通过153来表示信道A在该时间段的负载值60%。
对于第t个时间段,网络设备获得M个信道中每个信道在第t个时间段的负载信息后,可以将包括M个信道中每个信道在第t个时间段的负载信息的负载报告(load report)通过广播、组播等方式发送终端设备,例如:通过广播的方式发送给位于网络设备服务范围内的一个或多个终端设备。
其中,负载报告中用于指示每个信道的负载信息的指示信息可以如图7所示,其中,信道编号(channel number)字段用于指示信道的编号(或索引),占用一个8比特(octet);信道负载(channel load)字段用于指示信道对应的负载值,占用一个octet。对于M个信道中每个信道在第t个时间段的负载信息,负载报告共产生M*16比特的开销。
在一种可能的实施中,用于指示每个信道的负载信息的指示信息还可以如图8所示,指示信息还可以包括监管类(regulatory class)字段和实际测量停止时间(actual measurement stop time)字段。其中,监管类字段可以指示一个类型集合,占用一个octet,该类型集合可以包含:工作频段、信道带宽、所在信道集合、传输功率上限、设定排放限值(emissions limits set)、行为限制集(behavior limits set)等信息中的一项或多项。例如:监管类字段的值为55对应的类型集合表示信道在5吉赫(GHz)频段下,信道带宽为20MHz,所属信道集合中包括的信道的信道编号(或索引)为149、153、157、161、165,传输功率为1000mW,emissions limits set为4,behavior limits set为10;监管类字段的值12对应的监管类集合表示信在2.407GHz频段下,信道带宽为25MHz,所属信道集合中包括的信道的信道编号(或索引)为1-11,传输功率为1000mW,emissions limits sets为4和behavior limits set为10。实际测量停止时间字段,占用8个octet,用于指示完成负载测量的时间,可以用于保证下发给各终端设备的负载报告的时间一致性,比如网络设备在第t个时间段通过载波监听对信道进行负载测量,则该完成负载测量的时间为第t个时间段的截止时间。
需要理解的是,监管类字段和实际测量停止时间字段是可选的,是否存在监管类字段和实际测量停止时间字段可以通过在指示信息的前2比特指示。例如:00表示没有这两个字段,01表示存在实际测量停止时间字段,10表示存在监管类字段,而11表示监管类字段和实际测量停止时间字段全部都存在。
另外,需要理解的是,M个信道包括第一终端设备对应的1个主信道和M-1个次信道,其中M个信道中,第一终端设备对应的1个主信道可以由网络设备通过无线资源控制(radio resource control,RRC)消息等指示给第一终端设备,也可以由第一终端设备根据M个信道的负载信息确定(如选择负载值最小的信道作为主信道)等,本申请对此不作限定。
S602:第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值。
其中,第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息,信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合,N为大于或等于0、且小于或等于M-1的整数。
在本申请实施例中,第一终端设备还可以在每个获取周期对应的时间段内,对主信道和M-1个次信道进行信道状态监测,得到信道状态监测信息。以第t个时间段为例,第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息可以包括:第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态;第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态;第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数中的一项或多项。
其中,对于第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态,可以用表示,其中i=1、2、3、…、M,表示主信道和M-1个次信道(共M个信道)中的第i个信道(以下简称信道i),中包含的元素数量与第t个时间段内包括的时间单元的数量相等,元素的值为1代表信道i在该元素对应的时间单元的忙闲状态为忙(即有数据包的传输,可能是第一终端设备的数据传输,也可能是其它终端设备的数据包传输)、元素的值为0代表信道i在该元素对应的时间单元的忙闲状态为闲(即无数据包的传输)、元素的值为-1代表第一终端设备未监测信道 i在该元素对应的时间单元的忙闲状态(比如因第一终端设备在该元素对应的时间单元在信道i外的其它信道发送数据包,无法监测信道i在该元素对应的时间单元的忙闲状态)。比如表示信道i在第t个时间段的前9个时间单元的忙闲状态为闲,第10个时间单元的忙闲状态为忙。
对于第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态可以用表示,其中i=1、2、3、…、M,表示主信道和M-1个次信道(共M个信道)中的第i个信道(以下简称信道i),中包含的元素数量与第t个时间段内包括的时间单元的数量相等,元素的值为1代表对于信道i第一终端设备在该元素对应的时间单元有数据包的发送、元素的值为0代表对于信道i第一终端设备在该元素对应的时间单元无数据包的发送。比如表示第一终端设备在第t个时间段的前3个时间单元和第10个时间单元在信道i有数据包的发送,在第4至第9个时间单元在信道i没有数据包的发送。
对于第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数可以用表示。以为例,在第t个时间段的第一个时间单元,第一终端设备可以将的值设置为初始值0;在第t个时间段的第二个时间单元,中对应第二个时间单元的元素的值均与对应第一时间单元的元素的值相同,的值+1(为1);在第t个时间段的第三个时间单元,中对应第三个时间单元的元素的值均与对应第二时间单元的元素的值相同,的值+1(为2);在第t个时间段的第四个时间单元,存在中对应第四个时间单元的元素的值与对应第三时间单元的元素的值不相同,的值重置为0;在第t个时间段的第五个时间单元,存在中对应第五个时间单元的元素的值与对应第四时间单元的元素的值不相同,的值重置为0;在第t个时间段的第六个时间单元,中对应第六个时间单元的元素的值均与对应第五时间单元的元素的值相同,的值+1(为1);…;在第t个时间段的第十个时间单元,中对应第十个时间单元的元素的值均与对应第九个时间单元的元素的值相同,的值+1(为5);最终得到为5。
在本申请实施例中,信道聚合模型的输入可以是某一时间段(如第t个时间段)的信道环境信息S,信道聚合模型输出为信道聚合指示值Y。以第t个时间段为例,第t个时间段的信道环境信息St包括主信道在第t个时间段的负载信息以及M-1个次信道在第t个时间段的负载信息其中j=1、2、3、…、M-1,表示M-1个次信道中的第j个次信道。还可以包括第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息,如第t个时间段对应的上述 中的一项或多项。Y可以为0到2M-1-1之间的一个数,每个数都映射为具体的一种包含主信道在内的信道聚合方式,如Y=0代表不做信道聚合,Y=1代表主信道与M-1个次信道中第一个次信道进行信道聚合,Y=2代表主信道与M-1个次信道中第二个次信道聚合,…,Y=M-1代表主信道与M-1个次信道中第M-1个次信道聚合,Y=M代表主信道与M-1个次信道中第一个次信道和第二个次信道聚合等等。
对于信道聚合模型(也即信道聚合模型对应的神经网络)中各层神经元的参数,可以通过随机初始化的方式为信道聚合模型中的各层神经元配置参数。也可以采用样本库中已标注有信道聚合方式对应的目标信道聚合指示值的多个信道环境信息样本,由训练设备训练得到。在一种可能的实施中,样本库中的多个信道环境信息样本可以由第一终端设备获取多个时间段分别对应的信道环境信息,并由人工针对每个时间段对应的信道环境信息,根据该时间段下一时间段对应的信道环境信息,确定该时间段对应的信道环境信息所对应的一个优选信道聚合方式后,为该时间段对应的信道环境信息标注对应该优先信道聚合方式的目标信道聚合指示值。在对信道聚合模型进行训练时,训练设备(如第一终端设备或网络设备)可以将样本库中的信道环境信息样本输入到信道聚合模型,得到信道聚合模型输出的信道聚合指示值,根据信道聚合模型输出的信道聚合指示值与该信道环境信息样本对应的目标信道聚合指示值,通过损失函数(loss function)训练设备可以计算信道聚合模型的损失(loss),loss越高表示通过信道聚合模型输出的信道聚合指示值与目标信道聚合指示值的差异越大,信道聚合模型根据loss调整信道聚合模型中神经元的参数,如采用随机梯度下降法更新信道聚合模型中神经元的参数,那么对信道聚合模型的训练过程就变成了尽可能缩小这个loss的过程。通过样本集中的信道环境信息样本不断对信道聚合模型进行训练,当这个loss缩小至预设范围,即可得到训练完成的信道聚合模型。
作为一种示例,本申请实施例的信道聚合模型的结构可以如图9A所示,其中图9A中每个方块代表一个全连接层,信道聚合模型可以由7个全连接层构成,其中7个全连接层从左到右依次为1个输入层、5个隐藏层和1个输出层,其中每层的激活函数可以采用整流函数(rectification function,ReLu), 输入层的输入(inputs)为某一时间段(如第一时间段)的信道环境信息S,输入层的输出h1为隐藏层1的输入,隐藏层1的输出h2为隐藏层2的输入,隐藏层2的输出h3为隐藏层3的输入,隐藏层3的输出h4与隐藏层1的输出h2的异或运算结果为隐藏层4的输入,隐藏层4的输出h5为隐藏层5的输入,隐藏层5的输出h6与隐藏层3的输出h4的异或运算结果为输出层的输入,输出层的输出为信道聚合指示值Y。对信道聚合模型进行训练的过程,就是不断调整信道聚合模型中各层神经元的参数的过程。
需要理解的是,上述训练设备可以为第一终端设备,也可以为网络设备,还可以为服务器、计算机等其它设备,当训练设备非第一终端设备时,可以由训练设备确定信道聚合模型中各层神经元的参数后发送给第一终端设备。
在一些实施中,如图9B所示,在信道聚合模型基于某一时间段(如第t-1个时间段)的信道环境信息(St-1)输出信道聚合指示值(如第t-1信道聚合指示值)后,第一终端设备还可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞,并根据该信道聚合指示值所指示的信道聚合方式以及在聚合后的信道上的发送数据包的情况,结合该时间段各信道的负载情况,基于信道聚合模型输出的该信道聚合指示值给予一个奖励值(如Rt),并将该奖励值也作为下一个时间段(如第t个时间段)信道聚合模型的输入。以引导信道聚合模型根据各信道上的负载情况进行学习,以期通过信道聚合模型输出最优的信道聚合决策。可选地,还可以将信道聚合模型基于某一时间段(如第t-1个时间段)的信道环境信息输出信道聚合指示值(如第t-1)也作为下一个时间段(如第t个时间段)信道聚合模型的输入。
在一种可能的实现中,第一终端设备可以采用如下方式,确定基于信道聚合模型得到信道聚合指示值的奖励值,也即确定第一终端设备执行基于信道聚合模型得到的决策动作(即信道聚合指示值对应的信道聚合方式)的奖励值。下面以该时间段为第t个时间段,基于信道聚合模型得到信道聚合指示值的奖励值为Rt+1为例进行说明:
当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N不为零时,第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值;
当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N为零时,第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值;
当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N不为零时,第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值;
当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N为零时,第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。
上述各个情况中,Rt+1表示基于信道聚合模型得到第t信道聚合指示值的奖励值,K表示N个次信道中的第K个次信道,K=1、2、…、N,表示第K个次信道在第t个时间段的负载信息,表示主信道在第t个时间段的负载信息。
在另一些实现中,第一终端设备也根据主信道和N个次信道在第t个时间段负载信息(如负载值)的均值,确定基于信道聚合模型得到第t信道聚合指示值的奖励值Rt+1。例如:当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞时,将主信道和N个信道在第t+1个时间段负载信息(如负载值)的均值与-1的乘积,作为奖励值Rt+1;当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞时,将主信道和N个信道在第t+1个时间段负载信息(如负载值)的均值,作为奖励值Rt+1
上述是以时间段为第t个时间段,确定基于信道聚合模型得到第t信道聚合指示值的奖励值为Rt+1为例进行说明,可以理解的是对于其它时间段,如第t-1个时间段(第t-1个时间段为第t个时间段之前的时间段),第一终端设备也可以根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,来确定基于信道聚合模型得到所述第t-1信道聚合指示值的奖励值Rt,第t-1信道聚合指示值用于指示 M-1个次信道中的N’个次信道与主信道聚合,N’与N相同或不同。
比如:当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。
当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时,第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。
上述各个情况中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,K表示N’个次信道中的第K个次信道,K=1、2、…、N’,表示第K个次信道在第t-1个时间段的负载信息,表示主信道在第t-1个时间段的负载信息。
S603:第一终端设备在第t+1个时间段通过主信道和N个次信道聚合后的信道发送数据包。所述第t+1个时间段为所述第t个时间段之后的时间段。
第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值后,即可根据第t信道聚合指示值所指示M-1个次信道中的N个次信道与主信道聚合,将M-1个次信道与主信道聚合,并在第t个时间段之后的第t+1个时间段通过聚合后的信道向网络设备发送数据包。
在一些实施中,为了使信道聚合模型做出的信道聚合决策(即信道聚合指示值)符合用户的预期,用户还可以预先配置用于评价基于各种信道环境信息S做出不同决策动作a(即不同信道聚合指示值Y所对应的信道聚合方式)的状态动作值函数Q,对于信道聚合模型基于某一时间段的信道环境信息S输出的决策动作a(也即输出的信道聚合指示值Y所对应的信道聚合方式)进行状态动作值评价,得到一个第一状态动作值;并可以通过状态动作值函数Q对该时间段的信道环境信息S对应的所有可能的决策动作(即所有可能的信道聚合指示值Y所对应的信道聚合方式)分别进行评价,得到多个状态动作值,并选取其中的最大值,作为第二状态动作值。并可以根据第二状态动作值和第一状态动作值,以及奖励值,确定信道聚合模型的损失,对信道聚合模型进行训练更新,如根据损失采用随机梯度下降法更新信道聚合模型中神经元的参数。
以第t个时间段为例,可以采用以下期望平方奖励值函数(也可以称为损失函数),确定信道聚合模型的损失。
L(θ)=E[Rt+1+γmaxa′Q(st′,a′,θ*)-Q(st,at;θ)]2
其中,L()表示期望平方奖励值函数、L(θ)表示信道聚合模型的损失、Q()表示设定的状态动作值函数、γ表示折扣因子(取值可以为0.9等)、θ表示信道聚合模型当前的参数、Rt+1表示基于信道聚合模型得到的决策动作at(也即得到的第t信道聚合指示值对应的信道聚合方式)的奖励值;Q(st,at;θ)]表示基于第t个时间段的信道环境信息st进行决策动作at(也即输出的第t信道聚合指示值对应的信道聚合方式)的第一状态动作值;maxa′Q(st′,a′,θ*)表示基于第t个时间段的信道环境信息st分别进行所有可选决策动作a(2M-1-1个候选信道聚合指示值分别对应的候选信道聚合方式)的状态动作值中的最大状态动作值(即第二状态动作值),a′表示对应该第二状态动作值的决策动作,θ*表 示目标信道聚合模型的参数,也即输出该第二状态动作值的决策动作a′(a′对应的信道聚合指示值)时信道聚合模型的参数。
上述是以时间段为第t个时间段,确定信道聚合模型的损失为例进行说明的,可以理解的是,对于其它时间段(如第t-1个时间段),将对应第t个时间段的奖励值、第一状态动作值和第二状态动作值替换为对应于第t-1个时间段的奖励值、第一状态动作值和第二状态动作值,即可确定对应第t-1个时间段的信道聚合模型的损失,对t-1个时间段的信道聚合模型进行训练更新。
另外,需要理解的是,上述是由信道聚合模型在第一终端设备侧,由第一终端设备基于信道聚合模型,对输入的第t个时间段信道环境信息进行处理,得到信道聚合指示值,第一终端设备基于信道聚合指示值所指示的信道聚合方式,在第t+1个时间段进行信道聚合为例进行说明的。在一些实施中,信道聚合模型还可以部署在网络设备,由网络设备侧获取第一终端设备对应第t个时间段信道环境信息并输入信道聚合模型,对输入的第t个时间段信道环境信息进行处理,得到信道聚合指示值,并由网络设备将信道聚合指示值或信道聚合指示值所指示的信道聚合方式发送给第一终端设备,第一终端设备根据来自网络设备的信道聚合指示值或信道聚合指示值所指示的信道聚合方式,进行信道聚合。
可以理解的是,为了实现上述实施例中功能,第一终端设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
图10和图11为本申请的实施例提供的可能的通信装置的结构示意图。这些通信装置可以用于实现上述方法实施例中第一终端设备的功能,因此也能实现上述方法实施例所具备的有益效果。在一种可能的实现中,该通信装置可以是第一终端设备,还可以是应用于第一终端设备的模块(如芯片)。
如图10所示,通信装置1000包括处理单元1010和接口单元1020,其中接口单元1020还可以为收发单元或输入输出接口。通信装置1000可用于实现上述图6中所示的方法实施例中第一终端设备的功能。
当通信装置1000用于实现图6所示的方法实施例中第一终端设备的功能时:
接口单元1020,用于接收来自网络设备的负载报告,负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息,其中,M个信道包括第一终端设备对应的1个主信道和M-1个次信道,M为大于或等于2的整数,t为大于或等于2的整数;处理单元1010,用于将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值,第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息,第t信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合,N为大于或等于0、且小于或等于M-1的整数;以及对主信道和N个次信道进行信道聚合。可选地,负载报告还包括第t时段的截止时间。
在一种可能的设计中,处理单元1010在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息包括以下中的一项或多项:处理单元1010在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态;处理单元1010在第t个时间段内监测到的通信装置在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态;处理单元1010在第t个时间段内监测到的通信装置在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。
在一种可能的设计中,处理单元1010还用于根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值,其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合;根据第t-1个时间段的信道环境信息、第t-1信道聚合指示值和设定的状态动作值函数,确定基于第t-1个时间段的信道环境信息进行第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值;根据第t-1个时间段的信道环境信息、主信道与M-1个次信道对应的2M-1-1个候选信道聚合指示值和设定的状态动作值函数,确定第二状态动作值,其中2M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2M-1-1个候选信道聚合方式,第二状态动作值为基于第t-1个时间段的信道环境信息分别进行2M-1-1个候选信道聚合指示值对应的候选信道 聚合方式的状态动作值中的最大状态动作值;以及根据第一状态动作值、第二状态动作值和第t-1信道聚合指示值的奖励值,确定信道聚合模型的损失;根据信道聚合模型的损失,对信道聚合模型进行训练更新;其中,N’与N相同或不同,第t-1个时间段为第t个时间段之前的时间段。
在一种可能的设计中,处理单元1010,还用于根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值,其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合,N’与N相同或不同,第t-1个时间段为第t个时间段之前的时间段;处理单元1010将第t个时间段的信道环境信息输入到信道聚合模型进行处理,得到第t信道聚合指示值时,具体用于将第t个时间段的信道环境信息、第t-1信道聚合指示值的奖励值输入到信道聚合模型进行处理,得到第t信道聚合指示值。
一种可能的实现中,处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时,处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;其中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,K表示N’个次信道中的第K个次信道,K=1、2、…、N’,表示第K个次信道在第t-1个时间段的负载信息。
另一种可能的实现中,处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时,处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值;其中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,表示主信道在第t-1个时间段的负载信息。
再一种可能的实现中,处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时,处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;其中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,K表示N’个次信道中的第K个次信道,K=1、2、…、N’,表示第K个次信道在第t-1个时间段的负载信息,表示主信道在第t-1个时间段的负载信息。
又一种可能的实现中,处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时,处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值;其中,Rt表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值,表示主信道在第t-1个时间段的负载信息。
在一种可能的设计中,处理单元1010,还用于根据主信道和N个次信道中每个次信道在第t个时间段的负载信息,确定基于信道聚合模型得到第t信道聚合指示值的奖励值。
一种可能的实现中,处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息,确定基于信道聚合模型得到第t信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N不为零时,处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。
另一种可能的实现中,处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负 载信息,确定基于信道聚合模型得到第t信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N为零时,处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。
再一种可能的实现中,处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息,确定基于信道聚合模型得到第t信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N不为零时,处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。
又一种可能的实现中,处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息,确定基于信道聚合模型得到第t信道聚合指示值的奖励值时,具体用于:当接口单元1020在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N为零时,处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值;
上述几种设计中,Rt+1表示基于信道聚合模型得到第t信道聚合指示值的奖励值,K表示N个次信道中的第K个次信道,K=1、2、…、N,表示第K个次信道在第t个时间段的负载信息,表示主信道在第t个时间段的负载信息。
在一种可能的设计中,处理单元1010,还用于根据第t个时间段的信道环境信息、第t信道聚合指示值和设定的状态动作值函数,确定基于第t个时间段的信道环境信息进行第t信道聚合指示值对应的信道聚合方式的第一状态动作值;根据第t个时间段的信道环境信息、主信道与M-1个次信道对应的2M-1-1个候选信道聚合指示值和设定的状态动作值函数,确定第二状态动作值,其中2M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2M-1-1个候选信道聚合方式,第二状态动作值为基于第t个时间段的信道环境信息分别进行2M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值;以及根据第一状态动作值、第二状态动作值和基于信道聚合模型得到第t信道聚合指示值的奖励值,确定信道聚合模型的损失;根据信道聚合模型的损失,对信道聚合模型进行训练更新。
如图11所示,本申请还提供一种通信装置1100,包括处理器1110和接口电路1120。处理器1110和接口电路1120之间相互耦合。可以理解的是,接口电路1120可以为收发器、输入输出接口、输入接口、输出接口、通信接口等。可选的,通信装置1100还可以包括存储器1130,用于存储处理器1110执行的指令或存储处理器1110运行指令所需要的输入数据或存储处理器1110运行指令后产生的数据。可选的,存储器1130还可以和处理器1110集成在一起。
当通信装置1100用于实现图6所示的方法时,处理器1110可以用于实现上述处理单元1010的功能,接口电路1120可以用于实现上述接口单元1020的功能。
如图12所示,为本申请实施例提供的一种设备结构示意图,该设备可以是网络设备或第一终端设备,该设备中可以包括处理器、收发机和天线,其中处理器可以包括一个获多个处理单元,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。其中,处理器可以是设备的神经中枢和指挥中心。处理器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的操作。在本申请实施例中,处理器可以根据信道聚合方法对应的指令,执行相应信道聚合方法流程;收发器和天线可以接收来自其它设备的信号并传输至处理器或将来自处理器的信号发送给其它设备。
另外,在设备中还可以包括神经网络处理器(neural-network processing unit,NPU),由NPU实现对信道聚合模型(即神经网络模型)训练更新,以及根据输入信道聚合模型的信息,进行运算输出信道聚合方式(或信道聚合方式对应的信道聚合指示值)。可以理解的是在NPU中可以包含推理模块和训练模块,其中训练模块可以用于实现对信道聚合模型(即神经网络模型)训练更新。推理模块可以实现根据输入信道聚合模型的信息,进行运算输出信道聚合方式。另外NPU可以耦合在中央处理器中,本申请对此不作限定。
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、逻辑电路、现场可编程门阵列(field programmable gate array,FPGA) 或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
本申请的实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于网络设备或终端设备中。当然,处理器和存储介质也可以作为分立组件存在于网络设备或终端设备中。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网络设备、终端、计算机、服务器或数据中心通过有线或无线方式向另一个网络设备、终端、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘;还可以是半导体介质,例如,固态硬盘。该计算机可读存储介质可以是易失性或非易失性存储介质,或可包括易失性和非易失性两种类型的存储介质。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
另外,需要理解,在本申请实施例中,“示例的”一词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (21)

  1. 一种信道聚合方法,其特征在于,包括:
    第一终端设备接收来自网络设备的负载报告,所述负载报告包括所述网络设备的M个信道中每个信道在第t个时间段的负载信息,其中,所述M个信道包括所述第一终端设备对应的1个主信道和M-1个次信道,所述M为大于或等于2的整数,所述t为大于或等于2的整数;
    所述第一终端设备将所述第t个时间段的信道环境信息输入到所述信道聚合模型进行处理,得到第t信道聚合指示值,所述第t个时间段的信道环境信息包括所述主信道和所述M-1个次信道中每个次信道在所述第t个时间段的负载信息、以及所述第一终端设备在所述第t个时间段对所述主信道和所述M-1个次信道进行信道状态监测得到的信道状态监测信息,所述第t信道聚合指示值用于指示所述M-1个次信道中的N个次信道与所述主信道聚合,所述N为大于或等于0、且小于或等于所述M-1的整数;
    所述第一终端设备在第t+1个时间段通过所述主信道和所述N个次信道聚合后的信道发送数据包,所述第t+1个时间段为所述第t个时间段之后的时间段。
  2. 如权利要求1所述的方法,其特征在于,所述第一终端设备在所述第t个时间段对所述主信道和所述M-1个次信道进行信道状态监测得到的信道状态监测信息包括以下中的一项或多项:
    所述第一终端设备在所述第t个时间段内监测到的所述主信道和所述M-1个次信道中的每个次信道在每个时间单元的忙闲状态;
    所述第一终端设备在所述第t个时间段内监测到的所述第一终端设备在所述主信道和所述M-1个次信道中的每个次信道上每个时间单元的数据包发送状态;
    所述第一终端设备在所述第t个时间段内监测到的所述第一终端设备在所述主信道和所述M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。
  3. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:
    所述第一终端设备根据所述主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,其中所述第t-1信道聚合指示值用于指示所述M-1个次信道中的所述N’个次信道与所述主信道聚合;
    所述第一终端设备根据所述第t-1个时间段的信道环境信息、所述第t-1信道聚合指示值和设定的状态动作值函数,确定基于所述第t-1个时间段的信道环境信息进行所述第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值;
    所述第一终端设备根据所述第t-1个时间段的信道环境信息、所述主信道与所述M-1个次信道对应的2M-1-1个候选信道聚合指示值和所述设定的状态动作值函数,确定第二状态动作值,其中所述2M-1-1个候选信道聚合指示值对应于所述主信道与所述M-1个次信道的2M-1-1个候选信道聚合方式,所述第二状态动作值为基于所述第t-1个时间段的信道环境信息分别进行所述2M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值;
    所述第一终端设备根据所述第一状态动作值、所述第二状态动作值和所述第t-1信道聚合指示值的奖励值,确定所述信道聚合模型的损失;
    所述第一终端设备根据所述信道聚合模型的损失,对所述信道聚合模型进行训练更新;
    其中,所述N’与所述N相同或不同,所述第t-1个时间段为所述第t个时间段之前的时间段。
  4. 如权利要求1或2所述的方法,其特征在于,所述方法还包括:
    所述第一终端设备根据所述主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,其中所述第t-1信道聚合指示值用于指示所述M-1个次信道中的所述N’个次信道与所述主信道聚合,所述N’与所述N相同或不同,所述第t-1个时间段为所述第t个时间段之前的时间段;
    所述第一终端设备将所述第t个时间段的信道环境信息输入到所述信道聚合模型进行处理,得到第t信道聚合指示值,包括:
    所述第一终端设备将所述第t个时间段的信道环境信息、所述第t-1信道聚合指示值的奖励值输入到所述信道聚合模型进行处理,得到所述第t信道聚合指示值。
  5. 如权利要求3或4所述的方法,其特征在于,所述第一终端设备根据所述主信道和所述N’个次 信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,包括:
    当所述第一终端设备在所述主信道和所述N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且所述N’不为零时,所述第一终端设备根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述K表示所述N’个次信道中的第K个次信道,所述K=1、2、…、N’,所述表示所述第K个次信道在所述第t-1个时间段的负载信息。
  6. 如权利要求3或4所述的方法,其特征在于,所述第一终端设备根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,包括:
    当所述第一终端设备在所述主信道和所述N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且所述N’为零时,所述第一终端设备根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述表示所述主信道在所述第t-1个时间段的负载信息。
  7. 如权利要求3或4所述的方法,其特征在于,所述第一终端设备根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,包括:
    当所述第一终端设备在所述主信道和所述N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且所述N’不为零时,所述第一终端设备根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述K表示所述N’个次信道中的第K个次信道,所述K=1、2、…、N’,所述表示所述第K个次信道在所述第t-1个时间段的负载信息,所述表示所述主信道在所述第t-1个时间段的负载信息。
  8. 如权利要求3或4所述的方法,其特征在于,所述第一终端设备根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,包括:
    当所述第一终端设备在所述主信道和所述N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且所述N’为零时,所述第一终端设备根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述表示所述主信道在所述第t-1个时间段的负载信息。
  9. 如权利要求1-8中任一项所述的方法,其特征在于,所述负载报告还包括所述第t时段的截止时间。
  10. 一种通信装置,其特征在于,包括接口单元和处理单元;
    所述接口单元,用于接收来自网络设备的负载报告,所述负载报告包括所述网络设备的M个信道中每个信道在第t个时间段的负载信息,其中,所述M个信道包括所述第一终端设备对应的1个主信道和M-1个次信道,所述M为大于或等于2的整数,所述t为大于或等于2的整数;
    所述处理单元,用于将所述第t个时间段的信道环境信息输入到所述信道聚合模型进行处理,得到第t信道聚合指示值,所述第t个时间段的信道环境信息包括所述主信道和所述M-1个次信道中每个次信道在所述第t个时间段的负载信息、以及在所述第t个时间段对所述主信道和所述M-1个次信道进行信道状态监测得到的信道状态监测信息,所述第t信道聚合指示值用于指示所述M-1个次信道中的N个次信道与所述主信道聚合,所述N为大于或等于0、且小于或等于所述M-1的整数;以及在第t+1个时间段通过所述主信道和所述N个次信道聚合后的信道发送数据包,所述第t+1个时间段为所述第t 个时间段之后的时间段。
  11. 如权利要求10所述的装置,其特征在于,所述处理单元在所述第t个时间段对所述主信道和所述M-1个次信道进行信道状态监测得到的信道状态监测信息包括以下中的一项或多项:
    所述处理单元在所述第t个时间段内监测到的所述主信道和所述M-1个次信道中的每个次信道在每个时间单元的忙闲状态;
    所述处理单元在所述第t个时间段内监测到的所述通信装置在所述主信道和所述M-1个次信道中的每个次信道上每个时间单元的数据包发送状态;
    所述处理单元在所述第t个时间段内监测到的所述通信装置在所述主信道和所述M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。
  12. 如权利要求10或11所述的装置,其特征在于,所述处理单元,还用于:
    根据所述主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,其中所述第t-1信道聚合指示值用于指示所述M-1个次信道中的所述N’个次信道与所述主信道聚合;
    根据所述第t-1个时间段的信道环境信息、所述第t-1信道聚合指示值和设定的状态动作值函数,确定基于所述第t-1个时间段的信道环境信息进行所述第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值;
    根据所述第t-1个时间段的信道环境信息、所述主信道与所述M-1个次信道对应的2M-1-1个候选信道聚合指示值和所述设定的状态动作值函数,确定第二状态动作值,其中所述2M-1-1个候选信道聚合指示值对应于所述主信道与所述M-1个次信道的2M-1-1个候选信道聚合方式,所述第二状态动作值为基于所述第t-1个时间段的信道环境信息分别进行所述2M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值;
    根据所述第一状态动作值、所述第二状态动作值和所述第t-1信道聚合指示值的奖励值,确定所述信道聚合模型的损失;根据所述信道聚合模型的损失,对所述信道聚合模型进行训练更新;
    其中,所述N’与所述N相同或不同,所述第t-1个时间段为所述第t个时间段之前的时间段。
  13. 如权利要求10或11所述的装置,其特征在于,所述处理单元,还用于:
    根据所述主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,其中所述第t-1信道聚合指示值用于指示所述M-1个次信道中的所述N’个次信道与所述主信道聚合,所述N’与所述N相同或不同,所述第t-1个时间段为所述第t个时间段之前的时间段;
    将所述第t个时间段的信道环境信息输入到所述信道聚合模型进行处理,得到第t信道聚合指示值时,具体用于将所述第t个时间段的信道环境信息、所述第t-1信道聚合指示值的奖励值输入到所述信道聚合模型进行处理,得到所述第t信道聚合指示值。
  14. 如权利要求12或13所述的装置,其特征在于,所述处理单元根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值时,具体用于:
    当所述接口单元在所述主信道和所述N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且所述N’不为零时,所述处理单元根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述K表示所述N’个次信道中的第K个次信道,所述K=1、2、…、N’,所述表示所述第K个次信道在所述第t-1个时间段的负载信息。
  15. 如权利要求12或13所述的装置,其特征在于,所述处理单元根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值时,具体用于:
    当所述接口单元在所述主信道和所述N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且所述N’为零时,所述处理单元根据确定基于所述信道聚合模型得到所述第t信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述表示所述主信道在所述第t-1个时间段的负载信息。
  16. 如权利要求12或13所述的装置,其特征在于,所述处理单元根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值时,具体用于:
    当所述接口单元在所述主信道和所述N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且所述N’不为零时,所述处理单元根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述K表示所述N’个次信道中的第K个次信道,所述K=1、2、…、N’,所述表示所述第K个次信道在所述第t-1个时间段的负载信息,所述表示所述主信道在所述第t-1个时间段的负载信息。
  17. 如权利要求12或13所述的装置,其特征在于,所述处理单元根据所述主信道和所述N’个次信道中每个次信道在所述第t-1个时间段的负载信息,确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值时,具体用于:
    当所述接口单元在所述主信道和所述N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且所述N’为零时,所述处理单元根据确定基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值;
    其中,所述Rt表示基于所述信道聚合模型得到所述第t-1信道聚合指示值的奖励值,所述表示所述主信道在所述第t-1个时间段的负载信息。
  18. 如权利要求10-17中任一项所述的装置,其特征在于,所述负载报告还包括所述第t时段的截止时间。
  19. 一种计算机程序产品,其特征在于,包含指令,当所述指令被执行,使得如权利要求1-9中任一项所述的方法被实现。
  20. 一种芯片,其特征在于,所述芯片用于实现如权利要求1-9中任一项所述的方法。
  21. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有计算机程序或指令,当所述计算机程序或指令被执行时,使得如权利要求1-9中任一项所述的方法被实现。
PCT/CN2023/115350 2022-08-31 2023-08-28 一种信道聚合方法及装置 WO2024046286A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211066591.3 2022-08-31
CN202211066591.3A CN117693035A (zh) 2022-08-31 2022-08-31 一种信道聚合方法及装置

Publications (1)

Publication Number Publication Date
WO2024046286A1 true WO2024046286A1 (zh) 2024-03-07

Family

ID=90100358

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/115350 WO2024046286A1 (zh) 2022-08-31 2023-08-28 一种信道聚合方法及装置

Country Status (2)

Country Link
CN (1) CN117693035A (zh)
WO (1) WO2024046286A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180049186A1 (en) * 2015-03-09 2018-02-15 Kt Corporation Method for transmitting channel state information and device therefor
CN114616762A (zh) * 2019-10-25 2022-06-10 Oppo广东移动通信有限公司 用于传输信道状态信息的方法和设备
CN114698138A (zh) * 2020-12-29 2022-07-01 华为技术有限公司 一种信道接入方法和装置
CN114885426A (zh) * 2022-05-05 2022-08-09 南京航空航天大学 一种基于联邦学习和深度q网络的5g车联网资源分配方法
WO2022171536A1 (en) * 2021-02-15 2022-08-18 Nokia Technologies Oy Machine learning model distribution

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180049186A1 (en) * 2015-03-09 2018-02-15 Kt Corporation Method for transmitting channel state information and device therefor
CN114616762A (zh) * 2019-10-25 2022-06-10 Oppo广东移动通信有限公司 用于传输信道状态信息的方法和设备
CN114698138A (zh) * 2020-12-29 2022-07-01 华为技术有限公司 一种信道接入方法和装置
WO2022171536A1 (en) * 2021-02-15 2022-08-18 Nokia Technologies Oy Machine learning model distribution
CN114885426A (zh) * 2022-05-05 2022-08-09 南京航空航天大学 一种基于联邦学习和深度q网络的5g车联网资源分配方法

Also Published As

Publication number Publication date
CN117693035A (zh) 2024-03-12

Similar Documents

Publication Publication Date Title
US11399303B2 (en) Configuration of a neural network for a radio access network (RAN) node of a wireless network
EP3530049B1 (en) Method and appratus for allocating resources for wireless networks
US20230284194A1 (en) Carrier management method, resource allocation method and related devices
WO2021159467A1 (zh) 一种资源的配置方法及网络设备
Chen et al. Optimized uplink-downlink decoupling in LTE-U networks: An echo state approach
CN117280640A (zh) 关于信道状态的反馈信息的基于模型的确定
EP4243480A1 (en) Information sharing method and communication apparatus
WO2022142573A1 (zh) 一种信道接入方法和装置
US20240129758A1 (en) Channel access method and related apparatus
WO2024046286A1 (zh) 一种信道聚合方法及装置
WO2018018602A1 (en) Methods, terminals, and base stations for end-to-end communication
WO2020199815A1 (zh) 通信方法及装置
WO2024067143A1 (zh) 一种信息传输方法、装置和系统
WO2023036280A1 (zh) 一种模型测试方法及装置
WO2023165460A1 (zh) 一种通信方法、装置及系统
US20240089742A1 (en) Data transmission method and related apparatus
US20230275632A1 (en) Methods for beam coordination in a near-field operation with multiple transmission and reception points (trps)
WO2023185890A1 (zh) 一种数据处理方法及相关装置
WO2022135288A1 (zh) 信息处理的方法和装置
WO2024007112A1 (zh) 通信方法和装置
US20230354063A1 (en) Method and apparatus for configuring artificial neural network for wireless communication in mobile communication system
US20230353326A1 (en) Nr framework for beam prediction in spatial domain
WO2024067248A1 (zh) 一种获取训练数据集的方法和装置
WO2024061125A1 (zh) 一种通信方法及装置
WO2024036526A1 (zh) 一种模型调度方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23859316

Country of ref document: EP

Kind code of ref document: A1