CN117693035A

CN117693035A - A channel aggregation method and device

Info

Publication number: CN117693035A
Application number: CN202211066591.3A
Authority: CN
Inventors: 舒同欣; 刘鹏; 郭子阳; 罗嘉俊; 杨讯; 颜敏
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2022-08-31
Filing date: 2022-08-31
Publication date: 2024-03-12
Also published as: WO2024046286A1

Abstract

This application relates to the field of communication technology and discloses a channel aggregation method and device, in order to make optimal channel aggregation decisions and solve the problems of low channel aggregation throughput and large delay. The method includes: the first terminal device receives a load report, the load report includes load information of each of M channels of the network device in the t-th time period, and the M channels include 1 main channel corresponding to the first terminal device and M-1 secondary channels; input the channel environment information of the t-th time period into the channel aggregation model for processing, and obtain the t-th channel aggregation indicator value. The channel environment information of the t-th time period includes the main channel and M-1 The load information of each secondary channel in the t-th time period, the t-th channel aggregation indicator value is used to indicate the aggregation of N secondary channels among the M-1 secondary channels with the primary channel; at the t+1-th time The segment sends data packets through the primary channel and the aggregated channel of N secondary channels.

Description

A channel aggregation method and device

技术领域Technical field

本申请涉及通信技术领域，尤其涉及一种信道聚合方法及装置。The present application relates to the field of communication technology, and in particular, to a channel aggregation method and device.

背景技术Background technique

为了应对频谱资源的短缺和业务流量增加的问题，电气与电子工程师协会(institute of electrical and electronics engineers，IEEE)制定的通信标准中引入了信道聚合技术。具体的信道聚合技术可以基于主信道，将主信道和与主信道相邻的次信道聚合，以支持更大的信道带宽，从而提高数据传输速率。In order to cope with the shortage of spectrum resources and the increase in business traffic, channel aggregation technology has been introduced in the communication standards formulated by the Institute of Electrical and Electronics Engineers (IEEE). Specific channel aggregation technology can be based on the main channel and aggregate the main channel and the secondary channels adjacent to the main channel to support larger channel bandwidth and thereby increase the data transmission rate.

目前，信道聚合方法主要分为静态(static)信道聚合和动态(dynamic)信道聚合两类信道聚合方法。静态信道聚合的主要思想为：在主信道空闲的前提条件下，需要等待所有次信道也空闲，才可以进行信道聚合。动态信道聚合的主要思想是：在主信道空闲时，如果正好也存在次信道空闲，即可将主信道和空闲的次信道聚合。Currently, channel aggregation methods are mainly divided into two categories: static channel aggregation and dynamic channel aggregation. The main idea of static channel aggregation is: under the premise that the primary channel is idle, it is necessary to wait for all secondary channels to be idle before channel aggregation can be performed. The main idea of dynamic channel aggregation is: when the primary channel is idle, if there happens to be an idle secondary channel, the primary channel and the idle secondary channel can be aggregated.

然而，采用上述信道聚合方法时，当存在多个终端设备竞争信道资源时，会存在各终端设备发送的数据包碰撞率高、终端设备多次进入倒退窗口等待发送数据包，导致信道聚合吞吐小、时延大的问题。However, when using the above channel aggregation method, when there are multiple terminal devices competing for channel resources, the collision rate of data packets sent by each terminal device will be high, and the terminal device will enter the backoff window multiple times to wait for sending data packets, resulting in low channel aggregation throughput. , the problem of large delay.

发明内容Contents of the invention

本申请实施例提供一种信道聚合方法及装置，以期做出最优的信道聚合决策，解决信道聚合吞吐小、时延大的问题。Embodiments of the present application provide a channel aggregation method and device, in order to make optimal channel aggregation decisions and solve the problems of low channel aggregation throughput and large delay.

第一方面，本申请实施例提供一种信道聚合方法，该方法可以由第一终端设备执行，也可以由第一终端设备的部件(例如处理器、芯片、或芯片系统等)执行，还可以由能实现全部或部分第一终端设备功能的逻辑模块或软件实现。以下以第一终端设备执行该方法为例进行说明，该方法包括：第一终端设备接收来自网络设备的负载报告，负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息，其中，M个信道包括第一终端设备对应的1个主信道和M-1个次信道，M为大于或等于2的整数，t为大于或等于2的整数；第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值，第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息，第t信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合，N为大于或等于0、且小于或等于M-1的整数；第一终端设备对主信道和N个次信道进行信道聚合，也即终端设备可以在第t+1个时间段通过主信道和N个次信道聚合后的信道发送数据包，第t+1个时间段为第t个时间段之后的时间段。In the first aspect, embodiments of the present application provide a channel aggregation method. The method can be executed by a first terminal device, or by a component of the first terminal device (such as a processor, a chip, or a chip system, etc.). It can also be executed by It is implemented by a logic module or software that can realize all or part of the functions of the first terminal device. The following takes the first terminal device to execute this method as an example. The method includes: the first terminal device receives a load report from the network device. The load report includes the load of each of the M channels of the network device in the t-th time period. Load information, where M channels include 1 primary channel and M-1 secondary channels corresponding to the first terminal device, M is an integer greater than or equal to 2, and t is an integer greater than or equal to 2; the first terminal device will The channel environment information of the tth time period is input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained. The channel environment information of the tth time period includes the primary channel and M-1 secondary channels. The load information of the t-th time period, and the channel state monitoring information obtained by the first terminal device performing channel state monitoring on the primary channel and M-1 secondary channels in the t-th time period, and the t-th channel aggregation indication value is used to indicate N secondary channels among the M-1 secondary channels are aggregated with the primary channel, where N is an integer greater than or equal to 0 and less than or equal to M-1; the first terminal device performs channel aggregation on the primary channel and the N secondary channels, That is, the terminal device can send data packets through the channel aggregated by the primary channel and N secondary channels in the t+1th time period, and the t+1th time period is the time period after the tth time period.

可选地，负载报告还可以包括第t时段的截止时间。Optionally, the load report may also include the deadline for the t-th period.

采用上述方法，第一终端设备可以从网络设备侧获取各信道准确的负载信息，并结合自身进行信道状态监测得到的信道状态监测信息(如第一终端设备在各信道发送数据包的信息等)，基于信道的实时负载和信道状态，利用人工智能(artificialintelligence，AI)，即信道聚合模型的预测能力，做出优选的信道聚合决策，有利于降低第一终端设备在聚合后的信道发送数据与其它终端设备发送的数据包的碰撞概率，提升聚合后信道的传输性能，解决信道聚合吞吐小、时延大的问题。Using the above method, the first terminal device can obtain accurate load information of each channel from the network device side, and combine it with the channel status monitoring information obtained from its own channel status monitoring (such as information about the first terminal device sending data packets on each channel, etc.) , based on the real-time load and channel status of the channel, using artificial intelligence (AI), that is, the prediction ability of the channel aggregation model, to make optimal channel aggregation decisions, which is beneficial to reducing the cost of the first terminal device sending data on the channel after aggregation. The collision probability of data packets sent by other terminal devices improves the transmission performance of the aggregated channel and solves the problems of low channel aggregation throughput and large delay.

在一种可能的设计中，第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息可以但不限于包括以下中的一项或多项：第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态；第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态；第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。In a possible design, the channel status monitoring information obtained by the first terminal device performing channel status monitoring on the primary channel and M-1 secondary channels in the t-th time period may, but is not limited to, include one or more of the following: Item: The busy and idle status of the primary channel and each of the M-1 secondary channels monitored by the first terminal device in the t-th time period in each time unit; The data packet sending status of each time unit monitored by the first terminal device on each of the primary channel and M-1 secondary channels in the segment; the data packet sending status of the first terminal device monitored in the t-th time segment The number of consecutive time units on which the first terminal device maintains the data packet sending state and the busy and idle state of the channel on the primary channel and each of the M-1 secondary channels remains unchanged at the same time.

上述设计中，第一终端设备可以从各个信道的忙闲状态和自身在各个信道的发送数据包情况等角度出发，对各个信道进行信道状态监测，有利于基于信道的实时负载和信道状态，通过信道聚合模型做出最优的信道聚合决策，从而提升聚合后信道的传输性能。In the above design, the first terminal device can monitor the channel status of each channel from the perspective of the busy and idle status of each channel and the status of sending data packets on each channel, which is beneficial to the real-time load and channel status based on the channel. The channel aggregation model makes optimal channel aggregation decisions, thereby improving the transmission performance of the aggregated channels.

在一种可能的设计中，该方法还包括：第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值，其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合；第一终端设备根据第t-1个时间段的信道环境信息、第t-1信道聚合指示值和设定的状态动作值函数，确定基于第t-1个时间段的信道环境信息进行第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值；第一终端设备根据第t-1个时间段的信道环境信息、主信道与M-1个次信道对应的2^M-1-1个候选信道聚合指示值和设定的状态动作值函数，确定第二状态动作值，其中2^M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2^M-1-1个候选信道聚合方式，第二状态动作值为基于第t-1个时间段的信道环境信息分别进行2^M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值；以及第一终端设备根据第一状态动作值、第二状态动作值和第t-1信道聚合指示值的奖励值，确定信道聚合模型的损失；第一终端设备根据信道聚合模型的损失，对信道聚合模型进行训练更新；其中，N’与N相同或不同，第t-1个时间段为第t个时间段之前的时间段。In a possible design, the method further includes: the first terminal device determines, based on the load information of the main channel and each of the N' sub-channels in the t-1th time period, to obtain the th-th time period based on the channel aggregation model. The reward value of the t-1 channel aggregation indication value, where the t-1th channel aggregation indication value is used to indicate the aggregation of N' secondary channels among the M-1 secondary channels with the main channel; the first terminal device is based on the t-1th channel aggregation indication value. The channel environment information of the t-1th time period, the t-1th channel aggregation indication value and the set state action value function are used to determine the channel corresponding to the t-1th channel aggregation indication value based on the channel environment information of the t-1th time period. The first state action value of the aggregation mode; the first terminal device is based on the channel environment information of the t-1th time period, the 2 ^M-1 -1 candidate channel aggregation indication values corresponding to the primary channel and M-1 secondary channels, and The set state action value function determines the second state action value, in which 2 ^M-1 -1 candidate channel aggregation indication values correspond to 2 ^{M-1 -1 candidate channel aggregation of the primary channel and M-} 1 secondary channels. method, the second state action value is the maximum state action value among the state action values of the candidate channel aggregation method corresponding to 2 ^M-1 -1 candidate channel aggregation indication values based on the channel environment information of the t-1th time period. ; and the first terminal device determines the loss of the channel aggregation model based on the reward value of the first state action value, the second state action value and the t-1 channel aggregation indication value; the first terminal device determines the loss of the channel aggregation model based on the loss of the channel aggregation model. The channel aggregation model is trained and updated; where N' is the same as or different from N, and the t-1th time period is the time period before the tth time period.

上述设计中，在信道聚合模型做出信道聚合决策(即输出信道聚合指示值)后，第一终端设备可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞，并根据该信道聚合决策以及在聚合后的信道上的发送数据包的情况，结合各信道的负载情况，对信道聚合模型做出的信道聚合决策给予不同的奖励，引导信道聚合模型根据各信道上的负载情况进行学习，以期通过信道聚合模型输出最优的信道聚合决策。In the above design, after the channel aggregation model makes a channel aggregation decision (ie, outputs a channel aggregation indication value), the first terminal device can test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices. And based on the channel aggregation decision and the situation of sending data packets on the aggregated channel, combined with the load condition of each channel, different rewards will be given to the channel aggregation decision made by the channel aggregation model, and the channel aggregation model will be guided according to the conditions on each channel. The load conditions are learned in order to output the optimal channel aggregation decision through the channel aggregation model.

在一种可能的设计中，该方法还包括：第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值，其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合，N’与N相同或不同，第t-1个时间段为第t个时间段之前的时间段；In a possible design, the method further includes: the first terminal device determines, based on the load information of the main channel and each of the N' sub-channels in the t-1th time period, to obtain the th-th time period based on the channel aggregation model. The reward value of the t-1 channel aggregation indication value, where the t-1th channel aggregation indication value is used to indicate that N' secondary channels among the M-1 secondary channels are aggregated with the main channel, N' is the same as or different from N, and the The t-1 time period is the time period before the t-th time period;

第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值，包括：第一终端设备将第t个时间段的信道环境信息、第t-1信道聚合指示值的奖励值输入到信道聚合模型进行处理，得到第t信道聚合指示值。The first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and obtains the t-th channel aggregation indication value, which includes: the first terminal device inputs the channel environment information of the t-th time period, the t-th The reward value of 1 channel aggregation indication value is input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained.

采用上述方法，还可以设定奖励策略对信道聚合模型做出的信道聚合决策(即输出的信道聚合指示值)给予不同的奖励，并将给予的奖励值也作为信道聚合模型下次作出信道决策的影响因素，以期使信道聚合模型作出用户需求的信道聚合决策。Using the above method, you can also set a reward strategy to give different rewards to the channel aggregation decisions made by the channel aggregation model (i.e., the output channel aggregation indicator value), and use the reward value as the next time the channel aggregation model makes a channel decision. Influencing factors, in order to enable the channel aggregation model to make channel aggregation decisions based on user needs.

可选地，第一终端设备根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值，可以包括以下情况，各个情况可以结合使用，也可以独立使用，本申请不限制各个情况的组合情况：Optionally, the first terminal device determines the reward for obtaining the t-1th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. Values may include the following situations. Each situation may be used in combination or independently. This application does not limit the combination of each situation:

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels does not collide with the data packet sent by other terminal devices, and N' is not zero, the first terminal device Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels does not collide with the data packet sent by other terminal devices and N' is zero, the first terminal device Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices and N' is zero, the first terminal device will Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model;

上述各个情况中，R_t表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值，K表示N’个次信道中的第K个次信道，K＝1、2、…、N’，表示第K个次信道在第t-1个时间段的负载信息，/>表示主信道在第t-1个时间段的负载信息。In each of the above cases, R _t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, K represents the K-th sub-channel among N' sub-channels, K=1, 2,...,N' , Represents the load information of the K-th sub-channel in the t-1th time period, /> Indicates the load information of the main channel in the t-1th time period.

上述设计中，在信道聚合模型做出信道聚合决策(即输出信道聚合指示值)后，第一终端设备可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞，并根据该信道聚合决策以及在聚合后的信道上的发送数据包的情况，结合各信道的负载情况，对信道聚合模型做出的信道聚合决策给予不同的奖励，以引导信道聚合模型根据各信道上的负载情况进行学习，以期通过信道聚合模型输出最优的信道聚合决策。In the above design, after the channel aggregation model makes a channel aggregation decision (ie, outputs a channel aggregation indication value), the first terminal device can test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices. And based on the channel aggregation decision and the situation of sending data packets on the aggregated channel, combined with the load condition of each channel, different rewards are given to the channel aggregation decision made by the channel aggregation model to guide the channel aggregation model according to each channel Learn the load conditions on the network in order to output the optimal channel aggregation decision through the channel aggregation model.

第二方面，本申请实施例提供一种通信装置，该装置具有实现上述第一方面中方法的功能，所述功能可以通过硬件实现，也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块，比如包括接口单元和处理单元。In a second aspect, embodiments of the present application provide a communication device, which has the function of implementing the method in the first aspect. The function can be implemented by hardware, or can be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, such as an interface unit and a processing unit.

在一个可能的设计中，该装置可以是芯片或者集成电路。In one possible design, the device may be a chip or integrated circuit.

在一个可能的设计中，该装置包括存储器和处理器，存储器用于存储所述处理器执行的指令，当指令被处理器执行时，所述装置可以执行上述第一方面的方法。In one possible design, the device includes a memory and a processor. The memory is used to store instructions executed by the processor. When the instructions are executed by the processor, the device can perform the method of the first aspect.

在一个可能的设计中，该装置可以为第一终端设备。In a possible design, the device may be a first terminal device.

第三方面，本申请实施例提供一种通信装置，该通信装置包括接口电路和处理器，处理器和接口电路之间相互耦合。处理器通过逻辑电路或执行指令用于实现上述第一方面的方法。接口电路用于接收来自该通信装置之外的其它通信装置的信号并传输至处理器或将来自处理器的信号发送给该通信装置之外的其它通信装置。可以理解的是，接口电路可以为收发器或收发机或收发信机或输入输出接口。In a third aspect, embodiments of the present application provide a communication device. The communication device includes an interface circuit and a processor, and the processor and the interface circuit are coupled to each other. The processor is used to implement the method of the first aspect above through logic circuits or executing instructions. The interface circuit is used to receive signals from other communication devices other than the communication device and transmit them to the processor or to send signals from the processor to other communication devices other than the communication device. It can be understood that the interface circuit may be a transceiver or a transceiver or a transceiver or an input-output interface.

可选的，通信装置还可以包括存储器，用于存储处理器执行的指令或存储处理器运行指令所需要的输入数据或存储处理器运行指令后产生的数据。存储器可以是物理上独立的单元，也可以与处理器耦合，或者处理器包括该存储器。Optionally, the communication device may also include a memory for storing instructions executed by the processor or input data required for the processor to run the instructions or data generated after the processor executes the instructions. The memory can be a physically separate unit, or it can be coupled to the processor, or the processor can include the memory.

第四方面，本申请实施例提供一种计算机可读存储介质，在存储介质中存储有计算机程序或指令，当计算机程序或指令被执行时，可以实现上述第一方面的方法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, in which computer programs or instructions are stored. When the computer programs or instructions are executed, the method of the first aspect can be implemented.

第五方面，本申请实施例还提供一种计算机程序产品，包括计算机程序或指令，当计算机程序或指令被执行时，可以实现上述第一方面的方法。In a fifth aspect, embodiments of the present application further provide a computer program product, which includes a computer program or instructions. When the computer program or instructions are executed, the method of the first aspect can be implemented.

第六方面，本申请实施例还提供一种芯片，该芯片与存储器耦合，用于读取并执行存储器中存储的程序或指令，实现上述第一方面的方法。In a sixth aspect, embodiments of the present application further provide a chip, which is coupled to a memory and used to read and execute programs or instructions stored in the memory to implement the method of the first aspect.

上述第二方面至第六方面所能达到的技术效果请参照上述第一方面所能达到的技术效果，这里不再重复赘述。For the technical effects that can be achieved from the above-mentioned second aspect to the sixth aspect, please refer to the technical effects that can be achieved from the above-mentioned first aspect, and they will not be repeated here.

附图说明Description of the drawings

图1为本申请实施例提供的一种通信系统架构示意图；Figure 1 is a schematic diagram of a communication system architecture provided by an embodiment of the present application;

图2为本申请实施例提供的一种全连接神经网络示意图；Figure 2 is a schematic diagram of a fully connected neural network provided by an embodiment of the present application;

图3为本申请实施例提供的一种神经元根据输入计算输出的示意图；Figure 3 is a schematic diagram of a neuron calculating output according to input according to an embodiment of the present application;

图4为本申请实施例提供的一种相邻的多信道聚合的示意图；Figure 4 is a schematic diagram of adjacent multi-channel aggregation provided by an embodiment of the present application;

图5为本申请实施例提供的一种前导码打孔传输的示意图；Figure 5 is a schematic diagram of a preamble puncturing transmission provided by an embodiment of the present application;

图6为本申请实施例提供的一种信道聚合方法示意图；Figure 6 is a schematic diagram of a channel aggregation method provided by an embodiment of the present application;

图7为本申请实施例提供的信道的负载信息的指示信息示意图之一；Figure 7 is one of the schematic diagrams of indication information of channel load information provided by an embodiment of the present application;

图8为本申请实施例提供的信道的负载信息的指示信息示意图之二；Figure 8 is a second schematic diagram of indication information of channel load information provided by an embodiment of the present application;

图9A为本申请实施例提供的一种信道聚合模型的结构的示意图；Figure 9A is a schematic diagram of the structure of a channel aggregation model provided by an embodiment of the present application;

图9B为本申请实施例提供的一种强化学习流程示意图；Figure 9B is a schematic diagram of a reinforcement learning process provided by an embodiment of the present application;

图10为本申请实施例提供的通信装置示意图之一；Figure 10 is a schematic diagram of a communication device provided by an embodiment of the present application;

图11为本申请实施例提供的通信装置示意图之二；Figure 11 is a second schematic diagram of a communication device provided by an embodiment of the present application;

图12为本申请实施例提供的一种设备结构示意图。Figure 12 is a schematic structural diagram of a device provided by an embodiment of the present application.

具体实施方式Detailed ways

本申请实施例的技术方案可以应用于各种通信系统，例如：5G系统，LTE系统，长期演进高级(long term evolution-advanced，LTE-A)系统等通信系统中，也可以扩展到如无线保真(wireless fidelity，WiFi)、全球微波互联接入(worldwide interoperabilityfor microwave access，wimax)、以及3GPP等相关的蜂窝系统中，及未来的通信系统，如6G系统等。具体的，本申请实施例所应用的通信系统架构可以如图1所示，包括网络设备和多个终端设备，图1中以三个终端设备为例。终端设备1-终端设备3可以分别或者同时向网络设备发送数据(或数据包)，需要说明的是，本申请实施例中不限定图1中所示通信系统中终端设备以及网络设备的个数。The technical solutions of the embodiments of the present application can be applied to various communication systems, such as: 5G systems, LTE systems, long term evolution-advanced (LTE-A) systems and other communication systems, and can also be extended to wireless security systems. Wireless fidelity (WiFi), global interoperability for microwave access (wimax), and related cellular systems such as 3GPP, as well as future communication systems such as 6G systems. Specifically, the communication system architecture applied in the embodiment of the present application may be as shown in Figure 1, including a network device and multiple terminal devices. In Figure 1, three terminal devices are taken as an example. Terminal device 1 - terminal device 3 can send data (or data packets) to the network device separately or simultaneously. It should be noted that the embodiment of the present application does not limit the number of terminal devices and network devices in the communication system shown in Figure 1 .

上述终端设备也可以称为终端(terminal)、用户设备(user equipment，UE)、移动台(mobile station，MS)、移动终端等。终端设备可以广泛应用于各种场景，例如，设备到设备(device-to-device，D2D)通信、车到一切(vehicle to everything，V2X)通信、机器类通信(machine-type communication，MTC)、物联网(internet of things，IoT)、虚拟现实、增强现实、工业控制、自动驾驶、远程医疗、智能电网、智能家具、智能办公、智能穿戴、智能交通、智慧城市等。终端设备可以是手机、平板电脑、带无线收发功能的电脑、可穿戴设备、车辆、无人机、直升机、飞机、轮船、机器人、机械臂、智能家居设备、车载终端、IoT终端、可穿戴设备、WiFi系统中的站点(station，STA)等。本申请的实施例对终端设备所采用的具体技术和具体设备形态不做限定。The above terminal equipment may also be called a terminal, user equipment (UE), mobile station (MS), mobile terminal, etc. Terminal devices can be widely used in various scenarios, such as device-to-device (D2D) communication, vehicle to everything (V2X) communication, machine-type communication (MTC), Internet of things (IoT), virtual reality, augmented reality, industrial control, autonomous driving, telemedicine, smart grid, smart furniture, smart office, smart wear, smart transportation, smart city, etc. Terminal devices can be mobile phones, tablets, computers with wireless transceiver functions, wearable devices, vehicles, drones, helicopters, airplanes, ships, robots, robotic arms, smart home devices, vehicle terminals, IoT terminals, and wearable devices , sites (station, STA) in the WiFi system, etc. The embodiments of this application do not limit the specific technology and specific equipment form used by the terminal equipment.

网络设备也可以称为接入网(access network，AN)设备，或无线接入网(radioaccess network，RAN)设备。可以是基站(base station)、演进型基站(evolved NodeB，eNodeB)、收发点(transmitter and receiver point，TRP)、集成接入和回传(integratedaccess and backhauling，IAB)节点、第五代(5th generation，5G)移动通信系统中的下一代基站(next generation NodeB，gNB)、第六代(6th generation，6G)移动通信系统中的基站、其他未来移动通信系统中的基站、家庭基站(例如，home evolved nodeB，或home nodeB，HNB)、WiFi系统中的接入点(access point，AP)、无线中继节点、无线回传节点等。The network device may also be called an access network (AN) device or a radio access network (RAN) device. It can be a base station, an evolved base station (evolved NodeB, eNodeB), a transmitter and receiver point (TRP), an integrated access and backhauling (IAB) node, or a fifth generation (5th generation) , next generation NodeB (gNB) in the 5G) mobile communication system, base station in the sixth generation (6th generation, 6G) mobile communication system, base stations in other future mobile communication systems, home base station (for example, home evolved nodeB, or home nodeB, HNB), access point (AP) in WiFi system, wireless relay node, wireless backhaul node, etc.

在介绍本申请实施例之前，首先对本申请中的部分用语进行解释说明，以便于本领域技术人员理解。Before introducing the embodiments of the present application, some terms used in the present application will first be explained to facilitate understanding by those skilled in the art.

1)、神经网络(neural network，NN)是一种模拟人脑神经网络以期能够实现类人工智能的机器学习技术。神经网络至少包括3层，一个输入层、一个中间层(也称隐藏层)以及一个输出层。更深一些的神经网络可能在输入层和输出层之间包含更多的隐藏层。以最简单的神经网络为例，对其内部的结构和实现进行说明，参见图2所示的包含3个层的全连接神经网络示意图。如图2所示，该神经网络包括3个层，分别是输入层、隐藏层以及输出层，其中图2中每个圆代表一个神经元，输入层有3个神经元，隐藏层有4个神经元，输出层有2个神经元，并且每层神经元与下一层神经元全连接。神经元之间的每条连线对应一个权重，这些权重通过训练可以更新。隐藏层和输出层的每个神经元还可以对应一个偏置，这些偏置通过训练也可以更新。更新神经网络是指更新这些权重和偏置。知道了神经网络的结构，即神经网络每层包含的神经元个数以及神经元之间的连接关系，和神经网络的参数，即神经元之间的每条连线对应的权重、每个神经元对应的偏置，就知道了该神经网络的全部信息。1) Neural network (NN) is a machine learning technology that simulates the neural network of the human brain in order to achieve artificial intelligence. The neural network consists of at least 3 layers, an input layer, an intermediate layer (also called a hidden layer) and an output layer. Deeper neural networks may contain more hidden layers between the input and output layers. Taking the simplest neural network as an example, its internal structure and implementation will be described. See the schematic diagram of a fully connected neural network containing three layers shown in Figure 2. As shown in Figure 2, the neural network includes 3 layers, namely the input layer, the hidden layer and the output layer. Each circle in Figure 2 represents a neuron. The input layer has 3 neurons and the hidden layer has 4 Neurons, the output layer has 2 neurons, and the neurons in each layer are fully connected to the neurons in the next layer. Each connection between neurons corresponds to a weight, and these weights can be updated through training. Each neuron in the hidden layer and output layer can also correspond to a bias, and these biases can also be updated through training. Updating a neural network means updating these weights and biases. Know the structure of the neural network, that is, the number of neurons contained in each layer of the neural network and the connection relationship between the neurons, and the parameters of the neural network, that is, the weight corresponding to each connection between the neurons, each neuron By knowing the bias corresponding to the element, all the information of the neural network is known.

由图2可知，每个神经元可能有多条输入连线，每个神经元根据输入计算输出。参见图3，图3是一个神经元根据输入计算输出的示意图。如图3所示，一个神经元包含3个输入，1个输出，以及2个计算功能，输出的计算公式(1-1)可以表示为：As can be seen from Figure 2, each neuron may have multiple input connections, and each neuron calculates an output based on the input. See Figure 3, which is a schematic diagram of a neuron calculating output based on input. As shown in Figure 3, a neuron contains 3 inputs, 1 output, and 2 calculation functions. The output calculation formula (1-1) can be expressed as:

输出＝激活函数(输入1*权重1+输入2*权重2+输入3*权重3+偏置)(1-1)；Output = activation function (input 1 * weight 1 + input 2 * weight 2 + input 3 * weight 3 + bias) (1-1);

其中，“*”表示数学运算“乘”或“乘以”，其中激活函数可以采用S型函数(sigmoid函数)、双曲函数、整流函数(rectification function，ReLu)等。Among them, "*" represents the mathematical operation "multiply" or "multiply by", and the activation function can be a S-shaped function (sigmoid function), hyperbolic function, rectification function (rectification function, ReLu), etc.

每个神经元可能有多条输出连线，一个神经元的输出作为下一个神经元的输入。应理解，输入层只有输出连线，输入层的每个神经元是输入神经网络的值，每个神经元的输出值直接作为所有输出连线的输入。输出层只有输入连线，采用上述公式(1-1)的计算方式计算输出。可选的，输出层可以没有激活函数的计算，也就是说前述公式(1-1)可以变换成：输出＝输入1*权重1+输入2*权重2+输入3*权重3+偏置。Each neuron may have multiple output connections, and the output of one neuron serves as the input of the next neuron. It should be understood that the input layer only has output connections, each neuron of the input layer is the value input to the neural network, and the output value of each neuron is directly used as the input of all output connections. The output layer only has input connections, and the output is calculated using the calculation method of the above formula (1-1). Optionally, the output layer does not need to calculate the activation function, which means that the aforementioned formula (1-1) can be transformed into: output = input 1 * weight 1 + input 2 * weight 2 + input 3 * weight 3 + bias.

举例来说，k层神经网络可以表示为：For example, a k-layer neural network can be expressed as:

y＝fk(fk-1(…(f1(w1*x+b1)))(1-2)；y=fk(fk-1(…(f1(w1*x+b1)))(1-2);

其中，x表示神经网络的输入，y表示神经网络的输出，wi表示第i层神经网络的权重，bi表示第i层神经网络的偏置，fi表示第i层神经网络的激活函数，i＝1，2，…，k。Among them, x represents the input of the neural network, y represents the output of the neural network, wi represents the weight of the i-th layer neural network, bi represents the bias of the i-th layer neural network, fi represents the activation function of the i-th layer neural network, i= 1, 2,…,k.

2)、信道聚合，在IEEE 802.11ac标准中，信道聚合技术被首次引入，允许基于一个20兆赫(mega hertz，MHz)的主信道(primary channel)，将多个相邻的20MHz的次信道(secondary channel)聚合为带宽为40MHz、80MHz或者160MHz的信道用于传输，从而提高传输效率。图4为相邻的多信道聚合的示意图，参照图4所示可知，20MHz的主信道和20MHz的次信道可以聚合为带宽为40MHz的信道；40MHz的主信道和40MHz的次信道可以聚合为带宽为80MHz的信道；8MHz的主信道和80MHz的次信道可以聚合为带宽为160MHz的信道。2) Channel aggregation. In the IEEE 802.11ac standard, channel aggregation technology was introduced for the first time, allowing multiple adjacent 20MHz secondary channels (primary channel) to be combined based on a 20 MHz (mega hertz, MHz) primary channel. secondary channels) are aggregated into channels with a bandwidth of 40MHz, 80MHz or 160MHz for transmission, thereby improving transmission efficiency. Figure 4 is a schematic diagram of adjacent multi-channel aggregation. Referring to Figure 4, it can be seen that the 20MHz main channel and the 20MHz secondary channel can be aggregated into a channel with a bandwidth of 40MHz; the 40MHz main channel and the 40MHz secondary channel can be aggregated into a bandwidth It is an 80MHz channel; the 8MHz main channel and the 80MHz secondary channel can be aggregated into a channel with a bandwidth of 160MHz.

在802.11ac标准的下一代标准，即802.11ax标准中，基于前导码打孔(preamblepuncturing)等技术，信道聚合被允许在非相邻的20MHz信道之间进行，为信道聚合提供了更多的灵活性，也为进一步提高传输吞吐率带来了更多可能。如图5所示，图5是前导码打孔传输的示意图。其中，TX表示发送(transport)，CH表示信道(channel)，每个信道(CH1、CH2、CH3、CH4)的带宽均为20MHz，帧1(frame 1)、帧2(frame 2)以及帧3(frame 3)的传输带宽均为80MHz，由于传输frame 1时，次20MHz信道(记为S20)繁忙(busy)，所以S20被打孔，故frame 1的实际带宽为60MHz。同理，frame 2的实际带宽为60MHz，frame 3的实际带宽为40MHz。In the next-generation standard of the 802.11ac standard, the 802.11ax standard, based on technologies such as preamble puncturing, channel aggregation is allowed between non-adjacent 20MHz channels, providing more flexibility for channel aggregation It also brings more possibilities for further improving the transmission throughput rate. As shown in Figure 5, Figure 5 is a schematic diagram of preamble puncturing transmission. Among them, TX means transport, CH means channel, the bandwidth of each channel (CH1, CH2, CH3, CH4) is 20MHz, frame 1 (frame 1), frame 2 (frame 2) and frame 3 The transmission bandwidth of (frame 3) is 80MHz. Since the sub-20MHz channel (recorded as S20) is busy when transmitting frame 1, S20 is punctured, so the actual bandwidth of frame 1 is 60MHz. Similarly, the actual bandwidth of frame 2 is 60MHz, and the actual bandwidth of frame 3 is 40MHz.

3)、信道聚合方法，目前信道聚合方法主要分为静态信道聚合和动态信道聚合两类信道聚合方法。静态信道聚合的主要思想为：在主信道空闲的前提条件下，需要等待所有次信道也空闲，才可以进行信道聚合。动态信道聚合的主要思想是：在主信道空闲时，如果正好也存在次信道空闲，即可将主信道和空闲的次信道聚合。3) Channel aggregation methods. Currently, channel aggregation methods are mainly divided into two categories: static channel aggregation and dynamic channel aggregation. The main idea of static channel aggregation is: under the premise that the primary channel is idle, it is necessary to wait for all secondary channels to be idle before channel aggregation can be performed. The main idea of dynamic channel aggregation is: when the primary channel is idle, if there happens to be an idle secondary channel, the primary channel and the idle secondary channel can be aggregated.

由上述信道聚合方法可知，目前信道聚合方法，主要思想是在主信道空闲的情况下，将主信道与空闲的次信道进行聚合。然而，当存在多个终端设备竞争信道资源时，会存在多个终端设备应用的聚合后的信道存在部分或全部重叠，存在各终端设备发送的数据包碰撞率高、终端设备多次进入倒退窗口等待发送数据包，导致信道聚合吞吐小、时延大的问题。It can be seen from the above channel aggregation method that the main idea of the current channel aggregation method is to aggregate the main channel and the idle secondary channel when the main channel is idle. However, when there are multiple terminal devices competing for channel resources, the aggregated channels applied by multiple terminal devices may partially or completely overlap, the data packets sent by each terminal device may have a high collision rate, and the terminal device may enter the backoff window multiple times. Waiting for data packets to be sent leads to problems of low channel aggregation throughput and large delay.

基于此，本申请提供一种信道聚合方法，旨在基于信道的实时状态和业务的传输需求，利用人工智能(artificial intelligence，AI)的预测能力，做出优选的信道聚合决策提升聚合后信道的传输性能，解决信道聚合吞吐小、时延大的问题。下面将结合附图，对本申请实施例进行详细描述，其中附图中的虚线表示可选步骤或组件。Based on this, this application provides a channel aggregation method, which aims to make optimal channel aggregation decisions based on the real-time status of the channel and the transmission requirements of the service, using the prediction ability of artificial intelligence (AI) to improve the quality of the aggregated channel. transmission performance, solving the problems of low channel aggregation throughput and large delay. The embodiments of the present application will be described in detail below with reference to the accompanying drawings, where dotted lines in the drawings represent optional steps or components.

另外，需要理解的是，本申请实施例提及“第一”、“第二”等序数词是用于对多个对象进行区分，不用于限定多个对象的大小、内容、顺序、时序、优先级或者重要程度等。例如，第t个时间段和第t+1个时间段，并不是表示这两个时间段对应的优先级或者重要程度等的不同。In addition, it should be understood that the ordinal words such as "first" and "second" mentioned in the embodiments of this application are used to distinguish multiple objects and are not used to limit the size, content, order, timing, etc. of multiple objects. Priority or importance, etc. For example, the t-th time period and the t+1-th time period do not indicate a difference in priority or importance corresponding to the two time periods.

本申请实施例中，对于名词的数目，除非特别说明，表示“单数名词或复数名词”，即"一个或多个”。“至少一个”是指一个或者多个，“多个”是指两个或两个以上。“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B的情况，其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。例如，A/B，表示：A或B。“以下至少一项(个)”或其类似表达，是指的这些项中的任意组合，包括单项(个)或复数项(个)的任意组合。例如，a,b,或c中的至少一项(个)，表示：a,b,c,a和b,a和c,b和c,或a和b和c，其中a,b,c可以是单个，也可以是多个。In the embodiments of this application, the number of nouns means "singular noun or plural noun", that is, "one or more", unless otherwise specified. "At least one" means one or more, and "plurality" means two or more. "And/or" describes the relationship between associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural. The character "/" generally indicates that the related objects are in an "or" relationship. For example, A/B means: A or B. "At least one of the following" or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items). For example, at least one of a, b, or c means: a, b, c, a and b, a and c, b and c, or a and b and c, where a, b, c Can be single or multiple.

图6为本申请实施例提供的信道聚合方法，该方法包括：Figure 6 shows a channel aggregation method provided by an embodiment of the present application. The method includes:

S601：第一终端设备接收来自网络设备的负载报告，负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息，M为大于或等于2的整数，t为大于或等于2的整数。S601: The first terminal device receives a load report from the network device. The load report includes the load information of each of the M channels of the network device in the t-th time period. M is an integer greater than or equal to 2, and t is greater than or equal to 2. An integer equal to 2.

在本申请实施例中，网络设备可以按照设定的获取周期，通过载波监听(carriersensing)等方式，获取网络设备的M个信道中每个信道在获取周期所对应的时间段(如第t个时间段)的负载信息。其中，信道在某一时间段(如第t个时间段)的负载信息，可以用于一个负载值来表示，该负载值表示该时间段内信道繁忙的时间(即有数据包传输的时间)占总时间比的比值。In this embodiment of the present application, the network device can obtain the time period corresponding to the acquisition period for each of the M channels of the network device (such as the tth time period) load information. Among them, the load information of the channel in a certain time period (such as the t-th time period) can be represented by a load value, which represents the time when the channel is busy in the time period (that is, the time when data packets are transmitted) Ratio to the total time.

作为一种示例：对于某一时间段(如第t个时间段)，网络设备可以通过载波监听获得该时间段内网络设备的M个信道中每个信道在每个时间单元是否有数据包的传输，并根据该时间段内每个信道在每个时间单元是否有数据包传输，来确定每个信道在该时间段的负载信息(如负载值)。其中时间单元可以为子帧、时隙(slot)、迷你时隙或符号等不同的时间粒度的资源，一个时间段内可以包括一个或多个时间单元。As an example: for a certain time period (such as the t-th time period), the network device can obtain through carrier sensing whether each of the M channels of the network device in the time period has a data packet in each time unit. Transmission, and determine the load information (such as load value) of each channel in this time period based on whether each channel has data packet transmission in each time unit in this time period. The time unit may be resources of different time granularities such as subframes, slots, mini-slots or symbols, and one time period may include one or more time units.

例如：某一时间段(如第t个时间段)包括50个时域单元，网络设备通过载波监听获得信道A在该时间段中的30个时域单元中有数据包的传输，则可以确定信道A在该时间段的负载信息(如负载值)为30/50*100％＝60％。可选地，还可以将负载值量化(scale)到0-255，比如信道A在该时间段的负载值为60％，60％*255＝153，可以通过153来表示信道A在该时间段的负载值60％。For example: a certain time period (such as the t-th time period) includes 50 time domain units, and the network device obtains through carrier sensing that channel A transmits data packets in 30 time domain units in the time period, then it can be determined The load information (such as load value) of channel A in this time period is 30/50*100%=60%. Optionally, the load value can also be quantized (scaled) to 0-255. For example, the load value of channel A in this time period is 60%, 60%*255=153, and 153 can be used to represent channel A in this time period. The load value is 60%.

对于第t个时间段，网络设备获得M个信道中每个信道在第t个时间段的负载信息后，可以将包括M个信道中每个信道在第t个时间段的负载信息的负载报告(load report)通过广播、组播等方式发送终端设备，例如：通过广播的方式发送给位于网络设备服务范围内的一个或多个终端设备。For the t-th time period, after the network device obtains the load information of each of the M channels in the t-th time period, it can send a load report including the load information of each of the M channels in the t-th time period. (Load report) Send the terminal device through broadcast, multicast, etc., for example: send it through broadcast to one or more terminal devices located within the service range of the network device.

其中，负载报告中用于指示每个信道的负载信息的指示信息可以如图7所示，其中，信道编号(channel number)字段用于指示信道的编号(或索引)，占用一个8比特(octet)；信道负载(channel load)字段用于指示信道对应的负载值，占用一个octet。对于M个信道中每个信道在第t个时间段的负载信息，负载报告共产生M*16比特的开销。Among them, the indication information used to indicate the load information of each channel in the load report can be as shown in Figure 7, in which the channel number (channel number) field is used to indicate the number (or index) of the channel, occupying an 8-bit (octet ); The channel load field is used to indicate the load value corresponding to the channel, occupying one octet. For the load information of each of the M channels in the t-th time period, the load report generates a total of M*16 bits of overhead.

在一种可能的实施中，用于指示每个信道的负载信息的指示信息还可以如图8所示，指示信息还可以包括监管类(regulatory class)字段和实际测量停止时间(actualmeasurement stop time)字段。其中，监管类字段可以指示一个类型集合，占用一个octet，该类型集合可以包含：工作频段、信道带宽、所在信道集合、传输功率上限、设定排放限值(emissions limits set)、行为限制集(behavior limits set)等信息中的一项或多项。例如：监管类字段的值为55对应的类型集合表示信道在5吉赫(GHz)频段下，信道带宽为20MHz，所属信道集合中包括的信道的信道编号(或索引)为149、153、157、161、165，传输功率为1000mW，emissions limits set为4，behavior limits set为10；监管类字段的值12对应的监管类集合表示信在2.407GHz频段下，信道带宽为25MHz，所属信道集合中包括的信道的信道编号(或索引)为1-11，传输功率为1000mW，emissions limits sets为4和behaviorlimits set为10。实际测量停止时间字段，占用8个octet，用于指示完成负载测量的时间，可以用于保证下发给各终端设备的负载报告的时间一致性，比如网络设备在第t个时间段通过载波监听对信道进行负载测量，则该完成负载测量的时间为第t个时间段的截止时间。In a possible implementation, the indication information used to indicate the load information of each channel can also be as shown in Figure 8. The indication information can also include a regulatory class field and an actual measurement stop time. field. Among them, the regulatory field can indicate a type set, occupying one octet. The type set can include: operating frequency band, channel bandwidth, channel set, transmission power upper limit, set emission limits (emissions limits set), behavior limit set ( behavior limits set) and other information. For example: the value of the regulatory field is 55, and the corresponding type set indicates that the channel is in the 5 GHz frequency band, the channel bandwidth is 20MHz, and the channel numbers (or indexes) of the channels included in the channel set are 149, 153, and 157 , 161, 165, the transmission power is 1000mW, the emissions limits set is 4, and the behavior limits set is 10; the regulatory class set corresponding to the value 12 of the regulatory class field indicates that the letter is in the 2.407GHz frequency band, the channel bandwidth is 25MHz, and it belongs to the channel set Included channels have a channel number (or index) of 1-11, a transmission power of 1000mW, emissions limits sets of 4 and behaviorlimits set of 10. The actual measurement stop time field, which occupies 8 octets, is used to indicate the time to complete the load measurement. It can be used to ensure the time consistency of the load report issued to each terminal device. For example, the network device uses carrier monitoring in the tth time period. When load measurement is performed on the channel, the time to complete the load measurement is the deadline of the t-th time period.

需要理解的是，监管类字段和实际测量停止时间字段是可选的，是否存在监管类字段和实际测量停止时间字段可以通过在指示信息的前2比特指示。例如：00表示没有这两个字段，01表示存在实际测量停止时间字段，10表示存在监管类字段，而11表示监管类字段和实际测量停止时间字段全部都存在。It should be understood that the regulatory field and the actual measurement stop time field are optional, and whether there is a regulatory field and the actual measurement stop time field can be indicated by the first 2 bits of the indication information. For example: 00 means there are no these two fields, 01 means the actual measurement stop time field exists, 10 means the supervision field exists, and 11 means both the supervision field and the actual measurement stop time field exist.

另外，需要理解的是，M个信道包括第一终端设备对应的1个主信道和M-1个次信道，其中M个信道中，第一终端设备对应的1个主信道可以由网络设备通过无线资源控制(radio resource control，RRC)消息等指示给第一终端设备，也可以由第一终端设备根据M个信道的负载信息确定(如选择负载值最小的信道作为主信道)等，本申请对此不作限定。In addition, it should be understood that the M channels include 1 main channel corresponding to the first terminal device and M-1 secondary channels. Among the M channels, 1 main channel corresponding to the first terminal device can be passed by the network device. A radio resource control (RRC) message or the like is directed to the first terminal device, or it can be determined by the first terminal device based on the load information of the M channels (such as selecting the channel with the smallest load value as the main channel), etc., this application There is no limit to this.

S602：第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值。S602: The first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and obtains the t-th channel aggregation indication value.

其中，第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息，信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合，N为大于或等于0、且小于或等于M-1的整数。Among them, the channel environment information in the t-th time period includes the load information of the primary channel and each of the M-1 secondary channels in the t-th time period, and the first terminal device's response to the primary channel in the t-th time period. Channel state monitoring information obtained by performing channel state monitoring with M-1 secondary channels. The channel aggregation indicator value is used to indicate the aggregation of N secondary channels among the M-1 secondary channels with the primary channel. N is greater than or equal to 0, and An integer less than or equal to M-1.

在本申请实施例中，第一终端设备还可以在每个获取周期对应的时间段内，对主信道和M-1个次信道进行信道状态监测，得到信道状态监测信息。以第t个时间段为例，第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息可以包括：第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态；第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态；第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数中的一项或多项。In this embodiment of the present application, the first terminal device can also perform channel status monitoring on the primary channel and M-1 secondary channels within the time period corresponding to each acquisition cycle to obtain channel status monitoring information. Taking the t-th time period as an example, the channel state monitoring information obtained by the first terminal device performing channel state monitoring on the primary channel and M-1 secondary channels in the t-th time period may include: The busy and idle status of the main channel and each of the M-1 secondary channels monitored in the time period in each time unit; the first terminal device monitored by the first terminal device in the t-th time period. The data packet transmission status of each time unit on the main channel and each of the M-1 secondary channels; the first terminal equipment monitored by the first terminal equipment in the t-th time period on the main channel and M- One or more of the number of consecutive time units that the data packet sending status and the busy and idle status of the channel on each sub-channel in a sub-channel remain unchanged at the same time.

其中，对于第一终端设备在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态，可以用表示，其中i＝1、2、3、…、M，表示主信道和M-1个次信道(共M个信道)中的第i个信道(以下简称信道i)，/>中包含的元素数量与第t个时间段内包括的时间单元的数量相等，元素的值为1代表信道i在该元素对应的时间单元的忙闲状态为忙(即有数据包的传输，可能是第一终端设备的数据传输，也可能是其它终端设备的数据包传输)、元素的值为0代表信道i在该元素对应的时间单元的忙闲状态为闲(即无数据包的传输)、元素的值为-1代表第一终端设备未监测信道i在该元素对应的时间单元的忙闲状态(比如因第一终端设备在该元素对应的时间单元在信道i外的其它信道发送数据包，无法监测信道i在该元素对应的时间单元的忙闲状态)。比如/>表示信道i在第t个时间段的前9个时间单元的忙闲状态为闲，第10个时间单元的忙闲状态为忙。Among them, for the busy and idle status of the primary channel and each of the M-1 secondary channels monitored by the first terminal device in the t-th time period in each time unit, you can use represents, where i=1, 2, 3,...,M, represents the i-th channel (hereinafter referred to as channel i) among the primary channel and M-1 secondary channels (a total of M channels),/> The number of elements contained in is equal to the number of time units included in the t-th time period. The value of the element is 1, which means that the busy status of channel i in the time unit corresponding to the element is busy (that is, there is transmission of data packets, possibly It is the data transmission of the first terminal device, or it may be the data packet transmission of other terminal devices). The value of the element is 0, which means that the busy status of channel i in the time unit corresponding to the element is idle (that is, there is no data packet transmission) , the value of the element is -1, which means that the first terminal device does not monitor the busy and idle status of channel i in the time unit corresponding to the element (for example, because the first terminal device sends data on other channels other than channel i in the time unit corresponding to the element) package, it is impossible to monitor the busy and idle status of channel i in the time unit corresponding to this element). For example/> It means that the busy-idle status of channel i in the first 9 time units of the t-th time period is idle, and the busy-idle status of the 10th time unit is busy.

对于第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态可以用表示，其中i＝1、2、3、…、M，表示主信道和M-1个次信道(共M个信道)中的第i个信道(以下简称信道i)，/>中包含的元素数量与第t个时间段内包括的时间单元的数量相等，元素的值为1代表对于信道i第一终端设备在该元素对应的时间单元有数据包的发送、元素的值为0代表对于信道i第一终端设备在该元素对应的时间单元无数据包的发送。比如/>表示第一终端设备在第t个时间段的前3个时间单元和第10个时间单元在信道i有数据包的发送，在第4至第9个时间单元在信道i没有数据包的发送。For the data packet sending status of each time unit of the first terminal device on the main channel and each of the M-1 secondary channels monitored by the first terminal device in the t-th time period, you can use represents, where i=1, 2, 3,...,M, represents the i-th channel (hereinafter referred to as channel i) among the primary channel and M-1 secondary channels (a total of M channels),/> The number of elements contained in is equal to the number of time units included in the t-th time period. The value of the element is 1, which means that for channel i, the first terminal device has sent a data packet in the time unit corresponding to the element. The value of the element is 0 means that for channel i, the first terminal device sends no data packet in the time unit corresponding to this element. For example/> It means that the first terminal device has sent data packets on channel i in the first 3 time units and the 10th time unit of the t-th time period, but has not sent data packets on channel i in the 4th to 9th time units.

对于第一终端设备在第t个时间段内监测到的第一终端设备在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数可以用表示。以/>和/>为例，在第t个时间段的第一个时间单元，第一终端设备可以将/>的值设置为初始值0；在第t个时间段的第二个时间单元，/>和/>中对应第二个时间单元的元素的值均与对应第一时间单元的元素的值相同，/>的值+1(/>为1)；在第t个时间段的第三个时间单元，/>和/>中对应第三个时间单元的元素的值均与对应第二时间单元的元素的值相同，/>的值+1(/>为2)；在第t个时间段的第四个时间单元，存在/>中对应第四个时间单元的元素的值与对应第三时间单元的元素的值不相同，/>的值重置为0；在第t个时间段的第五个时间单元，存在/>中对应第五个时间单元的元素的值与对应第四时间单元的元素的值不相同，/>的值重置为0；在第t个时间段的第六个时间单元，/>和/>中对应第六个时间单元的元素的值均与对应第五时间单元的元素的值相同，/>的值+1(/>为1)；…；在第t个时间段的第十个时间单元，/>和/>中对应第十个时间单元的元素的值均与对应第九个时间单元的元素的值相同，/>的值+1(/>为5)；最终得到为5。For the first terminal device monitored by the first terminal device in the t-th time period, the data packet sending status and the busy and idle status of the channel on the main channel and each of the M-1 secondary channels remain unchanged at the same time. The number of consecutive time units can be used express. with/> and/> For example, in the first time unit of the t-th time period, the first terminal device can send/> The value of is set to the initial value 0; in the second time unit of the t-th time period,/> and/> The values of the elements corresponding to the second time unit in are all the same as the values of the elements corresponding to the first time unit,/> The value +1(/> is 1); in the third time unit of the t-th time period,/> and/> The values of the elements corresponding to the third time unit in are all the same as the values of the elements corresponding to the second time unit,/> The value +1(/> is 2); in the fourth time unit of the t-th time period, there exists/> The value of the element corresponding to the fourth time unit is different from the value of the element corresponding to the third time unit,/> The value of is reset to 0; in the fifth time unit of the t-th time period, there is/> The value of the element corresponding to the fifth time unit is different from the value of the element corresponding to the fourth time unit,/> The value is reset to 0; in the sixth time unit of the t-th time period, /> and/> The values of the elements corresponding to the sixth time unit in are all the same as the values of the elements corresponding to the fifth time unit,/> The value +1(/> is 1);…;in the tenth time unit of the t-th time period,/> and/> The values of the elements corresponding to the tenth time unit in are all the same as the values of the elements corresponding to the ninth time unit,/> The value +1(/> is 5); finally we get is 5.

在本申请实施例中，信道聚合模型的输入可以是某一时间段(如第t个时间段)的信道环境信息S，信道聚合模型输出为信道聚合指示值Y。以第t个时间段为例，第t个时间段的信道环境信息S_t包括主信道在第t个时间段的负载信息以及M-1个次信道在第t个时间段的负载信息/>其中j＝1、2、3、…、M-1，表示M-1个次信道中的第j个次信道。还可以包括第一终端设备在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息，如第t个时间段对应的上述/>中的一项或多项。Y可以为0到2^M-1-1之间的一个数，每个数都映射为具体的一种包含主信道在内的信道聚合方式，如Y＝0代表不做信道聚合，Y＝1代表主信道与M-1个次信道中第一个次信道进行信道聚合，Y＝2代表主信道与M-1个次信道中第二个次信道聚合，…，Y＝M-1代表主信道与M-1个次信道中第M-1个次信道聚合，Y＝M代表主信道与M-1个次信道中第一个次信道和第二个次信道聚合等等。In this embodiment of the present application, the input of the channel aggregation model may be the channel environment information S of a certain time period (such as the t-th time period), and the output of the channel aggregation model is the channel aggregation indication value Y. Taking the t-th time period as an example, the channel environment information S _t of the t-th time period includes the load information of the main channel in the t-th time period. And the load information of M-1 sub-channels in the t-th time period/> Among them, j=1, 2, 3, ..., M-1, indicating the j-th sub-channel among the M-1 sub-channels. It may also include channel state monitoring information obtained by the first terminal device performing channel state monitoring on the main channel and M-1 secondary channels in the t-th time period, such as the above/> corresponding to the t-th time period. one or more of them. Y can be a number between 0 and 2 ^M-1 -1. Each number is mapped to a specific channel aggregation method including the main channel. For example, Y=0 means no channel aggregation, Y=1 represents the channel aggregation of the main channel and the first sub-channel among M-1 sub-channels, Y=2 represents the aggregation of the main channel and the second sub-channel among M-1 sub-channels,..., Y=M-1 represents the main channel The channel is aggregated with the M-1 sub-channel among the M-1 sub-channels. Y=M represents the aggregation of the main channel with the first sub-channel and the second sub-channel among the M-1 sub-channels, and so on.

对于信道聚合模型(也即信道聚合模型对应的神经网络)中各层神经元的参数，可以通过随机初始化的方式为信道聚合模型中的各层神经元配置参数。也可以采用样本库中已标注有信道聚合方式对应的目标信道聚合指示值的多个信道环境信息样本，由训练设备训练得到。在一种可能的实施中，样本库中的多个信道环境信息样本可以由第一终端设备获取多个时间段分别对应的信道环境信息，并由人工针对每个时间段对应的信道环境信息，根据该时间段下一时间段对应的信道环境信息，确定该时间段对应的信道环境信息所对应的一个优选信道聚合方式后，为该时间段对应的信道环境信息标注对应该优先信道聚合方式的目标信道聚合指示值。在对信道聚合模型进行训练时，训练设备(如第一终端设备或网络设备)可以将样本库中的信道环境信息样本输入到信道聚合模型，得到信道聚合模型输出的信道聚合指示值，根据信道聚合模型输出的信道聚合指示值与该信道环境信息样本对应的目标信道聚合指示值，通过损失函数(loss function)训练设备可以计算信道聚合模型的损失(loss)，loss越高表示通过信道聚合模型输出的信道聚合指示值与目标信道聚合指示值的差异越大，信道聚合模型根据loss调整信道聚合模型中神经元的参数，如采用随机梯度下降法更新信道聚合模型中神经元的参数，那么对信道聚合模型的训练过程就变成了尽可能缩小这个loss的过程。通过样本集中的信道环境信息样本不断对信道聚合模型进行训练，当这个loss缩小至预设范围，即可得到训练完成的信道聚合模型。For the parameters of each layer of neurons in the channel aggregation model (that is, the neural network corresponding to the channel aggregation model), the parameters of each layer of neurons in the channel aggregation model can be configured through random initialization. It is also possible to use multiple channel environment information samples in the sample library that have been marked with target channel aggregation indication values corresponding to the channel aggregation mode, and obtain them through training by the training device. In a possible implementation, the channel environment information corresponding to multiple time periods can be obtained by the first terminal device for multiple channel environment information samples in the sample library, and the channel environment information corresponding to each time period can be manually obtained, According to the channel environment information corresponding to the next time period in the time period, after determining a preferred channel aggregation method corresponding to the channel environment information corresponding to the time period, mark the channel environment information corresponding to the time period corresponding to the preferred channel aggregation method. Target channel aggregation indicator value. When training the channel aggregation model, the training device (such as the first terminal device or network device) can input the channel environment information samples in the sample library to the channel aggregation model, and obtain the channel aggregation indication value output by the channel aggregation model. According to the channel The channel aggregation indicator value output by the aggregation model is the target channel aggregation indicator value corresponding to the channel environment information sample. The loss function (loss function) training device can calculate the loss (loss) of the channel aggregation model. The higher the loss, the channel aggregation model is passed. The greater the difference between the output channel aggregation indicator value and the target channel aggregation indicator value, the channel aggregation model adjusts the parameters of the neurons in the channel aggregation model according to the loss. For example, if the stochastic gradient descent method is used to update the parameters of the neurons in the channel aggregation model, then the The training process of the channel aggregation model becomes the process of reducing this loss as much as possible. The channel aggregation model is continuously trained through the channel environment information samples in the sample set. When the loss is reduced to the preset range, the trained channel aggregation model can be obtained.

作为一种示例，本申请实施例的信道聚合模型的结构可以如图9A所示，其中图9A中每个方块代表一个全连接层，信道聚合模型可以由7个全连接层构成，其中7个全连接层从左到右依次为1个输入层、5个隐藏层和1个输出层，其中每层的激活函数可以采用整流函数(rectification function，ReLu)，输入层的输入(inputs)为某一时间段(如第一时间段)的信道环境信息S，输入层的输出h1为隐藏层1的输入，隐藏层1的输出h2为隐藏层2的输入，隐藏层2的输出h3为隐藏层3的输入，隐藏层3的输出h4与隐藏层1的输出h2的异或运算结果为隐藏层4的输入，隐藏层4的输出h5为隐藏层5的输入，隐藏层5的输出h6与隐藏层3的输出h4的异或运算结果为输出层的输入，输出层的输出为信道聚合指示值Y。对信道聚合模型进行训练的过程，就是不断调整信道聚合模型中各层神经元的参数的过程。As an example, the structure of the channel aggregation model in the embodiment of the present application can be shown in Figure 9A, where each block in Figure 9A represents a fully connected layer, and the channel aggregation model can be composed of 7 fully connected layers, of which 7 The fully connected layer consists of 1 input layer, 5 hidden layers and 1 output layer from left to right. The activation function of each layer can be a rectification function (ReLu), and the inputs of the input layer are a certain The channel environment information S of a time period (such as the first time period), the output h1 of the input layer is the input of hidden layer 1, the output h2 of hidden layer 1 is the input of hidden layer 2, and the output h3 of hidden layer 2 is the hidden layer 3's input, the XOR operation result of the output h4 of hidden layer 3 and the output h2 of hidden layer 1 is the input of hidden layer 4, the output h5 of hidden layer 4 is the input of hidden layer 5, the output h6 of hidden layer 5 is the same as the hidden The XOR operation result of the output h4 of layer 3 is the input of the output layer, and the output of the output layer is the channel aggregation indicator value Y. The process of training the channel aggregation model is the process of continuously adjusting the parameters of the neurons in each layer of the channel aggregation model.

需要理解的是，上述训练设备可以为第一终端设备，也可以为网络设备，还可以为服务器、计算机等其它设备，当训练设备非第一终端设备时，可以由训练设备确定信道聚合模型中各层神经元的参数后发送给第一终端设备。It should be understood that the above-mentioned training device can be a first terminal device, a network device, or other devices such as a server or a computer. When the training device is not the first terminal device, the training device can determine the channel aggregation model. The parameters of the neurons in each layer are then sent to the first terminal device.

在一些实施中，如图9B所示，在信道聚合模型基于某一时间段(如第t-1个时间段)的信道环境信息(S_t-1)输出信道聚合指示值(如第t-1信道聚合指示值)后，第一终端设备还可以测试在聚合后的信道上发送数据包是否会与其它终端设备发送数据包发生碰撞，并根据该信道聚合指示值所指示的信道聚合方式以及在聚合后的信道上的发送数据包的情况，结合该时间段各信道的负载情况，基于信道聚合模型输出的该信道聚合指示值给予一个奖励值(如R_t)，并将该奖励值也作为下一个时间段(如第t个时间段)信道聚合模型的输入。以引导信道聚合模型根据各信道上的负载情况进行学习，以期通过信道聚合模型输出最优的信道聚合决策。可选地，还可以将信道聚合模型基于某一时间段(如第t-1个时间段)的信道环境信息输出信道聚合指示值(如第t-1)也作为下一个时间段(如第t个时间段)信道聚合模型的输入。In some implementations, as shown in Figure 9B, the channel aggregation model outputs a channel aggregation indication value (such as the t-th time period) based on the channel environment information (S _t-1 ) of a certain time period (such as the t-1th time period). 1 channel aggregation indication value), the first terminal device can also test whether sending data packets on the aggregated channel will collide with data packets sent by other terminal devices, and based on the channel aggregation method indicated by the channel aggregation indication value and Based on the situation of sending data packets on the aggregated channel, combined with the load situation of each channel in that time period, a reward value (such as R _t ) is given based on the channel aggregation indication value output by the channel aggregation model, and the reward value is also As the input of the channel aggregation model in the next time period (such as the tth time period). The channel aggregation model is guided to learn according to the load conditions on each channel, in order to output the optimal channel aggregation decision through the channel aggregation model. Optionally, the channel aggregation model can also be based on the channel environment information of a certain time period (such as the t-1th time period) and output the channel aggregation indicator value (such as the t-1th time period) as the next time period (such as the t-1th time period). t time periods) as input to the channel aggregation model.

在一种可能的实现中，第一终端设备可以采用如下方式，确定基于信道聚合模型得到信道聚合指示值的奖励值，也即确定第一终端设备执行基于信道聚合模型得到的决策动作(即信道聚合指示值对应的信道聚合方式)的奖励值。下面以该时间段为第t个时间段，基于信道聚合模型得到信道聚合指示值的奖励值为R_t+1为例进行说明：In a possible implementation, the first terminal device can determine the reward value of the channel aggregation indication value based on the channel aggregation model in the following manner, that is, determine that the first terminal device performs the decision action (i.e., the channel aggregation indicator value) obtained based on the channel aggregation model. The reward value of the channel aggregation method corresponding to the aggregation indication value). The following explanation takes this time period as the t-th time period and the reward value of the channel aggregation indicator value obtained based on the channel aggregation model is R _t+1 as an example:

当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N不为零时，第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels does not collide with the data packet sent by other terminal devices, and N is not zero, the first terminal device shall Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N为零时，第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels does not collide with the data packet sent by other terminal devices and N is zero, the first terminal device shall Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N不为零时，第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值；When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is not zero, the first terminal device will Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;

当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N为零时，第一终端设备根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N secondary channels collides with the data packet sent by other terminal devices and N is zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.

上述各个情况中，R_t+1表示基于信道聚合模型得到第t信道聚合指示值的奖励值，K表示N个次信道中的第K个次信道，K＝1、2、…、N，表示第K个次信道在第t个时间段的负载信息，/>表示主信道在第t个时间段的负载信息。In each of the above cases, R _t+1 represents the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model, K represents the K-th sub-channel among N sub-channels, K=1, 2,...,N, Indicates the load information of the K-th sub-channel in the t-th time period,/> Indicates the load information of the main channel in the t-th time period.

在另一些实现中，第一终端设备也根据主信道和N个次信道在第t个时间段负载信息(如负载值)的均值，确定基于信道聚合模型得到第t信道聚合指示值的奖励值R_t+1。例如：当第一终端设备在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞时，将主信道和N个信道在第t+1个时间段负载信息(如负载值)的均值与-1的乘积，作为奖励值R_t+1；当第一终端设备在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞时，将主信道和N个信道在第t+1个时间段负载信息(如负载值)的均值，作为奖励值R_t+1。In other implementations, the first terminal device also determines the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value based on the average load information (such as load value) of the primary channel and N secondary channels in the t-th time period. Rt ₊₁ . For example: when the first terminal device sends a data packet on the main channel and N secondary channels aggregated and collides with the data packet sent by other terminal devices, the main channel and N channels will load information in the t+1th time period. The product of the mean value (such as the load value) and -1 is used as the reward value R _t+1 ; when the first terminal device sends a data packet on the channel after the main channel and N secondary channels are aggregated, it does not occur when other terminal devices send data packets. During a collision, the average value of the load information (such as load value) of the main channel and N channels in the t+1th time period is used as the reward value R _t+1 .

上述是以时间段为第t个时间段，确定基于信道聚合模型得到第t信道聚合指示值的奖励值为R_t+1为例进行说明，可以理解的是对于其它时间段，如第t-1个时间段(第t-1个时间段为第t个时间段之前的时间段)，第一终端设备也可以根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，来确定基于信道聚合模型得到所述第t-1信道聚合指示值的奖励值R_t，第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合，N’与N相同或不同。The above is an example where the time period is the t-th time period and the reward value for determining the t-th channel aggregation indication value based on the channel aggregation model is R _t+1 . It can be understood that for other time periods, such as the t-th 1 time period (the t-1th time period is the time period before the tth time period), the first terminal device can also use the main channel and each of the N' sub-channels in the t-1th time period. The load information of the time period is used to determine the reward value R _t for obtaining the t-1th channel aggregation indication value based on the channel aggregation model. The t-1th channel aggregation indication value is used to indicate N' among the M-1 sub-channels. The secondary channels are aggregated with the primary channel, and N' is the same as or different from N.

比如：当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。For example: when the data packet sent by the first terminal device on the channel after aggregation of the main channel and N' secondary channels does not collide with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels does not collide with the data packet sent by other terminal devices and N' is zero, the first terminal device Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices, and N' is not zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.

当第一终端设备在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时，第一终端设备根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值。When the data packet sent by the first terminal device on the channel after aggregation of the primary channel and N' secondary channels collides with the data packet sent by other terminal devices and N' is zero, the first terminal device will Determine the reward value based on the channel aggregation model to obtain the t-1th channel aggregation indication value.

S603：第一终端设备在第t+1个时间段通过主信道和N个次信道聚合后的信道发送数据包。所述第t+1个时间段为所述第t个时间段之后的时间段。S603: The first terminal device sends a data packet through the aggregated channel of the primary channel and N secondary channels in the t+1th time period. The t+1th time period is the time period after the tth time period.

第一终端设备将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值后，即可根据第t信道聚合指示值所指示M-1个次信道中的N个次信道与主信道聚合，将M-1个次信道与主信道聚合，并在第t个时间段之后的第t+1个时间段通过聚合后的信道向网络设备发送数据包。The first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing. After obtaining the t-th channel aggregation indication value, it can use the M-1 sub-channels indicated by the t-th channel aggregation indication value to N secondary channels are aggregated with the primary channel, M-1 secondary channels are aggregated with the primary channel, and data packets are sent to the network device through the aggregated channels in the t+1th time period after the tth time period.

在一些实施中，为了使信道聚合模型做出的信道聚合决策(即信道聚合指示值)符合用户的预期，用户还可以预先配置用于评价基于各种信道环境信息S做出不同决策动作a(即不同信道聚合指示值Y所对应的信道聚合方式)的状态动作值函数Q，对于信道聚合模型基于某一时间段的信道环境信息S输出的决策动作a(也即输出的信道聚合指示值Y所对应的信道聚合方式)进行状态动作值评价，得到一个第一状态动作值；并可以通过状态动作值函数Q对该时间段的信道环境信息S对应的所有可能的决策动作(即所有可能的信道聚合指示值Y所对应的信道聚合方式)分别进行评价，得到多个状态动作值，并选取其中的最大值，作为第二状态动作值。并可以根据第二状态动作值和第一状态动作值，以及奖励值，确定信道聚合模型的损失，对信道聚合模型进行训练更新，如根据损失采用随机梯度下降法更新信道聚合模型中神经元的参数。In some implementations, in order to make the channel aggregation decision (i.e., the channel aggregation indicator value) made by the channel aggregation model meet the user's expectations, the user can also pre-configure the evaluation method to make different decision actions based on various channel environment information S a ( That is, the state action value function Q of the channel aggregation method corresponding to different channel aggregation indicator value Y), for the decision action a output by the channel aggregation model based on the channel environment information S of a certain period of time (that is, the output channel aggregation indicator value Y The corresponding channel aggregation method) performs state action value evaluation to obtain a first state action value; and all possible decision actions corresponding to the channel environment information S in this time period (that is, all possible The channel aggregation method corresponding to the channel aggregation indication value Y) is evaluated respectively to obtain multiple state action values, and the maximum value is selected as the second state action value. And the loss of the channel aggregation model can be determined based on the second state action value, the first state action value, and the reward value, and the channel aggregation model can be trained and updated. For example, the stochastic gradient descent method can be used to update the neurons in the channel aggregation model based on the loss. parameter.

以第t个时间段为例，可以采用以下期望平方奖励值函数(也可以称为损失函数)，确定信道聚合模型的损失。Taking the t-th time period as an example, the following expected squared reward value function (which can also be called a loss function) can be used to determine the loss of the channel aggregation model.

L(θ)＝E[R_t+1+γmax_a′Q(s_t′,a′,θ^*)-Q(s_t,a_t；θ)]² L(θ)＝E[R _t+1 +γmax _a′ Q(s _t ′,a′,θ ^* )-Q(s _t ,a _t ;θ)] ²

其中，L()表示期望平方奖励值函数、L(θ)表示信道聚合模型的损失、Q()表示设定的状态动作值函数、γ表示折扣因子(取值可以为0.9等)、θ表示信道聚合模型当前的参数、R_t+1表示基于信道聚合模型得到的决策动作a_t(也即得到的第t信道聚合指示值对应的信道聚合方式)的奖励值；Q(s_t,a_t；θ)]表示基于第t个时间段的信道环境信息s_t进行决策动作a_t(也即输出的第t信道聚合指示值对应的信道聚合方式)的第一状态动作值；max_a′Q(s_t′,a′,θ^*)表示基于第t个时间段的信道环境信息s_t分别进行所有可选决策动作a(2^M-1-1个候选信道聚合指示值分别对应的候选信道聚合方式)的状态动作值中的最大状态动作值(即第二状态动作值)，a′表示对应该第二状态动作值的决策动作，θ^*表示目标信道聚合模型的参数，也即输出该第二状态动作值的决策动作a′(a′对应的信道聚合指示值)时信道聚合模型的参数。Among them, L() represents the expected squared reward value function, L(θ) represents the loss of the channel aggregation model, Q() represents the set state action value function, γ represents the discount factor (the value can be 0.9, etc.), θ represents The current parameters of the channel aggregation model, R _t+1, represent the reward value of the decision action a _t obtained based on the channel aggregation model (that is, the channel aggregation method corresponding to the obtained t-th channel aggregation indication value); Q(s _t ,a _t ; θ)] represents the first state action value of the decision action a _t (that is, the channel aggregation mode corresponding to the output t-th channel aggregation indication value) based on the channel environment information s _t of the t-th time period; max _a ′Q (s _t ′, a ′, θ ^* ) indicates that based on the channel environment information s _t of the t-th time period, all optional decision actions a(2 ^M-1 -1 candidate channels corresponding to the candidate channel aggregation indication values are performed respectively aggregation mode), a′ represents the decision action corresponding to the second state action value, θ ^* represents the parameters of the target channel aggregation model, that is, output the The decision action a′ (the channel aggregation indication value corresponding to a′) of the second state action value is a parameter of the channel aggregation model.

上述是以时间段为第t个时间段，确定信道聚合模型的损失为例进行说明的，可以理解的是，对于其它时间段(如第t-1个时间段)，将对应第t个时间段的奖励值、第一状态动作值和第二状态动作值替换为对应于第t-1个时间段的奖励值、第一状态动作值和第二状态动作值，即可确定对应第t-1个时间段的信道聚合模型的损失，对t-1个时间段的信道聚合模型进行训练更新。The above is an example of determining the loss of the channel aggregation model by taking the time period as the t-th time period. It can be understood that for other time periods (such as the t-1th time period), it will correspond to the t-th time period. The reward value, first state action value and second state action value of the segment are replaced with the reward value, first state action value and second state action value corresponding to the t-1th time period, and the corresponding t-th time period can be determined. The loss of the channel aggregation model in 1 time period, the channel aggregation model in t-1 time period is trained and updated.

另外，需要理解的是，上述是由信道聚合模型在第一终端设备侧，由第一终端设备基于信道聚合模型，对输入的第t个时间段信道环境信息进行处理，得到信道聚合指示值，第一终端设备基于信道聚合指示值所指示的信道聚合方式，在第t+1个时间段进行信道聚合为例进行说明的。在一些实施中，信道聚合模型还可以部署在网络设备，由网络设备侧获取第一终端设备对应第t个时间段信道环境信息并输入信道聚合模型，对输入的第t个时间段信道环境信息进行处理，得到信道聚合指示值，并由网络设备将信道聚合指示值或信道聚合指示值所指示的信道聚合方式发送给第一终端设备，第一终端设备根据来自网络设备的信道聚合指示值或信道聚合指示值所指示的信道聚合方式，进行信道聚合。In addition, it should be understood that the above is based on the channel aggregation model on the first terminal device side. The first terminal device processes the input channel environment information of the t-th time period based on the channel aggregation model to obtain the channel aggregation indication value. The first terminal device performs channel aggregation in the t+1th time period based on the channel aggregation mode indicated by the channel aggregation indication value as an example for explanation. In some implementations, the channel aggregation model can also be deployed on the network device. The network device side obtains the channel environment information of the first terminal device corresponding to the t-th time period and inputs it into the channel aggregation model. The input channel environment information of the t-th time period is Perform processing to obtain the channel aggregation indication value, and the network device sends the channel aggregation indication value or the channel aggregation method indicated by the channel aggregation indication value to the first terminal device, and the first terminal device determines the channel aggregation indication value or channel aggregation mode indicated by the channel aggregation indication value from the network device. Perform channel aggregation in the channel aggregation mode indicated by the channel aggregation indicator value.

可以理解的是，为了实现上述实施例中功能，第一终端设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到，结合本申请中所公开的实施例描述的各示例的单元及方法步骤，本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行，取决于技术方案的特定应用场景和设计约束条件。It can be understood that, in order to implement the functions in the above embodiments, the first terminal device includes corresponding hardware structures and/or software modules that perform each function. Those skilled in the art should easily realize that the units and method steps of each example described in conjunction with the embodiments disclosed in this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is performed by hardware or computer software driving the hardware depends on the specific application scenarios and design constraints of the technical solution.

图10和图11为本申请的实施例提供的可能的通信装置的结构示意图。这些通信装置可以用于实现上述方法实施例中第一终端设备的功能，因此也能实现上述方法实施例所具备的有益效果。在一种可能的实现中，该通信装置可以是第一终端设备，还可以是应用于第一终端设备的模块(如芯片)。Figures 10 and 11 are schematic structural diagrams of possible communication devices provided by embodiments of the present application. These communication devices can be used to implement the functions of the first terminal device in the above method embodiments, and therefore can also achieve the beneficial effects of the above method embodiments. In a possible implementation, the communication device may be a first terminal device, or may be a module (such as a chip) applied to the first terminal device.

如图10所示，通信装置1000包括处理单元1010和接口单元1020，其中接口单元1020还可以为收发单元或输入输出接口。通信装置1000可用于实现上述图6中所示的方法实施例中第一终端设备的功能。As shown in Figure 10, the communication device 1000 includes a processing unit 1010 and an interface unit 1020, where the interface unit 1020 may also be a transceiver unit or an input/output interface. The communication device 1000 may be used to implement the functions of the first terminal device in the above method embodiment shown in FIG. 6 .

当通信装置1000用于实现图6所示的方法实施例中第一终端设备的功能时：When the communication device 1000 is used to implement the functions of the first terminal device in the method embodiment shown in Figure 6:

接口单元1020，用于接收来自网络设备的负载报告，负载报告包括网络设备的M个信道中每个信道在第t个时间段的负载信息，其中，M个信道包括第一终端设备对应的1个主信道和M-1个次信道，M为大于或等于2的整数，t为大于或等于2的整数；处理单元1010，用于将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值，第t个时间段的信道环境信息包括主信道和M-1个次信道中每个次信道在第t个时间段的负载信息、以及在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息，第t信道聚合指示值用于指示M-1个次信道中的N个次信道与主信道聚合，N为大于或等于0、且小于或等于M-1的整数；以及对主信道和N个次信道进行信道聚合。可选地，负载报告还包括第t时段的截止时间。The interface unit 1020 is configured to receive a load report from the network device. The load report includes the load information of each of the M channels of the network device in the t-th time period, where the M channels include 1 corresponding to the first terminal device. primary channels and M-1 secondary channels, M is an integer greater than or equal to 2, t is an integer greater than or equal to 2; the processing unit 1010 is used to input the channel environment information of the tth time period into the channel aggregation model Perform processing to obtain the t-th channel aggregation indicator value. The channel environment information of the t-th time period includes the load information of the primary channel and each of the M-1 secondary channels in the t-th time period, and the load information of the t-th time period. The channel state monitoring information obtained by performing channel state monitoring on the main channel and M-1 secondary channels during the time period. The t-th channel aggregation indicator value is used to indicate the aggregation of N secondary channels among the M-1 secondary channels with the primary channel. N is an integer greater than or equal to 0 and less than or equal to M-1; and channel aggregation is performed on the primary channel and N secondary channels. Optionally, the load report also includes the deadline of the t-th period.

在一种可能的设计中，处理单元1010在第t个时间段对主信道和M-1个次信道进行信道状态监测得到的信道状态监测信息包括以下中的一项或多项：处理单元1010在第t个时间段内监测到的主信道和M-1个次信道中的每个次信道在每个时间单元的忙闲状态；处理单元1010在第t个时间段内监测到的通信装置在主信道和M-1个次信道中的每个次信道上每个时间单元的数据包发送状态；处理单元1010在第t个时间段内监测到的通信装置在主信道和M-1个次信道中的每个次信道上数据包发送状态与信道的忙闲状态同时保持不变连续的时间单元个数。In a possible design, the processing unit 1010 performs channel status monitoring on the primary channel and M-1 secondary channels in the t-th time period, and the channel status monitoring information obtained includes one or more of the following: Processing unit 1010 The busy and idle status of the main channel and each of the M-1 secondary channels in each time unit monitored in the t-th time period; the communication device monitored by the processing unit 1010 in the t-th time period The data packet transmission status of each time unit on the main channel and each of the M-1 secondary channels; the communication device monitored by the processing unit 1010 in the t-th time period is on the main channel and M-1 The number of consecutive time units that the data packet sending status and the busy and idle status of the channel on each secondary channel remain unchanged at the same time.

在一种可能的设计中，处理单元1010还用于根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值，其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合；根据第t-1个时间段的信道环境信息、第t-1信道聚合指示值和设定的状态动作值函数，确定基于第t-1个时间段的信道环境信息进行第t-1信道聚合指示值对应的信道聚合方式的第一状态动作值；根据第t-1个时间段的信道环境信息、主信道与M-1个次信道对应的2^M-1-1个候选信道聚合指示值和设定的状态动作值函数，确定第二状态动作值，其中2^M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2^M-1-1个候选信道聚合方式，第二状态动作值为基于第t-1个时间段的信道环境信息分别进行2^M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值；以及根据第一状态动作值、第二状态动作值和第t-1信道聚合指示值的奖励值，确定信道聚合模型的损失；根据信道聚合模型的损失，对信道聚合模型进行训练更新；其中，N’与N相同或不同，第t-1个时间段为第t个时间段之前的时间段。In a possible design, the processing unit 1010 is also configured to determine, based on the load information of each of the primary channel and N' secondary channels in the t-1th time period, the t-1th time interval obtained based on the channel aggregation model. The reward value of the channel aggregation indicator value, where the t-1th channel aggregation indicator value is used to indicate the aggregation of N' secondary channels among the M-1 secondary channels with the main channel; based on the channel environment information of the t-1th time period , the t-1th channel aggregation indication value and the set state action value function determine the first state action of the channel aggregation method corresponding to the t-1th channel aggregation indication value based on the channel environment information of the t-1th time period value; based on the channel environment information of the t-1th time period, the 2 ^M-1 -1 candidate channel aggregation indication values corresponding to the primary channel and M-1 secondary channels, and the set status action value function, determine the second State action value, in which 2 ^M-1 -1 candidate channel aggregation indication values correspond to 2 ^M-1 -1 candidate channel aggregation modes of the primary channel and M-1 secondary channels, and the second state action value is based on the t-th -1 time period of channel environment information respectively performs 2 ^M-1 -1 maximum state action value among the state action values of the candidate channel aggregation mode corresponding to -1 candidate channel aggregation indication value; and based on the first state action value, the second The reward value of the state action value and the t-1 channel aggregation indicator value determines the loss of the channel aggregation model; based on the loss of the channel aggregation model, the channel aggregation model is trained and updated; where N' is the same as or different from N, and the t-th -1 time period is the time period before the tth time period.

在一种可能的设计中，处理单元1010，还用于根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值，其中第t-1信道聚合指示值用于指示M-1个次信道中的N’个次信道与主信道聚合，N’与N相同或不同，第t-1个时间段为第t个时间段之前的时间段；处理单元1010将第t个时间段的信道环境信息输入到信道聚合模型进行处理，得到第t信道聚合指示值时，具体用于将第t个时间段的信道环境信息、第t-1信道聚合指示值的奖励值输入到信道聚合模型进行处理，得到第t信道聚合指示值。In a possible design, the processing unit 1010 is also configured to determine the t-th time period based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. The reward value of 1 channel aggregation indication value, where the t-1th channel aggregation indication value is used to indicate that N' secondary channels among the M-1 secondary channels are aggregated with the main channel, N' is the same as or different from N, and the t-th 1 time period is the time period before the t-th time period; the processing unit 1010 inputs the channel environment information of the t-th time period into the channel aggregation model for processing, and when obtaining the t-th channel aggregation indication value, it is specifically used to convert the t-th time period into the channel aggregation indicator value. The channel environment information of t time periods and the reward value of the t-1th channel aggregation indication value are input to the channel aggregation model for processing, and the tth channel aggregation indication value is obtained.

一种可能的实现中，处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’不为零时，处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；其中，R_t表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值，K表示N’个次信道中的第K个次信道，K＝1、2、…、N’，表示第K个次信道在第t-1个时间段的负载信息。In one possible implementation, the processing unit 1010 determines to obtain the t-1th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. When the reward value is used, it is specifically used: when the data packet sent by the interface unit 1020 on the channel after the aggregation of the primary channel and N' secondary channels does not collide with the data packet sent by other terminal devices, and N' is not zero, the processing unit 1010 based on Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model; where R _t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, and K represents the N'th sub-channel K sub-channels, K=1, 2,...,N', Indicates the load information of the Kth sub-channel in the t-1th time period.

另一种可能的实现中，处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N’为零时，处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值；其中，R_t表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值，/>表示主信道在第t-1个时间段的负载信息。In another possible implementation, the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. When the reward value of the value is used, it is specifically used: when the interface unit 1020 sends a data packet on the channel after the main channel and N' secondary channels have been aggregated, and does not collide with the data packet sent by other terminal devices, and N' is zero, the processing unit 1010 based on Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model; where R _t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, /> Indicates the load information of the main channel in the t-1th time period.

再一种可能的实现中，处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’不为零时，处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；其中，R_t表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值，K表示N’个次信道中的第K个次信道，K＝1、2、…、N’，表示第K个次信道在第t-1个时间段的负载信息，/>表示主信道在第t-1个时间段的负载信息。In another possible implementation, the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. When the reward value of the value is used, it is specifically used: when the interface unit 1020 sends a data packet on the channel after the main channel and N' secondary channels are aggregated, and the data packet sent by other terminal devices collides, and N' is not zero, the processing unit 1010 based on Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model; where R _t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model, and K represents the N'th sub-channel K sub-channels, K=1, 2,...,N', Represents the load information of the K-th sub-channel in the t-1th time period, /> Indicates the load information of the main channel in the t-1th time period.

又一种可能的实现中，处理单元1010根据主信道和N’个次信道中每个次信道在第t-1个时间段的负载信息，确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N’个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N’为零时，处理单元1010根据确定基于信道聚合模型得到第t-1信道聚合指示值的奖励值；其中，R_t表示基于信道聚合模型得到第t-1信道聚合指示值的奖励值，/>表示主信道在第t-1个时间段的负载信息。In another possible implementation, the processing unit 1010 determines to obtain the t-1th channel aggregation indication based on the channel aggregation model based on the load information of each of the primary channel and N' secondary channels in the t-1th time period. When the reward value of the value is used, it is specifically used: when the interface unit 1020 sends a data packet on the channel after the main channel and N' secondary channels are aggregated, and the data packet sent by other terminal devices collides, and N' is zero, the processing unit 1010 according to Determine the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model; where, R _t represents the reward value for obtaining the t-1th channel aggregation indication value based on the channel aggregation model,/> Indicates the load information of the main channel in the t-1th time period.

在一种可能的设计中，处理单元1010，还用于根据主信道和N个次信道中每个次信道在第t个时间段的负载信息，确定基于信道聚合模型得到第t信道聚合指示值的奖励值。In a possible design, the processing unit 1010 is also configured to determine the t-th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. reward value.

一种可能的实现中，处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息，确定基于信道聚合模型得到第t信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N不为零时，处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。In one possible implementation, the processing unit 1010 determines the reward value of the t-th channel aggregation indication value based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period, Specifically used: when the data packet sent by the interface unit 1020 on the channel after aggregation of the main channel and N secondary channels does not collide with the data packet sent by other terminal devices, and N is not zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.

另一种可能的实现中，处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息，确定基于信道聚合模型得到第t信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N个次信道聚合后的信道发送数据包未与其他终端设备发送数据包发生碰撞、且N为零时，处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。In another possible implementation, the processing unit 1010 determines when the reward value of the t-th channel aggregation indication value is obtained based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. , specifically used for: when the data packet sent by the interface unit 1020 on the channel after aggregation of the primary channel and N secondary channels does not collide with the data packet sent by other terminal devices, and N is zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.

再一种可能的实现中，处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息，确定基于信道聚合模型得到第t信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N不为零时，处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值。In another possible implementation, the processing unit 1010 determines when the reward value of the t-th channel aggregation indication value is obtained based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. , specifically used for: when the data packet sent by the interface unit 1020 on the channel aggregated between the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is not zero, the processing unit 1010 Determine the reward value based on the channel aggregation model to obtain the t-th channel aggregation indication value.

又一种可能的实现中，处理单元1010根据主信道和N个次信道中每个次信道在第t个时间段的负载信息，确定基于信道聚合模型得到第t信道聚合指示值的奖励值时，具体用于：当接口单元1020在主信道和N个次信道聚合后的信道发送数据包与其他终端设备发送数据包发生碰撞、且N为零时，处理单元1010根据确定基于信道聚合模型得到第t信道聚合指示值的奖励值；In another possible implementation, the processing unit 1010 determines when the reward value of the t-th channel aggregation indication value is obtained based on the channel aggregation model based on the load information of each of the primary channel and N secondary channels in the t-th time period. , specifically used for: when the data packet sent by the interface unit 1020 on the channel aggregated between the primary channel and N secondary channels collides with the data packet sent by other terminal devices, and N is zero, the processing unit 1010 Determine the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model;

上述几种设计中，R_t+1表示基于信道聚合模型得到第t信道聚合指示值的奖励值，K表示N个次信道中的第K个次信道，K＝1、2、…、N，表示第K个次信道在第t个时间段的负载信息，/>表示主信道在第t个时间段的负载信息。In the above designs, R _t+1 represents the reward value for obtaining the t-th channel aggregation indication value based on the channel aggregation model, K represents the K-th sub-channel among N sub-channels, K=1, 2,...,N, Indicates the load information of the K-th sub-channel in the t-th time period,/> Indicates the load information of the main channel in the t-th time period.

在一种可能的设计中，处理单元1010，还用于根据第t个时间段的信道环境信息、第t信道聚合指示值和设定的状态动作值函数，确定基于第t个时间段的信道环境信息进行第t信道聚合指示值对应的信道聚合方式的第一状态动作值；根据第t个时间段的信道环境信息、主信道与M-1个次信道对应的2^M-1-1个候选信道聚合指示值和设定的状态动作值函数，确定第二状态动作值，其中2^M-1-1个候选信道聚合指示值对应于主信道与M-1个次信道的2^M-1-1个候选信道聚合方式，第二状态动作值为基于第t个时间段的信道环境信息分别进行2^M-1-1个候选信道聚合指示值对应的候选信道聚合方式的状态动作值中的最大状态动作值；以及根据第一状态动作值、第二状态动作值和基于信道聚合模型得到第t信道聚合指示值的奖励值，确定信道聚合模型的损失；根据信道聚合模型的损失，对信道聚合模型进行训练更新。In a possible design, the processing unit 1010 is also configured to determine the channel based on the t-th time period based on the channel environment information of the t-th time period, the t-th channel aggregation indicator value and the set state action value function. Environmental information carries out the first state action value of the channel aggregation mode corresponding to the t-th channel aggregation indication value; 2 ^{M-1 -1 corresponding to the channel environment information of the t-th time period, the primary channel and the M-} 1 secondary channels The candidate channel aggregation indicator value and the set state action value function determine the second state action value, where 2 ^M-1 -1 candidate channel aggregation indicator values correspond to 2 ^M-1 of the primary channel and M-1 secondary channels. -1 candidate channel aggregation mode, the second state action value is based on the channel environment information of the tth time period, respectively, 2 ^M-1 -1 candidate channel aggregation indication values corresponding to the state action values of the candidate channel aggregation mode maximum state action value; and determine the loss of the channel aggregation model based on the first state action value, the second state action value and the reward value of the t-th channel aggregation indication value based on the channel aggregation model; based on the loss of the channel aggregation model, the channel aggregation model Aggregate models for training updates.

如图11所示，本申请还提供一种通信装置1100，包括处理器1110和接口电路1120。处理器1110和接口电路1120之间相互耦合。可以理解的是，接口电路1120可以为收发器、输入输出接口、输入接口、输出接口、通信接口等。可选的，通信装置1100还可以包括存储器1130，用于存储处理器1110执行的指令或存储处理器1110运行指令所需要的输入数据或存储处理器1110运行指令后产生的数据。可选的，存储器1130还可以和处理器1110集成在一起。As shown in Figure 11, this application also provides a communication device 1100, including a processor 1110 and an interface circuit 1120. The processor 1110 and the interface circuit 1120 are coupled to each other. It can be understood that the interface circuit 1120 can be a transceiver, an input-output interface, an input interface, an output interface, a communication interface, etc. Optionally, the communication device 1100 may also include a memory 1130 for storing instructions executed by the processor 1110 or input data required for the processor 1110 to run the instructions or data generated after the processor 1110 executes the instructions. Optionally, the memory 1130 can also be integrated with the processor 1110 .

当通信装置1100用于实现图6所示的方法时，处理器1110可以用于实现上述处理单元1010的功能，接口电路1120可以用于实现上述接口单元1020的功能。When the communication device 1100 is used to implement the method shown in Figure 6, the processor 1110 can be used to implement the functions of the above-mentioned processing unit 1010, and the interface circuit 1120 can be used to implement the functions of the above-mentioned interface unit 1020.

如图12所示，为本申请实施例提供的一种设备结构示意图，该设备可以是网络设备或第一终端设备，该设备中可以包括处理器、收发机和天线，其中处理器可以包括一个获多个处理单元，不同的处理单元可以是独立的器件，也可以集成在一个或多个处理器中。其中，处理器可以是设备的神经中枢和指挥中心。处理器可以根据指令操作码和时序信号，产生操作控制信号，完成取指令和执行指令的操作。在本申请实施例中，处理器可以根据信道聚合方法对应的指令，执行相应信道聚合方法流程；收发器和天线可以接收来自其它设备的信号并传输至处理器或将来自处理器的信号发送给其它设备。As shown in Figure 12, a schematic structural diagram of a device is provided according to an embodiment of the present application. The device may be a network device or a first terminal device. The device may include a processor, a transceiver and an antenna. The processor may include a Multiple processing units are obtained. Different processing units can be independent devices or integrated into one or more processors. Among them, the processor can be the nerve center and command center of the device. The processor can generate operation control signals based on the instruction opcode and timing signals to complete the operations of fetching and executing instructions. In the embodiment of the present application, the processor can execute the corresponding channel aggregation method process according to the instructions corresponding to the channel aggregation method; the transceiver and the antenna can receive signals from other devices and transmit them to the processor or send signals from the processor to Other equipment.

另外，在设备中还可以包括神经网络处理器(neural-network processing unit，NPU)，由NPU实现对信道聚合模型(即神经网络模型)训练更新，以及根据输入信道聚合模型的信息，进行运算输出信道聚合方式(或信道聚合方式对应的信道聚合指示值)。可以理解的是在NPU中可以包含推理模块和训练模块，其中训练模块可以用于实现对信道聚合模型(即神经网络模型)训练更新。推理模块可以实现根据输入信道聚合模型的信息，进行运算输出信道聚合方式。另外NPU可以耦合在中央处理器中，本申请对此不作限定。In addition, the device can also include a neural-network processing unit (NPU), which implements training and updating of the channel aggregation model (i.e., neural network model), and performs calculation output based on the information of the input channel aggregation model. Channel aggregation mode (or channel aggregation indication value corresponding to the channel aggregation mode). It can be understood that the NPU can include an inference module and a training module, where the training module can be used to implement training and update of the channel aggregation model (ie, neural network model). The inference module can implement calculations and output channel aggregation methods based on the information of the input channel aggregation model. In addition, the NPU may be coupled to the central processing unit, which is not limited in this application.

可以理解的是，本申请的实施例中的处理器可以是中央处理单元(centralprocessing unit，CPU)，还可以是其它通用处理器、数字信号处理器(digital signalprocessor，DSP)、专用集成电路(application specific integrated circuit，ASIC)、逻辑电路、现场可编程门阵列(field programmable gate array，FPGA)或者其它可编程逻辑器件、晶体管逻辑器件，硬件部件或者其任意组合。通用处理器可以是微处理器，也可以是任何常规的处理器。It can be understood that the processor in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processor, digital signal processor (DSP), application specific integrated circuit (Application Specific Integrated Circuit). specific integrated circuit (ASIC), logic circuit, field programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. A general-purpose processor can be a microprocessor or any conventional processor.

本申请的实施例中的方法步骤可以通过硬件的方式来实现，也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成，软件模块可以被存放于随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器，从而使处理器能够从该存储介质读取信息，且可向该存储介质写入信息。当然，存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外，该ASIC可以位于网络设备或终端设备中。当然，处理器和存储介质也可以作为分立组件存在于网络设备或终端设备中。The method steps in the embodiments of the present application can be implemented by hardware or by a processor executing software instructions. Software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory In memory, register, hard disk, mobile hard disk, CD-ROM or any other form of storage medium well known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium. Of course, the storage medium can also be an integral part of the processor. The processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in network equipment or terminal equipment. Of course, the processor and the storage medium can also exist as discrete components in network equipment or terminal equipment.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时，全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机程序或指令可以从一个网络设备、终端、计算机、服务器或数据中心通过有线或无线方式向另一个网络设备、终端、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质，例如，软盘、硬盘、磁带；也可以是光介质，例如，数字视频光盘；还可以是半导体介质，例如，固态硬盘。该计算机可读存储介质可以是易失性或非易失性存储介质，或可包括易失性和非易失性两种类型的存储介质。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are executed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device. The computer program or instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer program or instructions may be transmitted from a network device, terminal, A computer, server or data center transmits via wired or wireless means to another network device, terminal, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media, such as floppy disks, hard disks, and tapes; optical media, such as digital video optical disks; or semiconductor media, such as solid-state hard drives. The computer-readable storage medium may be volatile or nonvolatile storage media, or may include both volatile and nonvolatile types of storage media.

在本申请的各个实施例中，如果没有特殊说明以及逻辑冲突，不同的实施例之间的术语和/或描述具有一致性、且可以相互引用，不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。In the various embodiments of this application, if there is no special explanation or logical conflict, the terms and/or descriptions between different embodiments are consistent and can be referenced to each other. The technical features in different embodiments are based on their inherent Logical relationships can be combined to form new embodiments.

另外，需要理解，在本申请实施例中，“示例的”一词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言，使用示例的一词旨在以具体方式呈现概念。In addition, it should be understood that in the embodiments of this application, the word "exemplary" is used to mean an example, illustration or explanation. Any embodiment or design described herein as "example" is not intended to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of the word example is intended to present a concept in a concrete way.

可以理解的是，在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分，并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定。It can be understood that the various numerical numbers involved in the embodiments of the present application are only for convenience of description and are not used to limit the scope of the embodiments of the present application. The size of the serial numbers of the above processes does not mean the order of execution. The execution order of each process should be determined by its function and internal logic.

Claims

1. A method of channel aggregation, comprising:

a first terminal device receives a load report from a network device, wherein the load report comprises load information of each channel in M channels of the network device in a t-th time period, the M channels comprise 1 main channel and M-1 secondary channels corresponding to the first terminal device, M is an integer greater than or equal to 2, and t is an integer greater than or equal to 2;

The first terminal equipment inputs channel environment information of the t-th time period into the channel aggregation model for processing to obtain a t-th channel aggregation indicated value, wherein the channel environment information of the t-th time period comprises load information of each secondary channel in the t-th time period in the main channel and the M-1 secondary channels and channel state monitoring information obtained by the first terminal equipment in the t-th time period for monitoring channel states of the main channel and the M-1 secondary channels, and the t-th channel aggregation indicated value is used for indicating N secondary channels in the M-1 secondary channels to aggregate with the main channel, and N is an integer which is more than or equal to 0 and less than or equal to M-1;

and the first terminal equipment transmits a data packet through the channel aggregated by the main channel and the N secondary channels in the t+1th time period, wherein the t+1th time period is a time period after the t time period.

2. The method of claim 1, wherein the channel state monitoring information obtained by the first terminal device performing channel state monitoring on the primary channel and the M-1 secondary channels in the t-th time period includes one or more of:

The first terminal equipment monitors the busy state of each time unit of the main channel and each secondary channel of the M-1 secondary channels in the t-th time period;

the data packet sending state of each time unit of the first terminal device on each secondary channel of the primary channel and the M-1 secondary channels, wherein the data packet sending state of each time unit of the first terminal device is monitored by the first terminal device in the t-th time period;

and the first terminal equipment monitors the number of time units of which the data packet sending state and the busy and idle state of the channel are kept unchanged and continuous on each secondary channel in the primary channel and the M-1 secondary channels in the t-th time period.

3. The method of claim 1 or 2, wherein the method further comprises:

the first terminal equipment determines a reward value of the t-1 channel aggregation indicated value based on the channel aggregation model according to the load information of each secondary channel in the t-1 time period of the main channel and N 'secondary channels, wherein the t-1 channel aggregation indicated value is used for indicating the N' secondary channels in the M-1 secondary channels to aggregate with the main channel;

The first terminal equipment determines a first state action value of a channel aggregation mode corresponding to the t-1 channel aggregation indicated value based on the channel environment information of the t-1 time period according to the channel environment information of the t-1 time period, the t-1 channel aggregation indicated value and a set state action value function;

the first terminal equipment receives the channel environment information of the t-1 time period, and 2 corresponding to the primary channel and the M-1 secondary channels according to the channel environment information of the t-1 time period ^M-1 -1 candidate channel aggregate indicator value and said set state action value function, determining a second state action value, wherein said 2 ^M-1 -1 candidate channel aggregate indicator value corresponds to 2 of the primary channel and the M-1 secondary channels ^M ^-1 -1 candidate channel aggregation, wherein the second state action value is obtained by respectively performing the 2 th state actions based on the channel environment information of the t-1 th time period ^M-1 -a maximum state action value of state action values of candidate channel aggregation modes corresponding to the 1 candidate channel aggregation instruction values;

the first terminal equipment determines the loss of the channel aggregation model according to the first state action value, the second state action value and the rewarding value of the t-1 channel aggregation indicated value;

The first terminal equipment trains and updates the channel aggregation model according to the loss of the channel aggregation model;

wherein the N' is the same as or different from the N, and the t-1 th time period is a time period before the t-1 th time period.

4. The method of claim 1 or 2, wherein the method further comprises:

the first terminal equipment determines a reward value of a t-1 th channel aggregation indicated value based on the channel aggregation model according to load information of each secondary channel in the t-1 th time period of the main channel and N ' secondary channels, wherein the t-1 th channel aggregation indicated value is used for indicating that the N ' secondary channels in the M-1 secondary channels are aggregated with the main channel, N ' is the same as or different from N, and the t-1 th time period is a time period before the t-1 th time period;

the first terminal device inputs the channel environment information of the t-th time period into the channel aggregation model for processing to obtain a t-th channel aggregation indicated value, and the method comprises the following steps:

and the first terminal equipment inputs the channel environment information of the t-th time period and the rewarding value of the t-1 channel aggregation indicated value into the channel aggregation model for processing to obtain the t-th channel aggregation indicated value.

5. The method according to claim 3 or 4, wherein the first terminal device determining a prize value for the t-1 th channel aggregation indicator value based on the channel aggregation model according to load information of the primary channel and each of the N' secondary channels in the t-1 th time period, includes:

when the first terminal device does not collide with other terminal device transmitting data packets in the channel transmitting data packet aggregated by the main channel and the N 'secondary channels and the N' is not zero, the first terminal device performs the following steps Determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

wherein the R is _t Representing a prize value for deriving the t-1 th channel aggregate indicator value based on the channel aggregate model, the K representing a kth secondary channel of the N 'secondary channels, the K = 1, 2, …, N', theLoad information indicating that the kth secondary channel is in the t-1 time period.

6. The method according to claim 3 or 4, wherein the first terminal device determining a prize value for the t-1 th channel aggregation indicator value based on the channel aggregation model according to load information of the primary channel and each of the N' secondary channels in the t-1 th time period, includes:

When the first terminal device does not collide with other terminal device transmitting data packets in the channel transmitting data packet aggregated by the main channel and the N 'secondary channels and the N' is zero, the first terminal device performs the following steps Determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

wherein the R is _t A reward value representing the t-1 th channel aggregation instruction value obtained based on the channel aggregation model, theLoad information indicating the primary channel in the t-1 th time period.

7. The method according to claim 3 or 4, wherein the first terminal device determining a prize value for the t-1 th channel aggregation indicator value based on the channel aggregation model according to load information of the primary channel and each of the N' secondary channels in the t-1 th time period, includes:

when the first terminal device collides with the other terminal device transmitting data packet in the channel transmitting data packet aggregated by the main channel and the N 'secondary channels and the N' is not zero, the first terminal device performs the following steps Determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

Wherein the R is _t Representing a prize value for deriving the t-1 th channel aggregate indicator value based on the channel aggregate model, the K representing a kth secondary channel of the N 'secondary channels, the K = 1, 2, …, N', theLoad information representing said kth secondary channel in said t-1 th time period, said +.>Load information indicating the primary channel in the t-1 th time period.

8. The method according to claim 3 or 4, wherein the first terminal device determining a prize value for the t-1 th channel aggregation indicator value based on the channel aggregation model according to load information of the primary channel and each of the N' secondary channels in the t-1 th time period, includes:

when the first terminal device collides with the other terminal device transmitting data packet in the channel transmitting data packet aggregated by the main channel and the N' secondary channels, and the first terminal device transmits the data packet to the other terminal deviceWhen N' is zero, the first terminal equipment is according to the following conditionsDetermining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

wherein the R is _t A reward value representing the t-1 th channel aggregation instruction value obtained based on the channel aggregation model, the Load information indicating the primary channel in the t-1 th time period.

9. The method of any of claims 1-8, wherein the load report further comprises an expiration of the t-th period.

10. A communication device, comprising an interface unit and a processing unit;

the interface unit is configured to receive a load report from a network device, where the load report includes load information of each channel in M channels of the network device in a t-th time period, where the M channels include 1 primary channel and M-1 secondary channels corresponding to the first terminal device, M is an integer greater than or equal to 2, and t is an integer greater than or equal to 2;

the processing unit is configured to input channel environment information of the t-th time period into the channel aggregation model for processing, so as to obtain a t-th channel aggregation indicated value, where the channel environment information of the t-th time period includes load information of each secondary channel in the t-th time period in the primary channel and the M-1 secondary channels, and channel state monitoring information obtained by monitoring channel states of the primary channel and the M-1 secondary channels in the t-th time period, and the t-th channel aggregation indicated value is used to indicate that N secondary channels in the M-1 secondary channels are aggregated with the primary channel, where N is an integer greater than or equal to 0 and less than or equal to the M-1; and transmitting a data packet through the channel aggregated by the primary channel and the N secondary channels in a t+1th time period, wherein the t+1th time period is a time period after the t time period.

11. The apparatus of claim 10, wherein the channel state monitoring information obtained by the processing unit channel state monitoring the primary channel and the M-1 secondary channels for the nth time period comprises one or more of:

the processing unit monitors the busy state of each time unit of each of the primary channel and the M-1 secondary channels in the t-th time period;

the processing unit monitors the data packet transmission state of each time unit of the communication device on each secondary channel of the primary channel and the M-1 secondary channels in the t-th time period;

the processing unit monitors the number of time units of which the data packet sending state and the busy state of the channel of the communication device in each secondary channel in the primary channel and the M-1 secondary channels are kept unchanged and continuous in the t time period.

12. The apparatus of claim 10 or 11, wherein the processing unit is further configured to:

determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model according to the load information of each secondary channel in the t-1 th time period of the primary channel and the N 'secondary channels, wherein the t-1 th channel aggregation indicated value is used for indicating the N' secondary channels in the M-1 secondary channels to aggregate with the primary channel;

Determining a first state action value of a channel aggregation mode corresponding to the t-1 th channel aggregation indicated value based on the channel environment information of the t-1 st time period according to the channel environment information of the t-1 th time period, the t-1 th channel aggregation indicated value and a set state action value function;

according to the channel environment information of the t-1 time period, the main channel corresponds to 2 of the M-1 secondary channels ^M-1 -1 candidate channel aggregate indicator value and said set state action value function, determining a second state action value, wherein said 2 ^M ^-1 -1 candidate channel aggregate indicator value corresponds to 2 of the primary channel and the M-1 secondary channels ^M-1 -1 candidate channel aggregation, wherein the second state action value is obtained by respectively performing the 2 th state actions based on the channel environment information of the t-1 th time period ^M-1 -a maximum state action value of state action values of candidate channel aggregation modes corresponding to the 1 candidate channel aggregation instruction values;

determining a loss of the channel aggregation model according to the first state action value, the second state action value and the rewarding value of the t-1 channel aggregation indicated value; training and updating the channel aggregation model according to the loss of the channel aggregation model;

13. The apparatus of claim 10 or 11, wherein the processing unit is further configured to:

determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model according to the load information of each secondary channel in the t-1 th time period of the primary channel and N ' secondary channels, wherein the t-1 th channel aggregation indicated value is used for indicating the N ' secondary channels in the M-1 secondary channels to aggregate with the primary channel, N ' is the same as or different from N, and the t-1 th time period is a time period before the t time period;

and inputting the channel environment information of the t-th time period into the channel aggregation model for processing, and when the t-th channel aggregation indicated value is obtained, inputting the channel environment information of the t-th time period and the rewarding value of the t-1-th channel aggregation indicated value into the channel aggregation model for processing, so as to obtain the t-th channel aggregation indicated value.

14. The apparatus according to claim 12 or 13, wherein the processing unit is configured to, when determining the prize value of the t-1 th channel aggregation indicator value based on the channel aggregation model according to the load information of the primary channel and each of the N' secondary channels in the t-1 th time period, specifically:

When the interface unit does not collide with other terminal equipment transmitting data packets in the channel transmitting data packets aggregated by the main channel and the N 'secondary channels and the N' is not zero, the processing unit processes the data packets according to the following conditions Determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

15. The apparatus according to claim 12 or 13, wherein the processing unit is configured to, when determining the prize value of the t-1 th channel aggregation indicator value based on the channel aggregation model according to the load information of the primary channel and each of the N' secondary channels in the t-1 th time period, specifically:

when the channel transmission data packet aggregated by the interface unit in the main channel and the N' secondary channels does not collide with other terminal equipment transmission data packets, and the interface unit receives the channel transmission data packet from the other terminal equipment When N' is zero, the processing unit is used for processing the data according to the following conditionsDetermining a reward value of the t-th channel aggregation indicated value based on the channel aggregation model;

16. The apparatus according to claim 12 or 13, wherein the processing unit is configured to, when determining the prize value of the t-1 th channel aggregation indicator value based on the channel aggregation model according to the load information of the primary channel and each of the N' secondary channels in the t-1 th time period, specifically:

when the interface unit collides with the data packets sent by other terminal equipment in the channel after the aggregation of the main channel and the N 'secondary channels and the N' is not zero, the processing unit processes the data packets according to the following conditions Determining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

wherein the R is _t Representing a prize value for deriving the t-1 th channel aggregate indicator value based on the channel aggregate model, the K representing a kth secondary channel of the N 'secondary channels, the K = 1, 2, …, N', the Represents the KthLoad information of the sub-channels in said t-1 th time period, said +.>Load information indicating the primary channel in the t-1 th time period.

17. The apparatus according to claim 12 or 13, wherein the processing unit is configured to, when determining the prize value of the t-1 th channel aggregation indicator value based on the channel aggregation model according to the load information of the primary channel and each of the N' secondary channels in the t-1 th time period, specifically:

when the interface unit collides with the data packets sent by other terminal equipment and the data packets sent by the channel after the aggregation of the main channel and the N 'secondary channels and the N' is zero, the processing unit processes the data packets according to the following conditionsDetermining a reward value of the t-1 th channel aggregation indicated value based on the channel aggregation model;

18. The apparatus of any of claims 10-17, wherein the load report further comprises an expiration of the t-th period.

19. A computer program product comprising instructions which, when executed, cause the method of any one of claims 1-9 to be implemented.

20. A chip for implementing the method according to any one of claims 1-9.

21. A computer readable storage medium, characterized in that the storage medium has stored therein a computer program or instructions, which when executed, cause the method according to any of claims 1-9 to be implemented.