WO2023036323A1 - 一种通信方法及装置 - Google Patents

一种通信方法及装置 Download PDF

Info

Publication number
WO2023036323A1
WO2023036323A1 PCT/CN2022/118269 CN2022118269W WO2023036323A1 WO 2023036323 A1 WO2023036323 A1 WO 2023036323A1 CN 2022118269 W CN2022118269 W CN 2022118269W WO 2023036323 A1 WO2023036323 A1 WO 2023036323A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
sub
output
data
input
Prior art date
Application number
PCT/CN2022/118269
Other languages
English (en)
French (fr)
Inventor
柴晓萌
吴艺群
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP22866784.6A priority Critical patent/EP4391474A1/en
Publication of WO2023036323A1 publication Critical patent/WO2023036323A1/zh
Priority to US18/598,574 priority patent/US20240211769A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Definitions

  • the present application relates to the technical field of communication, and in particular to a communication method and device.
  • a wireless communication network such as a mobile communication network
  • services supported by the network are becoming more and more diverse, and therefore requirements to be met are becoming more and more diverse.
  • the network needs to be able to support ultra-high speed, ultra-low latency, and/or very large connections.
  • This feature makes network planning, network configuration, and/or resource scheduling increasingly complex.
  • functions of the network become more and more powerful, such as supporting higher and higher spectrum, supporting high-order multiple input multiple output (MIMO) technology, supporting beamforming, and/or supporting new technologies such as beam management, etc. technology, making network energy saving a hot research topic.
  • MIMO multiple input multiple output
  • These new requirements, new scenarios, and new features have brought unprecedented challenges to network planning, O&M, and efficient operations.
  • artificial intelligence technology can be introduced into the wireless communication network to realize network intelligence. Based on this, how to effectively implement artificial intelligence in the network is a problem worth studying.
  • the present disclosure provides a communication method and device for reducing transmission overhead and improving communication security.
  • the present disclosure provides a communication method, including: determining a first sub-model and a second sub-model, the first sub-model and the second sub-model can be matched and used; sending first information, and the first sub-model A piece of information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model, or the first information is used to indicate the input data of the first sub-model and the first sub-model Label data for a submodel.
  • the label data of the first sub-model is the expected value or target value of the output data of the first sub-model, which can be understood as the expected output data of the first sub-model; the label data of the first sub-model can also be described as the first sub-model The output label of the model.
  • the input data and/or output data of one of the multiple sub-models that can be matched can be used to independently train a sub-model that has the same function as the sub-model or can be used to match the sub-model, or provide a matching
  • the input data and label data of a sub-model in the multiple sub-models used can be used to independently train a sub-model with the same function as the sub-model, without the need to transmit the sub-model over the air interface, which can reduce transmission overhead and improve communication security.
  • the output of the first sub-model is used to determine the input of the second sub-model; or, the output of the second sub-model is used to determine the input of the first sub-model.
  • Such a design can realize matching among multiple sub-models.
  • the first submodel is used to send information at the sending end, and the second submodel is used to receive the information at the receiving end; or, the second submodel is used to The end sends information, and the first submodel is used to receive the information at the receiving end.
  • Such a design can be used in scenarios such as using model compression/modulation information, which can reduce the overhead of information transmission.
  • the first sub-model and the second sub-model belong to a bilateral model.
  • This design is suitable for scenarios where bilateral models need to be deployed, which can reduce the overhead of sending sub-models in bilateral models, avoid leakage of bilateral model algorithms, and improve communication security.
  • the first information is used for training the third sub-model.
  • the function of the third sub-model is the same as that of the first sub-model; and/or, the input type of the third sub-model is the same as that of the first sub-model
  • the input type of the same, the output type of the third sub-model is the same as the output type of the first sub-model; and/or, the dimension of the input data of the third sub-model is the same as the input of the first sub-model
  • the dimensions of the data are the same, the dimensions of the output data of the third sub-model are the same as the dimensions of the output data of the first sub-model; and/or, the input of the third sub-model is the same as that of the first sub-model
  • the difference between the output of the third sub-model and the output of the first sub-model is smaller than the first threshold; and/or, the input of the third sub-model is different from the output of the first sub-model When the inputs are the same, the difference between the output of the third sub-model and the output label of the first sub
  • the third sub-model and the second sub-model can also form a new bilateral model.
  • the function of the third sub-model is the same as The function of the second sub-model is the same; and/or, the input type of the third sub-model is the same as the input type of the second sub-model, and the output type of the third sub-model is the same as that of the second sub-model
  • the output types of the models are the same; and/or, the dimension of the input data of the third sub-model is the same as the dimension of the input data of the second sub-model, and the dimension of the output data of the third sub-model is the same as that of the first sub-model
  • the dimensions of the output data of the two sub-models are the same; and/or, when the input of the third sub-model is the same as the input of the second sub-model, the output of the third sub-model is the same as that of the second sub-model The difference between the outputs is less than the first threshold; and/or, when the input of the third sub-model is the same as the input of the second sub-model, the output of the third sub-model is the
  • the third sub-model that replaces the second sub-model can be trained independently, so that the third sub-model can be matched with the first sub-model, and the transmission overhead of sending the second sub-model can be reduced.
  • the third sub-model and the first sub-model can also form a new bilateral model.
  • the determining the first sub-model and the second sub-model includes: determining the first sub-model and the second sub-model according to training data; wherein, the training data includes N channel information, N is a positive integer, and the channel information includes downlink channel characteristics or downlink channels.
  • the input data of the first sub-model includes M pieces of channel information, where M is a positive integer.
  • the output data of the first submodel includes feature bits corresponding to M pieces of channel information, where M is a positive integer.
  • the label data of the first sub-model when the input data of the first sub-model includes M pieces of channel information, the label data of the first sub-model includes M feature bits, where M is a positive integer.
  • the method further includes: acquiring information indicating a first feature bit, the output of the third sub-model includes the first feature bit; according to the second sub-model and the The first feature bit is used to obtain the first channel information; wherein, the input of the second sub-model includes the first feature bit, and the output of the second sub-model includes the first channel information.
  • the sending end uses the third sub-model trained independently according to the input and output of the first sub-model to send feature bits, and the receiving end can use the second sub-model that matches the first sub-model to recover channel information. There is no need to transmit sub-models in the air interface, which can reduce transmission overhead and improve communication security.
  • the input data of the first sub-model includes M feature bits, where M is a positive integer.
  • the output data of the first sub-model includes channel information corresponding to M feature bits, where M is a positive integer.
  • the input data of the first sub-model includes M feature bits
  • the label data of the first sub-model includes channel information corresponding to the M feature bits
  • M is a positive integer
  • the method further includes: determining a second characteristic bit according to the second channel information and the second sub-model; wherein the input of the second sub-model includes the second channel information , the output of the second sub-model includes the second feature bit; sending information indicating the second feature bit.
  • the sending end uses the second sub-model matching the first sub-model to send feature bits
  • the receiving end can use the third sub-model independently trained according to the input and output of the first sub-model to recover channel information. There is no need to transmit sub-models in the air interface, which can reduce transmission overhead and improve communication security.
  • the present disclosure provides a communication method, including: acquiring first information, the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model, or the The first information is used to indicate the input data of the first sub-model and the label data of the first sub-model; the third sub-model is trained according to the first information.
  • the method further includes: determining a first characteristic bit according to third channel information and the third sub-model; wherein the input of the third sub-model includes the third channel information , the output of the third sub-model includes the first feature bit; sending information indicating the first feature bit.
  • the method further includes: acquiring information indicating a second feature bit; obtaining fourth channel information according to the third submodel and the second feature bit; wherein, the The input of the third sub-model includes the second feature bit, and the output of the third sub-model includes the fourth channel information.
  • the present disclosure provides a communication device.
  • the communication device may be a first network element, or a device in the first network element, or a device that can be matched and used with the first network element.
  • the first network element may be an access network device or a terminal device.
  • the communication device may include a one-to-one corresponding module for executing the method/operation/step/action described in the first aspect.
  • the module may be a hardware circuit, or software, or a combination of hardware circuit and software. accomplish.
  • the communication device may include a processing module and a communication module.
  • the processing module is used to determine the first sub-model and the second sub-model, and the first sub-model and the second sub-model can be matched and used.
  • a communication module configured to send first information, where the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model; or, the first information is used to indicate Input data for the first sub-model and label data for the first sub-model.
  • the communication module is further configured to obtain information indicating a first feature bit, and the output of the third sub-model includes the first feature bit; the processing module is also used to Obtaining the first channel information according to the second sub-model and the first eigenbit; wherein, the input of the second sub-model includes the first eigenbit, and the output of the second sub-model includes the The first channel information.
  • the processing module is further configured to determine a second feature bit according to the second channel information and the second sub-model; wherein, the input of the second sub-model includes the second Channel information, the output of the second sub-model includes the second feature bit; the communication module is further configured to send information indicating the second feature bit.
  • the present disclosure provides a communication device.
  • the communication device may be a second network element, or a device in the second network element, or a device that can be matched and used with the second network element.
  • the second network element may be an access network device or a terminal device.
  • the communication device may include a one-to-one corresponding module for executing the method/operation/step/action described in the second aspect.
  • the module may be a hardware circuit, or software, or a combination of hardware circuit and software.
  • the communication device may include a processing module and a communication module.
  • a communication module configured to obtain first information, the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model; or, the first information is used to indicate the Input data for a first submodel and label data for the first submodel.
  • a processing module configured to train a third sub-model according to the first information.
  • the processing module is further configured to determine the first feature bit according to the third channel information and the third sub-model; wherein, the input of the third sub-model includes the third Channel information, the output of the third sub-model includes the first feature bit.
  • the communication module is further configured to send information indicating the first feature bit.
  • the communication module is further configured to obtain information indicating a second characteristic bit; the processing module is further configured to, according to the third submodel and the second characteristic bit, Obtaining fourth channel information; wherein, an input of the third submodel includes the second feature bit, and an output of the third submodel includes the fourth channel information.
  • the present disclosure provides a communication device, where the communication device includes a processor, configured to implement the method described in the first aspect above.
  • the communication device may also include memory for storing instructions and data.
  • the memory is coupled to the processor, and when the processor executes the instructions stored in the memory, the method described in the first aspect above can be implemented.
  • the communication device may also include a communication interface, which is used for the device to communicate with other devices.
  • the communication interface may be a transceiver, circuit, bus, module or other type of communication interface, and other devices may for access network equipment.
  • the communication device includes:
  • a processor is configured to determine a first sub-model and a second sub-model, and the first sub-model and the second sub-model can be matched and used.
  • the processor is further configured to use a communication interface to send first information, where the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model; or, the first The information is used to indicate input data of a first sub-model and label data of said first sub-model.
  • the present disclosure provides a communication device, where the communication device includes a processor, configured to implement the method described in the second aspect above.
  • the communication device may also include memory for storing instructions and data.
  • the memory is coupled to the processor, and when the processor executes the instructions stored in the memory, the method described in the second aspect above can be implemented.
  • the device may also include a communication interface, which is used for the device to communicate with other devices.
  • the communication interface may be a transceiver, circuit, bus, module or other type of communication interface, and other devices may be Terminal Equipment.
  • the device includes:
  • a processor configured to use a communication interface to acquire first information, where the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model; or, the first information uses Indicates the input data of the first sub-model and the label data of the first sub-model.
  • the processor is further configured to train a third sub-model according to the first information.
  • the present disclosure provides a communication system, including the communication device described in the third aspect or the fifth aspect, and the communication device described in the fourth aspect or the sixth aspect.
  • the present disclosure further provides a computer program, which, when the computer program is run on a computer, causes the computer to execute the method provided in any one of the first aspect or the second aspect above.
  • the present disclosure further provides a computer program product, including instructions, which, when run on a computer, cause the computer to execute the method provided in any one of the first aspect or the second aspect above.
  • the present disclosure also provides a computer-readable storage medium, where a computer program or instruction is stored in the computer-readable storage medium, and when the computer program or instruction is run on a computer, the computer executes The method provided in the first aspect or the second aspect above.
  • the present disclosure further provides a chip, which is used to read a computer program stored in a memory, and execute the method provided in the first aspect or the second aspect above.
  • the present disclosure further provides a chip system, which includes a processor, configured to support a computer device to implement the method provided in the first aspect or the second aspect above.
  • the chip system further includes a memory, and the memory is used to store necessary programs and data of the computer device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • FIG. 1A is a schematic structural diagram of a communication system
  • FIG. 1B is a schematic structural diagram of another communication system
  • Figure 2A is a schematic diagram of a neuron structure
  • Fig. 2B is a kind of schematic diagram of the layer relation of neural network
  • FIG. 3 is a schematic diagram of a deployment process of a bilateral model in a related technology
  • FIG. 4A is a schematic diagram of an application framework of AI
  • 4B to 4E are schematic diagrams of several network architectures
  • FIG. 5A is one of the schematic flowcharts of the communication method provided by the present disclosure.
  • FIG. 5B is one of the schematic diagrams of the deployment process of the bilateral model provided by the present disclosure.
  • FIG. 6A is one of the schematic flowcharts of the communication method provided by the present disclosure.
  • FIG. 6B is one of the schematic diagrams of the deployment process of the bilateral model provided by the present disclosure.
  • FIG. 7A is one of the schematic flowcharts of the communication method provided by the present disclosure.
  • FIG. 7B is one of the schematic diagrams of the deployment process of the bilateral model provided by the present disclosure.
  • FIG. 8 is one of the schematic flowcharts of the communication method provided by the present disclosure.
  • FIG. 9 is one of the structural schematic diagrams of the communication device provided by the present disclosure.
  • Fig. 10 is one of the structural schematic diagrams of the communication device provided by the present disclosure.
  • the present disclosure refers to at least one of the following, indicating one or more. Multiple means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B may indicate: A exists alone, A and B exist simultaneously, and B exists independently.
  • the character "/” generally indicates that the contextual objects are an "or” relationship.
  • first, second, etc. may be used in the present disclosure to describe various objects, these objects should not be limited by these terms. These terms are only used to distinguish one object from another.
  • the communication system can be a third generation (3 th generation, 3G) communication system (such as a universal mobile telecommunications system (universal mobile telecommunications system, UMTS)), a fourth generation 4th generation (4G) communication systems (such as long term evolution (LTE) systems), 5th generation (5G) communication systems, worldwide interoperability for microwave access (WiMAX) ) or a wireless local area network (wireless local area network, WLAN) system, or a fusion system of multiple systems, or a future communication system, such as a 6G communication system.
  • the 5G communication system may also be called a new radio (new radio, NR) system.
  • a network element in a communication system may send signals to another network element or receive signals from another network element.
  • the signal may include information, configuration information, or data, etc.; a network element may also be called an entity, a network entity, a device, a communication device, a node, a communication node, etc.
  • a network element is taken as an example for description.
  • a communication system may include at least one terminal device and at least one access network device.
  • the network element sending the configuration information may be an access network device
  • the network element receiving the configuration information may be a terminal device.
  • the multiple terminal devices can also send signals to each other, that is, both the sending network element of the configuration information and the receiving network element of the configuration information can be terminal devices.
  • the communication system includes an access network device 110 and two terminal devices, that is, a terminal device 120 and a terminal device 130 . At least one of the terminal device 120 and the terminal device 130 may send uplink data to the access network device 110, and the access network device 110 may receive the uplink data. The access network device 110 may send downlink data to at least one of the terminal device 120 and the terminal device 130 .
  • the terminal equipment and access network equipment involved in FIG. 1A will be described in detail below.
  • Terminal equipment also known as terminal, user equipment (UE), mobile station (mobile station, MS), mobile terminal (mobile terminal, MT), etc.
  • UE user equipment
  • MS mobile station
  • MT mobile terminal
  • the terminal device includes a handheld device with a wireless connection function, or a vehicle-mounted device, and the like.
  • Exemplary, examples of some terminals are: wireless network camera, mobile phone (mobile phone), tablet computer, notebook computer, palmtop computer, mobile Internet device (mobile internet device, MID), wearable device such as smart watch, virtual reality ( virtual reality (VR) equipment, augmented reality (augmented reality, AR) equipment, wireless terminals in industrial control (industrial control), terminals in car networking systems, wireless terminals in self driving, smart grid ( Wireless terminals in smart grid, wireless terminals in transportation safety, wireless terminals in smart city (smart city) such as smart fuel dispensers, terminal equipment on high-speed rail and wireless terminals in smart home (smart home) , such as smart speakers, smart coffee machines, smart printers, etc.
  • smart network camera mobile phone
  • mobile phone mobile phone
  • tablet computer notebook computer
  • palmtop computer mobile Internet device
  • mobile internet device mobile internet device
  • wearable device such as smart watch, virtual reality ( virtual reality (VR) equipment, augmented reality (augmented reality, AR) equipment, wireless terminals in industrial control (industrial control), terminals in car networking systems, wireless terminals in
  • the communication device used to realize the function of the terminal device may be a terminal device, or a terminal device with some terminal functions, or a device capable of supporting the terminal device to realize this function, such as a chip system, which may be installed in the end device.
  • a system-on-a-chip may be composed of chips, and may also include chips and other discrete devices.
  • description is made by taking a terminal device or a UE as an example in which the communication device for realizing the function of the terminal device is used for description.
  • the access network device may be a base station (base station, BS), and the access network device may also be called a network device, an access node (access node, AN), or a wireless access node (radio access node, RAN). Access network devices can provide wireless access services for terminal devices.
  • base station base station
  • AN access node
  • RAN wireless access node
  • the access network equipment includes, but is not limited to, at least one of the following: a base station, a next-generation node B (generation nodeB, gNB) in 5G, and an access network equipment in an open radio access network (O-RAN) , evolved node B (evolved node B, eNB), radio network controller (radio network controller, RNC), node B (node B, NB), base station controller (base station controller, BSC), base transceiver station (base transceiver station, BTS), home base station (for example, home evolved nodeB, or home node B, HNB), base band unit (base band unit, BBU), sending and receiving point (transmitting and receiving point, TRP), transmitting point (transmitting point, TP), and/or mobile switching center, etc.
  • a base station a next-generation node B (generation nodeB, gNB) in 5G
  • OFD open radio access network
  • RNC radio network controller
  • node B node B
  • the access network device may also be a centralized unit (centralized unit, CU), a distributed unit (distributed unit, DU), a centralized unit control plane (CU control plane, CU-CP) node, or a centralized unit user plane (CU user plane, CU-UP) node.
  • the access network device may be a relay station, an access point, a vehicle-mounted device, a wearable device, or an access network device in a future evolved public land mobile network (public land mobile network, PLMN).
  • PLMN public land mobile network
  • the communication device used to realize the function of the access network equipment may be the access network equipment, or the network equipment with some functions of the access network equipment, or a device capable of supporting the access network equipment to realize the function , such as a chip system, a hardware circuit, a software module, or a hardware circuit plus a software module, the device can be installed in the access network equipment.
  • a device capable of supporting the access network equipment to realize the function such as a chip system, a hardware circuit, a software module, or a hardware circuit plus a software module, the device can be installed in the access network equipment.
  • description is made by taking the communication device for realizing the function of the access network device as an example of the access network device.
  • the protocol layer structure may include a control plane protocol layer structure and a user plane protocol layer structure.
  • the control plane protocol layer structure may include a radio resource control (radio resource control, RRC) layer, a packet data convergence protocol (packet data convergence protocol, PDCP) layer, a radio link control (radio link control, RLC) layer, a media The access control (media access control, MAC) layer and the function of the protocol layer such as the physical layer.
  • the user plane protocol layer structure may include the functions of the PDCP layer, the RLC layer, the MAC layer, and the physical layer.
  • the PDCP layer may also include a service data adaptation protocol (service data adaptation protocol). protocol, SDAP) layer.
  • the data transmission needs to pass through the user plane protocol layer, such as the SDAP layer, PDCP layer, RLC layer, MAC layer, and physical layer.
  • the SDAP layer, the PDCP layer, the RLC layer, the MAC layer and the physical layer may also be collectively referred to as an access layer.
  • the transmission direction of the data it is divided into sending or receiving, and each layer above is divided into a sending part and a receiving part.
  • the PDCP layer obtains data from the upper layer, it transmits the data to the RLC layer and the MAC layer, and then the MAC layer generates transmission blocks, which are then transmitted wirelessly through the physical layer.
  • Data is encapsulated correspondingly in each layer.
  • the data received by a certain layer from the upper layer of this layer is regarded as the service data unit (service data unit, SDU) of this layer, and after being encapsulated by this layer, it becomes a protocol data unit (protocol data unit, PDU), and then passed to next layer.
  • SDU service data unit
  • PDU protocol data unit
  • the terminal device may also have an application layer and a non-access layer.
  • the application layer can be used to provide services to the application program installed in the terminal device.
  • the downlink data received by the terminal device can be transmitted to the application layer in turn by the physical layer, and then provided to the application program by the application layer;
  • the application layer can obtain the data generated by the application program, and transmit the data to the physical layer in turn, and send it to other communication devices.
  • the non-access layer can be used to forward user data, such as forwarding uplink data received from the application layer to the SDAP layer or forwarding downlink data received from the SDAP layer to the application layer.
  • the access network device may include a centralized unit (central unit, CU) and a distributed unit (distributed unit, DU). Multiple DUs can be centrally controlled by one CU.
  • the interface between the CU and the DU may be referred to as an F1 interface.
  • the control plane (control panel, CP) interface may be F1-C
  • the user plane (user panel, UP) interface may be F1-U.
  • CU and DU can be divided according to the protocol layer of the wireless network: for example, the functions of the PDCP layer and above protocol layers are set in the CU, and the functions of the protocol layers below the PDCP layer (such as RLC layer and MAC layer, etc.) are set in the DU; another example, PDCP The functions of the protocol layer above the layer are set in the CU, and the functions of the protocol layer below the PDCP layer are set in the DU.
  • the above division of the processing functions of CU and DU according to the protocol layer is only an example, and it can also be divided in other ways, for example, the CU or DU can be divided into functions with more protocol layers, and for example, the CU or DU can also be divided into some processing functions with the protocol layer.
  • part of the functions of the RLC layer and the functions of the protocol layers above the RLC layer are set in the CU, and the rest of the functions of the RLC layer and the functions of the protocol layers below the RLC layer are set in the DU.
  • the functions of the CU or DU can also be divided according to the business type or other system requirements, for example, according to the delay, and the functions whose processing time needs to meet the delay requirement are set in the DU, which does not need to meet the delay
  • the required feature set is in the CU.
  • the CU may also have one or more functions of the core network.
  • the CU can be set on the network side to facilitate centralized management.
  • the wireless unit (radio unit, RU) of the DU is set remotely. Wherein, the RU has a radio frequency function.
  • DUs and RUs can be divided in a physical layer (physical layer, PHY).
  • the DU can implement high-level functions in the PHY layer
  • the RU can implement low-level functions in the PHY layer.
  • the functions of the PHY layer may include adding a cyclic redundancy check (cyclic redundancy check, CRC) code, channel coding, rate matching, scrambling, modulation, layer mapping, precoding, resource mapping, physical antenna mapping, and/or RF routing functions.
  • CRC cyclic redundancy check
  • the functions of the PHY layer may include CRC, channel decoding, de-rate matching, descrambling, demodulation, de-layer mapping, channel detection, resource de-mapping, physical antenna de-mapping, and/or radio frequency receiving functions.
  • the high-level functions in the PHY layer may include a part of the functions of the PHY layer, for example, this part of the functions is closer to the MAC layer, and the lower-level functions in the PHY layer may include another part of the functions of the PHY layer, for example, this part of the functions is closer to the radio frequency function.
  • high-level functions in the PHY layer may include adding CRC codes, channel coding, rate matching, scrambling, modulation, and layer mapping
  • low-level functions in the PHY layer may include precoding, resource mapping, physical antenna mapping, and radio transmission functions
  • high-level functions in the PHY layer may include adding CRC codes, channel coding, rate matching, scrambling, modulation, layer mapping, and precoding
  • low-level functions in the PHY layer may include resource mapping, physical antenna mapping, and radio frequency send function.
  • the function of the CU may be implemented by one entity, or may also be implemented by different entities.
  • the functions of the CU can be further divided, that is, the control plane and the user plane are separated and realized by different entities, namely, the control plane CU entity (ie, the CU-CP entity) and the user plane CU entity (ie, the CU-UP entity) .
  • the CU-CP entity and CU-UP entity can be coupled with the DU to jointly complete the functions of the access network equipment.
  • the signaling generated by the CU can be sent to the terminal device through the DU, or the signaling generated by the terminal device can be sent to the CU through the DU.
  • signaling at the RRC or PDCP layer will eventually be processed as signaling at the physical layer and sent to the terminal device, or converted from received signaling at the physical layer.
  • the signaling at the RRC or PDCP layer can be considered to be sent through the DU, or sent through the DU and the RU.
  • any one of the foregoing DU, CU, CU-CP, CU-UP, and RU may be a software module, a hardware structure, or a software module+hardware structure, without limitation.
  • the existence forms of different entities may be different, which is not limited.
  • DU, CU, CU-CP, and CU-UP are software modules
  • RU is a hardware structure.
  • the number and types of devices in the communication system shown in FIG. 1A are only for illustration, and the present disclosure is not limited thereto. In practical applications, the communication system may also include more terminal devices and more access networks.
  • the device may also include other devices, for example, it may include core network devices, and/or network elements for implementing artificial intelligence functions.
  • the method provided in this disclosure can be used for communication between an access network device and a terminal device, and can also be used for communication between other communication devices, such as the communication between a macro base station and a micro base station in a wireless backhaul link. Such as communication between two terminal devices in a side link (sidelink, SL), etc., are not restricted.
  • the present disclosure uses communication between an access network device and a terminal device as an example for description.
  • an AI function (such as an AI module or an AI entity) may be configured in an existing network element in a communication system to implement AI-related operations.
  • the existing network element may be an access network device (such as gNB), a terminal device, a core network device, or a network management (operation, administration and maintenance, OAM), etc.
  • an AI function may be configured in at least one network element of the access network device 110 , the terminal device 120 , and the terminal device 130 .
  • an independent network element may also be introduced into the communication system to perform AI-related operations.
  • the independent network element may be called an AI network element or an AI node, etc., and this disclosure does not limit the name.
  • the network element performing AI-related operations is a network element with a built-in AI function (such as an AI module or an AI entity).
  • AI-related operations may also be referred to as AI functions.
  • the AI network element can be directly connected to the access network equipment in the communication system, or can be indirectly connected through a third-party network element and the access network equipment.
  • the third-party network element may be an authentication management function (authentication management function, AMF) network element, a user plane function (user plane function, UPF) network element, and other core network elements.
  • AMF authentication management function
  • UPF user plane function
  • FIG. 1B an AI network element 140 is introduced into the communication system shown in FIG. 1A above.
  • the AI function is built into an existing network element as an example for illustration.
  • the AI model is the specific realization of the AI function, and the AI model represents the mapping relationship between the input and output of the model.
  • AI models can be neural networks or other machine learning models.
  • the AI function may include at least one of the following: data collection (collecting training data and/or reasoning data), model learning, model information publishing (configuring model information), reasoning, or reasoning result publishing.
  • the AI model may be referred to simply as a model.
  • model learning can also be understood as model training.
  • a neural network is a concrete implementation of machine learning techniques and AI models. According to the general approximation theorem, the neural network can theoretically approximate any continuous function, so that the neural network has the ability to learn any mapping.
  • Traditional communication systems need to rely on rich expert knowledge to design communication modules, while deep learning communication systems based on neural networks can automatically discover hidden pattern structures from a large number of data sets, establish mapping relationships between data, and achieve better results than traditional communication systems. The performance of the modeling method.
  • each neuron performs a weighted sum operation on its input values, and outputs the operation result through an activation function.
  • FIG. 2A it is a schematic diagram of a neuron structure.
  • the weight of i is used to weight x i .
  • the bias for performing weighted summation of the input values according to the weights is, for example, b.
  • the output of the neuron is: x i +b).
  • b may be various possible values such as decimals, integers (such as 0, positive integers or negative integers), or complex numbers.
  • the activation functions of different neurons in a neural network can be the same or different.
  • a neural network generally includes multiple layers, each layer may include one or more neurons. By increasing the depth and/or width of the neural network, the expressive ability of the neural network can be improved, providing more powerful information extraction and abstract modeling capabilities for complex systems.
  • the depth of the neural network may refer to the number of layers included in the neural network, and the number of neurons included in each layer may be referred to as the width of the layer.
  • a neural network includes an input layer and an output layer. The input layer of the neural network processes the received input information through neurons, and passes the processing result to the output layer, and the output layer obtains the output result of the neural network.
  • the neural network includes an input layer, a hidden layer and an output layer, refer to FIG. 2B .
  • a neural network processes the received input information through neurons, and passes the processing results to the middle hidden layer.
  • the hidden layer calculates the received processing results to obtain the calculation results, and the hidden layer transmits the calculation results to the output layer or
  • the adjacent hidden layers finally get the output of the neural network from the output layer.
  • a neural network may include one hidden layer, or include multiple hidden layers connected in sequence, without limitation.
  • DNN deep neural network
  • DNN deep neural network
  • FNN feedforward neural network
  • CNN convolutional neural network
  • RNN recurrent neural network
  • a loss function can be defined.
  • the loss function describes the gap or difference between the output value of the neural network and the ideal target value, and the disclosure does not limit the specific form of the loss function.
  • the training process of the neural network is the process of adjusting the parameters of the neural network so that the value of the loss function is less than the threshold, or the value of the loss function meets the target requirements. Adjusting the parameters of the neural network, for example, adjusting at least one of the following parameters: the number of layers of the neural network, the width, the weight of the neurons, or the parameters in the activation function of the neurons.
  • the training data may include the input of the AI model, or include the input and target output (label) of the AI model, and is used for training the AI model.
  • the training data includes multiple training samples, and each training sample is an input to the neural network.
  • Training data can also be understood as a collection of training samples, or called a training data set.
  • the training data set is one of the most important parts of machine learning.
  • the training process of the model is essentially to learn some of its characteristics from the training data, so that the output of the AI model is as close as possible to the target output, such as the difference between the output of the AI model and the target output. The difference between them is the smallest.
  • the target output may also be called a label.
  • the method provided in the present disclosure specifically relates to the training and application of bilateral models.
  • the bilateral model or called double-ended model, cooperation model.
  • the bilateral model can be composed of two or more sub-models, and the two or more sub-models can be matched and used, and the two or more sub-models can be distributed in different network elements.
  • an auto-encoder AE
  • the auto-encoder includes an encoder and a decoder, where the encoder and the decoder are used in a matching manner.
  • the output of the encoder can be used to determine input to the decoder.
  • the encoder and the decoder are deployed on different network elements, for example, the encoder is deployed on the terminal device, and the decoder is deployed on the access network device.
  • one network element trains the bilateral model, and then deploys the two trained sub-models on the two network elements respectively.
  • the network element that completes the bilateral model training can be the two network elements that deploy the sub-models One of them may also be a third-party network element.
  • the access network device can complete the training of the bilateral model, and then send the sub-models that need to be deployed on the terminal device to the terminal device. Specifically, as shown in FIG. 3 , it is divided into three stages. In the model training stage, the access network device independently trains a bilateral model.
  • the bilateral model includes sub-model 1 and sub-model 2, where the input type of sub-model 1 is a, The input type of sub-model 2 is the same as the output type of sub-model 1, and the output type of sub-model 2 is b.
  • the access network device sends sub-model 1 to the terminal device.
  • the terminal device can obtain data of type c based on the data of sub-model 1 and type a, that is, the input type of sub-model 1 is a, and the output type of sub-model 1 is c ;
  • the terminal device sends the data of type c to the access network device;
  • the access network device can obtain the data of type b according to the data of sub-model 2 and type c, that is, the input type of sub-model 2 is c, and the data of sub-model 2
  • the output type is b.
  • an AI model such as a sub-model
  • the overhead of transmitting the sub-model over the air interface is relatively large.
  • there are many formats of AI models For example, from the classification of neural networks, there are FNN, CNN, and RNN; from the internal structure of neural networks, it involves the number of neurons in each layer. The connection relationship between elements, the connection relationship between layers, the activation function type, etc. Defining the AI model or the interpretation format of the AI model requires a lot of standardization work. Moreover, due to the large difference in the computing power of terminal devices, the scale of AI models that can be supported is also different.
  • the access network device may need to train the corresponding AI model for UEs with various computing capabilities. In this way, for the access network device, the required calculation and Storage overhead is also relatively large.
  • the AI model involves related algorithm design.
  • the algorithm and AI model are relatively private content, and the interaction between different network elements such relatively private content is likely to lead to algorithm leakage, which is not conducive to communication security.
  • each network element can train sub-models independently, and the sub-models trained independently by each network element match each other to form a bilateral model.
  • the communication method provided by the present disclosure will be described in detail below with reference to the accompanying drawings.
  • the communication method provided by the present disclosure can be applied to the communication system shown in FIG. 1A or FIG. 1B, and a network element in the communication system can train the bilateral model, and then send the input data and/or output data of the sub-model in the bilateral model to
  • other network elements can use the input data and/or output data of a certain sub-model to train to obtain a sub-model with the same function as the sub-model. That is, each network element independently trains sub-models, and the sub-models independently trained by each network element match each other to form a new bilateral model.
  • the data source (data source) is used to store training data and inference data.
  • the model training host (model training host) obtains the AI model by analyzing or training the training data provided by the data source, and deploys the AI model in the model inference host (model inference host).
  • the AI model represents the mapping relationship between the input and output of the model. Learning the AI model through the model training node is equivalent to using the training data to learn the mapping relationship between the input and output of the model.
  • the model inference node uses the AI model to perform inference based on the inference data provided by the data source, and obtains the inference result.
  • the model inference node inputs the inference data into the AI model, and obtains an output through the AI model, and the output is the inference result.
  • the inference result may indicate: configuration parameters used (executed) by the execution object, and/or operations performed by the execution object.
  • the reasoning result can be uniformly planned by the execution (actor) entity, and sent to one or more execution objects (for example, network elements) for execution.
  • the application framework shown in Figure 4A can be deployed on the network element shown in Figure 1A or Figure 1B, for example, the application framework in Figure 4A can be deployed on the access network device in Figure 1A, and the access network device Among them, the model training node can analyze or train the training data provided by the data source to obtain a bilateral model.
  • the model inference node can use the sub-model included in the bilateral model to perform inference according to the inference data provided by the sub-model and the data source, and obtain the output of the sub-model. That is, the input of the sub-model includes inference data, and the output of the sub-model is the inference result corresponding to the sub-model.
  • the access network device can send the inference data and/or inference results corresponding to the sub-models to the terminal device, and the terminal device can independently train a corresponding sub-model based on the inference data and/or inference results submodel.
  • the network architecture to which the communication solution provided by the present disclosure can be applied is introduced below with reference to FIG. 4B to FIG. 4E .
  • the access network device includes a near real-time access network intelligent controller (RAN intelligent controller, RIC) module for model learning and/or reasoning.
  • RAN intelligent controller RIC
  • the near real-time RIC can obtain network-side and/or terminal-side information from at least one of the CU, DU, and RU, and the information can be used as training data or inference data.
  • the near real-time RIC may submit the reasoning result to at least one of the CU, the DU and the RU.
  • the inference results can be exchanged between the CU and the DU.
  • the inference results can be exchanged between the DU and the RU, for example, the near real-time RIC submits the inference results to the DU, and the DU submits the inference results to the RU.
  • near-real-time RIC can be used to train a bilateral model, and a sub-model of the bilateral model can be used for inference; or, near-real-time RIC can be used to train a sub-model that can be matched with sub-models distributed on other network elements.
  • the access network device may include a non-real-time RIC in addition to the near-real-time RIC (optionally, the non-real-time RIC may be located in the OAM or in the core network device), using for model learning and inference.
  • the non-real-time RIC can obtain network-side and/or terminal-side information from at least one of CU, DU, and RU, which can be used as training data or inference data, and the inference results can be submitted to CU, DU, and RU at least one of the .
  • the inference results can be exchanged between the CU and the DU.
  • the reasoning result can be exchanged between the DU and the RU, for example, the non-real-time RIC submits the reasoning result to the DU, and then the DU submits it to the RU.
  • the non-real-time RIC is used to train a bilateral model, and a sub-model of the bilateral model is used for inference; or, the non-real-time RIC can be used to train a sub-model that can be matched with a sub-model distributed on other network elements.
  • the access network equipment includes a near-real-time RIC, and the access network equipment includes a non-real-time RIC (optionally, the non-real-time RIC can be located in the OAM or the core network equipment middle).
  • the non-real-time RIC can be used for model learning and/or reasoning; and/or, like the first possible implementation above, the near real-time RIC can be used for model learning and/or reasoning; And/or, the near-real-time RIC can obtain AI model information from the non-real-time RIC, and obtain network-side and/or terminal-side information from at least one of CU, DU, and RU, and use the information and the AI model information to obtain inference results , optional, the near real-time RIC can submit the reasoning result to at least one of CU, DU and RU, optional, the reasoning result can be interactive between CU and DU, optional, the reasoning can be interactive between DU and RU As a result, for example, the near real-time RIC submits the reasoning results to the DU, and the DU submits it to the RU.
  • near-real-time RIC is used to train bilateral model A, and a sub-model of bilateral model A is used for inference.
  • non-real-time RIC is used to train bilateral model B, and a sub-model of bilateral model B is used for inference.
  • the non-real-time RIC is used to train the bilateral model C, and the bilateral model C is submitted to the near-real-time RIC, and the near-real-time RIC uses the sub-model of the bilateral model C for inference.
  • FIG. 4C is an example diagram of a network architecture to which the method provided by the present disclosure can be applied. Compared with FIG. 4B , in FIG. 4B CU is separated into CU-CP and CU-UP.
  • FIG. 4D is an example diagram of a network architecture to which the method provided by the present disclosure can be applied.
  • the access network device includes one or more AI entities, and the functions of the AI entities are similar to the near real-time RIC described above.
  • the OAM includes one or more AI entities, and the functions of the AI entities are similar to the non-real-time RIC described above.
  • the core network device includes one or more AI entities, and the functions of the AI entities are similar to the above-mentioned non-real-time RIC.
  • both the OAM and the core network equipment include AI entities, the models trained by their respective AI entities are different, and/or the models used for reasoning are different.
  • the different models include at least one of the following differences: structural parameters of the model (such as the number of layers and/or weights of the model), input parameters of the model, or output parameters of the model.
  • FIG. 4E is an example diagram of a network architecture to which the method provided in the present disclosure can be applied.
  • the access network devices in Fig. 4E are separated into CU and DU.
  • the CU may include an AI entity, and the function of the AI entity is similar to the above-mentioned near real-time RIC.
  • the DU may include an AI entity, and the function of the AI entity is similar to the above-mentioned near real-time RIC.
  • both the CU and the DU include AI entities, the models trained by their respective AI entities are different, and/or the models used for reasoning are different.
  • the CU in FIG. 4E may be further split into CU-CP and CU-UP.
  • one or more AI models may be deployed in the CU-CP. And/or, one or more AI models may be deployed in the CU-UP.
  • the OAM of the access network device and the OAM of the core network device may be deployed separately and independently.
  • a model can infer one parameter, or infer multiple parameters.
  • the learning process of different models can be deployed in different devices or nodes, and can also be deployed in the same device or node.
  • the inference process of different models can be deployed in different devices or nodes, or can be deployed in the same device or node.
  • a communication method is illustrated, and the method includes the following process.
  • the first network element determines a first submodel and a second submodel.
  • first sub-model and the second sub-model can be matched and used.
  • matching between the first sub-model and the second sub-model it can be understood with reference to any one of the following two optional manners.
  • the output of the first sub-model is used to determine the input of the second sub-model.
  • the output data of the first sub-model can be used to generate the input data of the second sub-model.
  • the input data of the second sub-model includes the output data of the first sub-model, or for example, the output data of the first sub-model can be pre-processed to obtain the input data of the second sub-model.
  • the output of the first sub-model and the input of the second sub-model can meet at least one of the following characteristics: the output type of the first sub-model is the same as the input type of the second sub-model; or the dimension of the output data of the first sub-model is the same as that of the second sub-model.
  • the dimensions of the input data of the two sub-models are the same; or the format of the output data of the first sub-model is the same as that of the input data of the second sub-model.
  • the output of the second sub-model is used to determine the input of the first sub-model, or in other words, the output data of the second sub-model can be used to generate the input of the first sub-model data.
  • the input of the first sub-model includes the output of the second sub-model, or, for example, the output of the second sub-model can be pre-processed to obtain the input of the first sub-model.
  • the output of the second sub-model and the input of the first sub-model can meet at least one of the following characteristics: the output type of the second sub-model is the same as the input type of the first sub-model; or the dimension of the output data of the second sub-model is the same as that of the first sub-model.
  • the dimensions of the input data of a sub-model are the same; or the format of the output data of the second sub-model is the same as the format of the input data of the first sub-model.
  • the first sub-model and the second sub-model may belong to a bilateral model.
  • the first network element can acquire relevant training data according to the scene where the bilateral model needs to be deployed; the first network element trains the first sub-model and the second sub-model according to the training data.
  • the scenario where the bilateral model needs to be deployed may include at least one of the following: feedback of channel information, such as the second network element side using one of the sub-models to compress channel state information (channel state information, CSI), and send the compressed CSI
  • the first network element uses the other sub-model to restore CSI
  • bilateral AI modulation and demodulation for example, the second network element side uses one of the sub-models to modulate the signal, and the first network element uses the other sub-model Demodulated signal
  • bilateral AI beam prediction for example, the second network element uses one of the sub-models to generate one or more beams, and uses the one or more beams to send reference signals to the first network element, and the first network element uses the Another sub-model and the received reference signal(s) predict the optimal beam.
  • the first submodel and the second submodel may be applied to different network elements.
  • the first submodel is applied to the second network element
  • the second submodel is applied to the first network element.
  • the first network element may be an access network device
  • the second network element may be, for example, a terminal device.
  • the first network element may be a terminal device
  • the second network element may be an access network device.
  • the first network element and the second network element in this disclosure may both be terminal devices; or for scenarios involving a bilateral model that needs to be deployed between access network devices
  • the first network element and the second network element in this disclosure may both be access network devices.
  • the present disclosure does not limit the types of network elements.
  • the first submodel may be used to send information at the sending end (or a network element called a sending end), and the second submodel is used to receive the information at a receiving end (or a network element called a receiving end). information; or, the second submodel is used to send information at the sending end (or a network element called a sending end), and the first submodel is used to receive the information at a receiving end (or a network element called a receiving end)
  • This scenario includes but is not limited to the above CSI compression and recovery, or the above modulation and demodulation.
  • the first network element determines input data and output data of the first sub-model.
  • the first network element may determine input data of the first sub-model according to the training data described in S501.
  • the training data described in S501 is used as the input data of the first sub-model, or the input data of the first sub-model is generated by itself based on the training data described in S501.
  • the first network element may also determine the input data of the first sub-model according to the relevant data sent by the second network element.
  • the first network element may also obtain the input data of the first sub-model through related measurement.
  • the first network element may also determine the input data of the first sub-model according to the output data of the second sub-model.
  • the first network element inputs the aforementioned determined input data into the first sub-model, and obtains corresponding output data through the first sub-model.
  • the first sub-model in S502 is a trained model
  • the input data of the first sub-model can also be called the inference data of the first sub-model
  • the output data of the first sub-model can also be called the first sub-model. Inference results for submodels.
  • the first network element sends the first information to the second network element.
  • the first information is used to indicate input data of the first sub-model and/or output data of the first sub-model. For example, if the first network element determines the input data of the first sub-model according to the relevant data obtained from the second network element, the first information sent by the first network element to the second network element can be used to indicate that the first sub-model output data without indicating the input data of the first submodel. The related data can be used by the second network element to obtain the input data of the first sub-model.
  • the first information sent by the first network element to the second network element can be used to indicate the input data of the first sub-model and the first sub-model The output data of the model.
  • the first information that the first network element can send to the second network element can be used to indicate the input data of the first sub-model and Output data for the first submodel is not indicated.
  • the first information may be used to indicate input data of the first sub-model and label data of the first sub-model.
  • the label data of the first sub-model is the expected value or target value of the output data of the first sub-model, which can be understood as the expected output data of the first sub-model; the label data of the first sub-model can also be described as the first sub-model Label samples for the model or output labels for the first submodel.
  • the second network element side can determine the input data and label data of the first sub-model by itself, the first network element may not indicate to the second network element the input data and label data of the first sub-model through the first information. Label data for the first submodel.
  • the first network element may also send multiple sets of input data and/or output data corresponding to the sub-models to the second network element, or the first network element may also send multiple sets of sub-models corresponding to the second network element Input data and label data.
  • the second network element trains the third sub-model according to the first information.
  • the first information is used for training the third sub-model, or in other words, the first information is used for determining the third sub-model.
  • the third submodel and the first submodel may meet at least one of the following features: the function of the third submodel is the same as that of the first submodel; the input of the third submodel The type is the same as the input type of the first sub-model, the output type of the third sub-model is the same as the output type of the first sub-model; the dimension of the input data of the third sub-model is the same as that of the first sub-model The dimensions of the input data of the sub-models are the same, the dimensions of the output data of the third sub-model are the same as the dimensions of the output data of the first sub-model; the format of the input data of the third sub-model is the same as that of the first sub-model The format of the input data of the sub-model is the same, the format of the output data of the third sub-model is the same as the format of the output data of the first sub-model; the input of the third sub-model is the same as that of the first sub-model When the input is the same, the difference between the output of
  • the third sub-model trained by the second network element can replace the first sub-model previously trained by the first network element, that is, the third sub-model can be matched with the second sub-model to form a new bilateral model.
  • the network structures of the third sub-model and the first sub-model may be the same or different; the types of neural networks used by the third sub-model and the first sub-model may be the same or different. This disclosure is not limited in this regard.
  • the second network element may determine the training data of the third sub-model and the label samples of the third sub-model (hereinafter referred to as labels) according to the first information.
  • the second network element may determine the training data of the third sub-model according to the input data of the first sub-model, that is, the input data of the third sub-model; the second network element may use the output data of the first sub-model as a label, or, The second network element may use the label data related to the first sub-model as the label of the third sub-model.
  • the training data of the third sub-model determined by the second network element according to the first information may include one or more, and each training data corresponds to a label.
  • the manner in which the second network element trains the third sub-model is described below by taking the training data of the third sub-model including multiple as an example.
  • the second network element inputs the input data of each first sub-model into the AI model to be trained, and the second network element may output data corresponding to the input data of the first sub-model or
  • the label data is used as the label
  • the third sub-model is obtained through training.
  • the loss function can represent the difference between the output of the third sub-model and the label corresponding to the output.
  • the loss function can be specifically the output of the third sub-model and the output corresponding to NMSE or MSE or cosine similarity between the labels of .
  • the AI model to be trained may be the aforementioned DNN, such as FNN, CNN, or RNN, or other AI models, which is not limited in the present disclosure.
  • the second network element inputs the input data of each first sub-model into the basic model, and the second network element can output data or label data corresponding to the input data of the first sub-model As a label, the second network element can update the basic model according to the training data to obtain the third sub-model.
  • the basic model may be a model obtained through historical training of the second network element, or the basic model may also be pre-configured in the second network element. For example, for a scenario where a bilateral model needs to be deployed, a scenario-related basic model may be pre-configured on the second network element side.
  • the third submodel and the second submodel may meet at least one of the following characteristics: the function of the third submodel is the same as that of the second submodel; the input of the third submodel The type is the same as the input type of the second sub-model, the output type of the third sub-model is the same as the output type of the second sub-model; the dimension of the input data of the third sub-model is the same as that of the second sub-model The dimensions of the input data of the sub-models are the same, the dimensions of the output data of the third sub-model are the same as the dimensions of the output data of the second sub-model; the format of the input data of the third sub-model is the same as that of the second sub-model The format of the input data of the sub-model is the same, the format of the output data of the third sub-model is the same as the format of the output data of the second sub-model; the input of the third sub-model is the same as that of the second sub-model When the inputs are the same, the difference between the output
  • the third sub-model trained by the second network element can replace the second sub-model previously trained by the first network element, that is, the third sub-model can be matched with the first sub-model to form a new bilateral model.
  • the network structures of the third sub-model and the second sub-model may be the same or different; the types of neural networks used by the third sub-model and the second sub-model may be the same or different. This disclosure is not limited in this regard.
  • the second network element may determine the training data of the third sub-model and the label samples of the third sub-model (hereinafter referred to as labels) according to the first information.
  • the first information indicates the input data and output data of the first sub-model; or the first information indicates the input data or output data of the first sub-model, but the second network element may determine the first Output data or input data for the first submodel not indicated by the information.
  • the second network element can determine the training data of the third sub-model according to the output data of the first sub-model, that is, the input data of the third sub-model; the second network element can use the input data of the first sub-model as the third sub-model Tag of.
  • the training data of the third sub-model determined by the second network element according to the first information may include one or more, and each training data corresponds to a label.
  • the manner in which the second network element trains the third sub-model is described below by taking the training data of the third sub-model including multiple as an example.
  • the second network element inputs the output data of each first sub-model into the AI model to be trained, and the second network element may use the input data corresponding to the output data of the first sub-model as Label, training to get the third sub-model, the loss function can represent the difference between the output of the third sub-model and the label corresponding to the output, for example, the loss function can be specifically the difference between the output of the third sub-model and the label corresponding to the output NMSE or MSE or cosine similarity between
  • the AI model to be trained may be the aforementioned DNN, such as FNN, CNN, or RNN, or other AI models, which is not limited in the present disclosure.
  • the second network element inputs the output data of each first sub-model into the basic model, and the second network element may use the input data corresponding to the output data of the first sub-model as a label,
  • the second network element can update the basic model according to the training data to obtain the third sub-model.
  • the basic model may be a model obtained through historical training of the second network element, or the basic model may also be pre-configured in the second network element. For example, for a scenario where a bilateral model needs to be deployed, a scenario-related basic model may be pre-configured on the second network element side.
  • the second network element may determine relevant parameters in the stage of training the third sub-model.
  • the parameters may be defined on the side of the second network element in a predetermined manner, or the first network element may indicate the relevant parameters to the second network element.
  • the parameter includes a training end condition, and the second network element may perform training of the third sub-model according to the training end condition.
  • the training end condition may include at least one of the following: training duration, number of training iterations, or a performance threshold that the third sub-model needs to meet.
  • the performance threshold may be the convergence threshold of the loss function for training, testing or verification, or the performance threshold may also be other thresholds, such as the threshold set for the difference between the output data and the label of the third sub-model, specifically Specifically, the difference between the output data of the third sub-model and the label can be represented by mean square error (mean square error, MSE), normalized mean square error (normalization mean square error, NMSE), cross entropy, etc.
  • MSE mean square error
  • NMSE normalized mean square error
  • cross entropy etc.
  • the parameters may also include the structure of the third sub-model, the parameters of the third sub-model, the loss function for training the third sub-model, and the like.
  • the second network element can train corresponding sub-models according to the input data and/or output data of each set of sub-models, thereby obtaining Multiple submodels.
  • the training of each group of sub-models by the second network element can be implemented by referring to the manner of training the third sub-model mentioned above, which will not be repeated in this disclosure.
  • the first network element may also instruct the second network element to use one or more sub-models therein.
  • a bilateral model is trained by the first network element, and the input/output related to the sub-model in the bilateral model is indicated to the second network element, so as to realize the independent training of the sub-model with the same function on the side of the second network element.
  • the matching and use of other sub-models on the network element can meet the application requirements of the bilateral model, and there is no need to transmit sub-models on the air interface, which can reduce transmission overhead and improve communication security.
  • FIG. 5A is an example only showing the interaction between two network elements, it does not mean that the disclosure is limited to two network elements, nor does it mean that the disclosure limits the bilateral model to only include two matching The submodel to use.
  • the method described in FIG. 5A may be referred to to realize the distribution of sub-models in more than two network elements and the matching and use of distributed sub-models in more than two network elements.
  • the present disclosure also provides a schematic diagram of a deployment process of a bilateral model. Four stages are schematically shown in FIG. 5B.
  • model training stage 1 the first network element trains a bilateral model, which includes sub-model 1 and sub-model 2, wherein the input type of sub-model 1 is a, and sub-model 1
  • the output type of submodel 2 is c; the input type of submodel 2 is the same as the output type of submodel 1, and the output type of submodel 2 is b.
  • the first network element sends the input data (denoted as a1) and/or output data (denoted as c1) of sub-model 1 to the terminal device.
  • the second network element can train the sub-model 3 according to the received a1 and/or c1.
  • a1 is used as the input of the sub-model 3, and c1 is used as the output of the sub-model 3, and the function of the sub-model 3 and the sub-model 1 is the same after training.
  • a1 is used as the output of the sub-model 3, and c1 is used as the input of the sub-model 3, and the function of the sub-model 3 and the sub-model 2 is the same after training.
  • the second network element can obtain data of type c based on data of sub-model 1 and type a , that is, the input type of sub-model 3 is a, and the output type of sub-model 3 is c; the second network element sends the data of type c to the first network element; the first network element can according to the data of sub-model 2 and type c , to get data of type b, that is, the input type of sub-model 2 is c, and the output type of sub-model 2 is b.
  • the present disclosure provides a communication method. Referring to FIG. 6A , the method includes the following process.
  • the access network device acquires training data.
  • the training data includes N pieces of channel information.
  • N is a positive integer, that is, N is an integer greater than or equal to 1.
  • the training data is used to determine a bilateral model, which includes a first sub-model and a second sub-model.
  • channel information can be understood with reference to the following manner B1 or manner B2.
  • the channel information includes downlink channel characteristics.
  • the access network equipment can use the uplink and downlink reciprocity of the channel to obtain the downlink channel characteristics according to the uplink channel.
  • the access network device can obtain the downlink channel characteristics according to the uplink channel through some signal processing methods; or, the terminal device can also report CSI to the access network device, and the CSI includes the precoding matrix index (precoding matrix index) matrix index, PMI), the PMI is used to represent the downlink channel characteristics; then the access network device can also obtain the downlink channel characteristics by collecting the PMI reported by the terminal device.
  • the present disclosure does not limit the manner in which the access network device obtains the downlink channel characteristics.
  • the downlink channel feature may refer to the eigenvector or eigenmatrix of the downlink channel, and the eigenvector or eigenmatrix may be obtained by performing singular value decomposition (singular value decomposition, SVD) on the downlink channel by the terminal device, or the eigenvector Alternatively, the eigenmatrix may also be obtained by the terminal device through eigenvalue decomposition (eigen value decomposition, EVD) according to the covariance matrix of the downlink channel.
  • singular value decomposition singular value decomposition
  • SVD singular value decomposition
  • EVD eigen value decomposition
  • the downlink channel feature can also refer to the precoding matrix index (precoding matrix index, PMI), the PMI can be used by the terminal device according to the predefined codebook, the downlink channel, the feature vector of the downlink channel or the feature matrix of the downlink channel be processed.
  • precoding matrix index precoding matrix index, PMI
  • the channel information includes downlink channels, that is, all channel information.
  • the access network equipment can use the uplink and downlink reciprocity of the channel to obtain the downlink channel according to the uplink channel.
  • the access network device can obtain the downlink channel according to the uplink channel through some signal processing methods; or, the terminal device can also report the related information of the downlink channel to the access network device, and the access network device also The downlink channel may be acquired according to the related information of the downlink channel.
  • the present disclosure does not limit the manner in which the access network equipment obtains the downlink channel.
  • the access network device determines a first submodel and a second submodel according to the N pieces of channel information.
  • the first sub-model and the second sub-model constitute a bilateral model for channel information feedback, which is denoted as the first bilateral model.
  • the access network device may divide the acquired N pieces of channel information into one or more training sets.
  • the access network device may use part or all of the one or more training sets to train the same model.
  • an access network device can use a training set to train the same bilateral model.
  • the access network device can use multiple training sets to train the same bilateral model.
  • the input to the first bilateral model includes channel information
  • the output of the first bilateral model includes recovered channel information.
  • the channel information may be a downlink channel characteristic or a downlink channel.
  • the definition of channel information can be understood with reference to S601, which will not be repeated here.
  • Training the first bilateral model can be understood as minimizing the difference between the input channel information and the output channel information, and the loss function corresponding to the first bilateral model can be expressed as the MSE between the input channel information and the output channel information , the cross-entropy between the input channel information and the output channel information, or the cosine similarity (cosine similarity) between the input channel information and the output channel information, etc.
  • the input type of the first sub-model may be consistent with the input type of the first bilateral model, that is, the input type of the first sub-model is channel information, or the input of the first bilateral model is Input to the first submodel.
  • the output type of the first sub-model is a feature bit, and the feature bit includes one or more binary bits. It can be understood that the feature bits are a low-dimensional expression of channel information, and the first submodel is used to compress and/or quantize the channel information to obtain feature bits.
  • the input of the second sub-model is determined by the output of the first sub-model, for example, the input type of the second sub-model is consistent with the output type of the first sub-model, both are feature bits; or, the dimension of the input data of the second sub-model The dimensions of the output data of the first sub-model are the same; or, the input data of the second sub-model includes the output data of the first sub-model; or, the output data of the first sub-model can be preprocessed and input to the second sub-model
  • the model that is, the input data of the second sub-model includes the preprocessed output data of the first sub-model.
  • the output of the second submodel is recovered channel information.
  • the first bilateral model may be an autoencoder, wherein the first sub-model is an encoder, and the second sub-model is a decoder.
  • the access network device may preset the dimension of the feature bits according to actual requirements, and the dimension of the feature bits may also be referred to as the number of bits included in the feature bits. For example, in consideration of feedback overhead, the access network device may reduce the dimension of feature bits to reduce feedback overhead. Specifically, the access network device may set the dimension of the feature bit to be smaller than the first dimension threshold. For example, in consideration of feedback accuracy, the access network device may increase the dimension of feature bits to improve feedback accuracy. Specifically, the access network device may set the dimension of the feature bit to be greater than the first dimension threshold.
  • the access network device sends the input data and/or output data of the first sub-model to the terminal device.
  • the manner of determining the input data and output data of the first sub-model can be implemented with reference to S502, which will not be repeated in this disclosure.
  • the input data of the first sub-model includes M pieces of channel information, where M is a positive integer.
  • the output data of the first sub-model includes feature bits corresponding to M pieces of channel information, where M is a positive integer.
  • the access network device may only send the output data of the first sub-model to the terminal device.
  • the input data of the first submodel is determined according to the channel information reported by the terminal device, and the access network device may only send the output data of the first submodel to the terminal device.
  • the terminal device reports the PMI to the access network device, and the access network device obtains the corresponding downlink channel characteristics according to the PMI reported by the terminal device Access network equipment will As the input of the first sub-model, the corresponding output is obtained, which is recorded as the feature bit B.
  • the access network device may only send the characteristic bit B to the terminal device.
  • the access network device may generate a feature bit corresponding to the PMI each time it receives the PMI reported by the terminal device, and send the feature bit corresponding to the PMI to the terminal device.
  • the feature bit that the terminal device will receive within T1 time units after reporting a PMI may be set as the feature bit corresponding to the PMI.
  • the time unit may be a time slot or a symbol.
  • the value of T1 may be set according to actual requirements, for example, one time slot, which is not limited in the present disclosure.
  • the access network device after receiving multiple PMIs reported by the terminal equipment, the access network device obtains the downlink channel characteristics corresponding to the multiple PMIs The access network equipment is entered one at a time in turn multiple According to the input to the first sub-model, a plurality of corresponding feature bits are output. The access network device may send the multiple feature bits to the terminal device.
  • mapping relationship between multiple PMIs and multiple feature bits such as one-to-one correspondence.
  • the mapping relationship between multiple feature bits and multiple PMIs may be predefined, for example, after each receiving M PMIs, the access network device assigns the corresponding N feature bits according to a specific The sequence is arranged in a message, and the terminal device can associate the N feature bits with the M PMIs according to the specific sequence.
  • the specific order may be the order in which the access network devices receive the PMIs.
  • the foregoing message including N characteristic bits may adopt a specific message format, and the specific message format may be set according to actual requirements, which is not limited in the present disclosure.
  • the mapping relationship between multiple feature bits and multiple PMIs may be configured by the access network device to the terminal device.
  • the access network device sends the input data and output data of the first sub-model to the terminal device.
  • the access network device may generate one or more channel information that meets requirements by itself, input the one or more channel information into the first sub-model, and output one or more corresponding feature bits.
  • the access network device sends the one or more channel information and the one or more feature bits to the terminal device. In this manner, the access network device may not obtain the PMI reported by the terminal device or need not wait for the PMI reported by the terminal device.
  • the access network device may also involve training of other bilateral models.
  • the access network device can acquire sub-models that need to be applied to the terminal device in each bilateral model, and combine the determined input data of the multiple sub-models with /or output data sent to the end device.
  • the terminal device trains the third sub-model according to the input data and/or output data of the first sub-model.
  • the terminal device may determine training data of the third sub-model and label samples of the third sub-model (hereinafter referred to as labels) according to the acquired input data and/or output data of the first sub-model.
  • labels may be determined according to the acquired input data and/or output data of the first sub-model.
  • the channel information obtained by the access network device includes downlink channel characteristics.
  • the downlink channel feature may be a PMI, a feature vector or a feature matrix, and the like.
  • the terminal device reports the PMI, and what the access network device sends to the terminal device may be output data of the first sub-model, that is, feature bits.
  • the terminal device trains the third sub-model according to the PMI reported by itself and the feature bits sent by the access network device. For example, the terminal device may use the feature bit sent by the access network device as a label, and the training data used to train the third sub-model may include PMI.
  • the input of the third sub-model includes PMI, and the output includes feature bits.
  • the terminal device reports the PMI, and what the access network device sends to the terminal device may be output data of the first sub-model, that is, feature bits.
  • the terminal device determines the feature vector or feature matrix W used to generate the PMI according to the PMI reported by itself.
  • the terminal device trains the third sub-model according to the feature vector or feature matrix W and feature bits sent by the access network device. For example, the terminal device may use the feature bit sent by the access network device as a label, and the training data used to train the third sub-model may include a feature vector or a feature matrix W.
  • the input of the third sub-model includes eigenvector or eigenmatrix W, and the output includes eigenbits.
  • the terminal device reports the PMI, and what the access network device sends to the terminal device may be output data of the first sub-model, that is, feature bits.
  • the same method as the access network device is used to restore the downlink channel characteristics based on the PMI Terminal equipment according to downlink channel characteristics and the feature bits sent by the access network device to train the third sub-model.
  • the terminal device may use the feature bits sent by the access network device as a label, and the training data used to train the third sub-model may include downlink channel features (such as feature vectors or feature matrices).
  • the input of the third sub-model includes downlink channel characteristics
  • the output includes eigenbits.
  • what the access network device sends to the terminal device is input data and output data of the first sub-model, that is, downlink channel characteristics (such as PMI, characteristic vector or characteristic matrix) and characteristic bits.
  • the terminal device trains the third sub-model according to the downlink channel characteristics and characteristic bits sent by the access network device. For example, the terminal device may use the feature bits sent by the access network device as a label, and the training data used to train the third sub-model may include downlink channel features.
  • the input of the third sub-model includes downlink channel features, and the output includes feature bits.
  • the training data of the third sub-model may be a downlink channel.
  • the terminal device reports the relevant information of the downlink channel, and what the access network device sends to the terminal device may be output data of the first sub-model, that is, feature bits.
  • the terminal device determines the downlink channel used to generate the related information of the downlink channel according to the related information of the downlink channel reported by itself.
  • the terminal device trains the third sub-model according to the downlink channel and the feature bits sent by the access network device. For example, the terminal device may use the feature bit sent by the access network device as a label, and the training data used for training the third sub-model may include a downlink channel.
  • the input of the third sub-model includes the downlink channel, and the output includes the characteristic bits.
  • the terminal device reports the relevant information of the downlink channel, and what the access network device sends to the terminal device may be output data of the first sub-model, that is, feature bits.
  • the downlink channel is recovered based on the relevant information of the downlink channel by using the same method as that of the access network device.
  • the terminal device trains the third sub-model according to the recovered downlink channel and the characteristic bits sent by the access network device. For example, the terminal device may use the feature bit sent by the access network device as a label, and the training data used to train the third sub-model may include the restored downlink channel.
  • the input of the third sub-model includes the restored downlink channel, and the output includes the characteristic bits.
  • the access network device sends to the terminal device is the input data and output data of the first submodel, that is, the downlink channel and feature bits.
  • the terminal device trains the third sub-model according to the downlink channel and feature bits sent by the access network device.
  • the terminal device may use the feature bit sent by the access network device as a label, and the training data used for training the third sub-model may include a downlink channel.
  • the input of the third sub-model includes the downlink channel, and the output includes the characteristic bits.
  • the terminal device sends a training completion notification to the access network device, where the training completion notification is used to indicate that the training of the third sub-model is completed.
  • the terminal device may execute S605 after completing the training of the third sub-model in S604.
  • the access network device obtains the training completion notification, it can know that the terminal device side can use the third sub-model and the second sub-model on the access network device side for joint inference, or it can be understood that the third sub-model and the second sub-model
  • the model can form a new bilateral model for channel information feedback, which is denoted as the second bilateral model.
  • the terminal device may also notify the access network device of the performance of the trained third sub-model, for example, the terminal device includes performance information in the aforementioned training completion notification.
  • the performance information may include performance parameters satisfied by the trained third sub-model: loss function thresholds for training/testing/validation, or other performances such as MSE, NMSE, and cross-entropy.
  • the performance information is only used to indicate whether the third sub-model meets the performance requirement, and does not specifically indicate the performance parameter.
  • S606-S608 below illustrate an example in which the third sub-model is matched with the second sub-model.
  • S606-S608 may not be executed, or S606-S608 may also be replaced with other examples in which the third sub-model is matched with the second sub-model, which is not limited by the present disclosure.
  • the terminal device determines the first feature bit according to the third channel information and the third sub-model; wherein, the input of the third sub-model includes the third channel information, and the output of the third sub-model includes the first characteristic bit;
  • the terminal device sends information used to indicate the first feature bit to the access network device.
  • the access network device obtains the first channel information according to the second submodel and the first eigenbit; wherein, the input of the second submodel includes the first eigenbit, and the output of the second submodel includes First channel information.
  • the access network device trains the bilateral model used for channel information feedback, and indicates the input/output related to the sub-models in the bilateral model to the terminal device.
  • Terminal devices can independently train sub-models with the same function, and use them in conjunction with other sub-models on access network devices, which can meet the application requirements of bilateral models and eliminate the need to transmit models over the air interface, which can reduce transmission overhead and improve communication security.
  • the present disclosure also provides a schematic diagram of a deployment process of a bilateral model.
  • Four stages are schematically shown in FIG. 6B.
  • model training stage 1 the access network equipment trains a bilateral model, which includes sub-model 1 and sub-model 2, wherein the input type of sub-model 1 is downlink channel features such as features Vector or feature matrix W, the output type of sub-model 1 is eigenbit; the input type of sub-model 2 is the same as the output type of sub-model 1, and the output type of sub-model 2 is the restored downlink channel feature
  • the access network device sends the input data (denoted as W1) and/or output data (denoted as characteristic bit B) of sub-model 1 to the terminal device.
  • the terminal device can train the sub-model 3 according to the received W1 and/or feature bits B.
  • This submodel 3 has the same function as submodel 1.
  • the terminal device can obtain the data of the feature bit type according to the data of sub-model 1 and type W, that is, the input type of sub-model 3 is W, and the output type of sub-model 3 is is the characteristic bit; the terminal device sends the data whose type is the characteristic bit to the access network device; the access network device can obtain the type as The data of , that is, the input type of sub-model 2 is feature bit, and the output type of sub-model 2 is
  • submodel 1 in FIG. 6B can be the first submodel in FIG. 6A
  • submodel 2 in FIG. 6B can be the second submodel in FIG. 6A
  • submodel 3 in FIG. 6B can be the first submodel in FIG. 6A.
  • the present disclosure provides another communication method. Referring to FIG. 7A , the method includes the following process.
  • the terminal device acquires training data.
  • the training data includes N pieces of channel information.
  • N is a positive integer, that is, N is an integer greater than or equal to 1.
  • the training data is used to determine a bilateral model, which includes a first sub-model and a second sub-model.
  • the channel information may specifically include downlink channels and downlink channel characteristics.
  • the terminal device can determine channel information according to the measurement of the downlink reference signal.
  • the terminal device determines a first submodel and a second submodel according to the N pieces of channel information.
  • the first sub-model and the second sub-model constitute a bilateral model for channel information feedback, which is denoted as the third bilateral model.
  • the terminal device may divide the acquired N pieces of channel information into one or more training sets.
  • the terminal device can use part or all of the one or more training sets to train the same model.
  • end devices can use a training set to train the same bilateral model.
  • an end device can train the same bilateral model using multiple training sets.
  • the terminal device can train a bilateral model according to the acquired downlink channel feature; or, the terminal device can first convert the downlink channel feature matrix or vector W into PMI, Restore downlink channel feature matrix or vector according to PMI Then use the recovered downlink channel feature matrix or vector Train a bilateral model.
  • the input to the third bilateral model includes channel information
  • the output of the third bilateral model includes recovered channel information.
  • the channel information may be a downlink channel characteristic or a downlink channel.
  • the definition of channel information can be understood with reference to S601, which will not be repeated here.
  • Training the third bilateral model can be understood as minimizing the difference between the input channel information and the output channel information, and the loss function corresponding to the third bilateral model can be expressed as the MSE between the input channel information and the output channel information , the cross-entropy between the input channel information and the output channel information, or the cosine similarity (cosine similarity) between the input channel information and the output channel information, etc.
  • Training the third bilateral model can also minimize the difference between the output channel information and the label channel information, and the loss function corresponding to the third bilateral model can be expressed as the MSE between the output channel information and the label channel information, output The cross-entropy between the channel information and the label channel information, or the cosine similarity between the output channel information and the label channel information (cosine similarity), etc.
  • the input type of the second sub-model may be consistent with the input type of the third bilateral model, that is, the input type of the second sub-model is channel information, or the input of the third bilateral model is Input to the second submodel.
  • the output type of the second sub-model is a feature bit, and the feature bit includes one or more binary bits. It can be understood that the feature bits are a low-dimensional expression of channel information, and the second sub-model is used to compress and/or quantize the channel information to obtain feature bits.
  • the input of the first sub-model is determined by the output of the second sub-model, for example, the input type of the first sub-model is consistent with the output type of the second sub-model, both are feature bits; or, the dimension of the input data of the first sub-model The dimensions of the output data of the second sub-model are the same; or, the input data of the first sub-model includes the output data of the second sub-model; or, the output data of the second sub-model can be preprocessed and input to the first sub-model
  • the model that is, the input data of the first sub-model includes the preprocessed output data of the second sub-model.
  • the output of the first submodel is recovered channel information.
  • the third bilateral model may be an autoencoder, wherein the second sub-model is an encoder, and the first sub-model is a decoder.
  • the terminal device may preset the dimension of the feature bits according to actual requirements, and the dimension of the feature bits may also be referred to as the number of bits included in the feature bits. For example, in consideration of feedback overhead, the terminal device may reduce the dimension of feature bits to reduce feedback overhead. Specifically, the terminal device may set the dimension of the feature bit to be smaller than the first dimension threshold. For example, in consideration of feedback accuracy, the terminal device may increase the dimension of feature bits to improve feedback accuracy. Specifically, the terminal device may set the dimension of the feature bit to be greater than the first dimension threshold.
  • the terminal device sends the input data and output data of the first sub-model to the access network device.
  • the terminal device can use the first sub-model to generate the input data and output data of the first sub-model; or, the terminal device can use the second sub-model to generate the input data of the first sub-model, and use the first sub-model to generate Output data for the first submodel.
  • the input data of the first sub-model includes M feature bits
  • the output data of the first sub-model includes channel information corresponding to the M feature bits, where M is a positive integer.
  • the terminal device may send M downlink channel features and M feature bits to the access network device.
  • the terminal device may convert the M downlink channel features into M PMIs, and then send the M feature bits and the M PMIs to the access network device.
  • S703 may also be replaced by: the terminal device sends the input data and label data of the first sub-model to the access network device.
  • the terminal device may use the second sub-model to generate input data of the first sub-model, and the terminal device may send the input data of the first sub-model and corresponding label data to the access network device.
  • the label data of the first sub-model is the label data of the third bilateral model.
  • the label data corresponding to the input data of the first sub-model is the label data corresponding to the input data of the third bilateral model corresponding to the input data of the first sub-model.
  • the access network device trains the third sub-model according to the input data and output data of the first sub-model.
  • the access network device may directly use the M feature bits sent by the terminal device and the channel information corresponding to the M feature bits to train the third sub-model. Or, if the channel information sent by the terminal device is specifically PMI, the access network device may first restore M PMIs to M downlink channel eigenvectors or matrices, and then use M eigenbits and M downlink channel eigenvectors or The matrix trains the third submodel.
  • This step can be implemented with reference to the description of S504. This disclosure will not describe it in detail.
  • this step can also be replaced by: the access network device trains the third sub-model according to the input data and label data of the first sub-model.
  • the access network device sends a training completion notification to the terminal device, where the training completion notification is used to indicate that the training of the third sub-model is completed.
  • the access network device may execute S705 after completing the training of the third sub-model in S704. After the access network device obtains the training completion notification, it can know that the terminal device side can use the third sub-model and the second sub-model on the access network device side for joint inference, or it can be understood that the third sub-model and the second sub-model
  • the model can form a new bilateral model for feedback of channel information, which is denoted as the fourth bilateral model.
  • the terminal device may also notify the access network device of the performance of the trained third sub-model, for example, the terminal device includes performance information in the aforementioned training completion notification.
  • the performance information may include performance parameters satisfied by the trained third sub-model: loss function thresholds for training/testing/validation, or other performances such as MSE, NMSE, and cross-entropy.
  • the performance information is only used to indicate whether the third sub-model meets the performance requirement, and does not specifically indicate the performance parameter.
  • S706-S708 below illustrate an example in which the third sub-model is matched with the second sub-model.
  • the steps S706-S708 may not be executed, or the steps S706-S708 may also be replaced by other examples in which the third sub-model is matched with the second sub-model, which is not limited in the present disclosure.
  • the terminal device determines a second characteristic bit according to the second channel information and the second submodel; wherein, the input of the second submodel includes the second channel information, and the output of the second submodel includes the second feature bit;
  • the terminal device sends information used to indicate the second feature bit to the access network device.
  • the access network device obtains fourth channel information according to the third submodel and the second feature bits; wherein, the input of the third submodel includes the second feature bits, and the output of the third submodel includes Fourth channel information.
  • the terminal device trains a bilateral model used for channel information feedback, and instructs the access network device on the input and output related to the sub-models in the bilateral model.
  • Access network devices can independently train sub-models with the same function, which can be matched with other sub-models on terminal devices, which can meet the application requirements of bilateral models, without the need to transmit models over the air interface, which can reduce transmission overhead and improve communication security.
  • the present disclosure also provides a schematic diagram of a deployment process of a bilateral model.
  • Four stages are schematically shown in FIG. 7B.
  • model training stage 1 the terminal device trains a bilateral model, which includes sub-model 1 and sub-model 2, wherein the input type of sub-model 1 is downlink channel features such as feature vector or Feature matrix W, the output type of sub-model 1 is feature bits; the input type of sub-model 2 is the same as the output type of sub-model 1, and the output type of sub-model 2 is the restored downlink channel feature
  • the terminal device sends the input data (denoted as characteristic bit B) and output data (denoted as B) of sub-model 2 (denoted as ) to the access network device.
  • the access network device can use the received feature bits B and Train submodel 4.
  • This submodel 4 has the same function as submodel 2.
  • the terminal device can obtain the data of the feature bit type according to the data of sub-model 1 and type W, that is, the input type of sub-model 1 is W, and the output type of sub-model 1 is is the characteristic bit; the terminal device sends the data whose type is the characteristic bit to the access network device; the access network device can obtain the type as The data of , that is, the input type of sub-model 4 is feature bit, and the output type of sub-model 4 is
  • sub-model 1 in FIG. 7B can be the second sub-model in FIG. 7A
  • sub-model 2 in FIG. 7B can be the first sub-model in FIG. 7A
  • sub-model 4 in FIG. 7B can be the second sub-model in FIG. 7A The third submodel in .
  • a communication method is illustrated, and the method includes the following process.
  • a third-party network element determines a first sub-model and a second sub-model.
  • the first network element may be an access network device
  • the second network element may be, for example, a terminal device.
  • the first network element may be a terminal device
  • the second network element may be an access network device.
  • the third-party network element may be an independent AI network element.
  • the third-party network element determines input data and output data of the first sub-model, and then executes S803a.
  • this step S802a may also be replaced by: the third-party network element determines the input data and label data of the first sub-model.
  • the third-party network element determines the input data and output data of the second sub-model, and then executes S803b.
  • this step S802b may also be replaced by: the third-party network element determines the input data and label data of the second sub-model.
  • the third-party network element sends the input data and/or output data of the first sub-model to the second network element, and then executes S804a.
  • this step S803a may also be replaced by: the third-party network element sends the input data and label data of the first sub-model to the second network element.
  • the third-party network element sends the first information to the second network element; wherein, when the third-party network element determines the input data and/or output data of the first sub-model, the first information includes the first sub-model. Input data and/or output data of a sub-model; when the third-party network element determines the input data and label data of the first sub-model, the first information includes the input data and/or label data of the first sub-model.
  • the third-party network element sends the input data and/or output data of the second sub-model to the first network element, and then executes S804b.
  • this step S803b can also be replaced by: the third-party network element sends the input data and/or label data of the second sub-model to the first network element or, the third-party network element sends the first network element Send the input data and/or label data for the second submodel
  • the third-party network element sends the second information to the first network element; wherein, when the third-party network element determines the input data and/or output data of the second sub-model, the second information includes the first Input data and/or output data of the second sub-model; when the third-party network element determines the input data and label data of the second sub-model, the second information includes the input data and/or label data of the second sub-model.
  • the second network element trains the third sub-model according to the input data and/or output data of the first sub-model, and then executes S805.
  • this step S804a may also be replaced by: the second network element trains the third sub-model according to the input data and/or label data of the first sub-model.
  • the first network element trains the fourth sub-model according to the input data and/or output data of the second sub-model, and then executes S805.
  • this step S804b may also be replaced by: the first network element trains the fourth sub-model according to the input data and/or label data of the second sub-model.
  • the relationship between the second sub-model and the fourth sub-model can be understood according to the relationship between the first sub-model and the third sub-model.
  • the third sub-model on the second network element is matched with the fourth sub-model on the first network element for use. That is, the third sub-model and the fourth sub-model constitute a new bilateral model.
  • a bilateral model is trained by a third-party network element, and the input/output related to multiple sub-models in the bilateral model is indicated to multiple network elements, so that each network element side independently trains sub-models with the same function, which is different from other network elements.
  • the sub-models on the element are matched and used to meet the application requirements of the bilateral model, without the need to transmit the model over the air interface, which can reduce transmission overhead and improve communication security.
  • the first network element, the second network element, and the third-party network element may include a hardware structure and/or a software module, and realize in the form of a hardware structure, a software module, or a hardware structure plus a software module the above functions. Whether one of the above-mentioned functions is executed in the form of a hardware structure, a software module, or a hardware structure plus a software module depends on the specific application and design constraints of the technical solution.
  • the present disclosure provides a communication device 900 , where the communication device 900 includes a processing module 901 and a communication module 902 .
  • the communication device 900 may be the first network element, or it may be a communication device that is applied to the first network element or used in conjunction with the first network element, and can implement a communication method performed by the first network element side; or, the communication device 900 It may be the second network element, or it may be a communication device that is applied to the second network element or matched with the second network element, and can realize the communication method performed by the second network element side; or, the communication device 900 may be a third party
  • the network element may also be a communication device that is applied to a third-party network element or used in conjunction with a third-party network element, and can implement a communication method performed by the third-party network element side.
  • the communication module may also be referred to as a transceiver module, a transceiver, a transceiver, a transceiver device, and the like.
  • a processing module may also be called a processor, a processing board, a processing unit, a processing device, and the like.
  • the device used to implement the receiving function in the communication module can be regarded as a receiving unit. It should be understood that the communication module is used to perform the sending and receiving operations on the access network device side or the terminal device side in the above method embodiments, The device used to implement the sending function in the communication module is regarded as a sending unit, that is, the communication module includes a receiving unit and a sending unit.
  • the receiving unit included in its communication module 902 is used to perform a receiving operation on the first network element side, for example, receive a signal from a second network element.
  • the sending unit included in the communication module 902 is configured to perform a sending operation on the first network element side, for example, send a signal to the second network element.
  • the receiving unit included in the communication module 902 is used to perform the receiving operation on the side of the second network element, for example, receiving a signal from the first network element; the sending unit included in the communication module 902 It is used to perform a sending operation on the second network element side, for example, sending a signal to the first network element.
  • the receiving unit included in its communication module 902 is used to perform a receiving operation on the side of the third-party network element, for example, receiving a signal from a first network element or a second network element; its communication module
  • the sending unit included in 902 is configured to perform a sending operation on the third-party network element side, for example, send a signal to the first network element or the second network element.
  • the communication module may be an input and output circuit and/or a communication interface, which performs input operations (corresponding to the aforementioned receiving operations) and output operations (corresponding to the aforementioned sending operations);
  • the processing module is an integrated processor or microprocessor or integrated circuit.
  • the first network element may be an access network device or a terminal device.
  • the communication device 900 includes:
  • the processing module 901 is configured to determine a first sub-model and a second sub-model, and the first sub-model and the second sub-model can be matched and used.
  • a communication module 902 configured to send first information, where the first information is used to indicate input data of the first sub-model and/or output data of the first sub-model, or the first information is used to Indicates input data for the first submodel and label data for the first submodel.
  • the input data and/or output data of a sub-model that can be matched with the used sub-model can be used to independently train a sub-model with the same function as the sub-model, without the need to transmit the sub-model over the air interface, which can reduce transmission Overhead, improve communication security.
  • the output of the first sub-model is used to determine the input of the second sub-model; or, the output of the second sub-model is used to determine the input of the first sub-model.
  • the first submodel is used to send information at the sending end, and the second submodel is used to receive the information at the receiving end; or, the second submodel is used to The end sends information, and the first submodel is used to receive the information at the receiving end.
  • the first sub-model and the second sub-model belong to a bilateral model.
  • the first information is used for training the third sub-model.
  • the function of the third sub-model is the same as that of the first sub-model; and/or, the input type of the third sub-model is the same as that of the first sub-model
  • the input type of the same, the output type of the third sub-model is the same as the output type of the first sub-model; and/or, the dimension of the input data of the third sub-model is the same as the input of the first sub-model
  • the dimensions of the data are the same, the dimensions of the output data of the third sub-model are the same as the dimensions of the output data of the first sub-model; and/or, the input of the third sub-model is the same as that of the first sub-model
  • the difference between the output of the third sub-model and the output of the first sub-model is smaller than the first threshold; and/or, the input of the third sub-model is different from the output of the first sub-model When the inputs are the same, the difference between the output of the third sub-model and the output label of the first sub
  • the function of the third sub-model is the same as The function of the second sub-model is the same; and/or, the input type of the third sub-model is the same as the input type of the second sub-model, and the output type of the third sub-model is the same as that of the second sub-model
  • the output types of the models are the same; and/or, the dimension of the input data of the third sub-model is the same as the dimension of the input data of the second sub-model, and the dimension of the output data of the third sub-model is the same as that of the first sub-model
  • the dimensions of the output data of the two sub-models are the same; and/or, when the input of the third sub-model is the same as the input of the second sub-model, the output of the third sub-model is the same as that of the second sub-model The difference between the outputs is less than the first threshold; and/or, when the input of the third sub-model is the same as the input of the second sub-model, the output of the third sub-model is the
  • the third sub-model that replaces the second sub-model can be trained independently, so that the third sub-model can be matched with the first sub-model, and the transmission overhead of sending the second sub-model can be reduced.
  • the third sub-model and the first sub-model can also form a new bilateral model.
  • the processing module 901 is specifically configured to: determine the first sub-model and the second sub-model according to training data; wherein, the training data includes N pieces of channel information, and N is a positive integer, and the channel information includes downlink channel characteristics or downlink channels.
  • the input data of the first sub-model includes M pieces of channel information, where M is a positive integer.
  • the output data of the first submodel includes feature bits corresponding to M pieces of channel information, where M is a positive integer.
  • the communication module 902 is further configured to obtain information indicating a first characteristic bit, and the output of the third sub-model includes the first characteristic bit; the processing module 901, It is also used to obtain first channel information according to the second sub-model and the first eigenbit; wherein, the input of the second sub-model includes the first eigenbit, and the output of the second sub-model Including the first channel information.
  • the input data of the first sub-model includes M feature bits, where M is a positive integer.
  • the output data of the first sub-model includes channel information corresponding to M feature bits, where M is a positive integer.
  • the input data of the first sub-model includes M feature bits
  • the label data of the first sub-model includes channel information corresponding to the M feature bits
  • M is a positive integer
  • the processing module 901 is further configured to determine a second characteristic bit according to the second channel information and the second sub-model; wherein, the input of the second sub-model includes the first Two channel information, the output of the second sub-model includes the second feature bit; the communication module 902 is further configured to send information indicating the second feature bit.
  • the processing module 901 is further configured to determine a second characteristic bit according to the second channel information and the second sub-model; wherein, the input of the second sub-model includes the first Two channel information, the output of the second sub-model includes the second characteristic bit; the communication module 902 is further configured to send information indicating the second characteristic bit and the second channel information, or send Information used to indicate the second eigenbit and label channel information corresponding to the second eigenbit.
  • the label channel information corresponding to the second feature bit may be understood as an output label of the third sub-model, for example, may be the second channel information.
  • the communication module 902 is configured to obtain first information, the first information is used to indicate the input data of the first sub-model and/or the output data of the first sub-model, or the first information is used to indicate the Input data for a first submodel and label data for the first submodel.
  • a processing module 901 configured to train a third sub-model according to the first information.
  • the acquired input data and/or output data of a sub-model can be used to independently train a sub-model with the same function as the sub-model. It can be applied to scenarios where bilateral models are deployed, without the need to transmit sub-models over the air interface, which can reduce transmission overhead and improve communication security.
  • the function of the third sub-model is the same as that of the first sub-model; and/or, the input type of the third sub-model is the same as the input type of the first sub-model Same, the output type of the third sub-model is the same as the output type of the first sub-model; and/or, the dimension of the input data of the third sub-model is the same as the dimension of the input data of the first sub-model Same, the dimensions of the output data of the third sub-model are the same as the dimensions of the output data of the first sub-model; and/or, when the input of the third sub-model is the same as the input of the first sub-model , the difference between the output of the third sub-model and the output of the first sub-model is smaller than a first threshold.
  • the first sub-model and the second sub-model can be matched and used.
  • the output of the first sub-model is used to determine the input of the second sub-model; or, the output of the second sub-model is used to determine the input of the first sub-model.
  • the first submodel is used to send information at the sending end, and the second submodel is used to receive the information at the receiving end; or, the second submodel is used to The end sends information, and the first submodel is used to receive the information at the receiving end.
  • the first sub-model and the second sub-model belong to a bilateral model; the third sub-model and the second sub-model form a new bilateral model.
  • the input data of the first sub-model includes M pieces of channel information, where M is a positive integer.
  • the output data of the first submodel includes feature bits corresponding to the M pieces of channel information, where M is a positive integer.
  • the processing module 901 is further configured to determine the first feature bit according to the third channel information and the third sub-model; wherein, the input of the third sub-model includes the first Three channel information, the output of the third sub-model includes the first feature bit.
  • the communication module 902 is further configured to send information indicating the first feature bit.
  • the input parameters of the first sub-model include M feature bits, where M is a positive integer.
  • the output parameters of the first submodel include channel information corresponding to the M feature bits, where M is a positive integer.
  • the communication module 902 is further configured to acquire information indicating the second characteristic bit; the processing module 901 is further configured to bits to obtain fourth channel information; wherein, the input of the third sub-model includes the second characteristic bit, and the output of the third sub-model includes the fourth channel information.
  • the third-party network element may be an AI network element.
  • the processing module 901 is configured to determine the first sub-model and the second sub-model.
  • a communication module 902 configured to send first information to a second network element, where the first information includes input data and/or output data of the first submodel, or the first information includes input data and label data of the first submodel, The first information is used for training the third sub-model; and sending second information to the first network element, the second information includes input data and/or output data of the second sub-model, or the second information includes the first The input data and label data of the second sub-model, the second information is used for the training of the fourth sub-model.
  • each functional module in each embodiment of this disclosure can be integrated into a processor , can also be a separate physical existence, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
  • the present disclosure also provides a communication device 1000 .
  • the communication device 1000 may be a chip or a chip system.
  • the system-on-a-chip may be constituted by chips, and may also include chips and other discrete devices.
  • the communication device 1000 may be used to realize the functions of the terminal device or the access network device in the communication system shown in FIG. 1A or to realize the functions of the terminal device, the access network device or the AI network element in the communication system shown in FIG. 1B .
  • the communication device 1000 may include at least one processor 1010, and the processor 1010 is coupled to a memory.
  • the memory may be located within the device, the memory may be integrated with the processor, or the memory may be located outside the device.
  • the communication device 1000 may further include at least one memory 1020 .
  • the memory 1020 stores necessary computer programs, configuration information, computer programs or instructions and/or data for implementing any of the above embodiments; the processor 1010 may execute the computer programs stored in the memory 1020 to complete the methods in any of the above embodiments.
  • the communication apparatus 1000 may further include a communication interface 1030, and the communication apparatus 1000 may perform information exchange with other devices through the communication interface 1030.
  • the communication interface 1030 may be a transceiver, a circuit, a bus, a module, a pin or other types of communication interfaces.
  • the communication interface 1030 in the device 1000 can also be an input and output circuit, which can input information (or call, receive information) and output information (or call, send information)
  • the processor is an integrated processor or a microprocessor or an integrated circuit or a logic circuit, and the processor can determine output information according to input information.
  • the coupling in the present disclosure is an indirect coupling or communication connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • the processor 1010 may cooperate with the memory 1020 and the communication interface 1030 .
  • the specific connection medium among the processor 1010, the memory 1020, and the communication interface 1030 is not limited in the present disclosure.
  • the processing module 1010 , the memory 1020 and the communication interface 1030 are connected to each other through a bus 1040 .
  • the bus 1040 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus or the like.
  • PCI peripheral component interconnect
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 10 , but it does not mean that there is only one bus or one type of bus.
  • a processor may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or execute the present invention.
  • a general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in conjunction with the present disclosure may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the memory may be a non-volatile memory, such as a hard disk (hard disk drive, HDD) or a solid-state drive (solid-state drive, SSD), etc., or a volatile memory (volatile memory), such as random memory Access memory (random-access memory, RAM).
  • a memory is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
  • the memory in the present disclosure may also be a circuit or any other device capable of implementing a storage function for storing program instructions and/or data.
  • the communication device 1000 can be applied to a terminal device.
  • the communication device 1000 can be a terminal device, or can support a terminal device to realize the functions of the terminal device in any of the above-mentioned embodiments. device.
  • the memory 1020 stores necessary computer programs, computer programs or instructions and/or data for realizing the functions of the terminal device in any of the foregoing embodiments.
  • the processor 1010 may execute the computer program stored in the memory 1020 to complete the method performed by the terminal device in any of the foregoing embodiments.
  • the communication interface in the communication apparatus 1000 can be used to interact with access network equipment, send data to the access network equipment or receive data from the access network equipment.
  • the communication device 1000 can be applied to access network equipment, and the specific communication device 1000 can be an access network device, or can support access network equipment, to implement any of the above-mentioned implementations
  • the memory 1020 stores necessary computer programs, computer programs or instructions and/or data for realizing the functions of the access network device in any of the foregoing embodiments.
  • the processor 1010 may execute the computer program stored in the memory 1020 to complete the method performed by the access network device in any of the foregoing embodiments.
  • the communication interface in the communication apparatus 1000 can be used to interact with the terminal equipment, send data to the terminal equipment or receive data from the terminal equipment.
  • the communication device 1000 can be applied to an AI network element, and the specific communication device 1000 can be an AI network element, or can support an AI network element, so as to realize the AI network element in any of the above-mentioned embodiments.
  • the memory 1020 stores necessary computer programs, computer programs or instructions and/or data for realizing the functions of the AI network element in any of the foregoing embodiments.
  • the processor 1010 may execute the computer program stored in the memory 1020 to complete the method performed by the AI network element in any of the foregoing embodiments.
  • the communication interface in the communication device 1000 can be used to interact with access network equipment, send data to the access network equipment or receive data from the access network equipment.
  • the communication device 1000 provided in this embodiment can be applied to terminal equipment to complete the method executed by the above terminal equipment, or applied to access network equipment to complete the method executed by access network equipment, or applied to AI network elements to complete the AI network Method of meta execution. Therefore, the technical effects that can be obtained can refer to the above-mentioned method embodiments, and will not be repeated here.
  • the present disclosure also provides a computer program.
  • the computer program When the computer program is run on a computer, the computer is executed from the perspective of the terminal device side or the access network device side as shown in FIGS. 5A to 8 .
  • the communication method provided in the embodiment.
  • the present disclosure also provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a computer, the computer makes The communication method provided in the embodiment shown in FIG. 5A to FIG. 8 is executed from a side angle.
  • the storage medium may be any available medium that can be accessed by a computer.
  • computer-readable media may include RAM, read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD- ROM or other optical disk storage, magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
  • the present disclosure provides a communication system, including a terminal device and an access network device, wherein the terminal device and the access network device can realize the communication method.
  • the present disclosure also provides a chip, the chip is used to read the computer program stored in the memory, and implement the embodiments shown in Figure 5A to Figure 8 from the perspective of the terminal device side or the access network device side The communication method provided in.
  • the present disclosure provides a chip system, which includes a processor, configured to support a computer device to implement the functions involved in the terminal device or the access network device in the embodiments shown in FIG. 5A to FIG. 8 .
  • the chip system further includes a memory, and the memory is used to store necessary programs and data of the computer device.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • the technical solution provided by the present disclosure may be fully or partially realized by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions according to the present disclosure are produced in whole or in part.
  • the computer may be a general computer, a special computer, a computer network, an access network device, a terminal device, an AI network element or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital video disc (digital video disc, DVD)), or a semiconductor medium.
  • the various embodiments can refer to each other, for example, the methods and/or terms between the method embodiments can refer to each other, such as the functions and/or terms between the device embodiments
  • Mutual references can be made, for example, functions and/or terms between the apparatus embodiment and the method embodiment can be referred to each other.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本公开提供一种通信方法及装置,用于解决模型传输开销大的问题。该方法包括:第一网元(如接入网设备/终端设备)确定第一子模型和第二子模型,第一子模型和第二子模型能够匹配使用;第一网元向第二网元(如终端设备/接入网设备)发送第一信息,第一信息用于指示第一子模型的输入数据和/或第一子模型的输出数据。第二网元可以根据第一信息训练第三子模型。第三子模型的功能和第一子模型的功能相同。

Description

一种通信方法及装置
相关申请的交叉引用
本申请要求在2021年09月10日提交中华人民共和国知识产权局、申请号为202111064144.X、申请名称为“一种通信方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,尤其涉及一种通信方法及装置。
背景技术
在无线通信网络中,例如在移动通信网络中,网络支持的业务越来越多样,因此需要满足的需求越来越多样。例如,网络需要能够支持超高速率、超低时延、和/或超大连接。该特点使得网络规划、网络配置、和/或资源调度越来越复杂。此外,由于网络的功能越来越强大,例如支持的频谱越来越高、支持高阶多入多出(multiple input multiple output,MIMO)技术、支持波束赋形、和/或支持波束管理等新技术,使得网络节能成为了热门研究课题。这些新需求、新场景和新特性给网络规划、运维和高效运营带来了前所未有的挑战。为了迎接该挑战,可以将人工智能技术引入无线通信网络中,从而实现网络智能化。基于此,如何在网络中有效地实现人工智能是一个值得研究的问题。
发明内容
本公开提供一种通信方法及装置,用于减少传输开销,提升通信安全。
第一方面,本公开提供一种通信方法,包括:确定第一子模型和第二子模型,所述第一子模型和所述第二子模型能够匹配使用;发送第一信息,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据,或者,所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据。其中,第一子模型的标签数据是第一子模型的输出数据的期望值或者目标值,可以理解为期望第一子模型输出的数据;第一子模型的标签数据也可以替换描述为第一子模型的输出标签。
上述设计中,提供能够匹配使用的多个子模型中一个子模型的输入数据和/或输出数据,可以用于独立训练与该子模型功能相同或与该子模型匹配使用的子模型,或提供匹配使用的多个子模型中一个子模型的输入数据和标签数据,可以用于独立训练与该子模型功能相同的子模型,无需在空口传输子模型,能够减少传输开销,提升通信安全。
在一种可能的设计中,所述第一子模型的输出用于确定所述第二子模型的输入;或者,所述第二子模型的输出用于确定所述第一子模型的输入。这样的设计能够实现多个子模型之间的匹配使用。
在一种可能的设计中,所述第一子模型用于在发送端发送信息,所述第二子模型在接收端用于接收所述信息;或者,所述第二子模型用于在发送端发送信息,所述第一子模型用于在接收端接收所述信息。这样的设计可以用于诸如利用模型压缩/调制信息的场景, 能够减少信息传输的开销。
在一种可能的设计中,所述第一子模型和所述第二子模型属于一个双边模型。这样的设计适用于需要部署双边模型的场景,可以减少发送双边模型中子模型的开销,避免双边模型算法泄露,提升通信安全。
在一种可能的设计中,所述第一信息用于第三子模型的训练。
其中,在一种可选的实现中,所述第三子模型的功能与所述第一子模型的功能相同;和/或,所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;和/或,所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出标签之间的差别小于第二阈值。通过这样的设计,实现独立训练替代第一子模型的第三子模型,使得第三子模型能够与第二子模型匹配使用,能够减少发送第一子模型的传输开销。且,第三子模型与所述第二子模型也能够组成一个新的双边模型。
在另一种可选的实现中,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据时,所述第三子模型的功能与所述第二子模型的功能相同;和/或,所述第三子模型的输入类型与所述第二子模型的输入类型相同,所述第三子模型的输出类型与所述第二子模型的输出类型相同;和/或,所述第三子模型的输入数据的维度与所述第二子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第二子模型的输出数据的维度相同;和/或,所述第三子模型的输入与所述第二子模型的输入相同时,所述第三子模型的输出与所述第二子模型的输出之间的差别小于第一阈值;和/或,所述第三子模型的输入与所述第二子模型的输入相同时,所述第三子模型的输出与所述第二子模型的输出标签之间的差别小于第二阈值。通过这样的设计,实现独立训练替代第二子模型的第三子模型,使得第三子模型能够与第一子模型匹配使用,能够减少发送第二子模型的传输开销。且,第三子模型与所述第一子模型也能够组成一个新的双边模型。
在一种可能的设计中,所述确定第一子模型和第二子模型,包括:根据训练数据,确定所述第一子模型和所述第二子模型;其中,所述训练数据包括N个信道信息,N为正整数,所述信道信息包括下行信道特征或者下行信道。通过这样的设计,能够实现利用匹配使用的子模型进行信道信息的反馈,减少反馈开销。
在一种可能的设计中,所述第一子模型的输入数据包括M个信道信息,M为正整数。
在一种可能的设计中,所述第一子模型的输出数据包括与M个信道信息对应的特征比特,M为正整数。
在一种可能的设计中,所述第一子模型的输入数据包括M个信道信息时,所述第一子模型的标签数据包括与M个特征比特,M为正整数。
在一种可能的设计中,所述方法还包括:获取用于指示第一特征比特的信息,所述第三子模型的输出包括所述第一特征比特;根据所述第二子模型和所述第一特征比特,得到第一信道信息;其中,所述第二子模型的输入包括所述第一特征比特,所述第二子模型的输出包括所述第一信道信息。
在上述设计中,发送端利用根据第一子模型的输入输出独自训练的第三子模型发送特 征比特,接收端侧可以利用与第一子模型匹配的第二子模型恢复出信道信息。无需在空口中传输子模型,能够减少传输开销,且提升通信安全。
在一种可能的设计中,所述第一子模型的输入数据包括M个特征比特,M为正整数。
在一种可能的设计中,所述第一子模型的输出数据包括与M个特征比特对应的信道信息,M为正整数。
在一种可能的设计中,所述第一子模型的输入数据包括M个特征比特,所述第一子模型的标签数据包括与M个特征比特对应的信道信息,M为正整数。
在一种可能的设计中,所述方法还包括:根据第二信道信息和所述第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;发送用于指示所述第二特征比特的信息。
在上述设计中,发送端侧利用与第一子模型匹配的第二子模型发送特征比特,接收端可以利用根据第一子模型的输入输出独自训练的第三子模型恢复出信道信息。无需在空口中传输子模型,能够减少传输开销,且提升通信安全。
第二方面,本公开提供一种通信方法,包括:获取第一信息,所述第一信息用于指示第一子模型的输入数据和/或者所述第一子模型的输出数据,或所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据;根据所述第一信息训练第三子模型。
关于第一子模型、第三子模型等的介绍请参见第一方面,此处不再赘述。在一种可能的设计中,所述方法还包括:根据第三信道信息和所述第三子模型,确定第一特征比特;其中,所述第三子模型的输入包括所述第三信道信息,所述第三子模型的输出包括所述第一特征比特;发送用于指示所述第一特征比特的信息。
在一种可能的设计中,所述方法还包括:获取用于指示第二特征比特的信息;根据所述第三子模型和所述第二特征比特,得到第四信道信息;其中,所述第三子模型的输入包括所述第二特征比特,所述第三子模型的输出包括所述第四信道信息。
第三方面,本公开提供一种通信装置,该通信装置可以是第一网元,也可以是第一网元中的装置,或者是能够和第一网元匹配使用的装置。其中,第一网元可以为接入网设备或者终端设备。一种设计中,该通信装置可以包括执行第一方面中所描述的方法/操作/步骤/动作所一一对应的模块,该模块可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。
一种设计中,该通信装置可以包括处理模块和通信模块。
处理模块,用于确定第一子模型和第二子模型,所述第一子模型和所述第二子模型能够匹配使用。
通信模块,用于发送第一信息,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据;或者,所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据。
关于第一子模型、第二子模型等的介绍请参见第一方面,此处不再赘述。
在一种可能的设计中,所述通信模块,还用于获取用于指示第一特征比特的信息,所述第三子模型的输出包括所述第一特征比特;所述处理模块,还用于根据所述第二子模型和所述第一特征比特,得到第一信道信息;其中,所述第二子模型的输入包括所述第一特 征比特,所述第二子模型的输出包括所述第一信道信息。
在一种可能的设计中,所述处理模块,还用于根据第二信道信息和所述第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;所述通信模块,还用于发送用于指示所述第二特征比特的信息。
第四方面,本公开提供一种通信装置,该通信装置可以是第二网元,也可以是第二网元中的装置,或者是能够和第二网元匹配使用的装置。其中,第二网元可以为接入网设备或者终端设备。一种设计中,该通信装置可以包括执行第二方面中所描述的方法/操作/步骤/动作所一一对应的模块,该模块可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。一种设计中,该通信装置可以包括处理模块和通信模块。
通信模块,用于获取第一信息,所述第一信息用于指示第一子模型的输入数据和/或者所述第一子模型的输出数据;或者,所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据。
处理模块,用于根据所述第一信息训练第三子模型。
关于第一子模型、第三子模型等的介绍请参见第二方面,此处不再赘述。在一种可能的设计中,所述处理模块,还用于根据第三信道信息和所述第三子模型,确定第一特征比特;其中,所述第三子模型的输入包括所述第三信道信息,所述第三子模型的输出包括所述第一特征比特。所述通信模块,还用于发送用于指示所述第一特征比特的信息。
在一种可能的设计中,所述通信模块,还用于获取用于指示第二特征比特的信息;所述处理模块,还用于根据所述第三子模型和所述第二特征比特,得到第四信道信息;其中,所述第三子模型的输入包括所述第二特征比特,所述第三子模型的输出包括所述第四信道信息。
第五方面,本公开提供一种通信装置,所述通信装置包括处理器,用于实现上述第一方面所描述的方法。所述通信装置还可以包括存储器,用于存储指令和数据。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的指令时,可以实现上述第一方面描述的方法。所述通信装置还可以包括通信接口,所述通信接口用于该装置与其它设备进行通信,示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口,其它设备可以为接入网设备。在一种可能的设备中,该通信装置包括:
存储器,用于存储程序指令;
处理器,用于确定第一子模型和第二子模型,所述第一子模型和所述第二子模型能够匹配使用。
处理器,还用于利用通信接口发送第一信息,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据;或者,所述第一信息用于指示第一子模型的输入数据和所述第一子模型的标签数据。
第六方面,本公开提供一种通信装置,所述通信装置包括处理器,用于实现上述第二方面所描述的方法。所述通信装置还可以包括存储器,用于存储指令和数据。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的指令时,可以实现上述第二方面描述的方法。所述装置还可以包括通信接口,所述通信接口用于该装置与其它设备进行通信,示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口,其它设备可以为终端设备。在一种可能的设备中,该装置包括:
存储器,用于存储程序指令;
处理器,用于利用通信接口,获取第一信息,所述第一信息用于指示第一子模型的输入数据和/或者所述第一子模型的输出数据;或者,所述第一信息用于指示第一子模型的输入数据和所述第一子模型的标签数据。
处理器,还用于根据所述第一信息训练第三子模型。
第七方面,本公开提供了一种通信系统,包括如第三方面或第五方面所描述的通信装置,以及如第四方面或第六方面所描述的通信装置。
第八方面,本公开还提供了一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机执行上述第一方面或第二方面中任一方面提供的方法。
第九方面,本公开还提供了一种计算机程序产品,包括指令,当所述指令在计算机上运行时,使得计算机执行上述第一方面或第二方面中任一方面提供的方法。
第十方面,本公开还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序或指令,当所述计算机程序或者指令在计算机上运行时,使得所述计算机执行上述第一方面或第二方面中提供的方法。
第十一方面,本公开还提供了一种芯片,所述芯片用于读取存储器中存储的计算机程序,执行上述第一方面或第二方面中提供的方法。
第十二方面,本公开还提供了一种芯片系统,该芯片系统包括处理器,用于支持计算机装置实现上述第一方面或第二方面中提供的方法。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器用于保存该计算机装置必要的程序和数据。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
附图说明
图1A为一种通信系统的结构示意图;
图1B为另一种通信系统的结构示意图;
图2A为神经元结构的一种示意图;
图2B为神经网络的层关系的一种示意图;
图3为一种相关技术中双边模型的部署流程示意图;
图4A为AI的一种应用框架的示意图;
图4B~图4E为几种网络架构的示意图;
图5A为本公开提供的通信方法的流程示意图之一;
图5B为本公开提供的双边模型的部署流程示意图之一;
图6A为本公开提供的通信方法的流程示意图之一;
图6B为本公开提供的双边模型的部署流程示意图之一;
图7A为本公开例提供的通信方法的流程示意图之一;
图7B为本公开提供的双边模型的部署流程示意图之一;
图8为本公开提供的通信方法的流程示意图之一;
图9为本公开提供的通信装置的结构示意图之一;
图10为本公开提供的通信装置的结构示意图之一。
具体实施方式
为了使本公开的目的、技术方案和优点更加清楚,下面将结合附图对本公开作进一步地详细描述。
本公开如下涉及的至少一个,指示一个或多个。多个,是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。另外,应当理解,尽管在本公开中可能采用术语第一、第二等来描述各对象、但这些对象不应限于这些术语。这些术语仅用来将各对象彼此区分开。
本公开如下描述中所提到的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括其他没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。需要说明的是,本公开中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本公开中被描述为“示例性的”或者“例如”的任何方法或设计方案不应被解释为比其它方法或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
本公开提供的技术可以应用于各种通信系统,例如,该通信系统可以是第三代(3 th generation,3G)通信系统(例如通用移动通信系统(universal mobile telecommunication system,UMTS))、第四代(4 th generation,4G)通信系统(例如长期演进(long term evolution,LTE)系统)、第五代(5 th generation,5G)通信系统、全球互联微波接入(worldwide interoperability for microwave access,WiMAX)或者无线局域网(wireless local area network,WLAN)系统、或者多种系统的融合系统,或者是未来的通信系统,例如6G通信系统等。其中,5G通信系统还可以称为新无线(new radio,NR)系统。
通信系统中的一个网元可以向另一个网元发送信号或从另一个网元接收信号。其中信号可以包括信息、配置信息或者数据等;网元也可以被称为实体、网络实体、设备、通信设备、节点、通信节点等等,本公开中以网元为例进行描述。例如,通信系统可以包括至少一个终端设备和至少一个接入网设备。其中,配置信息的发送网元可以为接入网设备,配置信息的接收网元可以为终端设备。此外可以理解的是,若通信系统中包括多个终端设备,多个终端设备之间也可以互发信号,即配置信息的发送网元和配置信息的接收网元均可以是终端设备。
参见图1A示意一种通信系统,作为示例,该通信系统包括接入网设备110以及两个终端设备,即终端设备120和终端设备130。终端设备120和终端设备130中的至少一个可以发送上行数据给接入网设备110,接入网设备110可以接收该上行数据。接入网设备110可以向终端设备120和终端设备130中的至少一个发送下行数据。
下面对图1A所涉及的终端设备和接入网设备进行详细说明。
(1)终端设备
终端设备又称之为终端、用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等,是一种向用户提供语音和/或数据连通性的设备。例如,终端设备包括具有无线连接功能的手持式设备、或车载设备等。示例性的,一些终端的举例为:无线网络摄像头、手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备如智能手表、虚拟现实 (virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、车联网系统中的终端、无人驾驶(self driving)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端如智能加油器,高铁上的终端设备以及智慧家庭(smart home)中的无线终端,如智能音响、智能咖啡机、智能打印机等。
本公开中,用于实现终端设备功能的通信装置可以是终端设备,也可以是具有终端部分功能的终端设备,也可以是能够支持终端设备实现该功能的装置,例如芯片系统,该装置可以被安装在终端设备中。本公开中,芯片系统可以由芯片构成,也可以包括芯片和其他分立器件。本公开提供的技术方案中,以用于实现终端设备功能的通信装置是终端设备或UE为例进行描述。
(2)接入网设备
接入网设备可以为基站(base station,BS),接入网设备还可以称为网络设备、接入节点(access node,AN)、无线接入节点(radio access node,RAN)。接入网设备可以为终端设备提供无线接入服务。接入网设备例如包括但不限于以下至少一个:基站、5G中的下一代节点B(generation nodeB,gNB)、开放无线接入网(open radio access network,O-RAN)中的接入网设备、演进型节点B(evolved node B,eNB)、无线网络控制器(radio network controller,RNC)、节点B(node B,NB)、基站控制器(base station controller,BSC)、基站收发台(base transceiver station,BTS)、家庭基站(例如,home evolved nodeB,或home node B,HNB)、基带单元(base band unit,BBU)、收发点(transmitting and receiving point,TRP)、发射点(transmitting point,TP)、和/或移动交换中心等。或者,接入网设备还可以是集中单元(centralized unit,CU)、分布单元(distributed unit,DU)、集中单元控制面(CU control plane,CU-CP)节点、或集中单元用户面(CU user plane,CU-UP)节点。或者,接入网设备可以为中继站、接入点、车载设备、可穿戴设备或者未来演进的公共陆地移动网络(public land mobile network,PLMN)中的接入网设备等。
本公开中,用于实现接入网设备功能的通信装置可以是接入网设备,也可以是具有接入网设备部分功能的网络设备,也可以是能够支持接入网设备实现该功能的装置,例如芯片系统,硬件电路、软件模块、或硬件电路加软件模块,该装置可以被安装在接入网设备中。本公开的方法中,以用于实现接入网设备功能的通信装置是接入网设备为例进行描述。
(3)接入网设备和终端设备之间的协议层结构
接入网设备和终端设备之间的通信遵循一定的协议层结构。该协议层结构可以包括控制面协议层结构和用户面协议层结构。例如,控制面协议层结构可以包括无线资源控制(radio resource control,RRC)层、分组数据汇聚层协议(packet data convergence protocol,PDCP)层、无线链路控制(radio link control,RLC)层、媒体接入控制(media access control,MAC)层和物理层等协议层的功能。例如,用户面协议层结构可以包括PDCP层、RLC层、MAC层和物理层等协议层的功能,在一种可能的实现中,PDCP层之上还可以包括业务数据适配协议(service data adaptation protocol,SDAP)层。
以接入网设备和终端设备之间的数据传输为例,数据传输需要经过用户面协议层,比如经过SDAP层、PDCP层、RLC层、MAC层、物理层。其中,SDAP层、PDCP层、RLC层、MAC层和物理层也可以统称为接入层。根据数据的传输方向分为发送或接收,上述每层又分为发送部分和接收部分。以下行数据传输为例,PDCP层自上层取得数据后,将 数据传送到RLC层与MAC层,再由MAC层生成传输块,然后通过物理层进行无线传输。数据在各个层中进行相对应的封装。例如,某一层从该层的上层收到的数据视为该层的服务数据单元(service data unit,SDU),经过该层封装后成为协议数据单元(protocol data unit,PDU),再传递给下一个层。
示例性的,终端设备还可以具有应用层和非接入层。其中,应用层可以用于向终端设备中所安装的应用程序提供服务,比如,终端设备接收到的下行数据可以由物理层依次传输到应用层,进而由应用层提供给应用程序;又比如,应用层可以获取应用程序产生的数据,并将数据依次传输到物理层,发送给其它通信装置。非接入层可以用于转发用户数据,比如将从应用层接收到的上行数据转发给SDAP层或者将从SDAP层接收到的下行数据转发给应用层。
(4)接入网设备的结构
接入网设备可以包括集中式单元(central unit,CU)和分布式单元(distributed unit,DU)。多个DU可以由一个CU集中控制。作为示例,CU和DU之间的接口可以称为F1接口。其中,控制面(control panel,CP)接口可以为F1-C,用户面(user panel,UP)接口可以为F1-U。CU和DU可以根据无线网络的协议层划分:比如,PDCP层及以上协议层的功能设置在CU,PDCP层以下协议层(例如RLC层和MAC层等)的功能设置在DU;又比如,PDCP层以上协议层的功能设置在CU,PDCP层及以下协议层的功能设置在DU。
可以理解的是,上述对CU和DU的处理功能按照协议层的划分仅仅是一种举例,也可以按照其他的方式进行划分,例如可以将CU或者DU划分为具有更多协议层的功能,又例如将CU或DU还可以划分为具有协议层的部分处理功能。在一种设计中,将RLC层的部分功能和RLC层以上的协议层的功能设置在CU,将RLC层的剩余功能和RLC层以下的协议层的功能设置在DU。在另一种设计中,还可以按照业务类型或者其他系统需求对CU或者DU的功能进行划分,例如按时延划分,将处理时间需要满足时延要求的功能设置在DU,不需要满足该时延要求的功能设置在CU。在另一种设计中,CU也可以具有核心网的一个或多个功能。示例性的,CU可以设置在网络侧方便集中管理。在另一种设计中,将DU的无线单元(radio unit,RU)拉远设置。其中,RU具有射频功能。
可选的,DU和RU可以在物理层(physical layer,PHY)进行划分。例如,DU可以实现PHY层中的高层功能,RU可以实现PHY层中的低层功能。其中,用于发送时,PHY层的功能可以包括添加循环冗余校验(cyclic redundancy check,CRC)码、信道编码、速率匹配、加扰、调制、层映射、预编码、资源映射、物理天线映射、和/或射频发送功能。用于接收时,PHY层的功能可以包括CRC、信道解码、解速率匹配、解扰、解调、解层映射、信道检测、资源解映射、物理天线解映射、和/或射频接收功能。其中,PHY层中的高层功能可以包括PHY层的一部分功能,例如该部分功能更加靠近MAC层,PHY层中的低层功能可以包括PHY层的另一部分功能,例如该部分功能更加靠近射频功能。例如,PHY层中的高层功能可以包括添加CRC码、信道编码、速率匹配、加扰、调制、和层映射,PHY层中的低层功能可以包括预编码、资源映射、物理天线映射、和射频发送功能;或者,PHY层中的高层功能可以包括添加CRC码、信道编码、速率匹配、加扰、调制、层映射和预编码,PHY层中的低层功能可以包括资源映射、物理天线映射、和射频发送功能。
示例性的,CU的功能可以由一个实体来实现,或者也可以由不同的实体来实现。例 如,可以对CU的功能进行进一步划分,即将控制面和用户面分离并通过不同实体来实现,分别为控制面CU实体(即CU-CP实体)和用户面CU实体(即CU-UP实体)。该CU-CP实体和CU-UP实体可以与DU相耦合,共同完成接入网设备的功能。
上述架构中,CU产生的信令可以通过DU发送给终端设备,或者终端设备产生的信令可以通过DU发送给CU。例如,RRC或PDCP层的信令最终会处理为物理层的信令发送给终端设备,或者,由接收到的物理层的信令转变而来。在这种架构下,该RRC或PDCP层的信令,即可以认为是通过DU发送的,或者,通过DU和RU发送的。
可选的,上述DU、CU、CU-CP、CU-UP和RU中的任一个可以是软件模块、硬件结构、或者软件模块+硬件结构,不予限制。其中,不同实体的存在形式可以是不同的,不予限制。例如DU、CU、CU-CP、CU-UP是软件模块,RU是硬件结构。这些模块及其执行的方法也在本公开的保护范围内。
应理解,图1A所示的通信系统中各个设备的数量、类型仅作为示意,本公开并不限于此,实际应用中在通信系统中还可以包括更多的终端设备、更多的接入网设备,还可以包括其它设备,例如可以包括核心网设备,和/或用于实现人工智能功能的网元。
本公开提供的方法可以用于接入网设备和终端设备之间的通信,也可以用于其他通信设备之间的通信,例如无线回传链路中宏基站和微基站之间的通信,又如边链路(sidelink,SL)中两个终端设备之间的通信等,不予限制。本公开以接入网设备和终端设备之间的通信为例进行描述。
本公开提供的方法涉及到人工智能(artificial Intelligence,AI)。AI可以通过各种可能的技术实现,例如通过机器学习技术实现。在本公开中,可以在通信系统中已有网元内配置AI功能(如AI模块或者AI实体)来实现AI相关的操作。例如在5G新无线(new radio,NR)系统中,该已有网元可以是接入网设备(如gNB)、终端设备、核心网设备、或网管(operation,administration and maintenance,OAM)等。例如在图1A示意的通信系统中,可以在接入网设备110、终端设备120、终端设备130中的至少一个网元中配置AI功能。或者,在本公开中,也可以在通信系统中引入独立的网元来执行AI相关的操作。该独立的网元可以称为AI网元或者AI节点等,本公开对此名称不进行限制。在这种情况下,执行AI相关的操作的网元为内置AI功能(如AI模块或者AI实体)的网元。AI相关的操作还可以称为AI功能。AI功能的具体介绍请参见下文。AI网元可以和通信系统中的接入网设备之间直接连接,也可以通过第三方网元和接入网设备实现间接连接。其中,第三方网元可以是认证管理功能(authentication management function,AMF)网元、用户面功能(user plane function,UPF)网元等核心网网元。示例性的,参见图1B,在上述图1A所示的通信系统中引入了AI网元140。本公开以AI功能内置在已有的网元内部为例进行说明。
为了便于理解,下面结合A1~A3,对本公开涉及的AI的部分用语进行介绍。可以理解的是,该介绍并不作为对本公开的限定。
A1,AI模型
AI模型是AI功能的具体实现,AI模型表征了模型的输入和输出之间的映射关系。AI模型可以是神经网络或者其他机器学习模型。本公开中,AI功能可以包括以下至少一项:数据收集(收集训练数据和/或推理数据)、模型学习、模型信息发布(配置模型信息)、推理、或推理结果发布。本公开中,可以将AI模型简称为模型。另外,模 型学习也可理解为模型训练。
A2,神经网络
神经网络是机器学习技术和AI模型的一种具体实现形式。根据通用近似定理,神经网络在理论上可以逼近任意连续函数,从而使得神经网络具备学习任意映射的能力。传统的通信系统需要借助丰富的专家知识来设计通信模块,而基于神经网络的深度学习通信系统可以从大量的数据集中自动发现隐含的模式结构,建立数据之间的映射关系,获得优于传统建模方法的性能。
神经网络的思想来源于大脑组织的神经元结构。例如,每个神经元都对其输入值进行加权求和运算,通过一个激活函数输出运算结果。如图2A所示,为神经元结构的一种示意图。假设神经元的输入为x=[x 0,x 1,…,x n],与各个输入对应的权值分别为w=[w,w 1,…,w n],其中,w i作为x i的权值,用于对x i进行加权。根据权值对输入值进行加权求和的偏置例如为b。激活函数的形式可以有多种,假设一个神经元的激活函数为:y=f(z)=max(0,z),则该神经元的输出为:
Figure PCTCN2022118269-appb-000001
x i+b)。再例如,一个神经元的激活函数为:y=f(z)=z,则该神经元的输出为:
Figure PCTCN2022118269-appb-000002
其中,b可以是小数、整数(例如0、正整数或负整数)、或复数等各种可能的取值。神经网络中不同神经元的激活函数可以相同或不同。
神经网络一般包括多个层,每层可包括一个或多个神经元。通过增加神经网络的深度和/或宽度,能够提高该神经网络的表达能力,为复杂系统提供更强大的信息提取和抽象建模能力。其中,神经网络的深度可以是指神经网络包括的层数,其中每层包括的神经元个数可以称为该层的宽度。在一种实现方式中,神经网络包括输入层和输出层。神经网络的输入层将接收到的输入信息经过神经元处理,将处理结果传递给输出层,由输出层得到神经网络的输出结果。在另一种实现方式中,神经网络包括输入层、隐藏层和输出层,可参考图2B。神经网络的输入层将接收到的输入信息经过神经元处理,将处理结果传递给中间的隐藏层,隐藏层对接收的处理结果进行计算,得到计算结果,隐藏层将计算结果传递给输出层或者相邻的隐藏层,最终由输出层得到神经网络的输出结果。其中,一个神经网络可以包括一个隐藏层,或者包括多个依次连接的隐藏层,不予限制。
本公开涉及的神经网络例如深度神经网络(deep neural network,DNN),根据网络的构建方式,DNN可以包括前馈神经网络(feedforward neural network,FNN)、卷积神经网络(convolutional neural networks,CNN)和递归神经网络(recurrent neural network,RNN)。
另外,在神经网络的训练过程中,可以定义损失函数。损失函数描述了神经网络的输出值与理想目标值之间的差距或差异,本公开并不限制损失函数的具体形式。神经网络的训练过程就是通过调整神经网络的参数,使得损失函数的取值小于门限,或者使得损失函数的取值满足目标需求的过程。调整神经网络的参数,例如调整如下参数中的至少一种:神经网络的层数、宽度、神经元的权值、或、神经元的激活函数中的参数。
A3,训练数据
训练数据可以包括AI模型的输入,或者包括AI模型的输入和目标输出(标签),用于AI模型的训练。例如,训练数据包括多个训练样本,每个训练样本为神经网络的一次输入。训练数据也可以理解为训练样本的集合,或称为训练数据集。
训练数据集是机器学习重要的部分之一,模型的训练过程本质上就是从训练数据中学 习它的某些特征,使得AI模型的输出尽可能接近目标输出,如AI模型的输出与目标输出之间的差异最小。其中,目标输出也可以被称为标签。
本公开提供的方法具体涉及到双边模型的训练及应用。其中,双边模型,或称为双端模型、协作模型。双边模型可以由两个或者多个子模型构成,两个或者多个子模型之间匹配使用,两个或者多个子模型可以分布在不同的网元。例如自编码器(auto-encoder,AE)是一种典型的双边模型,自编码器包括编码器和解码器,其中,编码器和解码器之间匹配使用,如编码器的输出可以用于确定解码器的输入。实际使用时,编码器和解码器分别部署在不同网元,如编码器部署在终端设备,解码器部署在接入网设备。
一种可能的实现中,由一个网元训练双边模型,然后再将训练好的两个子模型分别部署在两个网元上,完成双边模型训练的网元可以是两个部署子模型的网元之一,也可以是第三方网元。例如,在无线通信网络中,可以由接入网设备完成双边模型的训练,再将其中需要部署在终端设备上的子模型发送给终端设备。具体地,可参见图3,分为三个阶段,在模型训练阶段中,接入网设备独立的训练一个双边模型,该双边模型包括子模型1和子模型2,其中子模型1的输入类型为a,子模型2的输入类型与子模型1的输出类型相同,子模型2的输出类型为b。在子模型发送阶段中,接入网设备将子模型1发送给终端设备。在模型应用阶段(或称联合推断阶段)中,终端设备可以根据子模型1和类型a的数据,得到类型c的数据,即子模型1的输入类型为a,子模型1的输出类型为c;终端设备将类型c的数据发送给接入网设备;接入网设备可以根据子模型2和类型c的数据,得到类型b的数据,即子模型2的输入类型为c,子模型2的输出类型为b。
在相关技术中,需要在空口传输AI模型,例如子模型。一方面,当AI模型比较大时,在空口传输子模型的开销较大。另一方面,由于AI模型的类型AI模型的格式种类很多,例如,从神经网络分类上,有FNN、CNN和RNN;从神经网络内部结构上,涉及每层神经元的个数,每层神经元之间的连接关系,层与层之间的连接关系,激活函数类型等。定义AI模型或者AI模型的解读格式需要比较大量的标准化工作。且,由于终端设备的计算能力差异较大,可支持的AI模型规模也不尽相同。如果所有终端设备都从接入网设备下载AI模型,可能需要接入网设备针对各种计算能力的UE分别训练其对应的AI模型,这样对于接入网设备而言,所需的计算量和存储开销也比较大。此外,AI模型涉及相关算法设计,通常情况下,算法和AI模型属于比较隐私的内容,不同网元之间交互这类比较隐私的内容,较易导致算法泄露,不利于通信安全。
基于此,本公开提供一种通信方法,能够减少传输开销,并提升通信安全。本公开中,各网元可以独立训练子模型,各网元独立训练的子模型之间相互匹配,可以构成一个双边模型。以下结合附图对于本公开提供的通信方法进行详细说明。
本公开提供的通信方法可以应用于图1A或者图1B所示的通信系统,可以由通信系统中的一个网元训练双边模型,再将双边模型中子模型的输入数据和/或输出数据发送给通信系统中的其他网元,其他网元可以利用某一子模型的输入数据和/或输出数据,训练得到一个与该子模型功能相同的子模型。即实现各网元独立训练子模型,且各网元独立训练的子模型之间相互匹配,可以构成一个新的双边模型。
下面结合图4A所示的一种AI应用框架,对本公开涉及的双边模型训练以及子模型训练技术进行介绍。
在图4A中,数据源(data source)用于存储训练数据和推理数据。模型训练节点 (model trainning host)通过对数据源提供的训练数据(training data)进行分析或训练,得到AI模型,且将AI模型部署在模型推理节点(model inference host)中。其中,AI模型表征了模型的输入和输出之间的映射关系。通过模型训练节点学习得到AI模型,相当于由模型训练节点利用训练数据学习得到模型的输入和输出之间的映射关系。模型推理节点使用AI模型,基于数据源提供的推理数据进行推理,得到推理结果。该方法还可以描述为:模型推理节点将推理数据输入到AI模型,通过AI模型得到输出,该输出即为推理结果。该推理结果可以指示:由执行对象使用(执行)的配置参数、和/或由执行对象执行的操作。推理结果可以由执行(actor)实体统一规划,并发送给一个或多个执行对象(例如,网元)去执行。
在本公开中,图4A所示的应用框架可以部署在图1A或者图1B中所示的网元,例如,图4A的应用框架可以部署在图1A的接入网设备,在接入网设备中,模型训练节点可对数据源提供的训练数据(training data)进行分析或训练,得到一个双边模型。模型推理节点可以使用该双边模型包括的子模型,根据子模型和数据源提供的推理数据进行推理,得到子模型的输出。即该子模型的输入包括推理数据,子模型的输出即为子模型所对应的推理结果。将终端设备视为图4A中的执行对象,接入网设备可以将子模型对应的推理数据和/或推理结果发送给终端设备,终端设备可以根据推理数据和/或推理结果独自训练一个对应的子模型。
下面结合图4B~图4E对本公开提供的通信方案能够应用的网络架构进行介绍。
如图4B所示,第一种可能的实现中,接入网设备中包括近实时接入网智能控制(RAN intelligent controller,RIC)模块,用于进行模型学习和/或推理。例如,近实时RIC可以从CU、DU和RU中的至少一个获得网络侧和/或终端侧的信息,该信息可以作为训练数据或者推理数据。可选的,近实时RIC可以将推理结果递交至CU、DU和RU中的至少一个。可选的,CU和DU之间可以交互推理结果。可选的,DU和RU之间可以交互推理结果,例如近实时RIC将推理结果递交至DU,由DU递交给RU。例如,近实时RIC可以用于训练双边模型,利用双边模型的子模型进行推理;或者,近实时RIC可以用于训练能够与其他网元上分布的子模型匹配使用的子模型。
如图4B所示,第二种可能的实现中,接入网设备除包括近实时RIC之外可以包括非实时RIC(可选的,非实时RIC可以位于OAM中或者核心网设备中),用于进行模型学习和推理。例如,非实时RIC可以从CU、DU和RU中的至少一个获得网络侧和/或终端侧的信息,该信息可以作为训练数据或者推理数据,该推理结果可以被递交至CU、DU和RU中的至少一个。可选的,CU和DU之间可以交互推理结果。可选的,DU和RU之间可以交互推理结果,例如非实时RIC将推理结果递交至DU,由DU递交给RU。例如,非实时RIC用于训练双边模型,利用双边模型的子模型进行推理;或者,非实时RIC可以用于训练能够与其他网元上分布的子模型匹配使用的子模型。
如图4B所示,第三种可能的实现中,接入网设备中包括近实时RIC,接入网设备之外包括非实时RIC(可选的,非实时RIC可以位于OAM中或者核心网设备中)。同上述第二种可能的实现,非实时RIC可以用于进行模型学习和/或推理;和/或,同上述第一种可能的实现,近实时RIC可以用于进行模型学习和/或推理;和/或,近实时RIC可以从非实时RIC获得AI模型信息,并从CU、DU和RU中的至少一个获得 网络侧和/或终端侧的信息,利用该信息和该AI模型信息得到推理结果,可选的,近实时RIC可以将推理结果递交至CU、DU和RU中的至少一个,可选的,CU和DU之间可以交互推理结果,可选的,DU和RU之间可以交互推理结果,例如近实时RIC将推理结果递交至DU,由DU递交给RU。例如,近实时RIC用于训练双边模型A,利用双边模型A的子模型进行推理。例如,非实时RIC用于训练双边模型B,利用双边模型B的子模型进行推理。例如,非实时RIC用于训练双边模型C,将双边模型C递交给近实时RIC,近实时RIC利用双边模型C的子模型进行推理。
图4C所示为本公开提供的方法能够应用的一种网络架构的示例图。相对图4B,图4B中将CU分离为了CU-CP和CU-UP。
图4D所示为本公开提供的方法能够应用的一种网络架构的示例图。如图4D所示,可选的,接入网设备中包括一个或多个AI实体,该AI实体的功能类似上述近实时RIC。可选的,OAM中包括一个或多个AI实体,该AI实体的功能类似上述非实时RIC。可选的,核心网设备中包括一个或多个AI实体,该AI实体的功能类似上述非实时RIC。当OAM和核心网设备中都包括AI实体时,他们各自的AI实体所训练得到的模型不同,和/或用于进行推理的模型不同。
本公开中,模型不同包括以下至少一项不同:模型的结构参数(例如模型的层数、和/或权值等)、模型的输入参数、或模型的输出参数。
图4E所示为本公开提供的方法能够应用的一种网络架构的示例图。相对图4D,图4E中的接入网设备分离为CU和DU。可选的,CU中可以包括AI实体,该AI实体的功能类似上述近实时RIC。可选的,DU中可以包括AI实体,该AI实体的功能类似上述近实时RIC。当CU和DU中都包括AI实体时,他们各自的AI实体所训练得到的模型不同,和/或用于进行推理的模型不同。可选的,还可以进一步将图4E中的CU拆分为CU-CP和CU-UP。可选的,CU-CP中可以部署有一个或多个AI模型。和/或,CU-UP中可以部署有一个或多个AI模型。可选的,图4D或图4E中,接入网设备的OAM和核心网设备的OAM可以分开独立部署。
本公开中,一个模型可以推理得到一个参数,或者推理得到多个参数。不同模型的学习过程可以部署在不同的设备或节点中,也可以部署在相同的设备或节点中。不同模型的推理过程可以部署在不同的设备或节点中,也可以部署在相同的设备或节点中。
下面结合方案一和方案二对本公开提供的通信方法进行详细说明。在这些方法中,所包括的步骤或操作仅是示例,本公开还可以执行其它操作或者各种操作的变形。此外,各个步骤可以按照本公开呈现的不同的顺序来执行,并且有可能并非要执行全部操作。
方案一
参见图5A,示意一种通信方法,该方法包括如下流程。
S501,第一网元确定第一子模型和第二子模型。
其中,第一子模型和第二子模型能够匹配使用。关于第一子模型和第二子模型匹配使用可参照如下两种可选的方式中的任意一个可选的实施方式理解。
一种可选的实施方式中,所述第一子模型的输出用于确定所述第二子模型的输入。或者说,第一子模型的输出数据可以用于生成第二子模型的输入数据。例如第二子模型的输入数据包括第一子模型输出数据,或者例如第一子模型的输出数据经由相应预处理后,可以得到第二子模型的输入数据。其中第一子模型的输出和第二子模型的输入可以符合如下至少一个特征:第一子模型的输出类型与第二子模型的输入类型相同;或者第一子模型的输出数据的维度与第二子模型的输入数据的维度相同;或者第一子模型的输出数据的格式与第二子模型的输入数据的格式相同。
另一种可选的实施方式中,所述第二子模型的输出用于确定所述第一子模型的输入,或者说,第二子模型的输出数据可以用于生成第一子模型的输入数据。例如第一子模型的输入包括第二子模型的输出,或者例如第二子模型的输出经由相应预处理后,可以得到第一子模型的输入。其中第二子模型的输出和第一子模型的输入可以符合如下至少一个特征:第二子模型的输出类型与第一子模型的输入类型相同;或者第二子模型的输出数据的维度与第一子模型的输入数据的维度相同;或者第二子模型的输出数据的格式与第一子模型的输入数据的格式相同。
具体地,第一子模型和第二子模型可以属于一个双边模型。第一网元可以根据需要部署双边模型的场景,获取相关的训练数据;第一网元根据训练数据训练第一子模型和第二子模型。其中,需要部署双边模型的场景可以包括如下至少一种:信道信息的反馈,如第二网元侧利用其中一个子模型压缩信道状态信息(channel state Information,CSI),并将压缩后的CSI发送给第一网元,第一网元利用其中另一个子模型恢复CSI;双边的AI调制解调,例如第二网元侧利用其中一个子模型调制信号,第一网元利用其中另一个子模型解调信号;双边的AI波束预测,例如第二网元利用其中一个子模型生成一个或多个波束,并使用该一个或多个波束向第一网元发送参考信号,第一网元利用其中另一个子模型和接收到的一个或多个参考信号预测最优波束。
可选的,第一子模型和第二子模型可以应用于不同的网元。例如,第一子模型应用于第二网元,第二子模型应用于第一网元。其中,第一网元可以是接入网设备,第二网元可以是如终端设备。或者,第一网元可以是终端设备,第二网元可以是接入网设备。另外可以理解的是,对于涉及终端设备之间需要部署双边模型的场景,本公开中的第一网元和第二网元可以均为终端设备;或者对于涉及接入网设备之间需要部署双边模型的场景,本公开中的第一网元和第二网元可以均为接入网设备。本公开对网元的类型并不进行限制。
进一步地,所述第一子模型可以用于在发送端(或称作为发送端的网元)发送信息,所述第二子模型在接收端(或称作为接收端的网元)用于接收所述信息;或者,所述第二子模型用于在发送端((或称作为发送端的网元)发送信息,所述第一子模型在接收端(或称作为接收端的网元)用于接收所述信息。该场景包括但不限于上述CSI压缩和恢复、或上述调制和解调。
S502,第一网元确定第一子模型的输入数据和输出数据。
具体地,第一网元可以根据S501中描述的训练数据,确定第一子模型的输入数据。例如,将S501中描述的训练数据作为第一子模型的输入数据,或者在S501中描述的训练数据基础上自行生成第一子模型的输入数据。另外,第一网元也可以根据第二网元发送的相关数据,确定该第一子模型的输入数据。或者,第一网元也可以通过相关测量获取该第一子模型的输入数据。
当所述第二子模型的输出用于确定所述第一子模型的输入时,第一网元也可以根据第二子模型的输出数据确定第一子模型的输入数据。
第一网元将前述确定的输入数据输入到第一子模型,通过第一子模型得到对应的输出数据。
另外可以理解,该S502中的第一子模型为训练好的模型,第一子模型的输入数据也可以称为第一子模型的推理数据,第一子模型的输出数据也可以称为第一子模型的推理结果。
S503,第一网元向第二网元发送第一信息。
一种可选的实施方式中,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据。例如,若第一网元根据从第二网元中获取的相关数据,确定第一子模型的输入数据,第一网元可以向第二网元发送的第一信息可用于指示第一子模型的输出数据而不指示第一子模型的输入数据。该相关数据可以用于第二网元得到第一子模型的输入数据。例如,若第二网元侧无法自行确定第一子模型的输入数据和输出数据,第一网元向第二网元发送的第一信息可用于指示第一子模型的输入数据和第一子模型的输出数据。例如,若第二网元侧无法自行确定第一子模型的输出数据但可以确定输入数据,第一网元可以向第二网元发送的第一信息可用于指示第一子模型的输入数据而不指示第一子模型的输出数据。
另一种可选的实施方式中,第一信息可以用于指示第一子模型的输入数据和第一子模型的标签数据。其中,第一子模型的标签数据是第一子模型的输出数据的期望值或者目标值,可以理解为期望第一子模型输出的数据;第一子模型的标签数据也可以替换描述为第一子模型的标签样本或第一子模型的输出标签。当然可以理解,当第二网元侧可以自行确定第一子模型的输入数据和标签数据时,第一网元也可以不通过第一信息向第二网元指示第一子模型的输入数据和第一子模型的标签数据。
可选的,第一网元也可以向第二网元发送多组子模型对应的输入数据和/或输出数据,或者,第一网元也可以向第二网元发送多组子模型对应的输入数据和标签数据。
S504,第二网元根据第一信息训练第三子模型。
可以理解,第一信息用于第三子模型的训练,或者说第一信息用于第三子模型的确定。
在一种实现方式中,第三子模型和第一子模型可以符合如下至少一个特征:所述第三子模型的功能与所述第一子模型的功能相同;所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;所述第三子模型的输入数据的格式与所述第一子模型的输入数据的格式相同,所述第三子模型的输出数据的格式与所述第一子模型的输出数据的格式相同;所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出标签之间的差别小于第二阈值。其中,差别可以通过NMSE、MSE或余弦相似度等参数进行体现。
第二网元训练的第三子模型可以替代第一网元此前训练的第一子模型,即第三子模型可以与第二子模型匹配使用,构成一个新的双边模型。可以理解的是,第三子模型和第一 子模型的网络结构可以相同或者不相同;第三子模型和第一子模型所用的神经网络类型可以相同或者不相同。本公开对此不予限制。
具体地,第二网元可以根据第一信息确定第三子模型的训练数据,以及第三子模型的标签样本(以下简称标签)。第二网元可以根据第一子模型的输入数据确定第三子模型的训练数据,也即第三子模型的输入数据;第二网元可以将第一子模型的输出数据作为标签,或者,第二网元可以将与第一子模型的标签数据作为第三子模型的标签。可以理解的是,前述第一子模型的输入数据与第一子模型的输出数据之间具有映射关系,如第一子模型的输入数据的数量为一个多个,其中每个第一子模型的输入数据输入至第一子模型,均可得到对应的一个输出数据。第二网元根据第一信息确定的第三子模型的训练数据可以包括一个或多个,每个训练数据对应一个标签。如下以第三子模型的训练数据包括多个为例,对第二网元训练第三子模型的方式进行说明。
一种可选的实施方式中,第二网元将每个第一子模型的输入数据输入待训练的AI模型,第二网元可以将该第一子模型的输入数据所对应的输出数据或标签数据作为标签,训练得到第三子模型,损失函数可以表示第三子模型的输出和该输出对应的标签之间的差值,例如损失函数具体可以是第三子模型的输出和该输出对应的标签之间的NMSE或MSE或余弦相似度。其中,待训练的AI模型可以是如前述的DNN,例如FNN、CNN或者RNN等,或者也可以是其他AI模型,本公开对此不进行限制。另一种可选的实施方式中,第二网元将每个第一子模型的输入数据输入基础模型,第二网元可以将该第一子模型的输入数据所对应的输出数据或标签数据作为标签,第二网元可以根据该训练数据对基础模型进行更新,得到第三子模型。其中,基础模型可以是第二网元历史训练得到的模型,或者基础模型也可以是预先配置在第二网元中的。例如,针对需要部署双边模型的场景,可以预先在第二网元侧配置场景相关的基础模型。
另一种实现方式中,第三子模型和第二子模型可以符合如下至少一个特征:所述第三子模型的功能与所述第二子模型的功能相同;所述第三子模型的输入类型与所述第二子模型的输入类型相同,所述第三子模型的输出类型与所述第二子模型的输出类型相同;所述第三子模型的输入数据的维度与所述第二子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第二子模型的输出数据的维度相同;所述第三子模型的输入数据的格式与所述第二子模型的输入数据的格式相同,所述第三子模型的输出数据的格式与所述第二子模型的输出数据的格式相同;所述第三子模型的输入与所述第二子模型的输入相同时,所述第三子模型的输出与所述第二子模型的输出之间的差别小于第一阈值。其中,差别可以通过NMSE、MSE或余弦相似度等参数进行体现。
第二网元训练的第三子模型可以替代第一网元此前训练的第二子模型,即第三子模型可以与第一子模型匹配使用,构成一个新的双边模型。可以理解的是,第三子模型和第二子模型的网络结构可以相同或者不相同;第三子模型和第二子模型所用的神经网络类型可以相同或者不相同。本公开对此不予限制。
具体地,第二网元可以根据第一信息确定第三子模型的训练数据,以及第三子模型的标签样本(以下简称标签)。示例性的,此情况下,第一信息指示的是第一子模型的输入数据和输出数据;或第一信息指示第一子模型的输入数据或输出数据,但第二网元可以确定第一信息未指示的第一子模型的输出数据或输入数据。第二网元可以根据第一子模型的输出数据确定第三子模型的训练数据,也即第三子模型的输入数据;第二网元可以将第一 子模型的输入数据作为第三子模型的标签。可以理解的是,前述第一子模型的输入数据与第一子模型的输出数据之间具有映射关系,如第一子模型的输入数据的数量为一个多个,其中每个第一子模型的输入数据输入至第一子模型,均可得到对应的一个输出数据。第二网元根据第一信息确定的第三子模型的训练数据可以包括一个或多个,每个训练数据对应一个标签。如下以第三子模型的训练数据包括多个为例,对第二网元训练第三子模型的方式进行说明。
一种可选的实施方式中,第二网元将每个第一子模型的输出数据输入待训练的AI模型,第二网元可以将该第一子模型的输出数据所对应的输入数据作为标签,训练得到第三子模型,损失函数可以表示第三子模型的输出和该输出对应的标签之间的差值,例如损失函数具体可以是第三子模型的输出和该输出对应的标签之间的NMSE或MSE或余弦相似度。其中,待训练的AI模型可以是如前述的DNN,例如FNN、CNN或者RNN等,或者也可以是其他AI模型,本公开对此不进行限制。另一种可选的实施方式中,第二网元将每个第一子模型的输出数据输入基础模型,第二网元可以将该第一子模型的输出数据所对应的输入数据作为标签,第二网元可以根据该训练数据对基础模型进行更新,得到第三子模型。其中,基础模型可以是第二网元历史训练得到的模型,或者基础模型也可以是预先配置在第二网元中的。例如,针对需要部署双边模型的场景,可以预先在第二网元侧配置场景相关的基础模型。
可选的,第二网元可以确定训练第三子模型阶段中的相关参数。例如,可以采用预定的方式在第二网元侧定义参数,或者是由第一网元将相关参数指示给第二网元。其中,参数包括训练结束条件,第二网元可以根据训练结束条件进行第三子模型的训练。训练结束条件可以包括以下至少一个:训练时长、训练迭代次数、或者第三子模型所需满足的性能门限。其中,性能门限可以是训练、测试或者验证的损失函数的收敛门限,或者性能门限也可以是其他的阈值,例如针对第三子模型的输出数据与标签之间的差异所设定的阈值,具体地,第三子模型的输出数据与标签之间的差异可以通过均方误差(mean square error,MSE)、归一化均方误差(normalization mean square error,NMSE)、交叉熵等表示。参数还可以包括第三子模型的结构、第三子模型的参数、训练第三子模型的损失函数等。
可选的,若第一网元发送了多组子模型的输入数据和/或输出数据,第二网元可以根据每组子模型的输入数据和/或输出数据训练对应的子模型,从而获取多个子模型。第二网元训练其中每组子模型可以参照前述训练第三子模型的方式实施,本公开对此不再进行赘述。在构建新的双边模型并利用该新的双边模型进行推断(或称推理)时,第一网元还可以向第二网元指示第二网元使用其中的一个或多个子模型。
本公开中,由第一网元训练一个双边模型,将该双边模型中子模型相关的输入/输出指示给第二网元,实现第二网元侧独立训练功能相同的子模型,与第一网元上的其他子模型匹配使用,能够满足双边模型的应用需求,无需在空口传输子模型,能够减少传输开销,提升通信安全。
需要说明的是,虽然图5A作为示例,仅示意出了两个网元之间的交互,但并不表示本公开限制于两个网元,也不表示本公开限制双边模型仅包括两个匹配使用的子模型。在本公开中可以参照图5A描述的方法,实现在两个以上的网元中分布子模型以及两个以上的网元中分布子模型匹配使用。结合图5A所述的方法,本公开还提供一种双边模型的部署 流程示意图。在图5B中示意出四个阶段,在模型训练阶段1中,第一网元训练一个双边模型,该双边模型包括子模型1和子模型2,其中子模型1的输入类型为a,子模型1的输出类型为c;子模型2的输入类型与子模型1的输出类型相同,子模型2的输出类型为b。在子模型数据发送阶段中,第一网元将子模型1的输入数据(记为a1)和/或输出数据(记为c1)发送给终端设备。在模型训练阶段2中,第二网元可以根据收到的a1和/或c1,训练子模型3。例如,以a1作为子模型3的输入,以c1作为子模型3的输出,训练得到该子模型3与子模型1的功能相同。又如,以a1作为子模型3的输出,以c1作为子模型3的输入,训练得到该子模型3与子模型2的功能相同。
以训练得到的子模型3与子模型1的功能相同为例,在模型应用阶段(或称联合推断阶段)中,第二网元可以根据子模型1和类型a的数据,得到类型c的数据,即子模型3的输入类型为a,子模型3的输出类型为c;第二网元将类型c的数据发送给第一网元;第一网元可以根据子模型2和类型c的数据,得到类型b的数据,即子模型2的输入类型为c,子模型2的输出类型为b。
结合图5A所述的方法,以需要部署双边模型的应用场景是信道信息反馈为例,本公开提供一种通信方法,参见图6A,该方法包括如下流程。
S601,接入网设备获取训练数据。
其中,该训练数据包括N个信道信息。其中,N为正整数,即N为大于或者等于1的整数。该训练数据用于确定双边模型,该双边模型包括第一子模型和第二子模型。
关于信道信息的定义可参照如下方式B1或者方式B2理解。
在方式B1中,信道信息包括下行信道特征。在TDD系统中,接入网设备可以利用信道的上下行互易性,根据上行信道获取下行信道特征。或者,在FDD系统中,接入网设备可以通过一些信号处理的方式,根据上行信道获取下行信道特征;或者,终端设备也可以向接入网设备上报CSI,该CSI包括预编码矩阵索引(precoding matrix index,PMI),PMI用于表示下行信道特征;则接入网设备也可以通过收集终端设备上报的PMI,获取下行信道特征。本公开对于接入网设备获取下行信道特征的方式不予限制。
可选的,下行信道特征可以指的是下行信道的特征向量或特征矩阵,该特征向量或特征矩阵可以由终端设备对下行信道进行奇异值分解(singular value decomposition,SVD)得到,或者该特征向量或特征矩阵也可以由终端设备根据下行信道的协方差矩阵进行特征值分解(eigen value decomposition,EVD)得到。此外,下行信道特征还可以指的是预编码矩阵索引(precoding matrix index,PMI),该PMI可以由终端设备根据预定义的码本,对下行信道、下行信道的特征向量或下行信道的特征矩阵进行处理得到。
在方式B2中,信道信息包括下行信道,即全信道信息。在TDD系统中,接入网设备可以利用信道的上下行互易性,根据上行信道获取下行信道。或者,在FDD系统中,接入网设备可以通过一些信号处理的方式,根据上行信道获取下行信道;或者,终端设备也可以向接入网设备上报下行信道的相关信息,则接入网设备也可以根据下行信道的相关信息,获取下行信道。本公开对于接入网设备获取下行信道的方式不予限制。
S602,接入网设备根据N个信道信息确定第一子模型和第二子模型。
其中,第一子模型和第二子模型构成一个用于信道信息反馈的双边模型,记为第一双边模型。具体地,接入网设备可以将获取到的N个信道信息,划分为一个或多个训练集。 接入网设备可以利用该一个或多个训练集中的部分或全部,训练同一个模型。例如接入网设备可以使用一个训练集训练同一个双边模型。或者,例如接入网设备可以使用多个训练集训练同一个双边模型。
第一双边模型的输入包括信道信息,第一双边模型的输出包括恢复的信道信息。其中,信道信息可以是下行信道特征或者下行信道。有关信道信息的定义可参照S601理解,对此不再进行赘述。训练第一双边模型可以理解为尽可能最小化输入的信道信息和输出的信道信息之间的差异,第一双边模型对应的损失函数可以表现为输入的信道信息和输出的信道信息之间的MSE、输入的信道信息和输出的信道信息之间的交叉熵(cross-entropy),或者输入的信道信息和输出的信道信息之间的余弦相似度(cosine similarity)等。
可选的,在第一双边模型中,第一子模型的输入类型可以与第一双边模型的输入类型一致,即第一子模型的输入类型为信道信息,或者第一双边模型的输入即为第一子模型的输入。第一子模型的输出类型为特征比特,特征比特包括一个或多个二进制比特。可以理解,特征比特是信道信息的低维表达,第一子模型用于对信道信息进行压缩和/或量化得到特征比特。第二子模型的输入由第一子模型的输出确定,例如,第二子模型的输入类型与第一子模型的输出类型一致,均为特征比特;或者,第二子模型的输入数据的维度和第一子模型的输出数据的维度相同;或者,第二子模型的输入数据包括第一子模型的输出数据;或者,可以将第一子模型的输出数据进行预处理后输入至第二子模型,即第二子模型的输入数据包括预处理后的第一子模型的输出数据。第二子模型的输出为恢复的信道信息。示例性的,第一双边模型可以是自编码器,其中第一子模型为编码器,第二子模型为解码器。
接入网设备可以根据实际需求预先设置特征比特的维度,特征比特的维度也可以称为特征比特包括的比特数。例如,处于反馈开销的考虑,接入网设备可以将特征比特的维度减小,以减少反馈开销。具体地,接入网设备可以设定特征比特的维度小于第一维度阈值。例如,处于反馈精度的考虑,接入网设备可以将特征比特的维度增大,以提高反馈精度。具体地,接入网设备可以设定特征比特的维度大于第一维度阈值。
S603,接入网设备向终端设备发送第一子模型的输入数据和/或输出数据。
关于第一子模型的输入数据和输出数据的确定方式可参照S502实施,本公开对此不再进行赘述。具体地,所述第一子模型的输入数据包括M个信道信息,M为正整数。所述第一子模型的输出数据包括与M个信道信息对应的特征比特,M为正整数。
可选的,在终端设备可以自行确定第一子模型的输入数据的情况下,接入网设备可以只向终端设备发送第一子模型的输出数据。例如,第一子模型的输入数据是根据终端设备上报的信道信息确定的,接入网设备可以只向终端设备发送第一子模型的输出数据。
下面以终端设备向接入网设备上报PMI,实现信道信息反馈为例,结合实施方式C1或实施方式C2,对接入网设备可以仅发送第一子模型的输出数据的情况进行说明。
在实施方式C1中,终端设备向接入网设备上报PMI,接入网设备根据终端设备上报的PMI,获取对应的下行信道特征
Figure PCTCN2022118269-appb-000003
接入网设备将
Figure PCTCN2022118269-appb-000004
作为第一子模型的输入,获取对应的输出,记为特征比特B。接入网设备可只将特征比特B发送给终端设备。具体地,接入网设备可以在每次接收到终端设备上报的PMI,生成该PMI对应的特征比特,并向该终端设备发送该PMI对应的特征比特。可选的,可以设定终端设备将在上报一个PMI之后的T1个时间单元内收到的特征比特,为该PMI对应的特征比特。其中,时间单元可以是时 隙或者符号等。T1的值可以根据实际需求设定,例如1个时隙,本公开对此不进行限定。
在实施方式C2中,接入网设备在收到终端设备上报的多个PMI后,获取该多个PMI对应的下行信道特征
Figure PCTCN2022118269-appb-000005
接入网设备按照每次输入一个
Figure PCTCN2022118269-appb-000006
依次将多个
Figure PCTCN2022118269-appb-000007
按照输入至第一子模型,输出得到对应的多个特征比特。接入网设备可以将该多个特征比特发送给终端设备。
其中,多个PMI与多个特征比特之间具备映射关系,例如一一对应。一种可选的实施方式中,多个特征比特和多个PMI之间的映射关系可以是预定义的,例如接入网设备每收到M个PMI后,将对应的N个特征比特按特定顺序排布在一个消息中,终端设备可以根据该特定顺序将N个特征比特与M个PMI进行关联。其中,特定顺序可以为接入网设备接收PMI的先后顺序。可选的,前述包括N个特征比特的消息可以采用特定消息格式,特定的消息格式可以根据实际需求设定,本公开不予限制。另一种可选的实施方式中,多个特征比特和多个PMI之间的映射关系可以是由接入网设备配置给终端设备。
下面对于接入网设备向终端设备发送第一子模型的输入数据和输出数据的情况进行说明。接入网设备可以自行生成符合要求的一个或多个信道信息,将该一个或多个信道信息输入第一子模型,输出得到一个或多个对应的特征比特。接入网设备将该一个或者多个信道信息以及该一个或者多个特征比特发送给终端设备。在这个方式中,接入网设备可以不获取终端设备上报的PMI或者无需等待终端设备上报的PMI。
此外,接入网设备除训练S602描述的第一双边模型,还可能涉及其他双边模型的训练。则可选的,当接入网设备训练多个双边模型时,接入网设备可以在每个双边模型中获取需要应用于在终端设备的子模型,将确定出的多个子模型的输入数据和/或输出数据发送给终端设备。
S604,终端设备根据第一子模型的输入数据和/或输出数据,训练第三子模型。
该步骤可参照S504的描述实施。终端设备可以根据获取的第一子模型的输入数据和/或输出数据,确定第三子模型的训练数据,以及第三子模型的标签样本(以下简称标签)。以下分情况进行举例说明。
情况1,对应S601描述的方式B1,接入网设备获取的信道信息包括下行信道特征。该下行信道特征具体可以是PMI、特征向量或特征矩阵等。
示例性的,终端设备上报PMI,接入网设备向终端设备发送的可以是第一子模型的输出数据,即特征比特。终端设备根据自己上报的PMI和接入网设备发送的特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括PMI。第三子模型的输入包括PMI,输出包括特征比特。
示例性的,终端设备上报PMI,接入网设备向终端设备发送的可以是第一子模型的输出数据,即特征比特。终端设备根据自己上报的PMI,确定用于生成该PMI的特征向量或特征矩阵W。终端设备根据特征向量或特征矩阵W和接入网设备发送的特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括特征向量或特征矩阵W。第三子模型的输入包括特征向量或特征矩阵W,输出包括特征比特。
示例性的,终端设备上报PMI,接入网设备向终端设备发送的可以是第一子模型的输出数据,即特征比特。终端设备根据自己上报的PMI,采用与接入网设备相同的方法,基于PMI恢复出下行信道特征
Figure PCTCN2022118269-appb-000008
终端设备根据下行信道特征
Figure PCTCN2022118269-appb-000009
和接入网设备发送的特征比 特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括下行信道特征(如特征向量或特征矩阵)。第三子模型的输入包括下行信道特征
Figure PCTCN2022118269-appb-000010
输出包括特征比特。
示例性的,接入网设备向终端设备发送的是第一子模型的输入数据和输出数据,即下行信道特征(如PMI、特征向量或特征矩阵)和特征比特。终端设备根据接入网设备发送的下行信道特征和特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括下行信道特征。第三子模型的输入包括下行信道特征,输出包括特征比特。
情况2,对应S601描述的方式B1,第三子模型的训练数据可以是下行信道。
示例性的,终端设备上报下行信道的相关信息,接入网设备向终端设备发送的可以是第一子模型的输出数据,即特征比特。终端设备根据自己上报的下行信道的相关信息,确定用于生成该下行信道的相关信息的下行信道。终端设备根据下行信道和接入网设备发送的特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括下行信道。第三子模型的输入包括下行信道,输出包括特征比特。
示例性的,终端设备上报下行信道的相关信息,接入网设备向终端设备发送的可以是第一子模型的输出数据,即特征比特。终端设备根据自己上报的下行信道的相关信息,采用与接入网设备相同的方法,基于下行信道的相关信息,恢复出下行信道。终端设备根据恢复出的下行信道和接入网设备发送的特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括恢复出的下行信道。第三子模型的输入包括恢复出的下行信道,输出包括特征比特。
示例性的,接入网设备向终端设备发送的是第一子模型的输入数据和输出数据,即下行信道和特征比特。终端设备根据接入网设备发送的下行信道和特征比特训练第三子模型。例如,终端设备可以将接入网设备发送的特征比特作为标签,用于训练第三子模型的训练数据可以包括下行信道。第三子模型的输入包括下行信道,输出包括特征比特。
S605,终端设备向接入网设备发送训练完成通知,该训练完成通知用于指示完成第三子模型的训练。
具体地,终端设备可以在经过S604完成第三子模型训练后,执行S605。接入网设备获取到该训练完成通知,即可获知终端设备侧可以使用第三子模型与接入网设备侧的第二子模型进行联合推断,或者可以理解,第三子模型和第二子模型能够构成一个新的用于信道信息的反馈的双边模型,记为第二双边模型。
此外,终端设备还可以将训练好的第三子模型的性能通知给接入网设备,例如终端设备在前述训练完成通知中包括性能信息。例如,该性能信息可以指包括训练好的第三子模型所满足的性能参数:训练/测试/验证的损失函数门限,或者MSE、NMSE、交叉熵等其他性能。又如,该性能信息仅用于指示第三子模型是否达到性能要求,而不具体指示性能参数。
进一步可选的,以下通过S606~S608示意出第三子模型与第二子模型匹配使用的一种示例。该S606~S608可以不执行,或者该S606~S608也可以替换为第三子模型与第二子模型匹配使用的其它示例,本公开不予限制。
S606,终端设备根据第三信道信息和第三子模型,确定第一特征比特;其中,所述第 三子模型的输入包括所述第三信道信息,所述第三子模型的输出包括所述第一特征比特;
S607,终端设备向接入网设备发送用于指示所述第一特征比特的信息。
S608,接入网设备根据第二子模型和第一特征比特,得到第一信道信息;其中,所述第二子模型的输入包括所述第一特征比特,所述第二子模型的输出包括第一信道信息。
可选的,终端设备训练的第三子模型与第一子模型的性能差异越小,第一信道信息和第三信道信息之间的差异越小,能够使得该第一信道信息尽可能接近于S606中第三信道信息。
本公开中,由接入网设备训练用于信道信息反馈的双边模型,将该双边模型中子模型相关的输入/输出指示给终端设备。终端设备可以独立训练功能相同的子模型,与接入网设备上的其他子模型匹配使用,能够满足双边模型的应用需求,无需在空口传输模型,能够减少传输开销,提升通信安全。
结合图6A所述的方法,本公开还提供一种双边模型的部署流程示意图。在图6B中示意出四个阶段,在模型训练阶段1中,接入网设备训练一个双边模型,该双边模型包括子模型1和子模型2,其中子模型1的输入类型为下行信道特征如特征向量或特征矩阵W,子模型1的输出类型为特征比特;子模型2的输入类型与子模型1的输出类型相同,子模型2的输出类型为恢复的下行信道特征
Figure PCTCN2022118269-appb-000011
在子模型数据发送阶段中,接入网设备将子模型1的输入数据(记为W1)和/或输出数据(记为特征比特B)发送给终端设备。在模型训练阶段2中,终端设备可以根据收到的W1和/或特征比特B,训练子模型3。该子模型3与子模型1的功能相同。在模型应用阶段(或称联合推断阶段)中,终端设备可以根据子模型1和类型W的数据,得到类型为特征比特的数据,即子模型3的输入类型为W,子模型3的输出类型为特征比特;终端设备将类型为特征比特的数据发送给接入网设备;接入网设备可以根据子模型2和类型为特征比特的数据,得到类型为
Figure PCTCN2022118269-appb-000012
的数据,即子模型2的输入类型为特征比特,子模型2的输出类型为
Figure PCTCN2022118269-appb-000013
可以理解,图6B中的子模型1可以是图6A中的第一子模型,图6B中的子模型2可以是图6A中的第二子模型,图6B中的子模型3可以是图6A中的第三子模型。
结合图5A所述的方法,以需要部署双边模型的应用场景是信道信息反馈为例,本公开提供另一种通信方法,参见图7A,该方法包括如下流程。
S701,终端设备获取训练数据。
其中,该训练数据包括N个信道信息。其中,N为正整数,即N为大于或者等于1的整数。该训练数据用于确定双边模型,该双边模型包括第一子模型和第二子模型。
对应S601中的相关描述,信道信息具体可以包括下行信道和下行信道特征。对于终端设备来说,终端设备可以根据下行参考信号的测量,确定信道信息。
S702,终端设备根据N个信道信息确定第一子模型和第二子模型。
其中,第一子模型和第二子模型构成一个用于信道信息反馈的双边模型,记为第三双边模型。具体地,终端设备可以将获取到的N个信道信息,划分为一个或多个训练集。终端设备可以利用该一个或多个训练集中的部分或全部,训练同一个模型。例如终端设备可以使用一个训练集训练同一个双边模型。或者,例如终端设备可以使用多个训练集训练同一个双边模型。
以信道信息为下行信道特征为例。如果终端设备获取的下行信道特征是下行信道特征 矩阵或者向量W,终端设备可以根据获取到的下行信道特征训练双边模型;或者,终端设备也可以先将下行信道特征矩阵或者向量W转化为PMI,根据PMI恢复下行信道特征矩阵或者向量
Figure PCTCN2022118269-appb-000014
然后再使用恢复出的下行信道特征矩阵或者向量
Figure PCTCN2022118269-appb-000015
训练双边模型。
第三双边模型的输入包括信道信息,第三双边模型的输出包括恢复的信道信息。其中,信道信息可以是下行信道特征或者下行信道。有关信道信息的定义可参照S601理解,对此不再进行赘述。训练第三双边模型可以理解为尽可能最小化输入的信道信息和输出的信道信息之间的差异,第三双边模型对应的损失函数可以表现为输入的信道信息和输出的信道信息之间的MSE、输入的信道信息和输出的信道信息之间的交叉熵(cross-entropy),或者输入的信道信息和输出的信道信息之间的余弦相似度(cosine similarity)等。训练第三双边模型还可以是尽可能最小化输出的信道信息与标签信道信息之间的差异,第三双边模型对应的损失函数可以表现为输出的信道信息与标签信道信息之间的MSE、输出的信道信息与标签信道信息之间的交叉熵(cross-entropy),或者输出的信道信息与标签信道信息之间的余弦相似度(cosine similarity)等。
可选的,在第三双边模型中,第二子模型的输入类型可以与第三双边模型的输入类型一致,即第二子模型的输入类型为信道信息,或者第三双边模型的输入即为第二子模型的输入。第二子模型的输出类型为特征比特,特征比特包括一个或多个二进制比特。可以理解,特征比特是信道信息的低维表达,第二子模型用于对信道信息进行压缩和/或量化得到特征比特。第一子模型的输入由第二子模型的输出确定,例如,第一子模型的输入类型与第二子模型的输出类型一致,均为特征比特;或者,第一子模型的输入数据的维度和第二子模型的输出数据的维度相同;或者,第一子模型的输入数据包括第二子模型的输出数据;或者,可以将第二子模型的输出数据进行预处理后输入至第一子模型,即第一子模型的输入数据包括预处理后的第二子模型的输出数据。第一子模型的输出为恢复的信道信息。示例性的,第三双边模型可以是自编码器,其中第二子模型为编码器,第一子模型为解码器。
终端设备可以根据实际需求预先设置特征比特的维度,特征比特的维度也可以称为特征比特包括的比特数。例如,处于反馈开销的考虑,终端设备可以将特征比特的维度减小,以减少反馈开销。具体地,终端设备可以设定特征比特的维度小于第一维度阈值。例如,处于反馈精度的考虑,终端设备可以将特征比特的维度增大,以提高反馈精度。具体地,终端设备可以设定特征比特的维度大于第一维度阈值。
S703,终端设备向接入网设备发送第一子模型的输入数据和输出数据。
具体地,终端设备可以利用第一子模型,生成第一子模型的输入数据和输出数据;或者,终端设备可以利用第二子模型,生成第一子模型的输入数据,利用第一子模型生成第一子模型的输出数据。例如,所述第一子模型的输入数据包括M个特征比特,所述第一子模型的输出数据包括与M个特征比特对应的信道信息,M为正整数。
以信道信息为下行信道特征为例,终端设备可以将M个下行信道特征和M个特征比特发送给接入网设备。或者,终端设备也可以将M个下行信道特征转化为M个PMI,然后将M个特征比特和M个PMI发送给接入网设备。
此外可选的,S703也可以替换为:终端设备向接入网设备发送第一子模型的输入数据和标签数据。其中,终端设备可以利用第二子模型,生成第一子模型的输入数据,终端设备可以将第一子模型的输入数据和对应的标签数据发送给接入网设备。可以理解,图7A 描述的方案中,第一子模型的标签数据为第三双边模型的标签数据。具体地,第一子模型的输入数据对应的标签数据为第一子模型的输入数据对应的第三双边模型的输入数据对应的标签数据。
S704,接入网设备根据第一子模型的输入数据和输出数据,训练第三子模型。
接入网设备可以直接将终端设备发送的M个特征比特以及M个特征比特对应的信道信息,训练第三子模型。或者,若终端设备发送的信道新具体是PMI,则接入网设备也可以先将M个PMI恢复成M个下行信道特征向量或矩阵,然后使用M个特征比特和M个下行信道特征向量或矩阵训练第三子模型。
该步骤可参照S504的描述实施。本公开对此不再进行赘述。
此外可选的,对应S703,也可以理解该步骤也可以替换为:接入网设备根据第一子模型的输入数据和标签数据训练第三子模型。
S705,接入网设备向终端设备发送训练完成通知,该训练完成通知用于指示完成第三子模型的训练。
具体地,接入网设备可以在经过S704完成第三子模型训练后,执行S705。接入网设备获取到该训练完成通知,即可获知终端设备侧可以使用第三子模型与接入网设备侧的第二子模型进行联合推断,或者可以理解,第三子模型和第二子模型能够构成一个新的用于信道信息的反馈的双边模型,记为第四双边模型。
此外,终端设备还可以将训练好的第三子模型的性能通知给接入网设备,例如终端设备在前述训练完成通知中包括性能信息。例如,该性能信息可以指包括训练好的第三子模型所满足的性能参数:训练/测试/验证的损失函数门限,或者MSE、NMSE、交叉熵等其他性能。又如,该性能信息仅用于指示第三子模型是否达到性能要求,而不具体指示性能参数。
进一步可选的,以下通过S706~S708示意出第三子模型与第二子模型匹配使用的一种示例。该S706~S708可以不执行,或者该S706~S708也可以替换为第三子模型与第二子模型匹配使用的其它示例,本公开不予限制。
S706,终端设备根据第二信道信息和第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;
S707,终端设备向接入网设备发送用于指示所述第二特征比特的信息。
S708,接入网设备根据第三子模型和第二特征比特,得到第四信道信息;其中,所述第三子模型的输入包括所述第二特征比特,所述第三子模型的输出包括第四信道信息。
可选的,接入网设备训练的第三子模型与第一子模型的性能差异越小,第四信道信息和第二信道信息之间的差异越小,能够使得该第四信道信息尽可能接近于S706中第二信道信息。
本公开中,由终端设备训练用于信道信息反馈的双边模型,将该双边模型中子模型相关的输入和输出指示给接入网设备。接入网设备可以独立训练功能相同的子模型,与终端设备上的其他子模型匹配使用,能够满足双边模型的应用需求,无需在空口传输模型,能够减少传输开销,提升通信安全。
结合图7A所述的方法,本公开还提供一种双边模型的部署流程示意图。在图7B中示意出四个阶段,在模型训练阶段1中,终端设备训练一个双边模型,该双边模型包括子模型1和子模型2,其中子模型1的输入类型为下行信道特征如特征向量或特征矩阵W,子 模型1的输出类型为特征比特;子模型2的输入类型与子模型1的输出类型相同,子模型2的输出类型为恢复的下行信道特征
Figure PCTCN2022118269-appb-000016
在子模型数据发送阶段中,终端设备将子模型2的输入数据(记为特征比特B)和输出数据(记为
Figure PCTCN2022118269-appb-000017
)发送给接入网设备。在模型训练阶段2中,接入网设备可以根据收到的特征比特B和
Figure PCTCN2022118269-appb-000018
训练子模型4。该子模型4与子模型2的功能相同。在模型应用阶段(或称联合推断阶段)中,终端设备可以根据子模型1和类型W的数据,得到类型为特征比特的数据,即子模型1的输入类型为W,子模型1的输出类型为特征比特;终端设备将类型为特征比特的数据发送给接入网设备;接入网设备可以根据子模型4和类型为特征比特的数据,得到类型为
Figure PCTCN2022118269-appb-000019
的数据,即子模型4的输入类型为特征比特,子模型4的输出类型为
Figure PCTCN2022118269-appb-000020
可以理解,图7B中的子模型1可以是图7A中的第二子模型,图7B中的子模型2可以是图7A中的第一子模型,图7B中的子模型4可以是图7A中的第三子模型。
方案二
参见图8,示意一种通信方法,该方法包括如下流程。
S801,第三方网元确定第一子模型和第二子模型。
具体地,可参照S501实施,本公开不再进行赘述。
本方案如下以第一子模型应用于第二网元,第二子模型应用于第一网元为例说明。其中,第一网元可以是接入网设备,第二网元可以是如终端设备。或者,第一网元可以是终端设备,第二网元可以是接入网设备。另外可选的,第三方网元可以是独立AI网元。
S802a,第三方网元确定第一子模型的输入数据和输出数据,然后执行S803a。
或者可选的,该步骤S802a也可以替换为:第三方网元确定第一子模型的输入数据和标签数据。
具体地,可参照S502实施,本公开不再进行赘述。
S802b,第三方网元确定第二子模型的输入数据和输出数据,然后执行S803b。
或者可选的,该步骤S802b也可以替换为:第三方网元确定第二子模型的输入数据和标签数据。
具体地,可参照S502实施,本公开不再进行赘述。
S803a,第三方网元向第二网元发送第一子模型的输入数据和/或输出数据,然后执行S804a。
或者可选的,对应于S802a,该步骤S803a也可以替换为:第三方网元向第二网元发送第一子模型的输入数据和标签数据。
具体地,可参照S503实施,例如第三方网元向第二网元发送第一信息;其中,当第三方网元确定第一子模型的输入数据和/或输出数据时,第一信息包括第一子模型的输入数据和/或输出数据;当第三方网元确定第一子模型的输入数据和标签数据时,第一信息包括第一子模型的输入数据和/或标签数据。
S803b,第三方网元向第一网元发送第二子模型的输入数据和/或输出数据,然后执行S804b。
或者可选的,对应于S802b,该步骤S803b也可以替换为:第三方网元向第一网元发送第二子模型的输入数据和/或标签数据或者,第三方网元向第一网元发送第二子模型的输入数据和/或标签数据
具体地,可参照S503实施,例如第三方网元向第一网元发送第二信息;其中,当第三方网元确定第二子模型的输入数据和/或输出数据时,第二信息包括第二子模型的输入数据和/或输出数据;当第三方网元确定第二子模型的输入数据和标签数据时,第二信息包括第二子模型的输入数据和/或标签数据。
S804a,第二网元根据第一子模型的输入数据和/或输出数据,训练第三子模型,然后执行S805。
或者可选的,对应S803a,该步骤S804a也可以替换为:第二网元根据第一子模型的输入数据和/或标签数据,训练第三子模型。
具体地,可参照S504实施,本公开不再进行赘述。
S804b,第一网元根据第二子模型的输入数据和/或输出数据,训练第四子模型,然后执行S805。
或者可选的,对应S803b,该步骤S804b也可以替换为:第一网元根据第二子模型的输入数据和/或标签数据,训练第四子模型。
具体地,可参照S504实施,本公开不再进行赘述。例如,第二子模型和第四子模型之间的关系,可以按照第一子模型和第三子模型之间的关系理解。
S805,第二网元上的第三子模型与第一网元上的第四子模型匹配使用。即第三子模型和第四子模型构成一个新的双边模型。
本公开中,由第三方网元训练一个双边模型,将该双边模型中多个子模型相关的输入/输出指示给多个网元,实现各个网元侧独立训练功能相同的子模型,与其它网元上的子模型匹配使用,能够满足双边模型的应用需求,无需在空口传输模型,能够减少传输开销,提升通信安全。
上述分别从第一网元、第二网元、第三方网元以及它们交互的角度对本公开提供的方法进行了介绍。为了实现上述方法中的各功能,第一网元、第二网元、第三方网元可以包括硬件结构和/或软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能以硬件结构、软件模块、还是硬件结构加软件模块的方式来执行,取决于技术方案的特定应用和设计约束条件。
基于同一构思,参见图9,本公开提供了一种通信装置900,该通信装置900包括处理模块901和通信模块902。该通信装置900可以是第一网元,也可以是应用于第一网元或者和第一网元匹配使用,能够实现第一网元侧执行的通信方法的通信装置;或者,该通信装置900可以是第二网元,也可以是应用于第二网元或者和第二网元匹配使用,能够实现第二网元侧执行的通信方法的通信装置;或者,该通信装置900可以是第三方网元,也可以是应用于第三方网元或者和第三方网元匹配使用,能够实现第三方网元侧执行的通信方法的通信装置。
其中,通信模块也可以称为收发模块、收发器、收发机、收发装置等。处理模块也可以称为处理器,处理单板,处理单元、处理装置等。可选的,可以将通信模块中用于实现接收功能的器件视为接收单元,应理解,通信模块用于执行上述方法实施例中接入网设备侧或终端设备侧的发送操作和接收操作,将通信模块中用于实现发送功能的器件视为发送单元,即通信模块包括接收单元和发送单元。
该通信装置900应用于第一网元时,其通信模块902包括的接收单元用于执行第一网 元侧的接收操作,例如接收来自第二网元的信号。其通信模块902包括的发送单元用于执行第一网元侧的发送操作,例如向第二网元发送信号。该通信装置900应用于第二网元时,其通信模块902包括的接收单元用于执行第二网元侧的接收操作,例如接收来自第一网元的信号;其通信模块902包括的发送单元用于执行第二网元侧的发送操作,例如向第一网元发送信号。该通信装置900应用于第三方网元时,其通信模块902包括的接收单元用于执行第三方网元侧的接收操作,例如接收来自第一网元或者第二网元的信号;其通信模块902包括的发送单元用于执行第三方网元侧的发送操作,例如向第一网元或者第二网元发送信号。
此外需要说明的是,若该装置采用芯片/芯片电路实现,所述通信模块可以是输入输出电路和/或通信接口,执行输入操作(对应前述接收操作)、输出操作(对应前述发送操作);处理模块为集成的处理器或者微处理器或者集成电路。
以下对该通信装置900应用于第一网元的实施方式进行详细说明。对应图5A所描述的方法,该第一网元可以是接入网设备或者终端设备。
该通信装置900包括:
处理模块901,用于确定第一子模型和第二子模型,所述第一子模型和所述第二子模型能够匹配使用。
通信模块902,用于发送第一信息,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据,或者,所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据。
本公开中,提供能够匹配使用的多个子模型中一个子模型的输入数据和/或输出数据,可以用于独立训练与该子模型功能相同的子模型,无需在空口传输子模型,能够减少传输开销,提升通信安全。
在一种可能的设计中,所述第一子模型的输出用于确定所述第二子模型的输入;或者,所述第二子模型的输出用于确定所述第一子模型的输入。
在一种可能的设计中,所述第一子模型用于在发送端发送信息,所述第二子模型在接收端用于接收所述信息;或者,所述第二子模型用于在发送端发送信息,所述第一子模型用于在接收端接收所述信息。
在一种可能的设计中,所述第一子模型和所述第二子模型属于一个双边模型。
在一种可能的设计中,所述第一信息用于第三子模型的训练。
其中,在一种可选的实现中,所述第三子模型的功能与所述第一子模型的功能相同;和/或,所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;和/或,所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出标签之间的差别小于第二阈值。所述第三子模型与所述第二子模型组成一个新的双边模型。
在另一种可选的实现中,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据时,所述第三子模型的功能与所述第二子模型的功能相同;和/或, 所述第三子模型的输入类型与所述第二子模型的输入类型相同,所述第三子模型的输出类型与所述第二子模型的输出类型相同;和/或,所述第三子模型的输入数据的维度与所述第二子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第二子模型的输出数据的维度相同;和/或,所述第三子模型的输入与所述第二子模型的输入相同时,所述第三子模型的输出与所述第二子模型的输出之间的差别小于第一阈值;和/或,所述第三子模型的输入与所述第二子模型的输入相同时,所述第三子模型的输出与所述第二子模型的输出标签之间的差别小于第二阈值。通过这样的设计,实现独立训练替代第二子模型的第三子模型,使得第三子模型能够与第一子模型匹配使用,能够减少发送第二子模型的传输开销。且,第三子模型与所述第一子模型也能够组成一个新的双边模型。
在一种可能的设计中,所述处理模块901,具体用于:根据训练数据,确定所述第一子模型和所述第二子模型;其中,所述训练数据包括N个信道信息,N为正整数,所述信道信息包括下行信道特征或者下行信道。
在一种可能的设计中,所述第一子模型的输入数据包括M个信道信息,M为正整数。
在一种可能的设计中,所述第一子模型的输出数据包括与M个信道信息对应的特征比特,M为正整数。
在一种可能的设计中,所述通信模块902,还用于获取用于指示第一特征比特的信息,所述第三子模型的输出包括所述第一特征比特;所述处理模块901,还用于根据所述第二子模型和所述第一特征比特,得到第一信道信息;其中,所述第二子模型的输入包括所述第一特征比特,所述第二子模型的输出包括所述第一信道信息。
在一种可能的设计中,所述第一子模型的输入数据包括M个特征比特,M为正整数。
在一种可能的设计中,所述第一子模型的输出数据包括与M个特征比特对应的信道信息,M为正整数。
在一种可能的设计中,所述第一子模型的输入数据包括M个特征比特,所述第一子模型的标签数据包括与M个特征比特对应的信道信息,M为正整数。
在一种可能的设计中,所述处理模块901,还用于根据第二信道信息和所述第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;所述通信模块902,还用于发送用于指示所述第二特征比特的信息。
在一种可能的设计中,所述处理模块901,还用于根据第二信道信息和所述第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;所述通信模块902,还用于发送用于指示所述第二特征比特和第二信道信息的信息,或者,发送用于指示所述第二特征比特和第二特征比特对应的标签信道信息的信息。其中,第二特征比特对应的标签信道信息可以理解为第三子模型的输出标签,例如可以是第二信道信息。以下对该通信装置900应用于第二网元的实施方式进行详细说明。对应图5A所描述的方法,该第二网元可以是终端设备或者接入网设备。
通信模块902,用于获取第一信息,所述第一信息用于指示第一子模型的输入数据和/或者所述第一子模型的输出数据,或所述第一信息用于指示所述第一子模型的输入数据和所述第一子模型的标签数据。
处理模块901,用于根据所述第一信息训练第三子模型。
上述设计中,根据获取的一个子模型的输入数据和/或输出数据,可以用于独立训练与该子模型功能相同的子模型。能够应用于部署双边模型的场景,无需在空口传输子模型,能够减少传输开销,提升通信安全。
在一种可能的设计中,所述第三子模型的功能与所述第一子模型的功能相同;和/或,所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;和/或,所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;和/或,所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值。
在一种可能的设计中,所述第一子模型和第二子模型能够匹配使用。
在一种可能的设计中,所述第一子模型的输出用于确定所述第二子模型的输入;或者,所述第二子模型的输出用于确定所述第一子模型的输入。
在一种可能的设计中,所述第一子模型用于在发送端发送信息,所述第二子模型在接收端用于接收所述信息;或者,所述第二子模型用于在发送端发送信息,所述第一子模型用于在接收端接收所述信息。
在一种可能的设计中,所述第一子模型和所述第二子模型属于一个双边模型;所述第三子模型与所述第二子模型组成一个新的双边模型。
在一种可能的设计中,所述第一子模型的输入数据包括M个信道信息,M为正整数。
在一种可能的设计中,所述第一子模型的输出数据包括与所述M个信道信息对应的特征比特,M为正整数。
在一种可能的设计中,所述处理模块901,还用于根据第三信道信息和所述第三子模型,确定第一特征比特;其中,所述第三子模型的输入包括所述第三信道信息,所述第三子模型的输出包括所述第一特征比特。所述通信模块902,还用于发送用于指示所述第一特征比特的信息。
在一种可能的设计中,所述第一子模型的输入参数包括M个特征比特M为正整数。
在一种可能的设计中,所述第一子模型的输出参数包括与所述M个特征比特对应的信道信息,M为正整数。
在一种可能的设计中,所述通信模块902,还用于获取用于指示第二特征比特的信息;所述处理模块901,还用于根据所述第三子模型和所述第二特征比特,得到第四信道信息;其中,所述第三子模型的输入包括所述第二特征比特,所述第三子模型的输出包括所述第四信道信息。
以下对该通信装置900应用于第三方网元的实施方式进行详细说明。对应图8所描述的方法,该第三方网元可以是AI网元。
处理模块901,用于确定第一子模型和第二子模型。
通信模块902,用于向第二网元发送第一信息,所述第一信息包括第一子模型的输入数据和/输出数据,或第一信息包括第一子模型的输入数据和标签数据,所述第一信息用于第三子模型的训练;以及向第一网元发送第二信息,所述第二信息包括第二子模型的输入数据和/或输出数据,或第二信息包括第二子模型的输入数据和标签数据,所述第二信息用于第四子模型的训练。
关于第一子模型、第二子模型、第三子模型以及第四子模型之间的关系,可以参照上 述方法实施例中的介绍理解,本公开对此不再进行赘述。
本公开中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本公开各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
基于相同的技术构思,本公开还提供了一种通信装置1000。该通信装置1000可以是芯片或者芯片系统。可选的,在本公开中芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
通信装置1000可用于实现图1A所示的通信系统中终端设备或接入网设备的功能或者用于实现图1B所示的通信系统中终端设备、接入网设备或AI网元的功能。通信装置1000可以包括至少一个处理器1010,该处理器1010与存储器耦合,可选的,存储器可以位于该装置之内,存储器可以和处理器集成在一起,存储器也可以位于该装置之外。例如,通信装置1000还可以包括至少一个存储器1020。存储器1020保存实施上述任一实施例中必要计算机程序、配置信息、计算机程序或指令和/或数据;处理器1010可能执行存储器1020中存储的计算机程序,完成上述任一实施例中的方法。
通信装置1000中还可以包括通信接口1030,通信装置1000可以通过通信接口1030和其它设备进行信息交互。示例性的,所述通信接口1030可以是收发器、电路、总线、模块、管脚或其它类型的通信接口。当该通信装置1000为芯片类的装置或者电路时,该装置1000中的通信接口1030也可以是输入输出电路,可以输入信息(或称,接收信息)和输出信息(或称,发送信息),处理器为集成的处理器或者微处理器或者集成电路或则逻辑电路,处理器可以根据输入信息确定输出信息。
本公开中的耦合是装置、单元或模块之间的间接耦合或通信连接,可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。处理器1010可能和存储器1020、通信接口1030协同操作。本公开中不限定上述处理器1010、存储器1020以及通信接口1030之间的具体连接介质。
可选的,参见图10,所述处理模块1010、所述存储器1020以及所述通信接口1030之间通过总线1040相互连接。所述总线1040可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图10中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本公开中,处理器可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本公开中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本公开所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
在本公开中,存储器可以是非易失性存储器,比如硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)等,还可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM)。存储器是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本 公开中的存储器还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
在一种可能的实施方式中,该通信装置1000可以应用于终端设备,具体通信装置1000可以是终端设备,也可以是能够支持终端设备,实现上述涉及的任一实施例中终端设备的功能的装置。存储器1020保存实现上述任一实施例中的终端设备的功能的必要计算机程序、计算机程序或指令和/或数据。处理器1010可执行存储器1020存储的计算机程序,完成上述任一实施例中终端设备执行的方法。应用于终端设备,该通信装置1000中的通信接口可用于与接入网设备进行交互,向接入网设备发送数据或者接收来自接入网设备的数据。
在另一种可能的实施方式中,该通信装置1000可以应用于接入网设备,具体通信装置1000可以是接入网设备,也可以是能够支持接入网设备,实现上述涉及的任一实施例中接入网设备的功能的装置。存储器1020保存实现上述任一实施例中的接入网设备的功能的必要计算机程序、计算机程序或指令和/或数据。处理器1010可执行存储器1020存储的计算机程序,完成上述任一实施例中接入网设备执行的方法。应用于接入网设备,该通信装置1000中的通信接口可用于与终端设备进行交互,向终端设备发送数据或者接收来自终端设备的数据。
在另一种可能的实施方式中,该通信装置1000可以应用于AI网元,具体通信装置1000可以是AI网元,也可以是能够支持AI网元,实现上述涉及的任一实施例中AI网元的功能的装置。存储器1020保存实现上述任一实施例中的AI网元的功能的必要计算机程序、计算机程序或指令和/或数据。处理器1010可执行存储器1020存储的计算机程序,完成上述任一实施例中AI网元执行的方法。应用于AI网元,该通信装置1000中的通信接口可用于与接入网设备进行交互,向接入网设备发送数据或者接收来自接入网设备的数据。
由于本实施例提供的通信装置1000可应用于终端设备,完成上述终端设备执行的方法,或者应用于接入网设备,完成接入网设备执行的方法,或者应用于AI网元,完成AI网元执行的方法。因此其所能获得的技术效果可参考上述方法实施例,在此不再赘述。
基于以上实施例,本公开还提供了一种计算机程序,当所述计算机程序在计算机上运行时,使得所述计算机从终端设备侧或者接入网设备侧角度执行图5A至图8所示的实施例中所提供的通信方法。
基于以上实施例,本公开还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,所述计算机程序被计算机执行时,使得计算机从终端设备侧或者接入网设备侧角度执行图5A至图8所示的实施例中所提供的通信方法。其中,存储介质可以是计算机能够存取的任何可用介质。以此为例但不限于:计算机可读介质可以包括RAM、只读存储器(read-only memory,ROM)、电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、CD-ROM或其他光盘存储、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质。
基于以上实施例,本公开提供了一种通信系统,包括终端设备和接入网设备,其中,所述终端设备和接入网设备可以实现图5A至图8所示的实施例中所提供的通信方法。
基于以上实施例,本公开还提供了一种芯片,所述芯片用于读取存储器中存储的计算 机程序,从终端设备侧或者接入网设备侧角度实现图5A至图8所示的实施例中所提供的通信方法。
基于以上实施例,本公开提供了一种芯片系统,该芯片系统包括处理器,用于支持计算机装置实现图5A至图8所示的实施例中终端设备或接入网设备所涉及的功能。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器用于保存该计算机装置必要的程序和数据。该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本公开提供的技术方案可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本公开所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、接入网设备、终端设备、AI网元或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,DVD))、或者半导体介质等。
在本公开中,在无逻辑矛盾的前提下,各实施例之间可以相互引用,例如方法实施例之间的方法和/或术语可以相互引用,例如装置实施例之间的功能和/或术语可以相互引用,例如装置实施例和方法实施例之间的功能和/或术语可以相互引用。
显然,本领域的技术人员可以对本公开进行各种改动和变型而不脱离本公开的范围。这样,倘若本公开的这些修改和变型属于本公开权利要求及其等同技术的范围之内,则本公开也意图包含这些改动和变型在内。

Claims (31)

  1. 一种通信方法,其特征在于,包括:
    确定第一子模型和第二子模型,所述第一子模型和所述第二子模型能够匹配使用;
    发送第一信息,所述第一信息用于指示所述第一子模型的输入数据和/或所述第一子模型的输出数据。
  2. 如权利要求1所述的方法,其特征在于,
    所述第一子模型的输出用于确定所述第二子模型的输入;或者,
    所述第二子模型的输出用于确定所述第一子模型的输入。
  3. 如权利要求1或2所述的方法,其特征在于,
    所述第一子模型用于在发送端发送信息,所述第二子模型在接收端用于接收所述信息;或者,
    所述第二子模型用于在发送端发送信息,所述第一子模型用于在接收端接收所述信息。
  4. 如权利要求1-3任一项所述的方法,其特征在于,所述第一子模型和所述第二子模型属于一个双边模型。
  5. 如权利要求1-4任一项所述的方法,其特征在于,所述第一信息用于第三子模型的训练;其中,
    所述第三子模型的功能与所述第一子模型的功能相同;和/或,
    所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;和/或,
    所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;和/或,
    所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值。
  6. 如权利要求1-5任一项所述的方法,其特征在于,所述确定第一子模型和第二子模型,包括:
    根据训练数据,确定所述第一子模型和所述第二子模型;其中,所述训练数据包括N个信道信息,N为正整数,所述信道信息包括下行信道特征或者下行信道。
  7. 如权利要求1-6任一项所述的方法,其特征在于,所述第一子模型的输入数据包括M个信道信息,M为正整数。
  8. 如权利要求1-7任一项所述的方法,其特征在于,所述第一子模型的输出数据包括与M个信道信息对应的特征比特,M为正整数。
  9. 如权利要求5-8任一项所述的方法,其特征在于,所述方法还包括:
    获取用于指示第一特征比特的信息,所述第三子模型的输出包括所述第一特征比特;
    根据所述第二子模型和所述第一特征比特,得到第一信道信息;其中,所述第二子模型的输入包括所述第一特征比特,所述第二子模型的输出包括所述第一信道信息。
  10. 如权利要求1-6任一项所述的方法,其特征在于,所述第一子模型的输入数据包括M个特征比特,M为正整数。
  11. 如权利要求1-6和10任一项所述的方法,其特征在于,所述第一子模型的输出数据包括与M个特征比特对应的信道信息,M为正整数。
  12. 如权利要求10或11所述的方法,其特征在于,所述方法还包括:
    根据第二信道信息和所述第二子模型,确定第二特征比特;其中,所述第二子模型的输入包括所述第二信道信息,所述第二子模型的输出包括所述第二特征比特;
    发送用于指示所述第二特征比特的信息。
  13. 一种通信方法,其特征在于,包括:
    获取第一信息,所述第一信息用于指示第一子模型的输入数据和/或所述第一子模型的输出数据;
    根据所述第一信息训练第三子模型。
  14. 如权利要求13所述的方法,其特征在于,所述第三子模型的功能与所述第一子模型的功能相同;和/或,
    所述第三子模型的输入类型与所述第一子模型的输入类型相同,所述第三子模型的输出类型与所述第一子模型的输出类型相同;和/或,
    所述第三子模型的输入数据的维度与所述第一子模型的输入数据的维度相同,所述第三子模型的输出数据的维度与所述第一子模型的输出数据的维度相同;和/或,
    所述第三子模型的输入与所述第一子模型的输入相同时,所述第三子模型的输出与所述第一子模型的输出之间的差别小于第一阈值。
  15. 如权利要求13或14所述的方法,其特征在于,所述第一子模型和第二子模型能够匹配使用。
  16. 如权利要求15所述的方法,其特征在于,所述第一子模型的输出用于确定所述第二子模型的输入;或者,所述第二子模型的输出用于确定所述第一子模型的输入。
  17. 如权利要求15或16所述的方法,其特征在于,所述第一子模型用于在发送端发送信息,所述第二子模型在接收端用于接收所述信息;或者,所述第二子模型用于在发送端发送信息,所述第一子模型用于在接收端接收所述信息。
  18. 如权利要求13-17任一项所述的方法,其特征在于,所述第一子模型和所述第二子模型属于一个双边模型。
  19. 如权利要求13-18任一项所述的方法,其特征在于,所述第一子模型的输入数据包括M个信道信息,M为正整数。
  20. 如权利要求13-19任一项所述的方法,其特征在于,所述第一子模型的输出数据包括与所述M个信道信息对应的特征比特,M为正整数。
  21. 如权利要求13-20任一项所述的方法,其特征在于,所述方法还包括:
    根据第三信道信息和所述第三子模型,确定第一特征比特;其中,所述第三子模型的输入包括所述第三信道信息,所述第三子模型的输出包括所述第一特征比特;
    发送用于指示所述第一特征比特的信息。
  22. 如权利要求13-18任一项所述的方法,其特征在于,所述第一子模型的输入参数包括M个特征比特M为正整数。
  23. 如权利要求13-18和22任一项所述的方法,其特征在于,所述第一子模型的输出参数包括与所述M个特征比特对应的信道信息,M为正整数。
  24. 如权利要求22或23所述的方法,其特征在于,所述方法还包括:
    获取用于指示第二特征比特的信息;
    根据所述第三子模型和所述第二特征比特,得到第四信道信息;其中,所述第三子模 型的输入包括所述第二特征比特,所述第三子模型的输出包括所述第四信道信息。
  25. 一种通信装置,其特征在于,用于实现权利要求1-12任一项所述的方法。
  26. 一种通信装置,其特征在于,用于实现权利要求13-24任一项所述的方法。
  27. 一种通信装置,其特征在于,包括:
    处理器,所述处理器和存储器耦合,所述处理器用于执行权利要求1-12任一项所述的方法。
  28. 一种通信装置,其特征在于,包括:
    处理器,所述处理器和存储器耦合,所述处理器用于执行权利要求13-24任一项所述的方法。
  29. 一种通信系统,其特征在于,包括权利要求25或27所述的通信装置,以及权利要求26或28所述的通信装置。
  30. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有指令,当所述指令在计算机上运行时,使得计算机执行权利要求1-12任一项所述的方法或权利要求13-24任一项所述的方法。
  31. 一种计算机程序产品,其特征在于,包括指令,当所述指令在计算机上运行时,使得计算机执行权利要求1-12任一项所述的方法或权利要求13-24任一项所述的方法。
PCT/CN2022/118269 2021-09-10 2022-09-09 一种通信方法及装置 WO2023036323A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22866784.6A EP4391474A1 (en) 2021-09-10 2022-09-09 Communication method and apparatus
US18/598,574 US20240211769A1 (en) 2021-09-10 2024-03-07 Communication method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111064144.XA CN115842835A (zh) 2021-09-10 2021-09-10 一种通信方法及装置
CN202111064144.X 2021-09-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/598,574 Continuation US20240211769A1 (en) 2021-09-10 2024-03-07 Communication method and apparatus

Publications (1)

Publication Number Publication Date
WO2023036323A1 true WO2023036323A1 (zh) 2023-03-16

Family

ID=85506147

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118269 WO2023036323A1 (zh) 2021-09-10 2022-09-09 一种通信方法及装置

Country Status (4)

Country Link
US (1) US20240211769A1 (zh)
EP (1) EP4391474A1 (zh)
CN (1) CN115842835A (zh)
WO (1) WO2023036323A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685202A (zh) * 2018-12-17 2019-04-26 腾讯科技(深圳)有限公司 数据处理方法及装置、存储介质和电子装置
US10944436B1 (en) * 2019-11-21 2021-03-09 Harris Global Communications, Inc. RF communication device using artificial intelligence (AI) model and associated methods
CN112783807A (zh) * 2020-12-31 2021-05-11 深圳大普微电子科技有限公司 一种模型计算方法及系统
CN113365287A (zh) * 2020-03-06 2021-09-07 华为技术有限公司 通信方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685202A (zh) * 2018-12-17 2019-04-26 腾讯科技(深圳)有限公司 数据处理方法及装置、存储介质和电子装置
US10944436B1 (en) * 2019-11-21 2021-03-09 Harris Global Communications, Inc. RF communication device using artificial intelligence (AI) model and associated methods
CN113365287A (zh) * 2020-03-06 2021-09-07 华为技术有限公司 通信方法及装置
CN112783807A (zh) * 2020-12-31 2021-05-11 深圳大普微电子科技有限公司 一种模型计算方法及系统

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHINA TELECOM: "Solution to AI based ES in RAN Split Architecture", 3GPP DRAFT; R3-213956, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. RAN WG3, no. Online; 20210816 - 20210826, 6 August 2021 (2021-08-06), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France , XP052035622 *
QUALCOMM INCORPORATED: "Model Training Procedure", 3GPP DRAFT; R3-211755, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. RAN WG3, no. 20210517 - 20210528, 7 May 2021 (2021-05-07), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France , XP052002038 *

Also Published As

Publication number Publication date
US20240211769A1 (en) 2024-06-27
CN115842835A (zh) 2023-03-24
EP4391474A1 (en) 2024-06-26

Similar Documents

Publication Publication Date Title
US20230319585A1 (en) Methods and systems for artificial intelligence based architecture in wireless network
WO2023125660A1 (zh) 一种通信方法及装置
US20240211770A1 (en) Communication method and apparatus
WO2023036323A1 (zh) 一种通信方法及装置
EP4163827A1 (en) Method and device for acquiring neural network
WO2024008004A1 (zh) 一种通信方法及装置
WO2023006096A1 (zh) 一种通信方法及装置
WO2024046215A1 (zh) 一种通信方法及装置
WO2023174108A1 (zh) 一种通信方法及装置
WO2023231934A1 (zh) 一种通信方法及装置
WO2023231881A1 (zh) 一种模型应用方法及装置
WO2023116655A9 (zh) 一种通信方法及装置
EP4373006A1 (en) Communication method and apparatus
WO2024051789A1 (zh) 一种波束管理方法
WO2024139985A1 (zh) 通信方法和装置
WO2024046419A1 (zh) 一种通信方法及装置
WO2023115254A1 (zh) 处理数据的方法及装置
WO2023093777A1 (zh) 一种通信方法及装置
WO2023202514A1 (zh) 一种通信方法及装置
WO2024131900A1 (zh) 一种通信的方法和通信装置
WO2023125996A1 (zh) 一种上行预编码方法及装置
WO2023125699A1 (zh) 一种通信方法及装置
WO2023036280A1 (zh) 一种模型测试方法及装置
WO2024134636A1 (en) Data encoding
WO2024134468A1 (en) Decoder adaptation in machine learning-based wireless communication

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22866784

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022866784

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022866784

Country of ref document: EP

Effective date: 20240321

NENP Non-entry into the national phase

Ref country code: DE