WO2023273956A1 - 一种基于多任务网络模型的通信方法、装置及系统 - Google Patents

一种基于多任务网络模型的通信方法、装置及系统 Download PDF

Info

Publication number
WO2023273956A1
WO2023273956A1 PCT/CN2022/100097 CN2022100097W WO2023273956A1 WO 2023273956 A1 WO2023273956 A1 WO 2023273956A1 CN 2022100097 W CN2022100097 W CN 2022100097W WO 2023273956 A1 WO2023273956 A1 WO 2023273956A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
features
communication device
network model
fusion
Prior art date
Application number
PCT/CN2022/100097
Other languages
English (en)
French (fr)
Inventor
王梦杨
李佳徽
马梦瑶
谢俊文
张志聪
颜敏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023273956A1 publication Critical patent/WO2023273956A1/zh
Priority to US18/398,520 priority Critical patent/US20240127074A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W16/00Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
    • H04W16/22Traffic simulation tools or models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence

Definitions

  • the embodiments of the present application relate to the field of communication technologies, and in particular to a communication method, device and system based on a multi-task network model.
  • CNN convolutional neural network
  • STL single task learning
  • MTL multi-task learning
  • the MTL model uses a multi-task network model.
  • multiple functional network models share the intermediate features generated by the backbone network model.
  • Different functional network models Complete different tasks separately.
  • MTL mode can be more efficient and reduce the storage cost of the model.
  • a model operation mode of device-cloud collaboration is proposed, that is, collaborative intelligence (CI).
  • the model in the CI scenario is divided into two parts, one part is located on the mobile device side, and the other part is located on the cloud.
  • the mobile device runs part of the network model, and the cloud runs another part of the network model.
  • Intermediate features need to be transmitted between the mobile device and the cloud to achieve the purpose of collaboration.
  • the model operation mode of device-cloud collaboration can reduce the computing cost of mobile devices.
  • the structure of the multi-task network model in MTL mode is relatively complex. How to realize device-cloud collaboration in MTL mode is a problem that needs to be solved.
  • Embodiments of the present application provide a communication method, device, and system based on a multi-task network model, with a view to realizing device-cloud collaboration in an MTL mode.
  • a communication method based on a multi-task network model may be executed by a first communication device, or may be executed by a component (such as a processor, a chip, or a chip system, etc.) of the first communication device.
  • the first communication device may be a terminal device or a cloud
  • the multi-task network model includes a first backbone network model.
  • the method can be implemented through the following steps: the first communication device uses the first backbone network model to process the input signal to obtain a fusion feature, the fusion feature is obtained by fusion of multiple first features, and the multiple first features are the input signal Obtained by feature extraction; the first communication device compresses and channel-codes the fusion feature to obtain the first information; the first communication device sends the first information to the second communication device.
  • the obtained fused features can contain more information, which can make the second communication device more accurate when processing another part of the network model based on the fused features.
  • Generating fusion features in the feature extraction stage can make the structure of the multi-task network model clearer, and it is more conducive to dividing the multi-task network model into the part executed by the first communication device and the part executed by the second communication device, which is more conducive to the realization of MTL Device-cloud collaboration under the mode. And by compressing, fewer parameters are transmitted between the first communication device and the second communication device, reducing the transmission overhead, and by channel coding, the noise resistance performance of the data transmitted between the first communication device and the second communication device can be improved. it is good.
  • the first communication device when the first communication device uses the first backbone network model to process the input signal to obtain the fusion feature, the first communication device specifically implements the following steps: the first communication device performs feature extraction on the input signal, A plurality of second features are obtained, and the plurality of second features have different feature dimensions; the first communication device processes the feature dimensions of the plurality of second features to obtain a plurality of first features with the same feature dimension; the first communication device will multiple The first feature is fused to obtain the fused feature.
  • the purpose is to make the fusion features rich in information, and the fusion of information from different sources also plays a role of information complementarity.
  • the first communication device may perform the first Convolution operation and upsampling operation to obtain multiple first features with the same feature dimension.
  • the upsampling operation can change the height and width of the first feature to any value, not limited to integer multiples, for example, it can be expanded to 8 times or more of the original height, and the expansion multiple is any multiple.
  • using the deconvolution operation to change the height and width of features can only achieve an expansion multiple of 2, and the expansion multiple must be an integer.
  • the upsampling operation is more flexible by the multiple of the extended dimension.
  • the first communication device performs feature fusion on multiple first features to obtain a fused feature.
  • the first communication device may add multiple first features to obtain a third feature.
  • adding multiple first features may refer to adding elements at the same position in multiple first features, and a fused third feature can be obtained after adding multiple first features.
  • the method of obtaining the third feature by adding is simple and effective, and can help reduce the complexity of the model.
  • the first communication device performs a second convolution operation on the third feature to obtain the fusion feature.
  • the second convolution operation can use a 3*3 convolution operation, which can control the number of input and output channels of the second convolution operation to be equal, that is, the dimensions of the third feature and the fusion feature are the same, and the second convolution is performed on the third feature
  • the operation can make the obtained fusion features smoother, so that the fusion features are more suitable for the second communication device to perform subsequent network model processing, making the processing results more accurate.
  • the first communication device performs compression and channel protection processing on the fusion feature, which may be that the first communication device uses the source-channel joint coding model to perform down-sampling and the third convolution operation on the fusion feature to obtain the fourth Features; the source-channel joint coding model is trained based on channel noise, so that the data processed by the source-channel joint coding model can have more anti-noise performance, and the second communication device uses the corresponding source-channel joint coding model
  • the source-channel joint decoding model processes the received data, and can decode and obtain reconstructed features that are closer to fusion features, thus making the performance of device-cloud collaboration more stable and accurate.
  • the height and width of fusion features can be reduced by downsampling.
  • convolution operations are used to reduce the height and width of fusion features.
  • the multiples of convolution operations to reduce dimensions are affected by the size of the convolution kernel and can only be reduced to the original Integer multiples of the dimension.
  • downsampling can reduce the fusion feature to any dimension. Using downsampling to reduce the dimension of the fusion feature can be more flexible.
  • the third convolution operation the number of feature channels of the fused feature can be reduced, so that the fused feature can be compressed to be more conducive to transmission.
  • the first communication device performs compression and channel protection processing on the fusion feature, and further includes:
  • the first communication device uses the source-channel joint coding model to perform one or more of the following operations on the fourth feature: generalized division normalization, parameterized linear rectification unit, or power normalization.
  • the generalized division normalization can be used to improve the compression ability of the source-channel joint coding model, and the parameterized linear rectification unit can also be used to improve the compression ability of the source-channel joint coding model.
  • Power normalization can make the power of the compressed result be 1.
  • a communication method based on a multi-task network model is provided, and the method may be executed by the second communication device, or may be executed by components (such as a processor, a chip, or a chip system, etc.) of the second communication device.
  • the second communication device may be a cloud or a terminal device.
  • the multi-task network model includes a second backbone network model and a functional network model.
  • the method can be realized through the following steps: the second communication device receives the second information from the first communication device; the second communication device decompresses and channel-decodes the second information to obtain the reconstructed feature of the fusion feature, and the fusion feature is the input
  • the signal is obtained by merging multiple first features obtained by feature extraction; the second communication device uses the second backbone network model to perform feature analysis on the reconstructed features to obtain the result of the feature analysis; the second communication device uses the functional network model to process the feature analysis the result of.
  • the obtained fusion features can contain more information, which can make the processing by the second communication device based on the reconstructed features of the fusion features more accurate.
  • Generating fusion features in the feature extraction stage can make the structure of the multi-task network model clearer, which is more conducive to the division of the multi-task network model, and is more conducive to the realization of device-cloud collaboration in the MTL mode.
  • the decoded second information can be closer to the first information sent by the first communication device, and the anti-noise performance of the data transmitted between the first communication device and the second communication device can be improved.
  • the second communication device can complete multiple tasks by receiving a set of features (that is, the fusion features included in the first information), and does not need to input multiple sets of features to perform multiple tasks.
  • the operation of the second communication device is more concise and easier. It is beneficial to divide the multi-task network model into two parts, which is more suitable for the terminal-cloud collaboration scenario.
  • the result of feature analysis includes X features, the first feature of X features is a reconstructed feature, and the X i+ 1th feature among the X features is calculated by the X ith feature Obtained; the first Y features of the X features are obtained through the first operation, and the last (XY) features of the X features are obtained through the second operation; where X, Y, and i are positive integers , i is less than or equal to X, and Y is less than or equal to X; the convolution operation in the first operation has multiple receptive fields, and the convolution operation in the second operation has one receptive field.
  • the first operation fuses the convolution results of different receptive fields together, which is a means of feature fusion. It extracts and fuses different information from different angles, so that the information contained in the result of the first operation is more than that of the second operation. The result contains more information, which is more conducive to the performance improvement of the functional network model.
  • the first operation includes the following operations: perform a 1 ⁇ 1 convolution operation on the features to be processed in the first Y features; perform multiple 3 ⁇ 3 convolution operations on the results of the 1 ⁇ 1 convolution operation
  • the convolution operation of multiple 3 ⁇ 3 convolution operations has different receptive field sizes; the results of multiple 3 ⁇ 3 convolution operations are stitched in the channel number dimension; the stitching results of the channel number dimension are 1 ⁇ 1 Convolution operation to obtain the first convolution result; element-wise addition of the first convolution result and the feature to be processed.
  • the height of feature X i + 1 is 1/2 of the height of feature X i ; the width of feature X i+1 is 1/2 of the width of feature X i ; The number of channels of the X i+1 feature is the same as the number of channels of the X i feature.
  • the second communication device performs decompression and channel decoding on the second information, including: the second communication device performs the following operations on the second information by using a source-channel joint decoding model: a fourth convolution operation, Upsampling operation and fifth convolution operation.
  • the spatial dimension of the feature can be restored through the upsampling operation, and the fourth convolution operation and the fifth convolution operation can be used to perform two convolution operations.
  • the compressed fusion features can be slowly restored, improving feature recovery.
  • the ability to make the restored reconstruction features more parameters, using more parameters can improve the accuracy of network model analysis.
  • the second communication device performs decompression and channel decoding on the second information, and further includes one or more of the following operations: generalized division normalized denormalization (IGDN), parameterized linear Rectified Unit (PReLU), Batch Normalization (BN), or Rectified Linear Unit (ReLU).
  • IGDN generalized division normalized denormalization
  • PReLU parameterized linear Rectified Unit
  • BN Batch Normalization
  • ReLU Rectified Linear Unit
  • the IGDN can be used to improve the decompression capability or decoding capability of the source-channel joint decoding model.
  • PReLU can also be used to improve the decompression or decoding ability of the source-channel joint decoding model.
  • BN and/or ReLU can limit the value range of the decoding result, and can further increase the accuracy of the decoding result.
  • a communication device which may be a first communication device, or a device (for example, a chip, or a chip system, or a circuit) located in the first communication device, or a device capable of communicating with the first communication device.
  • the device matches the device used.
  • the first communication device may be a terminal device or a network device.
  • the device has the function of implementing the method described in the first aspect and any possible design of the first aspect.
  • the functions described above may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • Hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the device may include a transceiver module and a processing module. Exemplarily:
  • the processing module is used to process the input signal by using the first backbone network model to obtain fusion features, the fusion features are obtained by fusion of multiple first features, and the multiple first features are obtained by feature extraction of the input signal; processing The module is also used to compress and channel-code the fused features to obtain the first information; the transceiver module is used to send the first information to the second communication device.
  • the processing module when using the first backbone network model to process the input signal to obtain fusion features, is used to: perform feature extraction on the input signal to obtain multiple second features, the multiple second features have different feature dimensions; process the feature dimensions of the plurality of second features to obtain the plurality of first features with the same feature dimension; perform feature fusion on the plurality of first features to obtain fusion features.
  • the processing module when processing the feature dimensions of the multiple first features to obtain multiple first features with the same feature dimension, is configured to: perform a first convolution operation on the multiple first features and upsampling operations to obtain multiple first features with the same feature dimension.
  • the processing module when performing feature fusion on the multiple first features to obtain the fused features, is configured to: add the multiple first features to obtain the third feature.
  • the processing module is further configured to: perform a second convolution operation on the third feature to obtain a fusion feature.
  • the processing module when performing compression and channel protection processing on the fusion feature, is used to: use the information source channel joint coding model to perform down-sampling and the third convolution operation on the fusion feature to obtain the fourth feature;
  • the source-channel joint coding model is trained based on channel noise.
  • the processing module when performing compression and channel protection processing on the fused features, is also used to: use the source-channel joint coding model to perform one or more of the following operations on the fourth feature: generalized division and normalization , parameterized linear rectifier unit, or power normalization.
  • a communication device may be a second communication device, or a device (for example, a chip, or a chip system, or a circuit) located in the second communication device, or a device capable of communicating with the second communication device.
  • the device matches the device used.
  • the second communication device may be a terminal device or a network device.
  • the device has the function of realizing the above-mentioned second aspect and the method in any possible design of the second aspect.
  • the functions may be implemented by hardware, or may be implemented by executing corresponding software through hardware.
  • Hardware or software includes one or more modules corresponding to the above-mentioned functions.
  • the device may include a transceiver module and a processing module.
  • a transceiver module used to receive the second information of the first communication device
  • a processing module used to decompress and channel decode the second information to obtain the reconstruction feature of the fusion feature
  • the fusion feature is to perform It is obtained by fusing multiple first features obtained by feature extraction; and using the second backbone network model in the multi-task network model to perform feature analysis on the reconstructed features to obtain the result of feature analysis; and using the functional network in the multi-task network model
  • the model processes the results of feature resolution.
  • the result of feature analysis includes X features, the first feature of the X features is a reconstructed feature, and the X i+ 1th feature of the X features is derived from the X ith feature Obtained through calculation; the first Y features of the X features are obtained through the first operation, and the last (XY) features of the X features are obtained through the second operation; wherein, X, Y, i is a positive integer, i is less than or equal to X, and Y is less than or equal to X; the convolution operation in the first operation has multiple receptive fields, and the convolution operation in the second operation has one receptive field.
  • the height of feature X i + 1 is 1/2 of the height of feature X i ; the width of feature X i+1 is 1/2 of the width of feature X i ; The number of channels of the X i+1 feature is the same as the number of channels of the X i feature.
  • the processing module when decompressing and channel-decoding the second information, is configured to: use the source-channel joint decoding model to perform the following operations on the second information: the fourth convolution operation upsampling operation and Fifth convolution operation.
  • the processing module when decompressing and channel decoding the second information, is further configured to perform one or more of the following operations: generalized division, normalization, denormalization, parameterized linear rectification unit, batch normalization, or linear rectification unit.
  • the embodiment of the present application provides a communication device, the device includes a communication interface and a processor, and the communication interface is used for the device to communicate with other devices, such as sending and receiving data or signals.
  • the communication interface may be a transceiver, a circuit, a bus, a module or other types of communication interfaces, and other devices may be other communication devices.
  • the processor is used to call a set of programs, instructions or data, and execute the method described in the above-mentioned first aspect, or each possible design of the first aspect; or, execute the method described in the above-mentioned second aspect, or each possible design of the second aspect. method.
  • the device may also include a memory for storing programs, instructions or data invoked by the processor.
  • the memory is coupled to the processor, and when the processor executes the instructions or data stored in the memory, it can realize the above-mentioned first aspect or the method described in each possible design of the first aspect, or can realize the above-mentioned first aspect The method described by the second aspect or each possible design of the second aspect.
  • the embodiment of the present application provides a communication device, the device includes a communication interface and a processor, and the communication interface is used for the device to communicate with other devices, such as sending and receiving data or signals.
  • the communication interface may be a transceiver, a circuit, a bus, a module or other types of communication interfaces, and other devices may be other communication devices.
  • the processor is used for invoking a set of programs, instructions or data to execute the method described in the above second aspect or each possible design of the second aspect.
  • the device may also include a memory for storing programs, instructions or data invoked by the processor. The memory is coupled to the processor, and when the processor executes the instructions or data stored in the memory, it can implement the method described in the above second aspect or each possible design of the second aspect.
  • the embodiments of the present application also provide a computer-readable storage medium, where computer-readable instructions are stored in the computer-readable storage medium, and when the computer-readable instructions are run on a computer, such as each The methods described in the aspects or in each possible design of the aspects are carried out.
  • the embodiment of the present application provides a system-on-a-chip, which includes a processor and may further include a memory, configured to implement the method described in the above-mentioned first aspect or each possible design of the first aspect.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • the embodiment of the present application provides a chip system, the chip system includes a processor, and may further include a memory, configured to implement the method described in the above-mentioned second aspect or each possible design of the second aspect.
  • the system-on-a-chip may consist of chips, or may include chips and other discrete devices.
  • a computer program product containing instructions, which, when run on a computer, cause the method described in the above aspects or each possible design of each aspect to be executed.
  • Fig. 1 is a schematic diagram of the system architecture in the embodiment of the present application.
  • FIG. 2 is a schematic diagram of a communication system architecture in an embodiment of the present application.
  • Fig. 3 is the schematic diagram of neural network model structure in the embodiment of the present application.
  • Figure 4a is one of the schematic diagrams of the multi-task network model in the embodiment of the present application.
  • Fig. 4b is the second schematic diagram of the multi-task network model in the embodiment of the present application.
  • Fig. 4c is the third schematic diagram of the multi-task network model in the embodiment of the present application.
  • Fig. 4d is a schematic diagram of a computer vision task network model in an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a communication method based on a multi-task network model in an embodiment of the present application
  • FIG. 6 is a schematic diagram of the process of a communication method based on a multi-task network model in an embodiment of the present application
  • FIG. 7 is a schematic diagram of the process of feature fusion in the embodiment of the present application.
  • FIG. 8 is a schematic diagram of the process of source-channel joint encoding and decoding in the embodiment of the present application.
  • Fig. 9a is a schematic diagram of the first operation in the embodiment of the present application.
  • Fig. 9b is a schematic diagram of the second operation in the embodiment of the present application.
  • Fig. 10 is a schematic diagram of the process of feature analysis in the embodiment of the present application.
  • Figure 11a is one of the performance comparison diagrams of the network model in the embodiment of the present application.
  • Figure 11b is the second performance comparison diagram of the network model in the embodiment of the present application.
  • FIG. 12 is one of the structural schematic diagrams of the communication device in the embodiment of the present application.
  • Fig. 13 is the second structural diagram of the communication device in the embodiment of the present application.
  • FIG. 14 is the third schematic diagram of the structure of the communication device in the embodiment of the present application.
  • the present application provides a communication method and device based on a multi-task network model, in order to better realize terminal-cloud collaboration in the MTL mode.
  • the method and the device are based on the same technical conception. Since the principle of solving the problem of the method and the device is similar, the implementation of the device and the method can be referred to each other, and the repetition will not be repeated.
  • the communication method based on the multi-task network model provided by the embodiment of the present application can be applied to a 5G communication system, such as a 5G new air interface (new radio, NR) system, and can be applied to various application scenarios of a 5G communication system, such as enhanced mobile broadband (enhanced mobile broadband, eMBB), ultra reliable low latency communication (ultra reliable low latency communication, URLLC) and enhanced machine type communication (enhanced machine-type communication, eMTC).
  • the communication method based on the multi-task network model provided in the embodiment of the present application can also be applied to various communication systems that evolve in the future, such as a sixth generation (6th generation, 6G) communication system, and another example is an air-space-sea-ground integrated communication system.
  • the communication method based on the multi-task network model provided by the embodiment of the present application can also be applied to communication between base stations, communication between terminal equipment, communication between terminal equipment, Internet of Vehicles, Internet of Things, industrial Internet, satellite communication, etc., for example , can be applied to device-to-device (Device-to-Device, D2D), vehicle-to-everything (V2X), machine-to-machine (machine-to-machine, M2M) communication systems.
  • D2D Device-to-device
  • V2X vehicle-to-everything
  • M2M machine-to-machine
  • FIG. 1 shows a system architecture applicable to this embodiment of the present application, including a first communication device 101 and a second communication device 102 .
  • the first communication device 101 and the second communication device 102 are both ends of the multi-tasking network model in cooperation, and the form of running the multi-tasking network model can be a network device, a terminal device, a cloud (cloud) computing node, an edge server (edge server) , mobile edge computing (MEC) or computing power, etc.
  • the first communication device 101 and the second communication device 102 may be connected in a wired or wireless manner.
  • the first communication device 101 and the second communication device 102 may be any two ends capable of running a multi-task network model.
  • the first communication device 101 is a terminal device
  • the second communication device 102 may be a cloud computing node, a network device, an edge server (edge server), an MEC, or a computing power.
  • the first communication device 101 is a cloud computing node, network device, edge server (edge server), MEC or computing power
  • the second communication device 102 may be a terminal device.
  • the first communication device 101 may be the above-mentioned device, or may be a component of the above-mentioned device (for example, a processor, a chip, or a chip system, etc.), or may be a device that matches the above-mentioned device.
  • the second communication device 102 may be the above-mentioned device, or a component of the above-mentioned device (for example, a processor, a chip, or a chip system, etc.), or a device that matches the above-mentioned device.
  • the embodiment of the present application is applicable to the scenario of collaboratively running a multi-task network model (hereinafter referred to as the scenario of cooperative running), wherein the two ends of the cooperatively running multi-task network model can be any two ends, for example, when applied to terminal-cloud collaboration
  • the two ends can be referred to as the terminal and the cloud respectively
  • the terminal can be a terminal device or a device in a terminal device (such as a processor, a chip, or a chip system, etc.)
  • the cloud can be a network device, a cloud computing node, Devices such as edge servers, MEC, or computing power, and the cloud can also be in the form of software with computing power.
  • the communication method based on the multi-task network model is suitable for a communication system architecture when the two ends of the cooperatively running multi-task network model are terminal equipment and network equipment.
  • the communication system architecture includes a network device 201 and a terminal device 202 .
  • the network device 201 provides wireless access for one or more terminal devices 202 within the coverage of the network device 201 . There may be areas of overlapping coverage between network devices. Network devices can also communicate with each other.
  • the network device 201 provides services for the terminal device 202 within the coverage.
  • the network device 201 provides wireless access for one or more terminal devices 202 within the coverage of the network device 201 .
  • the network device 201 is a node in a radio access network (radio access network, RAN), and may also be called a base station, and may also be called a RAN node (or device).
  • RAN radio access network
  • examples of some network devices 201 are: next generation base station (next generation nodeB, gNB), next generation evolved base station (next generation evolved nodeB, Ng-eNB), transmission reception point (transmission reception point, TRP), evolved Node B (evolved Node B, eNB), radio network controller (radio network controller, RNC), node B (Node B, NB), base station controller (base station controller, BSC), base transceiver station (base transceiver station, BTS), home base station (for example, home evolved NodeB, or home Node B, HNB), base band unit (base band unit, BBU), or wireless fidelity (wireless fidelity, Wifi) access point (access point, AP),
  • the network device 201 may also be a satellite,
  • the network device 201 can also be other devices with network device functions, for example, the network device 201 can also be a device to device (device to device, D2D) communication, vehicle networking, or machine to machine (machine to machine, M2M) communication. A device that functions as a network device.
  • the network device 201 may also be any possible network device in the future communication system.
  • the network device 201 may include a centralized unit (CU) and a distributed unit (DU).
  • the network device may also include an active antenna unit (AAU).
  • CU implements some functions of network equipment
  • DU implements some functions of network equipment.
  • CU is responsible for processing non-real-time protocols and services, implementing radio resource control (radio resource control, RRC), packet data convergence layer protocol (packet data convergence protocol, PDCP) layer functions.
  • the DU is responsible for processing physical layer protocols and real-time services, realizing the functions of the radio link control (radio link control, RLC) layer, media access control (media access control, MAC) layer and physical (physical, PHY) layer.
  • the AAU implements some physical layer processing functions, radio frequency processing and related functions of active antennas.
  • the network device may be a device including one or more of a CU node, a DU node, and an AAU node.
  • the CU can be divided into network devices in an access network (radio access network, RAN), and the CU can also be divided into network devices in a core network (core network, CN), which is not limited in this application.
  • Terminal equipment 202 also referred to as user equipment (user equipment, UE), mobile station (mobile station, MS), mobile terminal (mobile terminal, MT), etc., is a device that provides voice and/or data connectivity to users. equipment.
  • the terminal device 202 includes a handheld device with a wireless connection function, a vehicle-mounted device, etc. If the terminal device 202 is located on the vehicle (for example, placed in the vehicle or installed in the vehicle), it can be considered as a vehicle-mounted device, and the vehicle-mounted device is also called It is an on-board unit (onBoard unit, OBU).
  • OBU onBoard unit
  • the terminal device 202 can be: mobile phone (mobile phone), tablet computer, notebook computer, palmtop computer, mobile internet device (mobile internet device, MID), wearable device (such as smart watch, smart bracelet, pedometer, etc. ), vehicle-mounted equipment (such as automobiles, bicycles, electric vehicles, airplanes, ships, trains, high-speed rail, etc.), virtual reality (virtual reality, VR) equipment, augmented reality (augmented reality, AR) equipment, industrial control (industrial control) Wireless terminals in smart home equipment (such as refrigerators, TVs, air conditioners, electric meters, etc.), intelligent robots, workshop equipment, wireless terminals in self driving, wireless terminals in remote medical surgery , wireless terminals in smart grid, wireless terminals in transportation safety, wireless terminals in smart city, or wireless terminals in smart home, flight equipment (such as , intelligent robots, hot air balloons, drones, airplanes), etc.
  • vehicle-mounted equipment such as automobiles, bicycles, electric vehicles, airplanes, ships, trains, high-speed rail, etc.
  • virtual reality virtual reality
  • the terminal device 202 may also be other devices with terminal device functions, for example, the terminal device 202 may also be a device-to-device (device to device, D2D) communication, a vehicle networking or a machine-to-machine (machine-to-machine, M2M) communication A device that functions as a terminal device in a device.
  • the network device functioning as a terminal device can also be regarded as a terminal device.
  • the terminal device 202 may also be a wearable device.
  • Wearable devices can also be called wearable smart devices or smart wearable devices, etc., which is a general term for the application of wearable technology to intelligently design daily wear and develop wearable devices, such as glasses, gloves, watches, clothing and shoes Wait.
  • a wearable device is a portable device that is worn directly on the body or integrated into the user's clothing or accessories. Wearable devices are not only a hardware device, but also achieve powerful functions through software support, data interaction, and cloud interaction.
  • Generalized wearable smart devices include full-featured, large-sized, complete or partial functions without relying on smart phones, such as smart watches or smart glasses, etc., and only focus on a certain type of application functions, and need to cooperate with other devices such as smart phones Use, such as various smart bracelets, smart helmets, smart jewelry, etc. for physical sign monitoring.
  • the means for realizing the functions of the terminal device 202 are, for example, chips, wireless transceivers, and chip systems, and the means for realizing the functions of the terminal device 202 may be installed or set or deployed in the terminal device 202 .
  • Network models can also be called neural network models, artificial neural networks (ANNs) models, neural network (NNs) models, and connection models.
  • the application of neural network models can realize artificial intelligence (AI) technology.
  • AI artificial intelligence
  • the neural network model is a typical representative of AI models.
  • the neural network model is a mathematical calculation model that imitates the behavioral characteristics of the human brain neural network and performs distributed parallel information processing. Its main task is to learn from the principles of the human brain neural network, build a practical artificial neural network according to application requirements, realize the design of learning algorithms suitable for application requirements, simulate the intelligent activities of the human brain, and then solve practical problems technically.
  • the neural network relies on the complexity of the network structure, and realizes the design of the corresponding learning algorithm by adjusting the interconnection relationship between a large number of internal nodes.
  • a neural network model can include multiple neural network layers with different functions, and each layer includes parameters and calculation formulas. According to different calculation formulas or different functions, different layers in the neural network model have different names, for example: the layer that performs convolution calculation is called a convolutional layer, and this convolutional layer is often used for input signals (such as: images) Perform feature extraction.
  • a neural network model can also be composed of multiple existing neural network sub-models. Neural network models with different structures can be used in different scenarios (such as classification, recognition) or provide different effects when used in the same scenario.
  • neural network models The different structures of neural network models are mainly reflected in one or more of the following: neural network models
  • the number of layers of the network layer is different, the order of each network layer is different, and the weights, parameters or calculation formulas in each network layer are different.
  • the neural network can be composed of neural units, and the neural unit can refer to an operation unit that takes x s and the intercept 1 as input, and the output of the operation unit can be shown in formula (1):
  • W s is the weight of x s
  • b is the bias of the neuron unit.
  • f is the activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal.
  • the output signal of the activation function can be used as the input of the next convolutional layer, and the activation function can be a sigmoid function, a ReLU function, a tanh function, and the like.
  • a neural network is a network formed by connecting multiple above-mentioned single neural units, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field.
  • the local receptive field can be an area composed of several neural units.
  • Multilayer perceptron is one of the forward propagation neural network models.
  • MLP includes a variety of network layers with different functions, which are: an input layer, an output layer, and one or more hidden layers.
  • One or more hidden layers are located between the input layer and the output layer, and the number of hidden layers in the MLP can be determined according to application requirements.
  • information is transmitted in one direction, that is, information moves forward from the input layer, then passes through one or more hidden layers layer by layer, and then passes from the last hidden layer to the output layer.
  • Fig. 3 illustrates an example of a neural network model structure.
  • the input layer includes a plurality of neurons, and the neurons in the input layer are also called input nodes, which are used to receive the input vector input from the outside, and pass the input vector to the implicit neurons in a layer. Among them, the input node does not perform calculation operations.
  • the hidden layer includes multiple neurons.
  • the neurons in the hidden layer are also called hidden nodes.
  • the hidden nodes are used to extract the characteristics of the input vector according to the input vector input to the hidden layer. , and pass this feature to the neurons in the next layer.
  • the way to extract features of the hidden node is: according to the output vector of the neuron located in the previous layer, and the weight value of the connection between the hidden node and the neuron of the previous layer, according to the input and output of the hidden node The relationship determines the output vector for that hidden node.
  • the upper layer refers to the network layer pointing to the input information of the hidden layer where the hidden node is located
  • the lower layer refers to the network layer that receives the output information of the hidden layer where the hidden node is located.
  • the output layer includes one or more neurons, and the neurons in the output layer are also called output nodes.
  • the output node can follow the input-output relationship of the output node and the output vector of the hidden node connected to it. , and the weight value between the hidden node connected and the output node, determine the output vector of the output node, and transmit the output vector to the outside.
  • the adjacent layers of the multi-layer perceptron are fully connected, that is, for any two adjacent layers, any neuron in the previous layer is connected to all neurons in the next layer. And the connections between neurons in adjacent layers are configured with weights.
  • the multi-task network model is a network model used in the MTL mode. Compared with the single-task network model used in the STL mode, the multi-task network model can perform multiple tasks.
  • the multi-task network model may include multiple sub-network models.
  • the multi-task network model includes N sub-network models, and each sub-network model can be regarded as a neural network model introduced in point 1).
  • the N sub-network models included in the multi-task network model can be divided into a backbone network model and a functional network model.
  • a functional network model can be used to be responsible for a task, multiple functional network models can be used to be responsible for multiple tasks, the multiple tasks can be related, and multiple functional network models can share the backbone network model.
  • the backbone network model can be used to extract features, for example, residual network (ResNet), visual geometry group (VGG), mobile network (mobileNet), Google innovation network (Google inception network (GoogLeNet), or Network models such as Alex network (AlexNet, Alex is a person's name) have the ability to extract features, so they can be used as backbone network models.
  • Functional network models can be used to account for other functions.
  • the multi-task network model There can also be only one model type in the multi-task network model, that is, the backbone network model. It can also be considered that the multi-task network model includes a sub-network model, or the multi-task network model cannot be split into multiple sub-network models.
  • the multi-task network model This one backbone network model in , performs several related tasks.
  • CNN is widely used in the field of computer vision. For example, tasks such as detection, tracking, recognition, classification or prediction can be solved by using CNN to establish corresponding network models.
  • the following is an example of the multi-task model with several tasks applied by the CNN multi-task model.
  • the multi-task network model is used for the application of image classification and segmentation, specifically the application of the image classification and segmentation is mask-based convolutional neural networks (mask region-based convolutional networks, Mask-RCNN), the multi-task network model includes 5 sub-network models, including 1 backbone network model and 4 functional network models.
  • the five sub-network models are ResNet, FPN, RPN, Classifier-NET, and Mask-NET.
  • ResNet is the backbone network model as a feature extractor.
  • FPN, RPN, Classifier-NET, and Mask-NET are functional network models.
  • FPN is used to extend the backbone network to better characterize objects at multiple scales.
  • the RPN determines the region of interest. Classifier-NET classifies objects.
  • Mask-NET segments objects are functional network models.
  • the multi-task network model is the application Mask-RCNN for image classification and segmentation.
  • the multi-task network model includes 6 sub-network models, including 2 backbone network models and 4 functional network models. .
  • the two backbone network models are the first ResNet and the second ResNet, and the four functional network models are FPN, RPN, Classifier-NET, and Mask-NET.
  • the first ResNet is used for feature extraction
  • the second ResNet is used for further feature extraction of the result after the feature extraction of the first ResNet.
  • FPN is used to extend the backbone network to better characterize objects at multiple scales.
  • the RPN determines the region of interest.
  • Classifier-NET classifies objects. Mask-NET segments objects.
  • the multi-task network model is applied to computer vision tasks, in which target detection and semantic segmentation are two relatively related tasks, both of which are to identify and classify objects in images.
  • the two tasks of target detection and semantic segmentation can be performed respectively through two functional network models.
  • the detection (detect) functional network model is used to perform the task of target detection
  • the segmentation (segment) functional network model is used to perform the task of semantic segmentation.
  • the backbone network model can include a ResNet50 model with multiple CNN layers and residual connections.
  • the rectangular bar can be represented as an intermediate feature, and the horizontal line without an arrow above is a residual connection, and multiple intermediate features for feature extraction need to be transferred to the back-end rhombus for calculation by residual connection.
  • the diamond represents an operation that can be used to fuse intermediate features of different scales located at different locations in the backbone network through a series of sampling operations and convolution operations to obtain new features.
  • the line with the arrow above represents the output processing result of the operation represented by the rhombus, and outputs the processing result to the detection (detect) functional network model and the segmentation (segment) functional network model.
  • the backbone network model in the CNN multi-task network model has a complex structure and a large number of parameters. If the multi-task network model is applied to a cooperative operation scenario (such as device-cloud cooperation), part of the multi-task network model needs to run at one end of the two ends of the cooperative operation, and the other part of the two ends of the cooperative operation at the other end run. It can be understood that the multi-task network model needs to be cut to form two parts. However, based on Figure 4d, it can be seen that the structure of the backbone network model of the CNN multi-task network model is complex, and there is no clear cutting point to divide the multi-task network model into two parts.
  • the CNN multi-task network model has a large amount of parameters, and the scene of collaborative operation also needs to transmit intermediate features from one end to the other.
  • the dimensions of the intermediate features that need to be transmitted are not only large but also exist. redundancy. To sum up, how to realize the cooperative operation of multi-task network model is a problem that needs to be solved.
  • the multi-task network model may include a first backbone network model, a second backbone network model and a functional network model.
  • the first backbone network model runs on the first communication device, and the second backbone network model and the functional network model run on the second communication device.
  • the first backbone network model and the second backbone network model may be of the same model type, for example, the first backbone network model and the second backbone network model are ResNet.
  • the first backbone network model and the second backbone network model may be two parts of one backbone network model, or may be considered as two independent backbone network models.
  • the number of the first backbone network model and the second backbone network model may be single or multiple.
  • the first communication device uses the first backbone network model to process the input signal to obtain fusion features.
  • the fused feature is obtained by fusing multiple first features, and the multiple first features are obtained by feature extraction of the input signal.
  • the first communication device performs compression and channel coding on the fusion feature to obtain first information.
  • the first communication device sends the first information to the second communication device.
  • the second communication device receives the second information of the first communication device.
  • the first information is affected by channel noise during channel transmission
  • the second information is information after the first information is affected by the noise.
  • the second information is the same as the first information.
  • the second communication device performs decompression and channel decoding on the second information to obtain a reconstructed feature of the fused feature.
  • the fused feature is obtained by fusing a plurality of first features obtained by performing feature extraction on the input signal.
  • the second communication device performs feature analysis on the reconstructed feature by using the second backbone network model, and obtains a feature analysis result.
  • the second communication device uses the functional network model to process the result of feature analysis.
  • the multi-task network model includes a first backbone network model, a second backbone network model, and a functional network model.
  • the first communication device and the second communication device cooperate to run the multi-task network model.
  • the final output function After the input signal is input to the multi-task network model, the final output function
  • the processing result obtained after the network model processes the result of feature analysis.
  • the obtained fused features can contain more information, which can make the second communication device more accurate when processing another part of the network model based on the fused features.
  • Generating fusion features in the feature extraction stage can make the structure of the multi-task network model clearer, and it is more conducive to dividing the multi-task network model into the part executed by the first communication device and the part executed by the second communication device, which is more conducive to the realization of MTL Device-cloud collaboration under the mode.
  • the parameters transmitted between the first communication device and the second communication device are reduced by compression, reducing the transmission overhead, and the data transmitted between the first communication device and the second communication device can be more noise-resistant through channel coding .
  • the process of the communication method based on the multi-task network model is illustrated below through FIG. 6 .
  • the first communication device and the second communication device cooperate to run a multi-task network model.
  • the first communication device uses the first backbone network model to process the input signal, which may be an image, to obtain a fusion feature F, where the fusion feature F is a three-dimensional tensor.
  • the first communication device compresses and channel-codes the fusion feature F to obtain first information D.
  • the dimension of the first information D is smaller than the dimension of the fusion feature F, that is, the number of elements of D is smaller than the number of elements of F.
  • the first communication device sends the first information D, and the first information D is transmitted through a wireless channel, and the second communication device receives the second information D' after D has undergone channel interference.
  • the second communication device performs decompression and channel decoding on the second information D' to obtain the reconstructed feature F' of F.
  • the second communication device uses the second backbone network model to perform feature analysis on the reconstructed feature F' to obtain a feature analysis result.
  • the second communication device inputs the result of feature analysis to multiple functional network models for processing, and the multiple functional network models share the result of feature analysis.
  • the N functional network models shown in Figure 6 correspond to N tasks respectively.
  • the second communication device uses the N functional network models to process the results of the feature analysis respectively, and the N functional network models respectively process the results of the feature analysis to obtain the processing results of the N tasks.
  • the tasks that can be performed by the multi-task network model can be model training or model reasoning.
  • the first communication device uses the first backbone network model to process the input signal to obtain fusion features.
  • the first communication device performs feature extraction on the input signal to obtain multiple second features, where the multiple second features have different feature dimensions.
  • the first communication device processes the feature dimensions of the multiple second features to obtain multiple first features with the same feature dimension, and the multiple first features correspond to multiple second features respectively, that is, process the feature dimension of one second feature A first feature corresponding to the second feature can be obtained.
  • the first communication device performs feature fusion on the plurality of first features to obtain a fusion feature.
  • Feature dimensions can include height, width, and number of channels.
  • the named features such as the first feature, the second feature, and the third feature may refer to intermediate features, which are processed data obtained by a certain layer during the intermediate processing of the multi-task network model.
  • the processing process may be to perform a convolution operation and an upsampling operation on the plurality of first features, where the convolution operation is denoted as the first A convolution operation.
  • the sequence of the first convolution operation and the upsampling operation is not limited.
  • the first convolution operation can change the number of channels of the feature, and the upsampling operation can change the height and width of the feature.
  • the upsampling operation can change the height and width of the first feature to any value, not limited to integer multiples, for example, it can be expanded to 8 times or more of the original height, and the expansion multiple is any multiple.
  • using the deconvolution operation to change the height and width of features can only achieve an expansion multiple of 2, and the expansion multiple must be an integer.
  • the upsampling operation is more flexible by the multiple of the extended dimension.
  • an upsampled (upsampled) operation may also be referred to as an interpolation operation, with the purpose of enlarging the height and/or width of a feature.
  • the process of upsampling operation or interpolation operation can be: rescale the input feature to the target size, calculate the feature value of each sampling point, and use interpolation methods such as bilinear-interpolation for other points Do the interpolation.
  • Interpolation is to use mathematical formulas to calculate missing values on the basis of surrounding adjacent eigenvalues, and insert the calculated missing values.
  • a fusion feature is obtained by performing feature fusion on a plurality of first features, and an addition operation may be performed on a plurality of first features to obtain a fusion feature. Or add a plurality of first features to obtain a third feature, and then perform a second convolution operation on the third feature to obtain a fusion feature.
  • adding a plurality of first features may add elements at the same position of the plurality of first features respectively.
  • the intermediate features may be three-dimensional data with height, width and channel number, and the same position of multiple first features may refer to the same height, same width and same channel position of multiple first features.
  • the following describes a process in which the first communication device fuses multiple first features to obtain a fusion feature in conjunction with specific application scenarios.
  • the first backbone network model is Residual Network50 (ResNet50), and ResNet50 is a residual network with 50 convolutional layers.
  • the first communication device uses ResNet50 to perform feature extraction on the input signal to obtain a plurality of second features with different feature dimensions, and the different feature dimensions may refer to different heights, widths, and numbers of channels.
  • a plurality of second features are recorded as Res1, Res2, Res3, Res4 and Res5, and the feature dimensions of Res1, Res2, Res3, Res4 and Res5 are respectively recorded as (H1, W1, C1), (H2, W2, C2), (H3 ,W3,C3),(H4,W4,C4),(H5,W5,C5).
  • H means height
  • W means width
  • C means number of channels.
  • the size relationship of the feature dimension is: H1>H2>H3>H4>H5, W1>W2>W3>W4>W5, C1 ⁇ C2 ⁇ C3 ⁇ C4 ⁇ C5, H1*W1*C1>H2*W2*C2>H3 *W3*C3>H4*W4*C4>H5*W5*C5.
  • the first communication device respectively performs a 1*1 convolution operation on Res2, Res3, Res4, and Res5, changes the number of feature channels of Res2, Res3, Res4, and Res5, and uses an upsampling method to change the number of feature channels of Res2, Res3, Res4, and Res5.
  • Res2, Res3, Res4 and Res5 are unified into the same feature dimension, and the same feature dimension is recorded as (H, W, C), that is, Res2, Res3, Res4 and Res5 are unified into dimensions (H, W,C) four sets of first features.
  • the dimensions of one or more features in Res2, Res3, Res4, and Res5 are the same as the dimension of the total weight, then the one or more features do not need to be convolutional and upsampled, for example , assuming that the dimension of the Res2 feature is the same as the dimension of the total weight, then Res2 does not need to perform convolution operation and upsampling, and only needs to perform convolution operation and upsampling on Res3, Res4 and Res5.
  • the first communication device adds the four groups of first features element by element to obtain the third feature F0.
  • the first communication device performs a 3*3 convolution operation on the third feature F0 to obtain a fusion feature F1.
  • the convolution kernel performing the convolution operation on the third feature F0 can be changed to other values, generally an odd number, and the size of the convolution kernel should be less than or equal to the width and/or height of F0.
  • the convolution operation is represented by conv in the figure.
  • the first communication device performs compression and channel coding on the fused features to obtain the first information.
  • Compression and channel coding can be understood as a joint source-channel coding (JSCC), which can be realized based on the joint source-channel coding model.
  • JSCC joint source-channel coding
  • the joint source-channel coding model is trained based on channel noise.
  • the data processed by the source-channel joint coding model has anti-noise performance.
  • the first communication device inputs the fusion feature into the JSCC model, and outputs first information.
  • the first communication device performs down-sampling and a third convolution operation on the fusion feature by using the source-channel joint coding model to obtain the fourth feature.
  • downsampling may also be referred to as downsampling (subsampled).
  • Downsampling can make the downsampled features conform to the size of the display area.
  • Downsampling may also be able to generate thumbnails of the downsampled features.
  • the process of downsampling can be described as follows. For a set of features whose height and width are MN, downsampling is performed by s times, that is, the resolution of (M/s)*(N/s) size is obtained, and s is the common divisor of M and N.
  • the eigenvalue in the downsampled original feature s*s window can be changed into a value, which is the mean value of all eigenvalues in the window.
  • Downsampling can reduce the fusion feature to any dimension, that is, the downsampling operation can reduce the fusion feature to any height and width, and the height or width after downsampling can be a positive integer.
  • downsampling can reduce the height and width of the fusion feature.
  • the convolution operation is used to reduce the height and width of the fusion feature.
  • the multiple of the reduction dimension of the convolution operation is affected by the size of the convolution kernel, and can only be reduced to one integral multiple of the original dimension. , using downsampling to reduce the dimensionality of fusion features can be more flexible.
  • the third convolution operation can reduce the number of feature channels for fusion features.
  • the third convolution operation may be a 3*3 convolution operation.
  • the convolution operation can adjust the ratio of the number of output and input channels according to specific needs, so as to achieve different compression factors.
  • the number of channels of the input feature of the third convolution operation can be any positive integer, and the third convolution operation can control the number of output channels so as to reduce the number of feature channels.
  • the first communication device may also use the source-channel joint coding model to perform one or more of the following operations on the fourth feature: generalized division normalization (generalized divisive normalization, GDN), parameterized linear rectification unit (parametric rectified linear unit) , PReLU), or power normalization.
  • generalized division normalization generalized divisive normalization, GDN
  • parameterized linear rectification unit parametric rectified linear unit
  • PReLU power normalization
  • the GDN operation can be used to improve the compression capability of the source-channel joint coding model.
  • PReLU can also be used to improve the compression capability of the source-channel joint coding model.
  • a further compression effect on the fourth feature can be achieved through the GDN operation and the PReLU operation.
  • Power normalization can make the power of the compressed result be 1.
  • the above operation of the source-channel joint coding model is only an example, and may be replaced by some other operations that can achieve the same effect in practical applications.
  • GDN operations can be replaced by batch normalization (BN).
  • PReLU can be replaced by a rectified linear unit (ReLU) or a fixed-parameter linear rectification unit (Leaky ReLu).
  • the second communication device performs decompression and channel decoding on the second information to obtain a reconstructed feature of the fused feature.
  • Decompression and channel decoding can be understood as a source-channel joint decoding, which can be realized based on the source-channel joint decoding model. That is, the second information is input into the source-channel joint decoding model, and the reconstructed feature of the fused feature is output.
  • the second communication device performs the following operations on the second information by using the source-channel joint decoding model: a first convolution operation, an upsampling operation, and a second convolution operation.
  • the spatial dimension of the feature can be restored through the up-sampling operation. Since the first communication device uses the down-sampling operation to reduce the height and width of the fusion feature during compression encoding, the reduction factor can be any value.
  • the second communication device adopts the corresponding up-sampling
  • the operation can restore the height and width of the feature, and the expansion dimension of the upsampling is also relatively flexible, and the spatial dimension of the feature can be restored using the multiple corresponding to the upsampling.
  • the original number of feature channels of the fusion feature is 64
  • the number of feature channels of the fusion feature is reduced to 1 through the third convolution operation
  • the number of feature channels of the second information obtained by the second communication device is 1.
  • the second communication device performs decompression and channel decoding on the second information, and also includes one or more of the following operations: generalized division normalization denormalization (IGDN), parameterized linear rectification unit (PReLU), batch Normalization (BN), or Rectified Linear Unit (ReLU).
  • IGDN generalized division normalization denormalization
  • PReLU parameterized linear rectification unit
  • BN batch Normalization
  • ReLU Rectified Linear Unit
  • the IGDN can be used to improve the decompression capability or decoding capability of the source-channel joint decoding model.
  • PReLU can also be used to improve the decompression or decoding ability of the source-channel joint decoding model.
  • BN and/or ReLU can limit the value range of the decoding result, and can further increase the accuracy of the decoding result.
  • the second communication device After decompression and channel decoding, the second communication device obtains the reconstructed features of the fused features.
  • the reconstructed features and the fused features have the same dimension size.
  • the reconstructed features are the fused features.
  • IGDN can be replaced by BN.
  • the source-channel joint decoding process performed by the second communication device corresponds to the source-channel joint encoding process performed by the first communication device. That is, if GDN is used for encoding, then IGDN is used for decoding. If a BN is used on the encoding side, use the corresponding BN on the decoding side.
  • PReLU can be replaced by ReLU or Leaky ReLU.
  • the following uses an example to schematically illustrate the steps of source channel joint encoding and decoding based on FIG. 8 .
  • the first communication device performs the following operations on the fusion feature during encoding: downsampling, 3*3 convolution operation, GDN, PReLU, and power normalization to obtain the first information, and after the first channel transmission, the second communication
  • the second communication device performs the following operations on the second information during decoding: 3*3 convolution operation, IGDN, PReLU, upsampling, 3*3 convolution operation, BN, and ReLU, to obtain the reconstruction of the fusion feature feature.
  • the second communication device needs to perform feature analysis on the reconstructed feature to obtain a feature analysis result.
  • the result of feature analysis includes X features, the first feature of X features is a reconstructed feature, and the X i+ 1th feature of the X features is obtained by the operation of the X i feature; among the X features
  • the first Y features are obtained through the first operation, and the last (XY) features of the X features are obtained through the second operation; where X, Y, and i are positive integers, i is less than or equal to X, Y is less than or equal to X.
  • the height of the X i + 1 feature is 1/2 of the height of the X i feature
  • the width of the X i+1 feature is 1/2 of the width of the X i feature
  • the X The number of channels of the i+1 feature is the same as that of the X i -th feature.
  • the convolution operation in the first operation has multiple receptive fields, and the convolution operation in the second operation has one receptive field.
  • Fusing the convolution results of different receptive fields together is a means of feature fusion, which extracts and fuses different information from different angles, so that the information contained in the result of the first operation is more than that contained in the result of the second operation. More information is more conducive to the performance improvement of the functional network model.
  • the convolution operation with the first convolution kernel in the second operation has a first receptive field
  • the convolution operation with the same first convolution kernel in the first operation has two kinds of receptive fields.
  • the receptive field includes a first receptive field and a second receptive field, and the second receptive field is larger than the first receptive field.
  • the second operation may be a bottleneck module (Bottleneck), and the first operation may be an expanded bottleneck module (Dilated Bottleneck).
  • the first operation may include the following operations: performing a 1 ⁇ 1 convolution operation (conv) on the features to be processed in the first Y features; Multiple 3 ⁇ 3 convolution operations are performed separately, and the receptive field sizes of multiple 3 ⁇ 3 convolution operations are different; the results of multiple 3 ⁇ 3 convolution operations are spliced in the channel number dimension (contat); A 1 ⁇ 1 convolution operation is performed on the splicing result of the channel number dimension to obtain the first convolution result; the first convolution result is added element-by-element to the feature to be processed.
  • the feature to be processed in the first Y features is any feature in the first Y features, and such an operation can be performed on each feature in the first Y features.
  • BN and/or ReLU may be performed after the first 1 ⁇ 1 convolution operation, and BN and/or ReLU may be performed after performing multiple 3 ⁇ 3 convolution operations. BN can also be performed after the second 1 ⁇ 1 convolution operation. After element-wise addition of the first convolution result and the features to be processed, ReLU can also be performed.
  • the second operation may include the following operations: performing a 1 ⁇ 1 convolution operation on the features to be processed in the last (X-Y) features among the X features; Perform a 3 ⁇ 3 convolution operation respectively; perform a 1 ⁇ 1 convolution operation on the result of the 3 ⁇ 3 convolution operation to obtain a second convolution result; combine the second convolution result with the last (X-Y)
  • the pending features in features are added element-wise.
  • the feature to be processed in the last (X-Y) features of the X features is any one of the last (X-Y) features, and each feature in the last (X-Y) features can be executed. operate.
  • BN and/or ReLU may be performed after the first 1 ⁇ 1 convolution operation, and BN and/or ReLU may be performed after the 3 ⁇ 3 convolution operation. BN can also be performed after the second 1 ⁇ 1 convolution operation. After element-wise addition of the second convolution result and the feature to be processed, ReLU can also be performed.
  • the reconstructed feature is denoted by D1
  • the seven features include D1, also includes D2, D3, D4, D5, D6, and D7.
  • the feature dimension of D1 is the same as that of the fusion feature, for example, it is recorded as (H, W, C). H means height, W means width, and C means number of channels.
  • the number of channels of the +1 feature is the same as the number of channels of the Xi'th feature.
  • the feature dimensions of D1 to D7 are (H, W, C), (H/2, W/2, C), (H/4, W/4, C), (H/8, W/8, C ), (H/16,W/16,C), (H/32,W/32,C), (H/64,W/64,C). Each feature is generated based on the previous one.
  • the height and width of the features are getting smaller and smaller, the height and width of the front features of the X features are larger than those of the later features.
  • the first Y features of the X features are obtained through the first operation, and the X features
  • the last (XY) features among the features are obtained by the second operation.
  • the first operation is represented by a rhombus
  • the second operation is represented by a circle.
  • the first 3 are obtained through the first operation
  • the last 4 features are obtained through the second operation.
  • the seven features D1-D7 are output to the functional network model for sharing by multiple functional network models of multi-tasks. Two tasks (task 1 and task 2) are used as examples in FIG. 10 .
  • the multi-task network model needs model training before application, and the source-channel joint coding model and the source-channel joint decoding model also need model training before application.
  • the multi-task network model may include a source-channel joint coding model and a source-channel joint decoding model.
  • the multi-task network model, the source-channel joint coding model and the source-channel joint decoding model are mutually independent models.
  • the source-channel joint coding model and the source-channel joint decoding model are combined for training during training. The following describes the possible implementation process of model training.
  • Step 1 Generate a basic multi-task network model, which can be applied to cooperative operation scenarios, for example, it can be used in device-cloud collaboration scenarios. Two devices operating in cooperation can still be expressed by a first communicating device and a second communicating device.
  • Step 2 Initialize the network parameters of the multi-task network model.
  • the input training data is the image pixel value standardized to the [0,1] interval, and the features after feature extraction and feature analysis are input into the functional network model corresponding to each task branch. Complete the corresponding tasks and output the results.
  • the output of each task branch and the label information in the training data calculate the loss, so as to realize the end-to-end training of the multi-task network model.
  • the multi-task loss function is the sum of the loss functions of each task branch.
  • step 2 Repeat step 2 until the multi-task network model converges.
  • Step 3 Based on the converged multi-task network model, select a cut point to divide the network model into two parts, and add the source-channel joint coding model and the source-channel joint decoding model to simulate the compression, transmission, decompression and reconstruction of intermediate features the process of.
  • the source-channel joint coding model compresses the intermediate features and the compression result passes through the channel model, such as AWGN channel and Rayleigh channel. the transmission process.
  • the loss function is: L MTL +L1(F,F'), where L MTL is the multi-task loss function, and L1(F,F') is the L1-norm of the original intermediate feature F and the reconstructed intermediate feature F'.
  • step 3 Repeat step 3 until the source-channel joint coding model and the source-channel joint decoding model converge.
  • Step 4 Based on the training results of step 3, the parameters of the multi-task network model are no longer fixed, and all parameters of the multi-task network model to which the source-channel joint coding model and the source-channel joint decoding model have been added are used for end-to-end joint training , the loss function used is L MTL , repeat step 4 until the overall model converges.
  • the overall model is a multi-task network model, a source-channel joint coding model and a source-channel joint decoding model.
  • the source-channel joint coding model and the source-channel joint decoding model can be briefly described as codec models.
  • the overall model is trained step by step.
  • the multi-task network model is trained as the basic model of the overall framework, and then the source-channel joint coding model and the source-channel joint decoding model are separately trained to obtain a certain compression ability.
  • end-to-end training is performed on the overall model, so that the multi-task network model, the source-channel joint coding model and the source-channel joint decoding model are more tightly coupled, and the overall performance is further improved.
  • Using the sum of the multi-task loss function and the L1-norm before and after the intermediate feature compression and reconstruction is used as the loss function of the separate training codec model, so that the codec model can improve the system performance while ensuring the compression capability.
  • the table below illustrates the performance improvement of the multi-task network model provided by the embodiment of the present application compared with other network models.
  • the multi-task network model provided by the embodiment of the present application is represented by a feature fusion based multi-task network (FFMNet), and another multi-task network model is BlitzNet.
  • FFMNet feature fusion based multi-task network
  • BlitzNet another multi-task network model
  • mAP is the average accuracy, which is the accuracy measure of the target detection branch
  • mIoU is the average intersection and union ratio, which is the accuracy measure of the semantic segmentation branch
  • Param is the model parameter quantity index, the order of magnitude is millions (million, M) . It can be seen that FFMNet has higher performance and fewer parameters than BlitzNet.
  • FFMNet 1 is a network model with target detection and semantic segmentation functions.
  • FFMNet 2 has only one functional network model, the single-task network model, and this functional network model is the functional network model for target detection.
  • FFMNet 3 has only one functional network model, that is, a single-task network model, and the functional network model is a semantically segmented functional network model.
  • indicates that the FFMNet has the functional network model
  • - indicates that the network model cannot test the corresponding indicators.
  • the multi-task network model is combined with the codec model to achieve high compression of intermediate features.
  • it can be trained without noise interference, and the codec model can achieve high compression of intermediate features. 1024 times compression, and the performance loss of two task branches (such as object detection and semantic segmentation tasks) is controlled within 2%.
  • the two tasks of target detection and semantic segmentation correspond to two sub-network models or two functional network models respectively.
  • the first row is the original feature dimension (H, W, C), that is, it has not undergone the compression and decompression process, so the compression ratio is 1, and then the performance of the corresponding two sub-network models is 40.8/44.6.
  • the fourth line and the fifth line reference may be made to the explanation of the second line above, and details will not be repeated here.
  • the compression factor can also be set to 512 times to achieve a balance between compression effect and functional network performance.
  • the fixed compression factor is 512 times, and the AWGN noise is introduced in the training process, and finally a source-channel joint coding and decoding model for multi-task networks with high compression factor and certain anti-noise ability can be obtained.
  • the source channel joint coding model Compared with the traditional joint photographic experts group (joint photographic experts group, JPEG) combined with quadrature amplitude modulation (Quadrature Amplitude Modulation, QAM) method for compressing intermediate features, the source channel joint coding model provided by the embodiment of the present application has more High compression ratio and overcome the cliff effect of traditional separation methods.
  • JPEG is a widely used lossy compression standard method for photographic images.
  • the fusion feature can be regarded as an image with multiple channels, so the compression algorithm in JEPG can be used to encode the source of the feature.
  • QAM is a modulation method that performs amplitude modulation on two orthogonal carrier waves. These two carriers are usually sine waves out of phase by 90 degrees ( ⁇ /2), and are therefore called quadrature carriers. Here it is used for channel protection and modulation.
  • FIG. 11a and FIG. 11b are performance comparison charts of the traditional separation method and the source-channel joint coding JSCC model provided by the embodiment of the present application.
  • the quality score is a parameter controlling the compression capability of the JPEG algorithm, and the smaller the value of the quality score, the greater the compression capability.
  • the code rate represents the ratio of the number of source coded bits (bits) to the number of bits after channel coding.
  • QAM means that every xx bit of the bit stream after source coding and channel coding is modulated into a symbol number, and the signal finally input to the channel is obtained.
  • the compression rate of the separation method is the ratio of the number of symbols of the fusion feature before compression, that is, the feature dimension, to the number of symbols after compression, channel coding, and modulation.
  • the source-channel joint coding model provided by the embodiment of the present application can achieve 512 times or 1024 times compression while ensuring the recognition rate, and has a certain anti-noise capability.
  • the storage, computing, and transmission costs on the device side can be reduced, while resisting channel noise and ensuring transmission robustness.
  • the communication device may include a hardware structure and/or a software module, and realize the above functions in the form of a hardware structure, a software module, or a hardware structure plus a software module. Whether one of the above-mentioned functions is executed in the form of a hardware structure, a software module, or a hardware structure plus a software module depends on the specific application and design constraints of the technical solution.
  • the embodiment of the present application also provides a communication device 1200, which can be a communication device, or a device in a communication device, or can be used in conjunction with a communication device s installation.
  • the communication device 1200 may be a terminal device or a network device.
  • the communication device 1200 may include modules corresponding to the methods/operations/steps/actions performed by the first communication device or the second communication device in the above method embodiments.
  • the modules may be hardware circuits, or However, software may also be realized by combining hardware circuits with software.
  • the device may include a processing module 1201 and a transceiver module 1202.
  • the processing module 1201 is used to call the transceiver module 1202 to perform the function of receiving and/or sending.
  • the processing module 1201 is configured to use the first backbone network model in the multi-task network model to process the input signal to obtain fusion features, the fusion features are obtained by fusion of multiple first features, and the multiple first features are obtained by performing obtained by feature extraction; and used for compressing and channel coding the fused features to obtain the first information;
  • the transceiver module 1202 is configured to send the first information to the second communication device.
  • the transceiver module 1202 is also used to perform operations related to receiving or sending signals performed by the first communication device in the above method embodiments, and the processing module 1201 is also used to perform operations other than sending and receiving signals performed by the first communication device in the above method embodiments Other operations will not be repeated here.
  • the first communication device may be a terminal device or a network device.
  • a transceiver module 1202 configured to receive second information from the first communication device
  • the processing module 1201 is configured to decompress and channel-decode the second information to obtain reconstructed features of fusion features; the fusion features are obtained by fusing a plurality of first features obtained by feature extraction of input signals; and using a multi-task network
  • the second backbone network model in the model performs feature analysis on the reconstructed features to obtain the result of feature analysis; and uses the functional network model in the multi-task network model to process the result of feature analysis.
  • the transceiving module 1202 is also used to perform operations related to receiving or sending signals performed by the second communication device in the above method embodiments, and the processing module 1201 is also used to perform operations other than sending and receiving signals performed by the second communication device in the above method embodiments Other operations will not be repeated here.
  • the second communication device may be a terminal device or a network device.
  • each functional module in each embodiment of the present application can be integrated into a processing In the controller, it can also be physically present separately, or two or more modules can be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
  • a communication device 1300 provided in the embodiment of the present application is used to realize the functions of the communication device in the above method.
  • the communication device may be a first communication device or a second communication device.
  • the device When realizing the function of the first communication device, the device may be the first communication device, or a device in the first communication device, or a device that can be matched with the first communication device.
  • the device When realizing the function of the second communication device, the device may be the second communication device, or a device in the second communication device, or a device that can be matched with the second communication device.
  • the communication device 1300 may be a system on a chip.
  • the system-on-a-chip may be composed of chips, or may include chips and other discrete devices.
  • the communication device 1300 includes at least one processor 1320, configured to implement the functions of the first communication device or the second communication device in the method provided in the embodiment of the present application.
  • the communication device 1300 may also include a communication interface 1310 .
  • the communication interface may be a transceiver, a circuit, a bus, a module or other types of communication interfaces for communicating with other devices through a transmission medium.
  • the communication interface 1310 is used for the devices in the communication device 1300 to communicate with other devices.
  • the communication device 1300 when the communication device 1300 is the first communication device, other devices can be the second communication device; for another example, the communication device 1300 is the second communication device.
  • the other device In the case of the communication device, the other device may be the first communication device; for another example, when the communication device 1300 is a chip, the other device may be another chip or device in the communication device.
  • the processor 1320 uses the communication interface 1310 to send and receive data, and is used to implement the methods described in the foregoing method embodiments.
  • the processor 1320 is configured to use the first backbone network model to process the input signal to obtain a fusion feature, where the fusion feature is obtained by fusing multiple first features, and the multiple first features are obtained by feature extraction of the input signal;
  • the processing module is further configured to perform compression and channel coding on the fused features to obtain the first information;
  • the communication interface 1310 is configured to send the first information to the second communication device.
  • the processor 1320 is configured to: perform feature extraction on the input signal to obtain multiple second features, where the multiple second features have different feature dimensions; process the feature dimensions of the plurality of second features to obtain the plurality of first features with the same feature dimension; perform feature fusion on the plurality of first features to obtain fusion features.
  • the processor 1320 when processing the feature dimensions of the multiple first features to obtain multiple first features with the same feature dimension, is configured to: perform a first convolution operation and upper Sampling operation to obtain multiple first features with the same feature dimension.
  • the processor 1320 when performing feature fusion on the multiple first features to obtain the fused features, is configured to: add the multiple first features to obtain the third feature.
  • the processor 1320 is further configured to: perform a second convolution operation on the third feature to obtain a fusion feature.
  • the processor 1320 when performing compression and channel protection processing on the fusion feature, is configured to: use the source-channel joint coding model to perform downsampling and the third convolution operation on the fusion feature to obtain the fourth feature; the source channel The joint coding model is trained based on channel noise.
  • the processor 1320 when performing compression and channel protection processing on the fused feature, is also configured to: use the source-channel joint coding model to perform one or more of the following operations on the fourth feature: generalized division normalization, parameter normalized linear rectifier unit, or normalized power.
  • the communication interface 1310 is used to receive the second information of the first communication device; the processor 1320 is used to decompress and channel decode the second information to obtain the reconstructed feature of the fusion feature; the fusion feature is to extract the feature of the input signal Obtained by merging multiple first features obtained; and using the second backbone network model in the multi-task network model to perform feature analysis on the reconstructed features to obtain the result of feature analysis; and using the functional network model in the multi-task network model to process The result of feature parsing.
  • the result of feature analysis includes X features, the first feature of the X features is a reconstructed feature, and the X i+ 1th feature of the X features is derived from the X ith feature Obtained through calculation; the first Y features of the X features are obtained through the first operation, and the last (XY) features of the X features are obtained through the second operation; wherein, X, Y, i is a positive integer, i is less than or equal to X, and Y is less than or equal to X; the convolution operation in the first operation has multiple receptive fields, and the convolution operation in the second operation has one receptive field.
  • the height of feature X i + 1 is 1/2 of the height of feature X i ; the width of feature X i+1 is 1/2 of the width of feature X i ; The number of channels of the X i+1 feature is the same as the number of channels of the X i feature.
  • the processor 1320 when decompressing and channel-decoding the second information, is configured to: use the source-channel joint decoding model to perform the following operations on the second information: fourth convolution operation up-sampling operation and a fifth convolution operation.
  • the processor 1320 when decompressing and channel decoding the second information, is also configured to perform one or more of the following operations: generalized division normalization denormalization, parameterized linear Rectification unit, batch normalization, or linear rectification unit.
  • the processor 1320 and the communication interface 1310 may also be configured to perform other corresponding steps or operations performed by the first communication device or the second communication device in the foregoing method embodiments, which will not be repeated here.
  • the communication device 1300 may also include at least one memory 1330 for storing program instructions and/or data.
  • the memory 1330 is coupled to the processor 1320 .
  • the coupling in the embodiments of the present application is an indirect coupling or a communication connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information exchange between devices, units or modules.
  • Processor 1320 may cooperate with memory 1330 .
  • Processor 1320 may execute program instructions stored in memory 1330 . At least one of the at least one memory may be integrated with the processor.
  • a specific connection medium among the communication interface 1310, the processor 1320, and the memory 1330 is not limited.
  • the memory 1330, the processor 1320, and the communication interface 1310 are connected through the bus 1340.
  • the bus is represented by a thick line in FIG. 13, and the connection mode between other components is only for schematic illustration. , is not limited.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 13 , but it does not mean that there is only one bus or one type of bus.
  • the processor 1320 may be a general-purpose processor, a digital signal processor, an application-specific integrated circuit, a field programmable gate array or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement Or execute the methods, steps and logic block diagrams disclosed in the embodiments of the present application.
  • a general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the memory 1330 may be a non-volatile memory, such as a hard disk (hard disk drive, HDD) or a solid-state drive (solid-state drive, SSD), etc., and may also be a volatile memory (volatile memory), For example random-access memory (random-access memory, RAM).
  • a memory is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
  • the memory in the embodiment of the present application may also be a circuit or any other device capable of implementing a storage function, and is used for storing program instructions and/or data.
  • the embodiment of the present application also provides a communication device 1400, which is used to implement the first communication device or the first communication device in the communication method based on the multi-task network model. Two operations performed by the communication device.
  • the communication device 1400 may be a system on a chip.
  • the system-on-a-chip may be composed of chips, or may include chips and other discrete devices.
  • Part or all of the communication methods based on the multitasking network model in the above embodiments may be implemented by hardware or software.
  • the communication device 1400 includes: an input and output interface 1401 and a logic circuit 1402 .
  • the input and output interface 1401 may be a transceiver, a circuit, a bus, a module or other types of communication interfaces, and is used for communicating with other devices through a transmission medium.
  • the input and output interface 1401 is used for the devices in the communication device 1400 to communicate with other devices.
  • the communication device 1400 when the communication device 1400 is the first communication device, the other devices can be the second communication device; When there are two communication devices, the other device may be the first communication device; for another example, when the communication device 1400 is a chip, the other device may be another chip or device in the communication device.
  • the logic circuit 1402 is configured to use the first backbone network model to process the input signal to obtain fusion features, where the fusion features are obtained by fusion of multiple first features, and the multiple first features are obtained by feature extraction of the input signal;
  • the processing module is further configured to perform compression and channel coding on the fused features to obtain the first information;
  • the input and output interface 1401 is configured to send the first information to the second communication device.
  • the logic circuit 1402 is configured to: perform feature extraction on the input signal to obtain multiple second features, where the multiple second features have different feature dimensions; process the feature dimensions of the plurality of second features to obtain the plurality of first features with the same feature dimension; perform feature fusion on the plurality of first features to obtain fusion features.
  • the logic circuit 1402 when processing the feature dimensions of the multiple first features to obtain multiple first features with the same feature dimension, is configured to: perform a first convolution operation and upper Sampling operation to obtain multiple first features with the same feature dimension.
  • the logic circuit 1402 when performing feature fusion on the multiple first features to obtain the fused features, is configured to: add the multiple first features to obtain the third feature.
  • the logic circuit 1402 is further configured to: perform a second convolution operation on the third feature to obtain a fusion feature.
  • the logic circuit 1402 when performing compression and channel protection processing on the fusion feature, is used to: use the source channel joint coding model to perform downsampling and the third convolution operation on the fusion feature to obtain the fourth feature; the source channel The joint coding model is trained based on channel noise.
  • the logic circuit 1402 when performing compression and channel protection processing on the fused feature, is also used to: use the source-channel joint coding model to perform one or more of the following operations on the fourth feature: generalized division normalization, parameter normalized linear rectifier unit, or normalized power.
  • the input and output interface 1401 is used to receive the second information of the first communication device; the logic circuit 1402 is used to decompress and channel decode the second information to obtain the reconstructed feature of the fusion feature; the fusion feature is to perform feature extraction on the input signal It is obtained by merging multiple first features obtained by extraction; and using the second backbone network model in the multi-task network model to perform feature analysis on the reconstructed features to obtain the result of feature analysis; and using the functional network model in the multi-task network model Process the results of feature parsing.
  • the result of feature analysis includes X features, the first feature of the X features is a reconstructed feature, and the X i+ 1th feature of the X features is derived from the X ith feature Obtained through calculation; the first Y features of the X features are obtained through the first operation, and the last (XY) features of the X features are obtained through the second operation; wherein, X, Y, i is a positive integer, i is less than or equal to X, and Y is less than or equal to X; the convolution operation in the first operation has multiple receptive fields, and the convolution operation in the second operation has one receptive field.
  • the height of feature X i + 1 is 1/2 of the height of feature X i ; the width of feature X i+1 is 1/2 of the width of feature X i ; The number of channels of the X i+1 feature is the same as the number of channels of the X i feature.
  • the logic circuit 1402 when decompressing and channel-decoding the second information, is configured to: use the source-channel joint decoding model to perform the following operations on the second information: fourth convolution operation up-sampling operation and a fifth convolution operation.
  • the logic circuit 1402 when decompressing and channel decoding the second information, is also used to perform one or more of the following operations: generalized division normalization denormalization, parameterized linear Rectification unit, batch normalization, or linear rectification unit.
  • the logic circuit 1402 and the input/output interface 1401 may also be used to perform other corresponding steps or operations performed by the first communication device or the second communication device in the above method embodiment, which will not be repeated here.
  • the transceiver module 1202 , the communication interface 1310 and the input/output interface 1401 output or receive may be baseband signals.
  • the transceiver module 1202, the communication interface 1310, and the input/output interface 1401 output or receive may be radio frequency signals.
  • Part or all of the operations and functions performed by the first communication device/second communication device described in the above method embodiments of the present application may be implemented by a chip or an integrated circuit.
  • An embodiment of the present application provides a computer-readable storage medium storing a computer program, where the computer program includes instructions for executing the foregoing method embodiments.
  • the embodiment of the present application provides a computer program product containing instructions, which, when run on a computer, causes the execution of the above-mentioned method embodiments to be executed.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本申请公开了一种基于多任务网络模型的通信方法、装置及系统,以期更好的在MTL模式下实现端云协作。该方法可以通过以下步骤实现:第一通信装置利用第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的;第一通信装置对融合特征进行压缩和信道编码,得到第一信息;第一通信装置向第二通信装置发送第一信息。第二通信装置接收第一通信装置的第二信息;第二通信装置对第二信息进行解压和信道译码,得到融合特征的重构特征;第二通信装置利用第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;第二通信装置利用功能网络模型处理特征解析的结果。

Description

一种基于多任务网络模型的通信方法、装置及系统
相关申请的交叉引用
本申请要求在2021年06月29日提交中国国家知识产权局、申请号为202110748182.0、申请名称为“一种基于多任务网络模型的通信方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及通信技术领域,尤其涉及一种基于多任务网络模型的通信方法、装置及系统。
背景技术
随着深度学习的发展,卷积神经网络(convolutional neural network,CNN)在计算机视觉领域发挥着越来越重要的作用,比如检测、跟踪、识别、分类或预测等任务都可以使用CNN建立相应的网络模型来解决。通常情况下,每一个网络模型只能解决一个任务,这种网络模型与任务一一对应的模式称为单任务学习(single task learning,STL)模式。基于STL模式,解决多个任务就需要多个网络模型,比较低效且耗费存储空间。基于此,提出一种多任务学习(multi-task learning,MTL)模式,MTL模式采用多任务网络模型,多任务网络模型中多个功能网络模型共享主干网络模型产生的中间特征,不同功能网络模型分别完成不同的任务。MTL模式能够更加高效并且降低了模型的存储成本。
近年来,虽然CNN模型的性能越来越高,但是模型的结构也越来越复杂并且需要的计算资源越来越大,一般的移动设备无法为其提供足够的计算资源。因此,一种端云协作的模型运行方式被提出,即协同智能(collaborative intelligence,CI)。CI场景下的模型被分为两部分,一部分位于移动设备端,另一部分位于云端。移动设备运行部分网络模型,云端运行另一部分网络模型,在移动设备和云端之间需要传输中间特征,以达到协作的目的。端云协作的模型运行方式能够降低移动设备的计算成本。
MTL模式下多任务网络模型的结构比较复杂,如何在MTL模式下实现端云协作,是需要解决的问题。
发明内容
本申请实施例提供一种基于多任务网络模型的通信方法、装置及系统,以期在MTL模式下实现端云协作。
第一方面,提供一种基于多任务网络模型的通信方法,该方法可以由第一通信装置执行,也可以由第一通信装置的部件(例如处理器、芯片、或芯片系统等)执行。第一通信装置可以是终端设备也可以是云端,多任务网络模型包括第一主干网络模型。该方法可以通过以下步骤实现:第一通信装置利用第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的;第一通信装置对融合特征进行压缩和信道编码,得到第一信息;第一通信装置向 第二通信装置发送第一信息。通过对输入信号提取的多个第一特征进行融合,得到的融合特征能够包含更多的信息,能够使得第二通信装置在基于融合特征进行另一部分的网络模型处理时更加准确。在特征提取阶段生成融合特征,能够使得多任务网络模型的结构更加清晰,更加有利于将多任务网络模型被划分为第一通信装置执行的部分和第二通信装置执行的部分,更利于实现MTL模式下的端云协作。并且通过压缩使得在第一通信装置和第二通信装置之间传输的参数较少,降低传输开销,通过信道编码能够使得第一通信装置和第二通信装置之间传输的数据的抗噪性能更好。
在一个可能的设计中,在第一通信装置利用第一主干网络模型对输入信号进行处理,获得融合特征时,第一通信装置具体通过以下步骤实现:第一通信装置对输入信号进行特征提取,得到多个第二特征,多个第二特征具有不同的特征维度;第一通信装置处理多个第二特征的特征维度,得到具有相同特征维度的多个第一特征;第一通信装置将多个第一特征进行特征融合,得到融合特征。通过将多个维度大小不同且包含不同信息的第一特征融合为一组特征,目的是为了使融合特征具有丰富的信息,不同来源的信息相互融合也起到了一定的信息互补作用。
在一个可能的设计中,在第一通信装置处理多个第一特征的特征维度,得到具有相同特征维度的多个第一特征时,第一通信装置具体可以对多个第一特征进行第一卷积操作和上采样操作,得到具有相同特征维度的多个第一特征。上采样的操作可以将第一特征的高度和宽度改为任意值,不限为整数倍,例如,可以扩展为原来的高度的8倍及以上,扩展倍数为任意倍数。常规操作中,使用反卷积操作来改变特征的高度和宽度,仅能够实现扩展倍数为2倍,并且扩充倍数必须为整数。相比常规操作,上采样的操作扩展维度的倍数更加灵活。
在一个可能的设计中,第一通信装置将多个第一特征进行特征融合,得到融合特征,具体可以是:第一通信装置对多个第一特征相加,得到第三特征。其中,对多个第一特征相加,可以是指多个第一特征中相同位置的元素进行相加,多个第一特征相加后能够得到一个融合后的第三特征。通过相加得到第三特征的方法操作简单且有效,能够有助于降低模型复杂度。
在一个可能的设计中,第一通信装置对第三特征进行第二卷积操作,得到融合特征。第二卷积操作可以采用3*3的卷积操作,可以控制第二卷积操作的输入输出的通道数相等,即第三特征和融合特征的维度相同,对第三特征进行第二卷积操作,能够使得得到的融合特征更加平滑,这样使得融合特征更适用于第二通信装置进行后续的网络模型的处理,使得处理结果更加准确。
在一个可能的设计中,第一通信装置对融合特征进行压缩和信道保护处理,可以是第一通信装置利用信源信道联合编码模型对融合特征进行降采样和第三卷积操作,得到第四特征;信源信道联合编码模型是基于信道噪声进行训练的,这样通过信源信道联合编码模型进行处理后的数据能够更具有抗噪性能,第二通信装置使用与信源信道联合编码模型对应的信源信道联合译码模型对接收到的数据进行处理,能够解码得到与融合特征更相近的重构特征,从而使得端云协作的性能更加稳定和准确。通过降采样能够降低融合特征的高和宽,常规情况下,使用卷积操作来降低融合特征的高度和宽度,卷积操作降低维度的倍数受卷积核大小的影响,并且只能降低为原来维度的整数倍分之一,相比来说,降采样可以对融合特征降低为任意的维度,使用降采样的方式降低融合特征的维度能够更加灵活。 通过第三卷积操作能够降低融合特征的特征通道数,从而能够使得融合特征压缩后更利于传输。
在一个可能的设计中,第一通信装置对融合特征进行压缩和信道保护处理,还包括:
第一通信装置利用信源信道联合编码模型对第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元、或功率归一化。广义除法归一化可以用于提高信源信道联合编码模型的压缩能力,参数化的线性整流单元也可以用于提高信源信道联合编码模型的压缩能力。功率归一化可以使得压缩后的结果的功率为1。
第二方面,提供基于多任务网络模型的通信方法,该方法可以由第二通信装置执行,也可以由第二通信装置的部件(例如处理器、芯片、或芯片系统等)执行。第二通信装置可以是云端也可以是终端设备。多任务网络模型包括第二主干网络模型和功能网络模型。该方法可以通过以下步骤实现:第二通信装置接收第一通信装置的第二信息;第二通信装置对第二信息进行解压和信道译码,得到融合特征的重构特征,融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;第二通信装置利用第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;第二通信装置利用功能网络模型处理特征解析的结果。通过对输入信号提取的多个第一特征进行融合,得到的融合特征能够包含更多的信息,能够使得第二通信装置在基于融合特征的重构特征进行处理时更加准确。在特征提取阶段生成融合特征,能够使得多任务网络模型的结构更加清晰,更加有利于将多任务网络模型被划分,更利于实现MTL模式下的端云协作。通过信道译码能够使得第二信息译码后更接近于第一通信装置发送的第一信息,提高第一通信装置与第二通信装置之间传输数据的抗噪性能。另外,第二通信装置通过接收一组特征(即第一信息中包括的融合特征),就能够完成多个任务,不需要输入多组特征进行多任务,第二通信装置的操作更加简洁,更利于将多任务网络模型进行切分成两部分,更适用于端云协作场景。
在一个可能的设计中,特征解析的结果包括X个特征,X个特征的第1个特征为重构特征,X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;X个特征中的前Y个特征中是经过第一运算得到的,X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;第一运算中卷积操作具有多种感受野(receptive field),第二运算中的卷积操作具有一种感受野。第一运算将不同感受野的卷积结果融合在一起,是一种特征融合的手段,是从不同的角度提取不同的信息并融合,使得第一运算的结果所包含的信息比第二运算的结果包含的信息更多,更有助于功能网络模型的性能提升。
在一个可能的设计中,第一运算包括以下操作:对前Y个特征中的待处理特征进行1×1的卷积操作;对1×1的卷积操作的结果分别进行多个3×3的卷积操作,多个3×3的卷积操作的感受野大小不同;对多个3×3的卷积操作的结果进行通道数维度的拼接;对通道数维度的拼接结果进行1×1的卷积操作,获得第一卷积结果;将第一卷积结果与待处理特征进行逐元素相加。
在一个可能的设计中,第X i+1个特征的高度为第X i个特征的高度的1/2;第X i+1个特征的宽度为第X i个特征的宽度的1/2;第X i+1个特征的通道数与第X i个特征的通道数相同。这样能够基于重构特征提取到不同尺度的特征,可以为后续功能网络模型提供更加丰富的信息,使得输入功能网络模型的数据和特征都得到增强,提高功能网络模型处理的准确性。
在一个可能的设计中,第二通信装置对第二信息进行解压和信道译码,包括:第二通 信装置利用信源信道联合译码模型对第二信息进行以下操作:第四卷积操作、上采样操作和第五卷积操作。通过上采样操作能够恢复特征的空间维度,通过第四卷积操作和第五卷积操作两次卷积操作,相比一次卷积操作能够使得被压缩的融合特征能够被缓慢恢复,提高特征恢复的能力,使得恢复后的重构特征的参数更多,使用更多的参数能够提高网络模型的解析准确度。
在一个可能的设计中,第二通信装置对第二信息进行解压和信道译码,还包括以下一项或多项操作:广义除法归一化的反归一化(IGDN)、参数化的线性整流单元(PReLU)、批归一化(BN)、或线性整流单元(ReLU)。其中,IGDN可以用于提高信源信道联合译码模型的解压能力或解码能力。PReLU也可以用于提高信源信道联合译码模型的解压能力或解码能力。BN和/或ReLU可以限制解码结果的值域,也可以进一步增加解码结果的准确度。
第三方面,提供一种通信装置,该装置可以是第一通信装置,也可以是位于第一通信装置中的装置(例如,芯片,或者芯片系统,或者电路),或者是能够和第一通信装置匹配使用的装置。第一通信装置可以是终端设备也可以是网络设备。该装置具有实现上述第一方面和第一方面的任一种可能的设计中所述的方法的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。一种设计中,该装置可以包括收发模块和处理模块。示例性地:
处理模块,用于利用第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的;处理模块还用于对融合特征进行压缩和信道编码,得到第一信息;收发模块用于向第二通信装置发送第一信息。
在一个可能的设计中,在利用第一主干网络模型对输入信号进行处理,获得融合特征时,处理模块用于:对输入信号进行特征提取,得到多个第二特征,该多个第二特征具有不同的特征维度;处理该多个第二特征的特征维度,得到具有相同特征维度的该多个第一特征;将该多个第一特征进行特征融合,得到融合特征。
在一个可能的设计中,在处理该多个第一特征的特征维度,得到具有相同特征维度的多个第一特征时,处理模块用于:对该多个第一特征进行第一卷积操作和上采样操作,得到具有相同特征维度的多个第一特征。
在一个可能的设计中,在将该多个第一特征进行特征融合,得到融合特征时,处理模块用于:对该多个第一特征相加,得到第三特征。
在一个可能的设计中,处理模块还用于:对第三特征进行第二卷积操作,得到融合特征。
在一个可能的设计中,在对融合特征进行压缩和信道保护处理时,处理模块用于:利用信源信道联合编码模型对融合特征进行降采样和第三卷积操作,得到第四特征;信源信道联合编码模型是基于信道噪声进行训练的。
在一个可能的设计中,在对融合特征进行压缩和信道保护处理时,处理模块还用于:利用信源信道联合编码模型对第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元、或功率归一化。
第三方面以及各个可能的设计的有益效果可以参考第一方面对应部分的描述,在此不再赘述。
第四方面,提供一种通信装置,该装置可以是第二通信装置,也可以是位于第二通信装置中的装置(例如,芯片,或者芯片系统,或者电路),或者是能够和第二通信装置匹配使用的装置。第二通信装置可以是终端设备也可以是网络设备。该装置具有实现上述第二方面和第二方面的任一种可能的设计中的方法的功能。功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块。一种设计中,该装置可以包括收发模块和处理模块。示例性地:收发模块,用于接收第一通信装置的第二信息;处理模块,用于对第二信息进行解压和信道译码,得到融合特征的重构特征;融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;以及利用多任务网络模型中的第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;以及利用多任务网络模型中的功能网络模型处理特征解析的结果。
在一个可能的设计中,特征解析的结果包括X个特征,该X个特征的第1个特征为重构特征,该X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;该X个特征中的前Y个特征中是经过第一运算得到的,该X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;第一运算中卷积操作具有多种感受野,第二运算中的卷积操作具有一种感受野。
在一个可能的设计中,第X i+1个特征的高度为第X i个特征的高度的1/2;第X i+1个特征的宽度为第X i个特征的宽度的1/2;第X i+1个特征的通道数与第X i个特征的通道数相同。
在一个可能的设计中,在对第二信息进行解压和信道译码时,处理模块用于:利用信源信道联合译码模型对第二信息进行以下操作:第四卷积操作上采样操作和第五卷积操作。
在一个可能的设计中,在对第二信息进行解压和信道译码时,处理模块还用于执行以下一项或多项操作:广义除法归一化的反归一化、参数化的线性整流单元、批归一化、或线性整流单元。
第四方面以及各个可能的设计的有益效果可以参考第二方面对应部分的描述,在此不再赘述。
第五方面,本申请实施例提供一种通信装置,该装置包括通信接口和处理器,所述通信接口用于该装置与其它设备进行通信,例如数据或信号的收发。示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口,其它设备可以为其它通信装置。处理器用于调用一组程序、指令或数据,执行上述第一方面、或第一方面各个可能的设计所描述的方法;或者,执行上述第二方面、或第二方面各个可能的设计所描述的方法。所述装置还可以包括存储器,用于存储处理器调用的程序、指令或数据。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的、指令或数据时,可以实现上述第一方面或第一方面各个可能的设计描述的方法,或者可以实现上述第二方面或第二方面各个可能的设计描述的方法。
第五方面的有益效果可以参考第一方面对应部分的描述,在此不再赘述。
第六方面,本申请实施例提供一种通信装置,该装置包括通信接口和处理器,所述通信接口用于该装置与其它设备进行通信,例如数据或信号的收发。示例性的,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口,其它设备可以为其它通信装置。处理器用于调用一组程序、指令或数据,执行上述第二方面、或第二方面各个可能的设计所描述的方法。所述装置还可以包括存储器,用于存储处理器调用的程序、指令或数据。所述存储器与所述处理器耦合,所述处理器执行所述存储器中存储的、指令或数据时,可 以实现上述第二方面或第二方面各个可能的设计描述的方法。
第六方面的有益效果可以参考第二方面对应部分的描述,在此不再赘述。
第七方面,本申请实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机可读指令,当所述计算机可读指令在计算机上运行时,使得如各方面或各方面各个可能的设计中所述的方法被执行。
第八方面,本申请实施例提供了一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现上述第一方面或第一方面各个可能的设计中所述的方法。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
第九方面,本申请实施例提供了一种芯片系统,该芯片系统包括处理器,还可以包括存储器,用于实现上述第二方面或第二方面各个可能的设计中所述的方法。该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
第十方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得如上述各方面或各方面各个可能的设计中所述的方法被执行。
附图说明
图1为本申请实施例中系统架构示意图;
图2为本申请实施例中通信系统架构示意图;
图3为本申请实施例中神经网络模型结构的示意图;
图4a为本申请实施例中多任务网络模型的示意图之一;
图4b为本申请实施例中多任务网络模型的示意图之二;
图4c为本申请实施例中多任务网络模型的示意图之三;
图4d为本申请实施例中计算机视觉任务网络模型的示意图;
图5为本申请实施例中基于多任务网络模型的通信方法的流程示意图;
图6为本申请实施例中基于多任务网络模型的通信方法的过程示意图;
图7为本申请实施例中特征融合的过程示意图;
图8为本申请实施例中信源信道联合编码和译码的过程示意图;
图9a为本申请实施例中第一运算的示意图;
图9b为本申请实施例中第二运算的示意图;
图10为本申请实施例中特征解析的过程示意图;
图11a为本申请实施例中网络模型的性能对比图之一;
图11b为本申请实施例中网络模型的性能对比图之二;
图12为本申请实施例中通信装置结构示意图之一;
图13为本申请实施例中通信装置结构示意图之二;
图14为本申请实施例中通信装置结构示意图之三。
具体实施方式
本申请提供一种基于多任务网络模型的通信方法及装置,以期更好的在MTL模式下实现端云协作。其中,方法和装置是基于同一技术构思的,由于方法及装置解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。
本申请实施例的描述中,“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。在本申请的描述中,“第一”、“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。
下面将结合附图,对本申请实施例进行详细描述。
本申请实施例提供的基于多任务网络模型的通信方法可以应用于5G通信系统,例如5G新空口(new radio,NR)系统,可以应用于5G通信系统的各种应用场景中,如增强移动宽带(enhanced mobile broadband,eMBB),超高可靠超低时延通信(ultra reliable low latency communication,URLLC)和增强型机器类型通信(enhanced machine-type communication,eMTC)。本申请实施例提供的基于多任务网络模型的通信方法也可以应用于未来演进的各种通信系统,例如第六代(6th generation,6G)通信系统,又例如空天海地一体化通信系统。本申请实施例提供的基于多任务网络模型的通信方法还可以应用于基站和基站之间的通信、终端设备和终端设备的通信、车联网、物联网、工业互联网、卫星通信等的通信,例如,可以应用于设备到设备(Device-to-Device,D2D)、车辆外联(vehicle-to-everything,V2X)、机器到机器(machine-to-machine,M2M)通信系统。
图1示出了本申请实施例适用的一种系统架构,包括第一通信装置101和第二通信装置102。第一通信装置101和第二通信装置102为协作运行多任务网络模型的两端,运行多任务网络模型的形态可以是网络设备、终端设备、云(cloud)计算节点、边缘服务器(edge server)、移动边缘计算(mobile edge computing,MEC)或算力等。第一通信装置101和第二通信装置102可以通过有线或无线的形式连接。第一通信装置101和第二通信装置102可以是能够运行多任务网络模型的任意两端。例如,第一通信装置101为终端设备,第二通信装置102可以是云计算节点、网络设备、边缘服务器(edge server)、MEC或算力。又例如,第一通信装置101为云计算节点、网络设备、边缘服务器(edge server)、MEC或算力,第二通信装置102可以是终端设备。第一通信装置101可以是上述设备,也可以是上述设备中的部件(例如处理器、芯片、或芯片系统等),也可以是与上述设备匹配的装置。类似的,第二通信装置102可以是上述设备,也可以是上述设备中的部件(例如处理器、芯片、或芯片系统等),也可以是与上述设备匹配的装置。
本申请实施例适用于协作运行多任务网络模型的场景(以下可简称为协作运行的场景)中,其中,协作运行多任务网络模型的两端可以是任意两端,例如当应用到端云协作的场景中时,两端分别可以称为终端和云端,终端可以是终端设备或终端设备中的装置(例如处理器、芯片、或芯片系统等)),云端可以是网络设备、云计算节点、边缘服务器(edge server)、MEC或算力等设备,云端也可以是具有计算能力的软件形态。
以下对本申请实施例涉及的终端设备和网络设备可能实现形式和功能进行举例介绍。
当协作运行多任务网络模型的两端为终端设备和网络设备时,基于多任务网络模型的通信方法适用于一种通信系统架构。如图2所示,通信系统架构中包括网络设备201和终端设备202。可以理解的是,图2中以一个网络设备201和一个终端设备202为例进行示意,网络设备201和终端设备202的数量均可以为多个。网络设备201为网络设备201覆盖范围内的一个或多个终端设备202提供无线接入。网络设备之间的覆盖范围可以存在重叠的区域。网络设备之间还可以互相通信。网络设备201为覆盖范围内的终端设备202提 供服务。网络设备201为网络设备201覆盖范围内的一个或多个终端设备202提供无线接入。
网络设备201为无线接入网(radio access network,RAN)中的节点,又可以称为基站,还可以称为RAN节点(或设备)。目前,一些网络设备201的举例为:下一代基站(next generation nodeB,gNB)、下一代演进的基站(next generation evolved nodeB,Ng-eNB)、传输接收点(transmission reception point,TRP)、演进型节点B(evolved Node B,eNB)、无线网络控制器(radio network controller,RNC)、节点B(Node B,NB)、基站控制器(base station controller,BSC)、基站收发台(base transceiver station,BTS)、家庭基站(例如,home evolved NodeB,或home Node B,HNB)、基带单元(base band unit,BBU),或无线保真(wireless fidelity,Wifi)接入点(access point,AP),网络设备201还可以是卫星,卫星还可以称为高空平台、高空飞行器、或卫星基站。网络设备201还可以是其他具有网络设备功能的设备,例如,网络设备201还可以是设备到设备(device to device,D2D)通信、车联网或机器到机器(machine to machine,M2M)通信中担任网络设备功能的设备。网络设备201还可以是未来通信系统中任何可能的网络设备。在一些部署中,网络设备201可以包括集中式单元(centralized unit,CU)和(distributed unit,DU)。网络设备还可以包括有源天线单元(active antenna unit,AAU)。CU实现网络设备的部分功能,DU实现网络设备的部分功能,比如,CU负责处理非实时协议和服务,实现无线资源控制(radio resource control,RRC),分组数据汇聚层协议(packet data convergence protocol,PDCP)层的功能。DU负责处理物理层协议和实时服务,实现无线链路控制(radio link control,RLC)层、媒体接入控制(media access control,MAC)层和物理(physical,PHY)层的功能。AAU实现部分物理层处理功能、射频处理及有源天线的相关功能。由于RRC层的信息最终会变成PHY层的信息,或者,由PHY层的信息转变而来,因而,在这种架构下,高层信令,如RRC层信令,也可以认为是由DU发送的,或者,由DU+AAU发送的。可以理解的是,网络设备可以为包括CU节点、DU节点、AAU节点中一项或多项的设备。此外,可以将CU划分为接入网(radio access network,RAN)中的网络设备,也可以将CU划分为核心网(core network,CN)中的网络设备,本申请对此不做限定。
终端设备202,又称之为用户设备(user equipment,UE)、移动台(mobile station,MS)、移动终端(mobile terminal,MT)等,是一种向用户提供语音和/或数据连通性的设备。例如,终端设备202包括具有无线连接功能的手持式设备、车载设备等,终端设备202如果位于车辆上(例如放置在车辆内或安装在车辆内),都可以认为是车载设备,车载设备也称为车载单元(onBoard unit,OBU)。目前,终端设备202可以是:手机(mobile phone)、平板电脑、笔记本电脑、掌上电脑、移动互联网设备(mobile internet device,MID)、可穿戴设备(例如智能手表、智能手环、计步器等),车载设备(例如,汽车、自行车、电动车、飞机、船舶、火车、高铁等)、虚拟现实(virtual reality,VR)设备、增强现实(augmented reality,AR)设备、工业控制(industrial control)中的无线终端、智能家居设备(例如,冰箱、电视、空调、电表等)、智能机器人、车间设备、无人驾驶(self driving)中的无线终端、远程手术(remote medical surgery)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端,或智慧家庭(smart home)中的无线终端、飞行设备(例如,智能机器人、热气球、无人机、飞机)等。终端设备202还可以是其他具有终端设备功能的设备,例如,终端设 备202还可以是设备到设备(device to device,D2D)通信、车联网或机器到机器(machine-to-machine,M2M)通信中担任终端设备功能的设备。特别地,在网络设备间进行通信的时候,担任终端设备功能的网络设备也可以看作是终端设备。
作为示例而非限定,在本申请实施例中,终端设备202还可以是可穿戴设备。可穿戴设备也可以称为穿戴式智能设备或智能穿戴式设备等,是应用穿戴式技术对日常穿戴进行智能化设计、开发出可以穿戴的设备的总称,如眼镜、手套、手表、服饰及鞋等。可穿戴设备即直接穿在身上,或是整合到用户的衣服或配件的一种便携式设备。可穿戴设备不仅仅是一种硬件设备,更是通过软件支持以及数据交互、云端交互来实现强大的功能。广义穿戴式智能设备包括功能全、尺寸大、可不依赖智能手机实现完整或者部分的功能,例如:智能手表或智能眼镜等,以及只专注于某一类应用功能,需要和其它设备如智能手机配合使用,如各类进行体征监测的智能手环、智能头盔、智能首饰等。
本申请实施例中,用于实现终端设备202的功能的装置例如是芯片、无线收发器、芯片系统,用于实现终端设备202的功能的装置可以被安装或设置或部署在终端设备202中。
为了本领域技术人员更好的理解本申请实施例提供的方案,首先对本申请涉及的几个概念或术语进行解释说明。
1)网络模型
网络模型也可以称为神经网络模型、人工神经网络(artificial neural networks,ANNs)模型、神经网络(NNs)模型、连接模型(connection model)。应用神经网络模型可以实现人工智能(artificial intelligence,AI)技术,AI技术中使用的AI模型多种多样,不同的应用场景可采用不同的AI模型,神经网络模型是AI模型的一种典型代表。神经网络模型是一种模仿人脑神经网络的行为特征,进行分布式并行信息处理的数学计算模型。它的主要任务是借鉴人脑神经网络的原理,根据应用需求建造实用的人工神经网络,实现适用于应用需求的学习算法设计,模拟出人脑的智能活动,然后在技术上解决实际问题。神经网络是依靠网络结构的复杂程度,通过调整内部大量节点之间相互连接的关系,实现相应学习算法的设计的。
一个神经网络模型可以包括多种不同功能的神经网络层,每层包括参数和计算公式。根据计算公式的不同或功能的不同,神经网络模型中不同的层有不同的名称,例如:进行卷积计算的层称为卷积层,该卷积层常用于对输入信号(例如:图像)进行特征提取。一个神经网络模型也可以由多个已有的神经网络子模型组合构成。不同结构的神经网络模型可用于不同的场景(例如:分类、识别)或在用于同一场景时提供不同的效果,神经网络模型的结构不同主要体现为以下一项或多项:神经网络模型中网络层的层数不同、各个网络层的顺序不同、每个网络层中的权重、参数或计算公式不同。神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以如公式(1)所示:
Figure PCTCN2022100097-appb-000001
其中,s=1、2、……n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入,激活函数可以是sigmoid函数、ReLU函数、tanh函数等。神经网络是将多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输 入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
多层感知机(multilayer perceptron,MLP),是前向传播神经网络模型中的一种,MLP包括多种不同功能的网络层,分别为:一个输入层、一个输出层、一个或多个隐含层,一个或多个隐含层位于输入层和输出层之间,且MLP中隐含层的数量可以根据应用需求确定。在MLP中,信息单向传递,即信息由输入层开始前向移动,然后在一个或多个隐含层中逐层传递,再由最后一层隐含层传递至输出层。
图3示意了一种神经网络模型结构的示例。
如图3所示,输入层包括多个神经元,输入层中的神经元又称输入节点,该输入节点用于接收从外部输入的输入向量,并将该输入向量传递至与其连接的隐含层中的神经元。其中,该输入节点不执行计算操作。
如图3所示,隐含层包括多个神经元,隐含层中的神经元又称隐含节点,隐含节点用于根据向该隐含层输入的输入向量,提取该输入向量的特征,并将该特征传递至下一层中的神经元。且隐含节点提取特征的实现方式为:根据位于上一层的神经元的输出向量、及该隐含节点与该前一层神经元之间连接的权重值,按照该隐含节点的输入输出关系确定该隐含节点的输出向量。其中,上一层是指向该隐含节点所在隐含层输入信息的网络层,下一层是指接收该隐含节点所在隐含层输出信息的网络层。
如图3所示,输出层包括一个或多个神经元,输出层中的神经元又称输出节点,该输出节点可以按照该输出节点的输入输出关系,根据其连接的隐含节点的输出向量、及其连接的隐含节点与该输出节点之间的权重值,确定该输出节点的输出向量,并将该输出向量传递至外部。
其中,多层感知机相邻层之间全连接,即对于任意相邻的两层,上一层中的任一神经元均与下一层中的所有神经元连接。且相邻层的神经元之间的连接均配置有权重。
2)多任务网络模型
多任务网络模型是MTL模式下采用的网络模型,相较于STL模式下使用的单任务网络模型,多任务网络模型能够执行多个任务。多任务网络模型可能包括多个子网络模型,例如,多任务网络模型包括N个子网络模型,每个子网络模型都可以看做一个上述第1)点介绍的神经网络模型。多任务网络模型包括的N个子网络模型,可以分为主干网络模型和功能网络模型。功能网络模型可以用于负责任务,多个功能网络模型可以用于负责多个任务,该多个任务可以具有相关性,多个功能网络模型可以共享主干网络模型。
主干网络模型可以用于提取特征,例如,残差网络(residual network,ResNet)、视觉几何组(visual geometry group,VGG)、移动网络(mobileNet)、谷歌创新网络(Google inception network,GoogLeNet)、或者亚历克斯网络(Alex network,AlexNet,Alex为人名)等网络模型均具有提取特征的能力,因此均可作为主干网络模型。功能网络模型可以用于负责其他的功能。如图4a所示,N个子网络模型包括Y个主干网络模型和(N-Y)个功能网络模型,Y为正整数,1<=Y<=N。例如,当Y的值为1时,N个子网络模型包括一个主干网络模型和(N-1)个功能网络模型。多任务网络模型中也可以只有一种模型类型,即主干网络模型,也可以认为多任务网络模型包括1个子网络模型,或者多任务网络模型不能被拆分为多个子网络模型,多任务网络模型中的这一个主 干网络模型执行多个相关的任务。
CNN在计算机视觉领域应用广泛,比如,检测、跟踪、识别、分类或预测等任务都可以使用CNN建立相应的网络模型来解决。以下以CNN多任务模型应用的几种任务来对多任务模型进行举例说明。
举例来说,如图4b所示,多任务网络模型用于图像分类和分割的应用,具体该图像分类和分割的应用为掩模-基于区域的卷积神经网络(mask region-based convolutional networks,Mask-RCNN),该多任务网络模型包括5个子网络模型,其中包括1个主干网络模型和4个功能网络模型。5个子网络模型分别为ResNet、FPN、RPN、Classifier-NET、和Mask-NET。其中,ResNet为主干网络模型,作为特征提取器。FPN、RPN、Classifier-NET、和Mask-NET为功能网络模型。FPN用于对主干网络的扩展,可以在多个尺度上更好地表征目标。RPN确定感兴趣区域。Classifier-NET对目标进行分类。Mask-NET对目标进行分割。
再举个例子,如图4c所示,多任务网络模型为图像分类和分割的应用Mask-RCNN,该多任务网络模型包括6个子网络模型,其中包括2个主干网络模型和4个功能网络模型。2个主干网络模型分别为第一ResNet和第二ResNet,4个功能网络模型分别为FPN、RPN、Classifier-NET、和Mask-NET。第一ResNet用于特征提取,第二ResNet用于对第一ResNet特征提取后的结果进一步特征提取。FPN用于对主干网络的扩展,可以在多个尺度上更好地表征目标。RPN确定感兴趣区域。Classifier-NET对目标进行分类。Mask-NET对目标进行分割。
再举例来说,如图4d所示,多任务网络模型应用于计算机视觉任务,其中目标检测和语义分割是两个比较相关的任务,都是为了识别出图像中的物体并分类。可以通过两个功能网络模型分别执行目标检测和语义分割两个任务,检测(detect)功能网络模型用于执行目标检测的任务,分割(segment)功能网络模型用于执行语义分割的任务。主干网络模型可以包括ResNet50模型和多个CNN层以及残差连接。图4d中,矩形条可以表示为中间特征,上方不带箭头的横线为残差连接,且特征提取的多个中间特征都需要残差连接输送到后端的菱形做运算。菱形表示一种运算操作,可以用于将位于主干网络不同位置的不同尺度的中间特征通过一系列的采样操作和卷积操作进行融合,得到新的特征。上方带箭头的线表示输出菱形所代表的操作的处理结果,向检测(detect)功能网络模型和分割(segment)功能网络模型输出处理结果。
可以看出,CNN多任务网络模型中主干网络模型结构复杂,参数量大。如果将多任务网络模型应用到协作运行的场景中(例如端云协作),需要多任务网络模型的一部分在协作运行的两端中的一端运行,另一部分在协作运行的两端中的另一端运行。可以理解为需要将多任务网络模型进行切割,以形成两个部分。但是基于图4d可以看出CNN多任务网络模型的主干网络模型结构复杂,没有比较清晰的切割点能够将多任务网络模型分成两个部分。另一方面从图4d可以看出CNN多任务网络模型的参数量大,协作运行的场景还需要由一端向另一端传输中间特征,一般情况下,需要传输的中间特征的维度不仅较大而且存在冗余。综上,如何实现协作运行多任务网络模型是需要解决的问题。
基于此,本申请实施例提供一种基于多任务网络模型的通信方法,以期实现协作运行多任务网络模型。如图5所示,本申请实施例提供的基于多任务网络模型的通信方法的具体流程如下所述。其中,多任务网络模型可以包括第一主干网络模型、第二主干网络模型 和功能网络模型。第一主干网络模型运行于第一通信装置,第二主干网络模型和功能网络模型运行于第二通信装置。可以理解的是,第一主干网络模型和第二主干网络模型可以是同一种模型类型,例如第一主干网络模型和第二主干网络模型为ResNet。第一主干网络模型和第二主干网络模型可以是一个主干网络模型的两个部分,也可以认为第一主干网络模型和第二主干网络模型为两个独立的主干网络模型。第一主干网络模型和第二主干网络模型的数量可以为单个也可以为多个。
S501.第一通信装置利用第一主干网络模型对输入信号进行处理,获得融合特征。
其中,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的。
S502.第一通信装置对融合特征进行压缩和信道编码,得到第一信息。
S503.第一通信装置向第二通信装置发送第一信息。
S504.第二通信装置接收第一通信装置的第二信息。
可以理解的是,第一信息在信道传输中会受到信道噪声的影响,第二信息为第一信息受到噪声影响之后的信息。在没有信道噪声的影响的理想情况下,第二信息与第一信息相同。
S505.第二通信装置对第二信息进行解压和信道译码,得到融合特征的重构特征。
融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的。
S506.第二通信装置利用第二主干网络模型对重构特征进行特征解析,得到特征解析的结果。
S507.第二通信装置利用功能网络模型处理特征解析的结果。
多任务网络模型包括第一主干网络模型、第二主干网络模型和功能网络模型,第一通信装置和第二通信装置协作运行多任务网络模型,输入信号输入到多任务网络模型之后,最终输出功能网络模型处理特征解析的结果之后获得的处理结果。
通过对输入信号提取的多个第一特征进行融合,得到的融合特征能够包含更多的信息,能够使得第二通信装置在基于融合特征进行另一部分的网络模型处理时更加准确。在特征提取阶段生成融合特征,能够使得多任务网络模型的结构更加清晰,更加有利于将多任务网络模型被划分为第一通信装置执行的部分和第二通信装置执行的部分,更利于实现MTL模式下的端云协作。并且通过压缩使得在第一通信装置和第二通信装置之间传输的参数较少,降低传输开销,通过信道编码能够使得第一通信装置和第二通信装置之间传输的数据更具有抗噪性能。
以下对图5实施例的一些可选实现方式进行说明。
基于图5实施例,下面通过图6来示意基于多任务网络模型的通信方法的过程。如图6所示,第一通信装置和第二通信装置协作运行多任务网络模型。第一通信装置利用第一主干网络模型对输入信号进行处理,输入信号可以是图像,得到融合特征F,融合特征F是一个三维张量。第一通信装置对融合特征F进行压缩和信道编码,得到第一信息D,第一信息D的维度比融合特征F的维度小,即D的元素个数小于F的元素个数。第一通信装置发送第一信息D,第一信息D通过无线信道传输,第二通信装置接收D经过信道干扰后的第二信息D’。第二通信装置对第二信息D’进行解压和信道译码,得到F的重构特征F’。第二通信装置利用第二主干网络模型对重构特征F’进行特征解析,得到特征解析的结果。第二通信装置将特征解析的结果输入到多个功能网络模型进行处理,多个功能网络模型共 享该特征解析的结果。例如,图6中所示的N个功能网络模型分别对应N个任务。第二通信装置利用N个功能网络模型分别处理特征解析的结果,N个功能网络模型分别处理特征解析的结果,得到N个任务的处理结果。
本申请实施例中,多任务网络模型可执行的任务可以是模型训练也可以是模型推理。
以下对S501的可选实现方式进行说明。第一通信装置利用第一主干网络模型对输入信号进行处理获得融合特征。第一通信装置对输入信号进行特征提取,得到多个第二特征,该多个第二特征具有不同的特征维度。第一通信装置处理该多个第二特征的特征维度,得到具有相同特征维度的多个第一特征,多个第一特征分别与多个第二特征对应,即处理一个第二特征的特征维度可以得到与该第二特征对应的第一特征。第一通信装置将该多个第一特征进行特征融合,得到融合特征。
特征维度可以包括高度、宽度和通道数。本申请实施例中第一特征、第二特征、第三特征等命名的特征可以是指中间特征,中间特征为多任务网络模型的中间处理过程中某个层获得的处理数据。
将多个第二特征进行处理得到多个具有相同维度的第一特征,其中,处理过程可以是,对多个第一特征分别进行卷积操作和上采样操作,这里的卷积操作记为第一卷积操作。第一卷积操作和上采样操作的顺序不作限定。第一卷积操作可以改变特征的通道数,上采样的操作可以改变特征的高和宽。上采样的操作可以将第一特征的高度和宽度改为任意值,不限为整数倍,例如,可以扩展为原来的高度的8倍及以上,扩展倍数为任意倍数。常规操作中,使用反卷积操作来改变特征的高度和宽度,仅能够实现扩展倍数为2倍,并且扩充倍数必须为整数。相比常规操作,上采样的操作扩展维度的倍数更加灵活。
其中,上采样(upsampled)操作也可以称为插值操作,目的是放大特征的高度和/或宽度。上采样操作或插值操作的过程可以为:将输入特征进行改比例(rescale)到目标尺寸,计算每个采样点的特征值,使用如双线性插值(bilinear-interpolation)等插值方法对其余点进行插值。插值就是在周围相邻特征值的基础上用数学公式计算缺失的值,并插入该计算所得的缺失的值。
将多个第一特征进行特征融合得到融合特征,可以对多个第一特征进行相加的运算,得到融合特征。或者对多个第一特征进行相加,得到第三特征,再对第三特征进行第二卷积操作得到融合特征。其中,对多个第一特征进行相加,可以对多个第一特征相同位置上的元素分别相加。中间特征可以是具有高度、宽度和通道数的三维数据,多个第一特征相同的位置,可以是指多个第一特征的同一高度、同一宽度和同一通道的位置。
以下结合具体的应用场景对第一通信装置对多个第一特征进行融合得到融合特征的过程进行描述。
如图7所示,第一主干网络模型为Residual Network50(ResNet50),ResNet50为具有50个卷积层的残差网络。第一通信装置利用ResNet50对输入信号进行特征提取,得到具有不同特征维度的多个第二特征,不同特征维度可以是指不同的高度、宽度和通道数。多个第二特征记为Res1、Res2、Res3、Res4和Res5,Res1、Res2、Res3、Res4和Res5的特征维度分别记为(H1,W1,C1),(H2,W2,C2),(H3,W3,C3),(H4,W4,C4),(H5,W5,C5)。H表示高度,W表示宽度,C表示通道数。特征维度的大小关系为:H1>H2>H3>H4>H5,W1>W2>W3>W4>W5,C1<C2<C3<C4<C5,H1*W1*C1>H2*W2*C2>H3*W3*C3>H4*W4*C4>H5*W5*C5。第一通信装置分别对Res2、Res3、Res4和Res5进行1*1的卷 积操作,改变Res2、Res3、Res4和Res5的特征通道数,并使用上采样的方法改变Res2、Res3、Res4和Res5的高度和宽度,最终将Res2、Res3、Res4和Res5统一为相同的特征维度,相同的特征维度记为(H,W,C),即将Res2、Res3、Res4和Res5统一为维度都是(H,W,C)的四组第一特征。需要说明的是,如果Res2、Res3、Res4和Res5中的某一个或多个特征的维度与总重统一的维度相同,则该一个或多个特征就不需要进行卷积操作和上采样,例如,假设Res2特征的维度与总重统一的维度相同,则Res2就不需要进行卷积操作和上采样,只需要对Res3、Res4和Res5进行卷积操作和上采样即可。第一通信装置对四组第一特征进行逐元素相加得到第三特征F0。第一通信装置对第三特征F0进行3*3的卷积操作,得到融合特征F1。对第三特征F0进行卷积操作的卷积核可以变为其它值,一般为奇数即可,卷积核的大小小于或等于F0的宽度和/或高度即可。卷积操作在图中用conv表示。
以下对S502的可选实现方式进行说明。第一通信装置对融合特征进行压缩和信道编码,得到第一信息。压缩和信道编码可以理解为一种信源信道联合编码(joint source-channel coding,JSCC),可以基于信源信道联合编码模型实现,信源信道联合编码模型是基于信道噪声进行训练的,利用信源信道联合编码模型处理的数据具有抗噪性能。第一通信装置将融合特征输入JSCC模型,输出第一信息。第一通信装置利用信源信道联合编码模型对融合特征进行降采样和第三卷积操作,得到第四特征。降采样(downsampled)的操作也可以称为下采样(subsampled)。降采样能够使得降采样后的特征符合显示区域的大小。降采样也可能够生成被降采样的特征的缩略图。降采样的过程可以如下所述。对于一组特征的高宽为MN,对其进行s倍下采样,即得到(M/s)*(N/s)尺寸的分辨率,s为M和N的公约数。如果被降采样的特征为矩阵形式的特征,可以把被降采样的原始特征s*s窗口内的特征值变成一个值,这个值就是窗口内所有特征值的均值。降采样可以对融合特征降低为任意的维度,即降采样操作可以将融合特征降低为任意的高度和任意的宽度,降采样之后的高度或宽度的值为正整数即可。例如,降采样可以降低融合特征的高度和宽度,融合特征的高度和宽度都为64,可以使用降采样将融合特征的高度和宽度都为16,即实现了对特征的4*4=16倍的压缩。常规情况下,使用卷积操作来降低融合特征的高度和宽度,卷积操作降低维度的倍数受卷积核大小的影响,并且只能降低为原来维度的整数倍分之一,相比来说,使用降采样的方式降低融合特征的维度能够更加灵活。
第三卷积操作能够降低融合特征的特征通道数。第三卷积操作可以是3*3的卷积操作。卷积操作能够根据具体的需要调整输出和输入通道数的比例,从而实现不同的压缩倍数。第三卷积操作的输入特征的通道数可以为任意正整数,第三卷积操作可以控制输出通道数从而达到降低特征通道数的目的。
第一通信装置还可以利用信源信道联合编码模型对第四特征执行以下一种或多种操作:广义除法归一化(generalized divisive normalization,GDN)、参数化的线性整流单元(parametric rectified linear unit,PReLU)、或功率归一化。其中,GDN操作可以用于提高信源信道联合编码模型的压缩能力。PReLU也可以用于提高信源信道联合编码模型的压缩能力。通过GDN操作和PReLU操作能够对实现对第四特征的进一步压缩效果。功率归一化可以使得压缩后的结果的功率为1。
可选的,上述信源信道联合编码模型的操作仅为举例,实际应用中,可以用一些能够达到相同效果的其它操作来替换。例如,GDN操作可以用批归一化(batch normalization,BN)来替换。又例如,PReLU可以用线性整流单元(rectified linear unit,ReLU)或者固定 参数的线性整流单元(Leaky ReLu)来替换。
以下对S505的可选实现方式进行说明。第二通信装置对第二信息进行解压和信道译码,得到融合特征的重构特征。
解压和信道译码可以理解为一种信源信道联合译码,可以基于信源信道联合译码模型实现。即将第二信息输入信源信道联合译码模型,输出融合特征的重构特征。第二通信装置利用信源信道联合译码模型对第二信息进行以下操作:第一卷积操作、上采样操作和第二卷积操作。通过上采样操作能够恢复特征的空间维度,由于第一通信装置在压缩编码时采用下采样操作降低融合特征的高度和宽度,降低倍数可以为任意值,此处第二通信装置采用对应的上采样操作,能够恢复特征的高度和宽度,上采样的扩展维度也比较灵活,能够使用与上采样对应的倍数恢复特征的空间维度。
举例来说,在第一通信装置侧,融合特征原始的特征通道数为64,通过第三卷积操作降低融合特征的特征通道数为1,第二通信装置获得第二信息的特征通道数为1,通过第一卷积操作将特征通道数恢复为8,通过第二卷积操作将特征通道数恢复为64。通过第一卷积操作和第二卷积操作两次卷积操作,相比一次卷积操作能够使得被压缩的融合特征能够被缓慢恢复,提高特征恢复的能力,使得恢复后的重构特征的参数更多,使用更多的参数能够提高网络模型的解析准确度。
第二通信装置对第二信息进行解压和信道译码,还包括以下一项或多项操作:广义除法归一化的反归一化(IGDN)、参数化的线性整流单元(PReLU)、批归一化(BN)、或线性整流单元(ReLU)。其中,IGDN可以用于提高信源信道联合译码模型的解压能力或解码能力。PReLU也可以用于提高信源信道联合译码模型的解压能力或解码能力。BN和/或ReLU可以限制解码结果的值域,也可以进一步增加解码结果的准确度。
第二通信装置在解压和信道译码之后,得到融合特征的重构特征,重构特征与融合特征具有相同的维度大小,在编译码性能理想的情况下,重构特征即为融合特征。
可选的,上述信源信道联合译码模型的操作仅为举例,实际应用中,可以用一些能够达到相同效果的其它操作来替换。例如,IGDN可以用BN来代替。可以理解的是,第二通信装置执行的信源信道联合译码过程,与第一通信装置执行的信源信道联合编码过程,是相应的。即,如果在编码时使用GDN,那么在译码时就使用IGDN。如果在编码侧使用BN,在译码侧使用对应的BN。
又例如,PReLU可以用ReLU或者Leaky ReLU来替换。
以下基于图8进行举例示意说明信源信道联合编码和译码的步骤。第一通信装置在编码时对融合特征进行以下操作:降采样、3*3的卷积操作、GDN、PReLU、和功率归一化,得到第一信息,第一经过信道传输后,第二通信装置收到第一信息在信道传输后对应的第二信息。第二通信装置在译码时对第二信息进行以下操作:3*3的卷积操作、IGDN、PReLU、上采样、3*3的卷积操作、BN、和ReLU,得到融合特征的重构特征。
第二通信装置需要对重构特征进行特征解析,得到特征解析的结果。特征解析的结果包括X个特征,X个特征的第1个特征为重构特征,X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;X个特征中的前Y个特征中是经过第一运算得到的,X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X。
X个特征中不同特征的高度和宽度不同,通道数相同。例如,第X i+1个特征的高度为 所述第X i个特征的高度的1/2,第X i+1个特征的宽度为第X i个特征的宽度的1/2,第X i+1个特征的通道数与第X i个特征的通道数相同。
第一运算中卷积操作具有多种感受野,第二运算中的卷积操作具有一种感受野。将不同感受野的卷积结果融合在一起,是一种特征融合的手段,是从不同的角度提取不同的信息并融合,使得第一运算的结果所包含的信息比第二运算的结果包含的信息更多,更有助于功能网络模型的性能提升。
可选的,在第二运算中具有第一卷积核的卷积操作具有第一感受野,在第一运算中具有相同的第一卷积核的卷积操作具有两种感受野,这两种感受野包括第一感受野和第二感受野,第二感受野大于第一感受野。
第二运算可以是瓶颈模块(Bottleneck),第一运算可以是扩张的瓶颈模块(Dilated Bottleneck)。
可选的,如图9a所示,第一运算可以包括以下操作:对前Y个特征中的待处理特征进行1×1的卷积操作(conv);对1×1的卷积操作的结果分别进行多个3×3的卷积操作,多个3×3的卷积操作的感受野大小不同;对多个3×3的卷积操作的结果进行通道数维度的拼接(contat);对通道数维度的拼接结果进行1×1的卷积操作,获得第一卷积结果;将第一卷积结果与待处理特征进行逐元素相加。其中,前Y个特征中的待处理特征是前Y个特征中的任意一个特征,对前Y个特征中的每一个特征都可以执行这样的操作。可选的,在第一次1×1的卷积操作后还可以进行BN和/或ReLU,在进行多个3×3的卷积操作之后,还可以BN和/或ReLU。在第二次1×1的卷积操作后还可以进行BN。将第一卷积结果与待处理特征进行逐元素相加之后,还可以进行ReLU。
如图9b所示,第二运算可以包括以下操作:对X个特征中的后(X-Y)个特征中的待处理特征进行1×1的卷积操作;对1×1的卷积操作的结果分别进行一个3×3的卷积操作;对3×3的卷积操作的结果进行1×1的卷积操作,获得第二卷积结果;将第二卷积结果与该后(X-Y)个特征中的待处理特征进行逐元素相加。其中,X个特征中的后(X-Y)个特征中的待处理特征是该后(X-Y)个特征中的任意一个特征,对该后(X-Y)个特征中的每一个特征都可以执行这样的操作。可选的,在第一次1×1的卷积操作后还可以进行BN和/或ReLU,在进行3×3的卷积操作之后还可以BN和/或ReLU。在第二次1×1的卷积操作后还可以进行BN。将第二卷积结果与待处理特征进行逐元素相加之后,还可以进行ReLU。
以下结合具体的应用场景对第二通信装置对重构特征进行特征解析的过程进行描述。
如图10所示,重构特征用D1表示,第二通信装置利用第二主干网络模型对重构特征D1进行特征解析,得到具有不同特征维度的X个特征,X=7。7个特征包括D1,还包括D2、D3、D4、D5、D6、和D7。D1的特征维度与融合特征的特征维度相同,例如记为(H,W,C)。H表示高度,W表示宽度,C表示通道数。假设第X i+1个特征的高度为所述第X i个特征的高度的1/2,第X i+1个特征的宽度为第X i个特征的宽度的1/2,第X i+1个特征的通道数与第X i个特征的通道数相同。D1~D7的特征维度分别为(H,W,C)、(H/2,W/2,C)、(H/4,W/4,C)、(H/8,W/8,C)、(H/16,W/16,C)、(H/32,W/32,C)、(H/64,W/64,C)。每个特征是基于前一个特征生成的。由于特征的高度和宽度越来越小,X个特征中前面的特征的高度和宽度要比后面特征的大,可以是X个特征中的前Y个特征是通过第一运算得到的,X个特征中的后(X-Y)个特征是通过第二运算得到的。例如图10中,第一运算用菱 形表示,第二运算用圆形表示。7个特征中前3个是通过第一运算得到的,后4个特征是通过第二运算得到的。7个特征D1~D7输出到功能网络模型中,供多任务的多个功能网络模型共享,图10中以两个任务(任务1和任务2)进行举例。
本申请实施例中,多任务网络模型在应用之前需要进行模型训练,信源信道联合编码模型和信源信道联合译码模型在应用之前也需要进行模型训练。在一种可选的实现方式中,可以认为多任务网络模型可以包括信源信道联合编码模型和信源信道联合译码模型。当然,也可以认为多任务网络模型与信源信道联合编码模型和信源信道联合译码模型是相互独立的模型。一般情况下,信源信道联合编码模型和信源信道联合译码模型在训练时是结合在一起进行训练的。以下对模型训练的可能实现过程进行说明。
步骤1:生成一个基本的多任务网络模型,该多任务网络模型可以适用于协作运行场景中,例如可以用于端云协作的场景中。协作运行的两个装置仍然可以用第一通信装置和第二通信装置表述。
步骤2:初始化多任务网络模型的网络参数,输入的训练数据为标准化到[0,1]区间的图像像素值,经过特征提取和特征解析之后的特征,输入各个任务分支对应的功能网络模型中完成相应的任务,并输出结果。
各个任务分支的输出与训练数据中的标注信息计算损失,从而实现对多任务网络模型的端到端训练。多任务损失函数L MTL定义为:L MTL=L Task1+L Task2+…+L TaskN,Task1、Task2……TaskN用于表征N个任务。多任务损失函数为各个任务分支的损失函数之和。
重复步骤2直到多任务网络模型收敛。
步骤3:基于已经收敛的多任务网络模型,选择切割点将网络模型分成两部分,并加入信源信道联合编码模型和信源信道联合译码模型模拟中间特征的压缩、传输、解压和重构的过程。
初始化信源信道联合编码模型和信源信道联合译码模型的参数,信源信道联合编码模型对中间特征进行压缩后的压缩结果通过信道模型,信道模型如AWGN信道、Rayleigh信道等,模拟信道中的传输过程。
固定已训练好的多任务网络模型参数,只对新加入的信源信道联合编码模型和信源信道联合译码模型进行训练,损失函数为:L MTL+L1(F,F’),其中L MTL为多任务损失函数,L1(F,F’)为原始中间特征F和重构后的中间特征F’的L1-范数。
重复步骤3直至信源信道联合编码模型和信源信道联合译码模型收敛。
步骤4:基于步骤3的训练结果,不再固定多任务网络模型参数,对已添加信源信道联合编码模型和信源信道联合译码模型的多任务网络模型的全部参数进行端到端的联合训练,使用的损失函数为L MTL,重复步骤4直到整体模型收敛。整体模型为多任务网络模型、信源信道联合编码模型和信源信道联合译码模型。信源信道联合编码模型和信源信道联合译码模型可以简述为编解码模型。
上述训练模型的方法,对整体模型分步训练,首先训练多任务网络模型作为整体框架的基础模型,然后再单独训练信源信道联合编码模型和信源信道联合译码模型,得到具有一定压缩能力的编解码模型,最后对整体模型进行端到端训练,使多任务网络模型、信源信道联合编码模型和信源信道联合译码模型,耦合更加紧密,并进一步提升整体性能。使用多任务损失函数和中间特征压缩重构前后的L1-范数之和作为单独训练编解码模型的损失函数,使得编解码模型在保证压缩能力的情况下提升系统性能。
以下通过表格来示意本申请实施例提供的多任务网络模型与其他网络模型相比的性能改进。
如表1所示,本申请实施例提供的多任务网络模型用基于特征融合的多任务网络(feature fusion based multi-task network,FFMNet)表示,其他一种多任务网络模型为BlitzNet。mAP为平均准确度均值,是目标检测分支的准确度衡量指标;mIoU为平均交并比,是语义分割分支的准确度衡量指标,Param为模型参数量指标,数量级为百万(million,M)。可以看出,FFMNet相比BlitzNet性能更高,参数更少。
表1
网络模型 mAP(%) mIoU(%) Param
FFMNet 40.8 44.6 63.2M
BlitzNet 40.1 44.1 87.8M
如表2所示,与单任务网络模型相比,本申请实施例提供的多任务网络模型FFMNet的性能更高。表2示出了多个版本的FFMNet。FFMNet 1为具有目标检测和语义分割功能的网络模型。FFMNet 2只有一个功能网络模型,即单任务网络模型,且该功能网络模型为目标检测的功能网络模型。FFMNet 3为只有一个功能网络模型,即单任务网络模型,且该功能网络模型为语义分割的功能网络模型。
表2
Figure PCTCN2022100097-appb-000002
表2中,√表示该FFMNet具有该功能网络模型,-表示该网络模型无法测试对应的指标。
在一个实施例中,多任务网络模型与编解码模型联合,实现了对中间特征的高倍压缩,如表3所示,可以在无噪声干扰的情况下训练,编解码模型能够达到对中间特征的1024倍压缩,并且将两个任务分支(例如目标检测和语义分割两个任务)的性能损失都控制在2%以内。目标检测和语义分割两个任务分别对应两个子网络模型或两个功能网络模型。
表3
Figure PCTCN2022100097-appb-000003
表3中,第一行为原始特征维度(H,W,C),即没有经过压缩解压过程,因此压缩比例为1,然后对应的两个子网络模型的性能为40.8/44.6。第二行为原始特征经过压缩解压所得结果,(H/2,W/2,C/32)表示编码结果的大小,即高度和宽度为原特征的1/2,通 道数为原特征的1/32,因此压缩比例为2*2*32=128,在接收端解码之后进行特征解析和执行功能子网络之后的结果为40.2/43.6。第三行为原始特征经过压缩解压所得结果,(H/2,W/2,C/64)表示编码结果的大小,即高度和宽度为原特征的1/2,通道数为原特征的1/64,因此压缩比例为2*2*64=256,在接收端解码之后进行特征解析和执行功能子网络之后的结果为39.7/43.1。第四行和第五行可以参照上述第二行的解释,在此不再赘述。
可以看出,1024倍压缩解压导致与未压缩的性能差距较大,实际应用中,还可以采用压缩倍数为512倍,达到压缩效果与功能网络性能的平衡。
通过分步训练模型,在固定压缩倍数为512倍,训练过程中引入AWGN噪声,最终可以得到一个高压缩倍数且有一定抗噪能力的面向多任务网络的信源信道联合编译码模型。
相比传统的联合图像专家小组(joint photographic experts group,JPEG)结合正交幅度调制(Quadrature Amplitude Modulation,QAM)对中间特征进行压缩的方法,本申请实施例提供的信源信道联合编码模型具有更高的压缩倍数并克服了传统分离方法的悬崖效应。其中,JPEG是一种针对照片影像而广泛使用的有损压缩标准方法。这里可以将融合特征看作是一种具有多个通道的图像,因此可以使用JEPG中的压缩算法对特征进行信源编码。QAM是一种在两个正交载波上进行幅度调制的调制方式。这两个载波通常是相位差为90度(π/2)的正弦波,因此被称作正交载波。这里是用于信道保护和调制。
如图11a和图11b所示,为传统分离方法和本申请实施例提供的信源信道联合编码JSCC模型的性能对比图。图11a和图11b中,质量分数为控制JPEG算法压缩能力的参数,质量分数值越小代表压缩能力越大。码率代表信源编码比特(bit)数与经过信道编码之后的bit数之比。QAM代表经过信源编码和信道编码之后的bit流每xx个bit被调制为一个符号数,得到最终输入到信道的信号。分离方法的压缩率就是压缩之前的融合特征的符号数即特征维度与经过压缩、信道编码、调制之后的符号数之比。
本申请实施例提供的信源信道联合编码模型可以在保证识别率情况下,做到512倍或1024倍压缩,同时有一定的抗噪能力。未来通过布署到端云两侧,可以降低端侧的存储、计算和传输开销,同时抵抗信道噪声,保证传输鲁棒性。
需要说明的是,本申请中的各个应用场景中的举例仅仅表现了一些可能的实现方式,是为了对本申请的方法更好的理解和说明。本领域技术人员可以根据申请提供的参考信号的指示方法,得到一些演变形式的举例。
上述对本申请实施例提供的方法进行了介绍。为了实现上述本申请实施例提供的方法中的各功能,通信装置可以包括硬件结构和/或软件模块,以硬件结构、软件模块、或硬件结构加软件模块的形式来实现上述各功能。上述各功能中的某个功能以硬件结构、软件模块、还是硬件结构加软件模块的方式来执行,取决于技术方案的特定应用和设计约束条件。
如图12所示,基于同一技术构思,本申请实施例还提供了一种通信装置1200,该通信装置1200可以是通信装置,也可以是通信装置中的装置,或者是能够和通信装置匹配使用的装置。通信装置1200可以是终端设备或网络设备。一种设计中,该通信装置1200可以包括执行上述方法实施例中第一通信装置或第二通信装置执行的方法/操作/步骤/动作所一一对应的模块,该模块可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。一种设计中,该装置可以包括处理模块1201和收发模块1202。处理模块1201用于调用收发模块1202执行接收和/或发送的功能。
当该通信装置1200用于执行第一通信装置的方法时,其中:
处理模块1201,用于利用多任务网络模型中的第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,多个第一特征为对输入信号进行特征提取获得的;以及用于对融合特征进行压缩和信道编码,得到第一信息;
收发模块1202,用于向第二通信装置发送第一信息。
收发模块1202还用于执行上述方法实施例中第一通信装置执行的接收或发送信号相关的操作,处理模块1201还用于执行上述方法实施例中第一通信装置执行的除收发信号之外的其它操作,在此不再一一赘述。第一通信装置可以是终端设备,也可以是网络设备。
当该通信装置1200用于执行第二通信装置的方法时,其中:
收发模块1202,用于接收第一通信装置的第二信息;
处理模块1201,用于对第二信息进行解压和信道译码,得到融合特征的重构特征;融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;以及利用多任务网络模型中的第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;以及利用多任务网络模型中的功能网络模型处理特征解析的结果。
收发模块1202还用于执行上述方法实施例中第二通信装置执行的接收或发送信号相关的操作,处理模块1201还用于执行上述方法实施例中第二通信装置执行的除收发信号之外的其它操作,在此不再一一赘述。第二通信装置可以是终端设备,也可以是网络设备。
本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
如图13所示为本申请实施例提供的通信装置1300,用于实现上述方法中通信装置的功能。通信装置可以是第一通信装置,也可以是第二通信装置。当实现第一通信装置的功能时,该装置可以是第一通信装置,也可以是第一通信装置中的装置,或者是能够和第一通信装置匹配使用的装置。当实现第二通信装置的功能时,该装置可以是第二通信装置,也可以是第二通信装置中的装置,或者是能够和第二通信装置匹配使用的装置。其中,该通信装置1300可以为芯片系统。本申请实施例中,芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。通信装置1300包括至少一个处理器1320,用于实现本申请实施例提供的方法中第一通信装置或第二通信装置的功能。通信装置1300还可以包括通信接口1310。在本申请实施例中,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口,用于通过传输介质和其它装置进行通信。例如,通信接口1310用于通信装置1300中的装置可以和其它装置进行通信,例如,通信装置1300是第一通信装置时,其它装置可以是第二通信装置;又例如,通信装置1300是第二通信装置时,其它装置可以是第一通信装置;又例如,通信装置1300是芯片时,其它装置可以是通信设备中其他芯片或器件。处理器1320利用通信接口1310收发数据,并用于实现上述方法实施例所述的方法。
示例性地,当该通信装置1300用于执行第一通信装置的方法时,其中:
处理器1320,用于利用第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的;处理模块还用于对融合特征进行压缩和信道编码,得到第一信息;通信接口1310用于向第二通信装置发送第一信息。
可选的,在利用第一主干网络模型对输入信号进行处理,获得融合特征时,处理器1320用于:对输入信号进行特征提取,得到多个第二特征,该多个第二特征具有不同的特征维度;处理该多个第二特征的特征维度,得到具有相同特征维度的该多个第一特征;将该多个第一特征进行特征融合,得到融合特征。
可选的,在处理该多个第一特征的特征维度,得到具有相同特征维度的多个第一特征时,处理器1320用于:对该多个第一特征进行第一卷积操作和上采样操作,得到具有相同特征维度的多个第一特征。
可选的,在将该多个第一特征进行特征融合,得到融合特征时,处理器1320用于:对该多个第一特征相加,得到第三特征。
可选的,处理器1320还用于:对第三特征进行第二卷积操作,得到融合特征。
可选的,在对融合特征进行压缩和信道保护处理时,处理器1320用于:利用信源信道联合编码模型对融合特征进行降采样和第三卷积操作,得到第四特征;信源信道联合编码模型是基于信道噪声进行训练的。
可选的,在对融合特征进行压缩和信道保护处理时,处理器1320还用于:利用信源信道联合编码模型对第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元、或功率归一化。
当该通信装置1300用于执行第二通信装置的方法时,其中:
通信接口1310,用于接收第一通信装置的第二信息;处理器1320,用于对第二信息进行解压和信道译码,得到融合特征的重构特征;融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;以及利用多任务网络模型中的第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;以及利用多任务网络模型中的功能网络模型处理特征解析的结果。
在一个可能的设计中,特征解析的结果包括X个特征,该X个特征的第1个特征为重构特征,该X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;该X个特征中的前Y个特征中是经过第一运算得到的,该X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;第一运算中卷积操作具有多种感受野,第二运算中的卷积操作具有一种感受野。
在一个可能的设计中,第X i+1个特征的高度为第X i个特征的高度的1/2;第X i+1个特征的宽度为第X i个特征的宽度的1/2;第X i+1个特征的通道数与第X i个特征的通道数相同。
在一个可能的设计中,在对第二信息进行解压和信道译码时,处理器1320用于:利用信源信道联合译码模型对第二信息进行以下操作:第四卷积操作上采样操作和第五卷积操作。
在一个可能的设计中,在对第二信息进行解压和信道译码时,处理器1320还用于执行以下一项或多项操作:广义除法归一化的反归一化、参数化的线性整流单元、批归一化、或线性整流单元。
处理器1320和通信接口1310还可以用于执行上述方法实施例第一通信装置或第二通信装置执行的其它对应的步骤或操作,在此不再一一赘述。
通信装置1300还可以包括至少一个存储器1330,用于存储程序指令和/或数据。存储器1330和处理器1320耦合。本申请实施例中的耦合是装置、单元或模块之间的间接耦合或通信连接,可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。 处理器1320可能和存储器1330协同操作。处理器1320可能执行存储器1330中存储的程序指令。所述至少一个存储器中的至少一个可以与处理器集成在一起。
本申请实施例中不限定上述通信接口1310、处理器1320以及存储器1330之间的具体连接介质。本申请实施例在图13中以存储器1330、处理器1320以及通信接口1310之间通过总线1340连接,总线在图13中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图13中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在本申请实施例中,处理器1320可以是通用处理器、数字信号处理器、专用集成电路、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
在本申请实施例中,存储器1330可以是非易失性存储器,比如硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)等,还可以是易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM)。存储器是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本申请实施例中的存储器还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。
基于与方法实施例同一技术构思,如图14所示,本申请实施例还提供一种通信装置1400,该通信装置1400用于执行上述基于多任务网络模型的通信方法中第一通信装置或第二通信装置执行的操作。其中,该通信装置1400可以为芯片系统。本申请实施例中,芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。上述实施例的基于多任务网络模型的通信方法中的部分或全部可以通过硬件来实现也可以通过软件来实现,当通过硬件实现时,通信装置1400包括:输入输出接口1401和逻辑电路1402。在本申请实施例中,输入输出接口1401可以是收发器、电路、总线、模块或其它类型的通信接口,用于通过传输介质和其它装置进行通信。例如,输入输出接口1401用于通信装置1400中的装置可以和其它装置进行通信,例如,通信装置1400是第一通信装置时,其它装置可以是第二通信装置;又例如,通信装置1400是第二通信装置时,其它装置可以是第一通信装置;又例如,通信装置1400是芯片时,其它装置可以是通信设备中其他芯片或器件。
示例性地,当该通信装置1400用于执行第一通信装置的方法时,其中:
逻辑电路1402,用于利用第一主干网络模型对输入信号进行处理,获得融合特征,融合特征为多个第一特征融合得到的,该多个第一特征为对输入信号进行特征提取获得的;处理模块还用于对融合特征进行压缩和信道编码,得到第一信息;输入输出接口1401用于向第二通信装置发送第一信息。
可选的,在利用第一主干网络模型对输入信号进行处理,获得融合特征时,逻辑电路1402用于:对输入信号进行特征提取,得到多个第二特征,该多个第二特征具有不同的特征维度;处理该多个第二特征的特征维度,得到具有相同特征维度的该多个第一特征;将该多个第一特征进行特征融合,得到融合特征。
可选的,在处理该多个第一特征的特征维度,得到具有相同特征维度的多个第一特征时,逻辑电路1402用于:对该多个第一特征进行第一卷积操作和上采样操作,得到具有 相同特征维度的多个第一特征。
可选的,在将该多个第一特征进行特征融合,得到融合特征时,逻辑电路1402用于:对该多个第一特征相加,得到第三特征。
可选的,逻辑电路1402还用于:对第三特征进行第二卷积操作,得到融合特征。
可选的,在对融合特征进行压缩和信道保护处理时,逻辑电路1402用于:利用信源信道联合编码模型对融合特征进行降采样和第三卷积操作,得到第四特征;信源信道联合编码模型是基于信道噪声进行训练的。
可选的,在对融合特征进行压缩和信道保护处理时,逻辑电路1402还用于:利用信源信道联合编码模型对第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元、或功率归一化。
当该通信装置1400用于执行第二通信装置的方法时,其中:
输入输出接口1401,用于接收第一通信装置的第二信息;逻辑电路1402,用于对第二信息进行解压和信道译码,得到融合特征的重构特征;融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;以及利用多任务网络模型中的第二主干网络模型对重构特征进行特征解析,得到特征解析的结果;以及利用多任务网络模型中的功能网络模型处理特征解析的结果。
在一个可能的设计中,特征解析的结果包括X个特征,该X个特征的第1个特征为重构特征,该X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;该X个特征中的前Y个特征中是经过第一运算得到的,该X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;第一运算中卷积操作具有多种感受野,第二运算中的卷积操作具有一种感受野。
在一个可能的设计中,第X i+1个特征的高度为第X i个特征的高度的1/2;第X i+1个特征的宽度为第X i个特征的宽度的1/2;第X i+1个特征的通道数与第X i个特征的通道数相同。
在一个可能的设计中,在对第二信息进行解压和信道译码时,逻辑电路1402用于:利用信源信道联合译码模型对第二信息进行以下操作:第四卷积操作上采样操作和第五卷积操作。
在一个可能的设计中,在对第二信息进行解压和信道译码时,逻辑电路1402还用于执行以下一项或多项操作:广义除法归一化的反归一化、参数化的线性整流单元、批归一化、或线性整流单元。
逻辑电路1402和输入输出接口1401还可以用于执行上述方法实施例第一通信装置或第二通信装置执行的其它对应的步骤或操作,在此不再一一赘述。
通信装置1200、通信装置1300和通信装置1400具体是芯片或者芯片系统时,收发模块1202、通信接口1310和输入输出接口1401所输出或接收的可以是基带信号。通信装置1200、通信装置1300和通信装置1400具体是设备时,收发模块1202、通信接口1310和输入输出接口1401所输出或接收的可以是射频信号。
本申请上述方法实施例描述的第一通信装置/第二通信装置所执行的操作和功能中的部分或全部,可以用芯片或集成电路来完成。
本申请实施例提供了一种计算机可读存储介质,存储有计算机程序,该计算机程序包括用于执行上述方法实施例的指令。
本申请实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得 执行上述方法实施例被执行。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。
显然,本领域的技术人员可以对本申请实施例进行各种改动和变型而不脱离本申请实施例的精神和范围。这样,倘若本申请实施例的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (33)

  1. 一种基于多任务网络模型的通信方法,其特征在于,所述多任务网络模型包括第一主干网络模型;所述方法包括:
    所述第一通信装置利用所述第一主干网络模型对输入信号进行处理,获得融合特征,所述融合特征为多个第一特征融合得到的,所述多个第一特征为对所述输入信号进行特征提取获得的;
    所述第一通信装置对所述融合特征进行压缩和信道编码,得到第一信息;
    所述第一通信装置向第二通信装置发送所述第一信息。
  2. 如权利要求1所述的方法,其特征在于,所述第一通信装置利用所述第一主干网络模型对输入信号进行处理,获得融合特征,包括:
    所述第一通信装置对输入信号进行特征提取,得到多个第二特征,所述多个第二特征具有不同的特征维度;
    所述第一通信装置处理所述多个第二特征的特征维度,得到具有相同特征维度的所述多个第一特征;
    所述第一通信装置将所述多个第一特征进行特征融合,得到所述融合特征。
  3. 如权利要求2所述的方法,其特征在于,所述第一通信装置处理所述多个第一特征的特征维度,得到具有相同特征维度的多个第一特征,包括:
    所述第一通信装置对所述多个第一特征进行第一卷积操作和上采样操作,得到具有相同特征维度的多个第一特征。
  4. 如权利要求2或3所述的方法,其特征在于,所述第一通信装置将所述多个第一特征进行特征融合,得到所述融合特征,包括:
    所述第一通信装置对所述多个第一特征相加,得到第三特征。
  5. 如权利要求4所述的方法,其特征在于,所述方法还包括:
    所述第一通信装置对所述第三特征进行第二卷积操作,得到所述融合特征。
  6. 如权利要求1~5任一项所述的方法,其特征在于,所述第一通信装置对所述融合特征进行压缩和信道保护处理,包括:
    所述第一通信装置利用信源信道联合编码模型对所述融合特征进行降采样和第三卷积操作,得到第四特征;所述信源信道联合编码模型是基于信道噪声进行训练的。
  7. 如权利要求6所述的方法,其特征在于,所述第一通信装置对所述融合特征进行压缩和信道保护处理,还包括:
    所述第一通信装置利用信源信道联合编码模型对所述第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元PReLU、或功率归一化。
  8. 一种基于多任务网络模型的通信方法,其特征在于,所述多任务网络模型包括第二主干网络模型和功能网络模型;所述方法包括:
    所述第二通信装置接收第一通信装置的第二信息;
    所述第二通信装置对所述第二信息进行解压和信道译码,得到融合特征的重构特征;所述融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;
    所述第二通信装置利用所述第二主干网络模型对所述重构特征进行特征解析,得到特 征解析的结果;
    所述第二通信装置利用所述功能网络模型处理所述特征解析的结果。
  9. 如权利要求8所述的方法,其特征在于,所述特征解析的结果包括X个特征,所述X个特征的第1个特征为所述重构特征,所述X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;所述X个特征中的前Y个特征中是经过第一运算得到的,所述X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;
    所述第一运算中卷积操作具有多种感受野,所述第二运算中的卷积操作具有一种感受野。
  10. 如权利要求9所述的方法,其特征在于,所述第X i+1个特征的高度为所述第X i个特征的高度的1/2;所述第X i+1个特征的宽度为所述第X i个特征的宽度的1/2;所述第X i+1个特征的通道数与所述第X i个特征的通道数相同。
  11. 如权利要求8~10任一项所述的方法,其特征在于,所述第二通信装置对所述第二信息进行解压和信道译码,包括:
    所述第二通信装置利用信源信道联合译码模型对所述第二信息进行以下操作:第四卷积操作、上采样操作和第五卷积操作。
  12. 如权利要求11所述的方法,其特征在于,所述第二通信装置对所述第二信息进行解压和信道译码,还包括以下一项或多项操作:广义除法归一化的反归一化、参数化的线性整流单元PReLU、批归一化、或线性整流单元ReLU。
  13. 一种通信装置,其特征在于,包括:
    处理模块,用于利用所述多任务网络模型中的第一主干网络模型对输入信号进行处理,获得融合特征,所述融合特征为多个第一特征融合得到的,所述多个第一特征为对所述输入信号进行特征提取获得的;以及用于对所述融合特征进行压缩和信道编码,得到第一信息;
    收发模块,用于向第二通信装置发送所述第一信息。
  14. 如权利要求13所述的装置,其特征在于,在利用所述第一主干网络模型对输入信号进行处理,获得融合特征时,所述处理模块用于:
    对输入信号进行特征提取,得到多个第二特征,所述多个第二特征具有不同的特征维度;
    处理所述多个第二特征的特征维度,得到具有相同特征维度的所述多个第一特征;
    将所述多个第一特征进行特征融合,得到所述融合特征。
  15. 如权利要求14所述的装置,其特征在于,在处理所述多个第一特征的特征维度,得到具有相同特征维度的多个第一特征时,所述处理模块用于:
    对所述多个第一特征进行第一卷积操作和上采样操作,得到具有相同特征维度的多个第一特征。
  16. 如权利要求14或15所述的装置,其特征在于,在将所述多个第一特征进行特征融合,得到所述融合特征时,所述处理模块用于:
    对所述多个第一特征相加,得到第三特征。
  17. 如权利要求16所述的装置,其特征在于,所述处理模块还用于:
    对所述第三特征进行第二卷积操作,得到所述融合特征。
  18. 如权利要求13~17任一项所述的装置,其特征在于,在对所述融合特征进行压缩和信道保护处理时,所述处理模块用于:
    利用信源信道联合编码模型对所述融合特征进行降采样和第三卷积操作,得到第四特征;所述信源信道联合编码模型是基于信道噪声进行训练的。
  19. 如权利要求18所述的装置,其特征在于,在对所述融合特征进行压缩和信道保护处理时,所述处理模块还用于:
    利用信源信道联合编码模型对所述第四特征执行以下一种或多种操作:广义除法归一化、参数化的线性整流单元PReLU、或功率归一化。
  20. 一种通信装置,其特征在于,包括:
    收发模块,用于接收第一通信装置的第二信息;
    处理模块,用于对所述第二信息进行解压和信道译码,得到融合特征的重构特征;所述融合特征为对输入信号进行特征提取获得的多个第一特征融合得到的;以及利用多任务网络模型中的第二主干网络模型对所述重构特征进行特征解析,得到特征解析的结果;以及利用所述多任务网络模型中的功能网络模型处理所述特征解析的结果。
  21. 如权利要求20所述的装置,其特征在于,所述特征解析的结果包括X个特征,所述X个特征的第1个特征为所述重构特征,所述X个特征中的第X i+1个特征是由第X i个特征经过运算得到的;所述X个特征中的前Y个特征中是经过第一运算得到的,所述X个特征中的后(X-Y)个特征是经过第二运算得到的;其中,X、Y、i为正整数,i小于或等于X,Y小于或等于X;
    所述第一运算中卷积操作具有多种感受野,所述第二运算中的卷积操作具有一种感受野。
  22. 如权利要求21所述的装置,其特征在于,所述第X i+1个特征的高度为所述第X i个特征的高度的1/2;所述第X i+1个特征的宽度为所述第X i个特征的宽度的1/2;所述第X i+1个特征的通道数与所述第X i个特征的通道数相同。
  23. 如权利要求20~22任一项所述的装置,其特征在于,在对所述第二信息进行解压和信道译码时,所述处理模块用于:
    利用信源信道联合译码模型对所述第二信息进行以下操作:第四卷积操作上采样操作和第五卷积操作。
  24. 如权利要求23所述的装置,其特征在于,在对所述第二信息进行解压和信道译码时,所述处理模块还用于执行以下一项或多项操作:广义除法归一化的反归一化、参数化的线性整流单元PReLU、批归一化、或线性整流单元ReLU。
  25. 一种通信装置,其特征在于,包括处理器和通信接口,所述通信接口用于发送第一信息,所述处理器用于执行如权利要求1~7任一项所述的方法。
  26. 如权利要求25所述的装置,其特征在于,所述装置还可以包括存储器,所述存储器用于存储所述处理器调用的程序、指令或数据;所述存储器与所述处理器耦合,或者所述处理器中包含所述存储器。
  27. 一种通信装置,其特征在于,包括处理器和通信接口,所述通信接口用于接收第二信息,所述处理器用于执行权利要求8~12任一项所述的方法。
  28. 如权利要求27所述的装置,其特征在于,所述装置还可以包括存储器,所述存储器用于存储所述处理器调用的程序、指令或数据;所述存储器与所述处理器耦合,或者所 述处理器中包含所述存储器。
  29. 一种芯片,其特征在于,包括逻辑电路和输入输出接口,所述输入输出接口用于输出第一信息,所述逻辑电路用于执行如权利要求1~7任一项所述的方法。
  30. 一种芯片,其特征在于,包括逻辑电路和输入输出接口,所述输入输出接口用于输入第二信息,所述逻辑电路用于执行如权利要求8~12任一项所述的方法。
  31. 一种计算机可读存储介质,其特征在于,所述计算机存储介质中存储有计算机可读指令,当所述计算机可读指令在通信装置上运行时,使得如权利要求1~7任一项所述的方法被执行,或者使得如权利要求8~12任一项所述的方法被执行。
  32. 一种计算机程序产品,其特征在于,所述计算机程序产品中存储有计算机可读指令,当所述计算机可读指令在通信装置上运行时,使得如权利要求1~7任一项所述的方法被执行,或者使得如权利要求8~12任一项所述的方法被执行。
  33. 一种通信系统,其特征在于,包括如权利要求13~19任一项所述的装置、以及如权利要求20~24任一项所述的装置;或者,包括如权利要求25或26所述的装置、以及如权利要求27或28所述的装置。
PCT/CN2022/100097 2021-06-29 2022-06-21 一种基于多任务网络模型的通信方法、装置及系统 WO2023273956A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/398,520 US20240127074A1 (en) 2021-06-29 2023-12-28 Multi-task network model–based communication method, apparatus, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110748182.0A CN115550943A (zh) 2021-06-29 2021-06-29 一种基于多任务网络模型的通信方法、装置及系统
CN202110748182.0 2021-06-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/398,520 Continuation US20240127074A1 (en) 2021-06-29 2023-12-28 Multi-task network model–based communication method, apparatus, and system

Publications (1)

Publication Number Publication Date
WO2023273956A1 true WO2023273956A1 (zh) 2023-01-05

Family

ID=84691270

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/100097 WO2023273956A1 (zh) 2021-06-29 2022-06-21 一种基于多任务网络模型的通信方法、装置及系统

Country Status (3)

Country Link
US (1) US20240127074A1 (zh)
CN (1) CN115550943A (zh)
WO (1) WO2023273956A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116070119A (zh) * 2023-03-31 2023-05-05 北京数慧时空信息技术有限公司 基于小样本的多任务组合模型的训练方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102655588A (zh) * 2012-04-24 2012-09-05 浙江工商大学 用于视频图像传输的信源信道联合解码方法
CN105631879A (zh) * 2015-12-30 2016-06-01 哈尔滨工业大学 一种基于线型阵列的超声层析成像系统及方法
CN110225341A (zh) * 2019-06-03 2019-09-10 中国科学技术大学 一种任务驱动的码流结构化图像编码方法
WO2020215985A1 (zh) * 2019-04-22 2020-10-29 腾讯科技(深圳)有限公司 医学影像分割方法、装置、电子设备和存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102655588A (zh) * 2012-04-24 2012-09-05 浙江工商大学 用于视频图像传输的信源信道联合解码方法
CN105631879A (zh) * 2015-12-30 2016-06-01 哈尔滨工业大学 一种基于线型阵列的超声层析成像系统及方法
WO2020215985A1 (zh) * 2019-04-22 2020-10-29 腾讯科技(深圳)有限公司 医学影像分割方法、装置、电子设备和存储介质
CN110225341A (zh) * 2019-06-03 2019-09-10 中国科学技术大学 一种任务驱动的码流结构化图像编码方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116070119A (zh) * 2023-03-31 2023-05-05 北京数慧时空信息技术有限公司 基于小样本的多任务组合模型的训练方法
CN116070119B (zh) * 2023-03-31 2023-10-27 北京数慧时空信息技术有限公司 基于小样本的多任务组合模型的训练方法

Also Published As

Publication number Publication date
US20240127074A1 (en) 2024-04-18
CN115550943A (zh) 2022-12-30

Similar Documents

Publication Publication Date Title
Zhang et al. Toward wisdom-evolutionary and primitive-concise 6G: A new paradigm of semantic communication networks
US20230232213A1 (en) Information transmission methods and apparatuses, and communication devices and storage medium
CN113747462A (zh) 一种信息处理方法及相关设备
US20230106468A1 (en) Image segmentation method and apparatus, computer device, and storage medium
CN112784897A (zh) 图像处理方法、装置、设备和存储介质
US20240127074A1 (en) Multi-task network model–based communication method, apparatus, and system
Xu et al. Semantic communication for the internet of vehicles: A multiuser cooperative approach
WO2022257662A1 (zh) 一种应用人工智能的通信方法及通信装置
CN115989527A (zh) 用于对增强现实媒体对象执行基于锚点的渲染的方法和装置
WO2022267633A1 (zh) 信息传输的方法和装置
Bing et al. Collaborative image compression and classification with multi-task learning for visual Internet of Things
CN111192265B (zh) 一种基于点云的语义实例确定方法、装置、电子设备及存储介质
WO2023231635A1 (zh) 一种模型传输的方法及装置
Nakahara et al. Edge computing-assisted DNN image recognition system with progressive image retransmission
Guo et al. Distributed Task-Oriented Communication Networks with Multimodal Semantic Relay and Edge Intelligence
WO2023146862A1 (en) Equivariant generative prior for inverse problems with unknown rotation
Ji et al. Guest editorial: Emerging visual IoT technologies for future communications and networks
CN115759107A (zh) 语义通信系统的生成方法、装置、电子设备及介质
Dawood et al. Simulation of multimedia data transmission over WSN based on MATLAB/SIMULINK
Tonchev et al. Semantic Communication System for 3D Video
CN114723933A (zh) 区域信息生成方法、装置、电子设备和计算机可读介质
CN116309151B (zh) 图片去压缩失真网络的参数生成方法、装置和存储介质
WO2024183180A1 (zh) 基于非正交多址的信息业务服务提供方法、系统、设备及介质
WO2024098259A1 (zh) 生成样本集的方法和设备
WO2023070675A1 (zh) 数据处理的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22831774

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22831774

Country of ref document: EP

Kind code of ref document: A1