WO2024012303A1 - 一种ai网络模型交互方法、装置和通信设备 - Google Patents
一种ai网络模型交互方法、装置和通信设备 Download PDFInfo
- Publication number
- WO2024012303A1 WO2024012303A1 PCT/CN2023/105408 CN2023105408W WO2024012303A1 WO 2024012303 A1 WO2024012303 A1 WO 2024012303A1 CN 2023105408 W CN2023105408 W CN 2023105408W WO 2024012303 A1 WO2024012303 A1 WO 2024012303A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network model
- information
- target
- model
- network
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 169
- 230000003993 interaction Effects 0.000 title claims abstract description 71
- 238000004891 communication Methods 0.000 title claims abstract description 46
- 238000007906 compression Methods 0.000 claims abstract description 119
- 230000006835 compression Effects 0.000 claims abstract description 118
- 238000013473 artificial intelligence Methods 0.000 claims description 148
- 238000012545 processing Methods 0.000 claims description 33
- 230000006870 function Effects 0.000 description 20
- 230000011664 signaling Effects 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 13
- 230000005540 biological transmission Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000000354 decomposition reaction Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000013140 knowledge distillation Methods 0.000 description 4
- 238000013138 pruning Methods 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/38—Flow control; Congestion control by adapting coding or compression rate
Definitions
- This application belongs to the field of communication technology, and specifically relates to an AI network model interaction method, device and communication equipment.
- the AI network model can use existing AI tools to implement the construction, training and verification of the network model. And by interacting with the AI network model in the wireless communication system, the trained AI network model is deployed on the target device that needs to use the AI network model, which involves the transmission problem of the AI network model.
- Embodiments of the present application provide an AI network model interaction method, device, and communication device, which can reduce transmission overhead and/or reduce computing resources and inference delays occupied during inference.
- an artificial intelligence AI network model interaction method which method includes:
- the first device sends first information to the second device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- the first device obtains relevant information of a target AI network model, and the target AI network model corresponds to the first information.
- an artificial intelligence AI network model interaction device applied to the first device, and the device includes:
- a first sending module configured to send first information to the second device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- the first acquisition module is used to acquire relevant information of the target AI network model, where the target AI network model corresponds to the first information.
- an artificial intelligence AI network model interaction method including:
- the second device receives first information from the first device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- the second device sends relevant information of the target AI network model to the first device, and the target AI network model corresponds to the first information, or the second device sends the first information according to the first information.
- Information related to the AI network model wherein the first AI network model is used for compression processing to obtain a second AI network model, and the second AI network model corresponds to the first information.
- an artificial intelligence AI network model interaction device applied to the second device, and the device includes:
- a first receiving module configured to receive first information from the first device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- a second sending module configured to send relevant information of a target AI network model to the first device, where the target AI network model corresponds to the first information, or to send the first AI network model according to the first information.
- Relevant information wherein the first AI network model is used to perform compression processing to obtain a second AI network model, and the second AI network model corresponds to the first information.
- a communication device in a fifth aspect, includes a processor and a memory.
- the memory stores a program or instructions that can be run on the processor.
- the program or instructions are implemented when executed by the processor. The steps of the method described in the first aspect or the third aspect.
- a communication device including a processor and a communication interface, wherein the communication interface is used to send first information to a second device, where the first information includes the AI network required by the first device. Information related to model compression and/or model inference; the communication interface or the processor is used to obtain relevant information of the target AI network model, and the target AI network model corresponds to the first information; or,
- the communication interface is used to receive first information from the first device, and send relevant information of the target AI network model to the first device or send relevant information of the first AI network model according to the first information, wherein,
- the first information includes information related to compression and/or model inference of the AI network model required by the first device, the target AI network model corresponds to the first information, and the first AI network model is used to Compression processing is performed to obtain a second AI network model, where the second AI network model corresponds to the first information.
- a seventh aspect provides a communication system, including: a first device and a second device.
- the first device can be used to perform the steps of the AI network model interaction method as described in the first aspect.
- the second device can To execute the steps of the AI network model interaction method described in the third aspect.
- a readable storage medium is provided. Programs or instructions are stored on the readable storage medium. When the programs or instructions are executed by a processor, the steps of the method described in the first aspect are implemented, or the steps of the method are implemented as described in the first aspect. The steps of the method described in the third aspect.
- a chip in a ninth aspect, includes a processor and a communication interface.
- the communication interface is coupled to the processor.
- the processor is used to run programs or instructions to implement the method described in the first aspect. , or implement the method as described in the third aspect.
- a computer program/program product is provided, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the method described in the first aspect AI
- the steps of the network model interaction method, or the computer program/program product is executed by at least one processor to implement the steps of the AI network model interaction method as described in the third aspect.
- the first device sends first information to the second device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device; the first device The device obtains relevant information of a target AI network model, where the target AI network model corresponds to the first information.
- the second device has pre-stored or trained the AI network model, during the process of the first device obtaining the AI network model from the second device, the first device can send the information required by the first device to the second device.
- the second device can determine at least one of the following according to the needs of the first device: the type, size, function, and complexity of the AI network model required by the first device , as well as the parameters, compression methods, compression nodes, etc. when compressing the determined AI network model.
- the second device can compress the AI network model according to the needs of the first device, and compress the compressed AI network model.
- Transmission can reduce the transmission overhead of the AI network model; in addition, the second device can also select an AI network model that matches the model inference process of the first device according to the needs of the first device, which can reduce the first device's impact on the target AI.
- Figure 1 is a schematic structural diagram of a wireless communication system to which embodiments of the present application can be applied;
- Figure 2 is a flow chart of an AI network model interaction method provided by an embodiment of the present application.
- Figure 3 is a schematic diagram of the embodiment of the present application applied to CSI feedback
- Figure 4 is one of the schematic diagrams of the interaction process between the first device and the second device in the embodiment of the present application
- Figure 5 is a second schematic diagram of the interaction process between the first device and the second device in the embodiment of the present application.
- Figure 6 is a schematic diagram of the interaction process between the first device, the second device and the third device in the embodiment of the present application;
- Figure 7 is a flow chart of another AI network model interaction method provided by an embodiment of the present application.
- Figure 8 is a schematic structural diagram of an AI network model interaction device provided by an embodiment of the present application.
- FIG. 9 is a schematic structural diagram of another AI network model interaction device provided by an embodiment of the present application.
- Figure 10 is a schematic structural diagram of a communication device provided by an embodiment of the present application.
- first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and that "first" and “second” are distinguished objects It is usually a category, and the number of objects is not limited.
- the first object can be one, or Can be multiple.
- “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
- LTE Long Term Evolution
- LTE-Advanced, LTE-A Long Term Evolution
- LTE-A Long Term Evolution
- CDMA Code Division Multiple Access
- TDMA Time Division Multiple Access
- FDMA Frequency Division Multiple Access
- OFDMA Orthogonal Frequency Division Multiple Access
- SC-FDMA Single-carrier Frequency Division Multiple Access
- NR New Radio
- FIG. 1 shows a block diagram of a wireless communication system to which embodiments of the present application are applicable.
- the wireless communication system includes a terminal 11 and a network side device 12.
- the terminal 11 can be a mobile phone, a tablet computer (Tablet Personal Computer), a laptop computer (Laptop Computer), or a notebook computer, a personal digital assistant (Personal Digital Assistant, PDA), a handheld computer, a netbook, or a super mobile personal computer.
- Tablet Personal Computer Tablet Personal Computer
- laptop computer laptop computer
- PDA Personal Digital Assistant
- PDA Personal Digital Assistant
- UMPC ultra-mobile personal computer
- UMPC mobile Internet device
- MID mobile Internet device
- augmented reality augmented reality, AR
- VR virtual reality
- robots wearable devices
- Vehicle user equipment VUE
- pedestrian terminal pedestrian terminal
- PUE pedestrian terminal
- smart home home equipment with wireless communication functions, such as refrigerators, TVs, washing machines or furniture, etc.
- game consoles personal computers (personal computer, PC), teller machine or self-service machine and other terminal-side devices.
- Wearable devices include: smart watches, smart bracelets, smart headphones, smart glasses, smart jewelry (smart bracelets, smart bracelets, smart rings, smart necklaces, smart anklets) bracelets, smart anklets, etc.), smart wristbands, smart clothing, etc.
- the network side equipment 12 may include access network equipment or core network equipment, where the access network equipment may also be called wireless access network equipment, radio access network (Radio Access Network, RAN), radio access network function or wireless access network unit.
- Access network equipment can include base stations, Wireless Local Area Network (WLAN) access points or WiFi nodes, etc.
- WLAN Wireless Local Area Network
- the base station can be called Node B, Evolved Node B (eNB), access point, base transceiver station ( Base Transceiver Station (BTS), radio base station, radio transceiver, Basic Service Set (BSS), Extended Service Set (ESS), home B-node, home evolved B-node, sending and receiving point ( Transmitting Receiving Point (TRP) or some other suitable term in the field, as long as the same technical effect is achieved, the base station is not limited to specific technical terms. It should be noted that in the embodiment of this application, only the NR system is used The base station is introduced as an example, and the specific type of base station is not limited.
- AI network models such as neural networks, decision trees, support vector machines, Bayesian classifiers, etc. This application takes a neural network as an example for explanation, but does not limit the specific type of AI network model.
- the AI algorithm selected and the network model used are also different.
- the main way to improve 5G network performance with the help of AI network models is to enhance or replace existing algorithms or processing modules with neural network-based algorithms and models.
- neural network-based algorithms and models can achieve better performance than deterministic-based algorithms.
- the more commonly used neural networks include deep neural networks, convolutional neural networks, and recurrent neural networks.
- the construction, training and verification of neural networks can be achieved.
- the size and complexity of the AI network model is a key issue in its deployment and application.
- the transmission problem of the AI network model will also be involved, and it will also be affected by the size and complexity of the AI network model.
- Large AI network models have high transmission overhead, occupy a lot of computing resources during inference, and have high inference delays.
- the first device before acquiring the AI network model, the first device sends demand information to the second device to notify the second device of the size, compression scheme, model complexity, etc. of the AI network model required by the first device, thereby
- the first device can obtain an AI network model that better matches the information it needs, and when the first device receives the compressed AI network model, the resource overhead of transmitting the AI network model can be reduced; when the first device obtains the AI network model that matches its needs, When the model complexity matches the AI network model, the computing resources and delay when the first device performs model inference on the AI network model can be reduced.
- An embodiment of the present application provides an AI network model interaction method.
- the execution subject is a first device.
- the AI network model interaction method executed by the first device may include the following steps:
- Step 201 The first device sends first information to the second device, where the first information includes information related to compression and/or model inference of the AI network model required by the first device.
- the first device can be the demander of the AI network model
- the second device can be the sender of the AI network model.
- the second device trains to obtain the AI network model and sends the trained AI network model to the first device.
- Relevant information The relevant information of the AI network model may be parameters or model files of the AI network model, etc., which can be used by the first device to perform AI network model inference (ie, apply the AI network model). It can be understood that "transmitting the AI network model" in the following embodiments can be interpreted as "transmitting the parameters or model files of the AI network model".
- the above-mentioned first device may be a terminal, for example: various types of terminals 11 listed in Figure 1 , or the first device may also be a network side device, for example: listed in the embodiment shown in Figure 1
- the network side device 12 may be a core network device
- the second device may also be a terminal or a network side device, such as an access network device or a core network device.
- the first device is a terminal and the second device is a base station. This is not a specific limitation.
- information interaction between the first device, the second device, and the third device may use new signaling or information, or reuse signaling or information in related technologies.
- the first device, the second device and the third device may be terminals or network side devices respectively, and may be based on the third device.
- the information interaction between the first device, the second device and the third device, whether the signal sending end and the signal receiving end involved are terminals or network side devices, is divided into the following four situations to multiplex signaling or information in related technologies. :
- PUCCH Physical Uplink Control Channel
- PUSCH Physical Uplink Shared Channel
- MAC CE Media Access Control Element
- Radio Resource Control (RRC) message
- Non-Access Stratum (NAS) messages
- DCI Downlink Control Information
- SIB System Information Block
- PDSCH Physical Downlink Shared Channel
- the information in the interaction process (such as: the above-mentioned first information, the above-mentioned At least one of the matching results, the above-mentioned first request information, the relevant information of the AI network model received by the first device, and the above-mentioned third information) may be carried in at least one of the following signaling or information:
- PSCCH Physical Sidelink Control Channel
- PSSCH Physical Sidelink Shared Channel
- PSBCH Physical Sidelink Broadcast Channel
- PSDCH Physical Sidelink Discovery Channel
- PSFCH Physical Sidelink Feedback Channel
- the information in the interaction process (such as: the above-mentioned first information , at least one of the above matching results, the above first request information, the relevant information of the AI network model received by the first device, and the above third information) may be carried in at least one of the following signaling or information:
- Xn interface signaling (such as: X2 interface signaling).
- Step 202 The first device obtains relevant information of a target AI network model, and the target AI network model corresponds to the first information.
- the acquisition of the target AI network model in the above step 202 may be to receive the target AI network model, for example: receiving the target AI network model from the second device or the third device; or the acquisition of the target AI network model in the above step 202.
- the target AI network model may be obtained by further processing according to the feedback information of the second device to the first signal.
- the second device sends the first AI network model to the first device according to the first information, and the first device responds to the first signal.
- the AI network model is compressed to obtain the target AI network model.
- the first device can perform model inference on the AI network model, that is, apply the target AI network model, for example: use the target AI network model to replace the communication system in related technologies. function modules in .
- replacing modules in the system in related technologies through AI network models can effectively improve system performance.
- the AI encoder encoder
- the AI decoder decoder
- the AI solution that uses AI encoder and AI decoder to replace conventional CSI calculations can improve the spectrum efficiency of the communication system by about 30% compared to the NR specified solution in related technologies. .
- SLS System Level Simulation
- program size Urban Micro, UMi 38.901, 7 cells, 3 sectors for each cell
- UE speed UE speed
- carrier frequency carrier frequency
- UE speed 3km/h
- carrier frequency carrier frequency
- 32 gNB antennas Antenna
- RBs radio bearers
- PMI is 58 bits.
- Mg in [Mg Ng M N P] represents the antennas contained in one column of the antenna panel array Number of panels; Ng represents the number of antenna panels contained in one row of the antenna panel array; M represents the number of antennas in one column on one panel; N represents the number of antennas in one row on one panel; P represents the number of planned directions of the antennas.
- the first information includes information related to compression and/or model inference of the AI network model required by the first device, which may be at least one of the following:
- the first information includes demand information of the first device.
- the first device informs the function, type, size, complexity of the AI network model it needs, whether to compress the AI network model, and the compression method used when compressing the AI network model.
- the second device so that the second device provides a target AI network model that meets the needs of the first device;
- the first information includes capability information of the first device.
- the first device uses its own capability information (such as available computing power supported by the first device, computing power available for model compression, model compression capability, model compression method, etc. ) informs the second device, so that the second device provides the first device with the target AI network model it supports.
- the first device obtains relevant information of the target AI network model, including:
- the first device receives relevant information of a target AI network model, and the target AI network model is a compressed AI network model or an uncompressed AI network model; or,
- the first device receives relevant information of the first AI network model from the second device, and compresses the first AI network model to obtain a second AI network model.
- the target AI network model includes the Describe the second AI network model.
- the first device obtains relevant information of the target AI network model, including:
- the first device receives relevant information from the second device about the AI network model that matches the first information, wherein the target AI network model includes the AI network model that matches the first information. .
- the AI network model matching the first information may be an AI network model that meets the capabilities of the first device, for example: the resources (such as power, computing resources, storage resources) occupied by the AI network model for model inference. etc.) is less than or equal to the resources available to the first device, or the model complexity when performing model inference on the AI network model is less than or equal to the maximum complexity that the first device can support, etc.
- the resources such as power, computing resources, storage resources
- model complexity when performing model inference on the AI network model is less than or equal to the maximum complexity that the first device can support, etc.
- the second device can determine the first model size of the AI network model required by the first device based on the first information. If Search the first model library for an AI network model that meets the first model size requirement, and then the target AI network model that meets the first model size requirement can be sent to the first device.
- the first device may receive the AI network model that matches the first information from the second device. In this way, the first device obtains the AI network model that matches the first information, which can reduce the computing power and inference delay when the first device applies the AI network model.
- the second device when the second device does not have an AI network model that matches the first information, if the size of the AI network model that the second device has is larger than the AI network model indicated in the first information, you can use Compress the AI network model of the second device to obtain an AI network model that matches the first information.
- the device that compresses the AI network model may be the first device, the second device, or the third device.
- the second device compresses the AI network model it has based on the first information and sends the compressed AI network model to the first device.
- the second device can determine the first model size of the AI network model required by the first device based on the first information. If it is not in the first model library, To find an AI network model that meets the first model size requirement, you can compress the first AI network model in the first model library, obtain the second AI network model that meets the first model size requirement, and then send it to First device.
- the second device selects the first AI network model it has based on the first information, and selects the first AI network model according to the difference between the first AI network model and the first information. Determine the model compression information, and send the first AI network model and model compression information to a third device, so that the third device performs compression processing on the first AI network model according to the model compression information to obtain compression corresponding to the first information.
- the compressed AI network model is then sent to the first device. In this way, the resources occupied by the first device when receiving the AI network model can be reduced, and the AI network model obtained by the first device matches the first information, which can reduce the computing power and inference delay when the first device applies the AI network model. ;
- the second device can determine the first model size of the AI network model required by the first device based on the first information. If it is not in the first model library, To find an AI network model that meets the first model size requirement, the second device can send the relevant information of the first AI network model and the third information in the first model library to the third device, so that the third device can follow the third The information compresses the received first AI network model to obtain a second AI network model, and the third device sends the second AI network model to the first device.
- the second device selects the first AI network model it has according to the first information, and selects the first AI network model according to the difference between the first AI network model and the first information. Determine the model compression information, and send the first AI network model and model compression information to the first device, so that the first device performs compression processing on the first AI network model according to the model compression information to obtain compression corresponding to the first information.
- the final AI network model In this way, although the resources occupied by the first device when receiving the AI network model are not reduced, the AI network model obtained by the first device matches the first information, which can reduce the computing power and reasoning when the first device applies the AI network model. time delay.
- the second device can determine the first model size of the AI network model required by the first device based on the first information. If it is not in the first model library, To find an AI network model that meets the first model size requirement, the second device can send the relevant information of the first AI network model and the third information in the first model library to the first device, so that the first device can follow the third The information compresses the received first AI network model to obtain the second AI network model.
- the first information includes at least one of the following:
- the first capability information indicates the compression capability of the first device for the AI network model and/or The AI network model compression method supported by the first device;
- the first requirement information indicates the size information of the AI network model required by the first device
- the first application information indicates the functional information of the AI network model required by the first device
- the second information including information related to resource usage of the first device
- the first indication information indicates a device for compressing the AI network model.
- the first capability information may reflect the compression capability of the first device for the AI network model and/or the AI network model compression method supported by the first device, where the model compression capability or the model compression method may include at least the following: One item: knowledge distillation method, pruning method, low-rank decomposition method, tensor decomposition method, etc., which are not exhaustive here.
- the first capability may be a field, for example: if the first capability is 0000, it means that the first device supports the knowledge distillation method; if the first capability is 0001, it means that the first device supports the pruning method; if the first capability is 0020 It means that the first device supports the low-rank decomposition method.
- the first capability is 0011, it means that the first device supports the tensor decomposition method. After learning the compression capability of the first device and/or the AI network model compression method supported by the first device, the second device may decide whether to compress the AI network model by the first device.
- the first requirement information may reflect the size information of the AI network model required by the first device (i.e., the model size requirement of the first device for the target AI network model).
- the size information may include at least one of the following: Target The upper limit of the model size of the AI network model, the lower limit of the model size of the target AI network model, the model size level of the target AI network model, the upper limit of the parameter amount of the target AI network model, the lower limit of the parameter amount of the target AI network model, the target AI network
- the parameter level of the model, the upper limit of the complexity of the target AI network model, the lower limit of the complexity of the target AI network model, or the complexity level of the target AI network model, the upper limit of the performance of the target AI network model, and the performance of the target AI network model The lower limit and the performance level of the target AI network model.
- the second device when the second device learns the first requirement information of the first device, it can select a target AI network model that matches the first requirement information from the AI network model it has, for example: from the AI network model it has.
- the target AI network model selected in the model is less than or equal to the upper limit of the model size required by the first device.
- the second device can perform compression processing on the AI network model it has based on the first information to obtain a target that matches the first required information.
- AI network model for example: assuming that the size of the AI network model of the second device is greater than the upper limit of the model size of the AI network model required by the first device, the AI network model of the second device can be compared with the AI network model required by the first device. The difference between the model sizes of the network models determines how to compress the AI network model of the second device so that the compressed AI network model is smaller than or equal to the model size of the AI network model required by the first device. upper limit.
- the above-mentioned first application information can reflect the functional information of the AI network model required by the first device, for example: an AI network model used to implement at least one of CSI prediction, CSI compression, beam prediction, traffic prediction and other functions.
- the second device when the second device learns the functions of the AI network model required by the first device, it can select the target AI network model that matches the first application information among the AI network models it has, for example: from the AI network model it has.
- the function of the target AI network model selected in the AI network model corresponds to the function information of the first application information, and then, the The second device may directly or indirectly provide the first device with an AI network model capable of realizing the function.
- the above-mentioned second information can reflect the resource usage of the first device.
- the resource usage may include: power usage, storage resource usage, computing resource usage, transmission resource usage, etc.
- the second information may include at least one of the following: the available computing power of the first device, the proportion of the available computing power of the first device, the available computing power level of the first device, the available power of the first device, the The proportion of available power, the available power level of the first device, the available storage of the first device, the proportion of available storage of the first device, and the available storage level of the first device.
- the second device when the second device learns the resource usage of the first device, it can select a target AI network model that matches the resource usage of the first device among its own AI network models, for example: the selected target AI network
- the resource occupancy of the model is less than or equal to the available resources of the first device or the resources that can be used for AI network model compression or inference.
- the second device can directly or indirectly provide the selected AI network model to the first device. In this way, It can reduce the risk that the AI network model's resource usage is greater than the actual available resources of the first device, improve the resource utilization of the first device during the model inference process, and reduce the delay in the model inference process.
- the above-mentioned first instruction information indicates the device that compresses the AI network model.
- the second device may not have an AI network model with the model size required by the first device, thus requiring an AI network model that the second device has.
- one or more devices are designated to perform model compression processing, where the designated device may include at least one of a first device, a second device, and a third device, and the third device may be in addition to the first device. and any device other than the second device, for example: it can be an entity within the communication network (such as a terminal, a base station, a core network device or other network layer entities), or it can be a third-party device outside the communication network, whose functions at least include Model compression function.
- the second device when the second device learns about the device that compresses the AI network model, it can send the AI network model to be compressed and the model compression information to the device, so that the AI network model is compressed through the device. Obtain an AI network model with a model size that meets the needs of the first device.
- the above-mentioned first information may include one or at least two of the above options one to five.
- the first information includes first capability information, first requirement information, and first application information
- the second The device can first select at least one AI network model that matches the first application information from the AI network models it has. If it is determined that all of the at least one AI network model does not match the model size corresponding to the first requirement information, and the first If the capability information indicates that the first device supports the knowledge distillation method, the second device can select the one with the closest model size corresponding to the first requirement information from at least one AI network model and send it to the first device, and can also send it to the first device.
- the device sends third information, which may include parameter information required to compress the AI network model sent to the first device into a model size that matches the first requirement information, such as compression method, compression level, etc.
- the second device does not have an AI network model that matches the first information, and the first indication information instructs the first device to compress the AI network model.
- the first device obtains relevant information of the target AI network model, including:
- the first device receives relevant information and third information of the first AI network model from the second device, wherein the third information is used to compress and process the first AI network model into a second AI network.
- Model
- the first device performs compression processing on the first AI network model according to the third information to obtain the third Two AI network models, wherein the target AI network model includes the second AI network model.
- the above-mentioned second device does not have an AI network model that matches the first information. It may be that none of the AI network models that the second device has that satisfies the first part of the first information satisfies the first requirement information, where the first The first part of the information may include information in the first information other than the first requirement information, such as: at least one of the first capability information, the first application information, and the second information.
- the above-mentioned third information may include information required to compress the first AI network model into a second AI network model corresponding to the first information.
- the third information includes at least one of the following: an AI network model compression method used when compressing the first AI network model and restriction information related to AI network model compression.
- the compression method may include at least one of the following: knowledge distillation method, pruning method, low-rank decomposition method, tensor decomposition method, etc.
- the AI network model compression-related restriction information may include at least one of the following: maximum compressibility limit (such as compression ratio, compression level), upper limit of compressed parameter amount, lower limit of compressed parameter amount, compressed model size The upper limit of the model size after compression, the lower limit of the compressed model size, etc.
- maximum compressibility limit such as compression ratio, compression level
- upper limit of compressed parameter amount such as compression ratio, compression level
- lower limit of compressed parameter amount such as compression level
- compressed model size The upper limit of the model size after compression, the lower limit of the compressed model size, etc.
- the third information may be determined based on the difference between the AI network model required by the first information and the first AI network model.
- the third information may include at least part of the first information, such as the first requirement information, etc.
- the device that receives the third information can use the AI network model corresponding to the third information and the first AI network model to The difference between them is used to determine how to compress the first AI network model.
- the second device selects the first AI network model that has been trained or stored based on the first information (for example, selects the one closest to the first information as the first AI network model, or selects a model with the same size as the first AI network model).
- the one that does not match the model size required by the information is used as the first AI network model), and is determined based on the difference between the first AI network model and the model size required in the first information, as well as the model compression capability supported by the first device.
- the third information is determined, and the third information and the first AI network model are sent to the first device. In this way, the target AI network model can be obtained by the first device compressing the first AI network model based on the third information.
- the second device does not have an AI network model that matches the first information, and the first indication information instructs the second device to compress the AI network model.
- the first device obtains relevant information of the target AI network model, including:
- the first device receives relevant information from the second AI network model of the second device, wherein the target AI network model includes the second AI network model, and the second AI network model is based on the
- the first information is an AI network model obtained by compressing the first AI network model of the second device.
- the second device selects the first AI network model that has been trained or stored based on the first information, and based on the difference between the first AI network model and the model size required in the first information, and the size of the model supported by the second device.
- Model Compression capability and other information are used to compress the first AI network model to obtain a second AI network model, and then forward the second AI network model to the first device.
- the second device does not have an AI network model that matches the first information, and the first indication information instructs the third device to compress the AI network model.
- the first device obtains relevant information of the target AI network model, including:
- the first device receives relevant information from the second AI network model of the third device, wherein the target AI network model includes the second AI network model, and the second AI network model is the target of the second AI network model from the third device.
- the AI network model obtained by compressing the first AI network model of the second device.
- the second device does not have the meaning of an AI network model that matches the first information, and the meaning and function of the third information are the same as those of the first optional implementation, which will not be described again here.
- the second device selects the first AI network model that has been trained or stored based on the first information, and based on the difference between the first AI network model and the model size required in the first information, and the size of the model supported by the third device.
- the third information is determined by using the model compression capability and other information, and the third information and the first AI network model are sent to the third device.
- the target AI network model can be processed by the third device based on the third information on the first AI network model. Compression processing is obtained and forwarded to the first device.
- the third device when the third device performs compression processing on the first AI network model according to the third information to obtain the second AI network model, the first device may also send the first information to the third device, or the first The part of the information related to model compression, in this way, the third device can also decide what kind of compression processing to perform on the first AI network model based on the first information or the part of the first information related to model compression.
- the method further includes:
- the first device sends relevant information of the second AI network model to the second device.
- the first device compresses the first AI network model according to the third information to obtain the second AI network model, and then sends the second AI network model to the second device.
- the AI network model of the second AI network model will include the second AI network model, so that the second AI network model can be directly transmitted without compressing the first AI network model again.
- the second device can also obtain the second AI network model from the third device. This second AI network model.
- the first device can further determine whether the second AI network model satisfies the third requirement.
- a requirement of information such as: the second AI network model meets the model size requirement in the first information.
- the first device can determine that the second AI network model indeed meets the requirement of the first information.
- the second device Only then will relevant information of the second AI network model be obtained from at least one of the first device and the third device.
- the method further includes:
- the first device obtains the matching result between the second AI network model and the first information
- the first device sends the matching result to the second device.
- the first device when the target AI network model is obtained by compression processing by the first device or the third device, the first device also obtains the matching result between the target AI network model and the first information, for example: the target AI network model and the first information. Whether the model sizes required by the first device are consistent, and the matching result is fed back to the second device. In this way, if the matching result indicates that the target AI network model does not match the first information, any of the following processes can be performed:
- the first device changes the first information and re-requests the AI network model from the second device. This process is similar to the process of the AI network model interaction method provided in the embodiment of the present application, and will not be described again here.
- the first device sends the first request information. At this time, the first information remains unchanged and may not send the first information to the second device.
- the second device may use different third information for compression according to the first request information. Process, or compress different first AI network models.
- the first request information may carry advice information not to compress and send the previously compressed first AI network model and/or carry advice information that the second device changes the third information.
- the first request information may not carry the above suggestions, but the second device may decide which first AI network model to recompress and whether to modify the third information.
- the AI network model interaction method when the matching result indicates that the second AI network model does not match the first information, the AI network model interaction method further includes:
- the first device sends first request information to the second device, the first request information is used to request the second device to update at least one of the third information and the first AI network model. .
- the first device when it obtains an AI network model that does not meet the required model size, it may send the first request information to the second device, so that the second device updates at least one of the following according to the first request information. : The compressed first AI network model and the third information used in the compression process, until the first device obtains an AI network model that meets the required model size.
- the AI network model interaction method when the matching result indicates that the second AI network model does not match the first information, the AI network model interaction method further includes:
- the first device updates the first information and sends the updated first information to the second device;
- the first device obtains the target AI network model corresponding to the updated first information.
- the first device when it obtains an AI network model that does not meet the required model size, it can update at least one item of the first information, for example: update the first instruction information, the first requirement information, the first capability information and at least one of the second information, so that the second device updates at least one of the first AI network model and the third information based on the updated first information, so that the second device updates the first AI network model based on the updated third information.
- Compressing an AI network model can obtain a target AI network model that matches the updated first information.
- the third device when the third device compresses the first AI network model according to the third information to obtain the second AI network model, the first device can also send the above-mentioned AI network model to the third device. Matching results, so that the third device sends compressed data to the second device when it is determined that the second AI network model matches the first information.
- the compressed second AI network model is not sent to the second device when it is determined that the second AI network model does not match the first information. In this way, the difference between the second AI network model and The waste of resources caused by transmitting the second AI network model when the first information does not match.
- the first device may also decide based on the matching result. Whether to send the second AI network model to the second device. At this time, the first device does not need to send the matching result to the third device, and the third device does not need to decide whether to send the second AI network model to the second device based on the matching result. Model.
- the first device sends first information to the second device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device; the first device The device obtains relevant information of a target AI network model, where the target AI network model corresponds to the first information.
- the second device has pre-stored or trained the AI network model, during the process of the first device obtaining the AI network model from the second device, the first device can send the information required by the first device to the second device.
- the second device can determine at least one of the following according to the needs of the first device: the type, size, function, and complexity of the AI network model required by the first device , as well as the parameters, compression methods, compression nodes, etc. when compressing the determined AI network model.
- the second device can compress the AI network model according to the needs of the first device, and compress the compressed AI network model.
- Transmission can reduce the transmission overhead of the AI network model; in addition, the second device also selects an AI network model that matches the model inference process of the first device according to the needs of the first device, which can reduce the first device's need for the target AI network model.
- Another AI network model interaction method provided by an embodiment of the present application is executed by a second device.
- the AI network model interaction method executed by the second device may include the following steps:
- Step 701 The second device receives first information from the first device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device.
- Step 702 The second device sends relevant information of the target AI network model to the first device.
- the target AI network model corresponds to the first information, or the second device sends information based on the first information.
- first device In the embodiment of the present application, the above-mentioned first device, second device, first information, relevant information of the first AI network model, and the second AI network model have the same meaning and function as the first device in the method embodiment shown in Figure 2.
- the meaning and function of the device, the second device, the first information, the information related to the first AI network model, and the second AI network model will not be described again here.
- the target AI network model is a compressed AI network model or an uncompressed AI network model.
- the first information includes at least one of the following:
- the first capability information indicates the AI network model compression capability of the first device and/or the AI network model compression method supported by the first device;
- the first requirement information indicates the size information of the AI network model required by the first device
- the first application information indicates the functional information of the AI network model required by the first device
- the second information including information related to resource usage of the first device
- the first indication information indicates a device for compressing the AI network model.
- the first indication information instructs the first device, the second device or the third device to compress the AI network model.
- the second device sends relevant information of the target AI network model, including:
- the second device If the second device has an AI network model that matches the first information, the second device sends the AI network model that matches the first information to the first device, and the target AI network model includes The AI network model matching the first information.
- the method also includes:
- the second device compresses the first AI network model according to the first information to obtain a second AI network model
- the second device sends relevant information about the target AI network model, including:
- the second device sends relevant information of the second AI network model to the first device, and the target AI network model includes the second AI network model.
- the second device sends relevant parameters of the first AI network model according to the first information, including:
- the second device When the second device does not have an AI network model that matches the first information and the first instruction information instructs the first device to compress the AI network model, the second device sends a message to the first device. Relevant information and third information of the first AI network model, wherein the third information is used to compress the first AI network model into a second AI network model, and the second AI network model is the same as the third AI network model. an information correspondence; and/or,
- the second device If the second device does not have an AI network model that matches the first information, and the first instruction information instructs the third device to compress the AI network model, the second device sends a message to the third device. Relevant information of the first AI network model and the third information.
- the third information includes at least one of the following: an AI network model compression method used when compressing the first AI network model and restriction information related to AI network model compression.
- the method further includes:
- the second device receives relevant information of the second AI network model.
- the method further includes:
- the second device receives a judgment result from the first device, where the judgment result is used to represent a matching result between the second AI network model and the first information.
- the method further includes:
- the second device receives the first request information from the first device and updates the At least one of the third information and the first AI network model;
- the second device sends the updated third information and/or the updated related information of the first AI network model.
- the method further includes:
- the second device receives updated first information from the first device
- the second device sends relevant information of the target AI network model that matches the updated first information to the first device, or the second device sends a third device based on the updated first information.
- Information related to the AI network model, and the third AI network model is used to perform compression processing to obtain a fourth AI network model corresponding to the updated first information.
- the above-mentioned third AI network model is similar to the first AI network model in the method embodiment shown in Figure 2, and both can be AI network models in the model library of the second device.
- the difference is that: the first AI network model is different from the first AI network model.
- the first information before the update corresponds to the third AI network model and the first information after the update.
- the above-mentioned fourth AI network model is similar to the second AI network model in the method embodiment shown in Figure 2. Both can be AI network models obtained by compressing the AI network model in the model library of the second device. The difference is that : The second AI network model is obtained by compressing the first AI network model and corresponds to the first information before updating. The fourth AI network model is obtained by compressing the third AI network model. , and the AI network model corresponding to the updated first information.
- the AI network model interaction method executed by the second device corresponds to the AI network model interaction method executed by the first device, and the first device and the second device respectively execute the respective AI network model interaction methods.
- This step can reduce the transmission overhead of the AI network model, and reduce the computing resources and inference delay occupied by the first device when inferring the target AI network model.
- the execution subject may be an AI network model interaction device.
- the AI network model interaction device performing the AI network model interaction method is used as an example to illustrate the AI network model interaction device provided by the embodiment of the present application.
- An AI network model interaction device provided by an embodiment of the present application may be a device in the first device. As shown in Figure 8, the AI network model interaction device 800 may include the following modules:
- the first sending module 801 is configured to send first information to the second device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- the first acquisition module 802 is used to acquire relevant information of the target AI network model, where the target AI network model corresponds to the first information.
- the first acquisition module 802 includes:
- the first receiving unit is configured to receive relevant information about the target AI network model, where the target AI network model is a compressed AI network model or an uncompressed AI network model;
- a second receiving unit configured to receive relevant information from the first AI network model of the second device
- a first processing unit configured to perform compression processing on the first AI network model to obtain a second AI network model, where the target AI network model includes the second AI network model.
- the first information includes at least one of the following:
- the first capability information indicates the AI network model compression capability of the first device and/or the AI network model compression method supported by the first device;
- the first requirement information represents the size information of the AI network model required by the first device
- the first application information represents the functional information of the AI network model required by the first device
- the second information including information related to resource usage of the first device
- the first indication information indicates a device for compressing the AI network model.
- the first instruction information instructs the first device, the second device or the third device to compress the AI network model.
- the first acquisition Module 802 including:
- a third receiving unit configured to receive relevant information and third information from the first AI network model of the second device, where the third information is used to compress and process the first AI network model into a second AI network model;
- a second processing unit configured to perform compression processing on the first AI network model according to the third information to obtain the second AI network model, wherein the target AI network model includes the second AI network model.
- the first acquisition module 802 specifically Used for:
- the AI network model obtained by compressing the first AI network model of the second device;
- the first acquisition module 802 specifically Used for:
- the target AI network model includes the second AI network model
- the second AI network model is a response to the second AI network model from the second device.
- the AI network model obtained after compression processing of the first AI network model.
- the third information includes at least one of the following: an AI network model compression method used when compressing the first AI network model and restriction information related to AI network model compression.
- the AI network model interaction device 800 also includes:
- the third sending module is configured to send the relevant information of the second AI network model to the second device.
- the first obtained Take module 802 specifically used for:
- Relevant information about the AI network model matching the first information is received from the second device, wherein the target AI network model includes the AI network model matching the first information.
- the AI network model interaction device 800 further includes:
- a second acquisition module configured to acquire the matching result between the second AI network model and the first information
- a fourth sending module is used to send the matching result to the second device.
- the AI network model interaction device 800 further includes:
- a fifth sending module configured to send first request information to the second device, where the first request information is used to request the second device to update at least the third information and the first AI network model.
- the AI network model interaction device 800 further includes:
- An update module configured to update the first information and send the updated first information to the second device
- the third acquisition module is used to acquire the target AI network model corresponding to the updated first information.
- the AI network model interaction device 800 provided by the embodiment of the present application can implement various processes implemented by the first device in the method embodiment as shown in Figure 2, and can achieve the same beneficial effects. To avoid duplication, they will not be described again here.
- Another AI network model interaction device provided by an embodiment of the present application can be a device in the second device.
- the AI network model interaction device 900 can include the following modules:
- the first receiving module 901 is configured to receive first information from the first device, where the first information includes information related to the compression and/or model inference of the AI network model required by the first device;
- the second sending module 902 is configured to send relevant information of the target AI network model to the first device, where the target AI network model corresponds to the first information, or send the first AI network based on the first information. Relevant information of the model, wherein the first AI network model is used for compression processing to obtain a second AI network model, and the second AI network model corresponds to the first information.
- the target AI network model is a compressed AI network model or an uncompressed AI network model.
- the first information includes at least one of the following:
- the first capability information indicates the AI network model compression capability of the first device and/or the AI network model compression method supported by the first device;
- the first requirement information indicates the size information of the AI network model required by the first device
- the first application information indicates the functional information of the AI network model required by the first device
- the second information including information related to resource usage of the first device
- the first indication information indicates a device for compressing the AI network model.
- the first instruction information instructs the first device, the second device or the third device to compress the AI network model.
- the second sending module 902 is specifically used for:
- the AI network model that matches the first information is sent to the first device, and the target AI network model includes The AI network model matching the first information.
- the AI network model The interactive device 900 also includes:
- a first processing module configured to compress the first AI network model according to the first information to obtain a second AI network model
- the second sending module 902 is specifically used for:
- the second sending module 902 is specifically used for:
- the second device does not have an AI network model that matches the first information
- the first instruction information instructs the first device to compress the AI network model
- Relevant information and third information of the first AI network model wherein the third information is used to compress the first AI network model into a second AI network model, the second AI network model is the same as the third AI network model.
- the second device does not have an AI network model that matches the first information
- the first indication information instructs the third device to compress the AI network model, send the message to the third device. Relevant information of the first AI network model and the third information.
- the third information includes at least one of the following: an AI network model compression method used when compressing the first AI network model and restriction information related to AI network model compression.
- the AI network model interaction device 900 also includes:
- the second receiving module is used to receive relevant information of the second AI network model.
- the AI network model interaction device 900 also includes:
- the third receiving module is configured to receive a judgment result from the first device, where the judgment result is used to represent the matching result between the second AI network model and the first information.
- the AI network model interaction device 900 further includes:
- a fourth receiving module configured to receive first request information from the first device, and update at least one of the third information and the first AI network model according to the first request information
- a sixth sending module configured to send the updated third information and/or the updated related information of the first AI network model.
- the AI network model interaction device 900 further includes:
- a fifth receiving module configured to receive updated first information from the first device
- a seventh sending module configured to send relevant information of the target AI network model that matches the updated first information to the first device, or the second device sends according to the updated first information.
- Relevant information of the third AI network model, the third AI network model is used to perform compression processing to obtain a fourth AI network model corresponding to the updated first information.
- the AI network model interaction device 900 provided by the embodiment of the present application can implement various processes implemented by the second device in the method embodiment as shown in Figure 7, and can achieve the same beneficial effects. To avoid duplication, they will not be described again here.
- the AI network model interaction device in the embodiment of the present application may be an electronic device, such as an electronic device with an operating system, or may be a component in the electronic device, such as an integrated circuit or chip.
- the electronic device may be a terminal or other devices other than the terminal.
- terminals may include but are not limited to the types of terminals 11 listed above, and other devices may be servers, network attached storage (Network Attached Storage, NAS), etc., which are not specifically limited in the embodiment of this application.
- the AI network model interaction device provided by the embodiment of the present application can implement each process implemented by the method embodiment shown in Figure 2 or Figure 7, and achieve the same technical effect. To avoid duplication, it will not be described again here.
- this embodiment of the present application also provides a communication device 1000, which includes a processor 1001 and a memory 1002.
- the memory 1002 stores programs or instructions that can be run on the processor 1001, such as , when the communication device 1000 serves as the first device, when the program or instruction is executed by the processor 1001, each step of the method embodiment shown in Figure 2 is implemented, and the same technical effect can be achieved.
- the communication device 1000 is used as the second device, when the program or instruction is executed by the processor 1001, each step of the method embodiment shown in Figure 7 is implemented, and the same technical effect can be achieved. To avoid repetition, the details will not be described here.
- An embodiment of the present application also provides a communication device, including a processor and a communication interface.
- the communication interface is used to send first information to a second device, where the first information includes the third device.
- Information related to the compression and/or model reasoning of the AI network model required by a device; the communication interface or the processor is used to obtain relevant information of the target AI network model, and the target AI network model corresponds to the first information .
- the communication interface is used to receive the first information from the first device, and send relevant information of the target AI network model to the first device or send the first AI according to the first information.
- the first information includes information related to compression and/or model inference of the AI network model required by the first device, and the target AI network model corresponds to the first information, so
- the first AI network model is used to perform compression processing to obtain a second AI network model, and the second AI network model corresponds to the first information.
- This terminal embodiment corresponds to the method embodiment shown in Figure 2 or Figure 7.
- Each implementation process and implementation manner of the method embodiment shown in Figure 2 or Figure 7 can be applied to this communication device embodiment, and can achieve the same technical effects.
- Embodiments of the present application also provide a readable storage medium, with programs or instructions stored on the readable storage medium.
- program or instructions When the program or instructions are executed by a processor, each process of the method embodiment shown in Figure 2 or Figure 7 is implemented. , and can achieve the same technical effect, so to avoid repetition, they will not be described again here.
- the processor is the processor in the terminal described in the above embodiment.
- the readable storage medium includes Computer-readable storage media, such as computer read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks, etc.
- An embodiment of the present application further provides a chip.
- the chip includes a processor and a communication interface.
- the communication interface is coupled to the processor.
- the processor is used to run programs or instructions.
- the implementation is as shown in Figure 2 or Figure 7 Each process of the method embodiment is shown, and the same technical effect can be achieved. To avoid repetition, the details will not be described here.
- chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-chip or system-on-chip, etc.
- Embodiments of the present application further provide a computer program/program product, the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement Figure 2 or Figure 7
- a computer program/program product is stored in a storage medium
- the computer program/program product is executed by at least one processor to implement Figure 2 or Figure 7
- An embodiment of the present application also provides a communication system, including: a first device and a second device.
- the first device can be used to perform the steps of the AI network model interaction method as shown in Figure 2.
- the second device can To execute the steps of the AI network model interaction method shown in Figure 7.
- the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation.
- the technical solution of the present application can be embodied in the form of a computer software product that is essentially or contributes to related technologies.
- the computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), including several instructions to cause a terminal (which can be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in various embodiments of this application.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
本申请公开了一种AI网络模型交互方法、装置和通信设备,属于通信技术领域,本申请实施例的AI网络模型交互方法包括:第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备对AI网络模型进行压缩和/或模型推理相关的信息;所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
Description
相关申请的交叉引用
本申请主张在2022年07月12日在中国提交的中国专利申请No.202210822781.7的优先权,其全部内容通过引用包含于此。
本申请属于通信技术领域,具体涉及一种AI网络模型交互方法、装置和通信设备。
在相关技术中,对借助人工智能(Artificial Intelligence,AI)网络模型来提升第5代(5th Generation,5G)通信系统网络性能的方法进行了研究。
该AI网络模型可以借助已有的AI工具,来实现网络模型的搭建、训练与验证工作。并通过在无线通信系统中对AI网络模型进行交互,以将训练得到的AI网络模型部署在需要使用该AI网络模型的目标设备,这就涉及AI网络模型的传输问题。
在相关技术中,存在由于AI网络模型的尺寸较大或复杂度较高而造成传输开销大,推理时占用大量的计算资源,推理时延高等问题。
发明内容
本申请实施例提供一种AI网络模型交互方法、装置和通信设备,能够降低传输开销和/或降低推理时占用的计算资源和推理时延。
第一方面,提供了一种人工智能AI网络模型交互方法,该方法包括:
第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
第二方面,提供了一种人工智能AI网络模型交互装置,应用于第一设备,该装置包括:
第一发送模块,用于向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
第一获取模块,用于获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
第三方面,提供了一种人工智能AI网络模型交互方法,包括:
第二设备接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
所述第二设备向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,所述第二设备根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
第四方面,提供了一种人工智能AI网络模型交互装置,应用于第二设备,该装置包括:
第一接收模块,用于接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
第二发送模块,用于向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
第五方面,提供了一种通信设备,该通信设备包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如第一方面或第三方面所述的方法的步骤。
第六方面,提供了一种通信设备,包括处理器及通信接口,其中,所述通信接口用于向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;所述通信接口或者所述处理器用于获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应;或者,
所述通信接口用于接收来自第一设备的第一信息,以及向所述第一设备发送目标AI网络模型的相关信息或者根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息,所述目标AI网络模型与所述第一信息对应,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
第七方面,提供了一种通信系统,包括:第一设备及第二设备,所述第一设备可用于执行如第一方面所述的AI网络模型交互方法的步骤,所述第二设备可用于执行如第三方面所述的AI网络模型交互方法的步骤。
第八方面,提供了一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如第一方面所述的方法的步骤,或者实现如第三方面所述的方法的步骤。
第九方面,提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如第一方面所述的方法,或实现如第三方面所述的方法。
第十方面,提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如第一方面所述的AI
网络模型交互方法的步骤,或者所述计算机程序/程序产品被至少一个处理器执行以实现如第三方面所述的AI网络模型交互方法的步骤。
在本申请实施例中,第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。这样,在第二设备预先存储或者训练得到AI网络模型的情况下,第一设备在从第二设备获取AI网络模型的过程中,第一设备可以向第二设备发送该第一设备所需要的AI网络模型的压缩和/或模型推理相关的信息,以使第二设备能够按照第一设备的需求来确定以下至少一项:第一设备需要的AI网络模型的类型、尺寸、功能、复杂程度,以及对确定的AI网络模型进行压缩处理时的参数、压缩方法、压缩节点等,这样,能够使第二设备按照第一设备的需求对AI网络模型进行压缩,并对压缩后的AI网络模型进行传输,能够降低AI网络模型的传输开销;此外,还可以是第二设备按照第一设备的需求选择与第一设备的模型推理过程相匹配的AI网络模型,能够降低第一设备对目标AI网络模型进行推理时占用的计算资源和推理时延。
图1是本申请实施例能够应用的一种无线通信系统的结构示意图;
图2是本申请实施例提供的一种AI网络模型交互方法的流程图;
图3是本申请实施例应用于CSI反馈的示意图;
图4是本申请实施例中第一设备与第二设备之间的交互过程示意图之一;
图5是本申请实施例中第一设备与第二设备之间的交互过程示意图之二;
图6是本申请实施例中第一设备、第二设备和第三设备之间的交互过程示意图;
图7是本申请实施例提供的另一种AI网络模型交互方法的流程图;
图8是本申请实施例提供的一种AI网络模型交互装置的结构示意图;
图9是本申请实施例提供的另一种AI网络模型交互装置的结构示意图;
图10是本申请实施例提供的一种通信设备的结构示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也
可以是多个。此外,说明书以及权利要求中“和/或”表示所连接对象的至少其中之一,字符“/”一般表示前后关联对象是一种“或”的关系。
值得指出的是,本申请实施例所描述的技术不限于长期演进型(Long Term Evolution,LTE)/LTE的演进(LTE-Advanced,LTE-A)系统,还可用于其他无线通信系统,诸如码分多址(Code Division Multiple Access,CDMA)、时分多址(Time Division Multiple Access,TDMA)、频分多址(Frequency Division Multiple Access,FDMA)、正交频分多址(Orthogonal Frequency Division Multiple Access,OFDMA)、单载波频分多址(Single-carrier Frequency Division Multiple Access,SC-FDMA)和其他系统。本申请实施例中的术语“系统”和“网络”常被可互换地使用,所描述的技术既可用于以上提及的系统和无线电技术,也可用于其他系统和无线电技术。以下描述出于示例目的描述了新空口(New Radio,NR)系统,并且在以下大部分描述中使用NR术语,但是这些技术也可应用于NR系统应用以外的应用,如第6代(6th Generation,6G)通信系统。
图1示出本申请实施例可应用的一种无线通信系统的框图。无线通信系统包括终端11和网络侧设备12。其中,终端11可以是手机、平板电脑(Tablet Personal Computer)、膝上型电脑(Laptop Computer)或称为笔记本电脑、个人数字助理(Personal Digital Assistant,PDA)、掌上电脑、上网本、超级移动个人计算机(ultra-mobile personal computer,UMPC)、移动上网装置(Mobile Internet Device,MID)、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、机器人、可穿戴式设备(Wearable Device)、车载设备(Vehicle User Equipment,VUE)、行人终端(Pedestrian User Equipment,PUE)、智能家居(具有无线通信功能的家居设备,如冰箱、电视、洗衣机或者家具等)、游戏机、个人计算机(personal computer,PC)、柜员机或者自助机等终端侧设备,可穿戴式设备包括:智能手表、智能手环、智能耳机、智能眼镜、智能首饰(智能手镯、智能手链、智能戒指、智能项链、智能脚镯、智能脚链等)、智能腕带、智能服装等。需要说明的是,在本申请实施例并不限定终端11的具体类型。网络侧设备12可以包括接入网设备或核心网设备,其中,接入网设备也可以称为无线接入网设备、无线接入网(Radio Access Network,RAN)、无线接入网功能或无线接入网单元。接入网设备可以包括基站、无线局域网(Wireless Local Area Network,WLAN)接入点或WiFi节点等,基站可被称为节点B、演进节点B(eNB)、接入点、基收发机站(Base Transceiver Station,BTS)、无线电基站、无线电收发机、基本服务集(Basic Service Set,BSS)、扩展服务集(Extended Service Set,ESS)、家用B节点、家用演进型B节点、发送接收点(Transmitting Receiving Point,TRP)或所述领域中其他某个合适的术语,只要达到相同的技术效果,所述基站不限于特定技术词汇,需要说明的是,在本申请实施例中仅以NR系统中的基站为例进行介绍,并不限定基站的具体类型。
人工智能目前在各个领域获得了广泛的应用。AI网络模型有多种实现方式,例如神经网络、决策树、支持向量机、贝叶斯分类器等。本申请以神经网络为例进行说明,但是并不限定AI网络模型的具体类型。
一般而言,根据需要解决的问题的不同类型,所选取的AI算法和采用的网络模型也有所差别。借助AI网络模型提升5G网络性能的主要方法是通过基于神经网络的算法和模型增强或者替代目前已有的算法或处理模块。在特定场景下,基于神经网络的算法和模型可以取得比基于确定性算法更好的性能。比较常用的神经网络包括深度神经网络、卷积神经网络和循环神经网络等。借助已有AI工具,可以实现神经网络的搭建、训练与验证工作。
在应用中,AI网络模型的大小和复杂度是其部署应用的一个关键问题。在无线通信系统中应用AI方案时还会涉及AI网络模型的传输问题,也会受到AI网络模型的大小和复杂度的影响。大的AI网络模型的传输开销大,推理时占用的计算资源多,且推理时延也高。
本申请实施例中,第一设备在获取AI网络模型之前,向第二设备发送需求信息,以通知第二设备该第一设备需要的AI网络模型的尺寸、压缩方案、模型复杂程度等,从而使第一设备能够获取与其需求信息更加匹配的AI网络模型,且在第一设备接收压缩后的AI网络模型的情况下,可以降低传输AI网络模型的资源开销;在第一设备获取到与其需要的模型复杂程度相匹配的AI网络模型的情况下,可以降低第一设备对该AI网络模型进行模型推理时的计算资源和时延。
下面结合附图,通过一些实施例及其应用场景对本申请实施例提供的AI网络模型交互方法、AI网络模型交互装置及通信设备等进行详细地说明。
请参阅图2,本申请实施例提供的一种AI网络模型交互方法,其执行主体是第一设备,如图2所示,该第一设备执行的AI网络模型交互方法可以包括以下步骤:
步骤201、第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型进行压缩和/或模型推理相关的信息。
其中,第一设备可以是AI网络模型的需求方,第二设备可以是AI网络模型的发送方,例如:第二设备训练得到AI网络模型,并向第一设备发送训练好的AI网络模型的相关信息,该AI网络模型的相关信息可以是AI网络模型的参数或模型文件等,能够供第一设备进行AI网络模型推理(即应用该AI网络模型)的数据。可以理解的,以下实施例中的“传输AI网络模型”,可以解释为“传输AI网络模型的参数或模型文件”。在实施中,上述第一设备可以是终端,例如:如图1中列举的各种类型的终端11,或者该第一设备也可以是网络侧设备,例如:如图1所示实施例中列举的网络侧设备12或者是核心网设备,上述第二设备也可以是终端或者网络侧设备,如:接入网设备或核心网设备。为了便于说明,以下实施例中,通常以第一设备是终端,第二设备是基站为例进行举例说明,在此不构成具体限定。
需要说明的是,本申请实施例中,第一设备、第二设备以及第三设备之间的信息交互,可以使用新的信令或信息,或者复用相关技术中的信令或信息。
具体的,第一设备、第二设备和第三设备分别可以是终端或网络侧设备,可以基于第
一设备、第二设备以及第三设备之间的信息交互,所涉及的信号发送端以及信号接收端是终端还是网络侧设备,分为以下4种情况来复用相关技术中的信令或信息:
情况1)假设第一设备与第二设备或第三设备之间的信息交互过程中,信息发送端为终端,且信息接收端为网络侧设备,则该交互过程中的信息(如:上述第一信息、上述匹配结果、上述第一请求信息、第一设备接收的AI网络模型的相关信息以及上述第三信息中的至少一项)可以承载于以下信令或信息中的至少一项:
物理上行控制信道(Physical Uplink Control Channel,PUCCH)的层(layer)1信令;
物理随机接入信道(Physical Random Access Channel,PRACH)的MSG 1;
PRACH的MSG 3;
PRACH的MSG A;
物理上行共享信道(Physical Uplink Shared Channel,PUSCH)的信息。
情况2)假设第一设备与第二设备或第三设备之间的信息交互过程中,信息发送端为网络侧设备,且信息接收端为终端,则该交互过程中的信息(如:上述第一信息、上述匹配结果、上述第一请求信息、第一设备接收的AI网络模型的相关信息以及上述第三信息中的至少一项)可以承载于以下信令或信息中的至少一项:
媒体接入控制控制元素(Medium Access Control Control Element,MAC CE);
无线资源控制(Radio Resource Control,RRC)消息;
非接入层(Non-Access Stratum,NAS)消息;
管理编排消息;
用户面数据;
下行控制信息(Downlink Control Information,DCI)信息;
系统信息块(System Information Block,SIB);
物理下行控制信道(Physical Downlink Control Channel,PDCCH)的层1信令;
物理下行共享信道(Physical Downlink Shared Channel,PDSCH)的信息;
PRACH的MSG 2;
PRACH的MSG 4;
PRACH的MSG B。
情况3)假设第一设备与第二设备或第三设备之间的信息交互过程中,信息发送端和信息接收端为不同终端,则该交互过程中的信息(如:上述第一信息、上述匹配结果、上述第一请求信息、第一设备接收的AI网络模型的相关信息以及上述第三信息中的至少一项)可以承载于以下信令或信息中的至少一项:
Xn接口信令;
PC5接口信令;
物理旁路控制信道(Pysical Sidelink Control Channel,PSCCH)的信息;
物理旁路共享信道(Physical Sidelink Shared Channel,PSSCH)的信息;
物理直通链路广播信道(Physical Sidelink Broadcast Channel,PSBCH)的信息;
物理直通链路发现信道(Physical Sidelink Discovery Channel,PSDCH)的信息;
物理旁路反馈信道(Physical Sidelink Feedback Channel,PSFCH)的信息。
情况4)假设第一设备与第二设备或第三设备之间的信息交互过程中,信息发送端和信息接收端为不同网络侧设备,则该交互过程中的信息(如:上述第一信息、上述匹配结果、上述第一请求信息、第一设备接收的AI网络模型的相关信息以及上述第三信息中的至少一项)可以承载于以下信令或信息中的至少一项:
S1接口信令;
Xn接口信令(如:X2接口信令)。
步骤202、所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
其中,上述步骤202中的获取目标AI网络模型,可以是接收目标AI网络模型,例如:接收来自第二设备或第三设备的目标AI网络模型;或者,上述步骤202中的获取目标AI网络模型,可以是根据第二设备对第一信号的反馈信息来进一步处理得到目标AI网络模型,例如:第二设备根据第一信息向第一设备发送第一AI网络模型,第一设备对该第一AI网络模型进行压缩处理,得到目标AI网络模型。
在应用中,第一设备在获取目标AI网络模型的相关信息之后,可以对AI网络模型进行模型推理,即应用该目标AI网络模型,例如:使用目标AI网络模型来替代相关技术中的通信系统中的功能模块。在实施中,通过AI网络模型替代相关技术中的系统中的模块能够有效提升系统性能。
例如:在如图3所示,在信道状态信息(Channel State Information Reference Signal,CSI)反馈过程中,可以采用AI编码器(encoder)和AI解码器(decoder)替代常规的CSI计算,可以在相同开销的情况下大幅度提升相应的系统性能。如下表1所示,采用AI encoder和AI decoder替代常规的CSI计算的AI方案,相较于相关技术中的NR指定方案(NR specified solution)而言,能够将通信系统的频谱效率提升30%左右。
表1
如上表1的仿真条件为:系统级仿真(System Level Simulation,SLS),程序大小(Urban Micro,UMi)38.901,7小区(cells),每个小区3个扇区(3 sectors for each cell),UE速度(UE speed)3km/h,载波频率(carrier frequency)3.5GHz,32个gNB天线(antenna)([Mg Ng M N P]=[1 1 2 8 2]),4个UE antenna([Mg Ng M N P]=[1 1 2 2 2]),52个无线承载(RBs),预编码矩阵指示(Precoding Matrix Indicator,PMI)的开销为58位(the overhead of PMI is 58bits)。其中,[Mg Ng M N P]中的Mg表示天线面板阵列中的一列包含的天线
面板数量;Ng表示天线面板阵列中的一行包含的天线面板数量;M表示一个面板上一列的天线数;N表示一个面板上一行的天线数;P表示天线的计划方向数。
其中,所述第一信息包括所述第一设备需要的AI网络模型进行压缩和/或模型推理相关的信息,可以是以下至少一项:
第一信息包括第一设备的需求信息,这样,第一设备将自身需要的AI网络模型的功能、种类、尺寸、复杂程度、是否压缩AI网络模型,以及压缩AI网络模型时采用压缩方法等告知第二设备,以使第二设备提供符合第一设备需求的目标AI网络模型;
第一信息包括第一设备的能力信息,这样,第一设备将自身的能力信息(如:第一设备支持的可用算力、可用于进行模型压缩的算力、模型压缩能力、模型压缩方法等)告知第二设备,以使第二设备给第一设备提供其支持的目标AI网络模型。
作为一种可选的实施方式,第一设备获取目标AI网络模型的相关信息,包括:
所述第一设备接收目标AI网络模型的相关信息,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型;或者,
所述第一设备接收来自所述第二设备的第一AI网络模型的相关信息,并对所述第一AI网络模型进行压缩处理,得到第二AI网络模型,所述目标AI网络模型包括所述第二AI网络模型。
在一种实施方式中,在所述第二设备具有与所述第一信息匹配的AI网络模型的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:
所述第一设备接收来自所述第二设备的与所述第一信息匹配的AI网络模型的相关信息,其中,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
其中,与所述第一信息匹配的AI网络模型,可以是满足第一设备的能力的AI网络模型,例如:对该AI网络模型进行模型推理所占用的资源(如电量、计算资源、存储资源等)小于等于第一设备可用的资源,或者对该AI网络模型进行模型推理时的模型复杂程度小于或者等于第一设备能够支持的最大复杂程度等。
例如:如图4、图5和图6所示,假设第一信息包括模型压缩相关的参数,第二设备根据第一信息可以确定第一设备需要的AI网络模型的第一模型大小,若在第一模型库中查找满足该第一模型大小需求的AI网络模型,则可以将该满足该第一模型大小需求的目标AI网络模型发送给第一设备。
本实施方式中,在第二设备具有与所述第一信息匹配的AI网络模型的情况下,第一设备可以从第二设备接收该与第一信息相匹配的AI网络模型。这样,第一设备获取与第一信息相匹配的AI网络模型,能够降低第一设备应用该AI网络模型时的算力和推理时延。
在另一种实施方式中,在第二设备不具有与第一信息相匹配的AI网络模型时,如第二设备具有的AI网络模型比第一信息中指示的AI网络模型尺寸大,可以通过对第二设备具有的AI网络模型进行压缩,得到与第一信息相匹配的AI网络模型。此时,对AI网络模型进行压缩的设备可以是第一设备、第二设备或第三设备。
1)在对AI网络模型进行压缩的设备是第二设备时,第二设备根据第一信息对具有的AI网络模型进行压缩,并向第一设备发送压缩后的AI网络模型。
例如:如图4所示,假设第一信息包括模型压缩相关的参数,第二设备根据第一信息可以确定第一设备需要的AI网络模型的第一模型大小,若在第一模型库中未查找满足该第一模型大小需求的AI网络模型,则可以对第一模型库中的第一AI网络模型进行压缩处理,得到满足该第一模型大小需求的第二AI网络模型后,再发送给第一设备。
这样,可以降低第二设备与第一设备之间传输AI网络模型时占用的资源,且第一设备获取的AI网络模型与第一信息相匹配,能够降低第一设备应用该AI网络模型时的算力和推理时延。
2)在对AI网络模型进行压缩的设备是第三设备时,第二设备根据第一信息选择自身具有的第一AI网络模型,并根据该第一AI网络模型与第一信息之间的差异确定模型压缩信息,并将该第一AI网络模型与模型压缩信息发送给第三设备,以使第三设备根据模型压缩信息对第一AI网络模型进行压缩处理,得到与第一信息对应的压缩后的AI网络模型,并将该压缩后的AI网络模型发送给第一设备。这样,可以降低第一设备接收AI网络模型时占用的资源,且第一设备获取的AI网络模型与第一信息相匹配,能够降低第一设备应用该AI网络模型时的算力和推理时延;
例如:如图5所示,假设第一信息包括模型压缩相关的参数,第二设备根据第一信息可以确定第一设备需要的AI网络模型的第一模型大小,若在第一模型库中未查找满足该第一模型大小需求的AI网络模型,则第二设备可以向第三设备发送第一模型库中的第一AI网络模型的相关信息和第三信息,以使第三设备按照第三信息对接收到的第一AI网络模型进行压缩处理,得到第二AI网络模型,并由第三设备将该第二AI网络模型发送给第一设备。
3)在对AI网络模型进行压缩的设备是第一设备时,第二设备根据第一信息选择自身具有的第一AI网络模型,并根据该第一AI网络模型与第一信息之间的差异确定模型压缩信息,并将该第一AI网络模型与模型压缩信息发送给第一设备,以使第一设备根据模型压缩信息对第一AI网络模型进行压缩处理,得到与第一信息对应的压缩后的AI网络模型。这样,虽然没有降低第一设备接收AI网络模型时占用的资源,但是,第一设备获取的AI网络模型与第一信息相匹配,能够降低第一设备应用该AI网络模型时的算力和推理时延。
例如:如图6所示,假设第一信息包括模型压缩相关的参数,第二设备根据第一信息可以确定第一设备需要的AI网络模型的第一模型大小,若在第一模型库中未查找满足该第一模型大小需求的AI网络模型,则第二设备可以向第一设备发送第一模型库中的第一AI网络模型的相关信息和第三信息,以使第一设备按照第三信息对接收到的第一AI网络模型进行压缩处理,得到第二AI网络模型。
作为一种可选的实施方式,所述第一信息包括以下至少一项:
第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或
所述第一设备支持的AI网络模型压缩方法;
第一要求信息,所述第一要求信息指示所述第一设备需要的AI网络模型的尺寸信息;
第一应用信息,所述第一应用信息指示所述第一设备需要的AI网络模型的功能信息;
第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;
第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
选项一,第一能力信息可以反映第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法,其中,所述模型压缩能力或模型压缩方法可以包括以下至少一项:知识蒸馏法、剪枝法、低秩分解法、张量分解法等,在此并不穷举。在实施中,第一能力可以是一个字段,例如:若第一能力为0000表示第一设备支持知识蒸馏法,若第一能力为0001表示第一设备支持剪枝法,若第一能力为0020表示第一设备支持低秩分解法,若第一能力为0011表示第一设备支持张量分解法。第二设备在获知第一设备的压缩能力和/或所述第一设备支持的AI网络模型压缩方法后,可以决定是否由第一设备来压缩AI网络模型。
选项二,所述第一要求信息可以反映第一设备需要的AI网络模型的尺寸信息(即第一设备对目标AI网络模型的模型大小的要求),该尺寸信息可以包括以下至少一项:目标AI网络模型的模型大小的上限、目标AI网络模型的模型大小的下限、目标AI网络模型的模型大小等级、目标AI网络模型的参数量的上限、目标AI网络模型的参数量下限、目标AI网络模型的参数量等级、目标AI网络模型的复杂度的上限、目标AI网络模型的复杂度的下限或目标AI网络模型的复杂度等级、目标AI网络模型的性能的上限、目标AI网络模型的性能的下限、目标AI网络模型的性能等级。
在实施中,第二设备在获知第一设备的第一要求信息时,可以在自身具备的AI网络模型中选择与该第一要求信息匹配的目标AI网络模型,例如:从自身具备的AI网络模型中选择的目标AI网络模型小于或者等于第一设备要求的模型大小的上限。
或者,在第二设备不具有与该第一要求信息匹配的目标AI网络模型的情况下,可以基于第一信息对自身具备的AI网络模型进行压缩处理,得到与该第一要求信息匹配的目标AI网络模型,例如:假设第二设备具有的AI网络模型的大小大于第一设备要求的AI网络模型的模型大小的上限,则可以根据第二设备具有的AI网络模型与第一设备要求的AI网络模型的模型大小之间的差异,来决定对第二设备具有的AI网络模型进行怎样的压缩处理,以使压缩后的AI网络模型小于或等于第一设备要求的AI网络模型的模型大小的上限。
选项三,上述第一应用信息可以反映第一设备需要的AI网络模型的功能信息,例如:用于实现CSI预测、CSI压缩、波束预测、流量预测等功能中至少一项的AI网络模型。
在实施中,第二设备在获知第一设备需要的AI网络模型的功能时,可以在自身具备的AI网络模型中选择与该第一应用信息匹配的目标AI网络模型,例如:从自身具备的AI网络模型中选择的目标AI网络模型的功能与第一应用信息的功能信息对应,然后,第
二设备可以直接或者间接地向该第一设备提供能够实现该功能的AI网络模型。
选项四,上述第二信息能够反映第一设备的资源使用率,该资源使用率可以包括:电量使用率、存储资源使用率、计算资源使用率、传输资源使用率等。例如:第二信息可以包括以下至少一项:第一设备的可用算力、第一设备的可用算力占比、第一设备的可用算力等级、第一设备的可用电量、第一设备的可用电量占比、第一设备的可用电量等级、第一设备的可用存储、第一设备的可用存储占比、第一设备的可用存储等级。
在实施中,第二设备在获知第一设备的资源使用率时,可以在自身具备的AI网络模型中选择与第一设备的资源使用率匹配的目标AI网络模型,例如:选择的目标AI网络模型的资源占用量小于等于第一设备的可用资源或可用于进行AI网络模型压缩或推理的资源,然后第二设备可以根据向第一设备直接或者间接地提供选择出的AI网络模型,这样,可以降低AI网络模型的资源占用量大于第一设备的实际可用资源的风险,提升了第一设备在模型推理过程中的资源利用率,并减少模型推理过程的时延。
选项五,上述第一指示信息指示对AI网络模型进行压缩的设备,可以是在第二设备不具有与第一设备需要的模型大小的AI网络模型,从而需要对第二设备具有的AI网络模型进行压缩处理的情况下,指定一个或多个设备执行模型压缩处理,其中,指定的设备可包括第一设备、第二设备和第三设备中的至少一个,第三设备可以是除了第一设备和第二设备之外的任意设备,例如:可以是通信网络内的实体(如终端、基站、核心网设备或其他网络层实体),也可以是通信网络外的第三方设备,其功能至少包括模型压缩功能。
在实施中,第二设备在获知对AI网络模型进行压缩的设备的情况下,可以将待压缩的AI网络模型和模型压缩信息发送至该设备,从而通过该设备对AI网络模型进行压缩处理,得到符合第一设备需要的模型大小的AI网络模型。
需要说明的是,上述第一信息可以包括以上选项一至选项五中的一项或者至少两项,例如:假设第一信息包括第一能力信息、第一要求信息、第一应用信息,则第二设备可以从自身具备的AI网络模型中先选择与第一应用信息匹配的至少一个AI网络模型,若判断该至少一个AI网络模型全部都与第一要求信息对应的模型大小不匹配,且第一能力信息表示第一设备支持进行知识蒸馏法,则第二设备可以从至少一个AI网络模型中选择与第一要求信息对应的模型大小最接近的一个发送给第一设备,且还可以向第一设备发送第三信息,该第三信息可以包括将发送至第一设备的AI网络模型压缩成与第一要求信息对应的模型大小匹配时需要的参数信息,如压缩方法,压缩等级等。
在第一种可选的实施方中,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:
所述第一设备接收来自所述第二设备的第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型;
所述第一设备根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第
二AI网络模型,其中,所述目标AI网络模型包括所述第二AI网络模型。
其中,上述第二设备不具有与所述第一信息匹配的AI网络模型,可以是第二设备具有的满足第一信息的第一部分的AI网络模型都不满足第一要求信息,其中,第一信息的第一部分可以包括第一信息中除了第一要求信息之外的信息,如:第一能力信息、第一应用信息、第二信息中的至少一项。
上述第三信息可以包括将第一AI网络模型压缩处理成与第一信息对应的第二AI网络模型所需的信息。
可选地,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩处理时采用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
其中,所述压缩方法可以包括以下至少一项:知识蒸馏法、剪枝法、低秩分解法、张量分解法等。
所述AI网络模型压缩相关的限制信息可以包括以下至少一项:最大可压缩限制(如压缩比例,压缩等级),压缩后的参数量的上限、压缩后的参数量的下限,压缩后模型大小的上限、压缩后的模型大小的下限等。
在实施中,第三信息可以根据第一信息所要求的AI网络模型与第一AI网络模型之间的差异确定。
当然,该第三信息可以包括第一信息的至少部分,例如:第一要求信息等,这样,接收到第三信息的设备可以根据第三信息所对应的AI网络模型与第一AI网络模型之间的差异,来确定对第一AI网络模型进行怎样的压缩处理。
本实施方式中,第二设备根据第一信息选择已经训练或存储的第一AI网络模型(例如:选择与第一信息最接近的一个作为第一AI网络模型,或者选择进模型尺寸与第一信息所要求的模型尺寸不匹配的一个作为第一AI网络模型),并根据该第一AI网络模型与第一信息中要求的模型尺寸的差异,以及第一设备支持的模型压缩能力等信息来确定第三信息,并将第三信息和第一AI网络模型发送给第一设备,这样,目标AI网络模型可以由第一设备根据第三信息对第一AI网络模型进行压缩处理得到。
在第二种可选的实施方中,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:
所述第一设备接收来自所述第二设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为根据所述第一信息对所述第二设备具有的第一AI网络模型进行压缩处理后得到的AI网络模型。
其中,上述第二设备不具有与所述第一信息匹配的AI网络模型的含义与第一种可选的实施方相同,在此不再赘述。
本实施方式中,第二设备根据第一信息选择已经训练或存储的第一AI网络模型,并根据该第一AI网络模型与第一信息中要求的模型尺寸的差异,以及第二设备支持的模型
压缩能力等信息来对第一AI网络模型进行压缩处理得到第二AI网络模型,然后再将该第二AI网络模型转发给第一设备。
在第三种可选的实施方中,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第三设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:
所述第一设备接收来自所述第三设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为对来自所述第二设备的第一AI网络模型进行压缩处理后得到的AI网络模型。
其中,上述第二设备不具有与所述第一信息匹配的AI网络模型的含义,以及第三信息的含义和作用与第一种可选的实施方相同,在此不再赘述。
本实施方式中,第二设备根据第一信息选择已经训练或存储的第一AI网络模型,并根据该第一AI网络模型与第一信息中要求的模型尺寸的差异,以及第三设备支持的模型压缩能力等信息来确定第三信息,并将第三信息和第一AI网络模型发送给第三设备,这样,目标AI网络模型可以由第三设备根据第三信息对第一AI网络模型进行压缩处理得到,并转发给第一设备。
需要说明的是,在由第三设备根据第三信息对第一AI网络模型进行压缩处理得到第二AI网络模型的情况,第一设备还可以向第三设备发送第一信息,或者该第一信息中与模型压缩相关的部分,这样,第三设备还可以根据该第一信息或者该第一信息中与模型压缩相关的部分来决定对第一AI网络模型进行怎样的压缩处理。
作为一种可选的实施方式,在所述第一设备根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第二AI网络模型之后,所述方法还包括:
所述第一设备向所述第二设备发送所述第二AI网络模型的相关信息。
本实施方式中,第一设备在根据第三信息对所述第一AI网络模型进行压缩处理,得到第二AI网络模型之后,将该第二AI网络模型发送给第二设备,这样,在后续的模型传输过程中,第二AI网络模型具有的AI网络模型将包括该第二AI网络模型,从而可以直接传输该第二AI网络模型,而无需再次对第一AI网络模型进行压缩处理。
与之相似的,在由第三设备根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第二AI网络模型的情况下,第二设备也可以从第三设备获取该第二AI网络模型。
在实施中,如图5和图6所示,对于根据第三信息对第一AI网络模型进行压缩得到的第二AI网络模型,第一设备还可以进一步判断该第二AI网络模型是否满足第一信息的要求,如:第二AI网络模型满足第一信息中的模型大小要求,这样,第一设备可以在判断该第二AI网络模型确实满足第一信息的要求的情况下,第二设备才会从第一设备和第三设备中的至少一个获取第二AI网络模型的相关信息。
作为一种可选的实施方式,在所述第一设备获取到由所述第一设备或第三设备压缩得到的第二AI网络模型的情况下,所述方法还包括:
所述第一设备获取所述第二AI网络模型与所述第一信息的匹配结果;
所述第一设备向所述第二设备发送所述匹配结果。
本实施方式中,在目标AI网络模型由第一设备或第三设备进行压缩处理得到的情况下,第一设备还获取目标AI网络模型与第一信息的匹配结果,例如:目标AI网络模型与第一设备要求的模型大小是否一致,且向第二设备反馈该匹配结果。这样,若该匹配结果表示目标AI网络模型与第一信息不匹配,则可以进行以下处理中的任一项:
1)第一设备改变第一信息,重新向第二设备请求AI网络模型。该过程与本申请实施例提供的AI网络模型交互方法的过程相似,在此不再赘述。
2)第一设备发送第一请求信息,此时第一信息不变,且可以不向第二设备发送该第一信息,第二设备可以根据该第一请求信息采用不同的第三信息进行压缩处理,或者对不同的第一AI网络模型进行压缩处理。例如:第一请求信息可以携带不要压缩发前面已压缩过的第一AI网络模型的建议信息和/或携带第二设备改变第三信息的建议信息。或者,该第一请求信息也可以是不携带上述建议,而是由第二设备决定重新压缩哪一个第一AI网络模型,以及是否修改第三信息。
3)放弃第一设备的AI网络模型请求。
在一种可选的实施方式中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述AI网络模型交互方法还包括:
所述第一设备向所述第二设备发送第一请求信息,所述第一请求信息用于请求所述第二设备更新所述第三信息和所述第一AI网络模型中的至少一项。
本实施方式中,第一设备在获取到不符合需要的模型大小的AI网络模型时,可以向第二设备发送第一请求信息,以使第二设备根据该第一请求信息更新以下至少一项:压缩的第一AI网络模型和压缩过程中使用的第三信息,直至第一设备获取到符合需要的模型大小的AI网络模型。
在另一种可选的实施方式中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述AI网络模型交互方法还包括:
所述第一设备更新所述第一信息,并向所述第二设备发送更新后的第一信息;
所述第一设备获取与所述更新后的第一信息对应的目标AI网络模型。
本实施方式中,第一设备在获取到不符合需要的模型大小的AI网络模型时,可以更新第一信息中的至少一项,例如:更新第一指示信息、第一要求信息、第一能力信息和第二信息中的至少一项,以使第二设备根据更新后的第一信息更新第一AI网络模型和第三信息中的至少一项,以使根据更新后的第三信息对第一AI网络模型进行压缩能够得到与更新后的第一信息相匹配的目标AI网络模型。
值得提出的是,如图5所示,在由第三设备根据第三信息对第一AI网络模型进行压缩处理得到第二AI网络模型的情况下,第一设备还可以向第三设备发送上述匹配结果,以使第三设备据在确定第二AI网络模型与第一信息匹配的情况下,向第二设备发送压缩
后的第二AI网络模型;在确定第二AI网络模型与第一信息不匹配的情况下,不向第二设备发送压缩后的第二AI网络模型,这样,可以减少第二AI网络模型与第一信息不匹配时传输第二AI网络模型所造成的的资源浪费。
需要说明的是,如图5所示,在由第三设备根据第三信息对第一AI网络模型进行压缩处理得到第二AI网络模型的情况下,也可以由第一设备根据匹配结果来决定是否向第二设备发送第二AI网络模型,此时,第一设备可以不向第三设备发送匹配结果,且第三设备也不需要根据匹配结果来决定是否向第二设备发送第二AI网络模型。
在本申请实施例中,第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。这样,在第二设备预先存储或者训练得到AI网络模型的情况下,第一设备在从第二设备获取AI网络模型的过程中,第一设备可以向第二设备发送该第一设备所需要的AI网络模型的压缩和/或模型推理相关的信息,以使第二设备能够按照第一设备的需求来确定以下至少一项:第一设备需要的AI网络模型的类型、尺寸、功能、复杂程度,以及对确定的AI网络模型进行压缩处理时的参数、压缩方法、压缩节点等,这样,能够使第二设备按照第一设备的需求对AI网络模型进行压缩,并对压缩后的AI网络模型进行传输,能够降低AI网络模型的传输开销;此外,第二设备还按照第一设备的需求选择与第一设备的模型推理过程相匹配的AI网络模型,能够降低第一设备对目标AI网络模型进行推理时占用的计算资源和推理时延。
请参阅图7,本申请实施例提供的另一种AI网络模型交互方法,其执行主体是第二设备,如图7所示,该第二设备执行的AI网络模型交互方法可以包括以下步骤:
步骤701、第二设备接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息。
步骤702、所述第二设备向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,所述第二设备根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
本申请实施例中,上述第一设备、第二设备、第一信息、第一AI网络模型的相关信息、第二AI网络模型的含义和作用同如图2所示方法实施例中的第一设备、第二设备、第一信息、第一AI网络模型的相关信息、第二AI网络模型的含义和作用,在此不再赘述。
可选地,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型。
可选地,所述第一信息包括以下至少一项:
第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法;
第一要求信息,所述第一要求信息指示所述第一设备需要的AI网络模型的尺寸信息;
第一应用信息,所述第一应用信息指示所述第一设备需要的AI网络模型的功能信息;
第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;
第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
可选地,所述第一指示信息指示所述第一设备、所述第二设备或第三设备对AI网络模型进行压缩。
可选地,所述第二设备发送目标AI网络模型的相关信息,包括:
所述第二设备在具有与所述第一信息匹配的AI网络模型的情况下,向所述第一设备发送所述与所述第一信息匹配的AI网络模型,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
可选地,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,所述方法还包括:
所述第二设备根据所述第一信息对第一AI网络模型进行压缩处理,得到第二AI网络模型;
所述第二设备发送目标AI网络模型的相关信息,包括:
所述第二设备向所述第一设备发送所述第二AI网络模型的相关信息,所述目标AI网络模型包括所述第二AI网络模型。
可选地,所述第二设备根据所述第一信息发送第一AI网络模型的相关参数,包括:
所述第二设备在不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,向所述第一设备发送第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型,所述第二AI网络模型与所述第一信息对应;和/或,
所述第二设备在不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第三设备对AI网络模型进行压缩的情况下,向所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息。
可选地,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩时使用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
可选地,在所述第二设备向所述第一设备或所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息之后,所述方法还包括:
所述第二设备接收所述第二AI网络模型的相关信息。
可选地,在所述第二设备向所述第一设备或所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息之后,所述方法还包括:
所述第二设备接收来自所述第一设备的判断结果,所述判断结果用于表示所述第二AI网络模型与所述第一信息的匹配结果。
可选地,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:
所述第二设备接收来自所述第一设备的第一请求信息,并根据所述第一请求信息更新
所述第三信息和所述第一AI网络模型中的至少一项;
所述第二设备发送更新后的所述第三信息和/或更新后的所述第一AI网络模型的相关信息。
可选地,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:
所述第二设备接收来自所述第一设备的更新后的第一信息;
所述第二设备向所述第一设备发送与所述更新后的第一信息匹配的目标AI网络模型的相关信息,或者,所述第二设备根据所述更新后的第一信息发送第三AI网络模型的相关信息,所述第三AI网络模型用于进行压缩处理得到与所述更新后的第一信息对应的第四AI网络模型。
其中,上述第三AI网络模型与如图2所示方法实施例中的第一AI网络模型相似,都可以是第二设备的模型库中的AI网络模型,区别在于:第一AI网络模型与更新前的第一信息对应,第三AI网络模型与更新后的第一信息对应。
上述第四AI网络模型与如图2所示方法实施例中的第二AI网络模型相似,都可以是对第二设备的模型库中的AI网络模型进行压缩处理得到的AI网络模型,区别在于:第二AI网络模型是对第一AI网络模型进行压缩处理得到的,且与更新前的第一信息对应的AI网络模型,第四AI网络模型是对第三AI网络模型进行压缩处理得到的,且与更新后的第一信息对应的AI网络模型。
本申请实施例提供的第二设备执行的AI网络模型交互方法,与第一设备执行的AI网络模型交互方法相对应,且第一设备和第二设备分别执行各自的AI网络模型交互方法中的步骤,能够降低AI网络模型的传输开销,以及降低第一设备对目标AI网络模型进行推理时占用的计算资源和推理时延。
本申请实施例提供的AI网络模型交互方法,执行主体可以为AI网络模型交互装置。本申请实施例中以AI网络模型交互装置执行AI网络模型交互方法为例,说明本申请实施例提供的AI网络模型交互装置。
请参阅图8,本申请实施例提供的一种AI网络模型交互装置,可以是第一设备内的装置,如图8所示,该AI网络模型交互装置800可以包括以下模块:
第一发送模块801,用于向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
第一获取模块802,用于获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
可选的,第一获取模块802,包括:
第一接收单元,用于接收目标AI网络模型的相关信息,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型;
或者,
第二接收单元,用于接收来自所述第二设备的第一AI网络模型的相关信息;
第一处理单元,用于对所述第一AI网络模型进行压缩处理,得到第二AI网络模型,所述目标AI网络模型包括所述第二AI网络模型。
可选的,所述第一信息包括以下至少一项:
第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法;
第一要求信息,所述第一要求信息表示所述第一设备需要的AI网络模型的尺寸信息;
第一应用信息,所述第一应用信息表示所述第一设备需要的AI网络模型的功能信息;
第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;
第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
可选的,所述第一指示信息指示所述第一设备、所述第二设备或第三设备对AI网络模型进行压缩。
可选的,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,第一获取模块802,包括:
第三接收单元,用于接收来自所述第二设备的第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型;
第二处理单元,用于根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第二AI网络模型,其中,所述目标AI网络模型包括所述第二AI网络模型;
和/或,
在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,第一获取模块802,具体用于:
接收来自所述第二设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为根据所述第一信息对所述第二设备具有的第一AI网络模型进行压缩处理后得到的AI网络模型;
和/或,
在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第三设备对AI网络模型进行压缩的情况下,第一获取模块802,具体用于:
接收来自所述第三设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为对来自所述第二设备的第一AI网络模型进行压缩处理后得到的AI网络模型。
可选的,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩处理时采用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
可选的,AI网络模型交互装置800还包括:
第三发送模块,用于向所述第二设备发送所述第二AI网络模型的相关信息。
可选的,在所述第二设备具有与所述第一信息匹配的AI网络模型的情况下,第一获
取模块802,具体用于:
接收来自所述第二设备的与所述第一信息匹配的AI网络模型的相关信息,其中,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
可选的,在所述第一设备获取到由所述第一设备或第三设备压缩得到的第二AI网络模型的情况下,AI网络模型交互装置800还包括:
第二获取模块,用于获取所述第二AI网络模型与所述第一信息的匹配结果;
第四发送模块,用于向所述第二设备发送所述匹配结果。
可选的,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,AI网络模型交互装置800还包括:
第五发送模块,用于向所述第二设备发送第一请求信息,所述第一请求信息用于请求所述第二设备更新所述第三信息和所述第一AI网络模型中的至少一项。
可选的,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,AI网络模型交互装置800还包括:
更新模块,用于更新所述第一信息,并向所述第二设备发送更新后的第一信息;
第三获取模块,用于获取与所述更新后的第一信息对应的目标AI网络模型。
本申请实施例提供的AI网络模型交互装置800,能够实现如图2所示方法实施例中第一设备实现的各个过程,且能够取得相同的有益效果,为避免重复,在此不再赘述。
请参阅图9,本申请实施例提供的另一种AI网络模型交互装置,可以是第二设备内的装置,如图9所示,该AI网络模型交互装置900可以包括以下模块:
第一接收模块901,用于接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;
第二发送模块902,用于向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
可选的,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型。
可选的,所述第一信息包括以下至少一项:
第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法;
第一要求信息,所述第一要求信息指示所述第一设备需要的AI网络模型的尺寸信息;
第一应用信息,所述第一应用信息指示所述第一设备需要的AI网络模型的功能信息;
第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;
第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
可选的,所述第一指示信息指示所述第一设备、所述第二设备或第三设备对AI网络模型进行压缩。
可选的,第二发送模块902,具体用于:
在所述第二设备具有与所述第一信息匹配的AI网络模型的情况下,向所述第一设备发送所述与所述第一信息匹配的AI网络模型,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
可选的,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,AI网络模型交互装置900还包括:
第一处理模块,用于根据所述第一信息对第一AI网络模型进行压缩处理,得到第二AI网络模型;
第二发送模块902,具体用于:
向所述第一设备发送所述第二AI网络模型的相关信息,所述目标AI网络模型包括所述第二AI网络模型。
可选的,第二发送模块902,具体用于:
在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,向所述第一设备发送第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型,所述第二AI网络模型与所述第一信息对应;和/或,
在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第三设备对AI网络模型进行压缩的情况下,向所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息。
可选的,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩时使用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
可选的,AI网络模型交互装置900还包括:
第二接收模块,用于接收所述第二AI网络模型的相关信息。
可选的,AI网络模型交互装置900还包括:
第三接收模块,用于接收来自所述第一设备的判断结果,所述判断结果用于表示所述第二AI网络模型与所述第一信息的匹配结果。
可选的,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,AI网络模型交互装置900还包括:
第四接收模块,用于接收来自所述第一设备的第一请求信息,并根据所述第一请求信息更新所述第三信息和所述第一AI网络模型中的至少一项;
第六发送模块,用于发送更新后的所述第三信息和/或更新后的所述第一AI网络模型的相关信息。
可选的,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,AI网络模型交互装置900还包括:
第五接收模块,用于接收来自所述第一设备的更新后的第一信息;
第七发送模块,用于向所述第一设备发送与所述更新后的第一信息匹配的目标AI网络模型的相关信息,或者,所述第二设备根据所述更新后的第一信息发送第三AI网络模型的相关信息,所述第三AI网络模型用于进行压缩处理得到与所述更新后的第一信息对应的第四AI网络模型。
本申请实施例提供的AI网络模型交互装置900,能够实现如图7所示方法实施例中第二设备实现的各个过程,且能够取得相同的有益效果,为避免重复,在此不再赘述。
本申请实施例中的AI网络模型交互装置可以是电子设备,例如具有操作系统的电子设备,也可以是电子设备中的部件,例如集成电路或芯片。该电子设备可以是终端,也可以为除终端之外的其他设备。示例性的,终端可以包括但不限于上述所列举的终端11的类型,其他设备可以为服务器、网络附属存储器(Network Attached Storage,NAS)等,本申请实施例不作具体限定。
本申请实施例提供的AI网络模型交互装置能够实现图2或图7所示方法实施例实现的各个过程,并达到相同的技术效果,为避免重复,这里不再赘述。
可选的,如图10所示,本申请实施例还提供一种通信设备1000,包括处理器1001和存储器1002,存储器1002上存储有可在所述处理器1001上运行的程序或指令,例如,该通信设备1000作为第一设备时,该程序或指令被处理器1001执行时实现如图2所示方法实施例的各个步骤,且能达到相同的技术效果。该通信设备1000作为第二设备时,该程序或指令被处理器1001执行时实现如图7所示方法实施例的各个步骤,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供一种通信设备,包括处理器和通信接口,在该通信设备作为第一设备时,通信接口用于向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;所述通信接口或者所述处理器用于获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。在该通信设备作为第二设备时,通信接口用于接收来自第一设备的第一信息,以及向所述第一设备发送目标AI网络模型的相关信息或者根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息,所述目标AI网络模型与所述第一信息对应,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
该终端实施例与如图2或图7所示方法实施例对应,图2或图7所示方法实施例的各个实施过程和实现方式均可适用于该通信设备实施例中,且能达到相同的技术效果。
本申请实施例还提供一种可读存储介质,所述可读存储介质上存储有程序或指令,该程序或指令被处理器执行时实现如图2或图7所示方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,所述处理器为上述实施例中所述的终端中的处理器。所述可读存储介质,包括
计算机可读存储介质,如计算机只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本申请实施例另提供了一种芯片,所述芯片包括处理器和通信接口,所述通信接口和所述处理器耦合,所述处理器用于运行程序或指令,实现如图2或图7所示方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
应理解,本申请实施例提到的芯片还可以称为系统级芯片,系统芯片,芯片系统或片上系统芯片等。
本申请实施例另提供了一种计算机程序/程序产品,所述计算机程序/程序产品被存储在存储介质中,所述计算机程序/程序产品被至少一个处理器执行以实现如图2或图7所示方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本申请实施例还提供了一种通信系统,包括:第一设备及第二设备,所述第一设备可用于执行如图2所示的AI网络模型交互方法的步骤,所述第二设备可用于执行如图7所示的AI网络模型交互方法的步骤。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对相关技术做出贡献的部分可以以计算机软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。
Claims (27)
- 一种人工智能AI网络模型交互方法,包括:第一设备向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型进行压缩和/或模型推理相关的信息;所述第一设备获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
- 根据权利要求1所述的方法,其中,所述第一设备获取目标AI网络模型的相关信息,包括:所述第一设备接收目标AI网络模型的相关信息,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型;或者,所述第一设备接收来自所述第二设备的第一AI网络模型的相关信息,并对所述第一AI网络模型进行压缩处理,得到第二AI网络模型,所述目标AI网络模型包括所述第二AI网络模型。
- 根据权利要求1所述的方法,其中,所述第一信息包括以下至少一项:第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法;第一要求信息,所述第一要求信息指示所述第一设备需要的AI网络模型的尺寸信息;第一应用信息,所述第一应用信息指示所述第一设备需要的AI网络模型的功能信息;第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
- 根据权利要求3所述的方法,其中,所述第一指示信息指示所述第一设备、所述第二设备或第三设备对AI网络模型进行压缩。
- 根据权利要求4所述的方法,其中,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:所述第一设备接收来自所述第二设备的第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型;所述第一设备根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第二AI网络模型,其中,所述目标AI网络模型包括所述第二AI网络模型;和/或,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:所述第一设备接收来自所述第二设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为根据所述第一信息对所述第二设备具有的第一AI网络模型进行压缩处理后得到的AI网络模型;和/或,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第三设备对AI网络模型进行压缩的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:所述第一设备接收来自所述第三设备的第二AI网络模型的相关信息,其中,所述目标AI网络模型包括所述第二AI网络模型,所述第二AI网络模型为对来自所述第二设备的第一AI网络模型进行压缩处理后得到的AI网络模型。
- 根据权利要求5所述的方法,其中,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩处理时采用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
- 根据权利要求5所述的方法,其中,在所述第一设备根据所述第三信息对所述第一AI网络模型进行压缩处理,得到所述第二AI网络模型之后,所述方法还包括:所述第一设备向所述第二设备发送所述第二AI网络模型的相关信息。
- 根据权利要求3所述的方法,其中,在所述第二设备具有与所述第一信息匹配的AI网络模型的情况下,所述第一设备获取目标AI网络模型的相关信息,包括:所述第一设备接收来自所述第二设备的与所述第一信息匹配的AI网络模型的相关信息,其中,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
- 根据权利要求5所述的方法,其中,在所述第一设备获取到由所述第一设备或第三设备压缩得到的第二AI网络模型的情况下,所述方法还包括:所述第一设备获取所述第二AI网络模型与所述第一信息的匹配结果;所述第一设备向所述第二设备发送所述匹配结果。
- 根据权利要求9所述的方法,其中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:所述第一设备向所述第二设备发送第一请求信息,所述第一请求信息用于请求所述第二设备更新所述第三信息和所述第一AI网络模型中的至少一项。
- 根据权利要求9所述的方法,其中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:所述第一设备更新所述第一信息,并向所述第二设备发送更新后的第一信息;所述第一设备获取与所述更新后的第一信息对应的目标AI网络模型。
- 一种人工智能AI网络模型交互方法,包括:第二设备接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;所述第二设备向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,所述第二设备根据所述第一信息发送第一AI网络模型的相关信息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
- 根据权利要求12所述的方法,其中,所述目标AI网络模型为压缩后的AI网络模型或未压缩的AI网络模型。
- 根据权利要求12所述的方法,其中,所述第一信息包括以下至少一项:第一能力信息,所述第一能力信息指示所述第一设备对AI网络模型的压缩能力和/或所述第一设备支持的AI网络模型压缩方法;第一要求信息,所述第一要求信息指示所述第一设备需要的AI网络模型的尺寸信息;第一应用信息,所述第一应用信息指示所述第一设备需要的AI网络模型的功能信息;第二信息,所述第二信息包括所述第一设备的资源使用率相关的信息;第一指示信息,所述第一指示信息指示对AI网络模型进行压缩的设备。
- 根据权利要求14所述的方法,其中,所述第一指示信息指示所述第一设备、所述第二设备或第三设备对AI网络模型进行压缩。
- 根据权利要求14所述的方法,其中,所述第二设备发送目标AI网络模型的相关信息,包括:所述第二设备在具有与所述第一信息匹配的AI网络模型的情况下,向所述第一设备发送所述与所述第一信息匹配的AI网络模型的相关信息,所述目标AI网络模型包括所述与所述第一信息匹配的AI网络模型。
- 根据权利要求15所述的方法,其中,在所述第二设备不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第二设备对AI网络模型进行压缩的情况下,所述方法还包括:所述第二设备根据所述第一信息对第一AI网络模型进行压缩处理,得到第二AI网络模型;所述第二设备发送目标AI网络模型的相关信息,包括:所述第二设备向所述第一设备发送所述第二AI网络模型的相关信息,所述目标AI网络模型包括所述第二AI网络模型。
- 根据权利要求15所述的方法,其中,所述第二设备根据所述第一信息发送第一AI网络模型的相关参数,包括:所述第二设备在不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指示所述第一设备对AI网络模型进行压缩的情况下,向所述第一设备发送第一AI网络模型的相关信息和第三信息,其中,所述第三信息用于将所述第一AI网络模型压缩处理成第二AI网络模型,所述第二AI网络模型与所述第一信息对应;和/或,所述第二设备在不具有与所述第一信息匹配的AI网络模型,且所述第一指示信息指 示所述第三设备对AI网络模型进行压缩的情况下,向所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息。
- 根据权利要求18所述的方法,其中,所述第三信息包括以下至少一项:对所述第一AI网络模型进行压缩时使用的AI网络模型压缩方法和AI网络模型压缩相关的限制信息。
- 根据权利要求18所述的方法,其中,在所述第二设备向所述第一设备或所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息之后,所述方法还包括:所述第二设备接收所述第二AI网络模型的相关信息。
- 根据权利要求18所述的方法,其中,在所述第二设备向所述第一设备或所述第三设备发送所述第一AI网络模型的相关信息和所述第三信息之后,所述方法还包括:所述第二设备接收来自所述第一设备的判断结果,所述判断结果用于表示所述第二AI网络模型与所述第一信息的匹配结果。
- 根据权利要求21所述的方法,其中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:所述第二设备接收来自所述第一设备的第一请求信息,并根据所述第一请求信息更新所述第三信息和所述第一AI网络模型中的至少一项;所述第二设备发送更新后的所述第三信息和/或更新后的所述第一AI网络模型的相关信息。
- 根据权利要求21所述的方法,其中,在所述匹配结果表示所述第二AI网络模型与所述第一信息不匹配的情况下,所述方法还包括:所述第二设备接收来自所述第一设备的更新后的第一信息;所述第二设备向所述第一设备发送与所述更新后的第一信息匹配的目标AI网络模型的相关信息,或者,所述第二设备根据所述更新后的第一信息发送第三AI网络模型的相关信息,所述第三AI网络模型用于进行压缩处理得到与所述更新后的第一信息对应的第四AI网络模型。
- 一种人工智能AI网络模型交互装置,应用于第一设备,所述装置包括:第一发送模块,用于向第二设备发送第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;第一获取模块,用于获取目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应。
- 一种人工智能AI网络模型交互装置,应用于第二设备,所述装置包括:第一接收模块,用于接收来自第一设备的第一信息,所述第一信息包括所述第一设备需要的AI网络模型的压缩和/或模型推理相关的信息;第二发送模块,用于向所述第一设备发送目标AI网络模型的相关信息,所述目标AI网络模型与所述第一信息对应,或者,根据所述第一信息发送第一AI网络模型的相关信 息,其中,所述第一AI网络模型用于进行压缩处理得到第二AI网络模型,所述第二AI网络模型与所述第一信息对应。
- 一种通信设备,包括处理器和存储器,所述存储器存储可在所述处理器上运行的程序或指令,所述程序或指令被所述处理器执行时实现如权利要求1至23中任一项所述的人工智能AI网络模型交互方法的步骤。
- 一种可读存储介质,所述可读存储介质上存储程序或指令,所述程序或指令被处理器执行时实现如权利要求1至23中任一项所述的人工智能AI网络模型交互方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210822781.7 | 2022-07-12 | ||
CN202210822781.7A CN117439958A (zh) | 2022-07-12 | 2022-07-12 | 一种ai网络模型交互方法、装置和通信设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024012303A1 true WO2024012303A1 (zh) | 2024-01-18 |
Family
ID=89535522
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/105408 WO2024012303A1 (zh) | 2022-07-12 | 2023-06-30 | 一种ai网络模型交互方法、装置和通信设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117439958A (zh) |
WO (1) | WO2024012303A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200364574A1 (en) * | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
CN113472830A (zh) * | 2020-03-31 | 2021-10-01 | 华为技术有限公司 | 一种通信方法及装置 |
CN114418086A (zh) * | 2021-12-02 | 2022-04-29 | 北京百度网讯科技有限公司 | 压缩神经网络模型的方法、装置 |
CN114706518A (zh) * | 2022-03-30 | 2022-07-05 | 深存科技(无锡)有限公司 | Ai模型推理方法 |
-
2022
- 2022-07-12 CN CN202210822781.7A patent/CN117439958A/zh active Pending
-
2023
- 2023-06-30 WO PCT/CN2023/105408 patent/WO2024012303A1/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200364574A1 (en) * | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
CN113472830A (zh) * | 2020-03-31 | 2021-10-01 | 华为技术有限公司 | 一种通信方法及装置 |
CN114418086A (zh) * | 2021-12-02 | 2022-04-29 | 北京百度网讯科技有限公司 | 压缩神经网络模型的方法、装置 |
CN114706518A (zh) * | 2022-03-30 | 2022-07-05 | 深存科技(无锡)有限公司 | Ai模型推理方法 |
Also Published As
Publication number | Publication date |
---|---|
CN117439958A (zh) | 2024-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114363921A (zh) | Ai网络参数的配置方法和设备 | |
EP4387316A1 (en) | Communication method and apparatus | |
CN114071615B (zh) | 小区切换方法和终端 | |
US20240330776A1 (en) | Client screening method and apparatus, client, and central device | |
US20240244419A1 (en) | Data subscription method and apparatus, and network element | |
CN114531696A (zh) | Ai网络部分输入缺失的处理方法和设备 | |
WO2024012303A1 (zh) | 一种ai网络模型交互方法、装置和通信设备 | |
CN113727298A (zh) | 层二标识确定方法、装置及终端 | |
WO2023066288A1 (zh) | 模型请求方法、模型请求处理方法及相关设备 | |
WO2023179651A1 (zh) | 波束处理方法、装置及设备 | |
EP4451732A1 (en) | Data collection method and apparatus, and first device and second device | |
WO2023198167A1 (zh) | 模型微调的方法、装置及设备 | |
WO2024032694A1 (zh) | Csi预测处理方法、装置、通信设备及可读存储介质 | |
WO2023125934A1 (zh) | Ai网络信息传输方法、装置及通信设备 | |
WO2023133790A1 (en) | Methods and apparatus of signalling for indicating dmrs ports | |
WO2023179649A1 (zh) | 人工智能模型的输入处理方法、装置及设备 | |
WO2024027683A1 (zh) | 模型匹配方法、装置、通信设备及可读存储介质 | |
WO2023186091A1 (zh) | 样本确定方法、装置及设备 | |
WO2023165396A1 (zh) | 上行发送处理方法、装置、终端、网络侧设备及存储介质 | |
WO2023174253A1 (zh) | Ai模型的处理方法及设备 | |
WO2023185756A1 (zh) | 信息传输方法、装置、终端及网络侧设备 | |
WO2024120463A1 (zh) | 非周期cli汇报方法及装置、计算机可读存储介质 | |
WO2023185790A1 (zh) | 信息传输方法、装置、终端及网络侧设备 | |
WO2024051564A1 (zh) | 信息传输方法、ai网络模型训练方法、装置和通信设备 | |
CN118524379A (zh) | Dci发送方法、装置、网络侧设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23838795 Country of ref document: EP Kind code of ref document: A1 |