WO2024041117A1 - 一种计算任务的分割方法及相关装置 - Google Patents

一种计算任务的分割方法及相关装置 Download PDF

Info

Publication number
WO2024041117A1
WO2024041117A1 PCT/CN2023/100037 CN2023100037W WO2024041117A1 WO 2024041117 A1 WO2024041117 A1 WO 2024041117A1 CN 2023100037 W CN2023100037 W CN 2023100037W WO 2024041117 A1 WO2024041117 A1 WO 2024041117A1
Authority
WO
WIPO (PCT)
Prior art keywords
amount
terminal device
calculation
computing
point
Prior art date
Application number
PCT/CN2023/100037
Other languages
English (en)
French (fr)
Inventor
曹佑龙
秦熠
陈二凯
徐瑞
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024041117A1 publication Critical patent/WO2024041117A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network

Definitions

  • the present application relates to the field of communications, and more specifically, to a method for dividing computing tasks and related devices.
  • AI artificial intelligence
  • the XR terminal can transmit images or videos upstream to the server.
  • the server can use neural network models such as deep neural network (DNN) models to detect and identify objects in the images or videos.
  • DNN deep neural network
  • the XR terminal can preprocess the images or videos to be uploaded and upload the preprocessed data to the server.
  • the neural network model includes multiple neural network layers. If it is preprocessed by the XR terminal, the computing tasks corresponding to the neural network model need to be divided.
  • This application provides a computing task segmentation method and related devices, in order to reasonably segment the computing tasks of the neural network model.
  • this application provides a method for dividing computing tasks, which can be applied to wireless access network equipment, such as base stations, access points, etc.
  • the method can be executed by the wireless access network device, or it can also be executed by components (such as chips, chip systems, processors, etc.) configured in the wireless access network device, or it can also be executed by a device that can implement all or part of the network.
  • the logic module or software implementation of the device function is not limited in this application.
  • the method includes: obtaining the amount of transmission data and the amount of calculation corresponding to the terminal device and the first calculation task, which is obtained by dividing the calculation task of the neural network model based on the dividing point; based on the amount of transmission data and the amount of calculation, and the channel status between the terminal device and the wireless access network device, determining the segmentation point as the target segmentation point; sending indication information to the terminal device, the indication information indicating the target segmentation point.
  • computing tasks can be defined based on business. Different businesses have different computing tasks. Computational tasks can be performed through neural network models.
  • Split points are used to split the computing tasks of the neural network model.
  • One split point can split the computing tasks of the neural network model into two computing tasks.
  • the splitting point splitting the computing tasks of the neural network model can be understood as follows: Since the neural network model includes multiple neural network layers, the splitting point can be located at any two of the multiple neural network layers. Between adjacent neural network layers, the dividing point is used as a node to divide the multiple neural network layers into two parts, each part including one or more neural network layers.
  • the computing tasks corresponding to one or more neural network layers included in each part are two computing tasks obtained by dividing the computing tasks of the neural network model.
  • the computing task of the neural network model can also be divided into more computing tasks, for example, it can be divided using two or more dividing points.
  • the specific process is as described above and will not be described again.
  • the split point can be used to split the computing task of the neural network model into two computing tasks, so that the two computing tasks are assigned to different devices for execution. As the position of the split point changes, the two computing tasks obtained by the split also change. In order to find reasonable split points, the neural network model can be split from different positions to meet different needs.
  • the dividing point determined to divide the computing tasks of a certain business is recorded as the target dividing point, and it is assumed that the dividing point corresponding to the first computing task is the target dividing point.
  • the two computing tasks obtained after dividing the computing tasks of the neural network model based on the target segmentation point are recorded as the first computing task and the second computing task respectively.
  • the first computing task is assigned to the terminal device.
  • the target dividing point may be determined based on the amount of transmission data corresponding to the first computing task, the amount of calculation, and the channel status between the terminal device and the radio access network device.
  • the amount of data to be transmitted may refer to the size of data that is obtained after the terminal device performs a certain computing task (such as the first computing task) and needs to be transmitted to another device (such as the first device described below). Described through dimensions such as bits and bytes.
  • the amount of calculation may refer to the number of floating-point operations required by the terminal device to perform a certain computing task (such as the first computing task), and may be described by parameters such as the number of floating-point operations.
  • the channel status between the terminal equipment and the wireless access network equipment can be determined by the signal to interference and noise ratio (SINR), reference signal receiving power (RSRP), channel quality indication (channel quality) indicator, CQI) and other parameters. Based on the channel status between the terminal device and the wireless access network device, the transmission rate of data transmission using the channel can be determined.
  • SINR signal to interference and noise ratio
  • RSRP reference signal receiving power
  • CQI channel quality indication
  • determining the target split point based on the amount of transmission data, the amount of calculation corresponding to the first computing task, and the channel status between the terminal device and the radio access network device does not mean that it is only based on the transmission data corresponding to the first computing task.
  • the target split point is determined based on the amount of data, the amount of calculation, and the channel status between the terminal device and the wireless access network device.
  • the wireless access network device can obtain the amount of transmission data, the amount of calculation, and the amount of transmission data that are divided by the dividing points at different locations and corresponding to the computing tasks of the terminal device according to the dividing points at different locations.
  • Channel status to determine the target split point It can be understood that if the above-mentioned dividing point for dividing the first computing task is the target dividing point, then the determination process of the target dividing point is inseparable from factors such as the amount of transmitted data, the amount of calculation, and the channel status corresponding to the first computing task. .
  • the wireless access network device can determine the target segmentation point based on the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task, and the channel status between the terminal device and the wireless access network device, it can be based on the transmission and calculation Analyze the power consumption, as well as the delay of transmission and calculation from multiple angles, so as to reasonably determine the target segmentation point according to different needs. Since the wireless access network device can obtain the channel status between the terminal device and the wireless access network device in real time, the perception of changes in the channel status can reach millisecond level, so it can adjust according to the changes in the channel status in a more timely and effective manner. The target segmentation point allows the target segmentation point to be adjusted as the channel status changes.
  • using this solution can reduce transmission delay and improve transmission reliability.
  • XR business it can reduce lag
  • autonomous driving and telemedicine business it can Improve the safety of autonomous driving and telemedicine by reducing delays and improving transmission reliability.
  • using this method can save the power consumption of the terminal equipment.
  • wireless access network equipment can comprehensively obtain the conditions of the terminal equipment within its coverage, such as interference between multiple terminal equipment in the same cell, and the interference required by multiple terminal equipment in the same cell to transmit services at the same time.
  • Bandwidth etc. Wireless access network equipment can control the power consumption of terminal equipment for data transmission, so it can also reduce power consumption and interference through reasonable selection of target split points to meet the needs of services with higher power consumption requirements; wireless access network equipment You can also consider the total bandwidth in the cell and adjust the transmission power consumption of the terminal equipment according to the rate requirements of the model split point to achieve optimal transmission of system-level services.
  • the target segmentation point can be reasonably selected to meet different needs and improve user experience.
  • the following exemplarily provides several possible implementation methods for obtaining the transmission data amount and calculation amount corresponding to the terminal device and the first computing task.
  • obtaining the transmission data amount and calculation amount corresponding to the terminal device and the first computing task includes: receiving first information from the terminal device, the first information indicating that the The amount of data transferred and the amount of computation described.
  • the radio access network device can directly obtain the transmission data amount and calculation amount corresponding to the first computing task from the terminal device.
  • obtaining the transmission data amount and calculation amount corresponding to the first computing task of the terminal device includes: receiving second information from the first device, the second information indicating the transmission The amount of data and the amount of calculation, the first device is another terminal device or a server.
  • the first device may be a device used to perform the above-mentioned second computing task.
  • the first device may be different devices.
  • the first device may be a server.
  • the first device may be another terminal device.
  • the radio access network device may also obtain the transmission data amount and calculation amount corresponding to the first computing task from the first device. Since the first device is used to perform the second computing task, it can pre-configure or build the neural network model, so it can also learn the neural network layer corresponding to the first computing task, and therefore it can also learn the neural network layer corresponding to the first computing task. The amount of transmitted data and calculation amount.
  • obtaining the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task includes: receiving third information from the first device, the third information indicating that the first device The amount of transmitted data and the amount of calculation corresponding to the second computing task, the first device is another terminal device or a server; based on the third information, the amount of transmitted data and the amount of calculation corresponding to the first computing task are determined ; Wherein, the first computing task and the second computing task are obtained by dividing the computing tasks of the neural network model based on the dividing point.
  • the radio access network device may also obtain the transmission data amount and calculation amount corresponding to the second computing task from the first device. Since the first device is used to perform the second computing task, it can pre-configure or build the neural network model, so it can also know the amount of transmitted data and calculation amount corresponding to the second computing task. The radio access network device may infer the amount of transmission data and the amount of calculation corresponding to the first calculation task based on the neural network model and the amount of transmission data and the amount of calculation corresponding to the second calculation task received from the first device.
  • the wireless access network device can determine the target split point by obtaining the amount of transmission data, the amount of calculation, and the channel status corresponding to the calculation tasks of the terminal device based on the split points at different locations. .
  • the wireless access network device can obtain the amount of transmitted data divided by dividing points at different locations and corresponding to the computing tasks of the terminal device based on the three possible implementations provided above. and calculation amount.
  • the target split point is determined based on at least one of delay or power consumption; wherein the delay is when the terminal device performs the first computing task The required time; the power consumption is the power consumption required by the terminal device to perform the first computing task.
  • the delay includes calculation delay and transmission delay.
  • the calculation delay is the time required for the terminal device to complete the calculation amount of the first computing task
  • the transmission delay is the time required for the terminal device to transmit the data obtained by the first computing task.
  • the calculation delay may be determined based on the calculation amount of the first calculation task and the calculation capability of the terminal device.
  • the transmission delay may be determined based on the amount of data transmitted by the first computing task and the channel status between the terminal device and the radio access network device. determined by the transmission rate between them.
  • the power consumption required by the terminal device to perform the first computing task includes: computing power consumption and transmission power consumption required by the terminal device to perform the first computing task.
  • the computing power consumption may be determined according to the calculation amount of the first computing task, for example, proportional to the calculation amount.
  • the transmission power consumption may be determined based on the amount of transmission data of the first computing task and the channel status between the terminal device and the radio access network device.
  • the radio access network device can determine the target split point based on the delay, so that the delay caused by the two computing tasks split based on the target split point are respectively allocated to the terminal device and the first device when executed satisfies a certain preset condition.
  • the radio access network device may also determine the target split point based on power consumption, so that the two computing tasks split based on the target split point are respectively allocated to the terminal device and the power consumption caused by the execution of the first device meets another predetermined value. Set conditions.
  • the radio access network device can also determine the target split point by comprehensively considering the delay and power consumption, so that the two computing tasks split based on the target split point are allocated to the terminal device and the delay caused by the execution of the first device respectively. and power consumption meet yet another preset condition. Whether the wireless access network equipment determines the target split point based on delay, power consumption, or delay and power consumption can be determined according to the needs.
  • the target split point is determined based on the delay; the target split point is determined based on the transmission data amount, the calculation amount, and the channel status between the terminal device and the radio access network device.
  • the split point is the target split point, which includes: determining the delay based on the computing power information of the terminal device, the amount of transmitted data, the amount of calculation, and the channel status; determining the delay based on the delay
  • the dividing point is the target dividing point.
  • the delay includes calculation delay and transmission delay.
  • the calculation delay can be determined based on the computing power information and calculation amount of the terminal device, and the transmission delay can be determined based on the amount of transmitted data and channel status. Therefore, the radio access network device can determine the delay based on the computing power information of the terminal device, the amount of transmitted data, the amount of calculation, and the channel status, and then determine the target split point based on the delay.
  • One possible implementation method of determining the target split point based on latency is that if the latency of the computing task (such as the first computing task) determined based on a certain split point is lower than a certain preset threshold (for ease of distinction and explanation, recorded as the first preset threshold), then the segmentation point corresponding to the calculation task is determined as the target segmentation point.
  • a certain preset threshold for ease of distinction and explanation, recorded as the first preset threshold
  • Another possible way to determine the target segmentation point based on time delay is to segment the computing tasks of the neural network model based on different segmentation points, obtain computing tasks corresponding to different segmentation points, and divide the computing tasks with the lowest latency into The corresponding segmentation point (such as the first computing task) is determined as the target segmentation point.
  • determining the target split point based on delay can reduce transmission time. delay and improve transmission reliability. For example, in XR business, lags can be reduced, and in autonomous driving and telemedicine businesses, the safety of autonomous driving and telemedicine can be improved by reducing delays and improving transmission reliability.
  • the target split point is determined based on the power consumption; the target split point is determined based on the amount of transmitted data, the amount of calculation, and the channel status between the terminal device and the radio access network device.
  • the dividing point is the target dividing point, including: determining the power consumption based on the transmission data amount, the calculation amount, and the channel state; determining the dividing point as the target dividing point based on the power consumption .
  • power consumption includes computing power consumption and transmission power consumption.
  • the computing power consumption can be determined based on the amount of calculation, and the transmission power consumption can be determined based on the amount of transmitted data and channel status. Therefore, the radio access network device can determine the power consumption based on the transmission data amount, the calculation amount, and the channel status, and then determine the target split point based on the power consumption.
  • One possible way to determine the target segmentation point based on power consumption is if the power consumption of the computing task (such as the first computing task) determined based on a certain segmentation point is lower than a certain preset threshold (for ease of distinction and explanation, recorded as the second preset threshold), then the segmentation point corresponding to the calculation task is determined as the target segmentation point.
  • a certain preset threshold for ease of distinction and explanation, recorded as the second preset threshold
  • Another possible way to determine the target segmentation point based on power consumption is to segment the computing tasks of the neural network model based on different segmentation points, obtain computing tasks corresponding to different segmentation points, and divide the computing tasks with the lowest power consumption into The corresponding segmentation point (such as the first computing task) is determined as the target segmentation point.
  • determining the target split point based on power consumption can save the power consumption of the terminal equipment.
  • wireless access network equipment can reduce mutual interference between multiple terminal equipment by controlling the transmission power consumption of terminal equipment, and can further achieve system-level service transmission optimization based on the total bandwidth in the cell.
  • the target split point is determined based on the delay and the power consumption; the target split point is determined based on the amount of transmitted data, the amount of calculation, and the channel between the terminal device and the radio access network device.
  • determining the dividing point as the target dividing point includes: determining the delay and the power based on the computing power information of the terminal device, the amount of transmitted data, the amount of calculation, and the channel status. consumption; based on the delay and the power consumption, determine the dividing point as the target dividing point.
  • the radio access network device can determine the target split point based on the weighted sum of delay and power consumption.
  • a possible implementation method of determining the target split point is if the weighted sum of delay and power consumption of the computing task (such as the first computing task) determined based on a certain split point is low At a certain preset threshold (for the convenience of distinction and explanation, it is recorded as the third preset threshold), then the segmentation point corresponding to the calculation task is determined as the target segmentation point.
  • Another possible way to determine the target segmentation point based on the weighted sum of delay and power consumption is to segment the computing tasks of the neural network model based on different segmentation points to obtain computing tasks corresponding to different segmentation points.
  • the segmentation point corresponding to the computing task (such as the first computing task) with the lowest weighted sum of delay and power consumption is determined as the target segmentation point.
  • Wireless access network equipment can take into account both delay and power consumption, and apply different weights to delay and power consumption to determine target split points to meet different needs and improve user experience.
  • the radio access network device can further Obtain the computing power information of the terminal device. This computing power information can be used to characterize the computing capabilities of the terminal device.
  • the computing power information includes at least one of the time required for the terminal device to complete the predefined test task or the computing power of the terminal device.
  • the predefined test tasks include: tasks performed based on at least one of a predefined test neural network model, a predefined calculation type, or predefined input data. That is to say, different terminal devices can be tested based on the same test task to obtain the time required for different terminal devices to complete the same test task, and then the computing capabilities of different terminal devices can be derived based on the time.
  • Computing power can also be characterized by the number of floating-point operations per second (FLOPS) of the terminal device.
  • FLOPS floating-point operations per second
  • the number of floating point operations per second is the peak number of floating point operations that can be performed per second.
  • the terminal device can report the number of floating point operations per second to the wireless access network device.
  • each terminal device can be tested based on the same test task, and then the computing capabilities of different terminal devices can be distinguished by the time it takes to complete the test task, so as to facilitate understanding of the computing capabilities of different terminal devices.
  • One possible implementation method for the terminal device to report the number of floating point operations per second is to directly report the number of floating point operations per second; one possible implementation method for the terminal device to report the number of floating point operations per second is to use the identifier Report information on the number of floating point operations per second.
  • the information identifying the number of floating point operations per second may be a capability level. Different floating-point operations per second can correspond to different capability levels.
  • the corresponding relationship between the number of floating point operations per second and the capability level can be predefined, such as predefined protocols.
  • the terminal device can report the capability level corresponding to the number of floating point operations per second to the wireless access network device according to the corresponding relationship.
  • the wireless access network device By defining different computing power information, it is convenient for the wireless access network device to have a more comprehensive understanding of the computing capabilities of the terminal device, thereby helping to reasonably determine the target segmentation point.
  • the method further includes: receiving computing power information from the terminal device.
  • the wireless access network device can accurately estimate the computing delay based on the computing power of the terminal device, which is beneficial to reasonably determining the target split point.
  • this application provides a communication device that can implement the method in the above-mentioned first aspect or any possible implementation manner of the first aspect.
  • the device includes a module or unit for implementing the method in the first aspect or any possible implementation of the first aspect.
  • the units or modules included in the device can be implemented by software and/or hardware.
  • the device may be, for example, a wireless access network device, or may be a chip, chip system, or processor that supports the wireless access network device to implement the above method, or may be a device capable of realizing all or part of the functions of the wireless access network device.
  • Logic modules or software are examples of the wireless access network device.
  • this application provides a communication device, including a processor, which can be used to implement the method in the first aspect or any possible implementation of the first aspect through logic circuits or execution of code instructions.
  • the communication device further includes a communication interface, and the processor is coupled to the communication interface.
  • the communication interface is used to receive signals from other communication devices outside the device and transmit them to the processor, or to send signals from the processor to other communication devices outside the device, for example
  • the communication interface may be a transceiver, a circuit, a bus, a module, or other types of communication interfaces.
  • the communication device further includes a memory, and the processor is coupled to the memory.
  • the memory is used to store program instructions and data.
  • the communication device is a wireless access network device, or a chip, chip system, or processor configured in the wireless access network device.
  • the present application provides a computer-readable storage medium in which a computer program or instructions are stored.
  • the computer program or instructions are run on a computer, the first aspect or the first aspect is caused by the above-mentioned method.
  • the methods in any of the possible implementations are executed.
  • the present application provides a computer program product.
  • the computer program product includes: a computer program (which can also be called a code, or an instruction).
  • a computer program which can also be called a code, or an instruction.
  • the first aspect or the third aspect is achieved.
  • the methods in any of the possible implementations are executed.
  • Figure 1 is a schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • Figure 2 is another schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • Figure 3 is another schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • Figure 4 is a schematic diagram of a neural network model provided by an embodiment of the present application.
  • Figure 5 is a schematic flow chart of a method for dividing computing tasks provided by an embodiment of the present application.
  • Figures 6 and 7 are schematic diagrams of dividing the computing tasks of the neural network model based on dividing points at different positions provided by the embodiment of the present application;
  • Figure 8 is a schematic flow chart of a method for dividing computing tasks provided by another embodiment of the present application.
  • Figures 9 and 10 are schematic block diagrams of a communication device provided by embodiments of the present application.
  • Figure 11 is a schematic structural diagram of a base station provided by an embodiment of the present application.
  • the method provided by this application can be applied to various communication systems, such as: long term evolution (long term evolution, LTE) system, LTE frequency division duplex (FDD) system, LTE time division duplex (TDD) ) system, fifth generation (5th generation, 5G) mobile communication system or new radio access technology (NR).
  • LTE long term evolution
  • FDD frequency division duplex
  • TDD time division duplex
  • 5G mobile communication system may include non-standalone networking (non-standalone, NSA) and/or independent networking (standalone, SA).
  • the technical solution provided by this application can also be applied to machine type communication (MTC), long term evolution-machine (LTE-M), and device-to-device (D2D).
  • MTC machine type communication
  • LTE-M long term evolution-machine
  • D2D device-to-device
  • MTC machine type communication
  • M2M machine to machine
  • IoT Internet of things
  • the IoT network may include, for example, the Internet of Vehicles.
  • the communication methods in the Internet of Vehicles system are collectively called vehicle to other equipment (vehicle to X, V2X, X can represent anything) system.
  • the V2X can include: vehicle to vehicle (vehicle to vehicle, V2V) communication, vehicle Communication with infrastructure (vehicle to infrastructure, V2I), communication between vehicles and pedestrians (vehicle to pedestrian, V2P), or vehicle to network (vehicle to network, V2N) communication, etc.
  • V2V vehicle to vehicle
  • V2I vehicle to infrastructure
  • V2P vehicle to pedestrian
  • V2N vehicle to network
  • the radio access network (radio access network, RAN) device may be any device with wireless transceiver functions.
  • Radio access network equipment may be equipment that uses 3rd generation partnership project (3GPP) technology to access the network, including but not limited to: base stations.
  • 3GPP 3rd generation partnership project
  • base station Node B (NodeB or NB), evolved Node B (eNB) in LTE, gNB or transmission reception point (TRP) in 5G (such as NR) system
  • 6G 6th generation
  • eNB evolved Node B
  • TRP transmission reception point
  • 6G 6th generation
  • Wireless access network equipment can also be macro base stations, micro base stations, pico base stations, small stations, balloon stations, indoor stations, relay stations, wireless relay nodes, wireless backhaul nodes, etc.
  • the wireless access network device may also be a device that uses non-3GPP technology to access the network, including but not limited to an access point (AP) in a wireless fidelity (Wi-Fi) system.
  • AP access point
  • Wi-Fi wireless fidelity
  • All or part of the functions of the radio access network equipment in this application can also be implemented through software functions running on hardware, or through virtualization functions instantiated on a platform (such as a cloud platform). This application does not limit the specific form of the wireless access network equipment.
  • Core network equipment can be used to complete three major functions: registration, connection, and session management, mainly including network exposure function (NEF) network element, policy control function (PCF) network element, and application function (application function).
  • NEF network exposure function
  • PCF policy control function
  • application function application function
  • AF network exposure function
  • AMF access and mobility management function
  • SMS session management function module
  • UPF user plane function
  • UPF is the interface of the data network, which can complete user plane data forwarding, session/flow level-based billing statistics, bandwidth limitation and other functions. User data can be accessed into the network through this network element.
  • NEF network elements can be used to open services and capabilities provided by 3GPP network functions to AF network elements, and also allow AF to provide information to 3GPP network functions.
  • the AF network element mainly transmits the requirements of the application side to the network side and can be regarded as the agent of the application server.
  • the SMF network element mainly performs session management, IP address allocation and management of user equipment, UPF selection, etc.
  • the PCF network element mainly performs policy control of billing strategies and quality of service (QoS) strategies.
  • the AMF network element mainly performs mobility management, access authentication/authorization and other functions.
  • the AMF network element can also be responsible for transmitting user policies between terminal equipment and PCF.
  • Network elements communicate with each other through interfaces.
  • the interface between the NEF network element and the AF network element is the N33 interface.
  • the signaling plane interface between the terminal equipment and the AMF network element is the N1 interface. Since the terminal equipment cannot directly interact with the core network, it needs to transparently transmit the non-access stratum (NAS) through the access stratum (AS). )information.
  • the signaling plane interface where the AMF requests the access network (AN) to allocate resources for the protocol data unit (PDU) session is the N2 interface.
  • Terminal equipment can also be called user equipment (UE), access terminal, user unit, user station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication equipment, User agent or user device.
  • UE user equipment
  • access terminal user unit
  • user station mobile station
  • mobile station mobile station
  • remote station remote terminal
  • mobile device user terminal
  • terminal wireless communication equipment
  • User agent User agent
  • Terminal devices may include but are not limited to: mobile phones, tablets, computers with wireless transceiver functions, virtual reality (VR) terminal devices, augmented reality (AR) terminal devices, hybrid Reality (mixed reality, MR) terminal equipment, XR terminal equipment, industrial control Wireless terminals in (industrial control), tactile terminal equipment, vehicle-mounted terminal equipment, wireless terminals in driverless driving, wireless terminals in remote medical (remote medical), wireless terminals in smart grid (smart grid), transportation safety ( Wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, wearable terminal devices, video players, full-range projectors, etc. This application does not limit the specific form of the terminal equipment.
  • a data network can provide operator services, Internet access or third-party services.
  • the data network includes a server that can encode and render video sources.
  • FIG. 1 is a schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • the communication system 100 shown in Figure 1 may include: terminal equipment, radio access network equipment (RAN as shown in the figure), core network equipment (UPF, SMF, AMF, PCF, NEF, etc. as shown in the figure) , data networks and servers.
  • the communication system 100 can be regarded as a server-network-terminal device network architecture.
  • the terminal equipment can be a VR terminal equipment, an AR terminal equipment, an XR terminal equipment, a video player, a full range of projectors, etc.
  • the terminal device can have the function of an application layer device, for example, it can collect user operations, such as handle operations, voice control, etc., and generate action instructions based on user operations, such as collecting images or videos; it can also have the function of a communication device, For example, wireless communication with wireless access network equipment can transmit action instructions, images, videos, etc. from application layer equipment to wireless access network equipment through the air interface, and data received from wireless access network equipment can be transmitted to applications. layer equipment.
  • the terminal device can collect images or videos and upload the collected images or videos to the server.
  • the process of data transmission between the terminal device and the server may be as shown in Figure 1.
  • the uplink data sent by the terminal device reaches the server via the wireless access network device, the core network device (specifically, it can be UPF in the core network device), and the data network.
  • the downlink data sent by the server reaches the terminal device via the data network, core network equipment (such as UPF), and wireless access network equipment.
  • terminal equipment can be separated, such as divided into application layer equipment and communication equipment based on different functions.
  • application server and data network can be deployed in one unit. This application does not limit this.
  • the interfaces between various network elements are exemplarily shown in Figure 1 .
  • the communication between the terminal device and the wireless access network device is through the Uu interface
  • the communication between the wireless access network device and the AMF is through the N3 interface, etc., which will not be described one by one here, and this application will not limit this.
  • FIG. 2 is another schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • the communication system 200 shown in Figure 2 may include terminal equipment 1 and terminal equipment 2, radio access network equipment 1 and radio access network equipment 2 (RAN1 and RAN2 as shown in the figure), and core network equipment, such as UPF network element.
  • the communication system 200 can be regarded as a network architecture of terminal device-network-terminal device.
  • the system shown in Figure 2 can be applied to the tactile Internet.
  • the main domain in the tactile Internet is, for example, terminal device 1, which can be an XR terminal, a personal computer, etc.;
  • the controlled domain is, for example, terminal device 2, which can be remote control. Robots, remote operators, etc.;
  • the network domain includes core network equipment and wireless access network equipment 1 and 2.
  • the main domain consists of the tactile user and the human system interface (HSI).
  • the HSI can be responsible for converting the tactile user's input into tactile data using appropriate tactile encoding technology.
  • Haptic data is transmitted through the network domain to the controlled domain.
  • the main domain can directly control the controlled domain through various command signals, and the controlled domain can also feed feedback signals to the main domain.
  • the master domain can also receive audio/video feedback signals from the controlled domain. It is easy to understand that the relationship between the main domain and the controlled domain is similar to the relationship between the server and the terminal device described above in conjunction with Figure 1.
  • Figure 3 is another schematic diagram of a communication system suitable for the method provided by the embodiment of the present application.
  • the communication system 300 may include terminal equipment, wireless access network equipment (AP as shown in the figure), fixed network and server.
  • the communication system 300 can be regarded as a server-network-terminal device network architecture.
  • the difference from Figure 1 is that the network in this network architecture includes a fixed network.
  • the terminal equipment can be an XR terminal, a video player, etc.
  • the wireless access network equipment can be a Wi-Fi router, Wi-Fi AP, set-top box, etc.
  • the process of transmitting data between the terminal device and the server can be shown in Figure 3.
  • the uplink data sent by the terminal device reaches the server via the wireless access network device and the fixed network.
  • the downlink data sent by the server reaches the terminal device via fixed network and wireless access network equipment.
  • the data transmitted between the terminal device and the server may include, for example, XR media data, general video data, etc.
  • AI is increasingly used in many businesses.
  • AI can be applied to the communication system shown in Figures 1 to 3, and one or more devices in the communication system can perform computing tasks through a neural network model.
  • the server can perform target detection and recognition on received images or videos through a neural network model.
  • the computing tasks of the neural network model can be divided so that part of the computing tasks are transferred to the terminal device.
  • the computing tasks of preprocessing images or videos are transferred to the terminal device. Preprocessing can specifically include extracting feature information, target positioning, image downsampling, etc.
  • the terminal device can upload the calculated data to the server. This reduces the amount of data transferred compared to uploading the original image or video.
  • Neural network model A complex network system composed of a large number of simple processing units (ie, neurons) connected to each other.
  • a neural network model can include multiple neural network layers. Based on different categories, neural network models can be divided into: DNN models, convolutional neural network (CNN) models, recurrent neural network (RNN) models, etc. This application includes but is not limited to this.
  • Computational tasks tasks performed by neural network models. If multiple neural network layers in the neural network model are divided into multiple parts, the computing tasks corresponding to the neural network model will also be divided into multiple computing tasks, for example, recorded as computing task 1 to computing task N. . Then, the calculation tasks corresponding to the neural network model can be realized by executing calculation tasks 1 to calculation tasks N (N is an integer greater than 1).
  • Computing tasks can be defined based on business. Different businesses have different computing tasks. Computing tasks include, for example, but are not limited to, target detection, target recognition, target classification, behavior prediction, action decision-making in control systems, image rendering enhancement, and so on. This application includes but is not limited to this.
  • Figure 4 is a schematic diagram of performing computing tasks through a neural network model provided by an embodiment of the present application.
  • the neural network model shown in Figure 4 is a DNN model.
  • the DNN model consists of multiple neural network layers, as shown in the figure There are 7 neural network layers, which may include one or more convolutional layers, one or more pooling layers, one or more fully connected layers, and one or more activation layers. The computational characteristics of different neural network layers are different.
  • the original data to be processed is input to the DNN model, and after calculation, the DNN model outputs the results.
  • the raw data to be processed may be, for example, an image or a video
  • the output result may be, for example, a result of target detection on the image or video. Therefore, the original data input to the DNN model can be images or videos, and the data output from the DNN model can be detection results.
  • the computing tasks performed by the DNN model can be convolution, pooling, classification, etc. on the input images or videos to obtain target detection results.
  • Split point used to split multiple neural network layers of the neural network model to divide multiple neural network layers into multiple parts.
  • the split point can be used to divide multiple neural network layers into two parts.
  • the dividing points are shown with dashed lines in Figure 4 . It can be understood that when the neural network model contains more than two neural network layers, the segmentation point can be selected in a variety of ways, and the position between any two adjacent neural network layers in multiple neural network layers can be determined as Split point.
  • the split point is only defined for convenience of description and can be regarded as a position in the neural network model, but does not mean that such a point exists in the neural network model. Segmentation is only defined for ease of understanding and does not mean that the neural network model is segmented.
  • two devices used to perform computing tasks such as the terminal device and the server in the system shown in Figure 1 or Figure 3 above, or the terminal device 1 and the terminal device 2 in the system shown in Figure 2 ) are pre-configured with the neural network model, or the neural network model can be pre-established. Each device can determine which layers of computing tasks it needs to perform based on the split points.
  • neural network model segmentation and computing task segmentation are used interchangeably, and the meanings expressed by both are the same.
  • Original data, intermediate data and results all three are data and are only defined to distinguish different data and should not constitute any limitation on this application.
  • the original data may be the data input to the neural network model, specifically the data input to the input layer; the result is the data output after the original data is processed by the neural network model, specifically the data output from the output layer.
  • Intermediate data may refer to data output from a certain neural network layer in the neural network model. Specifically, it may be data output from other layers except the input layer and output layer. For example, it may be from the convolution layer or pooling layer. layer output data. It can be understood that the intermediate data is output after segmenting the neural network model.
  • Transmission data volume refers to the size of the data that needs to be transmitted, which can be described by bits, bytes and other dimensions.
  • the amount of transmitted data may specifically refer to the amount of data obtained after the terminal device performs a certain computing task (such as the first computing task) and needs to be transmitted to another device (such as the first device described below). The size of the intermediate data.
  • Calculation amount refers to the number of operations that need to be performed, such as the number of floating point operations that need to be performed, or the number of additions and multiplications that need to be performed, etc. It can be described by parameters such as the number of floating point operations, the number of additions, and multiplications. In this embodiment of the present application, the amount of calculation may specifically refer to the number of operations required by the terminal device to perform a certain calculation task (such as a first calculation task).
  • Channel status refers to the channel attributes of the communication link.
  • the channel state may specifically refer to the channel attribute of the wireless communication link.
  • wireless signals may be affected by factors such as signal scattering, environmental fading, distance attenuation, etc., so the transmission rate may change accordingly. Therefore, the channel status It can be characterized by parameters such as SINR, RSRP, and CQI, which can be used to determine the transmission rate of data transmission through the channel.
  • the neural network layers close to the input side (such as convolution layers and pooling layers) have a small amount of calculation; the neural network layers close to the output side (such as fully connected layers , activation layer) requires a large amount of calculation. Therefore, the computing tasks of the neural network layer with a small amount of calculation can be assigned to the terminal device for execution, and the computing tasks of the neural network layer with a large amount of calculation can be assigned to the server for execution. In this way, after the terminal device processes the original data to be processed, the data dimension can be reduced, and the amount of intermediate data output is reduced compared with the original data, that is, the amount of data transmitted can be reduced.
  • the present application provides a method for dividing computing tasks, and determines the target dividing point of the computing tasks through network equipment (such as wireless access network equipment or core network equipment). Since the network device can obtain the channel status between the wireless access network device and the terminal device in a timely manner, especially the wireless access network device can obtain the channel status between the wireless access network device and the terminal device in real time, therefore the network device can obtain the channel status between the wireless access network device and the terminal device in real time. The latest acquired channel status can be used to adjust the target segmentation points in a timely and effective manner, making the segmentation of computing tasks more reasonable, meeting needs to a greater extent, and improving user experience.
  • network equipment such as wireless access network equipment or core network equipment
  • words such as “first” and “second” are used to distinguish the same or similar items with basically the same functions and effects.
  • the first calculation task and the second calculation task are only used to distinguish different indication information, and their order is not limited.
  • words such as “first” and “second” do not limit the number and execution order, and words such as “first” and “second” do not limit the number and execution order.
  • “at least one” refers to one or more, and “plurality” refers to two or more.
  • “And/or” describes the association of associated objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and B exists alone, where A, B can be singular or plural.
  • the character "/” generally indicates that the related objects are in an "or” relationship.
  • “At least one of the following” or similar expressions thereof refers to any combination of these items, including any combination of a single item (items) or a plurality of items (items).
  • At least one of a, b and c can mean: a, or b, or c, or a and b, or a and c, or b and c, or a, b and c, where a, b, c can be single or multiple.
  • the tables in the embodiments of this application are only examples and do not limit the scope of protection of this application.
  • the values of the information in the table are only examples and can be configured as other values, which are not limited by this application.
  • appropriate deformation adjustments can be made based on the tables in the above text, such as splitting, merging, etc.
  • the parameter names shown in the titles of each table can also be other names that can be understood by the communication device, and the values or expressions of the parameters can be The formula may also take other values or expressions that can be understood by the communication device.
  • other data structures can also be used, such as arrays, queues, containers, stacks, linear lists, pointers, linked lists, trees, graphs, structures, classes, heaps, hash tables or Hash table etc.
  • "predefinition” or “preconfiguration” can be achieved by pre-saving corresponding codes, tables or other methods that can be used to indicate relevant information in the device (for example, including the terminal device and the first device).
  • This application is for Its specific implementation method is not limited.
  • "saving” may refer to saving in one or more memories.
  • the one or more memories may be a separate device, or may be integrated in an encoder or decoder, a processor, or a communication device.
  • the one or more memories may also be partially provided separately and partially integrated in the decoder, processor, or communication device.
  • the type of memory can be any form of storage medium, and this application is not limited thereto.
  • Figure 5 is a schematic flow chart of a computing task dividing method 500 provided by an embodiment of the present application. It can be understood that the method is illustrated in Figure 5 by taking the radio access network device, the terminal device, and the first device as the execution subjects of the interaction gesture as an example, but this application does not limit the execution subjects of the interaction gesture.
  • the radio access network device in Figure 5 can also be a chip, chip system, or processor that supports the radio access network device to implement the method, or can be logic that can realize all or part of the functions of the radio access network device.
  • the terminal device in Figure 5 can also be a chip, chip system, or processor that supports the terminal device to implement the method, or can be a logic module or software that can realize all or part of the functions of the terminal device; in Figure 5
  • the first device may also be a chip, chip system, or processor that supports the first device to implement the method, or may be a logic module or software that can realize all or part of the functions of the first device.
  • the method 500 shown in Figure 5 includes steps 510 to 540. Each step in Figure 5 will be described in detail below.
  • the radio access network device obtains the transmission data amount and calculation amount of the terminal device corresponding to the first calculation task.
  • the first calculation task is obtained by dividing the calculation tasks of the neural network model based on the dividing points.
  • the first computing task may be obtained by dividing the computing task of a complete neural network model based on a certain dividing point. As mentioned before, by dividing the computing tasks of the neural network model based on a certain dividing point, two computing tasks can be obtained. In this embodiment, for convenience of distinction and explanation, the computing task corresponding to the neural network model is recorded as computing task A, and the two computing tasks obtained by dividing computing task A are recorded as the first computing task and the second computing task respectively. .
  • the first computing task corresponds to the terminal device, and the second computing task corresponds to another device.
  • the dividing point used to divide the computing task A into the first computing task and the second computing task is recorded as the dividing point A.
  • the other device is a device that communicates with the terminal device.
  • it may be the server in FIG. 1 or FIG. 3 , or it may be another terminal device different from the terminal device shown in FIG. 2 .
  • the other device is recorded as the first device.
  • the first computing task corresponds to the terminal device
  • the second computing task corresponds to the first device. It can be understood that assuming that the first computing task is assigned to the terminal device for execution, and the second computing task is assigned to the first device for execution.
  • the amount of transmitted data and the amount of calculation corresponding to the first computing task are the amount of transmitted data and the amount of calculation that may be generated assuming that the terminal device executes the first computing task.
  • a possible implementation manner of obtaining the transmission data amount and calculation amount corresponding to the terminal device and the first computing task is that the terminal device sends the first information to the wireless access network device.
  • the first information Indicates the amount of transmitted data and calculation amount corresponding to the first computing task.
  • the radio access network device receives the first information.
  • the first information may be carried in radio access control (radio resource control, RRC) signaling.
  • RRC radio resource control
  • the first information may specifically be user assistance information (UAI) carried in RRC signaling. ), or a cell in UAI.
  • UAI user assistance information
  • the first information can also be carried in a medium access control (medium access control, MAC)-control element (control element, CE).
  • MAC medium access control
  • CE control element
  • a new MAC-CE is used to carry the first information. This application does not limit the signaling used to carry the first information and the specific name of the first information.
  • the amount of transmission data and the amount of calculation corresponding to the first computing task can be estimated by the terminal device.
  • the terminal device can also estimate how many paths (paths) need to be calculated based on the number of layers of the neural network corresponding to the first calculation task, the number of neural network elements, and the corresponding calculation type and times, thereby predicting Estimate the number of operations required, that is, the amount of calculation corresponding to the first calculation task.
  • the terminal device can also estimate the size of the data that may be output, that is, the transmission data corresponding to the first computing task, based on the number of paths in the neural network corresponding to the first computing task and the type and number of neural network elements at the split point. quantity.
  • another possible implementation manner of obtaining the transmission data amount and calculation amount corresponding to the terminal device and the first calculation task is that the first device sends second information to the radio access network device, and the first device sends the second information to the radio access network device.
  • the second information indicates the amount of transmitted data and the amount of calculation corresponding to the first computing task.
  • the radio access network device receives the second information from the first device.
  • the amount of transmission data and the amount of calculation corresponding to the first computing task can be estimated by the first device.
  • the specific way in which the first device estimates the amount of data to be transmitted and the amount of calculation is similar to that described above, and will not be described again.
  • another possible implementation method of obtaining the transmission data amount and calculation amount corresponding to the terminal device and the first computing task is that the first device sends third information to the wireless access network device, and the third information is sent to the wireless access network device.
  • the third information indicates the amount of transmitted data and the amount of calculation corresponding to the second calculation task.
  • the radio access network device receives the third information from the first device.
  • the amount of transmitted data and the amount of calculation corresponding to the second computing task can be estimated by the first device.
  • the radio access network device can estimate the amount of transmission data and the amount of calculation according to the calculation task A corresponding to the neural network model, and then determine the amount of transmission data and calculation amount corresponding to the first neural network. .
  • the specific manner in which the first device and the radio access network device estimate the amount of transmission data and calculation amount is similar to that described above and will not be described again.
  • the above-mentioned second information and third information can be sent by the server (ie, an example of the first device) to the core network device through the N33 interface, and then the core network device uses the general packet radio service (GPRS).
  • GPRS general packet radio service
  • Tunneling protocol-control (GPRS tunneling protocol-control) signaling is transmitted to wireless Line access network equipment.
  • the above-mentioned second information and third information may also be sent by another terminal device (that is, another example of the first device) and carried in signaling such as RRC signaling and MAC-CE. This application does not limit the signaling used to carry the second information and the third information.
  • the terminal device or the first device can estimate the amount of transmitted data and calculations corresponding to any one of the two computing tasks obtained by dividing the computing tasks corresponding to the same neural network model through dividing points at different locations when the service is initiated. amount and sent to the wireless access network equipment.
  • the computing tasks corresponding to the same neural network model are divided through dividing points at different positions, and the resulting computing tasks are different.
  • the calculation tasks of the same neural network model are divided based on division point 1 (as shown in a) in Figure 6)
  • one calculation task corresponding to the terminal device and one calculation corresponding to the first device can be obtained Task, as shown in b) in Figure 6.
  • the computing tasks of the same neural network model are divided based on split point 2 (as shown in a) in Figure 6)
  • one computing task corresponding to the terminal device and one computing task corresponding to the first device can be obtained, as shown in Figure 6 As shown in c) in 6. Comparing b) and c) in Figure 6, it can be seen that the two computing tasks corresponding to the terminal device are different, and the two computing tasks corresponding to the first device are also different.
  • the terminal device or the first device can, after determining the neural network model based on the service, traverse the segmentation points at different positions in the model to segment the computing tasks of the neural network model to obtain different segmentations.
  • Computing tasks corresponding to the points, and then six computing tasks corresponding to the terminal equipment can be obtained, such as the dividing points 1 to 6 shown in Figure 7, and the neural network layers corresponding to the dividing points 1 to 6 respectively, and then it can be obtained
  • the amount of data to be transmitted and the amount of calculation corresponding to the six different computing tasks are estimated.
  • the first device can also obtain multiple different computing tasks corresponding to the first device based on the dividing points at different locations, and send the amount of transmission data and calculation amount corresponding to each computing task in a form similar to the above table. to wireless access network equipment.
  • each segmentation point and its identification are consistent in the terminal device and the first device.
  • the terminal device and the first device may pre-configure the neural network model, or build the neural network model based on the same configuration information, and assign identifiers to the dividing points at each location based on the same rules.
  • a dividing point can be set between every two adjacent layers from the input layer to the output layer, and numbered from 1 to 6 in sequence; for another example, it can be from the input layer to the output layer.
  • a split point every few neural network layers and label them in sequence.
  • the specific implementation method of constructing a neural network model based on the same configuration information by the terminal device and the first device may be that the terminal device and the first device can pass a neural network data exchange standard, for example, open neural network exchange (open neural network exchange) , ONNX) or other predefined neural network data exchange standards to negotiate configuration information.
  • the configuration information includes: the structure and/or parameters used to build the test neural network model.
  • the structure may specifically refer to the type of neural network, such as CNN, RNN, etc.
  • the parameters may specifically include the number of layers of the neural network, the type and weight of each neuron in each neural network layer, etc. This application includes but is not limited to this.
  • the radio access network device when it performs step 510, it can obtain the transmission data amount and calculation amount corresponding to the first computing task by obtaining the transmission data amount and calculation amount corresponding to the terminal device and multiple different computing tasks respectively.
  • the radio access network device determines the segmentation point (ie, the segmentation point A) as the target segmentation based on the above-mentioned transmission data amount, the above-mentioned calculation amount, and the channel status between the terminal device and the wireless access network device. point.
  • the wireless access network device can frequently send downlink reference signals to the terminal device to perform downlink channel measurement; the terminal device can also frequently send uplink reference signals to the wireless access network device to perform uplink channel measurement.
  • the radio access network device can determine the channel status with the terminal device based on the uplink channel measurement results in real time.
  • the amount of transmitted data and calculation amount corresponding to the first computing task may also affect the completion time and required power consumption of computing task A.
  • the radio access network device can determine whether to determine the split point A as the target split point in combination with the above items.
  • the target segmentation point can be determined from different dimensions based on different needs.
  • the target split point is determined based on time delay.
  • the delay is the time required for the terminal device to perform the first computing task.
  • it may include: the computing delay and the transmission delay for the terminal device to perform the first computing task.
  • the computing delay for the terminal device to perform the first computing task is the time required for the terminal device to complete the calculation amount of the first computing task
  • the transmission delay for the terminal device to perform the first computing task is the time required for the terminal device to transmit the calculation amount of the first computing task. The time required to obtain the intermediate data.
  • the computing delay of the first computing task may be determined based on the computing amount of the first computing task and the computing capability of the terminal device.
  • the calculation time is related to the computing power of the terminal device. Therefore, when determining the target segmentation point based on time delay, The computing power of the terminal device can be further obtained.
  • the method further includes: the radio access network device receiving computing power information from the terminal device, where the computing power information indicates the computing capability of the terminal device.
  • the terminal device sends the computing power information to the wireless access network device.
  • step 520 may specifically include: the radio access network device determines that the dividing point is target segmentation point.
  • Receiving the computing power information of the terminal device can be performed after step 510, before step 510, or simultaneously with step 510. This application does not limit the execution order of the two.
  • the computing power information may be carried in RRC signaling.
  • the computing power information may be a UAI carried in RRC signaling, or an information element in the UAI.
  • the computing power information can also be carried in MAC-CE.
  • a new MAC-CE is used to carry the computing power information. This application does not limit the signaling used to carry the computing power information.
  • the computing power information may be at least one of the following: the time required for the terminal device to complete the predefined test task or the computing power of the terminal device.
  • the predefined test tasks include: tasks performed based on at least one of a predefined test neural network model, a predefined calculation type, or predefined input data. That is to say, different terminal devices can be tested based on the same test task to obtain the time required for different terminal devices to complete the same test task, and then the computing capabilities of different terminal devices can be derived based on the time.
  • test neural network model can be pre-configured in the terminal device, or can be constructed based on predefined configuration information, or can also be based on configuration information obtained from other devices (such as wireless access network equipment or first equipment, etc.) And build.
  • configuration information please refer to the relevant description in step 510 above, and will not be described again here.
  • the calculation type may include operation types, such as matrix operations, specifically matrix multiplication operations, matrix inversion operations, etc. This application includes but is not limited to this.
  • Input data can refer to the data input to the test neural network model, that is, the data to be processed.
  • Test neural network models, calculation types and input data can for example be predefined by the protocol.
  • the terminal device performs a test task based on at least one of them, the time required to complete the test task can be obtained.
  • Computing power can also be characterized by the number of floating point operations per second of the terminal device.
  • the number of floating point operations per second is the peak number of floating point operations that can be performed per second.
  • the terminal device can report the number of floating point operations per second to the wireless access network device.
  • One possible implementation method for the terminal device to report the number of floating point operations per second is to directly report the number of floating point operations per second.
  • Another possible implementation method for the terminal device to report the number of floating point operations per second is to report information identifying the number of floating point operations per second.
  • the information identifying the number of floating point operations per second may be a capability level. Different floating-point operations per second can correspond to different capability levels.
  • the corresponding relationship between the number of floating point operations per second and the capability level can be predefined, such as predefined protocols.
  • the terminal device can report the capability level corresponding to the number of floating point operations per second to the wireless access network device according to the corresponding relationship.
  • the number of floating-point operations per second or its corresponding capability level is only one of the parameters used to characterize computing capabilities. This application is not limited to the parameters used to characterize computing capabilities. Other parameters can also be correlated with capability levels.
  • the computing speed of the terminal device can be determined. Further combined with the amount of calculation corresponding to the first computing task, the computing delay for the terminal device to complete the first computing task can be determined.
  • Q c represents the calculation amount
  • R c represents the calculation speed.
  • the user may adjust the mode of the terminal device in different power states, such as adjusting to the power saving mode in a low power state, and its computing power may also change with the mode. And change. Therefore, the terminal device can periodically send computing power information to the radio access network device. In this way, the wireless access network equipment can determine the calculation delay based on the latest received computing power information, thereby making the estimation of the calculation delay more accurate.
  • the transmission delay of the first computing task may be determined based on the amount of data transmitted by the first computing task and the channel status between the terminal device and the radio access network device. Determined by the transmission rate between wireless access network devices.
  • the transmission rate may be determined, for example, by the radio access network device based on the channel status of the physical layer obtained in real time and the scheduling signaling of Layer 2. Determining the transmission rate according to the channel status is an internal implementation of the wireless access network device and can be achieved through existing technologies, which will not be described in detail.
  • Q t represents the amount of transmitted data
  • R t represents the transmission rate.
  • the transmission delay of the terminal device executing the first computing task can be obtained.
  • the target split point can be determined based on latency. Therefore, the radio access network device can determine the delay T corresponding to the first computing task according to the above-described delay calculation method, and then determine whether to determine the dividing point A corresponding to the first computing task as the target dividing point.
  • One possible implementation is to pre-set a delay threshold for the computing task corresponding to the terminal device. If the delay of the first computing task split by the split point A does not exceed the delay threshold, it can be considered based on the split point If the computing task divided by A meets the delay requirement, the dividing point A can be determined as the target dividing point; if the delay of the first computing task divided by dividing point A does not exceed the delay threshold, it can be considered based on the dividing point The computing task divided by A meets the delay requirement, and the dividing point A can be determined as the target dividing point.
  • Another possible implementation method is to pre-set the delay threshold for computing task A. At this time, not only the delay of the first computing task but also the delay of the second computing task needs to be estimated, and then based on the two The relationship between the total delay and the delay threshold determines whether to determine the segmentation point A as the target segmentation point.
  • the dividing point A is determined as the target dividing point; if the sum of the delays of the first computing task and the second computing task divided by the dividing point A exceeds the delay threshold, it can be considered that the computing task divided based on the dividing point A is not If the delay threshold is met, the split point A cannot be determined. is the target segmentation point.
  • Another possible implementation method is to use dividing points at different positions to divide the computing task A corresponding to the neural network model, obtain multiple different computing tasks corresponding to the terminal device, and calculate the multiple different computing tasks respectively.
  • the split point corresponding to the computing task with the smallest delay is determined as the target split point. For example, if the first computing task is based on the computing task with the smallest delay among the plurality of different computing tasks, then the segmentation point A can be determined as the target segmentation point.
  • the computing delay and transmission delay of the terminal device performing different computing tasks can also be calculated by referring to the method provided above, but the amount of data transmitted and the amount of calculation are different.
  • segmentation point A is the target segmentation point determined based on the above-mentioned implementation.
  • the target split point is determined based on power consumption.
  • the power consumption is the power consumption required by the terminal device to perform the first computing task.
  • it may include: computing power consumption and transmission power consumption of the terminal device performing the first computing task.
  • the computing power consumption of the terminal device when performing the first computing task is the power consumption required by the terminal device to perform the calculation amount of the first computing task
  • the transmission power consumption of the terminal device when performing the first computing task is the power consumption required by the terminal device for transmitting the first computing task.
  • the computing power consumption of the first computing task is related to the computing amount of the first computing task. Computing power consumption increases as the amount of calculation increases.
  • the amount of computation is related to the number of layers of the neural network.
  • i the i-th neural network layer
  • P c,i the computing power consumption of the i-th neural network layer.
  • the computing power consumption of the terminal device executing the first computing task can be obtained.
  • the computing power consumption of each neural network layer can be calculated based on the number of operations required for each neuron calculation and the power consumption value required for each operation in the terminal chip. For example, it can be calculated based on each neuron.
  • the required number of operations of the adder, the number of operations of the multiplier, and the power consumption required for each operation of the adder and multiplier in the terminal chip are calculated.
  • network equipment and terminal equipment can also pre-stipulate the computing power consumption of typical neural network elements, and calculate the computing power consumption of each neural network layer according to the standard according to the neural network structure.
  • the transmission power consumption of the first computing task is related to the transmission data amount and channel status of the first computing task. The greater the amount of data transmitted, the greater the power consumption required. And because the intermediate data obtained by the terminal device performing the first calculation task needs to be transmitted to the radio access network device through the wireless channel, its transmission power consumption is also related to the channel state.
  • the transmission power consumption P t is related to the amount of transmission data required after the calculation task is divided and the current channel status.
  • the wireless access network equipment can perform power allocation based on the current channel status and the amount of transmitted data, and then determine the transmission power.
  • the transmission power consumption of the first computing task may also be determined by the radio access network device based on the amount of transmission data corresponding to the first computing task and the channel status obtained in real time.
  • the specific method for power allocation by the wireless access network equipment according to the current channel status and the amount of transmitted data can be found in the existing technology, and will not be described in detail here.
  • the wireless access network equipment can determine the target split point based on power consumption. For example, when the battery capacity of the terminal device is low, or when the battery capacity of the terminal device used by the user is small, the radio access network device may determine the power consumption corresponding to the first computing task based on the above power consumption calculation method. The power consumption P is used to determine whether the segmentation point A corresponding to the first computing task is determined as the target segmentation point.
  • One possible implementation is to pre-set a power consumption threshold for the computing task corresponding to the terminal device. If the power consumption of the first computing task divided by the dividing point A does not exceed the power consumption threshold, it can be considered to be based on the dividing point A. If the divided computing tasks meet the power consumption requirements, the dividing point A can be determined as the target dividing point; if the power consumption of the first computing task divided by dividing point A does not exceed the power consumption threshold, it can be considered based on dividing point A If the divided computing tasks meet the power consumption requirements, the dividing point A can be determined as the target dividing point.
  • Another possible implementation method is to use dividing points at different positions to divide the computing task A corresponding to the neural network model, obtain multiple different computing tasks corresponding to the terminal device, and calculate the multiple different computing tasks respectively.
  • the power consumption of the calculation task is determined as the target segmentation point corresponding to the computing task with the smallest power consumption. For example, if the first computing task is based on the computing task with the smallest power consumption among the plurality of different computing tasks, then the segmentation point A can be determined as the target segmentation point.
  • the computing power consumption and transmission power consumption of the terminal device performing different computing tasks can also be calculated by referring to the method provided above, but the amount of data transmitted and the amount of calculation are different.
  • wireless access network equipment can also consider the mutual interference between multiple terminal devices in the community and adjust the transmission power consumption of the terminal devices to reduce interference and obtain better transmission quality.
  • the wireless access network equipment can also consider the total bandwidth in the cell and adjust the transmission power of the terminal equipment according to the rate requirements of the model split point to achieve optimal transmission of system-level services.
  • the target split point is determined based on latency and power consumption.
  • the wireless access network equipment can combine delay and power consumption to determine the target segmentation point, so that the determined target segmentation point will not bring a large delay or a large amount of time to the terminal device. Large power consumption.
  • the radio access network device When the radio access network device determines the target split point based on delay and power consumption, it can apply different weights to the delay and power consumption according to requirements to obtain a weighted sum of the two. For example, for services with higher latency requirements, a higher weight can be applied to the delay; for services with lower latency requirements but terminal equipment that is more sensitive to power consumption, a higher weight can be applied to the power consumption. .
  • the weighted sum of delay and power consumption can be expressed by the following formula: ⁇ T+ ⁇ P, where ⁇ is the weight of delay, 0 ⁇ 1; ⁇ is the weight of power consumption, 0 ⁇ 1.
  • the values of ⁇ and ⁇ corresponding to different services and different terminal devices may be different. If the radio access network device determines the target split point based on delay and power consumption, it can respond to each service initiation and determine ⁇ and ⁇ values for the corresponding computing task, thereby calculating the delay corresponding to the first computing task. and the power consumption, and then determine whether the segmentation point A corresponding to the first computing task is determined as the target segmentation point.
  • One possible implementation is to set a threshold value in advance for the computing task corresponding to the terminal device. If the weighted sum of the delay and power consumption of the first computing task divided by the dividing point A does not exceed the threshold value, then It can be considered that the computing task divided based on the dividing point A meets the requirements, and the dividing point A can be determined as the target dividing point; if the weighted sum of the delay and power consumption of the first computing task divided by the dividing point A does not exceed the gate limit, it can be considered that the computing task based on segmentation point A meets the requirements, and the segmentation point A can be determined as the target segmentation point.
  • Another possible implementation method is to use dividing points at different positions to divide the computing task A corresponding to the neural network model, obtain multiple different computing tasks corresponding to the terminal device, and calculate the multiple different computing tasks respectively.
  • the weighted sum of the delay and power consumption is determined as the target split point by the split point corresponding to the computing task with the smallest weighted sum. For example, if the first computing task is the computing task with the smallest weighted sum of latency and power consumption among the plurality of different computing tasks, then the segmentation point A can be determined as the target segmentation point. It can be understood that the delay and power consumption of the terminal device performing different computing tasks can be calculated by referring to the method provided above, but the amount of data transmitted and the amount of calculation are different.
  • the radio access network device sends indication information to the terminal device, where the indication information indicates the target split point.
  • the terminal device receives the indication information.
  • the wireless access network device After the wireless access network device determines the target split point, it can notify the terminal device of the target split point.
  • the radio access network device may send indication information to the terminal device.
  • the indication information may specifically include an identifier of the target split point, such as an index of the target split point and other information that can be used to uniquely identify a split point.
  • the indication information may be carried in MAC-CE or downlink control information (DCI). This application does not limit the signaling used to carry the indication information.
  • DCI downlink control information
  • the radio access network device can notify the terminal device of split point A through indication information.
  • the terminal device After the terminal device determines the first computing task based on the dividing point A, it can execute the first computing task and transmit the intermediate data obtained thereby to the first device. Since after receiving the intermediate data, the first device also needs to use the intermediate data as the input of the local second computing task to continue calculation. Therefore, the first device also needs to determine the second computing task based on the target division point.
  • each segmentation point and its identification are consistent in the terminal device and the first device. Therefore, if the first device can learn the target segmentation point, it can determine the second computing task.
  • step 540 the terminal device sends the intermediate data obtained by the first computing task and the above instruction information to the first device.
  • the first device receives the intermediate data and indication information.
  • the first device After receiving the indication information, the first device can determine the target dividing point and then determine the second computing task.
  • the first device may use the intermediate data received from the terminal device as an input of the second computing task to perform the second computing task.
  • the indication information of the target split point may also be directly sent by the radio access network device to the first device.
  • the method further includes: the radio access network device sending the indication information to the first device.
  • the first device receives the indication information.
  • the terminal device does not need to send the indication information in step 540, but only sends intermediate data.
  • the radio access network device can determine the target segmentation based on the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task, and the channel status between the terminal device and the radio access network device. This point can be analyzed from multiple perspectives such as the power consumption of transmission and calculation, as well as the delay of transmission and calculation. Thus, the target segmentation points can be reasonably determined according to different needs. Since the wireless access network device can obtain the channel status between the terminal device and the wireless access network device in real time, the perception of changes in the channel status can reach millisecond level, so it can adjust according to the changes in the channel status in a more timely and effective manner. The target segmentation point allows the target segmentation point to be adjusted as the channel status changes.
  • wireless access network equipment can also consider the mutual interference between multiple terminal equipment in the community and adjust the transmission power consumption of the terminal equipment to reduce interference and obtain better transmission quality; and can combine the total bandwidth in the community and rate requirements, adjust the transmission power consumption of terminal equipment, and achieve optimal transmission of system-level services.
  • the method provided by this application is described by taking the radio access network to determine the target split point as an example. It can be understood that in addition to the wireless access network, the core network equipment can also obtain the transmission rate of the terminal equipment in real time, and therefore can also be used to determine the target split point. The method provided by this application will be described below by taking core network equipment to determine the target split point as an example.
  • FIG. 8 is a schematic flowchart of a computing task dividing method 800 provided by another embodiment of the present application. It can be understood that in Figure 8, the core network device, the terminal device, and the first device are mainly used as the execution subjects of the interactive representation as an example to illustrate the method, but this application does not limit the execution subjects of the interactive representation.
  • the core network device in Figure 8 can also be a chip, chip system, or processor that supports the core network device to implement the method, or it can also be a logic module or software that can realize all or part of the core network device functions;
  • the terminal device in can also be a chip, chip system, or processor that supports the terminal device to implement the method, or can be a logic module or software that can realize all or part of the functions of the terminal device;
  • the first device in Figure 8 can also be It is a chip, chip system, or processor that supports the first device to implement the method, or it can also be a logic module or software that can realize all or part of the functions of the first device.
  • the method 800 shown in FIG. 8 includes steps 810 to 840. Each step in Figure 8 is explained in detail below.
  • the core network device obtains the transmission data amount and calculation amount of the terminal device corresponding to the first calculation task.
  • the first calculation task is obtained by dividing the calculation tasks of the neural network model based on the dividing points.
  • the following exemplarily provides an implementation method for the core network device to obtain the transmission data amount and calculation amount corresponding to the terminal device and the first computing task:
  • the core network device receives first information from the terminal device.
  • the first information indicates the above-mentioned transmission data amount and calculation amount, which may correspond to 810a in the figure.
  • the first information used to carry the transmission data amount and calculation amount may be, for example, NAS signaling.
  • the core network device receives second information from the first device.
  • the second information indicates the above-mentioned transmission data amount and calculation amount, which may correspond to 810b in the figure.
  • the core network device receives third information from the first device, the third information indicating the amount of transmission data and the amount of calculation corresponding to the second computing task; the core network device responds to the second computing task based on The amount of transmitted data and the amount of calculation are determined to determine the amount of transmitted data and calculation amount corresponding to the first calculation task, which may correspond to 810c in the figure.
  • the above-mentioned second information and third information may be sent by the server (ie, an example of the first device) to the core network device through the N33 interface, or may be sent by another terminal device (ie, another example of the first device) It is sent to the core network equipment through signaling such as RRC signaling and MAC-CE.
  • step 810 is similar to step 510 in the previous method 500. Please refer to the relevant description above. No further details will be given here.
  • the core network device determines the segmentation point as the target segmentation point based on the above-mentioned transmission data amount, the above-mentioned calculation amount, and the transmission rate between the terminal device and the wireless access network device.
  • the core network device is used to determine the target split point.
  • the core network equipment can detect the average transmission rate through the QoS flow configured for the service flow. Therefore, the core network equipment can also determine the target split point based on the detected average transmission rate and the above-mentioned transmission data volume and calculation amount.
  • the core network device can also obtain the channel status between the terminal device and the radio access network device from the radio access network device in real time. Therefore, the core network device can also determine the relationship between the terminal device and the radio access network based on the obtained channel status.
  • the transmission rate between devices combined with the above-mentioned transmission data volume and calculation volume, determines the target segmentation point.
  • the core network equipment can determine the target split point based on delay and/or power consumption.
  • delay includes calculation delay and transmission delay.
  • the calculation delay is related to the computing power of the terminal device.
  • the method further includes: the core network device receiving computing power information from the terminal device, where the computing power information indicates the computing capability of the terminal device.
  • the terminal device sends the computing power information to the core network device.
  • step 820 may specifically include: the core network device determines that the dividing point is target segmentation point.
  • the computing power information can be carried in NAS signaling.
  • step 820 is similar to the previous step 520 in the method 500. Please refer to the previous relevant descriptions and will not be repeated here.
  • step 830 the core network device sends indication information to the terminal device, where the indication information indicates the target split point.
  • the terminal device receives the indication information.
  • the indication information may be carried in NAS signaling.
  • step 840 the terminal device sends the intermediate data obtained by the first computing task and the above-mentioned instruction information to the first device, and accordingly, the first device receives the intermediate data and the instruction information.
  • steps 830 and 840 are similar to the previous steps 530 and 540 in the method 500. Please refer to the previous relevant descriptions and will not be repeated here.
  • the core network device can determine the target split point based on the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task, and the transmission rate between the terminal device and the wireless access network device, it can Analyze it from multiple perspectives such as the power consumption of transmission and calculation, as well as the delay of transmission and calculation, so as to reasonably determine the target segmentation point according to different needs. Since the core network equipment can detect the average transmission rate based on the QoS flow configured for the service flow, and can also obtain the channel status between the terminal equipment and the radio access network equipment from the radio access network equipment in real time, it can also timely and effectively The target segmentation point is adjusted according to the change of the channel state, so that the target segmentation point can be adjusted as the channel state changes.
  • each step in Figure 5 and Figure 8 does not mean the order of execution.
  • the order of execution of each process should be determined by its function and internal logic, and should not constitute the implementation process of the embodiment of the present application. Any limitations.
  • each step in the processes shown in Figures 5 and 8 is only an example, and does not necessarily mean that every step must be performed.
  • Those skilled in the art can make simple changes based on the same concept and the process shown in Figure 5 or Figure 8, such as adjusting the execution order of some steps, or adding other steps. Or reduce the steps, etc., to implement the method provided by this application. These transformations should all fall within the protection scope of this application.
  • Figure 9 is a schematic block diagram of a communication device provided by an embodiment of the present application.
  • the communication device 900 may include: an acquisition module 910 , a processing module 920 and an interface module 930 .
  • the communication device 900 may be used to perform the steps performed by the radio access network device in the above-mentioned method 500 for dividing computing tasks, or may also be used to perform the steps performed by the core network device in the above-mentioned method 800 for dividing computing tasks.
  • the acquisition module 910 is used to perform step 510 to obtain the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task
  • the first computing task is obtained by dividing the computing task of the neural network model based on the dividing point
  • the processing module 920 is used to execute step 520, based on the amount of transmitted data, the amount of calculation, and the terminal device and wireless
  • the channel status between the access network devices determines that the split point is the target split point
  • the interface module 930 is configured to perform step 530 and send indication information to the terminal device, where the indication information indicates the target split point.
  • the acquisition module 910 is used to perform step 810 to obtain the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task. It is obtained by dividing the computing tasks of the neural network model based on the dividing points; the processing module 920 is used to execute step 820, based on the amount of transmission data, the amount of calculation, and the relationship between the terminal device and the wireless access network device.
  • the transmission rate determines the dividing point as the target dividing point; the interface module 930 is configured to perform step 830 and send indication information to the terminal device, where the indication information indicates the target dividing point.
  • each functional module in various embodiments of the present application can be integrated into a processor, or can exist physically alone, or two or more modules can be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software function modules.
  • Figure 10 is another schematic block diagram of a communication device provided by an embodiment of the present application.
  • the communication device 1000 may include at least one processor 1010, which is used to implement the functions of the radio access network equipment or the functions of the core network equipment in the method provided by the embodiment of the present application.
  • the communication device 1000 may also include at least one memory 1020 for storing program instructions and/or data.
  • Memory 1020 and processor 1010 are coupled.
  • the coupling in the embodiment of this application is an indirect coupling or communication connection between devices, units or modules, which may be in electrical, mechanical or other forms, and is used for information interaction between devices, units or modules.
  • the processor 1010 may cooperate with the memory 1020.
  • Processor 1010 may execute program instructions stored in memory 1020 . At least one of the at least one memory may be included in the processor.
  • the communication device 1000 may also include a communication interface 1030 for communicating with other devices through a transmission medium, so that the devices used in the communication device 1000 can communicate with other devices.
  • the other devices may include a terminal device and a first device.
  • the communication interface 1030 may be, for example, a transceiver, an interface, a bus, a circuit, or a device capable of implementing transceiver functions.
  • the processor 1010 may utilize a communication interface 1030 sends and receives data and/or information, and is used to implement the method performed by the radio access network device in the embodiment corresponding to Figure 4, or the method performed by the core network device in the embodiment corresponding to Figure 8.
  • the processor 1010 can be used to control the communication interface 1030 to obtain the amount of transmission data and calculation corresponding to the first computing task of the terminal device.
  • the amount is also used to determine the dividing point as the target dividing point based on the amount of transmitted data, the amount of calculation, and the channel status between the terminal device and the wireless access network device; and used to control the communication interface 1030 Send indication information to the terminal device, where the indication information indicates the target split point.
  • the processor 1010 can be used to control the communication interface 1030 to obtain the amount of transmission data and calculation amount corresponding to the terminal device and the first computing task. , is also used to determine the dividing point as the target dividing point based on the amount of transmitted data, the amount of calculation, and the transmission rate between the terminal device and the wireless access network device; and is used to control the communication interface 1030 to The terminal device sends indication information, and the indication information indicates the target dividing point.
  • connection medium between the processor 1010, the memory 1020 and the communication interface 1030 is not limited in the embodiment of the present application.
  • the processor 1010, the memory 1020 and the communication interface 1030 are connected through a bus 1040.
  • the bus 1040 is represented by a thick line in FIG. 10 , and the connection methods between other components are only schematically illustrated and not limited thereto.
  • the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 10, but it does not mean that there is only one bus or one type of bus.
  • FIG 11 is a schematic structural diagram of a base station provided by an embodiment of the present application.
  • the base station 1100 shown in Figure 11 has the functions of the radio access network equipment shown in Figure 4, and the base station 1100 can be applied in the communication systems shown in Figures 1 to 3.
  • the base station 1100 may include one or more radio frequency units, such as a remote radio unit (remote radio unit, RRU) 1110 and one or more baseband units (BBU) (also called distributed units ( distributed unit (DU))1120.
  • the RRU 1110 can be called a transceiver unit and can correspond to the acquisition module 910 and the interface module 930 in Figure 9, or to the communication interface 1030 in Figure 10.
  • the RRU 1110 may also be called a transceiver, a transceiver circuit, a transceiver, etc., which may include at least one antenna 1111 and a radio frequency unit 1112.
  • the RRU 1110 may include a receiving unit and a transmitting unit, the receiving unit may correspond to a receiver (or a receiver, a receiving circuit), and the transmitting unit may correspond to a transmitter (or a transmitter, a transmitting circuit).
  • the RRU 1110 part is mainly used for transmitting and receiving radio frequency signals and converting radio frequency signals and baseband signals. For example, it is used to perform the operation procedures of the wireless access network equipment in the above method embodiments, such as sending instruction information to the terminal equipment, etc. .
  • the BBU 1120 part is mainly used for baseband processing, base station control, etc.
  • the RRU 1110 and the BBU 1120 may be physically installed together or physically separated, that is, a distributed base station.
  • the BBU 1120 is the control center of the base station, which can also be called a processing unit. It can correspond to the processing module 920 in Figure 9 or the processor 1010 in Figure 10. It is mainly used to complete baseband processing functions, such as channel coding, multiplexing. , modulation, spread spectrum, etc.
  • the BBU processing unit
  • the BBU can be used to control the base station to execute the operation process of the radio access network device in the above method embodiment, for example, determine the target dividing point, generate indication information, etc.
  • the BBU (processing unit) may be used to control the base station to execute the operation process related to the radio access network device in the above method embodiment.
  • the BBU 1120 may be composed of one or more single boards, and multiple single boards may jointly It supports wireless access networks of a single access standard (such as LTE network), and can also support wireless access networks of different access standards (such as LTE network, 5G network or other networks).
  • the BBU 1120 also includes a memory 1121 and a processor 1122.
  • the memory 1121 is used to store necessary instructions and data.
  • the processor 1122 is used to control the base station to perform necessary actions, for example, to control the base station to perform the operation process of the radio access network equipment in the above method embodiment.
  • the memory 1121 and processor 1122 may serve one or more single boards. In other words, the memory and processor can be set independently on each board. It is also possible for multiple boards to share the same memory and processor. In addition, necessary circuits can also be installed on each board.
  • the base station 1100 shown in Figure 11 can implement various processes involving radio access network equipment in the method embodiment shown in Figure 4.
  • the operations and/or functions of each module in the base station 1100 are respectively intended to implement the corresponding processes in the above method embodiments.
  • the BBU 1120 can be used to perform actions implemented internally by the wireless access network device, and the RRU 1110 can be used to perform wireless access.
  • the sending, receiving and forwarding actions of network equipment please refer to the description in the previous method embodiments and will not be described again here.
  • the base station 1100 shown in Figure 11 is only one possible form of radio access network equipment, and should not constitute any limitation on this application.
  • the method provided by this application can be applied to other forms of wireless access network equipment.
  • it includes an active antenna unit (active antenna unit, AAU), it may also include a centralized unit (centralized, CU) and/or DU, or it may include a BBU and an adaptive radio unit (ARU), or BBU.
  • AAU active antenna unit
  • AAU active antenna unit
  • CU centralized unit
  • DU centralized, CU
  • BBU adaptive radio unit
  • This application does not limit the specific form of the wireless access network equipment.
  • the present application also provides a chip system.
  • the chip system includes at least one processor for implementing the functions involved in the method performed by the wireless access network device in the embodiment shown in Figure 4, or causing the computer to execute the method shown in Figure 4.
  • the chip system further includes a memory, the memory is used to store program instructions and data, and the memory is located within the processor or outside the processor.
  • the chip system can be composed of chips or include chips and other discrete devices.
  • This application also provides a communication system, including the aforementioned wireless access network device, terminal device and first device.
  • This application also provides a communication system, including the aforementioned core network equipment, terminal equipment and first equipment.
  • This application also provides a computer-readable storage medium.
  • the computer storage medium stores a computer program (which can also be called a code, or an instruction).
  • a computer program which can also be called a code, or an instruction.
  • the above-mentioned figure 4 is displayed.
  • the method executed by the radio access network device in the embodiment shown is executed, or the method executed by the core network device in the embodiment shown in FIG. 8 is executed.
  • the computer program product includes: a computer program (which can also be called a code, or an instruction).
  • a computer program which can also be called a code, or an instruction.
  • the computer program When the computer program is run, it causes the computer to execute the wireless communication in the embodiment shown in Figure 4.
  • each step of the above method embodiment can be completed through an integrated logic circuit of hardware in the processor or instructions in the form of software.
  • the above-mentioned processor can be a general-purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA), or other available processors.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • programmed logic devices discrete gate or transistor logic devices, discrete hardware components, or any combination thereof.
  • a general-purpose processor can be a microprocessor or any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium that is mature in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers, or the like.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
  • non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory. Erase electrically programmable read-only memory (EPROM, EEPROM) or flash memory.
  • Volatile memory can be random access memory (RAM), which is used as an external cache.
  • RAM static random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate SDRAM double data rate SDRAM
  • DDR SDRAM double data rate SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous link dynamic random access memory
  • direct rambus RAM direct rambus RAM
  • the methods provided by the above embodiments can be implemented in whole or in part through software, hardware, firmware, or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product may include one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic disk), optical media (eg, DVD), or semiconductor media (eg, solid state disk (SSD)), etc.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other various media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

一种计算任务的分割方法及相关装置,可应用于扩展现实XR业务或其他低时延业务。该方法包括:获取终端设备与第一计算任务对应的传输数据量和计算量,该第一计算任务是基于分割点对神经网络模型的计算任务分割的;基于该传输数据量、计算量,及终端设备与无线接入网设备之间的信道状态,确定该分割点为目标分割点;向终端设备发送指示信息,该指示信息用于指示该目标分割点。该方法可以由网络设备执行,如无线接入网设备,由于该网络设备可以实时地获取终端设备与无线接入网设备之间的信道状态,故可以根据信道状态的变化,及时有效地调整目标分割点,从而可以合理地对计算任务进行分割,满足不同需求,提升用户体验。

Description

一种计算任务的分割方法及相关装置
本申请要求于2022年08月24日提交中国专利局、申请号为202211020380.6、申请名称为“一种计算任务的分割方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信领域,并且更具体地,涉及一种计算任务的分割方法及相关装置。
背景技术
随着技术的发展,人工智能(artificial intelligence,AI)在诸多业务(例如扩展现实(extended reality,XR)、无人驾驶、远程医疗等业务)得到越来越广泛的应用。
以XR业务为例,XR终端可以上行传输图像或视频给服务器,服务器可以使用例如深度神经网络(deep neural network,DNN)模型等神经网络模型,对图像或视频中的内容进行目标检测和识别。为了减少XR终端上传的数据量,XR终端可以对待上传的图像或视频进行预处理,将预处理得到的数据上传服务器。而神经网络模型中包括了多个神经网络层,若由XR终端进行预处理,则需要对该神经网络模型对应的计算任务进行分割。
如何合理地对神经网络模型的计算任务进行分割,成为一项亟待解决的技术问题。
发明内容
本申请提供了一种计算任务的分割方法及相关装置,以期合理地对神经网络模型的计算任务进行分割。
第一方面,本申请提供了一种计算任务的分割方法,该方法可应用于无线接入网设备,例如基站、接入点等。该方法可以由无线接入网设备执行,或者,也可以由配置在无线接入网设备中的部件(如芯片、芯片系统、处理器等)执行,或者,还可以由能实现全部或部分网络设备功能的逻辑模块或软件实现,本申请对此不作限定。
该方法包括:获取终端设备与第一计算任务对应的传输数据量和计算量,该第一计算任务是基于分割点对神经网络模型的计算任务进行分割得到的;基于传输数据量、计算量,以及终端设备与无线接入网设备之间的信道状态,确定该分割点为目标分割点;向该终端设备发送指示信息,该指示信息指示目标分割点。
其中,计算任务可以基于业务而定义,业务不同,计算任务也不同。计算任务可以通过神经网络模型来执行。
分割点用于对神经网络模型的计算任务进行分割,一个分割点可以将神经网络模型的计算任务分割成两个计算任务。分割点对神经网络模型的计算任务进行分割可以这样理解:由于神经网络模型包括多个神经网络层,分割点可以位于在多个神经网络层中的任意两个 相邻的神经网络层之间,以该分割点为节点,将该多个神经网络层分成两个部分,每个部分都包括一个或多个神经网络层。每个部分所包括的一个或多个神经网络层对应的计算任务也就是对神经网络模型的计算任务分割后得到的两个计算任务。
可以理解,该神经网络模型的计算任务也可以被分割为更多个计算任务,例如可以使用两个或两个以上的分割点来分割,具体过程如上所述,不再赘述。
在本申请中,分割点可用于将神经网络模型的计算任务分割成两个计算任务,以使得该两个计算任务被分配给不同的设备来执行。随着分割点的位置变化,被分割得到的两个计算任务也随之变化。为了找到合理的分割点,可以从神经网络模型的不同位置对其进行分割,以满足不同的需求。本文中为了便于区分和说明,将被确定来用于对某一业务的计算任务进行分割的分割点记为目标分割点,并假设第一计算任务所对应的分割点为目标分割点。
为方便区分和说明,将基于目标分割点对神经网络模型的计算任务进行分割后得到的两个计算任务分别记为第一计算任务和第二计算任务,第一计算任务是分配给终端设备的计算任务。该目标分割点可以基于第一计算任务对应的传输数据量、计算量以及该终端设备与无线接入网设备之间的信道状态来确定。
其中,传输数据量可以是指由终端设备执行某一计算任务(如第一计算任务)后得到的、需传输给另一设备(如后文所述的第一设备)的数据的大小,可通过比特、字节等量纲来描述。计算量可以是指终端设备执行某一计算任务(如第一计算任务)需要进行的浮点运算的次数,可通过浮点运算次数等参数来描述。该终端设备与无线接入网设备之间的信道状态可通过信干噪比(signal to interference and noise ratio,SINR)、参考信号接收功率(reference signal receiving power,RSRP)、信道质量指示(channel quality indicator,CQI)等参数来描述,基于终端设备与无线接入网设备之间的信道状态,可以确定使用该信道进行数据传输的传输速率。
应理解,基于第一计算任务对应的传输数据量、计算量以及该终端设备与无线接入网设备之间的信道状态来确定目标分割点,并不代表仅基于第一计算任务对应的传输数据量、计算量以及该终端设备与无线接入网设备之间的信道状态来确定目标分割点。如前所述,随着分割点的位置变化,被分割得到的两个计算任务也随之变化。因此,一种可能的实现方式是,无线接入网设备可以根据不同位置的分割点,分别获取到由不同位置分割点分割得到的、与终端设备的计算任务对应的传输数据量、计算量及信道状态来确定目标分割点。可以理解,若上述用于分割得到第一计算任务的分割点为目标分割点,那么该目标分割点的确定过程离不开该第一计算任务对应的传输数据量、计算量以及信道状态等因素。
由于无线接入网设备可以基于终端设备与第一计算任务对应的传输数据量、计算量、以及终端设备与无线接入网设备之间的信道状态,来确定目标分割点,可以从传输、计算的功耗,以及传输、计算的时延等多个角度来予以分析,从而根据不同的需求,合理地确定出目标分割点。由于无线接入网设备可以实时地获取到终端设备与无线接入网设备之间的信道状态,对信道状态的变化的感知可以达到毫秒级,因此可以更及时有效地根据信道状态的变化来调整目标分割点,使得目标分割点可以随着信道状态的变化而调整。
例如,在对时延、可靠性要求较高的业务中,使用该方案可以减少传输时延,提高传输可靠性。例如在XR业务中,可以减少卡顿,又例如在自动驾驶、远程医疗业务中可以 通过减少时延、提高传输可靠性等来提高自动驾驶、远程医疗的安全性。
又例如,在对功耗较敏感的业务或使用了对功耗较敏感的终端设备的业务中,使用该方法可以节省终端设备的功耗。
此外,由于无线接入网设备可以全面地获取到其覆盖范围内终端设备的情况,比如同一小区中多个终端设备的之间的干扰,同一小区中多个终端设备同时进行传输业务所需的带宽等。无线接入网设备可控制终端设备传输数据的功耗,因此也可以通过目标分割点的合理选取,来降低功耗和干扰,满足对功耗要求较高的业务的需求;无线接入网设备还可以考虑小区内的总带宽,根据模型分割点的速率要求,调整终端设备的传输功耗,实现系统级的业务最优传输。
综上可以看到,使用本申请提供的方法,可以合理地选择目标分割点,满足不同的需求,提升用户体验。
下文示例性地提供了获取终端设备与第一计算任务对应的传输数据量和计算量的几种可能的实现方式。
在第一种可能的实现方式中,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:接收来自所述终端设备的第一信息,所述第一信息指示所述传输数据量和所述计算量。
即,无线接入网设备可以直接从终端设备获取与第一计算任务对应的传输数据量和计算量。
在第二种可能的实现方式中,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:接收来自第一设备的第二信息,所述第二信息指示所述传输数据量和所述计算量,所述第一设备为另一终端设备或服务器。
其中,第一设备可以是用于执行上述第二计算任务的设备。在不同的网络架构中,第一设备可以是不同的设备。例如在终端设备、网络与服务器组成的网络架构中,该第一设备可以是服务器,在终端设备、网络与终端设备组成的网络架构中,该第一设备可以是另一终端设备。
无线接入网设备也可以从第一设备获取与第一计算任务对应的传输数据量和计算量。由于第一设备用于执行第二计算任务,它可以预先配置或构建该神经网络模型,故也就可以获知第一计算任务所对应的神经网络层,因此也就可以获知与第一计算任务对应的传输数据量和计算量。
在第三种可能的实现方式中,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:接收来自第一设备的第三信息,所述第三信息指示第一设备与第二计算任务对应的传输数据量和计算量,所述第一设备为另一终端设备或服务器;基于所述第三信息,确定与所述第一计算任务对应的传输数据量和计算量;其中,所述第一计算任务和所述第二计算任务是基于所述分割点对所述神经网络模型的计算任务分割得到的。
无线接入网设备也可以从第一设备获取与第二计算任务对应的传输数据量和计算量。由于第一设备用于执行第二计算任务,它可以预先配置或构建该神经网络模型,故也就可以获知与第二计算任务对应的传输数据量和计算量。无线接入网设备可以根据该神经网络模型,以及从第一设备接收到的与第二计算任务对应的传输数据量和计算量,推断出与第一计算任务对应的传输数据量和计算量。
如前所述,无线接入网设备可以根据不同位置的分割点,分别获取到由分割点分割得到的、与终端设备的计算任务对应的传输数据量、计算量及信道状态来确定目标分割点。在这种实现方式中,无线接入网设备可以基于上文所提供的三种可能的实现方式,来获取由不同位置的分割点分割得到的、与终端设备的计算任务分别对应的传输数据量和计算量。
结合第一方面,在第一方面的某些可能的实现方式中,所述目标分割点基于时延或功耗中的至少一项确定;其中,时延为所述终端设备执行第一计算任务所需的时间;功耗为所述终端设备执行第一计算任务所需的功耗。
其中,时延包括计算时延和传输时延。计算时延为终端设备完成第一计算任务的计算量所需的时间,传输时延为终端设备传输由第一计算任务所得到的数据所需的时间。计算时延可以根据第一计算任务的计算量以及终端设备的计算能力确定。传输时延可以根据第一计算任务的传输数据量以及终端设备与无线接入网设备之间的信道状态确定,如,可以根据第一计算任务的传输数据量以及终端设备与无线接入网设备之间的传输速率来确定。
终端设备执行第一计算任务所需的功耗包括:终端设备执行第一计算任务所需的计算功耗和传输功耗。计算功耗可以根据第一计算任务的计算量确定,如,与计算量成正比。传输功耗可以根据第一计算任务的传输数据量以及终端设备与无线接入网设备之间的信道状态确定。
无线接入网设备可以基于时延来确定目标分割点,以使得基于目标分割点分割得到的两个计算任务分别分配给终端设备和第一设备执行时所带来的时延满足某一预设条件。无线接入网设备也可以基于功耗来确定目标分割点,以使得基于目标分割点分割得到的两个计算任务分别分配给终端设备和第一设备执行时所带来的功耗满足另一预设条件。无线接入网设备也可以综合考虑时延和功耗来确定目标分割点,以使得基于目标分割点分割得到的两个计算任务分别分配给终端设备和第一设备执行时所带来的时延和功耗满足又一预设条件。无线接入网设备基于时延,还是功耗,还是时延和功耗来确定目标分割点,可以视需求而定。
可选地,所述目标分割点基于所述时延确定;所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延;基于所述时延,确定所述分割点为所述目标分割点。
如前所述,时延包括计算时延和传输时延,计算时延可以根据终端设备的算力信息和计算量确定,传输时延可以根据传输数据量和信道状态确定。因此,无线接入网设备可以基于终端设备的算力信息、传输数据量、所述计算量,以及信道状态,确定时延,进而基于时延确定目标分割点。
基于时延确定目标分割点的一种可能的实现方式是,若基于某一分割点确定的计算任务(如第一计算任务)的时延低于某一预设门限(为便于区分和说明,记为第一预设门限),则将该计算任务对应的分割点确定为目标分割点。
基于时延确定目标分割点的另一种可能的实现方式是,基于不同的分割点对神经网络模型的计算任务进行分割,得到对应于不同分割点的计算任务,将其中时延最低的计算任务(如第一计算任务)对应的分割点确定为目标分割点。
在对时延、可靠性要求较高的业务中,基于时延来确定目标分割点,可以减少传输时 延,提高传输可靠性。例如在XR业务中,可以减少卡顿,又例如在自动驾驶、远程医疗业务中可以通过减少时延、提高传输可靠性等来提高自动驾驶、远程医疗的安全性。
可选地,所述目标分割点基于所述功耗确定;所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:基于所述传输数据量、所述计算量,以及所述信道状态,确定所述功耗;基于所述功耗,确定所述分割点为所述目标分割点。
如前所述,功耗包括计算功耗和传输功耗,计算功耗可以根据计算量确定,传输功耗可以根据传输数据量和信道状态确定。因此,无线接入网设备可以基于传输数据量、所述计算量,以及信道状态,确定功耗,进而基于功耗确定目标分割点。
基于功耗确定目标分割点的一种可能的实现方式是,若基于某一分割点确定的计算任务(如第一计算任务)的功耗低于某一预设门限(为便于区分和说明,记为第二预设门限),则将该计算任务对应的分割点确定为目标分割点。
基于功耗确定目标分割点的另一种可能的实现方式是,基于不同的分割点对神经网络模型的计算任务进行分割,得到对应于不同分割点的计算任务,将其中功耗最低的计算任务(如第一计算任务)对应的分割点确定为目标分割点。
在对功耗较敏感的业务或使用了对功耗较敏感的终端设备的业务中,基于功耗来确定目标分割点可以节省终端设备的功耗。并且,无线接入网设备可以通过控制终端设备的传输功耗,减少多个终端设备之间的相互干扰,还可进一步基于小区内的总带宽,实现系统级的业务传输最优。
可选地,所述目标分割点基于所述时延和所述功耗确定;所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延和所述功耗;基于所述时延和所述功耗,确定所述分割点为所述目标分割点。
一种可能的设计是,将时延和功耗分别施加不同的权重,以获得时延和功耗的加权和。无线接入网设备可以基于时延和功耗的加权和,来确定目标分割点。
基于时延和功耗的加权和,确定目标分割点的一种可能的实现方式是,若基于某一分割点确定的计算任务(如第一计算任务)的时延和功耗的加权和低于某一预设门限(为便于区分和说明,记为第三预设门限),则将该计算任务对应的分割点确定为目标分割点。
基于时延和功耗的加权和,确定目标分割点的另一种可能的实现方式是,基于不同的分割点对神经网络模型的计算任务进行分割,得到对应于不同分割点的计算任务,将其中时延和功耗的加权和最低的计算任务(如第一计算任务)对应的分割点确定为目标分割点。
无线接入网设备可以同时兼顾时延和功耗,对时延和功耗施加不同的权重来确定目标分割点,以满足不同的需求,提升用户体验。
可以看到,计算时延与终端设备的计算能力相关,因此,在基于时延确定目标分割点,或基于时延和功耗确定目标分割点的情况下,该无线接入网设备还可以进一步获取终端设备的算力信息。该算力信息可用于表征终端设备的计算能力。
在一种可能的设计中,算力信息包括终端设备完成预定义的测试任务所需的时间或终端设备的计算能力中的至少一项。
其中,预定义的测试任务包括:基于预定义的测试神经网络模型、预定义的计算类型或预定义的输入数据中的至少一项而执行的任务。也就是说,不同的终端设备可以基于相同的测试任务进行测试,以获得不同的终端设备完成同一测试任务所需的时间,进而可以根据时间来推出不同的终端设备的计算能力。
计算能力还可以通过终端设备每秒浮点运算次数(floating-point operations per second,FLOPS)来表征。每秒浮点运算次数也就是每秒所能够执行的浮点运算次数的峰值。终端设备可以将每秒浮点运算次数上报无线接入网设备。
通过预定义测试任务,可以使得各终端设备基于同一测试任务进行测试,进而将不同终端设备的计算能力通过完成测试任务的时间予以区分,从而方便了解不同终端设备的计算能力。
终端设备上报每秒浮点运算次数的一种可能的实现方式是,直接将每秒浮点运算次数上报;终端设备上报每秒浮点运算次数的一种可能的实现方式是,将用于标识每秒浮点运算次数的信息上报。
示例性地,用于标识每秒浮点运算次数的信息可以是能力等级。不同的每秒浮点运算次数可以与不同的能力等级对应。每秒浮点运算次数与能力等级的对应关系可以是预定义的,比如协议预定义。终端设备可以根据该对应关系,将与每秒浮点运算次数对应的能力等级上报无线接入网设备。
通过定义不同的算力信息,便于无线接入网设备更全面地了解终端设备的计算能力,从而有利于合理地确定目标分割点。
可选地,该方法还包括:从终端设备接收算力信息。
通过从终端设备接收算力信息,无线接入网设备可以根据终端设备的计算能力准确地估计计算时延,进而有利于合理地确定目标分割点。
第二方面,本申请提供了一种通信装置,可以实现上述第一方面或第一方面任一种可能实现方式中的方法。该装置包括用于实现第一方面或第一方面任一种可能实现方式中的方法的模块或单元。该装置包括的单元或模块可以通过软件和/或硬件方式实现。该装置例如可以为无线接入网设备,也可以为支持无线接入网设备实现上述方法的芯片、芯片系统、或处理器等,还可以为能实现无线接入网设备的全部或部分功能的逻辑模块或软件。
第三方面,本申请提供了一种通信装置,包括处理器,可用于通过逻辑电路或执行代码指令,以实现第一方面或第一方面任一种可能实现方式中的方法。
可选地,该通信装置还包括通信接口,处理器与通信接口耦合。所述通信接口用于接收来自所述装置之外的其它通信装置的信号并传输至所述处理器,或将来自所述处理器的信号发送给所述装置之外的其它通信装置,示例性地,通信接口可以是收发器、电路、总线、模块或其它类型的通信接口。
可选地,该通信装置还包括存储器,处理器与存储器耦合。所述存储器用于保存程序指令和数据。
可选地,该通信装置为无线接入网设备,或配置在无线接入网设备中的芯片、芯片系统、或处理器。
第四方面,本申请提供了一种计算机可读存储介质,该计算机存储介质中存储有计算机程序或指令,当所述计算机程序或指令在计算机上运行时,使得上述第一方面或第一方 面任一种可能实现方式中的方法被执行。
第五方面,本申请提供了一种计算机程序产品,所述计算机程序产品包括:计算机程序(也可以称为代码,或指令),当所述计算机程序被运行时,使得上述第一方面或第一方面任一种可能实现方式中的方法被执行。
应当理解的是,本申请的第二方面至第五方面与本申请的第一方面的技术方案相对应,各方面及对应的可行实施方式所取得的有益效果相似,不再赘述。
附图说明
图1是适用于本申请实施例提供的方法的通信系统的示意图;
图2是适用于本申请实施例提供的方法的通信系统的另一示意图;
图3是适用于本申请实施例提供的方法的通信系统的又一示意图;
图4是本申请实施例提供的神经网络模型的示意图;
图5是本申请实施例提供的计算任务的分割方法的示意性流程图;
图6和图7是本申请实施例提供的基于不同位置的分割点对神经网络模型的计算任务进行分割的示意图;
图8是本申请另一实施例提供的计算任务的分割方法的示意性流程图;
图9和图10是本申请实施例提供的通信装置的示意性框图;
图11是本申请实施例提供的基站的结构示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请提供的方法可以应用于各种通信系统,例如:长期演进(long term evolution,LTE)系统、LTE频分双工(frequency division duplex,FDD)系统、LTE时分双工(time division duplex,TDD)系统、第五代(5th generation,5G)移动通信系统或新无线接入技术(new radio access technology,NR)。其中,5G移动通信系统可以包括非独立组网(non-standalone,NSA)和/或独立组网(standalone,SA)。
本申请提供的技术方案还可以应用于机器类通信(machine type communication,MTC)、机器间通信长期演进技术(long term evolution-machine,LTE-M)、设备到设备(device-to device,D2D)网络、机器到机器(machine to machine,M2M)网络、物联网(internet of things,IoT)网络或者其他网络。其中,IoT网络例如可以包括车联网。其中,车联网系统中的通信方式统称为车到其他设备(vehicle to X,V2X,X可以代表任何事物)系统,例如,该V2X可以包括:车辆到车辆(vehicle to vehicle,V2V)通信,车辆与基础设施(vehicle to infrastructure,V2I)通信、车辆与行人之间的通信(vehicle to pedestrian,V2P)或车辆与网络(vehicle to network,V2N)通信等。
本申请提供的技术方案还可以应用于未来的通信系统,如第六代(6th Generation,6G)移动通信系统等。本申请对此不作限定。
本申请的实施例中,无线接入网(radio access network,RAN)设备可以是任意一种具有无线收发功能的设备。无线接入网设备可以是采用第三代合作伙伴计划(3rd generation partnership project,3GPP)技术接入网络的设备,例如包括但不限于:基站 (base station)、节点B(NodeB或NB)、LTE中的演进型节点B(evolved Node B,eNB)、5G(如NR)系统中的gNB或收发点(transmission reception point,TRP)、第六代(6th generation,6G)移动通信系统中的下一代基站、未来移动通信系统中的基站等;也可以是完成基站部分功能的模块或单元,例如,可以是集中式单元(central unit,CU),也可以是分布式单元(distributed unit,DU)。无线接入网设备还可以是宏基站、微基站、微微基站、小站、气球站、室内站、中继站、无线中继节点、无线回传节点等等。该无线接入网设备也可以是采用非3GPP技术接入网络的设备,例如包括但不限于无线保真(wireless fidelity,Wi-Fi)系统中的接入点(access point,AP)等。可以理解,本申请中的无线接入网设备的全部或部分功能也可以通过在硬件上运行的软件功能来实现,或者通过平台(例如云平台)上实例化的虚拟化功能来实现。本申请对于无线接入网设备的具体形式不作限定。
核心网设备可用于完成注册、连接、会话管理三大功能,主要包括网络开放功能(network exposure function,NEF)网元、策略控制功能(policy control function,PCF)网元、应用功能(application function,AF)网元、接入与移动性管理功能(access and mobility management function,AMF)网元、会话管理功能模块(session management function,SMF)网元以及用户平面功能(user plane function,UPF)网元等。
其中,UPF是数据网络的接口,可以完成用户面数据转发、基于会话/流级的计费统计、带宽限制等功能,用户数据可通过该网元接入到网络中。
NEF网元可用于向AF网元开放由3GPP网络功能提供的业务和能力,同时也可以让AF向3GPP网络功能提供信息。
AF网元主要传递应用侧对网络侧的需求,可视为应用服务器的代理。
SMF网元主要进行会话管理、用户设备的IP地址分配和管理、UPF选择等。
PCF网元主要进行计费策略与服务质量(quality of service,QoS)策略的策略控制等。
AMF网元主要进行移动性管理、接入鉴权/授权等功能。此外,AMF网元还可负责在终端设备与PCF间传递用户策略。
各网元之间通过接口通信。例如,NEF网元和AF网元之间的接口为N33接口。终端设备和AMF网元间的信令面接口为N1接口,由于终端设备不能直接与核心网交互,需经过接入层(access stratum,AS)透传非接入层(non-access stratum,NAS)信息。AMF向接入网(access network,AN)请求为协议数据单元(protocol data unit,PDU)会话分配资源等的信令面接口为N2接口。
上文关于核心网设备中的各个网元以及各个网元之间的接口仅为示例性说明,不应对本申请构成任何限定。
终端设备,也可以称为用户设备(user equipment,UE)、接入终端、用户单元、用户站、移动站、移动台、远方站、远程终端、移动设备、用户终端、终端、无线通信设备、用户代理或用户装置。
终端设备可以包括但不限于:手机(mobile phone)、平板电脑(pad)、带无线收发功能的电脑、虚拟现实(virtual reality,VR)终端设备、增强现实(augmented reality,AR)终端设备、混合现实(mixed reality,MR)终端设备、XR终端设备、工业控制 (industrial control)中的无线终端、触觉终端设备、车载终端设备、无人驾驶中的无线终端、远程医疗(remote medical)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端、可穿戴终端设备、视频播放器、全系投影仪等。本申请对于终端设备的具体形式均不做限定。
数据网络(data network,DN)可以提供运营商服务、互联网接入或第三方服务,在本申请的实施例中,数据网络包括服务器,可以对视频源进行编码、渲染等。
为了便于理解,以下结合图1、图2和图3对适用于本申请实施例提供的方法的通信系统进行简单说明。
图1是适用于本申请实施例提供的方法的通信系统的示意图。图1所示的通信系统100可以包括:终端设备、无线接入网设备(如图中所示的RAN)、核心网设备(如图中所示的UPF、SMF、AMF、PCF、NEF等)、数据网络和服务器。该通信系统100可视为服务器-网络-终端设备的网络架构。
其中,终端设备可以是VR终端设备、AR终端设备、XR终端设备、视频播放器、全系投影仪等。该终端设备可以既具有应用层设备的功能,如可以采集用户操作,比如,手柄操作、语音控制等,并基于用户操作生成动作指令,又比如,采集图像或视频;又具有通信设备的功能,如与无线接入网设备进行无线通信,将来自应用层设备的动作指令、图像、视频等通过空口传输至无线接入网设备,且可以将从无线接入网设备接收到的数据传输至应用层设备。该终端设备可以采集图像或视频,并可将采集到的图像或视频上传至服务器。
示例性地,终端设备与服务器间传输数据的过程可以如图1中所示。终端设备发送的上行数据经由无线接入网设备、核心网设备(具体可以为核心网设备中的UPF)、数据网络到达服务器。服务器发送的下行数据经由数据网络、核心网设备(如UPF)、无线接入网设备到达终端设备。
应理解,图1中所示出的各个设备仅为示例。例如在一些设计中,终端设备可以是分离的,如基于不同的功能,划分为应用层设备和通信设备。又例如,应用服务器和数据网络可以是合一部署的。本申请对此不作限定。
此外,图1中示例性地示出了各个网元之间的接口。例如,终端设备与无线接入网设备之间通过Uu接口通信,无线接入网设备与AMF之间可通过N3接口通信,等等,此处不一一赘述,本申请对此也不作限定。
图2是适用于本申请实施例提供的方法的通信系统的另一示意图。图2所示的通信系统200中可以包括终端设备1和终端设备2,无线接入网设备1和无线接入网设备2(如图中所示的RAN1和RAN2),以及核心网设备,如UPF网元。该通信系统200可视为终端设备-网络-终端设备的网络架构。
图2所示的系统可以应用于触觉互联网中,该触觉互联网中的主域例如为终端设备1,例如可以是XR终端、个人电脑等;受控域例如为终端设备2,例如可以是远程控制机器人、远程操作员等;网络域包括核心网设备、无线接入网设备1和2。其中,主域由触觉用户与人工系统接口(human system interface,HSI)组成,HSI可负责利用合适的触觉编码技术将触觉用户的输入转换为触觉数据。触觉数据通过网络域传输 至受控域。主域可通过各种命令信号直接控制受控域,受控域也可以将反馈信号反馈给主域。除了触觉反馈信号之外,主域还可以从受控域接收音频/视频反馈信号。不难理解,主域与受控域的关系与前文结合图1所描述的服务器与终端设备的关系相似。
图3是适用于本申请实施例提供的方法的通信系统的另一示意图。如图3所示,该通信系统300中可以包括终端设备、无线接入网设备(如图中所示的AP)、固网和服务器。该通信系统300可视为服务器-网络-终端设备的网络架构,与图1所不同的是,该网络架构中的网络包括固网。
其中,终端设备可以是XR终端、视频播放器等,无线接入网设备可以是Wi-Fi路由器、Wi-Fi AP、机顶盒等。
示例性地,终端设备与服务器间传输数据的过程可以如图3中所示。终端设备发送的上行数据经由无线接入网设备、固网到达服务器。服务器发送的下行数据经由固网、无线接入网设备到达终端设备。终端设备与服务器之间传输的数据例如可以包括XR媒体数据、普通视频数据等。
应理解,图1至图3所示的通信系统仅为示例,本申请并不限定所适用的系统的具体架构,也不限定各通信系统内包含的各种设备的数量和形态。
目前,AI在诸多业务中得到越来越广泛的应用。例如,AI可应用于如图1至图3所示的通信系统中,通信系统中的某一个或多个设备可以通过神经网络模型来执行计算任务。
以图1所示的系统为例,服务器可以通过神经网络模型,对接收到的图像或视频进行目标检测和识别。为了减少终端设备上传的数据量,可以对该神经网络模型的计算任务进行分割,使得一部分计算任务转移至终端设备,比如,将对图像或视频进行预处理的计算任务转移到终端设备,所述预处理具体可包括提取特征信息,目标定位,图像下采样等,终端设备可以将计算所得的数据上传服务器。这相比于将原始的图像或视频上传而言,可以减少传输的数据量。
为了更好地理解本申请实施例,下面结合图4对本申请实施例涉及到的几个术语进行说明。
1、神经网络模型:由大量的、简单的处理单元(即,神经元)互相连接而成的复杂网络系统。神经网络模型可以包括多个神经网络层。基于不同的类别,神经网络模型可以分为:DNN模型、卷积神经网络(convolutional neural network,CNN)模型、循环神经网络(recurrent neural network,RNN)模型等。本申请包含但不限于此。
2、计算任务:通过神经网络模型执行的任务。若将神经网络模型中的多个神经网络层进行分割成多个部分,该神经网络模型所对应的计算任务也就随之被分割成多个计算任务,例如记为计算任务1至计算任务N。那么,与神经网络模型对应的计算任务可通过执行计算任务1至计算任务N(N为大于1的整数)来实现。
计算任务可以基于业务而定义,业务不同,计算任务也不同。计算任务例如包括但不限于,目标检测、目标识别、目标分类、行为预测、控制系统中的动作决策、图像渲染增强等等。本申请包含但不限于此。
图4为本申请实施例提供的通过神经网络模型执行计算任务的示意图。作为示例,图4所示的神经网络模型为DNN模型。该DNN模型包括多个神经网络层,图中示出 了7个神经网络层,该多个神经网络层可以包括一个或多个卷积层、一个或多个池化层、一个或多个全连接层以及一个或多个激活层。不同神经网络层的计算特性有所不同。
如图所示,待处理的原始数据被输入至该DNN模型,经过计算后,DNN模型输出结果。示例性地,该待处理的原始数据例如可以是图像或视频,该输出结果例如可以是对图像或视频进行目标检测得到的结果。因此,被输入至DNN模型的原始数据可以是图像或视频,从DNN模型输出的数据可以是检测结果。该DNN模型所执行的计算任务可以是对输入的图像或视频进行卷积、池化、分类等处理,进而得到目标检测的结果。
3、分割点:用于对神经网络模型的多个神经网络层进行分割,以将多个神经网络层分为多个部分。在本申请实施例中,分割点可用于将多个神经网络层分成两个部分。图4中用虚线示出了分割点。可以理解,在神经网络模型包含两个以上神经网络层的情况下,分割点可以有多种选择,多个神经网络层中任意两个相邻的神经网络层之间的位置都可以被确定为分割点。
分割点只是为了方便描述而定义,可以视为神经网络模型中的位置,而并不代表在神经网络模型中存在这样一个点。分割也只是为便于理解而定义,并不代表对神经网络模型进行了分割。在一种可能的设计中,用于执行计算任务的两个设备(如上述图1或图3所示系统中的终端设备和服务器,或图2所示系统中的终端设备1和终端设备2)中都预先配置有该神经网络模型,或可预先建立该神经网络模型。各设备可以基于分割点,确定自身需要执行其中哪几个层的计算任务。
由于对神经网络模型进行分割,该神经网络模型对应的计算任务也被分割,因此下文中,神经网络模型分割和计算任务分割交替使用,二者所表达的含义是相同的。
4、原始数据、中间数据和结果:三者均为数据,只是为了区分不同的数据而定义,不应对本申请构成任何限定。其中,原始数据可以是被输入至神经网络模型的数据,具体可以是被输入至输入层的数据;结果是原始数据经由神经网络模型的处理后输出的数据,具体可以是从输出层输出的数据;中间数据可以是指从神经网络模型中的某个神经网络层输出的数据,具体可以是从除了输入层和输出层之外的其他层输出的数据,例如可以是从卷积层或池化层输出的数据。可以理解,中间数据在对神经网络模型进行了分割的情况下被输出。
5、传输数据量:指需要传输的数据的大小,可通过比特、字节等量纲来描述。在本申请实施例中,传输数据量具体可以指由终端设备执行某一计算任务(如第一计算任务)后得到的、需传输给另一设备(如后文所述的第一设备)的中间数据的大小。
6、计算量:指需要进行运算的次数,例如需要进行浮点运算的次数,或需要进行加法与乘法的次数等,可通过浮点运算次数、加法与乘法次数等参数来描述。在本申请实施例中,计算量具体可以指终端设备执行某一计算任务(如第一计算任务)需要进行的运算的次数。
7、信道状态:指通信链路的信道属性。在本申请实施例中,该信道状态具体可以指无线通信链路的信道属性。无线信号在传输过程中,可能受到信号散射、环境衰落、距离衰减等因素对信号的影响,因此传输速率可能会随之发生变化。因此,信道状态 可通过SINR、RSRP、CQI等参数来表征,可用于确定通过该信道进行数据传输的传输速率。
由于该DNN在用于执行目标检测与识别的计算任务时,靠近输入侧的神经网络层(比如卷积层、池化层)计算量较小;靠近输出侧的神经网络层(比如全连接层、激活层)计算量较大。因此,可以将计算量较小的神经网络层的计算任务分配给终端设备来执行,而将计算量较大的神经网络层的计算任务分配给服务器来执行。如此一来,终端设备在对待处理的原始数据进行处理后,数据维度得以降低,输出的中间数据较原始数据来说,数据量减少,也即传输的数据量得以减少。
由于神经网络模型所应用的业务不同,需求也不同。有些业务对时延要求较高,有些业务对可靠性要求较高,有些业务对时延和可靠性都具有较高的要求。因此,如何对神经网络模型进行分割,来满足不同业务的需求,成为一项亟待解决的技术问题。
有鉴于此,本申请提供一种计算任务的分割方法,通过网络设备(如无线接入网设备或核心网设备)来确定计算任务的目标分割点。由于网络设备可以及时地获取到无线接入网设备与终端设备之间的信道状态,尤其是无线接入网设备,可以实时地获取到与终端设备之间的信道状态,因此,网络设备可以基于最新获取到的信道状态,及时有效地调整目标分割点,使得对计算任务的分割更为合理,更大程度地满足需求,提升用户体验。
下面将结合附图详细说明本申请提供的方法。
为方便理解,首先做出如下几点说明:
第一,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。例如,第一计算任务和第二计算任务仅仅是为了区分不同的指示信息,并不对其先后顺序进行限定。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
第二,在本申请实施例中,各术语及英文缩略语,如信道状态、神经网络模型、计算任务、DNN、SINR、RSRP、CQI等均为方便描述而给出的示例性举例,不应对本申请构成任何限定。本申请并不排除在已有或未来的协议中定义其它能够实现相同或相似功能的术语的可能。
第三,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a、b和c中的至少一项(个),可以表示:a,或b,或c,或a和b,或a和c,或b和c,或a、b和c,其中a,b,c可以是单个,也可以是多个。
第四,本申请实施例中的表格仅为示例,并不对本申请的保护范围构成限定。例如,表格中的信息的取值仅仅是举例,可以配置为其他值,本申请并不限定。又例如,可以基于上述文中各表做适当的变形调整,例如,拆分,合并等等。再例如,各表中标题示出的参数名称也可以采用通信装置可理解的其他名称,其参数的取值或表示方 式也可以通信装置可理解的其他取值或表示方式。再例如,上述各表在实现时,也可以采用其他的数据结构,例如可以采用数组、队列、容器、栈、线性表、指针、链表、树、图、结构体、类、堆、散列表或哈希表等。
第五,“预定义”或“预配置”可以通过在设备(例如,包括终端设备和第一设备)中预先保存相应的代码、表格或其他可用于指示相关信息的方式来实现,本申请对于其具体的实现方式不做限定。其中,“保存”可以是指,保存在一个或者多个存储器中。所述一个或者多个存储器可以是单独的设置,也可以是集成在编码器或者译码器,处理器、或通信装置中。所述一个或者多个存储器也可以是一部分单独设置,一部分集成在译码器、处理器、或通信装置中。存储器的类型可以是任意形式的存储介质,本申请并不对此限定。
第六,在本申请实施例中,“当……时”、“在……的情况下”、“若”以及“如果”等描述均指在某种客观情况下设备(如,下文所述的终端设备或者接入网设备)会做出相应的处理,并非是限定时间,且也不要求设备(如,下文所述的终端设备或者接入网设备)在实现时一定要有判断的动作,也不意味着存在其它限定。
下文结合附图所示出的实施例从设备交互的角度示出了本申请提供的计算任务的分割方法。其中的各设备仅为示例,不应对本申请提供的方法的实施构成任何限定。
参看图5,图5是本申请实施例提供的计算任务的分割方法500的示意性流程图。可以理解,图5中主要以无线接入网设备、终端设备、和第一设备作为该交互示意的执行主体为例来示意该方法,但本申请并不限制交互示意的执行主体。例如,图5中的无线接入网设备也可以是支持该无线接入网设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分无线接入网设备功能的逻辑模块或软件;图5中的终端设备也可以是支持该终端设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分终端设备功能的逻辑模块或软件;图5中的第一设备也可以是支持该第一设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分第一设备功能的逻辑模块或软件。
图5所示的方法500包括步骤510至步骤540,下面对图5中的各个步骤进行详细说明。
在步骤510中,无线接入网设备获取终端设备与第一计算任务对应的传输数据量和计算量,该第一计算任务是基于分割点对神经网络模型的计算任务进行分割得到的。
该第一计算任务可以是基于某一分割点对一个完整的神经网络模型的计算任务进行分割得到的。如前所述,基于某一分割点对神经网络模型的计算任务进行分割,可以得到两个计算任务。本实施例中,为方便区分和说明,将与神经网络模型对应的计算任务记为计算任务A,对计算任务A进行分割得到的两个计算任务分别记为第一计算任务和第二计算任务。其中,第一计算任务与终端设备对应,第二计算任务与另一设备对应。将用于将计算任务A分割为第一计算任务和第二计算任务的分割点记为分割点A。
其中,所述另一设备是与终端设备通信的设备,例如可以为图1或图3中的服务器,也可以为图2中所示的不同于该终端设备的另一终端设备。为方便区分和说明,将该另一设备记为第一设备。
这里,第一计算任务与终端设备对应,第二计算任务与第一设备对应,可以理解为,假设将该第一计算任务分配给终端设备执行,将第二计算任务分配给第一设备执行。与此对应,与第一计算任务对应的传输数据量和计算量,也就是假设由该终端设备执行第一计算任务可能产生的传输数据量和计算量。
如图中的510a所示,获取终端设备与第一计算任务对应的传输数据量和计算量的一种可能的实现方式是,终端设备向无线接入网设备发送第一信息,该第一信息指示与第一计算任务对应的传输数据量和计算量。相应地,无线接入网设备接收该第一信息。
示例性地,该第一信息可以承载于无线接入控制(radio resource control,RRC)信令中,例如,该第一信息具体可以是承载于RRC信令的用户辅助信息(user assistant information,UAI),或者为UAI中的信元。该第一信息也可以承载于介质接入控制(medium access control,MAC)-控制元素(control element,CE)中,例如,新增MAC-CE用于承载该第一信息。本申请对用于承载该第一信息的信令,以及该第一信息的具体名称不作限定。
在这种实现方式中,与该第一计算任务对应的传输数据量和计算量可以由终端设备预估得到。示例性地,终端设备可以根据第一计算任务对应的神经网络的层数、神经网络元的个数及对应的计算类型和次数,还可以估计有多少路径(path)需要计算,由此可以预估所需进行的运算次数,也即该第一计算任务对应的计算量。终端设备还可以根据第一计算任务对应的神经网络中的路径数量以及分割点处的神经网络元的类型与数量,预估可能输出的数据的大小,也即该第一计算任务对应的传输数据量。
如图中的510b所示,获取终端设备与第一计算任务对应的传输数据量和计算量的另一种可能的实现方式是,第一设备向无线接入网设备发送第二信息,该第二信息指示与第一计算任务对应的传输数据量和计算量。相应地,无线接入网设备接收来自第一设备的第二信息。
在这种实现方式中,与该第一计算任务对应的传输数据量和计算量可以由第一设备预估得到。第一设备预估传输数据量和计算量的具体方式与上文所述相似,不再赘述。
如图中的510c所示,获取终端设备与第一计算任务对应的传输数据量和计算量的又一种可能的实现方式是,第一设备向无线接入网设备发送第三信息,该第三信息指示与第二计算任务对应的传输数据量和计算量。相应地,无线接入网设备接收来自第一设备的第三信息。
在这种实现方式中,与该第二计算任务对应的传输数据量和计算量可以由第一设备预估得到。无线接入网设备在接收到该第三信息之后,可以根据神经网络模型对应的计算任务A,预估传输数据量和计算量,进而确定出与第一神经网络对应的传输数据量和计算量。第一设备和无线接入网设备预估传输数据量和计算量的具体方式与上文所述相似,不再赘述。
示例性地,上述第二信息和第三信息可以由服务器(即,第一设备的一例)通过N33接口发送给核心网设备,再由核心网设备利用通用无线分组业务(general packet radio service,GPRS)隧道协议-控制(GPRS tunneling protocol-control)信令传输给无 线接入网设备。或者,上述第二信息和第三信息也可以由另一终端设备(即,第一设备的另一例)发送,并承载于RRC信令、MAC-CE等信令中。本申请对用于承载该第二信息和第三信息的信令不作限定。
由于对不同的业务,需要使用不同的神经网络模型执行不同的计算任务,故在每一个业务发起后,需要针对其所对应的神经网络模型进行计算任务的分割。故,终端设备或第一设备可以在业务发起时,可以预估通过不同位置的分割点对同一神经网络模型对应的计算任务分割得到的两个计算任务中任意一个所对应的传输数据量和计算量,并发送给无线接入网设备。
其中,对于同一设备来说,比如终端设备,通过不同位置的分割点对同一神经网络模型对应的计算任务进行分割,所得到的计算任务是不同的。参看图6,若基于分割点1对同一神经网络模型的计算任务进行分割(如图6中的a)所示),可以得到与终端设备对应的一个计算任务和与第一设备对应的一个计算任务,如图6中的b)所示。若基于分割点2对同一神经网络模型的计算任务进行分割(如图6中的a)所示),可以得到与终端设备对应的一个计算任务和与第一设备对应的一个计算任务,如图6中的c)所示。对比图6中的b)和c)可以看到,与终端设备对应的两个计算任务不同,与第一设备对应的两个计算任务也不同。
在一种可能的实现方式中,终端设备或第一设备可以在基于业务确定了神经网络模型之后,遍历模型中不同位置的分割点对该神经网络模型的计算任务进行分割,以得到与不同分割点对应的计算任务,进而可以得到与终端设备对应的6个计算任务,如图7中所示出的分割点1至6,以及分别与该分割点1至6对应的神经网络层,进而可以预估出对应于该6个不同计算任务的传输数据量和计算量。
终端设备或第一设备发送的对应于不同计算任务的传输数据量和计算量如表1所示:
表1
应理解,上文结合图7和表格所示的对应于不同计算任务的传输数据量和计算量仅为示例,本申请对于分割点的数量、上报的形式等均不作限定。例如,第一设备也可以基于不同位置的分割点,得到与第一设备对应的多个不同的计算任务,将各计算任务对应的传输数据量和计算量也以类似于上文表格的形式发送给无线接入网设备。
可以理解的是,对于同一个神经网络模型来说,各分割点及其的标识在终端设备和第一设备中是一致的。终端设备和第一设备可以预先配置该神经网络模型,或基于相同的配置信息构建该神经网络模型,并基于相同的规则对各个位置的分割点分配标识。例如图7中所示的神经网络,可以按照从输入层至输出层,每两个相邻层之间设置一个分割点,并按顺序依次标号为1至6;又例如,可以按照从输入层至输出层的顺序,每隔若干个神经网络层设置一个分割点,并依次标号。
其中,终端设备和第一设备基于相同的配置信息构建神经网络模型的具体实现方式可以是,终端设备和第一设备可通过神经网络数据交换标准,例如,开放式神经网络交换(open neural network exchange,ONNX)或其他预定义的神经网络数据交换标准协商配置信息。所述配置信息包括:用于构建该测试神经网络模型的结构和/或参数。其中,结构具体可以指神经网络的类型,例如CNN、RNN等,参数具体可以包括神经网络的层数、各个神经网络层中各个神经元的类型与权重等。本申请包含但不限于此。
因此,无线接入网设备在执行步骤510时,可以通过获取终端设备与多个不同的计算任务分别对应的传输数据量和计算量的方式来获取与第一计算任务对应的传输数据量和计算量,其中,所述多个不同的计算任务是基于不同位置的分割点(包括上述分割点A)对神经网络模型A的计算任务进行分割得到的,该多个不同的计算任务包括第一计算任务。
在步骤520中,无线接入网设备基于上述传输数据量、上述计算量,以及终端设备与无线接入网设备之间的信道状态,确定该分割点(即,上述分割点A)为目标分割点。
由于无线接入网设备与终端设备之间通过无线信道进行数据传输。无线信道的质量影响着数据传输的速率和质量。故,无线接入网设备可以频繁地向终端设备发送下行参考信号,以进行下行信道测量;终端设备也可以频繁地向无线接入网设备发送上行参考信号,以进行上行信道测量。
在本申请实施例中,由于终端设备需要将中间数据传输至第一设备,因此,无线接入网设备可以实时地根据上行信道测量结果,确定与终端设备之间的信道状态。
除了信道状态,与第一计算任务对应的传输数据量和计算量,也都可能影响计算任务A的完成时间和所需功耗。
在传输速率一定的情况下,传输数据量越大,所需的传输时间越多,所需的功耗也越大。在终端设备的计算能力一定的情况下,计算量越大,所需的计算时间越多,所需的功耗也越大。因此,无线接入网设备可以结合上述各项来确定是否将分割点A确定为目标分割点。
由于不同的业务存在不同的需求,有些业务对时延敏感,有些业务对传输可靠性要求较高,有些业务对功耗要求较高,有些业务的使用设备为对功耗较敏感的终端设备,因此可以基于不同的需求,从不同的维度来确定目标分割点。
在一种可能的设计中,该目标分割点基于时延确定。
若假设终端设备执行第一计算任务,该时延为终端设备执行第一计算任务所需的时间,具体可包括:终端设备执行第一计算任务的计算时延和传输时延。其中,终端设备执行第一计算任务的计算时延为终端设备完成第一计算任务的计算量所需的时间,终端设备执行第一计算任务的传输时延为终端设备传输由第一计算任务所得到的中间数据所需的时间。
第一计算任务的计算时延可以根据第一计算任务的计算量以及终端设备的计算能力确定。
其中,计算时间与终端设备的计算能力相关。因此在基于时延确定目标分割点时, 可以进一步获取终端设备的计算能力。
可选地,在步骤520之前,该方法还包括:无线接入网设备接收来自终端设备的算力信息,该算力信息指示终端设备的计算能力。相应地,终端设备向无线接入网设备发送该算力信息。
与此对应,步骤520具体可以包括:无线接入网设备基于上述传输数据量、上述计算量、终端设备的计算能力,以及终端设备与无线接入网设备间的信道状态,确定该分割点为目标分割点。
接收终端设备的算力信息可以在步骤510之后执行,也可以在步骤510之前执行,或与步骤510同步执行,本申请并不限定二者的执行先后顺序。
示例性地,该算力信息可以承载于RRC信令中,例如,该算力信息具体可以是承载于RRC信令中的UAI,或者为UAI中的信元。该算力信息也可以承载于MAC-CE中,例如,新增MAC-CE用于承载该算力信息。本申请对用于承载该算力信息的信令不作限定。
其中,该算力信息可以如下至少一项:终端设备完成预定义的测试任务所需的时间或终端设备的计算能力中。
进一步地,预定义的测试任务包括:基于预定义的测试神经网络模型、预定义的计算类型或预定义的输入数据中的至少一项而执行的任务。也就是说,不同的终端设备可以基于相同的测试任务进行测试,以获得不同的终端设备完成同一测试任务所需的时间,进而可以根据时间来推出不同的终端设备的计算能力。
这里,测试神经网络模型可以预先配置在终端设备中,也可以根据预定义的配置信息而构建,或者也可以根据从其他设备(比如无线接入网设备或第一设备等)获取到的配置信息而构建。关于配置信息的相关说明可参看前文步骤510中的相关描述,此处不再赘述。
计算类型可以包括运算类型,例如矩阵运算,具体还可以包括矩阵乘法运算、矩阵求逆运算等。本申请包含但不限于此。
输入数据可以指输入至测试神经网络模型的数据,也就是待处理的数据。
测试神经网络模型、计算类型和输入数据例如可以由协议预定义。终端设备基于该其中的至少一项来执行测试任务,便可得到完成该测试任务所需的时间。
计算能力还可以通过终端设备每秒浮点运算次数来表征。每秒浮点运算次数也就是每秒所能够执行的浮点运算次数的峰值。终端设备可以将每秒浮点运算次数上报无线接入网设备。
终端设备上报每秒浮点运算次数的一种可能的实现方式是,直接将每秒浮点运算次数上报。
终端设备上报每秒浮点运算次数的另一种可能的实现方式是,将用于标识每秒浮点运算次数的信息上报。
示例性地,用于标识每秒浮点运算次数的信息可以是能力等级。不同的每秒浮点运算次数可以与不同的能力等级对应。每秒浮点运算次数与能力等级的对应关系可以是预定义的,比如协议预定义。终端设备可以根据该对应关系,将与每秒浮点运算次数对应的能力等级上报无线接入网设备。
每秒浮点运算次数或与其对应的能力等级仅为用于表征计算能力的参数之一,本申请并不限定用于表征计算能力的参数,也可将其他参数与能力等级建立对应关系。
基于终端设备的计算能力,便可确定终端设备的计算速度,进一步结合与第一计算任务对应的计算量,便可以确定终端设备完成该第一计算任务的计算时延。
示例性地,计算时延Tc满足:Tc=Qc/Rc。其中,Qc表示计算量,Rc表示计算速度。将第一计算任务对应的计算量和终端设备的计算速度代入上式,便可得到终端设备执行第一计算任务的计算时延。
由于终端设备的电量可能会随着使用时间的延长而降低,用户可能会在不同的电量状态下调整终端设备的模式,例如在低电量状态调整至省电模式,其计算能力也可能随着模式而变化。因此,终端设备可以周期性地向无线接入网设备发送算力信息。如此一来,无线接入网设备便可以根据最新接收到的算力信息来确定计算时延,从而使得对计算时延的估计更为精准。
第一计算任务的传输时延可以根据第一计算任务的传输数据量以及终端设备与无线接入网设备之间的信道状态确定,如,可以根据第一计算任务的传输数据量以及终端设备与无线接入网设备之间的传输速率来确定。
其中,传输速率例如可以由无线接入网设备根据实时获取到的物理层的信道状态以及层2的调度信令来确定。根据信道状态确定传输速率为无线接入网设备的内部实现,且可通过已有技术来实现,对此不作详述。
示例性地,传输时延Tt满足:Tt=Qt/Rt。其中,Qt表示传输数据量,Rt表示传输速率。
将第一计算任务对应的传输数据量和终端设备的传输速率代入上式,便可得到终端设备执行第一计算任务的传输时延。
基于计算时延和传输时延,便可得到终端设备执行第一计算任务的时延T。即,T=Tc+Tt
前已述及,对于一些低时延业务来说,可以基于时延来确定目标分割点。因此,无线接入网设备可以根据上述时延的计算方式来确定与第一计算任务对应的时延T,进而确定是否将该第一计算任务对应的分割点A确定为目标分割点。
一种可能的实现方式是,对与终端设备对应的计算任务预先设置时延门限,若由分割点A分割得到的第一计算任务的时延不超过该时延门限,则可认为基于分割点A分割的计算任务满足时延需求,可以将该分割点A确定为目标分割点;若有分割点A分割得到的第一计算任务的时延不超过该时延门限,则可认为基于分割点A分割的计算任务满足时延需求,可以将该分割点A确定为目标分割点。
另一种可能的实现方式是,对计算任务A预先设置时延门限,此时,不但需要预估第一计算任务的时延,还需要预估第二计算任务的时延,进而根据二者的时延总和与时延门限的大小关系,确定是否将分割点A确定为目标分割点。例如,若由分割点A分割得到的第一计算任务和第二计算任务的时延总和不超过该时延门限,则可认为基于该分给点A分割的计算任务满足时延需求,可以将该分割点A确定为目标分割点;若由分割点A分割得到的第一计算任务和第二计算任务的时延总和超过该时延门限,则可认为基于该分割点A分割的计算任务不满足时延门限,不可以将该分割点A确定 为目标分割点。
又一种可能的实现方式是,采用不同位置的分割点,对神经网络模型对应的计算任务A进行分割,得到与终端设备对应的多个不同的计算任务,分别计算该多个不同的计算任务的时延,将其中时延最小的计算任务所对应的分割点确定为目标分割点。例如,第一计算任务是基于所述多个不同的计算任务中时延最小的计算任务,则分割点A便可确定为目标分割点。可以理解,终端设备执行不同计算任务的计算时延和传输时延也可以参照前文提供的方式来计算,只是传输数据量和计算量有所不同。
以图7为例,分别计算基于分割点1至6分割得到的计算任务的时延,可以得到对应于分割点1至6的时延1至时延6,若其中的分割点3对应的时延3最小,则可将分割点3确定为目标分割点。
本实施例中为方便说明,假设上述分割点A是基于上述实现方式所确定出的目标分割点。
在另一种可能的设计中,该目标分割点基于功耗确定。
若假设终端设备执行第一计算任务,该功耗为终端设备执行第一计算任务所需的功耗,具体可包括:终端设备执行第一计算任务的计算功耗和传输功耗。其中,终端设备执行第一计算任务的计算功耗为终端设备为执行第一计算任务的计算量所需的功耗,终端设备执行第一计算任务的传输功耗为终端设备为传输由第一计算任务多得到的中间数据所需的功耗。
第一计算任务的计算功耗与第一计算任务的计算量相关。计算功耗随计算量增大而增大。
如前所述,计算量与神经网络的层数相关。示例性地,计算功耗Pc满足:Pc=∑iPc,i。其中,i表示第i个神经网络层,Pc,i表示第i个神经网络层的计算功耗。
将第一计算任务对应的神经网络层数,以及每个神经网络层的计算功耗代入上式,便可得到终端设备执行第一计算任务的计算功耗。其中,每个神经网络层的计算功耗可以根据每个神经元计算所需要的运算次数,以及终端芯片中每次运算所需的功耗值计算得出,例如,可以根据每个神经元计算所需要的加法器的运行次数、乘法器的运行次数,以及终端芯片中的加法器与乘法器每次运行所需的功耗值计算得出。或者,网络设备与终端设备也可以针对典型的神经网络元的运算功耗进行预规定,根据神经网络结构按照标准计算每个神经网络层的计算功耗。
第一计算任务的传输功耗与第一计算任务的传输数据量以及信道状态相关。传输数据量越大,所需的功耗也越大。又由于终端设备执行第一计算任务得到的中间数据需要通过无线信道传输至无线接入网设备,因此其传输功耗也与信道状态相关。
示例性地,传输功耗Pt与计算任务分割后所需的传输数据量以及当前的信道状态相关。无线接入网设备可以根据当前的信道状态和传输数据量进行功率分配,进而确定该传输功率。第一计算任务的传输功耗也可以由无线接入网设备基于第一计算任务对应的传输数据量和实时获取到的信道状态确定。无线接入网设备根据当前的信道状态和传输数据量进行功率分配的具体方法可参看已有技术,此处不作详述。
基于计算功耗和传输功耗,便可得到终端设备执行第一计算任务的功耗P。即,P=Pc+Pt
对于一些对功耗较敏感的终端设备,而对时延要求不高的业务来说,无线接入网设备可以基于功耗来确定目标分割点。例如,在终端设备电量较低的情况下,或者,在用户使用的终端设备电池容量较小的情况下,无线接入网设备可以根据上述功耗的计算方式来确定与第一计算任务对应的功耗P,进而确定是否将与第一计算任务对应的分割点A确定为目标分割点。
一种可能的实现方式是,对于终端设备对应的计算任务预先设置功耗门限,若由分割点A分割得到的第一计算任务的功耗不超过该功耗门限,则可认为基于分割点A分割的计算任务满足功耗需求,可以将该分割点A确定为目标分割点;若有分割点A分割得到的第一计算任务的功耗不超过该功耗门限,则可认为基于分割点A分割的计算任务满足功耗需求,可以将该分割点A确定为目标分割点。
另一种可能的实现方式是,采用不同位置的分割点,对神经网络模型对应的计算任务A进行分割,得到与终端设备对应的多个不同的计算任务,分别计算该多个不同的计算任务的功耗,将其中功耗最小的计算任务所对应的分割点确定为目标分割点。例如,第一计算任务是基于所述多个不同的计算任务中功耗最小的计算任务,则分割点A便可确定为目标分割点。可以理解,终端设备执行不同计算任务的计算功耗和传输功耗也可以参照前文提供的方式来计算,只是传输数据量和计算量有所不同。
在多用户场景下,无线接入网设备还可以考虑小区中多个终端设备之间的相互干扰,调整终端设备的传输功耗,从而降低干扰,以获得更优的传输质量。此外,无线接入网设备还可以考虑小区内的总带宽,根据模型分割点的速率要求,调整终端设备的传输功率,实现系统级的业务最优传输。
在又一种可能的设计中,该目标分割点基于时延和功耗确定。
关于时延和功耗的说明可参看前文,此处不再赘述。无线接入网设备可以综合时延和功耗,来确定目标分割点,以使得所确定的目标分割点对于终端设备来说,既不会带来很大的时延,又不会带来很大的功耗。
无线接入网设备基于时延和功耗来确定目标分割点时,可以根据需求,对时延和功耗施加不同的权重,以获得二者的加权和。例如,对于时延要求较高的业务,可以将时延施加更高的权重;而对于时延要求不高的业务,但对功耗较敏感的终端设备,可以将功耗施加更高的权重。
示例性地,时延和功耗的加权和可通过如下公式表示:αT+βP,其中,α为时延的权重,0<α≤1;β为功耗的权重,0<β≤1。
不同的业务、不同的终端设备所对应的α、β的值可以是不同的。无线接入网设备若基于时延和功耗确定目标分割点,则可以响应于每一次业务发起,为对应的计算任务确定α和β值,由此可以计算与第一计算任务对应的时延和功耗的加权和,进而确定是否将与第一计算任务对应的分割点A确定为目标分割点。
一种可能的实现方式是,对于终端设备对应的计算任务预先设置门限值,若由分割点A分割得到的第一计算任务的时延和功耗的加权和不超过该门限值,则可认为基于分割点A分割的计算任务满足需求,可以将该分割点A确定为目标分割点;若有分割点A分割得到的第一计算任务的时延和功耗的加权和不超过该门限值,则可认为基于分割点A分割的计算任务满足需求,可以将该分割点A确定为目标分割点。
另一种可能的实现方式是,采用不同位置的分割点,对神经网络模型对应的计算任务A进行分割,得到与终端设备对应的多个不同的计算任务,分别计算该多个不同的计算任务的时延和功耗的加权和,将其中加权和最小的计算任务所对应的分割点确定为目标分割点。例如,第一计算任务是基于所述多个不同的计算任务中时延和功耗的加权和最小的计算任务,则分割点A便可确定为目标分割点。可以理解,终端设备执行不同计算任务的时延和功耗可以参照前文提供的方式来计算,只是传输数据量和计算量有所不同。
应理解,上文所提供的基于时延和/或功耗确定目标分割点的具体实现方式仅为示例,基于相同的构思,本领域的技术人员还可以采用其他实现方式,来基于时延和/或功耗确定目标分割点。
在步骤530中,无线接入网设备向终端设备发送指示信息,该指示信息指示目标分割点。相应地,终端设备接收该指示信息。
无线接入网设备确定出目标分割点后,便可以将该目标分割点通知终端设备。无线接入网设备可以向终端设备发送指示信息,该指示信息具体可以包括目标分割点的标识,例如目标分割点的索引等可用于唯一标识一个分割点的信息。
示例性地,该指示信息可承载于MAC-CE或下行控制信息(downlink control information,DCI)中。本申请对用于承载该指示信息的信令不作限定。
本实施例中假设分割点A为目标分割点,则无线接入网设备可以将分割点A通过指示信息通知终端设备。
终端设备在基于该分割点A确定了第一计算任务之后,便可执行第一计算任务,并将由此得到的中间数据传输给第一设备。由于第一设备接收到该中间数据后,还需要将该中间数据作为本地第二计算任务的输入,来继续进行计算,因此,第一设备也需要根据目标分割点来确定第二计算任务。
如前所述,对于同一个神经网络模型来说,各分割点及其的标识在终端设备和第一设备中是一致的。因此,第一设备如果可以获知目标分割点,便可确定第二计算任务。
在步骤540中,终端设备向第一设备发送由第一计算任务得到的中间数据和上述指示信息。相应地,第一设备接收该中间数据和指示信息。
第一设备接收到该指示信息后,便可以确定目标分割点,进而确定第二计算任务。第一设备可以将从终端设备接收到的中间数据作为第二计算任务的输入,执行第二计算任务。
在另一种实现方式中,该目标分割点的指示信息也可以由无线接入网设备直接发送给第一设备。可选地,该方法还包括:无线接入网设备向第一设备发送该指示信息。相应地,第一设备接收该指示信息。
可以理解,如果无线接入网设备向第一设备发送目标分割点的指示信息,则终端设备在步骤540中可以不必发送指示信息,而只发送中间数据。
在本申请实施例中,由于无线接入网设备可以基于终端设备与第一计算任务对应的传输数据量、计算量、以及终端设备与无线接入网设备之间的信道状态,来确定目标分割点,可以从传输、计算的功耗,以及传输、计算的时延等多个角度来予以分析, 从而根据不同的需求,合理地确定出目标分割点。由于无线接入网设备可以实时地获取到终端设备与无线接入网设备之间的信道状态,对信道状态的变化的感知可以达到毫秒级,因此可以更及时有效地根据信道状态的变化来调整目标分割点,使得目标分割点可以随着信道状态的变化而调整。此外,无线接入网设备还可以考虑小区中多个终端设备之间的相互干扰,调整终端设备的传输功耗,从而降低干扰,以获得更优的传输质量;并可以结合小区内的总带宽和速率要求,调整终端设备的传输功耗,实现系统级的业务最优传输。
以上,以无线接入网确定目标分割点为例,描述了本申请提供的方法。可以理解,除了无线接入网之外,核心网设备也可以实时地获取终端设备的传输速率,因此也可以用于确定目标分割点。下面将以核心网设备确定目标分割点为例,描述本申请提供的方法。
参考图8,图8是本申请另一实施例提供的计算任务的分割方法800的示意性流程图。可以理解,图8中主要以核心网设备、终端设备、和第一设备作为该交互示意的执行主体为例来示意该方法,但本申请并不限制交互示意的执行主体。例如,图8中的核心网设备也可以是支持该核心网设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分核心网设备功能的逻辑模块或软件;图8中的终端设备也可以是支持该终端设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分终端设备功能的逻辑模块或软件;图8中的第一设备也可以是支持该第一设备实现该方法的芯片、芯片系统、或处理器,还可以是能实现全部或部分第一设备功能的逻辑模块或软件。
图8所示的方法800包括步骤810至步骤840。下面详细说明图8中的各个步骤。
在步骤810中,核心网设备获取终端设备与第一计算任务对应的传输数据量和计算量,该第一计算任务是基于分割点对神经网络模型的计算任务进行分割得到的。
下文示例性地给出了核心网设备获取终端设备与第一计算任务对应的传输数据量和计算量的实现方式:
一种可能的实现方式是,核心网设备从终端设备接收第一信息,该第一信息指示上述传输数据量和计算量,可对应于图中的810a。
由于终端设备经由无线接入网设备与核心网设备连接,故用于承载传输数据量和计算量的第一信息例如可以是NAS信令。
另一种可能的实现方式是,核心网设备从第一设备接收第二信息,该第二信息指示上述传输数据量和计算量,可对应于图中的810b。
又一种可能的实现方式是,核心网设备从第一设备接收第三信息,该第三信息指示与第二计算任务对应的传输数据量和计算量;核心网设备根据与第二计算任务对应的传输数据量和计算量,确定与第一计算任务对应的传输数据量和计算量,可对应于图中的810c。
示例性地,上述第二信息和第三信息可以由服务器(即,第一设备的一例)通过N33接口发送给核心网设备,也可以由另一终端设备(即,第一设备的另一例)通过RRC信令、MAC-CE等信令发送给核心网设备。
步骤810的具体过程与前文方法500中的步骤510相似,可参看前文相关说明, 此处不再赘述。
在步骤820中,核心网设备基于上述传输数据量、上述计算量,以及终端设备与无线接入网设备之间的传输速率,确定该分割点为目标分割点。
在本实施例中,核心网设备用于确定目标分割点。核心网设备可以通过业务流所配置的QoS流检测平均传输速率,故核心网设备也可以基于检测到的平均传输速率,结合上述传输数据量和计算量,确定目标分割点。
或者,核心网设备也可以实时地从无线接入网设备获取终端设备与无线接入网设备之间的信道状态,因此核心网设备也可以基于获取到的信道状态确定终端设备与无线接入网设备之间的传输速率,进而结合上述传输数据量和计算量,确定目标分割点。
核心网设备可以基于时延和/或功耗,确定目标分割点。
如前所述,时延包括计算时延和传输时延。其中,计算时延与终端设备的计算能力相关,若要确定计算时延,还需获取终端设备的计算能力。故可选地,该方法还包括:核心网设备接收来自终端设备的算力信息,该算力信息指示终端设备的计算能力。相应地,终端设备向核心网设备发送该算力信息。
与此对应,步骤820具体可以包括:核心网设备基于上述传输数据量、上述计算量、终端设备的算力信息,以及终端设备与无线接入网设备之间的传输速率,确定该分割点为目标分割点。
示例性地,该算力信息可以承载于NAS信令中。
步骤820的具体过程与前文方法500中的步骤520相似,可参看前文相关说明,此处不再赘述。
在步骤830中,核心网设备向终端设备发送指示信息,该指示信息指示目标分割点。相应地,终端设备接收该指示信息。
示例性地,该指示信息可以承载于NAS信令中。
在步骤840中,终端设备向第一设备发送由第一计算任务得到的中间数据和上述指示信息,相应地,第一设备接收该中间数据和指示信息。
步骤830和840的具体过程与前文方法500中的步骤530和540相似,可参看前文相关说明,此处不再赘述。
在本申请实施例中,由于核心网设备可以基于终端设备与第一计算任务对应的传输数据量、计算量、以及终端设备与无线接入网设备之间的传输速率,确定目标分割点,可以从传输、计算的功耗,以及传输、计算的时延等多个角度来予以分析,从而根据不同的需求,合理地确定出目标分割点。由于核心网设备可以基于业务流所配置的QoS流检测平均传输速率,也可以实时地从无线接入网设备获取到终端设备与无线接入网设备之间的信道状态,因此也可以及时有效地根据信道状态的变化来调整目标分割点,使得目标分割点可以随着信道状态的变化而调整。
应理解,图5和图8中的各个步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。此外,图5和图8所示流程中的各个步骤仅为示例,并不一代表每个步骤都必须执行。本领域的技术人员可以基于相同的构思,在图5或图8所示流程的基础上可以做出简单的变换,例如对部分步骤的执行顺序做出调整,或者,增加其他步骤 或减少其中的步骤等,来实施本申请提供的方法。这些变换均应落入本申请的保护范围之内。
以上,结合图5至图8详细说明了本申请实施例提供的方法。以下,结合图9至图11详细说明本申请实施例提供的装置。
参看图9,图9是本申请实施例提供的通信装置的示意性框图。如图9所示,该通信装置900可以包括:获取模块910、处理模块920和接口模块930。该通信装置900可以用于执行上述计算任务的分割方法500中无线接入网设备执行的步骤,或者,也可用于执行上述计算任务的分割方法800中核心网设备执行的步骤。
示例性地,该装置900用于执行上述方法500中无线接入网设备执行的步骤时,获取模块910用于执行步骤510,获取终端设备与第一计算任务对应的传输数据量和计算量,所述第一计算任务是基于分割点对神经网络模型的计算任务分割得到的;该处理模块920用于执行步骤520,基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点;该接口模块930用于执行步骤530,向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
该装置900用于执行上述方法800中核心网设备执行的步骤时,获取模块910用于执行步骤810,获取终端设备与第一计算任务对应的传输数据量和计算量,所述第一计算任务是基于分割点对神经网络模型的计算任务分割得到的;该处理模块920用于执行步骤820,基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的传输速率,确定所述分割点为目标分割点;该接口模块930用于执行步骤830,向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
应理解,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本申请各个实施例中的各功能模块可以集成在一个处理器中,也可以是单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
参看图10,图10是本申请实施例提供的通信装置的另一示意性框图。如图10所示,该通信装置1000可以包括至少一个处理器1010,用于实现本申请实施例提供的方法中无线接入网设备的功能或核心网设备的功能。
该通信装置1000还可以包括至少一个存储器1020,用于存储程序指令和/或数据。存储器1020和处理器1010耦合。本申请实施例中的耦合是装置、单元或模块之间的间接耦合或通信连接,可以是电性,机械或其它的形式,用于装置、单元或模块之间的信息交互。处理器1010可能和存储器1020协同操作。处理器1010可能执行存储器1020中存储的程序指令。所述至少一个存储器中的至少一个可以包括于处理器中。
该通信装置1000还可以包括通信接口1030,用于通过传输介质和其它设备进行通信,从而用于通信装置1000中的装置可以和其它设备进行通信。示例性地,当该通信装置1000用于实现本申请实施例提供的方法中无线接入网设备或核心网设备的功能时,该其他设备可以包括终端设备和第一设备。所述通信接口1030例如可以是收发器、接口、总线、电路或者能够实现收发功能的装置。处理器1010可利用通信接口 1030收发数据和/或信息,并用于实现图4对应的实施例中的无线接入网设备所执行的方法,或图8对应的实施例中的核心网设备所执行的方法。
例如,当该装置1000用于实现本申请实施例提供的方法中无线接入网设备的功能时,处理器1010可用于控制通信接口1030获取终端设备与第一计算任务对应的传输数据量和计算量,还用于基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点;并用于控制通信接口1030向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
又例如,当该装置1000用于实现本申请实施例提供的方法中核心网设备的功能时,处理器1010可用于控制通信接口1030获取终端设备与第一计算任务对应的传输数据量和计算量,还用于基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的传输速率,确定所述分割点为目标分割点;并用于控制通信接口1030向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
本申请实施例中不限定上述处理器1010、存储器1020以及通信接口1030之间的具体连接介质。本申请实施例在图10中以处理器1010、存储器1020以及通信接口1030之间通过总线1040连接。总线1040在图10中以粗线表示,其它部件之间的连接方式,仅是进行示意性说明,并不引以为限。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图10中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
参看图11,图11是本申请实施例提供的基站的结构示意图。图11所示的基站1100具有图4所示的无线接入网设备的功能,该基站1100可应用于如图1至图3所示的通信系统中。如图11所示,该基站1100可以包括一个或多个射频单元,如远端射频单元(remote radio unit,RRU)1110和一个或多个基带单元(BBU)(也可称为分布式单元(distributed unit,DU))1120。所述RRU 1110可以称为收发单元,可以与图9中的获取模块910和接口模块930对应,或与图10中的通信接口1030对应。可选地,该RRU 1110还可以称为收发机、收发电路、或者收发器等等,其可以包括至少一个天线1111和射频单元1112。可选地,RRU 1110可以包括接收单元和发送单元,接收单元可以对应于接收器(或称接收机、接收电路),发送单元可以对应于发射器(或称发射机、发射电路)。所述RRU 1110部分主要用于射频信号的收发以及射频信号与基带信号的转换,例如,用于执行上述方法实施例中关于无线接入网设备的操作流程,如,向终端设备发送指示信息等。所述BBU 1120部分主要用于进行基带处理,对基站进行控制等。所述RRU 1110与BBU 1120可以是物理上设置在一起,也可以物理上分离设置的,即分布式基站。
所述BBU 1120为基站的控制中心,也可以称为处理单元,可以与图9中的处理模块920或图10中的处理器1010对应,主要用于完成基带处理功能,如信道编码,复用,调制,扩频等等。例如所述BBU(处理单元)可以用于控制基站执行上述方法实施例中关于无线接入网设备的操作流程,例如,确定目标分割点,生成指示信息等。或,所述BBU(处理单元)可以用于控制基站执行上述方法实施例中关于无线接入网设备的操作流程。
在一个示例中,所述BBU 1120可以由一个或多个单板构成,多个单板可以共同 支持单一接入制式的无线接入网(如LTE网),也可以分别支持不同接入制式的无线接入网(如LTE网,5G网或其他网)。所述BBU 1120还包括存储器1121和处理器1122。所述存储器1121用以存储必要的指令和数据。所述处理器1122用于控制基站进行必要的动作,例如用于控制基站执行上述方法实施例中关于无线接入网设备的操作流程。所述存储器1121和处理器1122可以服务于一个或多个单板。也就是说,可以每个单板上单独设置存储器和处理器。也可以是多个单板共用相同的存储器和处理器。此外每个单板上还可以设置有必要的电路。
应理解,图11所示的基站1100能够实现图4所示方法实施例中涉及无线接入网设备的各个过程。基站1100中的各个模块的操作和/或功能,分别为了实现上述方法实施例中的相应流程。具体可参见上述方法实施例中的描述,为避免重复,此处适当省略详细描述。
当基站1100用于执行上文方法实施例中涉及无线接入网设备的操作流程时,BBU 1120可以用于执行由无线接入网设备内部实现的动作,而RRU 1110可以用于执行无线接入网设备发送、接收及转发的动作。具体请见前面方法实施例中的描述,此处不再赘述。
应理解,图11所示出的基站1100仅为无线接入网设备的一种可能的形态,而不应对本申请构成任何限定。本申请所提供的方法可适用于其他形态的无线接入网设备。例如,包括有源天线单元(active antenna unit,AAU),还可以包括集中单元(centralized,CU)和/或DU,或者包括BBU和自适应无线单元(adaptive radio unit,ARU),或BBU。本申请对于无线接入网设备的具体形态不做限定。
本申请还提供了一种芯片系统,所述芯片系统包括至少一个处理器,用于实现上述图4所示实施例中无线接入网设备执行的方法中所涉及的功能,或使得计算机执行图8所示实施例中核心网设备执行的方法中所涉及的功能,例如,接收或处理上述方法中所涉及的数据和/或信息。
在一种可能的设计中,所述芯片系统还包括存储器,所述存储器用于保存程序指令和数据,存储器位于处理器之内或处理器之外。
该芯片系统可以由芯片构成,也可以包含芯片和其他分立器件。
本申请还提供了一种通信系统,包括前述的无线接入网设备、终端设备和第一设备。
本申请还提供了一种通信系统,包括前述的核心网设备、终端设备和第一设备。
本申请还提供了一种计算机可读存储介质,所述计算机存储介质上存储有计算机程序(也可以称为代码,或指令),当所述计算机程序在被处理器运行时,使得上述图4所示实施例中无线接入网设备执行的方法被执行,或使得上述图8所示的实施例中核心网设备执行的方法被执行。
本申请还提供一种计算机程序产品,所述计算机程序产品包括:计算机程序(也可以称为代码,或指令),当所述计算机程序被运行时,使得计算机执行图4所示实施例中无线接入网设备执行的方法,或使得计算机执行图8所示实施例中核心网设备执行的方法。
应理解,上述的方法实施例可以应用于处理器中,或者由处理器实现。处理器可 能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。
上述的处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated Circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件或者其任意结合。通用处理器可以是微处理器,也可以是任何常规的处理器等。
结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器、闪存、只读存储器、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
上述实施例所提供的方法,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品可以包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁盘)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘solid state disk(SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单 元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (24)

  1. 一种计算任务的分割方法,其特征在于,包括:
    获取终端设备与第一计算任务对应的传输数据量和计算量,所述第一计算任务是基于分割点对神经网络模型的计算任务分割得到的;
    基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点;
    向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
  2. 如权利要求1所述的方法,其特征在于,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:
    接收来自所述终端设备的第一信息,所述第一信息指示所述传输数据量和所述计算量。
  3. 如权利要求1所述的方法,其特征在于,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:
    接收来自第一设备的第二信息,所述第二信息指示所述传输数据量和所述计算量,所述第一设备为另一终端设备或服务器。
  4. 如权利要求1所述的方法,其特征在于,所述获取终端设备与第一计算任务对应的传输数据量和计算量,包括:
    接收来自第一设备的第三信息,所述第三信息指示第一设备与第二计算任务对应的传输数据量和计算量,所述第一设备为另一终端设备或服务器;
    基于所述第三信息,确定与所述第一计算任务对应的传输数据量和计算量;
    其中,所述第一计算任务和所述第二计算任务是基于所述分割点对所述神经网络模型的计算任务分割得到的。
  5. 如权利要求1至4中任一项所述的方法,其特征在于,所述目标分割点基于时延或功耗中的至少一项确定;
    其中,所述时延为所述终端设备执行所述第一计算任务所需的时间;所述功耗为所述终端设备执行所述第一计算任务所需的功耗。
  6. 如权利要求5所述的方法,其特征在于,所述目标分割点基于所述时延确定;以及
    所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:
    基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延;
    基于所述时延,确定所述分割点为所述目标分割点。
  7. 如权利要求5所述的方法,其特征在于,所述目标分割点基于所述功耗确定;以及
    所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:
    基于所述传输数据量、所述计算量,以及所述信道状态,确定所述功耗;
    基于所述功耗,确定所述分割点为所述目标分割点。
  8. 如权利要求5所述的方法,其特征在于,所述目标分割点基于所述时延和所述功耗确定;以及
    所述基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点,包括:
    基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延和所述功耗;
    基于所述时延和所述功耗,确定所述分割点为所述目标分割点。
  9. 如权利要求6或8所述的方法,其特征在于,所述算力信息包括所述终端设备完成预定义的测试任务所需的时间或所述终端设备的计算能力中的至少一项;
    其中,所述预定义的测试任务包括:基于预定义的测试神经网络模型、预定义的计算类型或预定义的输入数据中的至少一项而执行的任务。
  10. 如权利要求1至9中任一项所述的方法,其特征在于,所述方法应用于所述无线接入网设备。
  11. 一种通信装置,其特征在于,包括:
    获取模块,用于获取终端设备与第一计算任务对应的传输数据量和计算量,所述第一计算任务是基于分割点对神经网络模型的计算任务分割得到的;
    处理模块,用于基于所述传输数据量、所述计算量,以及所述终端设备与无线接入网设备之间的信道状态,确定所述分割点为目标分割点;
    接口模块,用于向所述终端设备发送指示信息,所述指示信息指示所述目标分割点。
  12. 如权利要求11所述的装置,其特征在于,所述获取模块具体用于接收来自所述终端设备的第一信息,所述第一信息指示所述传输数据量和所述计算量。
  13. 如权利要求11所述的装置,其特征在于,所述获取模块具体用于接收来自第一设备的第二信息,所述第二信息指示所述传输数据量和所述计算量,所述第一设备为另一终端设备或服务器。
  14. 如权利要求11所述的装置,其特征在于,所述获取模块具体用于:
    接收来自第一设备的第三信息,所述第三信息指示与第二计算任务对应的传输数据量和计算量,所述第一设备为另一终端设备或服务器;
    基于所述第三信息,确定与所述第一计算任务对应的传输数据量和计算量;
    其中,所述第一计算任务和所述第二计算任务是基于所述分割点对所述神经网络模型的计算任务分割得到的。
  15. 如权利要求11至14任一项所述的装置,其特征在于,所述目标分割点基于时延或功耗中的至少一项确定;
    其中,所述时延为所述终端设备执行所述第一计算任务所需的时间;所述功耗为所述终端设备执行所述第一计算任务所需的功耗。
  16. 如权利要求15所述的装置,其特征在于,所述目标分割点基于所述时延确定;以及
    所述处理模块具体用于:
    基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延;
    基于所述时延,确定所述分割点为所述目标分割点。
  17. 如权利要求15所述的装置,其特征在于,所述目标分割点基于所述功耗确定;以及
    所述处理模块具体用于:
    基于所述传输数据量、所述计算量,以及所述信道状态,确定所述功耗;
    基于所述功耗,确定所述分割点为所述目标分割点。
  18. 如权利要求15所述的装置,其特征在于,所述目标分割点基于所述时延和所述功耗确定;以及
    所述处理模块具体用于:
    基于所述终端设备的算力信息、所述传输数据量、所述计算量,以及所述信道状态,确定所述时延和所述功耗;
    基于所述时延和所述功耗,确定所述分割点为所述目标分割点。
  19. 如权利要求16或18所述的装置,其特征在于,所述算力信息包括所述终端设备完成预定义的测试任务所需的时间或所述终端设备的计算能力中的一项或多项;
    其中,所述预定义的测试任务包括:基于预定义的测试神经网络模型、预定义的计算类型或预定义的输入数据中的至少一项而执行的任务。
  20. 如权利要求11至19任一项所述的装置,其特征在于,所述装置为无线接入网设备,或,所述装置配置于所述无线接入网设备中。
  21. 一种通信装置,其特征在于,包括:处理器和通信接口;所述通信接口用于接收来自所述装置之外的其它通信装置的信号并传输至所述处理器,或将来自所述处理器的信号发送给所述装置之外的其它通信装置;所述处理器通过逻辑电路或执行程序指令用于实现如权利要求1至10中任一项所述的方法。
  22. 如权利要求21所述的装置,其特征在于,所述装置为无线接入网设备,或,配置在所述无线接入网设备中的芯片。
  23. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有计算机程序或指令,当所述计算机程序或指令被计算机执行时,实现如权利要求1至10中任一项所述的方法。
  24. 一种计算机程序产品,其特征在于,包括指令,当所述指令被计算机运行时,实现如权利要求1至10中任一项所述的方法。
PCT/CN2023/100037 2022-08-24 2023-06-13 一种计算任务的分割方法及相关装置 WO2024041117A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211020380.6A CN117632463A (zh) 2022-08-24 2022-08-24 一种计算任务的分割方法及相关装置
CN202211020380.6 2022-08-24

Publications (1)

Publication Number Publication Date
WO2024041117A1 true WO2024041117A1 (zh) 2024-02-29

Family

ID=90012356

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/100037 WO2024041117A1 (zh) 2022-08-24 2023-06-13 一种计算任务的分割方法及相关装置

Country Status (2)

Country Link
CN (1) CN117632463A (zh)
WO (1) WO2024041117A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114630300A (zh) * 2020-12-11 2022-06-14 华为技术有限公司 传输数据的方法和通信装置
CN114630341A (zh) * 2020-12-11 2022-06-14 华为技术有限公司 通信方法、装置及系统
CN114756340A (zh) * 2022-03-17 2022-07-15 中国联合网络通信集团有限公司 算力调度系统、方法、装置和存储介质
WO2022156752A1 (zh) * 2021-01-21 2022-07-28 维沃移动通信有限公司 数据传输方法、终端及网络侧设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114630300A (zh) * 2020-12-11 2022-06-14 华为技术有限公司 传输数据的方法和通信装置
CN114630341A (zh) * 2020-12-11 2022-06-14 华为技术有限公司 通信方法、装置及系统
WO2022156752A1 (zh) * 2021-01-21 2022-07-28 维沃移动通信有限公司 数据传输方法、终端及网络侧设备
CN114756340A (zh) * 2022-03-17 2022-07-15 中国联合网络通信集团有限公司 算力调度系统、方法、装置和存储介质

Also Published As

Publication number Publication date
CN117632463A (zh) 2024-03-01

Similar Documents

Publication Publication Date Title
WO2021243619A1 (zh) 信息传输方法、装置、通信设备及存储介质
US20220116799A1 (en) Method and device for o-ran-based performance optimization and configuration
US11388644B2 (en) Apparatus and method for load balancing in wireless communication system
KR102548902B1 (ko) 무선 통신 시스템에서 데이터 수집 및 분석 기능을 활용하는 방법 및 장치
US20230014932A1 (en) Method and device of communication in a communication system using an open radio access network
CN114071560B (zh) 网络优化方法、装置及存储介质
CN105830415A (zh) 用于管理媒体流的方法、无线通信设备和基站设备
WO2016187756A1 (zh) 一种用于资源分配的方法、装置、系统以及基站
CN117289669B (zh) 一种基于工业大模型的自动调整式生产线控制系统及方法
US20230284194A1 (en) Carrier management method, resource allocation method and related devices
CN111200821B (zh) 一种容量规划方法及装置
WO2024041117A1 (zh) 一种计算任务的分割方法及相关装置
CN107426809B (zh) 基于虚拟用户队列模型的wvn功率和缓存分配方法
US20230062037A1 (en) Method for Cell Issue Forecasting
CN110892679A (zh) 管理平面性能指示符传送
WO2022082516A1 (zh) 数据传输方法及通信装置
WO2024012319A1 (zh) 用于无线通信的电子设备和方法、计算机可读存储介质
Gu Matching theory framework for 5G wireless communications
WO2024093739A1 (zh) 一种通信方法及装置
WO2023185452A1 (zh) 一种通信方法和通信装置
WO2024046419A1 (zh) 一种通信方法及装置
WO2024108443A1 (zh) 通信方法及装置
WO2022253001A1 (zh) 一种切换方法及装置
WO2023193579A1 (zh) 数据传输的方法和装置
TW202318899A (zh) 資源分配方法及資源分配系統

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856209

Country of ref document: EP

Kind code of ref document: A1